On one occasion I encountered an interesting piece of code.
There was a user table and the only unique identifier was the identity field itself. I would guess as the month and years progressed this table was enriched with additional columns until someone decided to call a halt on adding columns, but rather created a key value pair table to store additional attributes.
So this led to an interesting insert process. Inserting a single row was fairly simple. i.e. Insert row into main table, use the scope_identity to then insert additional details into the key value pair table.
However, the process to insert in bulk made use of cursors and xml. Each row was processed within a cursor row by row to allow the identity of the first insert to be used to join it the key value pair table. I assume that this was done, because, if the data was inserted in bulk into the first table first, there would be no way to join back to it to identify the identities that had just been created, because there was no secondary unique identifier.
Whatever your reason may be, I sometimes use this little feature of the merge statement to capture identities
Before we go anywhere, lets have a quick overview of scope identity
This will return the identity of the last inserted value. It will be limited to the current session and the current scope.
i.e. If you insert into a table and that table has a trigger to insert into a second table, it will return the value of the first table.
This will return the identity of the last inserted value. It will be limited to the current session.
i.e. If you insert into a table and that table has a trigger to insert into a second table, it will return the value of the second table.
This will return the identity of the last inserted value. It is not bound to any session.
Now that we’ve got this out of the way, lets focus on bulk capturing rather than individual inserts
CREATE TABLE dbo.Users
The unique key is the ID.. So if we were to insert ‘Fred’ twice it would be impossible after the inserts are successful to capture which process created the entry. (I know it’s bad table design :))
INSERT INTO dbo.Users
As you can see, two rows were inserted and the final output from the scope identities show the final ID as 2 and obviously disregard the first.
SQL server does also offer the inserted table as reference via the output clause.
But there is still no way that this can be tied back to the original data. The source dataset cannot be referenced within the output clause.
Roll on the merge syntax
So breaking this down
- The CTE constructs two rows to be inserted. The column “WhichRow” is used to verify the data at the end
- Merge against the User table
- Merge using 1=2 .. This means that every row in the source dataset will be treated as not matched
- Insert the data
- Return the inserted data and the source data together via the output clause
- Note: the could be directed into a temp table to be used for a second process