
SQL SERVER: Find Duplicate Rows with Row_Number
12/31/2008This is very known question to find Duplicate Rows from the Table and Delete Duplicate rows from the table
I came to know about “ROW_NUMBER()” of SQL SERVER 2005.
Lets see how we can use this feature:
There is one Table:
CREATE TABLE FindDuplicates(
EmpID INT IDENTITY(1,1),
EmpName VARCHAR(500)
)
INSERT some data into the table:
INSERT INTO FindDuplicates VALUES('A')
INSERT INTO FindDuplicates VALUES('A')
INSERT INTO FindDuplicates VALUES('B')
INSERT INTO FindDuplicates VALUES('C')
INSERT INTO FindDuplicates VALUES('C')
INSERT INTO FindDuplicates VALUES('D')
INSERT INTO FindDuplicates VALUES('D')
Now we have data like:

Now, we need to Find the duplicate Rows from the table. So we need to get “A”, “C” and “D”.
Lets see the use of Row_Number():
;With CTE AS(
SELECT ROW_NUMBER() OVER(PARTITION BY EmpName ORDER BY EmpName) As RowID,
EmpID,
EmpName
FROM FindDuplicates
)
SELECT *
FROM CTE
WHERE RowID > 1
Output will be like:

So we can Delete Duplicate Rows as we have EmpID (Primary Key)
I Used:
PARTITION BY: which will find same rows and assign ID accordingly.
ORDER BY: In which order we want to assign ID. By Default its Ascending
Fantastic! If I know that some days ago …
Thank you
Your query won’t work if you had more than two records with a duplicate EmpName value.
Standard grouping syntax would probably work as well…
SELECT MIN (EmpID) AS EmpID, EmpName
FROM FindDuplicates
GROUP BY EmpName
While you don’t get the rank number, you really don’t need it in the result set. If you don’t want the first record as the query above, use MAX instead for the EmpID.
Cheers,
Ben
Hi benstaylor,
If I have more than two records, then it will give me both the duplicate Emp Name.
Lets say now I insert like this:
INSERT INTO FindDuplicates VALUES(‘A’)
INSERT INTO FindDuplicates VALUES(‘A’)
INSERT INTO FindDuplicates VALUES(‘A’)
INSERT INTO FindDuplicates VALUES(‘B’)
INSERT INTO FindDuplicates VALUES(‘C’)
INSERT INTO FindDuplicates VALUES(‘C’)
INSERT INTO FindDuplicates VALUES(‘C’)
INSERT INTO FindDuplicates VALUES(‘D’)
INSERT INTO FindDuplicates VALUES(‘D’)
this query will return me
RowID EmpID EmpName
2 2 A
3 3 A
2 6 C
3 9 C
2 8 D
Could you let me know what is the problem?
Thanks,
Tejas
very nice!
Also see here how to use row_number() function for various purposes
http://beyondrelational.com/blogs/madhivanan/archive/2007/08/27/multipurpose-row-number-function.aspx
Madhivanan
Hi,
I am in urgent need of a solution. My problem is, i need to retrieve the data which has duplicate records and need to have the row number start from 1 to the number of records.
Ex:
ProductID Amount RowNumber
1 200.00 1
1 330.00 2
2 400.00 1
2 600.00 2
2 350.00 3
3 200.00 1
4 330.00 1
5 200.00 1
5 330.00 2
Please help me in solving this.
Thanks,
Niranjan
very helpfull