If you have worked with data, you must have come across a duplicate record.
These records are quite annoying and we should know the best possible way to deal with these records.
Today I’m going to show you the 10 best possible ways to remove duplicate records from the database
Each method has its own pros and cons
I’ll walk you through each one of them step by step.
So that you have a clear understanding of all these strategies beforehand
updated on:
Here is the list of strategies that you are going to learn today:
We’ll use below data records as an exercise sheet:
To follow along you may download the same here.
Here’s the step by step process:
First, select the entire data.
In our case will select the data from a range A1 to C36 from sheet M1
Under Data tab look for Data Tools group
Click on Remove Duplicates
A Remove Duplicate window will show up
Keep the default selection as it is and click on OK
A message box from Microsoft Excel will pop-up saying :
16 duplicate values found and removed; 19 unique values remain.
This is how you can use Excel Built-in Option to Remove duplicate records
Take away from this method:
This is also an excel build-in option under Home Tab – Styles group
Here are the steps:
Select the data you want to remove duplicates from
In our case select Range A1 to C36 from sheet M2
From Conditional Formatting drop-down under Home tab choose :
Highlight Cells Rules – Duplicate Values
Duplicate Values window will pop-up
From this window, you can format the cells based on your requirements.
To follow along with this example, keep the default selection as it is and click on ok
Once you click on the ok button..,
All the cells with the duplicate value will get highlighted
This is how you can use Conditional Formatting Option to Remove duplicate records
Take away from this method:
Most of us may know that..,
PIVOT Table is one of the best tools to manipulate data quickly:
so as to remove duplicate records
Here’s are the steps to follow:
From sheet, M3 select the data Range A1 to C36
Click on PivotTable from Insert tab Tables group
Create PivotTable window will show up:
Under this window choose Existing Worksheet
Click on the input box under Location and select cell E1:
Click on OK
Now
Under PivotTable Fields, drag and drop the fields as shown below:
Right click on cell E1 and choose PivotTable Options…
Under PivotTable Options, window click on Display tab and select Classic PivotTable layout
Click on OK
From top Ribbon bar select Design tab
Click on Subtotals dropdown arrow and choose Do Not Show Subtotals under layout group
Arrange the column to AutoFit the Column Width
Now you could see all the 19 unique records
Plus under the Total column, you could see how many times these records were repeated.
Quite interesting right…
Take away from this method:
This option is one of my personal favorite on this list
Let’s get started:
From our exercise workbook select sheet M4
From Data tab click on Advanced under Sort & Filter group
An Advanced Filter window will pop up
From this window choose Copy to another location under Action
Select range A1 to C36 under list range
For criteria range select cell A1
Under Copy to select cell E1
Checkmark the Unique records only and click on OK
boom…
All the unique records are shown up
AutoFit the Column Width to get an exact same formatting
Take away from this method:
An old age method
This strategy was mostly used in the office version 2003 and beyond… when there wasn’t any inbuilt option within excel to remove duplicates
Let’s follow the steps:
Select sheet M5 from our exercise workbook
We need to add an extra column to work on.. so let’s add an extra column next to column “C” and give it a title as “Unique Ref”
I just made up this name, you can come up with your own title name
Select the range A1 to D36
From Data tab click on Filter under Sort and Filter group
From Cell A1 filter drop-down choose Sort A to Z.
So that all the records including duplicate records get placed below each other
Now that all the records are placed in proper order, it’s time for some action
In a cell, D2 place the formula as =A2=A3 and drag it till D36
WOW, you are done!
From Cell D1 filter drop-down choose FALSE for all 19 unique records and TRUE for all 16 duplicate records
Take away from this method:
Till now we have not used any excel functions to remove duplicate records.
However, In this method, we are going to make use of excel function to find duplicate records.
Let’s move on:
Select sheet M6 from our exercise workbook
In this method also we need to add an extra column to work on
So let’s add an extra column and give it a title as “Unique Ref”
Now in Cell D2 we will use countif function to get an exact count of each record
Put the formula as:
=COUNTIF(A$2:A2,A2)
in cell D2 and drag it till D36
Select the range A1 to D36
From Data tab click on Filter under Sort and Filter group
made it till here… Great!
Now from Cell D1 filter drop-down choose 1 for all 19 unique records and choose all other numbers for duplicate records
Take away from this method:
This method is a combination of multiple Excel functions like :
IFERROR; INDEX; MATCH; COUNTIF
Using all these functions we will create an ARRAY formula to detect a duplicate records
So, let’s get started:
From our exercise workbook select sheet M7
Copy the Title from A1 to C1 and Paste it in cell E1 to G1
In Cell E2 place the formula as:
=IFERROR(INDEX($A$2:$A$36, MATCH(0,COUNTIF($E$1:E1, $A$2:$A$36), 0)),””)
Now we need to convert this formula in an ARRAY
To do that:
Select cell E2 and press function key F2 from your keyboard to get the cursor within a cell
With a cursor within a cell press keyboard key Ctrl + Shift + Enter
do it all at once to get curly braces {..} around the formula
Your formula should look like:
{=IFERROR(INDEX($A$2:$A$36, MATCH(0,COUNTIF($E$1:E1, $A$2:$A$36), 0)),””)}
Next:
Drag the formula from range E2 to E36
OR
Select the range E2 to E36 and press keyboard shortcut key Ctrl + d
This will give you all the unique Ref id
To get the name of Book and Author
Place the formula:
=IFERROR(INDEX($A$2:$C$36, MATCH($E2,$A$2:$A$36,0), MATCH(F$1,$A$1:$C$1,0)),””)
in cell F2
Now copy the cell F2
Select the range F2 to G36 and Paste it to get the Book and Author name.
That’s it!
Take away from this method:
This method is almost the same as countif.
The only difference is we can set multiple criteria using a countifs function
Steps to follow:
Select sheet M8 from our exercise workbook
Add an extra column and give it a title as “Unique Ref”
Place a formula:
=COUNTIFS(A$2:A2,A2,B$2:B2,B2,C$2:C2,C2)
in cell D2 and drag it till D36
Select the range A1 to D36
From Data tab click on Filter under Sort and Filter group
Now from Cell D1 filter drop-down choose 1 for all 19 unique records and choose all other numbers for duplicate records
Take away from this method:
This is a very Impressive and revolutionary tool from Microsoft.
It’s a completely new feature in office 2016.
However, you can get this feature for office version 2010 and 2013 by downloading a Free add-in from Microsoft.
Let’s get back to our topic Remove duplicate records using power query
Select sheet M9 from our exercise workbook
To use a power query we need to convert our data range into a table.
To do that, select the range A1 to C36
Click on Table from Insert tab Tables group
Create Table window will pop-up
Keep the default selection as it is and click on OK
Your data range will now get converted into a table
Next:
From Data tab click on From Table under Get & Transform group
A new Power Query Editor window will open up
Right click on Ref Id under query editor and click on Remove Duplicates
Now we need to close the query editor:
To do that, click on Close & Load option from Home tab query editor
A new sheet with unique records will get added to your workbook
Take away from this method:
There are various ways to remove duplicates using Excel VBA Macros
I will walk you through two of my best strategies:
Our first Macro will automatically remove all the duplicate records from your data.
&
Our second Macro will create a unique record, without messing with the original records.
Sounds good!
Let’s first create a fresh data records:
From our Exercise workbook copy the Row data from sheet1
Press keyboard shortcut key ctr + n to open a new workbook
From the clipboard paste the copied data in sheet1 and sheet2
Your fresh data records are ready.
Close all your other workbooks, so that you do not mess around
Now let’s create a macro using our first method.
Automatically remove all the duplicate records from your data.
From Sheet1 press keyboard shortcut key Alt + F11 to open Microsoft Visual Basic for Applications window.
From this window click on Insert tab and choose Module
Module1 will get added
Double click on the Module1 and paste the below code:
Sub Remove_duplicates_m1()
ActiveSheet.Range("A:C").RemoveDuplicates _
Columns:=Array(1, 2, 3), Header:=xlYes
End Sub
Now close the Visual Basic Application window
From sheet1 press keyboard shortcut key Alt + F8 to open a Macro Name Window
Choose the Macro name we just created “Remove_duplicates_m1” and click on Run
WOW
All the unique records are shown up
Cool right…
Now let’s look at our second method
Create a unique record, without messing with the original data points
From the active workbook, select sheet2
Use keyboard shortcut key Alt + F11 to open Microsoft Visual Basic for Applications window.
Select the Module1 we just created in our earlier example
Paste the below code within the module
Sub Remove_duplicates_m2()
ActiveSheet.Range("A:C").AdvancedFilter _
Action:=xlFilterCopy, CopyToRange:=Range("F1"), _
Unique:=True
End Sub
Here’s how it should look like:
Close the Visual Basic Application window
From sheet2 press keyboard shortcut key Alt + F8 to open a Macro Name Window
Choose the Macro name “Remove_duplicates_m2” and click on Run
Now you must see all the unique records in cell F1 to H20.
Take away from both method:
Which of these 10 strategies would you like to try first?
Is it the Advanced filter method OR the VBA Macro method
Or maybe I missed one of your favorite methods, that could be an addon to this list
Either way, let me know by leaving a comment below right now.
Leave a Reply