Find Duplicates between two Tables in Access

Black lists are an important component of marketing. It is not difficult to maintain them. And with the help of the DataQualityTools, black lists can also easily be processed.

Black lists are a good method of respecting the wishes of individual addressees to not receive any advertising, and even save money in doing so. Furthermore, creating your own black lists allows you to exclude certain recipients from a particular marketing campaign, such as competitors or your own employees, and for campaigns designed to recruit new customers, you can exclude your existing customers.

To process black lists, you need a program that is able to carry out an error-tolerant duplicate detection between two tables, such as the DataQualityTools. By doing a search for duplicates between two tables, you can also synchronize address lists, for example to remove addresses that were already solicited with a previous marketing campaign so as not to solicit them again, or to transfer information, such as the telephone or customer number, from one address list to another based on the results of the comparison. There are many ways to make use of a duplicate detection between two tables. And with the DataQualityTools, a search for duplicates between two tables is done in a snap.

To do so, follow the steps below:

  1. If you have not already done so, download DataQualityTools free of charge here. Install the program and request a trial activation. Then you can work with the program for one whole week without any restrictions.
  2. The functions we need can be found in the menu of the 'Matching between two tables' block. Let’s choose 'Universal Matching'.

    Universal Matching

  3. After starting this function, the project management appears. Here, you create a new project with a project name of your choice and then click on the 'Next' button.
  4. In the next step, we start by opening the two Access files with the data to be processed by clicking on the 'Open file' button.

    Data source Access

    After that, we enter the name of the database server. After clicking on the 'Connect to server' button, the access data have to be entered. Finally, the desired database containing the table can be selected in the corresponding selection lists.
  5. Then you have to tell the program which of the columns from the table you want to compare:

    Matching definition

    In this example, the column 'City' is to be compared with. This contains the city name, which is why 'City' was selected from the selection list for the field content. And 70% has been chosen as the threshold for the configence score. The city name must therefore be at least 70% the same, so that the data record in question appears as a hit in the results.

    If required, individual columns can also be combined to form a group:

    Matching definition

    The contents of the columns are then summarised in the group before the comparison and are thus compared together. In this example, the contents of the data fields 'Street' and 'HouseNo' from Table 1 is compared with the contents of the data field 'Street' from Table 2.
  6. With a click on the button 'Next', we get to a dialogue with further options. But we don't need them here.
  7. A click on the 'Next' button starts the matching. In no time at all, a summary of the results is displayed. 

    Matching results

    If the program has found duplicates between the processed tables, a click on the 'OK' button takes you to the dialogue with the functions for transforming the results. Otherwise, reduce the threshold for the degree of matching and the re-start the comparison.
  8. In the dialogue with the functions for transforming the results, there is a button at the very top with the caption 'Manual post-editing'. Here, the results of the comparison is presented in table form. Those records that should be deleted are marked with a red X, which can be removed if necessary.

    Matching results

  9. Finally, you must transform the results. For example, we could delete the records marked with a deletion flag directly of the source file. To do this, we select the appropriate function by first clicking on 'Deleting functions':

    Deleting functions

    And then click on 'Deleting in the Source Table':

    Deleting in the Source Table

You can use DedupeWizard as well to execute a search for duplicates between two tables. It can only process Excel files and comprises only the most important functions of DataQualityTools.

Note: There is a video tutorial available for DataQualityTools, introducing the user to the operation of the program based on the example of a duplicate detection within one table.

You can read about how to use DataQualityTools to search for duplicates within an address list in the article 'Remove Duplicates in Access'.