On Thu, 16 Sep 2010 06:22:15 -0700, Andreas maps...@gmx.net wrote:
It's not only typos to catch. There is variation in the way to write
things that not necessarily are wrong.
e.g.
Miller's Bakery
Bakery Miller
Bakery Miller, Ltd.
Bakery Miller and sons
Bakery Smith (formerly Miller)
and the
On Sep 15, 2010, at 10:40 PM, Andreas wrote:
I need to clean up a lot of contact data because of a merge of customer lists
that used to be kept separate.
I allready know that there are double entries within the lists and they do
overlap, too.
Relevant fields could be name, street, zip,
On Thu, Sep 16, 2010 at 04:40:42AM +0200, Andreas wrote:
I need to clean up a lot of contact data because of a merge of customer
lists that used to be kept separate.
I allready know that there are double entries within the lists and they
do overlap, too.
Relevant fields could be name,
Am 16.09.2010 13:18, schrieb Sam Mason:
On Thu, Sep 16, 2010 at 04:40:42AM +0200, Andreas wrote:
I need to clean up a lot of contact data because of a merge of customer
lists that used to be kept separate.
What to do depends on how much data you have; a few thousand and you can
do lots of
On Thu, Sep 16, 2010 at 03:22:15PM +0200, Andreas wrote:
We are talking about nearly 500.000 records with considerable overlapping.
Other things to consider is whether each one contains unique entries and
hence can you do a best match between datasets--FULL OUTER JOIN is
your friend here, but
On 9/16/2010 5:18 AM, Sam Mason wrote:
On Thu, Sep 16, 2010 at 04:40:42AM +0200, Andreas wrote:
I need to clean up a lot of contact data because of a merge of customer
lists that used to be kept separate.
I allready know that there are double entries within the lists and they
do
Hi,
I need to clean up a lot of contact data because of a merge of customer
lists that used to be kept separate.
I allready know that there are double entries within the lists and they
do overlap, too.
Relevant fields could be name, street, zip, city, phone
Is there a way to do something
Andreas wrote:
I need to clean up a lot of contact data because of a merge of customer
lists that used to be kept separate.
I allready know that there are double entries within the lists and they
do overlap, too.
Relevant fields could be name, street, zip, city, phone
Is there a way to do
Andreas,
Relevant fields could be name, street, zip, city, phone
Is there a way to do something like this with postgresql ?
I fear this will need still a lot of manual sorting and searching even when
potential peers get automatically identified.
One of the techniques I use to increase the