Andreas wrote:
> Hi,
>
> how can I find double entries in varchar columns where the content is not
> 100% identical because of a spelling error or the person considered it
> "looked nicer" that way?
>
> I'd like to identify and then merge records of e.g. 'google', 'gogle',
> 'guugle' Then I wa
Vivek Khera wrote:
>
> On Apr 15, 2008, at 11:23 PM, Tom Lane wrote:
>> What's really a duplicate sounds like a judgment call here, so you
>> probably shouldn't even think of automating it completely.
>
> I did a consulting gig about 10 years ago for a company that made
> software to normalize st
On Apr 15, 2008, at 11:23 PM, Tom Lane wrote:
What's really a duplicate sounds like a judgment call here, so you
probably shouldn't even think of automating it completely.
I did a consulting gig about 10 years ago for a company that made
software to normalize street addresses and names. Lit
On Wed, 16 Apr 2008, Andreas <[EMAIL PROTECTED]> writes:
> how can I find double entries in varchar columns where the content is
> not 100% identical because of a spelling error or the person
> considered it "looked nicer" that way?
>
> I'd like to identify and then merge records of e.g. 'google'
: [EMAIL PROTECTED] on behalf of Andreas
Sent: Tue 4/15/2008 8:15 PM
To: pgsql-sql@postgresql.org
Subject: [SQL] How to find double entries
Hi,
how can I find double entries in varchar columns where the content is
not 100% identical because of a spelling error or the person considered
it "l
Andreas <[EMAIL PROTECTED]> writes:
> I'd like to identify and then merge records of e.g. 'google', 'gogle',
> 'guugle'
> Then I want to match abbrevations like 'A-Company Ltd.', 'a company
> ltd.', 'A-Company Limited'
> Is there a way to do this?
> It would be OK just to list candidats up
Andreas wrote:
> Hi,
>
> how can I find double entries in varchar columns where the content is
> not 100% identical because of a spelling error or the person considered
> it "looked nicer" that way?
When doing some near-duplicate elimination as part of converting a
legacy data set to PostgreSQL I
Hi,
how can I find double entries in varchar columns where the content is
not 100% identical because of a spelling error or the person considered
it "looked nicer" that way?
I'd like to identify and then merge records of e.g. 'google', 'gogle',
'guugle'
Then I want to match abbrevations