Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

Josh Berkus Tue, 17 Jun 2014 14:42:18 -0700

On 06/17/2014 02:36 PM, Tom Lane wrote:
> Josh Berkus <[email protected]> writes:
>> (2) If there are multiple columns with the same levenschtien distance,
>> which one do you suggest?  The current code picks a random one, which
>> I'm OK with.  The other option would be to list all of the columns.
> 
> I objected to that upthread.  I don't think that picking a random one is
> sane at all.  Listing them all might be OK (I notice that that seems to be
> what both bash and git do).
> 
> Another issue is whether to print only those having exactly the minimum
> observed Levenshtein distance, or to print everything less than some
> cutoff.  The former approach seems to me to be placing a great deal of
> faith in something that's only a heuristic.


Well, that depends on what the cutoff is.  If it's high, like 0.5, that
could be a LOT of columns.  Like, I plan to test this feature with a
3-table join that has a combined 300 columns.  I can completely imagine
coming up with a string which is within 0.5 or even 0.3 of 40 columns names.

So if we want to list everything below a cutoff, we'd need to make that
cutoff fairly narrow, like 0.2.  But that means we'd miss a lot of
potential matches on short column names.

I really think we're overthinking this: it is just a HINT, and we can
improve it in future PostgreSQL versions, and most of our users will
ignore it anyway because they'll be using a client which doesn't display
HINTs.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

Reply via email to