Re: [GENERAL] something better than pgtrgm?

2012-10-10 Thread Willy-Bas Loos
Thanks, but no, we do need the performance And we have admins (not users) enter the names and codes, but we can't make it way complicated to do that. I thought you meant that they see to it that the names end up in the database under the correct encoding (which is a logical thing to do..) Thanks

[GENERAL] something better than pgtrgm?

2012-10-09 Thread Willy-Bas Loos
Hi, I need a *language unaware* text comparison algorithm, so i found pgtrgm. But i am not so content with it, because the similarities it finds are: - biased to favor text that is the same in the first character - much dependent on similar length of the strings Are there any other

Re: [GENERAL] something better than pgtrgm?

2012-10-09 Thread Andrew Sullivan
On Tue, Oct 09, 2012 at 02:10:26PM +0200, Willy-Bas Loos wrote: Hi, I need a *language unaware* text comparison algorithm [. . .] (i want to use it for *did you mean ...?* for approx 6-10 character codes or 8-20 letter words of mixed languages) I don't think this is going to do what you

Re: [GENERAL] something better than pgtrgm?

2012-10-09 Thread Willy-Bas Loos
Hi, Andrew thanks for replying On Tue, Oct 9, 2012 at 2:18 PM, Andrew Sullivan a...@crankycanuck.ca wrote: But for the mixed languages case, surely it's not _any_ mixed language? Are you mixing Arabic, Farsi, Chinese, and Hindi, for instance? We're mixing species names of birds in greek and

Re: [GENERAL] something better than pgtrgm?

2012-10-09 Thread Andrew Sullivan
On Tue, Oct 09, 2012 at 03:10:31PM +0200, Willy-Bas Loos wrote: We're mixing species names of birds in greek and latin (scientific names), and all languages spoken in africa, europe and western asia. Yike. I'm not very knowledgeable about scripts around the world, but i am afraid that the

Re: [GENERAL] something better than pgtrgm?

2012-10-09 Thread Willy-Bas Loos
On Tue, Oct 9, 2012 at 3:23 PM, Andrew Sullivan a...@crankycanuck.ca wrote: you will need to be extremely rigorous about normalizing spellings on the way in. Is that a possibility? Yes, it is. If so, I can almost imagine a way this could work Great! How? -- Quality comes from focus

Re: [GENERAL] something better than pgtrgm?

2012-10-09 Thread Andrew Sullivan
On Tue, Oct 09, 2012 at 03:54:35PM +0200, Willy-Bas Loos wrote: If so, I can almost imagine a way this could work Great! How? Well, it involves very large tables. But basically, you work out a variant table for any language you like, and then query across it with subsets of the