Christian Marie wrote:
> A developer I work with was trying to use dmetaphone to group people names 
> into
> equivalence classes. He found that many long names would be grouped together
> when they shouldn't be, this turned out to be because dmetaphone has an
> undocumented upper bound on its output length, of four. This is obviously
> impractical for many use cases.
> 
> This patch addresses this by adding and documenting an optional argument to
> dmetaphone and dmetaphone_alt that specifies the maximum output length. This
> makes it possible to use dmetaphone on much longer inputs.
> 
> Backwards compatibility is catered for by making the new argument optional,
> defaulting to the old, hard-coded value of four. We now have:
> 
>       dmetaphone(text source) returns text
>       dmetaphone(text source, int max_output_length) returns text
>       dmetaphone_alt(text source) returns text
>       dmetaphone_alt(text source, int max_output_length) returns text

I like the idea.

How about:
    dmetaphone(text source, int max_output_length DEFAULT 4) returns text
    dmetaphone_alt(text source, int max_output_length DEFAULT 4) returns text

Saves two functions and is self-documenting.

> +postgres=# select dmetaphone('unicorns');
> + dmetaphone
> +------------
> + ANKR
> +(1 row)
> +
> +postgres=# select dmetaphone('unicorns', 8);
> + dmetaphone
>  ------------
> - KMP
> + ANKRNS
>  (1 row)
>  </screen>
>   </sect2>

Yeah, "ponies" would have been too short...

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to