Hi Hack,

That’s been helpful, thanks. We’ve aligned our Damerau/Levenshtein
algorithms, the latest version should behave as expected [1, 2].

Best,
Christian

[1] https://files.basex.org/releases/latest/
[2]
https://github.com/BaseXdb/basex/commit/6889ac108c6b32d448d640d53ec098bbb8938f06


On Thu, May 9, 2024 at 8:29 AM Jack Steyn <steynj...@gmail.com> wrote:

> Hi,
>
> According to my copy of BaseX 10.7,
>
> string:levenshtein('oil field', 'oilfield')
>
> and
>
> string:levenshtein('oil field', 'coalfield')
>
> both return the same value, 0.7777777777777778.
>
> My understanding is that the Levenshtein-Damerau distance between 'oil
> field' and 'oilfield' is 1 and between 'oil field' and 'coalfield' is 3, so
> following the formula from
> https://docs.basex.org/wiki/String_Module#string:levenshtein
>
> 1.0 – distance / max(length of strings)
>
> should give 0.888... and 0.666... respectively.
>
> Am I off-base here or is there something awry with string:levenshtein?
>
> Cheers,
>
> Jack
>

Reply via email to