Useful keywords; thank you!

Also more of a development effort than this project will support, alas.
(Unless someone's willing to provide a pointer to their public release of
such a solution, free for commercial use?  Which doesn't seem a whole lot
more likely than someone throwing a gold brick through my window.)

On Wed, Nov 11, 2020 at 6:42 PM Imsieke, Gerrit, le-tex <
[email protected]> wrote:

> This is probably difficult since in BaseX, fuzzy matching is implemented
> using the Levenshtein distance between two strings [1]. Therefore
> similarity is a relation between pairs of paragraphs rather than an
> intrinsic property of an individual paragraph.
>
> You should look for content fingerprinting/clustering techniques.
>
> [1] https://docs.basex.org/wiki/Full-Text#Fuzzy_Querying
>
>
> On 12.11.2020 00:00, Graydon Saunders wrote:
> > Hello --
> >
> > Is there some way to assign the abstraction of a fuzzy match to a
> > variable, so that something like
> >
> > for $x in //p
> >    let $key := get-fuzzy-match-value($x)
> >    group by $key
> >    return <similar-paragraphs>{$x}</similar-paragraphs>
> >
> > would be possible?
> >
> > I'm supposing this is one of those things that's either easy or
> impossible.
> >
> > Thanks!
> > Graydon
>
>

Reply via email to