Useful keywords; thank you! Also more of a development effort than this project will support, alas. (Unless someone's willing to provide a pointer to their public release of such a solution, free for commercial use? Which doesn't seem a whole lot more likely than someone throwing a gold brick through my window.)
On Wed, Nov 11, 2020 at 6:42 PM Imsieke, Gerrit, le-tex < [email protected]> wrote: > This is probably difficult since in BaseX, fuzzy matching is implemented > using the Levenshtein distance between two strings [1]. Therefore > similarity is a relation between pairs of paragraphs rather than an > intrinsic property of an individual paragraph. > > You should look for content fingerprinting/clustering techniques. > > [1] https://docs.basex.org/wiki/Full-Text#Fuzzy_Querying > > > On 12.11.2020 00:00, Graydon Saunders wrote: > > Hello -- > > > > Is there some way to assign the abstraction of a fuzzy match to a > > variable, so that something like > > > > for $x in //p > > let $key := get-fuzzy-match-value($x) > > group by $key > > return <similar-paragraphs>{$x}</similar-paragraphs> > > > > would be possible? > > > > I'm supposing this is one of those things that's either easy or > impossible. > > > > Thanks! > > Graydon > >

