Thank you very much for all feedback! the last example crashed also so I 
have tried with try and error to trace it down into the library and it 
looks like the problem are this 2 lines in file 
src/libsimmetrics/simmetrics/tokenizer.c

                tmp = calloc((init_len + qtype->qgram_len), 
sizeof(char));

probably both lines should be changed to

               tmp = calloc((init_len + 2 * qtype->qgram_len), 
sizeof(char));

@Andrea: can you verify and include that in library?

However you last SQL example show interesting thing: calling 2 
stringmetrics in one query result in values 100 and 36 in one order and 
40 and 100 in opposite order. This is also not good :(

sqlite> .load ./libstringmetrics.so
select a.firstname, b.firstname, a.lastname, b.lastname,
stringmetrics("qgrams_distance","similarity",a.firstname, 
b.firstname,"") first_dist,
stringmetrics("qgrams_distance","similarity",a.lastname, b.lastname,"") 
last_dist
from
(select "Milan" as firstname, "Roubal" as lastname ) a,
(select "Milan" as firstname, "RoubalRoubalRoubalRo" as lastname ) b
;
sqlite>    ...>    ...>    ...>    ...>    ...>    ...> 
Milan|Milan|Roubal|RoubalRoubalRoubalRo|100.0|36.6666679382324
sqlite> select a.firstname, b.firstname, a.lastname, b.lastname,
stringmetrics("qgrams_distance","similarity",a.lastname, b.lastname,"") 
last_dist,
stringmetrics("qgrams_distance","similarity",a.firstname, 
b.firstname,"") first_dist
from
(select "Milan" as firstname, "Roubal" as lastname ) a,
(select "Milan" as firstname, "RoubalRoubalRoubalRo" as lastname ) b
;
    ...>    ...>    ...>    ...>    ...>    ...> 
Milan|Milan|Roubal|RoubalRoubalRoubalRo|40.0|100.0

    Thank you
    Best Regards
    Milan
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to