A fixed up version can be found at 
http://www.dessus.com/libstringmetrics-fixed.zip which includes a working DLL 
compiled with 32-bit MinGW, plus modified source and the "compile.cmd" used to 
compile on Windows with MinGW installed.  I will remove it once Andrea has 
merged the changes.

>-----Original Message-----
>From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
>boun...@sqlite.org] On Behalf Of Keith Medcalf
>Sent: Sunday, 28 September, 2014 14:21
>To: General Discussion of SQLite Database
>Subject: Re: [sqlite] A new extension for sqlite to analyze the
>stringmetrics
>
>
>Andrea, here are some problems and their solutions:
>
>(1) Where do you get a "tolower" function that takes a char* and returns
>a char*?  The one in ctypes.h works a character at a time (thus takes an
>int and returns an int).
>
>(2) Why is the entire sqlite3 engine included in an dll which is loaded
>as an extension to sqlite3?
>
>Please follow along and I will show you where we can find and fix these
>issues ... this might be helpful to other extension writers as well.
>
>First, the -std=c99 needs to be -std=gnu99 to permit the gnu extension
>functions to be recognized.
>Without the GNU extensions a bunch of non-ansi names are not recognized
>because c99 implies -ansi.  Once this change is made 99% of the errors go
>away.
>
>src\wrapper_functions.c: In function 'stringmetricsFunc':
>src\wrapper_functions.c:350:16: warning: 'return' with a value, in
>function returning void [enabled by default]
>                return (1);
>                ^
>
>This is easy.  SQLite scalar functions are supposed to return an int
>status code.  That code is either SQLITE_ERR if there was an error, or
>SQLITE_OK if everything is OK.  So change the function definition to
>return an int, and the two return statements to return SQLITE_ERR (not 1)
>and SQLITE_OK (not nothing).
>
>
>src\wrapper_functions.c:353:4: warning: implicit declaration of function
>'tolower' [-Wimplicit-function-declaration]
>    if(strcmp(tolower(kindofoutput),"similarity")==0) {
>    ^
>src\wrapper_functions.c:353:4: warning: passing argument 1 of 'strcmp'
>makes pointer from integer without a cast [enabled by default]
>In file included from src\wrapper_functions.c:57:0:
>c:\apps\mingw\include\string.h:55:37: note: expected 'const char *' but
>argument is of type 'int'
> _CRTIMP int __cdecl __MINGW_NOTHROW strcmp (const char*, const char*)
>__MINGW_ATTRIB_PURE;
>                                     ^
>src\wrapper_functions.c:355:4: warning: passing argument 1 of 'strcmp'
>makes pointer from integer without a cast [enabled by default]
>    } else if(strcmp(tolower(kindofoutput),"metric")==0) {
>    ^
>In file included from src\wrapper_functions.c:57:0:
>c:\apps\mingw\include\string.h:55:37: note: expected 'const char *' but
>argument is of type 'int'
> _CRTIMP int __cdecl __MINGW_NOTHROW strcmp (const char*, const char*)
>__MINGW_ATTRIB_PURE;
>
>
>The "tolower" function works on an int (single character) and returns an
>int (single character).  It does not work on whole strings.  The function
>for doing a case insensitive string compare is "stricmp":
>
>This can be fixed by making the following changes in wrapper_functions.c:
>
>        if(kindofoutput!=NULL) {
>            if(stricmp(kindofoutput,"similarity")==0) {
>                sqlite3_result_double(context, similarity);
>            } else if(stricmp(kindofoutput,"metric")==0) {
>                sqlite3_result_text(context, metrics, strlen(metrics)+1,
>NULL);
>            } else {
>                mex = malloc(strlen(sm_name) + 200 + strlen(metrics)+1);
>                sprintf(mex,"%s between \"%s\" & \"%s\" is \"%s\" and
>yields a %3.0f%% similarity",sm_name,par1,par2,metrics,similarity*100);
>                sqlite3_result_text(context, mex, strlen(mex)+1, NULL);
>            }
>        } else {
>            mex = malloc(strlen(sm_name) + 200 + strlen(metrics)+1);
>            sprintf(mex,"%s between \"%s\" & \"%s\" is \"%s\" and yields
>a %3.0f%% similarity",sm_name,par1,par2,metrics,similarity*100);
>            sqlite3_result_text(context, mex, strlen(mex)+1, NULL);
>
>(basically a global search and replace for
>"strcmp(tolower(kindofoutput)," and replacing it with
>"stricmp(kindofoutput,")
>
>
>Now we have left only the problem that the entirety of SQLite3 itself is
>compiled into the extension.
>
>Since we are not compiling the extension into the core, you simply need
>to use the correct header.  "wrapper_functions.c" should be using
>sqlite3ext.h, not sqlite3.h.  You then need to add a macro to get a
>reference to the sqlite3_api thus:
>
>
>#include <sqlite3ext.h>
>#include <string.h>
>#include <stdlib.h>
>#include <malloc.h>
>#include <stddef.h>
>#include "simmetrics.h"
>
>SQLITE_EXTENSION_INIT3
>
>    const int SIMMETC = 27;
>
>
>SQLITE_EXTENSION_INIT1 creates the "sqlite3_api" pointer.
>SQLITE3_EXTENSION_INIT2 initializes its value.  If you need to access the
>"sqlite3_api" in a source file which is "linked with" something which has
>a declaration and initialization of sqlite3_api, then you just put in the
>SQLITE_EXTENSION_INIT3 macro at the top of those modules.  (The
>definitions are at the end of sqlite3ext.h)
>
>You then change the compile command thusly:
>
> gcc -s -O3 -std=gnu99 -mdll -mthreads -Bl,--static -static-libgcc
>     -I src
>     -I src\libsimmetrics\include
>     -I ..\sqlite\dist
>     src\*.c
>     src\libsimmetrics\simmetrics\*.c
>     -o stringmetrics.dll
>
>where "..\sqlite\dist" is the location of the sqlite3 header files (I
>point them to my own SQLite3 build directories, you can carry an extra
>copy in the src/sqlite3 directory and refer to those if you prefer).
>This produces a 73K extension module with no external dependancies (other
>than to the MSVCRT.DLL subsystem runtime library) and produces no
>diagnostic output.
>
>I added the -mthreads because I presume this may be used in a multithread
>environment.  It added no linkage to the thread library code, so I assume
>the base functions used were already thread-safe (or could not be made
>so).  I haven't looked into which is the case.
>
>After these changes we get the following (on Win81 x64, with the current
>MingW 32-bit compiler) (slightly reformatted to fit your screen):
>
>2014-09-28 14:05:30 [D:\Source\libstringmetrics-master]
>>gcc -s -O3 -std=gnu99 -mdll -mthreads -Bl,--static -static-libgcc
>     -I src
>     -I src\libsimmetrics\include
>     -I ..\sqlite\dist
>     src\*.c
>     src\libsimmetrics\simmetrics\*.c
>     -o stringmetrics.dll
>
>and comparing this extension to the original included in the distribution
>(I stripped it, so it is smaller than the one in the distribution because
>the internal symbol table is gone)
>
>2014-09-28 14:05:33 [D:\Source\libstringmetrics-master]
>>dir *.dll
>
>2014-09-28  12:57           769,038 libstringmetrics.dll
>2014-09-28  14:05            75,776 stringmetrics.dll
>
>and running it:
>
>2014-09-28 13:55:55 [D:\Source\libstringmetrics-master]
>>sqlite
>SQLite version 3.8.7 2014-09-26 18:30:11
>Enter ".help" for usage hints.
>Connected to a transient in-memory database.
>Use ".open FILENAME" to reopen on a persistent database.
>sqlite> .load stringmetrics
>sqlite> .echo on
>sqlite> .read test.sql
>.read test.sql
>select load_extension("libstringmetrics.dll");
>
>select stringmetrics("block_distance_custom","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Block Distance customized between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "0" and yields a 100% similarity
>select stringmetrics("cosine_custom","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Cosine Similarity customized between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "1.000000" and yields a 100% similarity
>select stringmetrics("dice_custom","phrase","via giuseppe-garibaldi,25",
>"via giuseppe garibaldi 25",",-");
>Dice Similarity customized between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "1.000000" and yields a 100% similarity
>select stringmetrics("euclidean_distance","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Euclidean Distance between "via giuseppe-garibaldi,25" & "via giuseppe
>garibaldi 25" is "2.00" and yields a  55% similarity
>select stringmetrics("euclidean_distance_custom","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Euclidean Distance customized between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "0" and yields a 100% similarity
>select stringmetrics("jaccard","phrase","via giuseppe-garibaldi,25", "via
>giuseppe garibaldi 25",",-");
>Jaccard Similarity between "via giuseppe-garibaldi,25" & "via giuseppe
>garibaldi 25" is "0.200000" and yields a  20% similarity
>select stringmetrics("jaccard_custom","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Jaccard Similarity customized between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "1.000000" and yields a 100% similarity
>select stringmetrics("jaro","phrase","via giuseppe-garibaldi,25", "via
>giuseppe garibaldi 25",",-");
>Jaro Similarity between "via giuseppe-garibaldi,25" & "via giuseppe
>garibaldi 25" is "0.920000" and yields a  92% similarity
>select stringmetrics("jaro_winkler","phrase","via giuseppe-garibaldi,25",
>"via giuseppe garibaldi 25",",-");
>Jaro Winkler Similarity between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "0.968000" and yields a  97% similarity
>select stringmetrics("levenshtein","phrase","via giuseppe-garibaldi,25",
>"via giuseppe garibaldi 25",",-");
>Levenshtein Distance between "via giuseppe-garibaldi,25" & "via giuseppe
>garibaldi 25" is "2" and yields a  92% similarity
>select stringmetrics("matching_coefficient","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Matching Coefficient SimMetrics between "via giuseppe-garibaldi,25" &
>"via giuseppe garibaldi 25" is "1.00" and yields a  25% similarity
>select stringmetrics("matching_coefficient_custom","phrase","via
>giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
>Matching Coefficient SimMetrics customized between "via giuseppe-
>garibaldi,25" & "via giuseppe garibaldi 25" is "4.00" and yields a 100%
>sim
>ilarity
>select stringmetrics("monge_elkan","phrase","via giuseppe-garibaldi,25",
>"via giuseppe garibaldi 25",",-");
>Monge Elkan Similarity between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "1.012500" and yields a 101% similarity
>select stringmetrics("monge_elkan_custom","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Matching Coefficient SimMetrics customized STILL NOT IMPLEMENTED between
>"via giuseppe-garibaldi,25" & "via giuseppe garibaldi 25" is "still
> not implemented" and yields a   0% similarity
>select stringmetrics("needleman_wunch","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Needleman Wunch SimMetrics between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "2.00" and yields a  96% similarity
>select stringmetrics("overlap_coefficient","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Overlap Coefficient Similarity between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "0.500000" and yields a  50% similarity
>select stringmetrics("overlap_coefficient_custom","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Overlap Coefficient Similarity customized between "via giuseppe-
>garibaldi,25" & "via giuseppe garibaldi 25" is "1.000000" and yields a
>100%
>similarity
>select stringmetrics("qgrams_distance","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>QGrams Distance between "via giuseppe-garibaldi,25" & "via giuseppe
>garibaldi 25" is "12" and yields a  78% similarity
>select stringmetrics("qgrams_distance_custom","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>QGrams Distance customized between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "0" and yields a 100% similarity
>select stringmetrics("smith_waterman","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Smith Waterman SimMetrics between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "21.00" and yields a  84% similarity
>select stringmetrics("smith_waterman_gotoh","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Smith Waterman Gotoh SimMetrics between "via giuseppe-garibaldi,25" &
>"via giuseppe garibaldi 25" is "109.00" and yields a  87% similarity
>select stringmetrics("soundex_phonetics","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Soundex Phonetics between "via giuseppe-garibaldi,25" & "via giuseppe
>garibaldi 25" is "V221 & V221" and yields a 100% similarity
>select stringmetrics("metaphone_phonetics","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Metaphone Phonetics between "via giuseppe-garibaldi,25" & "via giuseppe
>garibaldi 25" is "FJSP & FJSP" and yields a 100% similarity
>select stringmetrics("double_metaphone_phonetics","phrase","via giuseppe-
>garibaldi,25", "via giuseppe garibaldi 25",",-");
>Double Metaphone Phonetics between "via giuseppe-garibaldi,25" & "via
>giuseppe garibaldi 25" is "FJSP & FJSP" and yields a 100% similarity
>
>sqlite>
>
>
>
>>-----Original Message-----
>>From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
>>boun...@sqlite.org] On Behalf Of Andrea Peri
>>Sent: Sunday, 28 September, 2014 02:53
>>To: Gert Van Assche; General Discussion of SQLite Database
>>Subject: Re: [sqlite] A new extension for sqlite to analyze the
>>stringmetrics
>>
>>You should use SQLite 32bit
>>Il 28/set/2014 10:45 "Gert Van Assche" <ger...@gmail.com> ha scritto:
>>
>>> Thanks Andrea.
>>> When I download the DLL I get exactly the same error.
>>> I'm using the 32bit SQLite3.exe on a Win 64 bit machine.
>>> Could that cause the error?
>>>
>>> thanks
>>>
>>> gert
>>>
>>> 2014-09-27 20:27 GMT+02:00 Andrea Peri <aperi2...@gmail.com>:
>>>
>>>> https://github.com/aperi2007/libstringmetrics
>>>>
>>>>
>>>> >Andrea, where do I find it?
>>>> >
>>>> >thanks
>>>> >
>>>> >gert
>>>>
>>>>
>>>>
>>>> --
>>>> -----------------
>>>> Andrea Peri
>>>> . . . . . . . . .
>>>> qwerty àèìòù
>>>> -----------------
>>>>
>>>
>>>
>>_______________________________________________
>>sqlite-users mailing list
>>sqlite-users@sqlite.org
>>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
>_______________________________________________
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to