I had to implement something like this for comparing passages from statutes (see the Introduction in Douglas Hay and Paul Craven, *Masters, Servants and Magistrates in Britain and the Empire, 1562-1955* [UNCP Press, 2004] for an illustration).
You need to isolate the keywords, in whatever order, count them, and measure the distances (number of words) between them. SqLite is great for managing the tables of keywords, the lists of texts that contain them, and tables of distances. But it is not the optimal tool for breaking down the texts and extracting the keywords and distances. I used Perl for this job, and found that I could easily adapt recipes from the Perl Cookbook and similar repositories to build my routines. I wrote the disaggregated lists of keywords, distances and texts as sql tables and analysed them in SqLite. Paul Craven York University ---------------------------------- Date: Wed, 13 Jun 2012 23:09:35 +0200 From: Philip Bennefall <phi...@blastbay.com> To: <sqlite-users@sqlite.org> Subject: [sqlite] Full text search without full phrase matches Message-ID: <A12309DB130E42BBA0590D664F66922A@chicken> Content-Type: text/plain; charset="iso-8859-1" Hi all, I am new to this maling list and to SqLite, so I wanted to start by thanking all of those who make this project a reality. It is a great tool. Now, to my question. I am trying to use the full text search feature to find rough matches for a chat robot. Basically I want to match as many keywords as possible, but not necessarily all of them. The results should be sorted based on how many keywords were found in the phrase and how closely ordered they are to the query. In other words the ordering doesn't have to be exact, but the closer it is, the higher the result should rank. Similarly, even if only one or two words in the phrase are found it should match, but rank higher the more of the words that are present. I have read the reference and I see the NEAR statement and the matchinfo function, as well as the example of how to use it, but I cannot figure out how to apply this knowledge to my specific problem. Does anyone have any suggestions? Thanks in advance for your help. Kind regards, Philip Bennefall _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users