Hi Paul, and thank you for your reply.

The trouble I have is that in my query, all the keywords don't necessarily have to be present in order for a successful match to be made. SqLite's fts only seems to match if all the keywords are present, which I don't require.

I am not familiar with Perl, but am working exclusively in C++.

The input I am processing is arbitrary, and so is the data that I am searching through in the index. The incoming data is user messages, and the index contains old messages that the robot has given to users (stemmed and stripped in various ways to make matches more probable), and then there's another column which contains an appropriate answer if that query is matched. I want it to match as many keywords as possible but not necessarily all, and order by: 1. How many keywords were matched, with some minimum threshold below which no match is made.
2. How well the ordering matched.

Do you have any tips?

Kind regards,

Philip Bennefall
----- Original Message ----- From: <pc...@sympatico.ca>
To: <sqlite-users@sqlite.org>
Sent: Thursday, June 14, 2012 7:01 PM
Subject: [sqlite] Full text search without full phrase matches


I had to implement something like this for comparing passages from statutes (see the Introduction in Douglas Hay and Paul Craven, *Masters, Servants and Magistrates in Britain and the Empire, 1562-1955* [UNCP Press, 2004] for an illustration).

You need to isolate the keywords, in whatever order, count them, and measure the distances (number of words) between them. SqLite is great for managing the tables of keywords, the lists of texts that contain them, and tables of distances. But it is not the optimal tool for breaking down the texts and extracting the keywords and distances. I used Perl for this job, and found that I could easily adapt recipes from the Perl Cookbook and similar repositories to build my routines. I wrote the disaggregated lists of keywords, distances and texts as sql tables and analysed them in SqLite.

Paul Craven
York University

----------------------------------

Date: Wed, 13 Jun 2012 23:09:35 +0200
From: Philip Bennefall <phi...@blastbay.com>
To: <sqlite-users@sqlite.org>
Subject: [sqlite] Full text search without full phrase matches
Message-ID: <A12309DB130E42BBA0590D664F66922A@chicken>
Content-Type: text/plain; charset="iso-8859-1"

Hi all,

I am new to this maling list and to SqLite, so I wanted to start by thanking all of those who make this project a reality. It is a great tool.

Now, to my question. I am trying to use the full text search feature to find rough matches for a chat robot. Basically I want to match as many keywords as possible, but not necessarily all of them. The results should be sorted based on how many keywords were found in the phrase and how closely ordered they are to the query. In other words the ordering doesn't have to be exact, but the closer it is, the higher the result should rank. Similarly, even if only one or two words in the phrase are found it should match, but rank higher the more of the words that are present. I have read the reference and I see the NEAR statement and the matchinfo function, as well as the example of how to use it, but I cannot figure out how to apply this knowledge to my specific problem. Does anyone have any suggestions?

Thanks in advance for your help.

Kind regards,

Philip Bennefall
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to