I built something like that where each word was translated into a token
and a key built from the token and the position of the word and used to
build a tree. The tree access was fast and could probably be adapted to
produce strict ranking by position. The complexity of the method is the
need for a dictionary to use for conversion from word to token.
Martin Pfeifle wrote:
Unfortunately, the fts module of sqlite does not support "fuzzy text search = google
search".
What you first need is a similarity measure between strings, e.g. the
Edit-distance.
Based on such a similarity measure, you could build up an appropriate index
structure,
e.g. a Relational M-tree (cf.
deposit.ddb.de/cgi-bin/dokserv?idn=972667849&dok_var=d1&dok_ext=pdf&filename=972667849.pdf
Chapter 10.3)
Such a module should not only support range queries, e.g. give me all strings
which have a distance smaller than eps to my query string, but also ranked
nearest neighbor queries.
We also urgently need such a module, and think about implementing it on our own. I would appreciate if efforts could be synchronized.
Best Martin
----- Ursprüngliche Mail ----
Von: Michael Schlenker <[EMAIL PROTECTED]>
An: sqlite-users@sqlite.org
Gesendet: Dienstag, den 6. März 2007, 09:46:52 Uhr
Betreff: Re: [sqlite] Soft search in database
Henrik Ræder schrieb:
Hi
(First post - hope it's an appropriate place)
I've been implementing a database of a few MB of text (indexing
magazines) in SQLite, and so far have found it to work really well.
Now my boss, who has a wonderfully creative mind, asks me to implement a
full-text search function which is not the usual simplistic 'found' /
'not found', but more Google-style where a graded list of results is
returned.
For example, in a search for "MP3 Player", results with the phrases next
to each other would get a high rating, as would records with a high
occurance of the keywords.
This falls outside the usual scope of SQL, but would still seem a
relatively common problem to tackle.
Any ideas (pointers) how to tackle this?
You have come to the right place.
Take a closer look at:
http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex
Michael
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------