Re: AW: [sqlite] Soft search in database
I built something like that where each word was translated into a token and a key built from the token and the position of the word and used to build a tree. The tree access was fast and could probably be adapted to produce strict ranking by position. The complexity of the method is the need for a dictionary to use for conversion from word to token. Martin Pfeifle wrote: Unfortunately, the fts module of sqlite does not support "fuzzy text search = google search". What you first need is a similarity measure between strings, e.g. the Edit-distance. Based on such a similarity measure, you could build up an appropriate index structure, e.g. a Relational M-tree (cf. deposit.ddb.de/cgi-bin/dokserv?idn=972667849_var=d1_ext=pdf=972667849.pdf Chapter 10.3) Such a module should not only support range queries, e.g. give me all strings which have a distance smaller than eps to my query string, but also ranked nearest neighbor queries. We also urgently need such a module, and think about implementing it on our own. I would appreciate if efforts could be synchronized. Best Martin - Ursprüngliche Mail Von: Michael Schlenker <[EMAIL PROTECTED]> An: sqlite-users@sqlite.org Gesendet: Dienstag, den 6. März 2007, 09:46:52 Uhr Betreff: Re: [sqlite] Soft search in database Henrik Ræder schrieb: Hi (First post - hope it's an appropriate place) I've been implementing a database of a few MB of text (indexing magazines) in SQLite, and so far have found it to work really well. Now my boss, who has a wonderfully creative mind, asks me to implement a full-text search function which is not the usual simplistic 'found' / 'not found', but more Google-style where a graded list of results is returned. For example, in a search for "MP3 Player", results with the phrases next to each other would get a high rating, as would records with a high occurance of the keywords. This falls outside the usual scope of SQL, but would still seem a relatively common problem to tackle. Any ideas (pointers) how to tackle this? You have come to the right place. Take a closer look at: http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex Michael - To unsubscribe, send email to [EMAIL PROTECTED] -
AW: [sqlite] Soft search in database
Unfortunately, the fts module of sqlite does not support "fuzzy text search = google search". What you first need is a similarity measure between strings, e.g. the Edit-distance. Based on such a similarity measure, you could build up an appropriate index structure, e.g. a Relational M-tree (cf. deposit.ddb.de/cgi-bin/dokserv?idn=972667849_var=d1_ext=pdf=972667849.pdf Chapter 10.3) Such a module should not only support range queries, e.g. give me all strings which have a distance smaller than eps to my query string, but also ranked nearest neighbor queries. We also urgently need such a module, and think about implementing it on our own. I would appreciate if efforts could be synchronized. Best Martin - Ursprüngliche Mail Von: Michael Schlenker <[EMAIL PROTECTED]> An: sqlite-users@sqlite.org Gesendet: Dienstag, den 6. März 2007, 09:46:52 Uhr Betreff: Re: [sqlite] Soft search in database Henrik Ræder schrieb: > Hi > > (First post - hope it's an appropriate place) > > I've been implementing a database of a few MB of text (indexing > magazines) in SQLite, and so far have found it to work really well. > > Now my boss, who has a wonderfully creative mind, asks me to implement a > full-text search function which is not the usual simplistic 'found' / > 'not found', but more Google-style where a graded list of results is > returned. > > For example, in a search for "MP3 Player", results with the phrases next > to each other would get a high rating, as would records with a high > occurance of the keywords. > > This falls outside the usual scope of SQL, but would still seem a > relatively common problem to tackle. > > Any ideas (pointers) how to tackle this? You have come to the right place. Take a closer look at: http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex Michael -- Michael Schlenker Software Engineer CONTACT Software GmbH Tel.: +49 (421) 20153-80 Wiener Straße 1-3 Fax:+49 (421) 20153-41 28359 Bremen http://www.contact.de/ E-Mail: [EMAIL PROTECTED] Sitz der Gesellschaft: Bremen | Geschäftsführer: Karl Heinz Zachries Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215 - To unsubscribe, send email to [EMAIL PROTECTED] - ___ Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de