Hello Teijo,
On 03/24/2017 01:24 AM, Teijo wrote: > Hello, > > If I search given word with search.cgi, I get correct number of occurences. > > But if I do it with SQL (no matter in mysql or sqlite3), they show extra > occurence. For example, if a given word is in a given original file > twice, they tell that there are three occurences. SQL query is almost > the same one found in Mnogosearch's manual, except that I am using only > one word: > > SELECT url.url, count(*) AS RANK FROM dict, url WHERE > url.rec_id=dict.url_id AND dict.word IN ('word') GROUP BY url.url ORDER > BY rank DESC; > > I'd like to know (by SQL query) position of word in the original file > (to use filepos function). There is at least coord column in dict table. > Coord contains section id and word's position in relationship to > section, if I have understood correctly. How to extract the relative > position from coord, or is the position information elsewhere in > database? If I disabled all sections, would coord actually contain the > absolute position? > > I'm using "single mode" as to database. Coord is a 32 bit number. - The highest 8 bits are section ID (e.g. title, body, etc, according to Section commands in indexer.conf) - The lowest 24 bits are position inside this section. - The last hit inside each combination (url_id,word,secno) is the section length (i.e. the total number of words in this section on) in this document. This MySQL query return the information in a readable form: SELECT url_id,word,coord>>24 AS secno,coord&0xFFFFFF AS pos FROM dict WHERE word='mnogosearch' ORDER BY secno,pos; +--------+-------------+-------+-----+ | url_id | word | secno | pos | --------+-------------+-------+-----+ | 1 | mnogosearch | 1 | 1 | | 1 | mnogosearch | 1 | 14 | | 1 | mnogosearch | 1 | 28 | | 1 | mnogosearch | 1 | 42 | | 1 | mnogosearch | 1 | 76 | | 1 | mnogosearch | 1 | 77 | | 1 | mnogosearch | 1 | 85 | | 1 | mnogosearch | 1 | 105 | <- section 1 length | 1 | mnogosearch | 2 | 1 | | 1 | mnogosearch | 2 | 6 | <- section 2 length | 1 | mnogosearch | 3 | 54 | | 1 | mnogosearch | 3 | 69 | <- section 3 length | 1 | mnogosearch | 4 | 1 | | 1 | mnogosearch | 4 | 11 | <- section 4 length | 1 | mnogosearch | 8 | 2 | | 1 | mnogosearch | 8 | 4 | <- section 8 length +--------+-------------+-------+-----+ Lines that are not marked as "section X length" are actual word hits. > > Best regards, > > Teijo > _______________________________________________ > General mailing list > General@mnogosearch.org > http://lists.mnogosearch.org/listinfo/general _______________________________________________ General mailing list General@mnogosearch.org http://lists.mnogosearch.org/listinfo/general