Hi, I'm studying the indexing mechanism of FTS3/4, I can pretty much understand how doclists, terms, segments are created and stored, but one thing I can't grasp is about updating and deleting docs and keeping up the index up to date. From the source comments:
[quote] ** Since we're using a segmented structure, with no docid-oriented ** index into the term index, we clearly cannot simply update the term ** index when a document is deleted or updated. For deletions, we ** write an empty doclist (varint(docid) varint(POS_END)), for updates ** we simply write the new doclist. Segment merges overwrite older ** data for a particular docid with newer data, so deletes or updates ** will eventually overtake the earlier data and knock it out. The ** query logic likewise merges doclists so that newer data knocks out ** older data. [/quote] Its clear to me that with the way things are stored, it would be crazy to update all doclists with matches related to a single docid. I just don't see how a segment merge can possibly know which doclist is older/newer, other than by the level. What happens when a document stored in a level 0 segment (recently inserted) is updated or deleted? Which one will be kept and go up to a level 1 segment? -- View this message in context: http://old.nabble.com/Segment-merging-in-FTS-and-updates-deletes-tp32827350p32827350.html Sent from the SQLite mailing list archive at Nabble.com. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users