Hi, I'm studying the indexing mechanism of FTS3/4, I can pretty much
understand how doclists, terms, segments are created and stored, but one
thing I can't grasp is about updating and deleting docs and keeping up the
index up to date. From the source comments:

[quote]
** Since we're using a segmented structure, with no docid-oriented
** index into the term index, we clearly cannot simply update the term
** index when a document is deleted or updated.  For deletions, we
** write an empty doclist (varint(docid) varint(POS_END)), for updates
** we simply write the new doclist.  Segment merges overwrite older
** data for a particular docid with newer data, so deletes or updates
** will eventually overtake the earlier data and knock it out.  The
** query logic likewise merges doclists so that newer data knocks out
** older data.
[/quote]

Its clear to me that with the way things are stored, it would be crazy to
update all doclists with matches related to a single docid.
I just don't see how a segment merge can possibly know which doclist is
older/newer, other than by the level. What happens when a document stored in
a level 0 segment (recently inserted) is updated or deleted? Which one will
be kept and go up to a level 1 segment?
-- 
View this message in context: 
http://old.nabble.com/Segment-merging-in-FTS-and-updates-deletes-tp32827350p32827350.html
Sent from the SQLite mailing list archive at Nabble.com.

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to