[ https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845950#action_12845950 ]
Jason Rutherglen commented on LUCENE-2312: ------------------------------------------ {quote}The tricky part is to make sure that a reader always sees a consistent snapshot of the index. At the same time a reader must not follow pointers to non-published locations (e.g. array blocks). {quote} Right. In what case in the term enum, term docs chain of doc scoring would a reader potentially try to follow a pointer to a byte array that doesn't exist? I think we're strictly preventing it via last doc ids? Also, when we flush, I think we need to block further doc writing (via an RW lock?) and wait for any currently writing docs to complete, then forcibly publish the byte arrays, then release the write lock? This way we always have published data that's consistent for readers (eg, the inverted index can be read completely, and there won't be any wild writes still occurring to a byte array that's been published). > Search on IndexWriter's RAM Buffer > ---------------------------------- > > Key: LUCENE-2312 > URL: https://issues.apache.org/jira/browse/LUCENE-2312 > Project: Lucene - Java > Issue Type: New Feature > Components: Search > Affects Versions: 3.0.1 > Reporter: Jason Rutherglen > Assignee: Michael Busch > Fix For: 3.1 > > > In order to offer user's near realtime search, without incurring > an indexing performance penalty, we can implement search on > IndexWriter's RAM buffer. This is the buffer that is filled in > RAM as documents are indexed. Currently the RAM buffer is > flushed to the underlying directory (usually disk) before being > made searchable. > Todays Lucene based NRT systems must incur the cost of merging > segments, which can slow indexing. > Michael Busch has good suggestions regarding how to handle deletes using max > doc ids. > https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923 > The area that isn't fully fleshed out is the terms dictionary, > which needs to be sorted prior to queries executing. Currently > IW implements a specialized hash table. Michael B has a > suggestion here: > https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org