On Thu, Sep 8, 2011 at 4:21 PM, Jason Rutherglen <jason.rutherg...@gmail.com> wrote: > The delete by query is solved by recording the primary / UID of the > document(s) deleted. It's only expensive if the transaction log > implementation is not designed properly. :)
phew I don't think this is realistic. I mean this could be a lot of documents and looking up a lot of primary keys, plus you need to know what the primary key is and you somehow need to do this async. I don't consider this as an option. simon > > On Thu, Sep 8, 2011 at 5:35 AM, Simon Willnauer > <simon.willna...@googlemail.com> wrote: >> hey folks, >> >> we already have transaction logging on Solr side so I should have >> started this discussion earlier. However, I want to bring this up to >> the list since I think this is a very valuable feature also for plain >> Lucene users and eventually this should also be available to them. I >> don't think this needs to be a core feature at all but I think we need >> to provide the necessary hooks in Lucene core to make this reliable >> and consistent. I have a couple of concerns that which the current >> extension mechanism we provide on the IndexWriter side this feature >> can only be implemented in a sub-optimal way on the Solr (or basically >> on top of lucene) but lemme elaborate this a little. >> >> IndexWriter doesn't provide any transaction guarantees neither does it >> give any guarantees on the order. So if you index two versions of a >> document with the same delete key you can't tell which one wins unless >> you prevent IW from seeing those two documents at the same time ie. >> locking before you hit IW. This is basically what other implementation >> do like ElasticSearch which uses locks assigned to buckets in an array >> selected based on the del terms hash. However this gets a little more >> complex once you get to DeleteQueries where you can't tell which >> document is affected so they might be misplaced in the transaction log >> if the order doesn't match the order the IW sees. Under the hood IW >> does maintain such an order inside the DocumentsWriterDeleteQueue >> which could be utilized to provide a total ordering that IMO should be >> reflected in the transaction log. >> >> Before I am going to propose ways of how this could be implemented I >> want to check if other think we should provide more reliable ways for >> users with the need for durability and consistent recovery. >> >> simon >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org