The delete by query is solved by recording the primary / UID of the
document(s) deleted.  It's only expensive if the transaction log
implementation is not designed properly.  :)

On Thu, Sep 8, 2011 at 5:35 AM, Simon Willnauer
<simon.willna...@googlemail.com> wrote:
> hey folks,
>
> we already have transaction logging on Solr side so I should have
> started this discussion earlier. However, I want to bring this up to
> the list since I think this is a very valuable feature also for plain
> Lucene users and eventually this should also be available to them. I
> don't think this needs to be a core feature at all but I think we need
> to provide the necessary hooks in Lucene core to make this reliable
> and consistent. I have a couple of concerns that which the current
> extension mechanism we provide on the IndexWriter side this feature
> can only be implemented in a sub-optimal way on the Solr (or basically
> on top of lucene) but lemme elaborate this a little.
>
> IndexWriter doesn't provide any transaction guarantees neither does it
> give any guarantees on the order. So if you index two versions of a
> document with the same delete key you can't tell which one wins unless
> you prevent IW from seeing those two documents at the same time ie.
> locking before you hit IW. This is basically what other implementation
> do like ElasticSearch which uses locks assigned to buckets in an array
> selected based on the del terms hash. However this gets a little more
> complex once you get to DeleteQueries where you can't tell which
> document is affected so they might be misplaced in the transaction log
> if the order doesn't match the order the IW sees. Under the hood IW
> does maintain such an order inside the DocumentsWriterDeleteQueue
> which could be utilized to provide a total ordering that IMO should be
> reflected in the transaction log.
>
> Before I am going to propose ways of how this could be implemented I
> want to check if other think we should provide more reliable ways for
> users with the need for durability and consistent recovery.
>
> simon
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to