On Thu, Sep 8, 2011 at 4:21 PM, Jason Rutherglen
<jason.rutherg...@gmail.com> wrote:
> The delete by query is solved by recording the primary / UID of the
> document(s) deleted.  It's only expensive if the transaction log
> implementation is not designed properly.  :)

phew I don't think this is realistic. I mean this could be a lot of
documents and looking up a lot of primary keys, plus you need to know
what the primary key is and you somehow need to do this async. I don't
consider this as an option.

simon
>
> On Thu, Sep 8, 2011 at 5:35 AM, Simon Willnauer
> <simon.willna...@googlemail.com> wrote:
>> hey folks,
>>
>> we already have transaction logging on Solr side so I should have
>> started this discussion earlier. However, I want to bring this up to
>> the list since I think this is a very valuable feature also for plain
>> Lucene users and eventually this should also be available to them. I
>> don't think this needs to be a core feature at all but I think we need
>> to provide the necessary hooks in Lucene core to make this reliable
>> and consistent. I have a couple of concerns that which the current
>> extension mechanism we provide on the IndexWriter side this feature
>> can only be implemented in a sub-optimal way on the Solr (or basically
>> on top of lucene) but lemme elaborate this a little.
>>
>> IndexWriter doesn't provide any transaction guarantees neither does it
>> give any guarantees on the order. So if you index two versions of a
>> document with the same delete key you can't tell which one wins unless
>> you prevent IW from seeing those two documents at the same time ie.
>> locking before you hit IW. This is basically what other implementation
>> do like ElasticSearch which uses locks assigned to buckets in an array
>> selected based on the del terms hash. However this gets a little more
>> complex once you get to DeleteQueries where you can't tell which
>> document is affected so they might be misplaced in the transaction log
>> if the order doesn't match the order the IW sees. Under the hood IW
>> does maintain such an order inside the DocumentsWriterDeleteQueue
>> which could be utilized to provide a total ordering that IMO should be
>> reflected in the transaction log.
>>
>> Before I am going to propose ways of how this could be implemented I
>> want to check if other think we should provide more reliable ways for
>> users with the need for durability and consistent recovery.
>>
>> simon
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to