@Erick, I see, thanks for the clarification.

@Shawn, Good idea for the workaround! I will try that and see if it
resolves the issue.

Thanks,

Chris

On Tue, Nov 7, 2017 at 1:09 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> bq: you think it is caused by the DBQ deleting a document while a
> document with that same ID
>
> No. I'm saying that DBQ has no idea _if_ that would be the case so
> can't carry out the operations in parallel because it _might_ be the
> case.
>
> Shawn:
>
> IIUC, here's the problem. For deleteById, I can guarantee the
> sequencing through the same optimistic locking that regular updates
> use (i.e. the _version_ field). But I'm kind of guessing here.
>
> Best,
> Erick
>
> On Tue, Nov 7, 2017 at 8:51 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> > On 11/5/2017 12:20 PM, Chris Troullis wrote:
> >> The issue I am seeing is when some
> >> threads are adding/updating documents while other threads are issuing
> >> deletes (using deleteByQuery), solr seems to get into a state of extreme
> >> blocking on the replica
> >
> > The deleteByQuery operation cannot coexist very well with other indexing
> > operations.  Let me tell you about something I discovered.  I think your
> > problem is very similar.
> >
> > Solr 4.0 and later is supposed to be able to handle indexing operations
> > at the same time that the index is being optimized (in Lucene,
> > forceMerge).  I have some indexes that take about two hours to optimize,
> > so having indexing stop while that happens is a less than ideal
> > situation.  Ongoing indexing is similar in many ways to a merge, enough
> > that it is handled by the same Merge Scheduler that handles an optimize.
> >
> > I could indeed add documents to the index without issues at the same
> > time as an optimize, but when I would try my full indexing cycle while
> > an optimize was underway, I found that all operations stopped until the
> > optimize finished.
> >
> > Ultimately what was determined (I think it was Yonik that figured it
> > out) was that *most* indexing operations can happen during the optimize,
> > *except* for deleteByQuery.  The deleteById operation works just fine.
> >
> > I do not understand the low-level reasons for this, but apparently it's
> > not something that can be easily fixed.
> >
> > A workaround is to send the query you plan to use with deleteByQuery as
> > a standard query with a limited fl parameter, to retrieve matching
> > uniqueKey values from the index, then do a deleteById with that list of
> > ID values instead.
> >
> > Thanks,
> > Shawn
> >
>

Reply via email to