May I ask why you haven't used the sku as (primary key)? Do you need to
have more versions of the same sku?
For my understanding, if you can have the sku as primary key, almost all
deleteByQuery are useless.

On Thu, Jun 16, 2022 at 4:38 PM Shawn Heisey <[email protected]> wrote:

> On 6/16/22 02:59, Marius Grigaitis wrote:
> > In the end what caught our eye is a few deleteByQuery lines in stacks of
> > running threads while Solr is overloaded. We temporarily removed
> > deleteByQuery and it had around 10x performance improvement on indexing
> > speed.
>
> I do not understand all the low-level interactions.  But I have seen
> deleteByQuery cause some major problems.  It seems to create a blocking
> situation where Lucene waits for things to complete before it actually
> does the delete, and anything sent AFTER the delete waits for the
> delete.  Imagine this situation:
>
> 1) Ongoing indexing begins a segment merge, one that will take 15
> minutes to complete.
> 2) A deleteByQuery is sent.
> 3) More index changes are sent.
>
> What happens in this situation is that step 2 will wait for the merge to
> complete, and step 3 will wait for step 2 to complete.  I have seen
> automatic segment merges that take a lot longer than 15 minutes.
>
> If step 2 is changed to query for ID and then use deleteById, then steps
> 2 and 3 will run concurrently with the merge.
>
> It took a lot of headscratching to figure out why my indexing process
> sometimes stalled for LONG time spans.
>
> Thanks,
> Shawn
>
>

-- 
Vincenzo D'Amore

Reply via email to