[ 
https://issues.apache.org/jira/browse/SOLR-17650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17925714#comment-17925714
 ] 

Mark Robert Miller commented on SOLR-17650:
-------------------------------------------

I took a look at this. In regards to multiple ways to read the updateLog:

We started with no parallel reply. You need to be able to read in sorted order 
because updates might be reordered, say going from leader to replica, and you 
don’t drop based on versions at the tlog level and you have things like 
dependent updates. So sorted order read will make sure you read in logical 
order from the tlog with a sorted reader.

The reason we wanted to add parallel replay is because if you have buffered and 
are trying to catch up while updates are continuously coming in, maybe it takes 
forever to catch up or a long time, because you are buffering as fast or faster 
than you are replaying. So OrderedExecutor was added, in this case you don’t 
have to read in sorted order, the OrderedExecutor handles reorders of updates 
to the same ID and delete by query is handled by waiting for the executor to 
drain before replaying it.

I think Yonik didn’t always use parallel replay because typically you wouldn’t 
expect to be replaying a lot, and so the coordination overhead is often 
probably not worth it in the normal case.

As to the fails, Houston's analysis looks correct to me, before SOLR-17391, 
only one thread appears to have been used, and now thread scheduling can 
changed the order of applied buffered updates.

> Determine executor for UpdateLog reading
> ----------------------------------------
>
>                 Key: SOLR-17650
>                 URL: https://issues.apache.org/jira/browse/SOLR-17650
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Houston Putman
>            Priority: Major
>
> Currently most operations that read from the updateLog use the 
> OrderedExecutor that is setup for that purpose. However, TLOG replicas, when 
> taking leadership, read from the updateLog without the executor (using the 
> inSortedOrder = true parameter).
> SOLR-17391 changed the default executor for updateLog reading to use a fixed 
> coreSize thread pool, which seems like started to enable parallel reading of 
> the updateLog (which was supposed to be true beforehand, but it looks like 
> our understanding of the executor was wrong).
> This ticket's purpose is to:
> # Investigate the correct way of reading the updateLog using an executor
> # Determine if there needs to actually be multiple ways 
> (inSortedOrder=true/false)
> # Make sure the tests pass with whatever we decide.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to