Can there be a situation where the index writer fails after the document
was added to tlog and a success is sent to the user? I think we want to
avoid such a situation, isn't it?

On Thu, 8 Oct, 2020, 8:25 pm Cao Mạnh Đạt, <[email protected]> wrote:

> > Can you explain a little more on how this would impact durability of
> updates?
> Since we persist updates into tlog, I do not think this will be an issue
>
> > What does a failure look like, and how does that information get
> propagated back to the client app?
> I did not be able to do much research but I think this is gonna be the
> same as the current way of our asyncId. In this case asyncId will be the
> version of an update (in case of distributed queue it will be offset)
> failures update will be put into a time-to-live map so users can query the
> failure, for success we can skip that by leverage the max succeeded version
> so far.
>
> On Thu, Oct 8, 2020 at 9:31 PM Mike Drob <[email protected]> wrote:
>
>> Interesting idea! Can you explain a little more on how this would impact
>> durability of updates? What does a failure look like, and how does that
>> information get propagated back to the client app?
>>
>> Mike
>>
>> On Thu, Oct 8, 2020 at 9:21 AM Cao Mạnh Đạt <[email protected]> wrote:
>>
>>> Hi guys,
>>>
>>> First of all it seems that I used the term async a lot recently :D.
>>> Recently I have been thinking a lot about changing the current indexing
>>> model of Solr from sync way like currently (user submit an update request
>>> waiting for response). What about changing it to async model, where nodes
>>> will only persist the update into tlog then return immediately much like
>>> what tlog is doing now. Then we have a dedicated executor which reads from
>>> tlog to do indexing (producer consumer model with tlog acting like the
>>> queue).
>>>
>>> I do see several big benefits of this approach
>>>
>>>    - We can batching updates in a single call, right now we do not use
>>>    writer.add(documents) api from lucene, by batching updates this gonna 
>>> boost
>>>    the performance of indexing
>>>    - One common problems with Solr now is we have lot of threads doing
>>>    indexing so that can ends up with many small segments. Using this model 
>>> we
>>>    can have bigger segments so less merge cost
>>>    - Another huge reason here is after switching to this model, we can
>>>    remove tlog and use a distributed queue like Kafka, Pulsar. Since the
>>>    purpose of leader in SolrCloud now is ordering updates, the distributed
>>>    queue is already ordering updates for us, so no need to have a dedicated
>>>    leader. That is just the beginning of things that we can do after using a
>>>    distributed queue.
>>>
>>> What do your guys think about this? Just want to hear from your guys
>>> before going deep into this rabbit hole.
>>>
>>> Thanks!
>>>
>>>

Reply via email to