I suppose failures would be returned to the client one the async response? How would one keep the tlog from growing forever if the actual indexing took a long time?
I'm guessing that this would be optional.. On Thu, Oct 8, 2020, 11:14 Ishan Chattopadhyaya <[email protected]> wrote: > Can there be a situation where the index writer fails after the document > was added to tlog and a success is sent to the user? I think we want to > avoid such a situation, isn't it? > > On Thu, 8 Oct, 2020, 8:25 pm Cao Mạnh Đạt, <[email protected]> wrote: > >> > Can you explain a little more on how this would impact durability of >> updates? >> Since we persist updates into tlog, I do not think this will be an issue >> >> > What does a failure look like, and how does that information get >> propagated back to the client app? >> I did not be able to do much research but I think this is gonna be the >> same as the current way of our asyncId. In this case asyncId will be the >> version of an update (in case of distributed queue it will be offset) >> failures update will be put into a time-to-live map so users can query the >> failure, for success we can skip that by leverage the max succeeded version >> so far. >> >> On Thu, Oct 8, 2020 at 9:31 PM Mike Drob <[email protected]> wrote: >> >>> Interesting idea! Can you explain a little more on how this would impact >>> durability of updates? What does a failure look like, and how does that >>> information get propagated back to the client app? >>> >>> Mike >>> >>> On Thu, Oct 8, 2020 at 9:21 AM Cao Mạnh Đạt <[email protected]> wrote: >>> >>>> Hi guys, >>>> >>>> First of all it seems that I used the term async a lot recently :D. >>>> Recently I have been thinking a lot about changing the current indexing >>>> model of Solr from sync way like currently (user submit an update request >>>> waiting for response). What about changing it to async model, where nodes >>>> will only persist the update into tlog then return immediately much like >>>> what tlog is doing now. Then we have a dedicated executor which reads from >>>> tlog to do indexing (producer consumer model with tlog acting like the >>>> queue). >>>> >>>> I do see several big benefits of this approach >>>> >>>> - We can batching updates in a single call, right now we do not use >>>> writer.add(documents) api from lucene, by batching updates this gonna >>>> boost >>>> the performance of indexing >>>> - One common problems with Solr now is we have lot of threads doing >>>> indexing so that can ends up with many small segments. Using this model >>>> we >>>> can have bigger segments so less merge cost >>>> - Another huge reason here is after switching to this model, we can >>>> remove tlog and use a distributed queue like Kafka, Pulsar. Since the >>>> purpose of leader in SolrCloud now is ordering updates, the distributed >>>> queue is already ordering updates for us, so no need to have a dedicated >>>> leader. That is just the beginning of things that we can do after using >>>> a >>>> distributed queue. >>>> >>>> What do your guys think about this? Just want to hear from your guys >>>> before going deep into this rabbit hole. >>>> >>>> Thanks! >>>> >>>>
