Re: Custom update processor and race condition with concurrent requests

Chris Hostetter Wed, 04 Mar 2020 10:20:26 -0800


: So, I thought it can be simplified by moving this state transitions and
: processing logic into Solr by writing a custom update processor. The idea
: occurred to me when I was thinking about Solr serializing multiple
: concurrent requests for a document on the leader replica. So, my thought
: process was if I am getting this serialization for free I can implement the
: entire processing inside Solr and a dumb client to push records to Solr
: would be sufficient. But, that's not working. Perhaps the point I missed is
: that even though this processing is moved inside Solr I still have a race
: condition because of time-of-check to time-of-update gap.


Correct.  Solr is (hand wavy) "locking" updates to documents by id on the 
leader node to ensure they are transactional, but that locking happens 
inside DistributedUpdateProcessor, other update processors don't run 
"inside" that lock.

: While writing this it just occurred to me that I'm running my custom update
: processor before DistributedProcessor. I'm committing the same XY crime
: again but if I run it after DistributedProcessor can this race condition be
: avoided?

no.  that would just introduce a whole new host of problems that are a 
much more ivolved conversation to get into (remeber: the processors after 
DUH run on every replica, after the leader has already assigned a 
version and said this update should go thorugh ... so now imagine what 
your error handling logic has to look like?)


Ultimately the goal that you're talking about really feels like "business 
logic that requires syncronizing/blocking updates" but you're trying to 
avoid writing a syncronized client to do that syncronization and error 
handling before forwarding those updates to solr.

I mean -- even with your explanation of your goal, there is a whole host 
of nuance / use case specific logic that has to go into "based on various 
conflicts it modifies the records for which update failed" -- and that 
logic seems like it would affect the locking: if you get a request that 
violates the legal state transition because of another request that 
(blocked it until it) just finished .... now what?  fail? apply some new 
rules?

this seems like logic you should really want in a "middle ware" layer that 
your clients talk to and sends docs to solr.

If you *REALLY* want to try and piggy back this logic into solr, then 
there is _one_ place i can think of where you can "hook in" to the logic 
DistributedUpdateHandler does while "locking" an id on the leader, and 
that would be extending the AtomicUpdateDocumentMerger...

It's marked experimental, and I don't really understand the use cases 
for why it exists, and in order to customize this you would have to 
also subclass DistributedUpdateHandlerFactory to build your custom 
instance and pass it to the DUH constructor, but then -- in theory -- you 
could intercept any document update *after* the RTG, and before it's 
written to the TLOG, and apply some business logic.

But i wouldn't recommend this ... "the'r be Dragons!"



-Hoss
http://www.lucidworks.com/

Re: Custom update processor and race condition with concurrent requests

Reply via email to