[GitHub] [beam] steveniemitz commented on pull request #22190: Remove locks around ExecutionStateSampler

GitBox Thu, 07 Jul 2022 15:39:39 -0700


steveniemitz commented on PR #22190:
URL: https://github.com/apache/beam/pull/22190#issuecomment-1178323588


   Answers inline
   
   > This is neat, do you have results for the various implementations you 
tried as to there performance impact?
   
   The concurrent set change implementation performed about the same (anything 
other than the original won't even show up in profiles, it just becomes noise 
in the other execution).  My concern was around double-counting samples at the 
end.  If we don't care about that, it might be worth revisiting since the queue 
solution here is much more complicated (although conceptually cool to write :P 
)  There might be some concern around thread safety inside the tracker itself, 
I doubt in the past it was ever being use concurrently.  We could also probably 
synchronize inside the tracker, which would almost never be contended.
   
   > 
   > To answer your questions:
   > 
   > 1. Double sampling isn't an issue even if you record the sample to the 
wrong bundle as long as that bundle is processing the same portion of the graph 
effectively. Something that we can easily do in the SDK harness implementation.
   
   I think this is fine, at least it wouldn't have changed between different 
implementations.
   
   > 2. LMAX is fine as I've had other use cases for it in mind but would 
rather have the benchmarks speak for themselves.
   
   LMAX has very good latency but slightly lower throughput than a java 
concurrent queue.  I mainly used it here to keep the queuing operation 
allocation free.  I had found a benchmark a long time ago around a java queue 
vs lmax disruptor but I can't find it now.
   
   > Other observations: A) Whatever we come up with seems valuable for the SDK 
harness version. Note that I created a new state sampler implementation there 
since the runners core one is being used by all the runners and has to have 
stricter synchronization while the SDK harness has a very specific thread 
access pattern that we can exploit for greater gains.
   
   Agreed and I took a quick look at that already.
   
   > 
   > B) One of the approaches I had considered was to have the state tracker 
only be added to the set when the bundle processor is created (e.g. DoFn's 
deserialized, graph is wired up) and removed when the entire bundle processor 
is destroyed. The sampling thread would then iterate over the trackers and just 
check to see if they are active skipping them if they aren't. This would make 
the bundle creation hot path faster at the cost of making the state sampling 
slower as the bundle processor lifetime should be pretty long as long as they 
are being processed regularly.
   
   Hm this is interesting, and the number of bundle processors really should be 
still fairly bounded (~1000s?) so not a big deal to iterate them all.  Changing 
it in the non-fn-execution worker might be a little more complicated than 
patching it up here though.
   
   > 
   > C) Finally, I'm guessing that the pipeline that your running is taking too 
long in the synchronized portion which is causing a backlog across threads. 
This could be because hashset is too slow to add/remove or that the 
stateSampling is taking too long causing everyone else to get backed up behind 
that thread even though it doesn't do much. Have you tried passing in the 
experiment to decrease state sampling to like every 5 seconds (`--experiment 
state_sampling_period_millis=5000`) to see if that resolves thread contention 
for the most part?
   
   I actually completely disabled the sampler and it didn't really matter (my 
theory was originally the same as yours, that the sampler was blocking things). 
 The problem is just many (max 300) threads all racing (very quickly) to 
acquire a lock twice per bundle.  The time spent actually inside the lock is 
very small, but according to the stats in the JFR profile I was looking at, the 
avg time to acquire the lock is ~80ms.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] steveniemitz commented on pull request #22190: Remove locks around ExecutionStateSampler

Reply via email to