steveniemitz commented on PR #22103:
URL: https://github.com/apache/beam/pull/22103#issuecomment-1178114663

   Ah this is cool and was next up on my hit list, although for a different 
reason.  I've never seen the actual sampler code itself show up as a 
significant CPU/memory offender, however, the contention around 
add/removeTracker is significant since all worker threads need to serialize 
through it.
   
   There's a couple options I've tried to remove the lock there:
   - Switch `activeTrackers` back to a concurrent set, the problem here was 
that deactivating a tracker could still race with the sampler thread and 
double-sample a sampler.  I'm not sure if this is a big enough deal to worry 
about?
   - Rather than worry about synchronization at all, push all add/remove/sample 
operations into a queue and have a single thread consuming from it and 
maintaining activeTrackers and doing the sampling.  Doing so removes the race 
above and simplifies the logic a lot (I think).  I used the LMAX Disruptor 
queue to handle the multi-consumer-single-producer setup here, but you could 
probably use a built-in java concurrent blocking queue for this as well.
   
   In any case, I was going to put up a review for the other version of this 
today-ish, we can apply whatever we end up deciding on to this as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to