> On July 18, 2015, 8:33 p.m., Ivan Mitic wrote: > > 1. {code} > > // Using @AtomicBoolean.compareAndSet so that only first thread to > > // execute the compare and set gets through to initializing > > // the maps > > if (isClusterInitialized.compareAndSet(false, true)) { > > initProviderMaps(clusterName); > > } > > {code} > > This ensures that only one thread calls into initProviderMaps() however, > > the thread that loses might move on without this initialization being > > completed. Is this the desired behavior? > > > > 2. Have you analyzed what can happen if threads start calling checkInit, > > resetInit in random orders? Will things hold? > > > > 3. Were you able to pinpoint the change that introduced this problem? I > > believe this is a recent regression - last month or so.
- Have you analyzed what can happen if threads start calling checkInit, resetInit in random orders? Will things hold? Valid point, I think it is ok to let init called by more than 1 thread instead of losing a reset. I will update with new changes. - Were you able to pinpoint the change that introduced this problem? I believe this is a recent regression - last month or so. This is not a regression, this code is unchanged for some time, what changed is the blueprints flow, we allow major CRUD operations while monitoring the still nascent cluster at the same time. As I explained already this is not a valid case for the UI, the blueprint processing has a changed a lot recently and Jon Speidel might be able to provide more reasoning behind the changes to the architecture in terms of when write locks were acquired previously as oppposed to now. Although, the deadlock situation is pretty clear. - Sid ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36587/#review92187 ----------------------------------------------------------- On July 18, 2015, 5:25 p.m., Sid Wagle wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/36587/ > ----------------------------------------------------------- > > (Updated July 18, 2015, 5:25 p.m.) > > > Review request for Ambari, Alejandro Fernandez, Jonathan Hurley, Mahadev > Konar, Myroslav Papirkovskyy, and Sumit Mohanty. > > > Bugs: AMBARI-12453 > https://issues.apache.org/jira/browse/AMBARI-12453 > > > Repository: ambari > > > Description > ------- > > The high level picture seems to be: Requests made from the UI for host > metrics trying to figure out the actual metrics service and the code that > updates in-memory maps which hold information of where that service is and > what ports to use to connect to it etc. These property maps are update by > Observers on important events like Cluster/Service/Host CRUD by resetting a > volatile variable. > > The contention occurs due the thread that actually enters the monitor > protecting the volatile var and asks for another lock on the cluster which is > held by some other thread which also needs a value from the in-memory maps > and waits on the object monitor that it cannot enter. > > Note: Web based deployments get away because not a lot of CRUD ops happen in > parallel to Reads like the use case of monitoring the Blueprint deploy as the > cluster is being provisioned. > > > Diffs > ----- > > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/AbstractProviderModule.java > 380a0fe > > Diff: https://reviews.apache.org/r/36587/diff/ > > > Testing > ------- > > All unit test passed. > > > Thanks, > > Sid Wagle > >