> On Sept. 11, 2014, 12:30 a.m., Chinmay Soman wrote: > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala, > > line 45 > > <https://reviews.apache.org/r/25522/diff/1/?file=684858#file684858line45> > > > > If a particular container fails, this will be set to False. > > > > However, when that container is restarted -> shouldn't we set this back > > to True ? > > > > From the current code, it seems like this will remain False after the > > first incident.
Good point. When a container is allocated and after state.neededContainers is decremented, we should check whether all containers are now running again. If so, then jobHealthy should be set to true again. - David ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25522/#review52983 ----------------------------------------------------------- On Sept. 11, 2014, 12:19 a.m., David Chen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/25522/ > ----------------------------------------------------------- > > (Updated Sept. 11, 2014, 12:19 a.m.) > > > Review request for samza. > > > Bugs: SAMZA-408 > https://issues.apache.org/jira/browse/SAMZA-408 > > > Repository: samza > > > Description > ------- > > SAMZA-408: Expose metric for tracking AM availability. > > > Diffs > ----- > > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterLifecycle.scala > 3d17632e17d3495a4335a6a80bcdb9e40db9d184 > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala > 09b1237d670307d8c51303bf1086bf863bad4756 > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterState.scala > 471eff499f9af4f76d434e8b5d79f618a5dcaeb8 > > samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterTaskManager.scala > ee08cfbce7ec3079ebf35bb510a22bcc0df1feb1 > > Diff: https://reviews.apache.org/r/25522/diff/ > > > Testing > ------- > > > Thanks, > > David Chen > >
