[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/2710 ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user Ethanlm commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r208340502 --- Diff: storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java --- @@ -365,6 +384,7 @@ void addReferencesToBlobs(PortAndAssignment pna, BlobChangingCallback cb) if (!localResourceList.isEmpty()) { getBlobs(localResourceList, pna, cb); } +pna.complete(); --- End diff -- just one more nit: put this outside of this method since the method `addReferencesToBlobs` doesn't indicate `complete` ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user zd-project commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r208292571 --- Diff: storm-server/src/main/java/org/apache/storm/daemon/supervisor/Container.java --- @@ -346,6 +358,10 @@ public void cleanUp() throws IOException { _usedMemory.remove(_port); _reservedMemory.remove(_port); cleanUpForRestart(); +} catch (IOException e) { +//This may or may not be reported depending on when process exits --- End diff -- I'll remove this after review is completed ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user zd-project commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r208286166 --- Diff: storm-server/src/main/java/org/apache/storm/daemon/supervisor/Container.java --- @@ -346,6 +358,10 @@ public void cleanUp() throws IOException { _usedMemory.remove(_port); _reservedMemory.remove(_port); cleanUpForRestart(); +} catch (IOException e) { +//This may or may not be reported depending on when process exits --- End diff -- My bad. I thought Container runs on it's own process when I started working on this. This should not be a problem if it's just a daemon thread. ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user Ethanlm commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r208092337 --- Diff: storm-server/src/main/java/org/apache/storm/daemon/supervisor/Container.java --- @@ -346,6 +358,10 @@ public void cleanUp() throws IOException { _usedMemory.remove(_port); _reservedMemory.remove(_port); cleanUpForRestart(); +} catch (IOException e) { +//This may or may not be reported depending on when process exits --- End diff -- I would like to better understand this. In which case the meter will not be reported? ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user Ethanlm commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r208095554 --- Diff: storm-server/src/main/java/org/apache/storm/daemon/supervisor/Slot.java --- @@ -1008,6 +1015,13 @@ public void close() throws Exception { } static class DynamicState { +private static final Map workerStateTransition = EnumUtil.toEnumMap(MachineState.class, +machineState -> StormMetricsRegistry.registerMeter("supervisor:num-transitions-into-" + machineState.toString())); +private static final Map workerStateDuration = EnumUtil.toEnumMap(MachineState.class, +machineState -> StormMetricsRegistry.registerTimer( +"supervisor:num-transitions-out-" + machineState.toString() + "-and-duration-ms") --- End diff -- It would be good if we can find a better name ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user Ethanlm commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r208279317 --- Diff: storm-client/src/jvm/org/apache/storm/daemon/supervisor/ClientSupervisorUtils.java --- @@ -28,11 +29,14 @@ import org.apache.storm.shade.org.apache.commons.lang.StringUtils; import org.apache.storm.utils.ConfigUtils; import org.apache.storm.utils.ObjectReader; +import org.apache.storm.utils.ShellUtils; import org.apache.storm.utils.Utils; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public class ClientSupervisorUtils { +public static final Meter numWorkerLaunchExceptions = ShellUtils.numShellExceptions; --- End diff -- I would like to see some comments so people can understand why `ClientSupervisorUtils` is using the same meter with `ShellUtils` ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user Ethanlm commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r208096352 --- Diff: storm-server/src/main/java/org/apache/storm/daemon/supervisor/Slot.java --- @@ -1147,8 +1162,19 @@ public DynamicState withPendingLocalization(Future pendingDownload) { */ public DynamicState withState(final MachineState state) { long newStartTime = Time.currentTimeMillis(); +//We may (though unlikely) lose metering here if state transition is too frequent (less than a millisecond) +workerStateDuration.get(this.state).update(newStartTime - startTime, TimeUnit.MILLISECONDS); +workerStateTransition.get(state).mark(); + +LocalAssignment assignment = this.currentAssignment; +if (this.state != MachineState.RUNNING && state == MachineState.RUNNING +&& this.currentAssignment instanceof TimerDecoratedAssignment) { +((TimerDecoratedAssignment) assignment).stopTiming(); +assignment = new LocalAssignment(this.currentAssignment); --- End diff -- A few comments would be helpful. ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user Ethanlm commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r208279468 --- Diff: storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java --- @@ -251,11 +261,18 @@ private LocalizedResource getUserFile(String user, String key) { long localVersion = blob.getLocalVersion(); long remoteVersion = blob.getRemoteVersion(blobStore); if (localVersion != remoteVersion || !blob.isFullyDownloaded()) { +if (!blob.isFullyDownloaded()) { --- End diff -- Should be `if (!blob.isFullyDownloaded()) {` ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user zd-project commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r207645153 --- Diff: storm-client/src/jvm/org/apache/storm/daemon/supervisor/ClientSupervisorUtils.java --- @@ -28,11 +29,14 @@ import org.apache.storm.shade.org.apache.commons.lang.StringUtils; import org.apache.storm.utils.ConfigUtils; import org.apache.storm.utils.ObjectReader; +import org.apache.storm.utils.ShellUtils; import org.apache.storm.utils.Utils; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public class ClientSupervisorUtils { +public static final Meter numWorkerLaunchExceptions = ShellUtils.numShellExceptions; --- End diff -- Logged in Jira ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r207578433 --- Diff: storm-server/src/main/java/org/apache/storm/daemon/supervisor/Slot.java --- @@ -1147,8 +1162,19 @@ public DynamicState withPendingLocalization(Future pendingDownload) { */ public DynamicState withState(final MachineState state) { long newStartTime = Time.currentTimeMillis(); +//TODO: potential lost metrics due to timing accuracy (Timer only tracks one call per millisecond) --- End diff -- Please don't leave TODOs int he code. Either fix it, file a follow on JIRA to fix it, or accept it and just have it be a comment and not a TODO. ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r207576736 --- Diff: storm-client/src/jvm/org/apache/storm/daemon/supervisor/ClientSupervisorUtils.java --- @@ -28,11 +29,14 @@ import org.apache.storm.shade.org.apache.commons.lang.StringUtils; import org.apache.storm.utils.ConfigUtils; import org.apache.storm.utils.ObjectReader; +import org.apache.storm.utils.ShellUtils; import org.apache.storm.utils.Utils; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public class ClientSupervisorUtils { +public static final Meter numWorkerLaunchExceptions = ShellUtils.numShellExceptions; --- End diff -- This all feels a bit too confusing to me. `ShellUtils.numShellExceptions` is static and pulled in from multiple different locations, but only registered once in the supervisor. I personally would rather see it where `ClientSupervisorUtils` registers the Meter, as it is tied to the supervisor having it do it here is fine. For ShellUtils, I would like to see it not have a static meter, but instead optionally pass a meter in when calling run, or perhaps optionally include it in the constructor. ---
[GitHub] storm pull request #2710: STORM-3099: Extend metrics on supervisor, workers ...
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/storm/pull/2710#discussion_r199710919 --- Diff: storm-server/src/main/java/org/apache/storm/daemon/drpc/DRPC.java --- @@ -208,6 +208,7 @@ public void failRequest(String id, DRPCExecutionException e) throws Authorizatio } } +//Should this be changed to private? --- End diff -- Leave this to the comment on PR or file an issue. TODO-like comment is tend to not be addressed afterwards. ---