[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/442 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user jmartell7 commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-92563897 @Lewuathe, I think this would be a great addition, but I'm a little concerned about overhead for very simple serialization/deserialization. Have you measured the impact of collecting this data? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user Lewuathe commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-91247461 I updated to remove unrelated lines and check the state of new metrics. Can you review it? Thank you so much! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user Lewuathe commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-90081039 @Parth-Brahmbhatt Thank you for advice. I merged master and updated this patch. Can you review it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-90122615 Overall things look fairly good. The only issue is that we are adding in new data to the heartbeat. If we want to be able to do a rolling upgrade we need to make sure that the new stat change is optional, and that we don't expect it to always be there when we read. For example there is a topology up and running. If nimbus is upgraded before any of the workers are, nimbus will crash because the new data is not there. Similarly if the ui is updated before nimbus is. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/442#discussion_r27805696 --- Diff: storm-core/src/storm.thrift --- @@ -174,12 +174,14 @@ struct BoltStats { 3: required mapstring, mapGlobalStreamId, double process_ms_avg; 4: required mapstring, mapGlobalStreamId, i64 executed; 5: required mapstring, mapGlobalStreamId, double execute_ms_avg; + 6: required mapstring, mapGlobalStreamId, double deserialize_time; --- End diff -- This is still marked as required, if we want a rolling upgrade this needs to be optional, possibly with a reasonable default, and every time we access it, mostly for reads, we need to check to see if it is set. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user Parth-Brahmbhatt commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-90117819 I am +1 once you remove all the generated files that has no change other then the generated date. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/442#discussion_r27805711 --- Diff: storm-core/src/storm.thrift --- @@ -174,12 +174,14 @@ struct BoltStats { 3: required mapstring, mapGlobalStreamId, double process_ms_avg; 4: required mapstring, mapGlobalStreamId, i64 executed; 5: required mapstring, mapGlobalStreamId, double execute_ms_avg; + 6: required mapstring, mapGlobalStreamId, double deserialize_time; } struct SpoutStats { 1: required mapstring, mapstring, i64 acked; 2: required mapstring, mapstring, i64 failed; 3: required mapstring, mapstring, double complete_ms_avg; + 4: required mapstring, mapstring, double deserialize_time; --- End diff -- This is still marked as required, if we want a rolling upgrade this needs to be optional, possibly with a reasonable default, and every time we access it, mostly for reads, we need to check to see if it is set. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user Lewuathe commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-89586916 @revans2 Thank you for supporting. I'll change this metrics as optional and thrift interface. So all I have to do it 2 points, @Parth-Brahmbhatt ? If there is any other points I should care about, please let me know. * Store metrics in thrift in ZooKeeper. * Change serialization as optional metrics. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-89333447 @Lewuathe, I am really sorry that this took so long. Now that https://issues.apache.org/jira/browse/STORM-634 has gone in, the code does not merge cleanly. The big difference is that now we are storing the metrics in thrift in zookeeper. We also want to support rolling upgrades, so we need to make sure that the new metrics are optional in both the code, and in thrift. If this is something that you just don't have time to finish I understand, and I'll try to come up with a patch at some point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-89333934 @Lewuathe I forgot to mention that @Parth-Brahmbhatt might be able to help you out in understanding the changes you need to make to stats for this to work. Especially the rolling upgrade part. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user Lewuathe commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-76501848 @revans2 Thank you so much for reviewing. I updated to solve above exception and include builtin metrics. Please check it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user Lewuathe commented on a diff in the pull request: https://github.com/apache/storm/pull/442#discussion_r25424494 --- Diff: storm-core/src/clj/backtype/storm/stats.clj --- @@ -164,13 +164,13 @@ (def COMMON-FIELDS [:emitted :transferred]) (defrecord CommonStats [emitted transferred rate]) -(def BOLT-FIELDS [:acked :failed :process-latencies :executed :execute-latencies]) +(def BOLT-FIELDS [:acked :failed :process-latencies :executed :execute-latencies :deserialize-time]) ;;acked and failed count individual tuples -(defrecord BoltExecutorStats [common acked failed process-latencies executed execute-latencies]) +(defrecord BoltExecutorStats [common acked failed process-latencies executed execute-latencies deserialize-time]) -(def SPOUT-FIELDS [:acked :failed :complete-latencies]) +(def SPOUT-FIELDS [:acked :failed :complete-latencies :deserialize-time]) --- End diff -- This might be the point I didn't fully understand. In [here](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/executor.clj#L413-L429), all tuple to even spout can go through task-receiver right? I think these values are sent as bolt ones. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-76253245 When I run a topology with acing actually turned on I get the following in the nimbus log in response to trying to view the UI page. The topology seems to be running just fine though. ``` 2015-02-26T13:38:38.319-0600 o.a.t.s.AbstractNonblockingServer$FrameBuffer [ERROR] Unexpected throwable while invoking! java.lang.ClassCastException: clojure.lang.PersistentVector cannot be cast to java.lang.String at backtype.storm.generated.SpoutStats$SpoutStatsStandardScheme.write(SpoutStats.java:929) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.SpoutStats$SpoutStatsStandardScheme.write(SpoutStats.java:702) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.SpoutStats.write(SpoutStats.java:616) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.ExecutorSpecificStats.standardSchemeWriteValue(ExecutorSpecificStats.java:220) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at org.apache.thrift7.TUnion$TUnionStandardScheme.write(TUnion.java:244) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at org.apache.thrift7.TUnion$TUnionStandardScheme.write(TUnion.java:213) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at org.apache.thrift7.TUnion.write(TUnion.java:152) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.ExecutorStats$ExecutorStatsStandardScheme.write(ExecutorStats.java:705) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.ExecutorStats$ExecutorStatsStandardScheme.write(ExecutorStats.java:563) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.ExecutorStats.write(ExecutorStats.java:489) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.ExecutorSummary$ExecutorSummaryStandardScheme.write(ExecutorSummary.java:862) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.ExecutorSummary$ExecutorSummaryStandardScheme.write(ExecutorSummary.java:763) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.ExecutorSummary.write(ExecutorSummary.java:655) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.TopologyInfo$TopologyInfoStandardScheme.write(TopologyInfo.java:1118) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.TopologyInfo$TopologyInfoStandardScheme.write(TopologyInfo.java:976) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.TopologyInfo.write(TopologyInfo.java:848) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.Nimbus$getTopologyInfoWithOpts_result$getTopologyInfoWithOpts_resultStandardScheme.write(Nimbus.java:18525) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.Nimbus$getTopologyInfoWithOpts_result$getTopologyInfoWithOpts_resultStandardScheme.write(Nimbus.java:18471) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.generated.Nimbus$getTopologyInfoWithOpts_result.write(Nimbus.java:18406) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at org.apache.thrift7.ProcessFunction.process(ProcessFunction.java:53) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at org.apache.thrift7.TBaseProcessor.process(TBaseProcessor.java:39) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at backtype.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:156) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at org.apache.thrift7.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518) ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at org.apache.thrift7.server.Invocation.run(Invocation.java:18) [storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0] at java.lang.Thread.run(Thread.java:744) [na:1.8.0] ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/442#discussion_r25346450 --- Diff: storm-core/src/clj/backtype/storm/stats.clj --- @@ -164,13 +164,13 @@ (def COMMON-FIELDS [:emitted :transferred]) (defrecord CommonStats [emitted transferred rate]) -(def BOLT-FIELDS [:acked :failed :process-latencies :executed :execute-latencies]) +(def BOLT-FIELDS [:acked :failed :process-latencies :executed :execute-latencies :deserialize-time]) ;;acked and failed count individual tuples -(defrecord BoltExecutorStats [common acked failed process-latencies executed execute-latencies]) +(defrecord BoltExecutorStats [common acked failed process-latencies executed execute-latencies deserialize-time]) -(def SPOUT-FIELDS [:acked :failed :complete-latencies]) +(def SPOUT-FIELDS [:acked :failed :complete-latencies :deserialize-time]) --- End diff -- Are we even collecting de-serialize time for the spout? I didn't see it anywhere. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/442#discussion_r25346357 --- Diff: storm-core/src/clj/backtype/storm/daemon/executor.clj --- @@ -410,7 +410,11 @@ (disruptor/clojure-handler (fn [tuple-batch sequence-id end-of-batch?] (fast-list-iter [[task-id msg] tuple-batch] - (let [^TupleImpl tuple (if (instance? Tuple msg) msg (.deserialize deserializer msg))] + (let [[deserialize-time ^TupleImpl tuple] (if (instance? Tuple msg) [0.0 msg] (with-time (.deserialize deserializer msg)))] --- End diff -- In all other cases the timings were sampled (we didn't measure everything). Not totally sure if we can do the same thing here or not. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/442#issuecomment-75978633 For the most part things look good. I really would like to see these stats sent to the MetricsConsumer instances, not just the UI (the systems are separate). I also would like to see the documentation for the restful web service https://github.com/apache/storm/blob/master/STORM-UI-REST-API.md updated to include the new fields you added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-671] Measure tuple serialization/deseri...
GitHub user Lewuathe opened a pull request: https://github.com/apache/storm/pull/442 [STORM-671] Measure tuple serialization/deserialization latency. Measure deserialization time in spout and bolt. These values can be see on dashboard UI. You can merge this pull request into a Git repository by running: $ git pull https://github.com/Lewuathe/storm deserialization-time Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/442.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #442 commit f5bc66f32b8bbb4d770ebb245ba6590d97ff2c1d Author: lewuathe lewua...@me.com Date: 2015-02-25T10:02:32Z [STORM-671] Measure tuple serialization/deserialization latency. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---