[jira] [Commented] (STORM-1057) Add throughput metric to spout/bolt and display them on web ui
[ https://issues.apache.org/jira/browse/STORM-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968358#comment-14968358 ] ASF GitHub Bot commented on STORM-1057: --- Github user wangli1426 commented on the pull request: https://github.com/apache/storm/pull/753#issuecomment-150077548 Hi @d2r, Could you please review this PR? Your response is highly appreciated. Thanks. > Add throughput metric to spout/bolt and display them on web ui > -- > > Key: STORM-1057 > URL: https://issues.apache.org/jira/browse/STORM-1057 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Li Wang >Assignee: Li Wang > Original Estimate: 168h > Remaining Estimate: 168h > > Throughput is a fundamental metric to reasoning about the performance > bottleneck of a topology. Displaying the throughputs of components and tasks > on the web ui could greatly facilitate the user identifying the performance > bottleneck and checking whether the the workload among components and tasks > are balanced. > What to do: > 1. Measure the throughput of each spout/bolt. > 2. Display the throughput metrics on web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1057] Add throughput metrics to spouts/...
Github user wangli1426 commented on the pull request: https://github.com/apache/storm/pull/753#issuecomment-150077548 Hi @d2r, Could you please review this PR? Your response is highly appreciated. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1057) Add throughput metric to spout/bolt and display them on web ui
[ https://issues.apache.org/jira/browse/STORM-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968355#comment-14968355 ] ASF GitHub Bot commented on STORM-1057: --- Github user wangli1426 commented on the pull request: https://github.com/apache/storm/pull/753#issuecomment-150076986 Hi @revans2 , Thanks for very much for your time and efforts. Following your suggestions, I have made ```update-values``` as a regular function and fixed the problem in ```storm/STORM-UI-REST-API.md```. > Add throughput metric to spout/bolt and display them on web ui > -- > > Key: STORM-1057 > URL: https://issues.apache.org/jira/browse/STORM-1057 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Li Wang >Assignee: Li Wang > Original Estimate: 168h > Remaining Estimate: 168h > > Throughput is a fundamental metric to reasoning about the performance > bottleneck of a topology. Displaying the throughputs of components and tasks > on the web ui could greatly facilitate the user identifying the performance > bottleneck and checking whether the the workload among components and tasks > are balanced. > What to do: > 1. Measure the throughput of each spout/bolt. > 2. Display the throughput metrics on web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1057] Add throughput metrics to spouts/...
Github user wangli1426 commented on the pull request: https://github.com/apache/storm/pull/753#issuecomment-150076986 Hi @revans2 , Thanks for very much for your time and efforts. Following your suggestions, I have made ```update-values``` as a regular function and fixed the problem in ```storm/STORM-UI-REST-API.md```. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1057) Add throughput metric to spout/bolt and display them on web ui
[ https://issues.apache.org/jira/browse/STORM-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968354#comment-14968354 ] ASF GitHub Bot commented on STORM-1057: --- Github user wangli1426 commented on a diff in the pull request: https://github.com/apache/storm/pull/753#discussion_r42704561 --- Diff: STORM-UI-REST-API.md --- @@ -351,11 +354,13 @@ Sample response: "executors": 12, "emitted": 184580, "transferred": 0, +"throughput": "195.000", "acked": 184640, "executeLatency": "0.048", "tasks": 12, "executed": 184620, "processLatency": "0.043", +"throughput": --- End diff -- My bad. This will be solved in this revision. > Add throughput metric to spout/bolt and display them on web ui > -- > > Key: STORM-1057 > URL: https://issues.apache.org/jira/browse/STORM-1057 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Li Wang >Assignee: Li Wang > Original Estimate: 168h > Remaining Estimate: 168h > > Throughput is a fundamental metric to reasoning about the performance > bottleneck of a topology. Displaying the throughputs of components and tasks > on the web ui could greatly facilitate the user identifying the performance > bottleneck and checking whether the the workload among components and tasks > are balanced. > What to do: > 1. Measure the throughput of each spout/bolt. > 2. Display the throughput metrics on web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1057] Add throughput metrics to spouts/...
Github user wangli1426 commented on a diff in the pull request: https://github.com/apache/storm/pull/753#discussion_r42704561 --- Diff: STORM-UI-REST-API.md --- @@ -351,11 +354,13 @@ Sample response: "executors": 12, "emitted": 184580, "transferred": 0, +"throughput": "195.000", "acked": 184640, "executeLatency": "0.048", "tasks": 12, "executed": 184620, "processLatency": "0.043", +"throughput": --- End diff -- My bad. This will be solved in this revision. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1057) Add throughput metric to spout/bolt and display them on web ui
[ https://issues.apache.org/jira/browse/STORM-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968353#comment-14968353 ] ASF GitHub Bot commented on STORM-1057: --- Github user wangli1426 commented on a diff in the pull request: https://github.com/apache/storm/pull/753#discussion_r42704498 --- Diff: storm-core/src/clj/backtype/storm/stats.clj --- @@ -277,15 +277,34 @@ (value-stats stats SPOUT-FIELDS) {:type :spout})) +(defn values-divided-by [pairs t] + (let [update-values (fn [m f & args] +(into {} (for [[k v] m] [k (apply f v args)])))] +(update-values pairs / (double t --- End diff -- Thanks for the comment. I will make ```update-values``` as a regular function in this revision. > Add throughput metric to spout/bolt and display them on web ui > -- > > Key: STORM-1057 > URL: https://issues.apache.org/jira/browse/STORM-1057 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Li Wang >Assignee: Li Wang > Original Estimate: 168h > Remaining Estimate: 168h > > Throughput is a fundamental metric to reasoning about the performance > bottleneck of a topology. Displaying the throughputs of components and tasks > on the web ui could greatly facilitate the user identifying the performance > bottleneck and checking whether the the workload among components and tasks > are balanced. > What to do: > 1. Measure the throughput of each spout/bolt. > 2. Display the throughput metrics on web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1057] Add throughput metrics to spouts/...
Github user wangli1426 commented on a diff in the pull request: https://github.com/apache/storm/pull/753#discussion_r42704498 --- Diff: storm-core/src/clj/backtype/storm/stats.clj --- @@ -277,15 +277,34 @@ (value-stats stats SPOUT-FIELDS) {:type :spout})) +(defn values-divided-by [pairs t] + (let [update-values (fn [m f & args] +(into {} (for [[k v] m] [k (apply f v args)])))] +(update-values pairs / (double t --- End diff -- Thanks for the comment. I will make ```update-values``` as a regular function in this revision. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Storm at Stackoverflow
Nice work Matthias! Upvoted. Jungtaek Lim (HeartSaVioR) 2015-10-22 1:16 GMT+09:00 Matthias J. Sax : > Hi, > > currently, there are two tags (apache-storm and storm) used on SO. I > just suggested "apache-storm" to be the main tag and "storm" to be a > synonym for it. This enables that all questions get tagged with a unique > tag. Old and new questions get re-tag from storm to apache-storm > automatically if the synonym get accepted. For this to happen, at least > 4 upvotes must be casted. > > If you have an SO account, please upvote here: > https://stackoverflow.com/tags/apache-storm/synonyms > > Thanks for your support! > > -Matthias > > -- Name : 임 정택 Blog : http://www.heartsavior.net / http://dev.heartsavior.net Twitter : http://twitter.com/heartsavior LinkedIn : http://www.linkedin.com/in/heartsavior
[jira] [Updated] (STORM-1017) If ignoreZkOffsets set true,KafkaSpout will reset zk offset when recover from failure.
[ https://issues.apache.org/jira/browse/STORM-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated STORM-1017: Fix Version/s: 0.11.0 > If ignoreZkOffsets set true,KafkaSpout will reset zk offset when recover from > failure. > -- > > Key: STORM-1017 > URL: https://issues.apache.org/jira/browse/STORM-1017 > Project: Apache Storm > Issue Type: Bug > Components: storm-kafka >Reporter: Renkai Ge >Assignee: Priyank Shah > Fix For: 0.11.0 > > > when ignoreZkOffsets set true and startOffsetTime = > kafka.api.OffsetRequest.EarliestTime(). > workers running -> topology shutdown by user and restart -> workers will read > from earliest time again > workers running -> one of workers shutdown by accident and supervisor restart > the worker -> what offset will the restarted worker read from? > More details on > https://github.com/apache/storm/pull/493#issuecomment-135783234 > It will cause a lot of unwanted duplicated messages in some scenes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1121] Remove method call to avoid overh...
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-150051486 @kishorvpatil rebalance seems not work cause it requires topology to be alive at the moment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1121) Improve Nimbus Topology submission time
[ https://issues.apache.org/jira/browse/STORM-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968155#comment-14968155 ] ASF GitHub Bot commented on STORM-1121: --- Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-150051486 @kishorvpatil rebalance seems not work cause it requires topology to be alive at the moment. > Improve Nimbus Topology submission time > --- > > Key: STORM-1121 > URL: https://issues.apache.org/jira/browse/STORM-1121 > Project: Apache Storm > Issue Type: Bug >Reporter: Kishor Patil >Assignee: Kishor Patil > > It appears, nimbus is blocking itself as active topologies count goes up. It > increases submitTopology response time exponentially for submission of newer > topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-1017) If ignoreZkOffsets set true,KafkaSpout will reset zk offset when recover from failure.
[ https://issues.apache.org/jira/browse/STORM-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Priyank Shah resolved STORM-1017. - Resolution: Fixed > If ignoreZkOffsets set true,KafkaSpout will reset zk offset when recover from > failure. > -- > > Key: STORM-1017 > URL: https://issues.apache.org/jira/browse/STORM-1017 > Project: Apache Storm > Issue Type: Bug > Components: storm-kafka >Reporter: Renkai Ge >Assignee: Priyank Shah > > when ignoreZkOffsets set true and startOffsetTime = > kafka.api.OffsetRequest.EarliestTime(). > workers running -> topology shutdown by user and restart -> workers will read > from earliest time again > workers running -> one of workers shutdown by accident and supervisor restart > the worker -> what offset will the restarted worker read from? > More details on > https://github.com/apache/storm/pull/493#issuecomment-135783234 > It will cause a lot of unwanted duplicated messages in some scenes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1121) Improve Nimbus Topology submission time
[ https://issues.apache.org/jira/browse/STORM-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968137#comment-14968137 ] ASF GitHub Bot commented on STORM-1121: --- Github user kishorvpatil commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-150049582 @HeartSaVioR looking into compilation issue. Somehow it did not get committed. Re-running the build. > Improve Nimbus Topology submission time > --- > > Key: STORM-1121 > URL: https://issues.apache.org/jira/browse/STORM-1121 > Project: Apache Storm > Issue Type: Bug >Reporter: Kishor Patil >Assignee: Kishor Patil > > It appears, nimbus is blocking itself as active topologies count goes up. It > increases submitTopology response time exponentially for submission of newer > topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1121] Remove method call to avoid overh...
Github user kishorvpatil commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-150049582 @HeartSaVioR looking into compilation issue. Somehow it did not get committed. Re-running the build. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-1121] Remove method call to avoid overh...
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-150047429 I also love the concept. Btw, in order to rely on recurring mk-assignments, - We should remove ```nimbus.reassign``` since it should be true for new topologies to be assigned. We can't change ```when statement``` cause it makes nimbus.reassign ineffective. - It would be better to let users know that new topology will be assigned by nimbus not at the moment but within ```nimbus.monitor.freq.secs```. ``` (schedule-recurring (:timer nimbus) 0 (conf NIMBUS-MONITOR-FREQ-SECS) (fn [] (when (conf NIMBUS-REASSIGN) (locking (:submit-lock nimbus) (mk-assignments nimbus))) (do-cleanup nimbus) )) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1121) Improve Nimbus Topology submission time
[ https://issues.apache.org/jira/browse/STORM-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968116#comment-14968116 ] ASF GitHub Bot commented on STORM-1121: --- Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-150047429 I also love the concept. Btw, in order to rely on recurring mk-assignments, - We should remove ```nimbus.reassign``` since it should be true for new topologies to be assigned. We can't change ```when statement``` cause it makes nimbus.reassign ineffective. - It would be better to let users know that new topology will be assigned by nimbus not at the moment but within ```nimbus.monitor.freq.secs```. ``` (schedule-recurring (:timer nimbus) 0 (conf NIMBUS-MONITOR-FREQ-SECS) (fn [] (when (conf NIMBUS-REASSIGN) (locking (:submit-lock nimbus) (mk-assignments nimbus))) (do-cleanup nimbus) )) ``` > Improve Nimbus Topology submission time > --- > > Key: STORM-1121 > URL: https://issues.apache.org/jira/browse/STORM-1121 > Project: Apache Storm > Issue Type: Bug >Reporter: Kishor Patil >Assignee: Kishor Patil > > It appears, nimbus is blocking itself as active topologies count goes up. It > increases submitTopology response time exponentially for submission of newer > topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1121] Remove method call to avoid overh...
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-150045478 @kishorvpatil You may want to check Travis build failure, seems like there's a missing spot. > Exception in thread "main" java.lang.IllegalArgumentException: Unable to resolve classname: RebalanceOptions, compiling:(backtype/storm/transactional_test.clj:417:26) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1121) Improve Nimbus Topology submission time
[ https://issues.apache.org/jira/browse/STORM-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968106#comment-14968106 ] ASF GitHub Bot commented on STORM-1121: --- Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-150045478 @kishorvpatil You may want to check Travis build failure, seems like there's a missing spot. > Exception in thread "main" java.lang.IllegalArgumentException: Unable to resolve classname: RebalanceOptions, compiling:(backtype/storm/transactional_test.clj:417:26) > Improve Nimbus Topology submission time > --- > > Key: STORM-1121 > URL: https://issues.apache.org/jira/browse/STORM-1121 > Project: Apache Storm > Issue Type: Bug >Reporter: Kishor Patil >Assignee: Kishor Patil > > It appears, nimbus is blocking itself as active topologies count goes up. It > increases submitTopology response time exponentially for submission of newer > topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1057) Add throughput metric to spout/bolt and display them on web ui
[ https://issues.apache.org/jira/browse/STORM-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967757#comment-14967757 ] ASF GitHub Bot commented on STORM-1057: --- Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/753#issuecomment-150007774 I just have two minor questions now. After that I am +1, but I would like to hear from @d2r. He wrote a lot of the recent metrics code changes and I value his opinion in this area a lot. > Add throughput metric to spout/bolt and display them on web ui > -- > > Key: STORM-1057 > URL: https://issues.apache.org/jira/browse/STORM-1057 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Li Wang >Assignee: Li Wang > Original Estimate: 168h > Remaining Estimate: 168h > > Throughput is a fundamental metric to reasoning about the performance > bottleneck of a topology. Displaying the throughputs of components and tasks > on the web ui could greatly facilitate the user identifying the performance > bottleneck and checking whether the the workload among components and tasks > are balanced. > What to do: > 1. Measure the throughput of each spout/bolt. > 2. Display the throughput metrics on web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1057) Add throughput metric to spout/bolt and display them on web ui
[ https://issues.apache.org/jira/browse/STORM-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967756#comment-14967756 ] ASF GitHub Bot commented on STORM-1057: --- Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/753#discussion_r42673357 --- Diff: storm-core/src/clj/backtype/storm/stats.clj --- @@ -277,15 +277,34 @@ (value-stats stats SPOUT-FIELDS) {:type :spout})) +(defn values-divided-by [pairs t] + (let [update-values (fn [m f & args] +(into {} (for [[k v] m] [k (apply f v args)])))] +(update-values pairs / (double t --- End diff -- I would prefer to see update-values be a regular function instead of defined each time value-divide-by is called. > Add throughput metric to spout/bolt and display them on web ui > -- > > Key: STORM-1057 > URL: https://issues.apache.org/jira/browse/STORM-1057 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Li Wang >Assignee: Li Wang > Original Estimate: 168h > Remaining Estimate: 168h > > Throughput is a fundamental metric to reasoning about the performance > bottleneck of a topology. Displaying the throughputs of components and tasks > on the web ui could greatly facilitate the user identifying the performance > bottleneck and checking whether the the workload among components and tasks > are balanced. > What to do: > 1. Measure the throughput of each spout/bolt. > 2. Display the throughput metrics on web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1057] Add throughput metrics to spouts/...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/753#issuecomment-150007774 I just have two minor questions now. After that I am +1, but I would like to hear from @d2r. He wrote a lot of the recent metrics code changes and I value his opinion in this area a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-1057] Add throughput metrics to spouts/...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/753#discussion_r42673357 --- Diff: storm-core/src/clj/backtype/storm/stats.clj --- @@ -277,15 +277,34 @@ (value-stats stats SPOUT-FIELDS) {:type :spout})) +(defn values-divided-by [pairs t] + (let [update-values (fn [m f & args] +(into {} (for [[k v] m] [k (apply f v args)])))] +(update-values pairs / (double t --- End diff -- I would prefer to see update-values be a regular function instead of defined each time value-divide-by is called. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1057) Add throughput metric to spout/bolt and display them on web ui
[ https://issues.apache.org/jira/browse/STORM-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967738#comment-14967738 ] ASF GitHub Bot commented on STORM-1057: --- Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/753#discussion_r42671953 --- Diff: STORM-UI-REST-API.md --- @@ -351,11 +354,13 @@ Sample response: "executors": 12, "emitted": 184580, "transferred": 0, +"throughput": "195.000", "acked": 184640, "executeLatency": "0.048", "tasks": 12, "executed": 184620, "processLatency": "0.043", +"throughput": --- End diff -- This seems to be missing a value. > Add throughput metric to spout/bolt and display them on web ui > -- > > Key: STORM-1057 > URL: https://issues.apache.org/jira/browse/STORM-1057 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Li Wang >Assignee: Li Wang > Original Estimate: 168h > Remaining Estimate: 168h > > Throughput is a fundamental metric to reasoning about the performance > bottleneck of a topology. Displaying the throughputs of components and tasks > on the web ui could greatly facilitate the user identifying the performance > bottleneck and checking whether the the workload among components and tasks > are balanced. > What to do: > 1. Measure the throughput of each spout/bolt. > 2. Display the throughput metrics on web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1057] Add throughput metrics to spouts/...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/753#discussion_r42671953 --- Diff: STORM-UI-REST-API.md --- @@ -351,11 +354,13 @@ Sample response: "executors": 12, "emitted": 184580, "transferred": 0, +"throughput": "195.000", "acked": 184640, "executeLatency": "0.048", "tasks": 12, "executed": 184620, "processLatency": "0.043", +"throughput": --- End diff -- This seems to be missing a value. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (STORM-1115) Stale leader-lock key effectively bans all nodes from becoming leaders
[ https://issues.apache.org/jira/browse/STORM-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans resolved STORM-1115. Resolution: Fixed Fix Version/s: 0.11.0 Thanks [~danielschonfeld], I merged this into master. Keep up the good work. > Stale leader-lock key effectively bans all nodes from becoming leaders > -- > > Key: STORM-1115 > URL: https://issues.apache.org/jira/browse/STORM-1115 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 0.11.0 >Reporter: Daniel Schonfeld >Assignee: Daniel Schonfeld > Fix For: 0.11.0 > > > I believe this curator bug is what's in play causing the above described > situation. > https://issues.apache.org/jira/browse/CURATOR-202 > Whenever we were hit by this bug we'd start seeing problems in submitting > topologies to nimbus, as well as having problems > activating/deactivating/killing topologies. Basically any topology that > utilizes the `is-leader` macro, since no nimbus believes itself to be the > leader based on LeaderLatch.hasLeadership() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1115) Stale leader-lock key effectively bans all nodes from becoming leaders
[ https://issues.apache.org/jira/browse/STORM-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967732#comment-14967732 ] ASF GitHub Bot commented on STORM-1115: --- Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/802 > Stale leader-lock key effectively bans all nodes from becoming leaders > -- > > Key: STORM-1115 > URL: https://issues.apache.org/jira/browse/STORM-1115 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 0.11.0 >Reporter: Daniel Schonfeld > > I believe this curator bug is what's in play causing the above described > situation. > https://issues.apache.org/jira/browse/CURATOR-202 > Whenever we were hit by this bug we'd start seeing problems in submitting > topologies to nimbus, as well as having problems > activating/deactivating/killing topologies. Basically any topology that > utilizes the `is-leader` macro, since no nimbus believes itself to be the > leader based on LeaderLatch.hasLeadership() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (STORM-1115) Stale leader-lock key effectively bans all nodes from becoming leaders
[ https://issues.apache.org/jira/browse/STORM-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated STORM-1115: --- Assignee: Daniel Schonfeld > Stale leader-lock key effectively bans all nodes from becoming leaders > -- > > Key: STORM-1115 > URL: https://issues.apache.org/jira/browse/STORM-1115 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 0.11.0 >Reporter: Daniel Schonfeld >Assignee: Daniel Schonfeld > > I believe this curator bug is what's in play causing the above described > situation. > https://issues.apache.org/jira/browse/CURATOR-202 > Whenever we were hit by this bug we'd start seeing problems in submitting > topologies to nimbus, as well as having problems > activating/deactivating/killing topologies. Basically any topology that > utilizes the `is-leader` macro, since no nimbus believes itself to be the > leader based on LeaderLatch.hasLeadership() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1115] Stale leader-lock key effectively...
Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/802 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-1111] - Fix Validation for lots of diff...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/807#issuecomment-150001051 Just two minor nits and then I am +1 on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1111) Fix Validation for lots of different configs
[ https://issues.apache.org/jira/browse/STORM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967715#comment-14967715 ] ASF GitHub Bot commented on STORM-: --- Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/807#issuecomment-150001051 Just two minor nits and then I am +1 on this. > Fix Validation for lots of different configs > > > Key: STORM- > URL: https://issues.apache.org/jira/browse/STORM- > Project: Apache Storm > Issue Type: Bug >Reporter: Robert Joseph Evans >Assignee: Boyang Jerry Peng > > Once https://github.com/apache/storm/pull/785 goes in the validation logic is > more obvious about what is happening, and we have a lot of configs that the > validation is incomplete. We should look at all of the configs and update > the validation logic + comments to show what can be stored in these configs, > and that we validate them correctly. The following is an incomplete list of > some of these changes that need to be made. > ``` > TOPOLOGY_ISOLATED_MACHINES needs @isPositiveNumber and @isInteger > All of the 'ZMQ_` configs should be deprecated. > TRANSACTIONAL_ZOOKEEPER_PORTneeds @isPositiveNumber and @isInteger > It would be great if we could restrict TOPOLOGY_LOGGING_SENSITIVITY to one of > the allowed values "S0", "S1", "S2", "S3" > TOPOLOGY_SHELLBOLT_MAX_PENDING needs @isPositiveNumber > TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS needs @isPositiveNumber > TOPOLOGY_MAX_ERROR_REPORT_PER_INTERVAL and > TOPOLOGY_ERROR_THROTTLE_INTERVAL_SECS both seem to need @isPositiveNumber > TOPOLOGY_TRANSFER_BUFFER_SIZE needs to be @isPowerOf2 > TOPOLOGY_ENVIRONMENT should be @isMapEntryType(keyType = String.class, > valueType = String.class). > TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS needs > @isPositiveNumber(includeZero = true) > TOPOLOGY_MAX_SPOUT_PENDING needs @isPositiveNumber > TOPOLOGY_MAX_TASK_PARALLELISM needs @isPositiveNumber > WORKER_METRICS and TOPOLOGY_WORKER_METRICS should be @isMapEntryType(keyType > = String.class, valueType = String.class). > TOPOLOGY_METRICS_CONSUMER_REGISTER should have a custom validator (you might > not have time to do it, so we might need a follow on JIRA for this). > Something like > @isListEntryCustom(entryValidatorClasses={MetricRegistryValidator.class}) > MetricRegistryValidator.class needs to check that it is a map, with a "class" > key that points to a string, a "parallelism.hint" key that points to a > positive non-null integer. > TOPOLOGY_EVENTLOGGER_EXECUTORS needs @isPositiveNumber > TOPOLOGY_ACKER_EXECUTORS needs @isPositiveNumber > TOPOLOGY_TASKS and TOPOLOGY_WORKERS need @isPositiveNumber > TASK_CREDENTIALS_POLL_SECS needs @isPositiveNumber > TASK_REFRESH_POLL_SECS TASK_HEARTBEAT_FREQUENCY_SECS > WORKER_HEARTBEAT_FREQUENCY_SECS and WORKER_RECEIVER_THREAD_COUNT need > @isPositiveNumber > SUPERVISOR_MONITOR_FREQUENCY_SECS and SUPERVISOR_HEARTBEAT_FREQUENCY_SECS > need @isPositiveNumber > SUPERVISOR_WORKER_SHUTDOWN_SLEEP_SECS needs @isPositiveNumber > DRPC_HTTP_FILTER_PARAMS should be @isMapEntryType(keyType = String.class, > valueType = String.class). > DRPC_INVOCATIONS_THREADS and DRPC_INVOCATIONS_PORT need @isPositiveNumber > DRPC_QUEUE_SIZE DRPC_MAX_BUFFER_SIZE and DRPC_WORKER_THREADS need > @isPositiveNumber > DRPC_AUTHORIZER_ACL needs to be a Map>>. This too probably needs a custom > validator in a follow on JIRA. > DRPC_PORT needs @isPositiveNumber > DRPC_HTTPS_PORT and DRPC_HTTP_PORT need @isPositiveNumber > UI_HTTPS_PORT and UI_HEADER_BUFFER_BYTES need @isPositiveNumber > UI_FILTER_PARAMS should be @isMapEntryType(keyType = String.class, valueType > = String.class). > LOGVIEWER_HTTPS_PORT needs @isPositiveNumber > LOGVIEWER_PORT and UI_PORT need @isPositiveNumber > NIMBUS_CREDENTIAL_RENEW_FREQ_SECS needs @isPositiveNumber > NIMBUS_IMPERSONATION_ACL needs to be updated, because I don't think Map of > string to map. It is more complex then that. > NIMBUS_TASK_LAUNCH_SECS NIMBUS_SUPERVISOR_TIMEOUT_SECS > NIMBUS_INBOX_JAR_EXPIRATION_SECS NIMBUS_CLEANUP_INBOX_FREQ_SECS > NIMBUS_MONITOR_FREQ_SECS and NIMBUS_TASK_TIMEOUT_SECS need @isPositiveNumber > NIMBUS_THRIFT_MAX_BUFFER_SIZE needs @isPositiveNumber > NIMBUS_THRIFT_THREADS and NIMBUS_THRIFT_PORT need @isPositiveNumber > ``` -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1111) Fix Validation for lots of different configs
[ https://issues.apache.org/jira/browse/STORM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967714#comment-14967714 ] ASF GitHub Bot commented on STORM-: --- Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/807#discussion_r42669817 --- Diff: storm-core/src/jvm/backtype/storm/Config.java --- @@ -1215,6 +1254,7 @@ * to be equal to the number of workers configured for this topology. If this variable is set to 0, * event logging will be disabled. */ +@isPositiveNumber --- End diff -- This is a really minor nit, but everywhere else `@isInteger` is above `@isPositiveNumber` it would be nice to be consistent. > Fix Validation for lots of different configs > > > Key: STORM- > URL: https://issues.apache.org/jira/browse/STORM- > Project: Apache Storm > Issue Type: Bug >Reporter: Robert Joseph Evans >Assignee: Boyang Jerry Peng > > Once https://github.com/apache/storm/pull/785 goes in the validation logic is > more obvious about what is happening, and we have a lot of configs that the > validation is incomplete. We should look at all of the configs and update > the validation logic + comments to show what can be stored in these configs, > and that we validate them correctly. The following is an incomplete list of > some of these changes that need to be made. > ``` > TOPOLOGY_ISOLATED_MACHINES needs @isPositiveNumber and @isInteger > All of the 'ZMQ_` configs should be deprecated. > TRANSACTIONAL_ZOOKEEPER_PORTneeds @isPositiveNumber and @isInteger > It would be great if we could restrict TOPOLOGY_LOGGING_SENSITIVITY to one of > the allowed values "S0", "S1", "S2", "S3" > TOPOLOGY_SHELLBOLT_MAX_PENDING needs @isPositiveNumber > TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS needs @isPositiveNumber > TOPOLOGY_MAX_ERROR_REPORT_PER_INTERVAL and > TOPOLOGY_ERROR_THROTTLE_INTERVAL_SECS both seem to need @isPositiveNumber > TOPOLOGY_TRANSFER_BUFFER_SIZE needs to be @isPowerOf2 > TOPOLOGY_ENVIRONMENT should be @isMapEntryType(keyType = String.class, > valueType = String.class). > TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS needs > @isPositiveNumber(includeZero = true) > TOPOLOGY_MAX_SPOUT_PENDING needs @isPositiveNumber > TOPOLOGY_MAX_TASK_PARALLELISM needs @isPositiveNumber > WORKER_METRICS and TOPOLOGY_WORKER_METRICS should be @isMapEntryType(keyType > = String.class, valueType = String.class). > TOPOLOGY_METRICS_CONSUMER_REGISTER should have a custom validator (you might > not have time to do it, so we might need a follow on JIRA for this). > Something like > @isListEntryCustom(entryValidatorClasses={MetricRegistryValidator.class}) > MetricRegistryValidator.class needs to check that it is a map, with a "class" > key that points to a string, a "parallelism.hint" key that points to a > positive non-null integer. > TOPOLOGY_EVENTLOGGER_EXECUTORS needs @isPositiveNumber > TOPOLOGY_ACKER_EXECUTORS needs @isPositiveNumber > TOPOLOGY_TASKS and TOPOLOGY_WORKERS need @isPositiveNumber > TASK_CREDENTIALS_POLL_SECS needs @isPositiveNumber > TASK_REFRESH_POLL_SECS TASK_HEARTBEAT_FREQUENCY_SECS > WORKER_HEARTBEAT_FREQUENCY_SECS and WORKER_RECEIVER_THREAD_COUNT need > @isPositiveNumber > SUPERVISOR_MONITOR_FREQUENCY_SECS and SUPERVISOR_HEARTBEAT_FREQUENCY_SECS > need @isPositiveNumber > SUPERVISOR_WORKER_SHUTDOWN_SLEEP_SECS needs @isPositiveNumber > DRPC_HTTP_FILTER_PARAMS should be @isMapEntryType(keyType = String.class, > valueType = String.class). > DRPC_INVOCATIONS_THREADS and DRPC_INVOCATIONS_PORT need @isPositiveNumber > DRPC_QUEUE_SIZE DRPC_MAX_BUFFER_SIZE and DRPC_WORKER_THREADS need > @isPositiveNumber > DRPC_AUTHORIZER_ACL needs to be a Map>>. This too probably needs a custom > validator in a follow on JIRA. > DRPC_PORT needs @isPositiveNumber > DRPC_HTTPS_PORT and DRPC_HTTP_PORT need @isPositiveNumber > UI_HTTPS_PORT and UI_HEADER_BUFFER_BYTES need @isPositiveNumber > UI_FILTER_PARAMS should be @isMapEntryType(keyType = String.class, valueType > = String.class). > LOGVIEWER_HTTPS_PORT needs @isPositiveNumber > LOGVIEWER_PORT and UI_PORT need @isPositiveNumber > NIMBUS_CREDENTIAL_RENEW_FREQ_SECS needs @isPositiveNumber > NIMBUS_IMPERSONATION_ACL needs to be updated, because I don't think Map of > string to map. It is more complex then that. > NIMBUS_TASK_LAUNCH_SECS NIMBUS_SUPERVISOR_TIMEOUT_SECS > NIMBUS_INBOX_JAR_EXPIRATION_SECS NIMBUS_CLEANUP_INBOX_FREQ_SECS > NIMBUS_MONITOR_FREQ_SECS and NIMBUS_TASK_TIMEOUT_SECS need @isPositiveNumber > NIMBUS_THRIFT_MAX_BUFFER_SIZE needs @isPositiveNumber > NIMBUS_THRIFT_THREADS and NIMBUS_THRIFT_PORT need @isPositiveNumber > ``` -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1111] - Fix Validation for lots of diff...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/807#discussion_r42669817 --- Diff: storm-core/src/jvm/backtype/storm/Config.java --- @@ -1215,6 +1254,7 @@ * to be equal to the number of workers configured for this topology. If this variable is set to 0, * event logging will be disabled. */ +@isPositiveNumber --- End diff -- This is a really minor nit, but everywhere else `@isInteger` is above `@isPositiveNumber` it would be nice to be consistent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1111) Fix Validation for lots of different configs
[ https://issues.apache.org/jira/browse/STORM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967711#comment-14967711 ] ASF GitHub Bot commented on STORM-: --- Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/807#discussion_r42669491 --- Diff: storm-core/src/jvm/backtype/storm/Config.java --- @@ -247,6 +253,7 @@ * * Defaults to false. */ +@Deprecated --- End diff -- This should actually not be deprecated. It says zmq, but it is not ZMQ. It should probably be renamed, but that is a follow on JIRA. > Fix Validation for lots of different configs > > > Key: STORM- > URL: https://issues.apache.org/jira/browse/STORM- > Project: Apache Storm > Issue Type: Bug >Reporter: Robert Joseph Evans >Assignee: Boyang Jerry Peng > > Once https://github.com/apache/storm/pull/785 goes in the validation logic is > more obvious about what is happening, and we have a lot of configs that the > validation is incomplete. We should look at all of the configs and update > the validation logic + comments to show what can be stored in these configs, > and that we validate them correctly. The following is an incomplete list of > some of these changes that need to be made. > ``` > TOPOLOGY_ISOLATED_MACHINES needs @isPositiveNumber and @isInteger > All of the 'ZMQ_` configs should be deprecated. > TRANSACTIONAL_ZOOKEEPER_PORTneeds @isPositiveNumber and @isInteger > It would be great if we could restrict TOPOLOGY_LOGGING_SENSITIVITY to one of > the allowed values "S0", "S1", "S2", "S3" > TOPOLOGY_SHELLBOLT_MAX_PENDING needs @isPositiveNumber > TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS needs @isPositiveNumber > TOPOLOGY_MAX_ERROR_REPORT_PER_INTERVAL and > TOPOLOGY_ERROR_THROTTLE_INTERVAL_SECS both seem to need @isPositiveNumber > TOPOLOGY_TRANSFER_BUFFER_SIZE needs to be @isPowerOf2 > TOPOLOGY_ENVIRONMENT should be @isMapEntryType(keyType = String.class, > valueType = String.class). > TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS needs > @isPositiveNumber(includeZero = true) > TOPOLOGY_MAX_SPOUT_PENDING needs @isPositiveNumber > TOPOLOGY_MAX_TASK_PARALLELISM needs @isPositiveNumber > WORKER_METRICS and TOPOLOGY_WORKER_METRICS should be @isMapEntryType(keyType > = String.class, valueType = String.class). > TOPOLOGY_METRICS_CONSUMER_REGISTER should have a custom validator (you might > not have time to do it, so we might need a follow on JIRA for this). > Something like > @isListEntryCustom(entryValidatorClasses={MetricRegistryValidator.class}) > MetricRegistryValidator.class needs to check that it is a map, with a "class" > key that points to a string, a "parallelism.hint" key that points to a > positive non-null integer. > TOPOLOGY_EVENTLOGGER_EXECUTORS needs @isPositiveNumber > TOPOLOGY_ACKER_EXECUTORS needs @isPositiveNumber > TOPOLOGY_TASKS and TOPOLOGY_WORKERS need @isPositiveNumber > TASK_CREDENTIALS_POLL_SECS needs @isPositiveNumber > TASK_REFRESH_POLL_SECS TASK_HEARTBEAT_FREQUENCY_SECS > WORKER_HEARTBEAT_FREQUENCY_SECS and WORKER_RECEIVER_THREAD_COUNT need > @isPositiveNumber > SUPERVISOR_MONITOR_FREQUENCY_SECS and SUPERVISOR_HEARTBEAT_FREQUENCY_SECS > need @isPositiveNumber > SUPERVISOR_WORKER_SHUTDOWN_SLEEP_SECS needs @isPositiveNumber > DRPC_HTTP_FILTER_PARAMS should be @isMapEntryType(keyType = String.class, > valueType = String.class). > DRPC_INVOCATIONS_THREADS and DRPC_INVOCATIONS_PORT need @isPositiveNumber > DRPC_QUEUE_SIZE DRPC_MAX_BUFFER_SIZE and DRPC_WORKER_THREADS need > @isPositiveNumber > DRPC_AUTHORIZER_ACL needs to be a Map>>. This too probably needs a custom > validator in a follow on JIRA. > DRPC_PORT needs @isPositiveNumber > DRPC_HTTPS_PORT and DRPC_HTTP_PORT need @isPositiveNumber > UI_HTTPS_PORT and UI_HEADER_BUFFER_BYTES need @isPositiveNumber > UI_FILTER_PARAMS should be @isMapEntryType(keyType = String.class, valueType > = String.class). > LOGVIEWER_HTTPS_PORT needs @isPositiveNumber > LOGVIEWER_PORT and UI_PORT need @isPositiveNumber > NIMBUS_CREDENTIAL_RENEW_FREQ_SECS needs @isPositiveNumber > NIMBUS_IMPERSONATION_ACL needs to be updated, because I don't think Map of > string to map. It is more complex then that. > NIMBUS_TASK_LAUNCH_SECS NIMBUS_SUPERVISOR_TIMEOUT_SECS > NIMBUS_INBOX_JAR_EXPIRATION_SECS NIMBUS_CLEANUP_INBOX_FREQ_SECS > NIMBUS_MONITOR_FREQ_SECS and NIMBUS_TASK_TIMEOUT_SECS need @isPositiveNumber > NIMBUS_THRIFT_MAX_BUFFER_SIZE needs @isPositiveNumber > NIMBUS_THRIFT_THREADS and NIMBUS_THRIFT_PORT need @isPositiveNumber > ``` -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1111] - Fix Validation for lots of diff...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/807#discussion_r42669491 --- Diff: storm-core/src/jvm/backtype/storm/Config.java --- @@ -247,6 +253,7 @@ * * Defaults to false. */ +@Deprecated --- End diff -- This should actually not be deprecated. It says zmq, but it is not ZMQ. It should probably be renamed, but that is a follow on JIRA. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1121) Improve Nimbus Topology submission time
[ https://issues.apache.org/jira/browse/STORM-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967689#comment-14967689 ] ASF GitHub Bot commented on STORM-1121: --- Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-149996906 I love the concept, but I would like to see the tests updated so we are not adding several mins to the time it takes to run the unit tests. > Improve Nimbus Topology submission time > --- > > Key: STORM-1121 > URL: https://issues.apache.org/jira/browse/STORM-1121 > Project: Apache Storm > Issue Type: Bug >Reporter: Kishor Patil >Assignee: Kishor Patil > > It appears, nimbus is blocking itself as active topologies count goes up. It > increases submitTopology response time exponentially for submission of newer > topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1121] Remove method call to avoid overh...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/810#issuecomment-149996906 I love the concept, but I would like to see the tests updated so we are not adding several mins to the time it takes to run the unit tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1121) Improve Nimbus Topology submission time
[ https://issues.apache.org/jira/browse/STORM-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967654#comment-14967654 ] ASF GitHub Bot commented on STORM-1121: --- Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/810#discussion_r42666392 --- Diff: storm-core/test/clj/backtype/storm/integration_test.clj --- @@ -236,6 +237,7 @@ "acking-test1" {} (:topology tracked)) + (Thread/sleep 11000) --- End diff -- Is there a way we can force mk-assignments to be called? I don't really like adding around 1 min to the time it takes to run the tests needlessly. > Improve Nimbus Topology submission time > --- > > Key: STORM-1121 > URL: https://issues.apache.org/jira/browse/STORM-1121 > Project: Apache Storm > Issue Type: Bug >Reporter: Kishor Patil >Assignee: Kishor Patil > > It appears, nimbus is blocking itself as active topologies count goes up. It > increases submitTopology response time exponentially for submission of newer > topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1121] Remove method call to avoid overh...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/810#discussion_r42666392 --- Diff: storm-core/test/clj/backtype/storm/integration_test.clj --- @@ -236,6 +237,7 @@ "acking-test1" {} (:topology tracked)) + (Thread/sleep 11000) --- End diff -- Is there a way we can force mk-assignments to be called? I don't really like adding around 1 min to the time it takes to run the tests needlessly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-855) Add tuple batching
[ https://issues.apache.org/jira/browse/STORM-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967622#comment-14967622 ] ASF GitHub Bot commented on STORM-855: -- Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/765#issuecomment-149987537 So I have been running a number of tests trying to come to a conclusive decision on how storm should handle batching, and trying to understand the difference between my test results and the test results from #694. I ran the word count test I wrote as a part of #805 on a 35 node storm cluster. This was done against several different storm versions, the baseline in the #805 pull request; this patch + #805 (batch-v2); and #694 + #805 + modifications to use the hybrid approach to enable acking and batch to work in a multi-process topology (STORM-855). To avoid having all of the numbers be hard to parse I am just going to include some charts, but if anyone wants to see the raw numbers or reproduce it themselves I am happy to provide data and/or branches. The numbers below were collected after the topology had been running for at least 200 seconds. This is to avoid startup issues like JIT etc. I filtered out any 30 second interval where the measured throughput was not +/- 10% of the target throughput on the assumption that if the topology cannot keep up with the desired throughput or it was trying to catch up from previous slowness it would not be within that range. I did not filter based off of the number of failures that happened, simply because that would have resulted in removing all of the STORM-855 with batching enabled results. None of the other test configurations saw any failures at all during testing. ![throughput-vs-latency](https://cloud.githubusercontent.com/assets/3441321/10644336/d0393222-77ed-11e5-849a-0b6be6ac5178.png) This shows the 99%-ile latency vs measured throughput. It is not too interesting except to note that batching in STORM-855 at low throughput resulted in nothing being fully processed. All of the tuples timed out before they could finish. Only at a medium throughput above 16,000 sentences/second were we able to maintain enough tuples to complete batches regularly, but even then many tuples would still time out. This should be able to be fixed with a batch timeout, but that is not implemented yet. To get a better view I adjusted the latency to be a log scale. ![throughput-vs-latency-log](https://cloud.githubusercontent.com/assets/3441321/10644335/d02ab29c-77ed-11e5-883e-a647f6b4279b.png) From this we can see that on the very low end batching-v2 is increasing the 99%-ile latency from 5-10 ms to 19-21 ms. Most of that you can get back by configuring the batch size to 1, instead of the default 100 tuples. However, once the baseline stops functioning at around 7000 sentences/sec the batching code is able to continue working, with either a batch size of 1 or 100. I believe that this has to do with the automatic backpressure. In the baseline code backpressure does not take into account the overflow buffer, but in the batching code it does. I think this gives the topology more stability in maintaining a throughput, but I don't have any solid evidence for that. I then zoomed in on the graphs to show what a 2 second SLA would look like ![throughput-vs-latency-2-sec](https://cloud.githubusercontent.com/assets/3441321/10644332/d0176f5c-77ed-11e5-98c4-d2e7a9e48c70.png) and a 100 ms SLA. ![throughput-vs-latency-100-ms](https://cloud.githubusercontent.com/assets/3441321/10644334/d0291540-77ed-11e5-9fb3-9c9c97f504f9.png) In both cases the batching v2 with a batch size of 100 was able to handle the highest throughput for that given latency. Then I wanted to look at memory and CPU Utilization. ![throughput-vs-mem](https://cloud.githubusercontent.com/assets/3441321/10644337/d03c3094-77ed-11e5-8cda-cf53fe3a2389.png) Memory does not show much, the amount of memory used varies a bit from one to the other, but if you realize this is for 35 worker processes it is varying from 70 MB/worker to about 200 MB/worker. The numbers simply show that as the throughput increases the memory utilizations does too, and it does not vary too much from one implementation to another. ![throughput-vs-cpu](https://cloud.githubusercontent.com/assets/3441321/10645834/6ba799e0-77f5-11e5-88fd-7e09475a5b6c.png) CPU however shows that on the low end we are going from 7 or 8 cores worth of CPU time to about 35 cores worth for the batching code. This seems to be the result of the batch flushing threads waking up periodically. We should be able to mitigate this by adjusting that interval to be larger, but that would in turn impact the latency. I bel
[GitHub] storm pull request: Disruptor batching v2
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/765#issuecomment-149987537 So I have been running a number of tests trying to come to a conclusive decision on how storm should handle batching, and trying to understand the difference between my test results and the test results from #694. I ran the word count test I wrote as a part of #805 on a 35 node storm cluster. This was done against several different storm versions, the baseline in the #805 pull request; this patch + #805 (batch-v2); and #694 + #805 + modifications to use the hybrid approach to enable acking and batch to work in a multi-process topology (STORM-855). To avoid having all of the numbers be hard to parse I am just going to include some charts, but if anyone wants to see the raw numbers or reproduce it themselves I am happy to provide data and/or branches. The numbers below were collected after the topology had been running for at least 200 seconds. This is to avoid startup issues like JIT etc. I filtered out any 30 second interval where the measured throughput was not +/- 10% of the target throughput on the assumption that if the topology cannot keep up with the desired throughput or it was trying to catch up from previous slowness it would not be within that range. I did not filter based off of the number of failures that happened, simply because that would have resulted in removing all of the STORM-855 with batching enabled results. None of the other test configurations saw any failures at all during testing. ![throughput-vs-latency](https://cloud.githubusercontent.com/assets/3441321/10644336/d0393222-77ed-11e5-849a-0b6be6ac5178.png) This shows the 99%-ile latency vs measured throughput. It is not too interesting except to note that batching in STORM-855 at low throughput resulted in nothing being fully processed. All of the tuples timed out before they could finish. Only at a medium throughput above 16,000 sentences/second were we able to maintain enough tuples to complete batches regularly, but even then many tuples would still time out. This should be able to be fixed with a batch timeout, but that is not implemented yet. To get a better view I adjusted the latency to be a log scale. ![throughput-vs-latency-log](https://cloud.githubusercontent.com/assets/3441321/10644335/d02ab29c-77ed-11e5-883e-a647f6b4279b.png) From this we can see that on the very low end batching-v2 is increasing the 99%-ile latency from 5-10 ms to 19-21 ms. Most of that you can get back by configuring the batch size to 1, instead of the default 100 tuples. However, once the baseline stops functioning at around 7000 sentences/sec the batching code is able to continue working, with either a batch size of 1 or 100. I believe that this has to do with the automatic backpressure. In the baseline code backpressure does not take into account the overflow buffer, but in the batching code it does. I think this gives the topology more stability in maintaining a throughput, but I don't have any solid evidence for that. I then zoomed in on the graphs to show what a 2 second SLA would look like ![throughput-vs-latency-2-sec](https://cloud.githubusercontent.com/assets/3441321/10644332/d0176f5c-77ed-11e5-98c4-d2e7a9e48c70.png) and a 100 ms SLA. ![throughput-vs-latency-100-ms](https://cloud.githubusercontent.com/assets/3441321/10644334/d0291540-77ed-11e5-9fb3-9c9c97f504f9.png) In both cases the batching v2 with a batch size of 100 was able to handle the highest throughput for that given latency. Then I wanted to look at memory and CPU Utilization. ![throughput-vs-mem](https://cloud.githubusercontent.com/assets/3441321/10644337/d03c3094-77ed-11e5-8cda-cf53fe3a2389.png) Memory does not show much, the amount of memory used varies a bit from one to the other, but if you realize this is for 35 worker processes it is varying from 70 MB/worker to about 200 MB/worker. The numbers simply show that as the throughput increases the memory utilizations does too, and it does not vary too much from one implementation to another. ![throughput-vs-cpu](https://cloud.githubusercontent.com/assets/3441321/10645834/6ba799e0-77f5-11e5-88fd-7e09475a5b6c.png) CPU however shows that on the low end we are going from 7 or 8 cores worth of CPU time to about 35 cores worth for the batching code. This seems to be the result of the batch flushing threads waking up periodically. We should be able to mitigate this by adjusting that interval to be larger, but that would in turn impact the latency. I believe that with further work we should be able to reduce that CPU utilization and the latency on the low end by dynamically adjusting the batch size and timeout based off of a specified SLA. At this point I feel this branch is ready for a formal revi
[GitHub] storm pull request: Minor grammar fix to FAQ
Github user jerrypeng commented on the pull request: https://github.com/apache/storm/pull/808#issuecomment-149974237 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1121) Improve Nimbus Topology submission time
[ https://issues.apache.org/jira/browse/STORM-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967521#comment-14967521 ] ASF GitHub Bot commented on STORM-1121: --- GitHub user kishorvpatil opened a pull request: https://github.com/apache/storm/pull/810 [STORM-1121] Remove method call to avoid overhead during topology submission time Nimbus calls mk-assignments from SubmitTopology within lock is causing it to wait for processing of all heartbeats. This conflicts with recurring mk-assignments call. Topology can be submitted without making assignments, since for new topology the assignments would be made available as part of next scheduling cycle.). This gives more consistent response time as avoid locking within nimbus. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kishorvpatil/incubator-storm STORM-1121 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/810.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #810 commit 1593a374a4a381238f39fd9b0d078586d1aaa305 Author: Kishor Patil Date: 2015-10-19T23:32:18Z Remove method call to avoid overhead during topology submission time > Improve Nimbus Topology submission time > --- > > Key: STORM-1121 > URL: https://issues.apache.org/jira/browse/STORM-1121 > Project: Apache Storm > Issue Type: Bug >Reporter: Kishor Patil >Assignee: Kishor Patil > > It appears, nimbus is blocking itself as active topologies count goes up. It > increases submitTopology response time exponentially for submission of newer > topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1121] Remove method call to avoid overh...
GitHub user kishorvpatil opened a pull request: https://github.com/apache/storm/pull/810 [STORM-1121] Remove method call to avoid overhead during topology submission time Nimbus calls mk-assignments from SubmitTopology within lock is causing it to wait for processing of all heartbeats. This conflicts with recurring mk-assignments call. Topology can be submitted without making assignments, since for new topology the assignments would be made available as part of next scheduling cycle.). This gives more consistent response time as avoid locking within nimbus. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kishorvpatil/incubator-storm STORM-1121 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/810.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #810 commit 1593a374a4a381238f39fd9b0d078586d1aaa305 Author: Kishor Patil Date: 2015-10-19T23:32:18Z Remove method call to avoid overhead during topology submission time --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (STORM-1121) Improve Nimbus Topology submission time
Kishor Patil created STORM-1121: --- Summary: Improve Nimbus Topology submission time Key: STORM-1121 URL: https://issues.apache.org/jira/browse/STORM-1121 Project: Apache Storm Issue Type: Bug Reporter: Kishor Patil Assignee: Kishor Patil It appears, nimbus is blocking itself as active topologies count goes up. It increases submitTopology response time exponentially for submission of newer topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Storm at Stackoverflow
Hi, currently, there are two tags (apache-storm and storm) used on SO. I just suggested "apache-storm" to be the main tag and "storm" to be a synonym for it. This enables that all questions get tagged with a unique tag. Old and new questions get re-tag from storm to apache-storm automatically if the synonym get accepted. For this to happen, at least 4 upvotes must be casted. If you have an SO account, please upvote here: https://stackoverflow.com/tags/apache-storm/synonyms Thanks for your support! -Matthias signature.asc Description: OpenPGP digital signature
[jira] [Commented] (STORM-1057) Add throughput metric to spout/bolt and display them on web ui
[ https://issues.apache.org/jira/browse/STORM-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967379#comment-14967379 ] ASF GitHub Bot commented on STORM-1057: --- Github user wangli1426 commented on the pull request: https://github.com/apache/storm/pull/753#issuecomment-149948793 Hi @revans2 , Recent commits to the master branch cause conflict to my PR, so I up-merged my PR in 4de33d1. Could you please review the code again? Thank you very much. > Add throughput metric to spout/bolt and display them on web ui > -- > > Key: STORM-1057 > URL: https://issues.apache.org/jira/browse/STORM-1057 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Li Wang >Assignee: Li Wang > Original Estimate: 168h > Remaining Estimate: 168h > > Throughput is a fundamental metric to reasoning about the performance > bottleneck of a topology. Displaying the throughputs of components and tasks > on the web ui could greatly facilitate the user identifying the performance > bottleneck and checking whether the the workload among components and tasks > are balanced. > What to do: > 1. Measure the throughput of each spout/bolt. > 2. Display the throughput metrics on web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1057] Add throughput metrics to spouts/...
Github user wangli1426 commented on the pull request: https://github.com/apache/storm/pull/753#issuecomment-149948793 Hi @revans2 , Recent commits to the master branch cause conflict to my PR, so I up-merged my PR in 4de33d1. Could you please review the code again? Thank you very much. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: fix keyword (schema -> scheme) from main-route...
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/storm/pull/809 fix keyword (schema -> scheme) from main-routes You can merge this pull request into a Git repository by running: $ git pull https://github.com/HeartSaVioR/storm STORM-1120 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/809.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #809 commit 90eadfba0aaf62d45b719153eecef14712132546 Author: Jungtaek Lim Date: 2015-10-21T14:10:36Z fix keyword (schema -> scheme) from main-routes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (STORM-1120) main-routes in backtype.storm.ui.core uses wrong keyword - schema
Jungtaek Lim created STORM-1120: --- Summary: main-routes in backtype.storm.ui.core uses wrong keyword - schema Key: STORM-1120 URL: https://issues.apache.org/jira/browse/STORM-1120 Project: Apache Storm Issue Type: Bug Components: storm-core Affects Versions: 0.11.0 Reporter: Jungtaek Lim Assignee: Jungtaek Lim We're using both 'schema' and 'scheme' keywords from main-routes but [~knusbaum] confirmed that 'scheme' is correct. https://github.com/apache/storm/pull/717#issuecomment-146656531 -- This message was sent by Atlassian JIRA (v6.3.4#6332)