[jira] [Updated] (STORM-3626) storm-kafka-migration should pull in storm-client as "provided" dependency
[ https://issues.apache.org/jira/browse/STORM-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3626: -- Labels: pull-request-available (was: ) > storm-kafka-migration should pull in storm-client as "provided" dependency > -- > > Key: STORM-3626 > URL: https://issues.apache.org/jira/browse/STORM-3626 > Project: Apache Storm > Issue Type: Bug >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Minor > Labels: pull-request-available > > https://github.com/apache/storm/blob/master/external/storm-kafka-migration/pom.xml#L34-L39 > {code:java} > > org.apache.storm > storm-client > ${project.version} > > {code} > it is "compile" dependency as of now. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3623) v2 metrics tick reports all worker metrics within each executor
[ https://issues.apache.org/jira/browse/STORM-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3623: -- Labels: pull-request-available (was: ) > v2 metrics tick reports all worker metrics within each executor > --- > > Key: STORM-3623 > URL: https://issues.apache.org/jira/browse/STORM-3623 > Project: Apache Storm > Issue Type: Bug >Reporter: Ethan Li >Assignee: Aaron Gresch >Priority: Major > Labels: pull-request-available > > see > https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/executor/Executor.java#L335-L341 > {code:java} > private void addV2Metrics(List dataPoints) { > boolean enableV2MetricsDataPoints = > ObjectReader.getBoolean(topoConf.get(Config.TOPOLOGY_ENABLE_V2_METRICS_TICK), > false); > if (!enableV2MetricsDataPoints) { > return; > } > StormMetricRegistry stormMetricRegistry = > workerData.getMetricRegistry(); > {code} > This should be reporting just the metrics for the Executor. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3624) Race condition on ArtifactoryConfigLoader.load
[ https://issues.apache.org/jira/browse/STORM-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3624: -- Labels: pull-request-available (was: ) > Race condition on ArtifactoryConfigLoader.load > -- > > Key: STORM-3624 > URL: https://issues.apache.org/jira/browse/STORM-3624 > Project: Apache Storm > Issue Type: Bug >Reporter: Ethan Li >Priority: Major > Labels: pull-request-available > > https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/scheduler/resource/ResourceAwareScheduler.java#L100-L102 > config() is called in multiple threads. But ArtifactoryConfigLoader.load is > not thread-safe. For example, > https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/scheduler/utils/ArtifactoryConfigLoader.java#L181-L187 > {code:java} > JSONObject returnValue; > try { > returnValue = (JSONObject) jsonParser.parse(metadataStr); > } catch (ParseException e) { > LOG.error("Could not parse JSON string {}", metadataStr, e); > return null; > } > {code} > Multiple threads use the same jsonParser and since JsonParser is not > thread-safe, the return value will be corrupted. > I propose to create a separate thread to load scheduler configs periodically. > This also makes the config loading logic cleaner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3622) Race Condition in CachedThreadStatesGaugeSet registered at SystemBolt
[ https://issues.apache.org/jira/browse/STORM-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3622: -- Labels: pull-request-available (was: ) > Race Condition in CachedThreadStatesGaugeSet registered at SystemBolt > - > > Key: STORM-3622 > URL: https://issues.apache.org/jira/browse/STORM-3622 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Major > Labels: pull-request-available > > We noticed that with the change in https://github.com/apache/storm/pull/3242, > there is a race condition causing NPE. > {code:java} > 2020-04-14 18:22:12.997 o.a.s.u.Utils Thread-17-__acker-executor[16, 16] > [ERROR] Async loop died! > java.lang.RuntimeException: java.lang.NullPointerException > at org.apache.storm.executor.Executor.accept(Executor.java:291) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:131) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.utils.JCQueue.consume(JCQueue.java:111) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:172) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:159) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.utils.Utils$1.run(Utils.java:434) > [storm-client-2.2.0.y.jar:2.2.0.y] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] > Caused by: java.lang.NullPointerException > at > com.codahale.metrics.jvm.ThreadStatesGaugeSet.getThreadCount(ThreadStatesGaugeSet.java:95) > ~[metrics-jvm-3.2.6.jar:3.2.6] > at > com.codahale.metrics.jvm.ThreadStatesGaugeSet.access$000(ThreadStatesGaugeSet.java:20) > ~[metrics-jvm-3.2.6.jar:3.2.6] > at > com.codahale.metrics.jvm.ThreadStatesGaugeSet$1.getValue(ThreadStatesGaugeSet.java:56) > ~[metrics-jvm-3.2.6.jar:3.2.6] > at org.apache.storm.executor.Executor.addV2Metrics(Executor.java:344) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.executor.Executor.metricsTick(Executor.java:320) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:218) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.executor.Executor.accept(Executor.java:287) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > ... 6 more > {code} > This is due to a race condition in CachedGauge > https://github.com/dropwizard/metrics/blob/v3.2.6/metrics-core/src/main/java/com/codahale/metrics/CachedGauge.java#L49-L53 > There are two issues here. > The first one is > https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/executor/Executor.java#L335-L341. > > This makes all the executors to get values for all the metrics. So multiple > threads will access the same metric. > So the threads gauges are now accessed by multiple threads. But in > CachedGauge, > {code:java} > @Override > public T getValue() { > if (shouldLoad()) { > this.value = loadValue(); > } > return value; > } > {code} > this method is not thread-safe. Two threads can reach to getValue at the same > time. > The first thread reaching shouldLoad knows it needs to reload, so it calls > the next line this.value=loadValue() > The second thread is a little bit late so shouldLoad returns false. Then it > returns the value directly. > There is a race condition between first thread calling loadValue() and the > second thread returning value. > If the first thread finishes loadValue() first, both values returned to the > threads are the same value (and current value). But if the second thread > returns earlier, the second thread gets the original value (which is null ), > hence NPE. > To summarize, the second issue is CachedThreadStatesGaugeSet is not > thread-safe > To fix this NPE, we should avoid using CachedThreadStatesGaugeSet. > But we still need to fix > https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/executor/Executor.java#L335-L341 > to avoid unnecessary computations and redundant metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3619) Add null check for the topology name
[ https://issues.apache.org/jira/browse/STORM-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3619: -- Labels: pull-request-available (was: ) > Add null check for the topology name > > > Key: STORM-3619 > URL: https://issues.apache.org/jira/browse/STORM-3619 > Project: Apache Storm > Issue Type: Improvement >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Minor > Labels: pull-request-available > > Currently with > {code:java} > StormSubmitter.submitTopology(null, ...) > {code} > > submission will fail: > {code:java} > Exception in thread "main" java.lang.RuntimeException: > org.apache.storm.thrift.TApplicationException: Internal error processing > isTopologyNameAllowed > at > org.apache.storm.StormSubmitter.topologyNameExists(StormSubmitter.java:438) > at > org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:247) > at > org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:206) > at > org.apache.storm.StormSubmitter.submitTopologyWithProgressBar(StormSubmitter.java:411) > at > org.apache.storm.StormSubmitter.submitTopologyWithProgressBar(StormSubmitter.java:392) > at > com.example.SimplifiedTwoCompTopology.main(SimplifiedTwoCompTopology.java:49) > Caused by: org.apache.storm.thrift.TApplicationException: Internal error > processing isTopologyNameAllowed > at > org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:79) > at > org.apache.storm.generated.Nimbus$Client.recv_isTopologyNameAllowed(Nimbus.java:1209) > at > org.apache.storm.generated.Nimbus$Client.isTopologyNameAllowed(Nimbus.java:1196) > at > org.apache.storm.StormSubmitter.topologyNameExists(StormSubmitter.java:436) > ... 5 more > {code} > And on nimbus: > {code:java} > 2020-04-08 18:38:46.356 o.a.s.t.ProcessFunction pool-34-thread-475 [ERROR] > Internal error processing isTopologyNameAllowed > java.lang.NullPointerException: null > at java.util.regex.Matcher.getTextLength(Matcher.java:1283) > ~[?:1.8.0_242] > at java.util.regex.Matcher.reset(Matcher.java:309) ~[?:1.8.0_242] > at java.util.regex.Matcher.(Matcher.java:229) ~[?:1.8.0_242] > at java.util.regex.Pattern.matcher(Pattern.java:1093) ~[?:1.8.0_242] > at > org.apache.storm.daemon.nimbus.Nimbus.validateTopologyName(Nimbus.java:1189) > ~[storm-server-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.daemon.nimbus.Nimbus.isTopologyNameAllowed(Nimbus.java:4667) > ~[storm-server-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.generated.Nimbus$Processor$isTopologyNameAllowed.getResult(Nimbus.java:4423) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.generated.Nimbus$Processor$isTopologyNameAllowed.getResult(Nimbus.java:4402) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [storm-shaded-deps-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > [storm-shaded-deps-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.security.auth.sasl.SaslTransportPlugin$TUGIWrapProcessor.process(SaslTransportPlugin.java:152) > [storm-client-2.2.0.y > .jar:2.2.0.y] > at > org.apache.storm.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:291) > [storm-shaded-deps-2.2.0.y.jar:2.2.0.y > ] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_242] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_242] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] > {code} > This is because > https://github.com/apache/storm/blob/v2.1.0/storm-server/src/main/java/org/apache/storm/daemon/nimbus/Nimbus.java#L1175-L1180 > the topology name is null so NullPointerException > But the error message is not obvious. We should add null check when > validating topology name and report the error indicating the topology name > being null. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3618) add meter for tracking internal scheduling errors
[ https://issues.apache.org/jira/browse/STORM-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3618: -- Labels: pull-request-available (was: ) > add meter for tracking internal scheduling errors > - > > Key: STORM-3618 > URL: https://issues.apache.org/jira/browse/STORM-3618 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3606) AutoTGT shouldn't invoke TGT renewal thread (from UserGroupInformation.loginUserFromSubject)
[ https://issues.apache.org/jira/browse/STORM-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3606: -- Labels: pull-request-available (was: ) > AutoTGT shouldn't invoke TGT renewal thread (from > UserGroupInformation.loginUserFromSubject) > > > Key: STORM-3606 > URL: https://issues.apache.org/jira/browse/STORM-3606 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.2.3, 2.1.0 >Reporter: Ethan Li >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > > When hadoop security is enabled, > https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/security/auth/kerberos/AutoTGT.java#L199-L209 > AutoTGT will invoke "loginUserFromSubject", and it will spawn a TGT renewal > thread ("TGT Renewer for "). > https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java#L928-L957 > which will eventually invoke system command "kinit -R", and then fail with > the exception > {code:java} > org.apache.hadoop.util.Shell$ExitCodeException: kinit: Credentials cache file > '/tmp/krb5cc_xxx' not found while renewing credentials > at org.apache.hadoop.util.Shell.runCommand(Shell.java:1004) > ~[stormjar.jar:?] > at org.apache.hadoop.util.Shell.run(Shell.java:898) ~[stormjar.jar:?] > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213) > ~[stormjar.jar:?] > at org.apache.hadoop.util.Shell.execCommand(Shell.java:1307) > ~[stormjar.jar:?] > at org.apache.hadoop.util.Shell.execCommand(Shell.java:1289) > ~[stormjar.jar:?] > at > org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:1011) > [stormjar.jar:?] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] > {code} > "kinit" will never work from worker process since Storm don't keep TGT in > local cache. Instead, TGT is saved in zookeeper and in memory of Worker > process. > This exception is confusing but not harmful to topologies. And the TGT > renewal thread will eventually abort. > It's better to find a real solution for it. But for now we can document what > might happen in AutoTGT code. > To be clear, we still need loginUserFromSubject or some sort but we don't > want to spawn TGT renewal thread. This is found with hadoop-2.8.5. Other > versions are similar. But it can also change in the future release. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3600) ResourceAwareScheduler taking too long to schedule
[ https://issues.apache.org/jira/browse/STORM-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3600: -- Labels: pull-request-available (was: ) > ResourceAwareScheduler taking too long to schedule > -- > > Key: STORM-3600 > URL: https://issues.apache.org/jira/browse/STORM-3600 > Project: Apache Storm > Issue Type: Improvement > Components: storm-server >Affects Versions: 2.2.0 >Reporter: Bipin Prasad >Assignee: Bipin Prasad >Priority: Major > Labels: pull-request-available > Fix For: 2.2.0 > > > Review BaseResourceAwareStrategy and sortNodes() used by > GenericResourceAwareStrategy for improvement in speed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3616) If running upload credentials and no autocreds are found, we should have an option to fail
[ https://issues.apache.org/jira/browse/STORM-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3616: -- Labels: pull-request-available (was: ) > If running upload credentials and no autocreds are found, we should have an > option to fail > -- > > Key: STORM-3616 > URL: https://issues.apache.org/jira/browse/STORM-3616 > Project: Apache Storm > Issue Type: Improvement >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Labels: pull-request-available > > User tried to upload credentials on a box with a bad setup. Because the > command passed, it was assumed to be ok. Then the topology certs timed out > and dev got involved. > If the command had just failed, we would not have gotten involved at all. > Existing output: > {code:java} > 1809 [main] INFO b.s.s.a.AuthUtils - Got AutoCreds [] > 1810 [main] WARN b.s.StormSubmitter - No credentials were found to push to > edgestorm > 1812 [main] INFO b.s.c.upload-credentials - Uploaded new creds to topology: > edgestorm{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3614) update SystemBolt metrics to use v2 API
[ https://issues.apache.org/jira/browse/STORM-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3614: -- Labels: pull-request-available (was: ) > update SystemBolt metrics to use v2 API > --- > > Key: STORM-3614 > URL: https://issues.apache.org/jira/browse/STORM-3614 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3613) storm.py should include lib-worker instead of lib directory in the classpath while submitting a topology
[ https://issues.apache.org/jira/browse/STORM-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3613: -- Labels: pull-request-available (was: ) > storm.py should include lib-worker instead of lib directory in the classpath > while submitting a topology > > > Key: STORM-3613 > URL: https://issues.apache.org/jira/browse/STORM-3613 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Major > Labels: pull-request-available > > Currently the classpath is: > {code:java} > -cp > //storm/2.2.0/*://storm/2.2.0/lib/*://storm/2.2.0/extlib/*:/tmp/storm-examples-1.0-SNAPSHOT.jar://storm/2.2.0/conf://storm/2.2.0/bin: > > {code} > for "storm jar" command. > It should include lib-worker/ instead of lib/. > This can cause problems because we don't shade deps in lib/ so topology jar > could conflict with jars in lib/. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-1316) port storm.trident.state-test to java
[ https://issues.apache.org/jira/browse/STORM-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-1316: -- Labels: java-migration jstorm-merger pull-request-available (was: java-migration jstorm-merger) > port storm.trident.state-test to java > - > > Key: STORM-1316 > URL: https://issues.apache.org/jira/browse/STORM-1316 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Robert Joseph Evans >Priority: Major > Labels: java-migration, jstorm-merger, pull-request-available > > Test some of trident state -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3609) ClassCastException when credentials are updated for ICredentialsListener spout/bolt instances
[ https://issues.apache.org/jira/browse/STORM-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3609: -- Labels: pull-request-available (was: ) > ClassCastException when credentials are updated for ICredentialsListener > spout/bolt instances > - > > Key: STORM-3609 > URL: https://issues.apache.org/jira/browse/STORM-3609 > Project: Apache Storm > Issue Type: Bug >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Major > Labels: pull-request-available > > {code:java} > 2020-03-26 21:04:38.526 o.a.s.u.Utils Thread-14-spout-executor[2, 2] [ERROR] > Async loop died! > java.lang.RuntimeException: java.lang.ClassCastException: > org.apache.storm.generated.Credentials cannot be cast to java.util.Map > at org.apache.storm.executor.Executor.accept(Executor.java:291) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:131) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.utils.JCQueue.consume(JCQueue.java:111) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.utils.JCQueue.consume(JCQueue.java:102) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.executor.spout.SpoutExecutor$2.call(SpoutExecutor.java:170) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.executor.spout.SpoutExecutor$2.call(SpoutExecutor.java:159) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.utils.Utils$1.run(Utils.java:433) > [storm-client-2.2.0.y.jar:2.2.0.y] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] > Caused by: java.lang.ClassCastException: > org.apache.storm.generated.Credentials cannot be cast to java.util.Map > at > org.apache.storm.executor.spout.SpoutExecutor.tupleActionFn(SpoutExecutor.java:303) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.executor.Executor.accept(Executor.java:287) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > ... 7 more > {code} > note: "2.2.0.y" is our internal version, which is master branch. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3608) Upgrade snakeyaml from 1.11 to 1.26
[ https://issues.apache.org/jira/browse/STORM-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3608: -- Labels: pull-request-available (was: ) > Upgrade snakeyaml from 1.11 to 1.26 > --- > > Key: STORM-3608 > URL: https://issues.apache.org/jira/browse/STORM-3608 > Project: Apache Storm > Issue Type: Dependency upgrade >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Minor > Labels: pull-request-available > > snakeyaml-1.11 (sep 2012) is really old. We should just upgrade to latest > version 1.26 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3607) Document the exceptions topologies will see from TGT renewal thread
[ https://issues.apache.org/jira/browse/STORM-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3607: -- Labels: pull-request-available (was: ) > Document the exceptions topologies will see from TGT renewal thread > --- > > Key: STORM-3607 > URL: https://issues.apache.org/jira/browse/STORM-3607 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Ethan Li >Priority: Minor > Labels: pull-request-available > > This is to document STORM-3606 in the code so users can be less confusing > about the exceptions from TGT renewal thread. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3605) add meter to track scheduling timeouts
[ https://issues.apache.org/jira/browse/STORM-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3605: -- Labels: pull-request-available (was: ) > add meter to track scheduling timeouts > -- > > Key: STORM-3605 > URL: https://issues.apache.org/jira/browse/STORM-3605 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-1304) port backtype.storm.submitter-test to java
[ https://issues.apache.org/jira/browse/STORM-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-1304: -- Labels: java-migration jstorm-merger pull-request-available (was: java-migration jstorm-merger) > port backtype.storm.submitter-test to java > --- > > Key: STORM-1304 > URL: https://issues.apache.org/jira/browse/STORM-1304 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Robert Joseph Evans >Assignee: Jark Wu >Priority: Major > Labels: java-migration, jstorm-merger, pull-request-available > > Test ZookeeperAuthentication payload generation that is a part of > StormSubmitter -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-1293) port backtype.storm.messaging.netty-integration-test to java
[ https://issues.apache.org/jira/browse/STORM-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-1293: -- Labels: java-migration jstorm-merger pull-request-available (was: java-migration jstorm-merger) > port backtype.storm.messaging.netty-integration-test to java > - > > Key: STORM-1293 > URL: https://issues.apache.org/jira/browse/STORM-1293 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Robert Joseph Evans >Priority: Major > Labels: java-migration, jstorm-merger, pull-request-available > > Integration tests for netty messaging layer -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-1241) port backtype.storm.security.auth.auto-login-module-test to java
[ https://issues.apache.org/jira/browse/STORM-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-1241: -- Labels: java-migration jstorm-merger pull-request-available (was: java-migration jstorm-merger) > port backtype.storm.security.auth.auto-login-module-test to java > - > > Key: STORM-1241 > URL: https://issues.apache.org/jira/browse/STORM-1241 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Robert Joseph Evans >Priority: Major > Labels: java-migration, jstorm-merger, pull-request-available > > junit migration -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3604) HealthChecker should print out error message when it fails
[ https://issues.apache.org/jira/browse/STORM-3604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3604: -- Labels: pull-request-available (was: ) > HealthChecker should print out error message when it fails > -- > > Key: STORM-3604 > URL: https://issues.apache.org/jira/browse/STORM-3604 > Project: Apache Storm > Issue Type: Improvement >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Minor > Labels: pull-request-available > > Currently in the code > https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/healthcheck/HealthChecker.java#L122-L130 > {code:java} >if (process.exitValue() != 0) { > String str; > InputStream stdin = process.getInputStream(); > BufferedReader reader = new BufferedReader(new > InputStreamReader(stdin)); > while ((str = reader.readLine()) != null) { > if (str.startsWith("ERROR")) { > LOG.warn("The healthcheck process {} exited with code > {}", script, process.exitValue()); > return FAILED; > } > } > return FAILED_WITH_EXIT_CODE; > } > {code} > The healthcheck doesn't really print out the error message so it's not easy > to debug when healthcheck fails. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-2483) wrong parameters order
[ https://issues.apache.org/jira/browse/STORM-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-2483: -- Labels: pull-request-available (was: ) > wrong parameters order > -- > > Key: STORM-2483 > URL: https://issues.apache.org/jira/browse/STORM-2483 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.3 > Environment: storm-core:1.0.3 >Reporter: Jacob Liu >Priority: Major > Labels: pull-request-available > > org.apache.storm.utils.Utils#getGlobalStreamId has wrong parameters order: > > public static GlobalStreamId getGlobalStreamId(String streamId, String > componentId) { > if (componentId == null) { > return new GlobalStreamId(streamId, DEFAULT_STREAM_ID); > } > return new GlobalStreamId(streamId, componentId); > } > but GlobalStreamId constructor is: public GlobalStreamId( > String componentId, > String streamId) > so i think the nice code is: > public static GlobalStreamId getGlobalStreamId(String streamId, String > componentId) { > if (streamId == null) { > return new GlobalStreamId(componentId, DEFAULT_STREAM_ID); > } > return new GlobalStreamId(componentId, streamId); > } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3306) Some tests in storm-core/test/jvm/org/apache/storm/integration/TopologyIntegrationTest.java are using Thrift to build topologies. They should use TopologyBuilder instead.
[ https://issues.apache.org/jira/browse/STORM-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3306: -- Labels: newbie pull-request-available (was: newbie) > Some tests in > storm-core/test/jvm/org/apache/storm/integration/TopologyIntegrationTest.java > are using Thrift to build topologies. They should use TopologyBuilder > instead. > --- > > Key: STORM-3306 > URL: https://issues.apache.org/jira/browse/STORM-3306 > Project: Apache Storm > Issue Type: Task > Components: storm-core >Affects Versions: 2.0.0 >Reporter: Stig Rohde Døssing >Priority: Minor > Labels: newbie, pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3602) loadaware shuffle can overload local worker
[ https://issues.apache.org/jira/browse/STORM-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3602: -- Labels: pull-request-available (was: ) > loadaware shuffle can overload local worker > --- > > Key: STORM-3602 > URL: https://issues.apache.org/jira/browse/STORM-3602 > Project: Apache Storm > Issue Type: Bug >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Major > Labels: pull-request-available > > We were seeing a worker overloaded and tuples timing out with loadaware > shuffle enabled. From investigating, we found that the code allows switching > from Host local to Worker local if the load average is lower than the low > water mark. It really should be checking the load on the worker instead. > > What's happening is the worker is overloaded with tons of idle host local > tasks, so it switches to HOST_LOCAL. Then the calculation across all the > host tasks is below the low water mark and it immediately switches back to > the overloaded worker local task. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3599) Bump the rocksdbjni to 5.18.4
[ https://issues.apache.org/jira/browse/STORM-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3599: -- Labels: pull-request-available (was: ) > Bump the rocksdbjni to 5.18.4 > - > > Key: STORM-3599 > URL: https://issues.apache.org/jira/browse/STORM-3599 > Project: Apache Storm > Issue Type: Sub-task > Components: build >Reporter: Yikun Jiang >Priority: Minor > Labels: pull-request-available > > The rocksdb community release a special relaese for ARM(aarch64) [1]. > And we can bump our storm rocksdbjni from 5.18.3 to 5.18.4 to enable ARM > support for STORM. > [1] https://github.com/facebook/rocksdb/releases/tag/v5.18.4 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3585) Change ConstraintSolverStrategy to allow max co-Location Count for spreading components
[ https://issues.apache.org/jira/browse/STORM-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3585: -- Labels: pull-request-available (was: ) > Change ConstraintSolverStrategy to allow max co-Location Count for spreading > components > --- > > Key: STORM-3585 > URL: https://issues.apache.org/jira/browse/STORM-3585 > Project: Apache Storm > Issue Type: New Feature > Components: storm-server >Affects Versions: 2.2.0 >Reporter: Bipin Prasad >Assignee: Bipin Prasad >Priority: Minor > Labels: pull-request-available > Fix For: 2.2.0 > > > We need constraint solver strategy to evolve around to using Map maxCoLocationCount> instead of simple list of components that get maximum of > 1 collocation as specified in config _topology.spread.components_. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3598) Storm UI visualization throws NullPointerException
[ https://issues.apache.org/jira/browse/STORM-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3598: -- Labels: pull-request-available (was: ) > Storm UI visualization throws NullPointerException > -- > > Key: STORM-3598 > URL: https://issues.apache.org/jira/browse/STORM-3598 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Major > Labels: pull-request-available > > We encountered an issue with visualization on UI. > > {code:java} > 2020-03-09 19:59:01.756 o.a.s.d.u.r.StormApiResource qtp1919834117-167291 > [ERROR] Failure getting topology visualization > java.lang.NullPointerException: null > at > org.apache.storm.stats.StatsUtil.mergeWithAddPair(StatsUtil.java:1855) > ~[storm-server-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.stats.StatsUtil.expandAveragesSeq(StatsUtil.java:2308) > ~[storm-server-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.stats.StatsUtil.aggregateAverages(StatsUtil.java:832) > ~[storm-server-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.stats.StatsUtil.aggregateBoltStats(StatsUtil.java:731) > ~[storm-server-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.stats.StatsUtil.boltStreamsStats(StatsUtil.java:900) > ~[storm-server-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.daemon.ui.UIHelpers.getVisualizationData(UIHelpers.java:1939) > ~[storm-webapp-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.daemon.ui.resources.StormApiResource.getTopologyVisualization(StormApiResource.java:423) > ~[storm-webapp-2.2.0.y.jar:2.2.0.y] > {code} > This is a bug in the code. > https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/stats/StatsUtil.java#L1846-L1858 > {code:java} > for (K kk : mm1.keySet()) { > List seq1 = mm1.get(kk); > List seq2 = mm2.get(kk); > List sums = new ArrayList(); > for (int i = 0; i < seq1.size(); i++) { > if (seq1.get(i) instanceof Long) { > sums.add(((Number) seq1.get(i)).longValue() + > ((Number) seq2.get(i)).longValue()); > } else { > sums.add(((Number) seq1.get(i)).doubleValue() + > ((Number) seq2.get(i)).doubleValue()); > } > } > tmp.put(kk, sums); > } > {code} > It assume mm1 and mm2 always have the same key, which is not true. > And it can be reproduced by my example code: > {code:java} > public class WordCountTopology extends ConfigurableTopology { > private static final Logger LOG = > LoggerFactory.getLogger(WordCountTopology.class); > public static void main(String[] args) { > ConfigurableTopology.start(new WordCountTopology(), args); > } > protected int run(String[] args) { > TopologyBuilder builder = new TopologyBuilder(); > builder.setSpout("spout1", new RandomSpout(1), 1); > builder.setSpout("spout2", new RandomSpout(2), 1); > builder.setBolt("bolt", new RandomBolt(), 2).directGrouping("spout1", > "stream1") > .directGrouping("spout2", "stream2"); > String topologyName = "word-count"; > conf.setNumWorkers(3); > if (args != null && args.length > 0) { > topologyName = args[0]; > } > return submit(topologyName, conf, builder); > } > static class RandomSpout extends BaseRichSpout { > String stream; > int id; > public RandomSpout(int id) { > this.id = id; > stream = "stream" + id; > } > int taskId = 0; > SpoutOutputCollector collector; > public void open(Map conf, TopologyContext context, > SpoutOutputCollector collector) { > taskId = context.getThisTaskId(); > this.collector = collector; > } > /** > * Different spout send tuples to different bolt via different stream. > */ > public void nextTuple() { > LOG.info("emitting {}", id); > if (id == 1) { > Values val = new Values("test a sentence"); > collector.emitDirect(2, stream, val, val); > } else { > Values val = new Values("test 2 sentence"); > collector.emitDirect(3, stream, val, val); > } > try { > Thread.sleep(1000); > } catch (InterruptedException e) { > e.printStackTrace(); > } > } > public void declareOutputFields(OutputFieldsDeclarer declarer) { >
[jira] [Updated] (STORM-3595) Refactor Resource Aware Strategies (Base, Generic, Default)
[ https://issues.apache.org/jira/browse/STORM-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3595: -- Labels: pull-request-available (was: ) > Refactor Resource Aware Strategies (Base, Generic, Default) > --- > > Key: STORM-3595 > URL: https://issues.apache.org/jira/browse/STORM-3595 > Project: Apache Storm > Issue Type: Improvement > Components: storm-server >Affects Versions: 2.2.0 >Reporter: Bipin Prasad >Assignee: Bipin Prasad >Priority: Minor > Labels: pull-request-available > Fix For: 2.2.0 > > > Code is duplicated in incompatible ways. Refactor BaseResourceAwareStrategy, > DefaultResourceAwareStrategy and GenericResourceAwareStrategy to remove > duplicated code and minor incompatibilities. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3596) Feed send assignment status into blacklist scheduler
[ https://issues.apache.org/jira/browse/STORM-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3596: -- Labels: pull-request-available (was: ) > Feed send assignment status into blacklist scheduler > > > Key: STORM-3596 > URL: https://issues.apache.org/jira/browse/STORM-3596 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Major > Labels: pull-request-available > > We occasionally see hiccups sending assignments to supervisors, which are > usually transitory. But we have seen more persistent issues with a > supervisor when its disk became read-only. The supervisor remained up and > was unable to start workers. Nimbus continually tried to send it assignments > and failed, but just ate the exception and continued on. > > We should be able to send this information to the blacklist scheduler and add > the node to the blacklist when some threshold occurs. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3594) Add checkstyle rule WhitespaceAfter
[ https://issues.apache.org/jira/browse/STORM-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3594: -- Labels: pull-request-available (was: ) > Add checkstyle rule WhitespaceAfter > --- > > Key: STORM-3594 > URL: https://issues.apache.org/jira/browse/STORM-3594 > Project: Apache Storm > Issue Type: Improvement >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Minor > Labels: pull-request-available > > [https://checkstyle.sourceforge.io/config_whitespace.html#WhitespaceAfter] > > This is already exercised in Storm code base. Adding it to make it clearer -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3591) Improve GRAS Strategy Log
[ https://issues.apache.org/jira/browse/STORM-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3591: -- Labels: pull-request-available (was: ) > Improve GRAS Strategy Log > - > > Key: STORM-3591 > URL: https://issues.apache.org/jira/browse/STORM-3591 > Project: Apache Storm > Issue Type: Improvement >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Labels: pull-request-available > > [https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/scheduler/resource/strategies/scheduling/GenericResourceAwareStrategy.java#L123] > {code:java} > 2020-02-24 14:53:59.652 o.a.s.s.r.s.s.GenericResourceAwareStrategy > pool-21-thread-1 [WARN] Scheduling [[1, 1]] left over task (most likely sys > tasks) > {code} > This message seems to be confusing on debugging. > [https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/scheduler/resource/strategies/scheduling/DefaultResourceAwareStrategy.java#L82] > Default Strategy actually uses debug level instead of warn. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3590) Add a test to validate that GRAS's node sorting is stable to prevent excessive fragmentation and starvation of non-GRAS topologies
[ https://issues.apache.org/jira/browse/STORM-3590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3590: -- Labels: pull-request-available (was: ) > Add a test to validate that GRAS's node sorting is stable to prevent > excessive fragmentation and starvation of non-GRAS topologies > -- > > Key: STORM-3590 > URL: https://issues.apache.org/jira/browse/STORM-3590 > Project: Apache Storm > Issue Type: Test >Reporter: Govind Menon >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3589) Iterator in BaseResourceStrategy is potentially buggy
[ https://issues.apache.org/jira/browse/STORM-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3589: -- Labels: pull-request-available (was: ) > Iterator in BaseResourceStrategy is potentially buggy > - > > Key: STORM-3589 > URL: https://issues.apache.org/jira/browse/STORM-3589 > Project: Apache Storm > Issue Type: Improvement >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Labels: pull-request-available > > [https://github.com/apache/storm/blame/master/storm-server/src/main/java/org/apache/storm/scheduler/resource/strategies/scheduling/BaseResourceAwareStrategy.java#L280] > We should probably only peek but not remove value from nodeIterator in > hasNext() function. > > [https://github.com/apache/storm/blame/master/storm-server/src/main/java/org/apache/storm/scheduler/resource/strategies/scheduling/BaseResourceAwareStrategy.java#L296-L300] > > And two consecutive next() call will cause problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3583) Localizer should not cause supervisor restart on FileNotFoundException
[ https://issues.apache.org/jira/browse/STORM-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3583: -- Labels: pull-request-available (was: ) > Localizer should not cause supervisor restart on FileNotFoundException > -- > > Key: STORM-3583 > URL: https://issues.apache.org/jira/browse/STORM-3583 > Project: Apache Storm > Issue Type: Bug > Components: storm-server >Reporter: Kishor Patil >Assignee: Kishor Patil >Priority: Major > Labels: pull-request-available > > The supervisor should not restart if blobstore does not have a file it is > trying to localize. Supervisor should simply assume that the blob will be > available soon but no need for daemon to keel over. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3588) RAS scheduler should not pre-empt and evict topologies due to generic resource
[ https://issues.apache.org/jira/browse/STORM-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3588: -- Labels: pull-request-available (was: ) > RAS scheduler should not pre-empt and evict topologies due to generic resource > -- > > Key: STORM-3588 > URL: https://issues.apache.org/jira/browse/STORM-3588 > Project: Apache Storm > Issue Type: Improvement >Reporter: Rui Li >Assignee: Rui Li >Priority: Major > Labels: pull-request-available > > Currently, when we enabled generic resource support in our cluster, RAS > scheduler will evict scheduled topologies if new inappropriate asks for crazy > amount of generic resources. > We have identified the currently getScore() function in > DefaultSchedulingPriorityStrategy does not consider the generic resource > factor. Ideally, if user topology is asking way too much generic resources, > it should be assigned lower priority in scheduling and won't affect other > running topologies. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3587) Allow Scheduler futureTask to gracefully exit and register message on timeout
[ https://issues.apache.org/jira/browse/STORM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3587: -- Labels: pull-request-available (was: ) > Allow Scheduler futureTask to gracefully exit and register message on timeout > - > > Key: STORM-3587 > URL: https://issues.apache.org/jira/browse/STORM-3587 > Project: Apache Storm > Issue Type: New Feature > Components: storm-server >Affects Versions: 2.2.0 >Reporter: Bipin Prasad >Assignee: Bipin Prasad >Priority: Minor > Labels: pull-request-available > Fix For: 2.2.0 > > > ResourceAwareScheduler creates a FutureTask with timeout specified in config. > Scheduling strategy (e.g ConstraintSolverStrategy) uses the the same > configuration variable to determine when to terminate its effort. > Increase the timeout in ResourceAwareScheduler to allow the FutureTask to > gracefully exit and record its error message. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3584) Support getting version info from a wildcard classpath entry
[ https://issues.apache.org/jira/browse/STORM-3584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3584: -- Labels: pull-request-available (was: ) > Support getting version info from a wildcard classpath entry > - > > Key: STORM-3584 > URL: https://issues.apache.org/jira/browse/STORM-3584 > Project: Apache Storm > Issue Type: Improvement >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Minor > Labels: pull-request-available > > [https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/utils/VersionInfo.java#L134] > > Current VersionInfo.getFromClasspath method only tries to get version info > from the specific property file under a directory, or jar/zip files. It > should support a classpath entry like //*. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3581) Change log level to info to show the config classes being used for validation
[ https://issues.apache.org/jira/browse/STORM-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3581: -- Labels: pull-request-available (was: ) > Change log level to info to show the config classes being used for validation > - > > Key: STORM-3581 > URL: https://issues.apache.org/jira/browse/STORM-3581 > Project: Apache Storm > Issue Type: Improvement >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Trivial > Labels: pull-request-available > > [https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/validation/ConfigValidation.java#L82] > This is trivial but since it's caused some confusion, it's better to have it > in the log as INFO instead of DEBUG > {code:java} > LOG.debug("Will use {} for validation", ret); > {code} > > Because the classes being used for validation depends on whether the > following file is in the classpath or not > > [https://github.com/apache/storm/blob/master/storm-server/src/main/resources/META-INF/services/org.apache.storm.validation.Validated] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3579) Fix Kerberos connection from Worker to Nimbus/Supervisor
[ https://issues.apache.org/jira/browse/STORM-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3579: -- Labels: pull-request-available (was: ) > Fix Kerberos connection from Worker to Nimbus/Supervisor > > > Key: STORM-3579 > URL: https://issues.apache.org/jira/browse/STORM-3579 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Major > Labels: pull-request-available > > BUG2 in the parent JIRA -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3580) Config overrides supplied using -c in storm.py not passed to all commands
[ https://issues.apache.org/jira/browse/STORM-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3580: -- Labels: pull-request-available (was: ) > Config overrides supplied using -c in storm.py not passed to all commands > - > > Key: STORM-3580 > URL: https://issues.apache.org/jira/browse/STORM-3580 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 2.2.0 >Reporter: Bipin Prasad >Assignee: Bipin Prasad >Priority: Major > Labels: pull-request-available > > -c is used to supply configuration overide options. Storm.py in the client > code converts these overrides into one -Dstorm.options system property. > However, this jvm option is not handled properly when the actual command is > executed. For example, Rebalance command completely ignores this setting. > Commands that currently process "-c" options: > * Activate - Not Needed > * *AdminCommands - Yes* > * BasicDrpcClient - Not Needed > * Blobstore - Not Needed > * CLI - Not Needed > * *ConfigValue - Yes* > * Deactivate - Not Needed > * *DevZookeeper - Yes* > * *DRPCServer - Yes* > * GetErrors - Not Needed > * *HealthCheck - Yes* > * *Heartbeats - Yes* > * KillTopology - Not Needed > * *KillWorkers - Yes* > * ListTopologies - Not Needed > * *LogViewerServer - Yes* > * *LocalCluster - Yes* > * Monitor - Not Needed > * *Nimbus - Yes* > * *Pacemaker - Yes* > * Rebalance - {color:#FF}Add as part of this Jira{color} > * SetLogLevel - Not Needed > * *ShellSubmission - Yes* > * *StormSqlRunner - Yes* > * Supervisor - {color:#FF}Not Needed?{color} > * *UI - Yes* > * *UploadCredentials - Yes, but specific options* > * VersionsInfo - Not Needed > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3578) ClientAuthUtils.insertWorkerTokens removes exiting and new WorkerToken altogether if they are equal
[ https://issues.apache.org/jira/browse/STORM-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3578: -- Labels: pull-request-available (was: ) > ClientAuthUtils.insertWorkerTokens removes exiting and new WorkerToken > altogether if they are equal > --- > > Key: STORM-3578 > URL: https://issues.apache.org/jira/browse/STORM-3578 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Major > Labels: pull-request-available > > BUG1 in the parent JIRA -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3577) upload-credentials Breaks Topology in secure cluster
[ https://issues.apache.org/jira/browse/STORM-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3577: -- Labels: pull-request-available (was: ) > upload-credentials Breaks Topology in secure cluster > > > Key: STORM-3577 > URL: https://issues.apache.org/jira/browse/STORM-3577 > Project: Apache Storm > Issue Type: Bug >Reporter: Ethan Li >Priority: Critical > Labels: pull-request-available > > *Background* > Worker uses WorkerToken to connect to Nimbus/Supervisor, (e.g. in > Worker.doHeartBeat method). If WorkerToken is not in place, it will fall back > to Kerberos. > > *Issue:* > Users can submit topology and the topology is running fine. > But error shows up in worker log if "storm upload-credentials" is executed. > (2.2.0.y is our internal version of apache-storm master branch) > > {code:java} > 2020-02-04 00:12:57.975 o.a.s.d.w.Worker heartbeat-timer [WARN] Exception > when send heartbeat to local supervisor > 2020-02-04 00:12:57.984 o.a.s.s.a.k.ClientCallbackHandler heartbeat-timer > [WARN] Could not login: the client is being asked for a password, but the > client code does not currently support obtaining a password from the user. > Make sure that the client is configured to use a ticket cache (using the JAAS > configuration setting 'useTicketCache=true)' and restart the client. If you > still get this message after that, the TGT in the ticket cache has expired > and must be manually refreshed. To do so, first determine if you are using a > password or a keytab. If the former, run kinit in a Unix shell in the > environment of the user who is running this client using the command 'kinit > ' (where is the name of the client's Kerberos principal). If > the latter, do 'kinit -k -t ' (where is the name of > the Kerberos principal, and is the location of the keytab file). > After manually refreshing your cache, restart this client. If you continue to > see this message after manually refreshing your cache, ensure that your KDC > host's clock is in sync with this host's clock. > 2020-02-04 00:12:57.984 o.a.s.s.a.k.KerberosSaslTransportPlugin > heartbeat-timer [ERROR] Server failed to login in > principal:javax.security.auth.login.LoginException: No password provided > javax.security.auth.login.LoginException: No password provided > at > com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:919) > ~[?:1.8.0_181] > at > com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760) > ~[?:1.8.0_181] > at > com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) > ~[?:1.8.0_181] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_181] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_181] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_181] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181] > at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) > ~[?:1.8.0_181] > at > javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) > ~[?:1.8.0_181] > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) > ~[?:1.8.0_181] > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) > ~[?:1.8.0_181] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_181] > at > javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) > ~[?:1.8.0_181] > at javax.security.auth.login.LoginContext.login(LoginContext.java:587) > ~[?:1.8.0_181] > at org.apache.storm.messaging.netty.Login.login(Login.java:300) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at org.apache.storm.messaging.netty.Login.(Login.java:84) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin.mkLogin(KerberosSaslTransportPlugin.java:112) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin.kerberosConnect(KerberosSaslTransportPlugin.java:171) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin.connect(KerberosSaslTransportPlugin.java:138) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:48) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:98) > ~[storm-client-2.2.0.y.jar:2.2.0.y] > at > org.apache.storm.
[jira] [Updated] (STORM-3575) Fix Scheduler Status on failure after multiple attempts
[ https://issues.apache.org/jira/browse/STORM-3575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3575: -- Labels: pull-request-available (was: ) > Fix Scheduler Status on failure after multiple attempts > --- > > Key: STORM-3575 > URL: https://issues.apache.org/jira/browse/STORM-3575 > Project: Apache Storm > Issue Type: Improvement >Reporter: Rui Li >Priority: Minor > Labels: pull-request-available > > The RAS on multiple attempts when fails to schedule a topology, it is > overriding status as to with {color:#FF}_Failed to schedule within 5 > attempts_{color} > But I think, it should append this message to existing reason/status on the > topology. > [https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/scheduler/resource/ResourceAwareScheduler.java#L239] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3574) Rebalance should re-computeExecutors so metrics consumers can be added
[ https://issues.apache.org/jira/browse/STORM-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3574: -- Labels: pull-request-available (was: ) > Rebalance should re-computeExecutors so metrics consumers can be added > -- > > Key: STORM-3574 > URL: https://issues.apache.org/jira/browse/STORM-3574 > Project: Apache Storm > Issue Type: Bug > Components: storm-server >Affects Versions: 2.1.0 >Reporter: Bipin Prasad >Assignee: Bipin Prasad >Priority: Minor > Labels: pull-request-available > > Since metrics consumers are a system component and dynamically adding them > should be possible with simple configuration change > _topology.metrics.consumer.register_. Currently the _computeExecutors_ or > more precisely - _StormCommon#startTaskInfo_ which evaluates this variable is > only invoked during _startTopology_ , the rebalance functionality is not > starting newer tasks. > We need to ensure rebalance recalculates the idToExecutors again - hopefully > only adding new tasks for added executors without removing old ids. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3572) Topology visualization can fail if executor is not up
[ https://issues.apache.org/jira/browse/STORM-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3572: -- Labels: pull-request-available (was: ) > Topology visualization can fail if executor is not up > - > > Key: STORM-3572 > URL: https://issues.apache.org/jira/browse/STORM-3572 > Project: Apache Storm > Issue Type: Bug >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3571) Add topology info to slot warning messages
[ https://issues.apache.org/jira/browse/STORM-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3571: -- Labels: pull-request-available (was: ) > Add topology info to slot warning messages > -- > > Key: STORM-3571 > URL: https://issues.apache.org/jira/browse/STORM-3571 > Project: Apache Storm > Issue Type: Improvement >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Labels: pull-request-available > > {code:java} > 2020-01-30 10:57:56.639 o.a.s.d.s.Slot SLOT_6714 [WARN] SLOT 6714: HB is too > old 32000 > 3 > 2020-01-30 11:31:44.107 o.a.s.d.s.Slot SLOT_6723 [WARN] SLOT 6723: HB is too > old 31000 > 3 > 2020-01-30 11:39:15.256 o.a.s.d.s.Slot SLOT_6728 [WARN] SLOT 6728: HB is too > old 32000 > 3 > {code} > Stale HB is one of the most common situations we need to deal with. It could > be helpful to see topology id in such logs when debugging. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3570) add config name when validation fails with ClassNotFoundException
[ https://issues.apache.org/jira/browse/STORM-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3570: -- Labels: pull-request-available (was: ) > add config name when validation fails with ClassNotFoundException > - > > Key: STORM-3570 > URL: https://issues.apache.org/jira/browse/STORM-3570 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3568) Topology UI page "Change Log Level" should not allow empty logger name
[ https://issues.apache.org/jira/browse/STORM-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3568: -- Labels: pull-request-available (was: ) > Topology UI page "Change Log Level" should not allow empty logger name > -- > > Key: STORM-3568 > URL: https://issues.apache.org/jira/browse/STORM-3568 > Project: Apache Storm > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li >Priority: Major > Labels: pull-request-available > > 500 Server Error will be shown on the bottom. Users have complained the call > stack was not informative. We should prevent them from leaving logger field > empty. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3567) Topology UI page is showing total resources for each component if not scheduled
[ https://issues.apache.org/jira/browse/STORM-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3567: -- Labels: pull-request-available (was: ) > Topology UI page is showing total resources for each component if not > scheduled > --- > > Key: STORM-3567 > URL: https://issues.apache.org/jira/browse/STORM-3567 > Project: Apache Storm > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Labels: pull-request-available > Attachments: Screen Shot 2020-01-13 at 8.43.40 AM.png > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3566) add serialVersionUID field to class which implement Serializable interface.
[ https://issues.apache.org/jira/browse/STORM-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3566: -- Labels: pull-request-available (was: ) > add serialVersionUID field to class which implement Serializable interface. > --- > > Key: STORM-3566 > URL: https://issues.apache.org/jira/browse/STORM-3566 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Howard Liu >Priority: Critical > Labels: pull-request-available > > add serialVersionUID field to class which implement Serializable interface -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3565) Allow users to add dimensionsfor storm metrics
[ https://issues.apache.org/jira/browse/STORM-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3565: -- Labels: pull-request-available (was: ) > Allow users to add dimensionsfor storm metrics > -- > > Key: STORM-3565 > URL: https://issues.apache.org/jira/browse/STORM-3565 > Project: Apache Storm > Issue Type: Improvement >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Labels: pull-request-available > > We have encountered a use-case that users want a better control over metrics. > > For example, a kafka spout may stream in data from multiple kafka partitions. > And the partition list could be dynamic. For a simple count metric, our users > might want to a better fine-grained group like grouping the counts by kafka > partition id. > > It would be more convenient if storm could allow users to add dimensions to > individual yamas metrics -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3544) Ability to find topologies with low worker uptime on UI
[ https://issues.apache.org/jira/browse/STORM-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3544: -- Labels: pull-request-available (was: ) > Ability to find topologies with low worker uptime on UI > --- > > Key: STORM-3544 > URL: https://issues.apache.org/jira/browse/STORM-3544 > Project: Apache Storm > Issue Type: Improvement >Reporter: David Andsager >Assignee: David Andsager >Priority: Major > Labels: pull-request-available > > useful for finding bad topologies / nodes -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3563) Travis fails because of missing maven package from the mirror
[ https://issues.apache.org/jira/browse/STORM-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3563: -- Labels: pull-request-available (was: ) > Travis fails because of missing maven package from the mirror > - > > Key: STORM-3563 > URL: https://issues.apache.org/jira/browse/STORM-3563 > Project: Apache Storm > Issue Type: Bug >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Critical > Labels: pull-request-available > > Travis fails because of the following error: > {code:java} > 0.39s$ wget > http://mirrors.rackhosting.com/apache/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz > -P $HOME > --2020-01-03 06:21:16-- > http://mirrors.rackhosting.com/apache/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz > Resolving mirrors.rackhosting.com (mirrors.rackhosting.com)... 77.247.64.34, > 2a02:4de0:21::2 > Connecting to mirrors.rackhosting.com > (mirrors.rackhosting.com)|77.247.64.34|:80... connected. > HTTP request sent, awaiting response... 404 Not Found > 2020-01-03 06:21:17 ERROR 404: Not Found. > {code} > The latest maven version is 3.6.3. 3.6.1 is removed from the mirror. > We should use https://archive.apache.org/dist/maven/maven-3/3.6.1/ to prevent > this from happening every time there is a new maven minor release. > Also we should cache maven to avoid re-download every time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3557) allow health checks to pass on timeout
[ https://issues.apache.org/jira/browse/STORM-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3557: -- Labels: pull-request-available (was: ) > allow health checks to pass on timeout > -- > > Key: STORM-3557 > URL: https://issues.apache.org/jira/browse/STORM-3557 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > > We've seen nodes with high loads that timeout health checks periodically. > This leads to killing workers unnecessarily. > > I'd like an option to not fail when timeouts occur, and to have a metric to > track when these occur. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3555) Add meter for tracking errors killing workers
[ https://issues.apache.org/jira/browse/STORM-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3555: -- Labels: pull-request-available (was: ) > Add meter for tracking errors killing workers > - > > Key: STORM-3555 > URL: https://issues.apache.org/jira/browse/STORM-3555 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > > We've seen nodes fail to kill workers when they the processes end up in a > defunct state. I would like a meter to track these failures for alerting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3552) Storm CLI set_log_level no longer updates the log level
[ https://issues.apache.org/jira/browse/STORM-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3552: -- Labels: pull-request-available (was: ) > Storm CLI set_log_level no longer updates the log level > --- > > Key: STORM-3552 > URL: https://issues.apache.org/jira/browse/STORM-3552 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 2.0.0, 2.1.0 >Reporter: Luke Marjoram >Priority: Major > Labels: pull-request-available > > Using the example StatefulWindowingTopology, when trying to update the log > level via command line with the following command a NullPointer is thrown in > the worker log and the log level is not updated. > {code:java} > storm set_log_level -l ROOT=DEBUG:0 test{code} > {code:java} > 2019-12-09 17:16:02.600+0100 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl > main-EventThread [ERROR] Event listener threw exception > java.lang.NullPointerException: null > at > java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) > ~[?:1.8.0_131] > at org.apache.logging.log4j.Level.getLevel(Level.java:261) > ~[log4j-api-2.11.2.jar:2.11.2] > at > org.apache.storm.daemon.worker.LogConfigManager.setLoggerLevel(LogConfigManager.java:145) > ~[storm-client-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.daemon.worker.LogConfigManager.processLogConfigChange(LogConfigManager.java:98) > ~[storm-client-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.daemon.worker.Worker.checkLogConfigChanged(Worker.java:422) > ~[storm-client-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.cluster.StormClusterStateImpl.issueMapCallback(StormClusterStateImpl.java:177) > ~[storm-client-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.cluster.StormClusterStateImpl$1.changed(StormClusterStateImpl.java:122) > ~[storm-client-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.cluster.ZKStateStorage$ZkWatcherCallBack.execute(ZKStateStorage.java:243) > ~[storm-client-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.zookeeper.ClientZookeeper.lambda$mkClientImpl$0(ClientZookeeper.java:314) > ~[storm-client-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl$7.apply(CuratorFrameworkImpl.java:1048) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl$7.apply(CuratorFrameworkImpl.java:1041) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.curator.shaded.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:92) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl.processEvent(CuratorFrameworkImpl.java:1040) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl.access$000(CuratorFrameworkImpl.java:66) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl$1.process(CuratorFrameworkImpl.java:126) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.curator.ConnectionState.process(ConnectionState.java:185) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:533) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT] > at > org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508) > [storm-shaded-deps-2.1.0.jar:2.1.1-SNAPSHOT]{code} > This appears to be a regression from the migration from clojure to java in > STORM-1267 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3511) Nimbus logs got flood with TTransportException Error messages (because of thrift 0.12.0)
[ https://issues.apache.org/jira/browse/STORM-3511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3511: -- Labels: pull-request-available (was: ) > Nimbus logs got flood with TTransportException Error messages (because of > thrift 0.12.0) > > > Key: STORM-3511 > URL: https://issues.apache.org/jira/browse/STORM-3511 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Ethan Li >Priority: Major > Labels: pull-request-available > > Submitting a wordCountTopology works in secure cluster. But the following > {code:java} > 2019-09-25 13:53:46.560 o.a.s.t.s.TThreadPoolServer pool-15-thread-1 [ERROR] > Thrift error occurred during processing of message. > org.apache.storm.thrift.transport.TTransportException: null > at > org.apache.storm.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.transport.TSaslTransport.read(TSaslTransport.java:433) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:43) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:27) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.security.auth.sasl.SaslTransportPlugin$TUGIWrapProcessor.process(SaslTransportPlugin.java:147) > ~[storm-client-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:310) > [shaded-deps-2.0.1.y.jar:2.0.1.y] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_181] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_181] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] > {code} > flood the nimbus log. (2.0.1.y is our internal version. The code is basically > community 2.0.0) > This is similar to the issue found in Thrift community and got fixed in > 0.13.0 but it's not released yet > https://issues.apache.org/jira/browse/THRIFT-4805 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3551) Fix LocalAssignment Equivalency in Slot for Generice Resource Aware Scheduler
[ https://issues.apache.org/jira/browse/STORM-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3551: -- Labels: pull-request-available (was: ) > Fix LocalAssignment Equivalency in Slot for Generice Resource Aware Scheduler > - > > Key: STORM-3551 > URL: https://issues.apache.org/jira/browse/STORM-3551 > Project: Apache Storm > Issue Type: Bug > Components: storm-server >Affects Versions: 2.1.0 >Reporter: Kishor Patil >Assignee: Kishor Patil >Priority: Major > Labels: pull-request-available > > If supervisor defines generic resource then it needs to ignore it from > comparison while assignment is not using such generic resource. > > {code} > 2019-12-03 21:02:10.635 o.a.s.d.s.Slot SLOT_6726 [INFO] SLOT 6726: Assignment > Changed from LocalAssignment(topology_id:TEST-WordCount-281-1570127542, > executors:[ExecutorInfo(task_start:261, task_end:261), > ExecutorInfo(task_start:188, task_end:188)], > resources:WorkerResources(mem_on_heap:1.0, mem_off_heap:0.0, cpu:400.0, > shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:\{offheap.memory.mb=0.0, onheap.memory.mb=1.0, > cpu.pcore.percent=400.0}, shared_resources:{}), owner:bud_storm) to > LocalAssignment(topology_id:TEST-WordCount-281-1570127542, > executors:[ExecutorInfo(task_start:261, task_end:261), > ExecutorInfo(task_start:188, task_end:188)], > resources:WorkerResources(mem_on_heap:1.0, mem_off_heap:0.0, cpu:400.0, > shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:\{offheap.memory.mb=0.0, network.resource.units=0.0, > onheap.memory.mb=1.0, cpu.pcore.percent=400.0}, shared_resources:{}), > owner:bud_storm) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3549) use of topology specific jaas conf doesn't work with kafka
[ https://issues.apache.org/jira/browse/STORM-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3549: -- Labels: pull-request-available (was: ) > use of topology specific jaas conf doesn't work with kafka > -- > > Key: STORM-3549 > URL: https://issues.apache.org/jira/browse/STORM-3549 > Project: Apache Storm > Issue Type: Bug >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Major > Labels: pull-request-available > > {code:java} > 2019-09-17 19:22:23.006 o.a.s.u.Utils Thread-22-line-reader-spout-executor[4, > 4] [ERROR] Async loop died! > org.apache.kafka.common.KafkaException: Failed to construct kafka consumer > at > org.apache.kafka.clients.consumer.KafkaConsumer.(KafkaConsumer.java:702) > ~[stormjar.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer. (KafkaConsumer.java:557) > ~[stormjar.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer. (KafkaConsumer.java:540) > ~[stormjar.jar:?] > at > org.apache.storm.kafka.spout.internal.ConsumerFactoryDefault.createConsumer(ConsumerFactoryDefault.java:26) > ~[stormjar.jar:?] > at > org.apache.storm.kafka.spout.internal.ConsumerFactoryDefault.createConsumer(ConsumerFactoryDefault.java:22) > ~[stormjar.jar:?] > at org.apache.storm.kafka.spout.KafkaSpout.open(KafkaSpout.java:147) > ~[stormjar.jar:?] > at > org.apache.storm.executor.spout.SpoutExecutor.init(SpoutExecutor.java:148) > ~[storm-client-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.executor.spout.SpoutExecutor.call(SpoutExecutor.java:158) > ~[storm-client-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.executor.spout.SpoutExecutor.call(SpoutExecutor.java:55) > ~[storm-client-2.0.1.y.jar:2.0.1.y] > at org.apache.storm.utils.Utils$1.run(Utils.java:425) > [storm-client-2.0.1.y.jar:2.0.1.y] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] > Caused by: org.apache.kafka.common.KafkaException: > javax.security.auth.login.LoginException: Could not login: the client is > being asked for a password, but the Kafka client code does not currently > support obtaining a password from the user. not available to garner > authentication information from the user > at > org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:86) > ~[stormjar.jar:?] > at > org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:70) > ~[stormjar.jar:?] > at > org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:83) > ~[stormjar.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer. (KafkaConsumer.java:623) > ~[stormjar.jar:?] > ... 10 more > Caused by: javax.security.auth.login.LoginException: Could not login: the > client is being asked for a password, but the Kafka client code does not > currently support obtaining a password from the user. not available to garner > authentication information from the user > at > com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:940) > ~[?:1.8.0_181] > at > com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760) > ~[?:1.8.0_181] > at > com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) > ~[?:1.8.0_181] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_181] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_181] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_181] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181] > at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) > ~[?:1.8.0_181] > at > javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) > ~[?:1.8.0_181] > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) > ~[?:1.8.0_181] > at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) > ~[?:1.8.0_181] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_181] > at > javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) > ~[?:1.8.0_181] > at javax.security.auth.login.LoginContext.login(LoginContext.java:587) > ~[?:1.8.0_181] > at > org.apache.kafka.common.security.authenticator.AbstractLogin.login(AbstractLogin.java:69) > ~[stormjar.jar:?] > at > org.apache.kafka.common.security.kerberos.KerberosLogin.login(KerberosLogin.java:110) > ~[stormjar.jar:?] > at > org.apache.kafka.common.security.authenticator.LoginManager.
[jira] [Updated] (STORM-3548) Remove iterator from Task.sendUnanchored
[ https://issues.apache.org/jira/browse/STORM-3548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3548: -- Labels: pull-request-available (was: ) > Remove iterator from Task.sendUnanchored > > > Key: STORM-3548 > URL: https://issues.apache.org/jira/browse/STORM-3548 > Project: Apache Storm > Issue Type: Improvement > Components: storm-client >Affects Versions: 2.1.0 >Reporter: Christopher Johnson >Assignee: Christopher Johnson >Priority: Minor > Labels: pull-request-available > > Storm 2.x aims to remove iterators from the critical path to reduce garbage. > > This method is called to send acking tuples so should use a for loop instead > of list iterator. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3545) blob update spews errors until cleanup occurs after topology killed
[ https://issues.apache.org/jira/browse/STORM-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3545: -- Labels: pull-request-available (was: ) > blob update spews errors until cleanup occurs after topology killed > --- > > Key: STORM-3545 > URL: https://issues.apache.org/jira/browse/STORM-3545 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3541) allow reporting of v2 metrics api using metrics tick
[ https://issues.apache.org/jira/browse/STORM-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3541: -- Labels: pull-request-available (was: ) > allow reporting of v2 metrics api using metrics tick > > > Key: STORM-3541 > URL: https://issues.apache.org/jira/browse/STORM-3541 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > > We would like to be able to report v2 metrics api through the metrics tick, > like v1 metrics. > > The main reason (besides keeping one reporter) is to reduce connection load > on our openTSDB monitoring service. Having one node doing reporting for a > topology decreases the number of connections. > > This also allows an easy migration path for users to the new v2 API. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3540) Pacemaker race condition can cause continual reconnection
[ https://issues.apache.org/jira/browse/STORM-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3540: -- Labels: pull-request-available (was: ) > Pacemaker race condition can cause continual reconnection > - > > Key: STORM-3540 > URL: https://issues.apache.org/jira/browse/STORM-3540 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Major > Labels: pull-request-available > > Seeing issues with connections to pacemaker with some workers despite > pacemaker being up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3539) Add metric for worker start time out
[ https://issues.apache.org/jira/browse/STORM-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3539: -- Labels: pull-request-available (was: ) > Add metric for worker start time out > > > Key: STORM-3539 > URL: https://issues.apache.org/jira/browse/STORM-3539 > Project: Apache Storm > Issue Type: Improvement >Reporter: David Andsager >Assignee: David Andsager >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3538) Add Meter for sendSupervisorAssignments exception
[ https://issues.apache.org/jira/browse/STORM-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3538: -- Labels: pull-request-available (was: ) > Add Meter for sendSupervisorAssignments exception > - > > Key: STORM-3538 > URL: https://issues.apache.org/jira/browse/STORM-3538 > Project: Apache Storm > Issue Type: Improvement >Reporter: David Andsager >Assignee: David Andsager >Priority: Major > Labels: pull-request-available > > We've observed exceptions which are only logged. Adding a meter allows > improved tracking. > [https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/nimbus/AssignmentDistributionService.java#L292-L293] > 2019-11-12 18:39:47.466 o.a.s.n.AssignmentDistributionService > pool-25-thread-9 [ERROR] Exception when trying to send assignments to node > 21b6cb54-da09-4e5e-a4ca-d0dc3a0df5bd-10.209.156.143-numa-1: SASL > authentication not complete > 2019-11-12 18:41:54.833 o.a.s.n.AssignmentDistributionService > pool-25-thread-2 [ERROR] Exception when trying to send assignments to node > 21b6cb54-da09-4e5e-a4ca-d0dc3a0df5bd-10.209.156.143-numa-1: > java.net.SocketTimeoutException: Read timed out > 2019-11-12 18:43:59.525 o.a.s.n.AssignmentDistributionService pool-25-thread-6 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3536) Add Generic-resources.md
[ https://issues.apache.org/jira/browse/STORM-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3536: -- Labels: pull-request-available (was: ) > Add Generic-resources.md > > > Key: STORM-3536 > URL: https://issues.apache.org/jira/browse/STORM-3536 > Project: Apache Storm > Issue Type: Improvement >Reporter: David Andsager >Assignee: David Andsager >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3529) Catch and log RetriableException in KafkaOffsetMetric
[ https://issues.apache.org/jira/browse/STORM-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3529: -- Labels: newbie pull-request-available (was: newbie) > Catch and log RetriableException in KafkaOffsetMetric > - > > Key: STORM-3529 > URL: https://issues.apache.org/jira/browse/STORM-3529 > Project: Apache Storm > Issue Type: Improvement > Components: storm-kafka-client >Affects Versions: 2.0.0 >Reporter: Stig Rohde Døssing >Priority: Major > Labels: newbie, pull-request-available > > When the KafkaOffsetMetric.getValueAndReset method calls the KafkaClient > methods, exceptions may be thrown. When these exceptions are retriable, we > should not crash the worker by letting them escape the method. We should > instead catch and log the exception. > An example of the desired behavior can be seen at > https://github.com/apache/storm/blob/7b1a98fc10fad516ef9ed0b3afc53a1d7be8a169/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java#L295 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3534) Add generic resources to UI
[ https://issues.apache.org/jira/browse/STORM-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3534: -- Labels: pull-request-available (was: ) > Add generic resources to UI > --- > > Key: STORM-3534 > URL: https://issues.apache.org/jira/browse/STORM-3534 > Project: Apache Storm > Issue Type: Improvement >Reporter: David Andsager >Assignee: David Andsager >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3530) Improve Scheduling Failure Message
[ https://issues.apache.org/jira/browse/STORM-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3530: -- Labels: pull-request-available (was: ) > Improve Scheduling Failure Message > -- > > Key: STORM-3530 > URL: https://issues.apache.org/jira/browse/STORM-3530 > Project: Apache Storm > Issue Type: Improvement >Reporter: David Andsager >Assignee: David Andsager >Priority: Major > Labels: pull-request-available > > Users find it difficult to determine the cause of a topology scheduling > failure. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3211) WindowedBoltExecutor NPE if wrapped bolt returns null from getComponentConfiguration
[ https://issues.apache.org/jira/browse/STORM-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3211: -- Labels: newbie pull-request-available (was: newbie) > WindowedBoltExecutor NPE if wrapped bolt returns null from > getComponentConfiguration > > > Key: STORM-3211 > URL: https://issues.apache.org/jira/browse/STORM-3211 > Project: Apache Storm > Issue Type: Task > Components: storm-client, storm-core >Affects Versions: 2.0.0, 1.2.2 >Reporter: Stig Rohde Døssing >Priority: Major > Labels: newbie, pull-request-available > > {code} > Exception in thread "main" java.lang.NullPointerException > at > org.apache.storm.topology.WindowedBoltExecutor.declareOutputFields(WindowedBoltExecutor.java:309) > at > org.apache.storm.topology.TopologyBuilder.getComponentCommon(TopologyBuilder.java:432) > at > org.apache.storm.topology.TopologyBuilder.createTopology(TopologyBuilder.java:120) > at Main.main(Main.java:23) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3066) Storm Flux variable substitution
[ https://issues.apache.org/jira/browse/STORM-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3066: -- Labels: newbie pull-request-available (was: newbie) > Storm Flux variable substitution > > > Key: STORM-3066 > URL: https://issues.apache.org/jira/browse/STORM-3066 > Project: Apache Storm > Issue Type: Improvement >Reporter: Calvin Chen >Priority: Minor > Labels: newbie, pull-request-available > > we are using flux to submit storm topology, for topology yml file, we need > substitute variables, from > [https://github.com/apache/storm/tree/master/flux], it says variable can be > substituted in format of ${variable.name}, my question is, in my case, > variable is a list, and I only want to get first element of the list, how can > I do it? > I try to use ${variable.0} and ${variable}[0], it doesn't work, after > substitution, the result is element_name.0 or element_name[0], which is not I > expected(expected one should be element_name), please let me know how to > specify the first element of a list in flux. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3512) Nimbus failing on startup with `GLIBC_2.12' not found
[ https://issues.apache.org/jira/browse/STORM-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3512: -- Labels: pull-request-available (was: ) > Nimbus failing on startup with `GLIBC_2.12' not found > - > > Key: STORM-3512 > URL: https://issues.apache.org/jira/browse/STORM-3512 > Project: Apache Storm > Issue Type: Bug > Components: storm-metrics >Affects Versions: 2.0.0 >Reporter: Sergey Titov >Priority: Major > Labels: pull-request-available > > Nimbus failing to start with and exception (see below). > > {code:java} > 2019-09-25 17:21:56.013 o.a.s.u.Utils main [ERROR] Received error in thread > main.. terminating server... > java.lang.Error: java.lang.UnsatisfiedLinkError: > /tmp/librocksdbjni3787537456845796855.so: /lib64/libpthread.so.0: version > `GLIBC_2.12' not found (required by /tmp/librocksdbjni3787537456845796855.so) > at > org.apache.storm.utils.Utils.handleUncaughtException(Utils.java:647) > ~[storm-client-2.0.0.jar:2.0.0] > at > org.apache.storm.utils.Utils.handleUncaughtException(Utils.java:626) > ~[storm-client-2.0.0.jar:2.0.0] > at > org.apache.storm.utils.Utils.lambda$createDefaultUncaughtExceptionHandler$2(Utils.java:982) > ~[storm-client-2.0.0.jar:2.0.0] > at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1057) > [?:1.8.0_211] > at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:1052) > [?:1.8.0_211] > at java.lang.Thread.dispatchUncaughtException(Thread.java:1959) > [?:1.8.0_211] > Caused by: java.lang.UnsatisfiedLinkError: > /tmp/librocksdbjni3787537456845796855.so: /lib64/libpthread.so.0: version > `GLIBC_2.12' not found (required by /tmp/librocksdbjni3787537456845796855.so) > at java.lang.ClassLoader$NativeLibrary.load(Native Method) > ~[?:1.8.0_211] > at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941) > ~[?:1.8.0_211] > at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824) > ~[?:1.8.0_211] > at java.lang.Runtime.load0(Runtime.java:809) ~[?:1.8.0_211] > at java.lang.System.load(System.java:1086) ~[?:1.8.0_211] > at > org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78) > ~[rocksdbjni-5.8.6.jar:?] > at > org.rocksdb.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:56) > ~[rocksdbjni-5.8.6.jar:?] > at org.rocksdb.RocksDB.loadLibrary(RocksDB.java:64) > ~[rocksdbjni-5.8.6.jar:?] > at org.rocksdb.RocksDB.(RocksDB.java:35) > ~[rocksdbjni-5.8.6.jar:?] > at > org.apache.storm.metricstore.rocksdb.RocksDbStore.prepare(RocksDbStore.java:67) > ~[storm-server-2.0.0.jar:2.0.0] > at > org.apache.storm.metricstore.MetricStoreConfig.configure(MetricStoreConfig.java:33) > ~[storm-server-2.0.0.jar:2.0.0] > at org.apache.storm.daemon.nimbus.Nimbus.(Nimbus.java:528) > ~[storm-server-2.0.0.jar:2.0.0] > at org.apache.storm.daemon.nimbus.Nimbus.(Nimbus.java:471) > ~[storm-server-2.0.0.jar:2.0.0] > at org.apache.storm.daemon.nimbus.Nimbus.(Nimbus.java:465) > ~[storm-server-2.0.0.jar:2.0.0] > at > org.apache.storm.daemon.nimbus.Nimbus.launchServer(Nimbus.java:1282) > ~[storm-server-2.0.0.jar:2.0.0] > at org.apache.storm.daemon.nimbus.Nimbus.launch(Nimbus.java:1307) > ~[storm-server-2.0.0.jar:2.0.0] > at org.apache.storm.daemon.nimbus.Nimbus.main(Nimbus.java:1312) > ~[storm-server-2.0.0.jar:2.0.0] > {code} > > Environment: > {code:java} > >>> uname -a > Linux gctdwp03 3.0.101-108.98-default #1 SMP Mon Jul 15 13:58:06 UTC 2019 > (262a94d) x86_64 x86_64 x86_64 GNU/Linux > > >>> cat /etc/SuSE-release > SUSE Linux Enterprise Server 11 (x86_64) > VERSION = 11 > PATCHLEVEL = 4{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3528) Allow users to provide their own custom TriggerPolicy/EvictionPolicy in BaseWindowBolt
[ https://issues.apache.org/jira/browse/STORM-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3528: -- Labels: pull-request-available (was: ) > Allow users to provide their own custom TriggerPolicy/EvictionPolicy in > BaseWindowBolt > -- > > Key: STORM-3528 > URL: https://issues.apache.org/jira/browse/STORM-3528 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aparna Ravindra >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3527) Container.getWorkerUser() should check if the user name is empty
[ https://issues.apache.org/jira/browse/STORM-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3527: -- Labels: pull-request-available (was: ) > Container.getWorkerUser() should check if the user name is empty > > > Key: STORM-3527 > URL: https://issues.apache.org/jira/browse/STORM-3527 > Project: Apache Storm > Issue Type: Bug >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Minor > Labels: pull-request-available > > Sometimes supervisor got terminated/died during writing username to > workers-users file. And when it happens, the file could be empty. And when > supervisor recovers after, it wouldn't be able to get the correct username > because the workers-users file is present but empty. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3526) Always override Login Config from System property
[ https://issues.apache.org/jira/browse/STORM-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3526: -- Labels: pull-request-available (was: ) > Always override Login Config from System property > - > > Key: STORM-3526 > URL: https://issues.apache.org/jira/browse/STORM-3526 > Project: Apache Storm > Issue Type: Bug > Components: storm-client >Reporter: Kishor Patil >Assignee: Kishor Patil >Priority: Major > Labels: pull-request-available > > Many a times, daemons should be able to read system property passed as > argument to JVM to override default config file setting for login config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3521) Storm CLI jar command doesn't handle topology arguments correctly
[ https://issues.apache.org/jira/browse/STORM-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3521: -- Labels: pull-request-available (was: ) > Storm CLI jar command doesn't handle topology arguments correctly > - > > Key: STORM-3521 > URL: https://issues.apache.org/jira/browse/STORM-3521 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Govind Menon >Assignee: Govind Menon >Priority: Major > Labels: pull-request-available > Fix For: 2.0.1, 2.1.0, 2.2.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3525) Large Contraint Solver test fails on some VM
[ https://issues.apache.org/jira/browse/STORM-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3525: -- Labels: pull-request-available (was: ) > Large Contraint Solver test fails on some VM > > > Key: STORM-3525 > URL: https://issues.apache.org/jira/browse/STORM-3525 > Project: Apache Storm > Issue Type: Bug > Components: storm-server >Affects Versions: 2.2.0 >Reporter: Bipin Prasad >Assignee: Bipin Prasad >Priority: Major > Labels: pull-request-available > > TestConstraintSolverStrategy.testScheduleLargeExecutorConstraintCount() > method fails on some VMs when the parallelismMultiplier is set 20. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3524) worker fails to launch due to missing parent directory for localized resource
[ https://issues.apache.org/jira/browse/STORM-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3524: -- Labels: pull-request-available (was: ) > worker fails to launch due to missing parent directory for localized resource > - > > Key: STORM-3524 > URL: https://issues.apache.org/jira/browse/STORM-3524 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Major > Labels: pull-request-available > > {code:java} > 2019-10-14 14:59:29.839 o.a.s.l.LocalizedResource AsyncLocalizer Executor - 2 > [WARN] Nothing to cleanup with badeDir > /home/y/var/storm/supervisor/usercache/xxx/filecache/files even though we > expected there to be something there 2019-10-14 14:59:29.839 > o.a.s.l.AsyncLocalizer AsyncLocalizer Executor - 2 [WARN] Failed to download > blob xxx:xxx.topology.yaml will try again in 100 ms > java.nio.file.NoSuchFileException: > /home/y/var/storm/supervisor/usercache/xxx/filecache/files at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > ~[?:1.8.0_181] at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > ~[?:1.8.0_181] at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > ~[?:1.8.0_181] at > sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) > ~[?:1.8.0_181] at java.nio.file.Files.createDirectory(Files.java:674) > ~[?:1.8.0_181] at > org.apache.storm.localizer.LocalizedResource.lambda$fetchUnzipToTemp$4(LocalizedResource.java:257) > ~[storm-server-2.0.1.y.jar:2.0.1.y] at > org.apache.storm.localizer.LocallyCachedBlob.fetch(LocallyCachedBlob.java:92) > ~[storm-server-2.0.1.y.jar:2.0.1.y] at > org.apache.storm.localizer.LocalizedResource.fetchUnzipToTemp(LocalizedResource.java:250) > ~[storm-server-2.0.1.y.jar:2.0.1.y] at > org.apache.storm.localizer.AsyncLocalizer.lambda$downloadOrUpdate$10(AsyncLocalizer.java:277) > ~[storm-server-2.0.1.y.jar:2.0.1.y] at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626) > [?:1.8.0_181] at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_181] at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [?:1.8.0_181] at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > [?:1.8.0_181] at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [?:1.8.0_181] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_181] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] > {code} > > A worker on a supervisor was failing to come up with this error continually > presenting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-2749) Remove state spout since it's never supported by storm
[ https://issues.apache.org/jira/browse/STORM-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-2749: -- Labels: pull-request-available (was: ) > Remove state spout since it's never supported by storm > -- > > Key: STORM-2749 > URL: https://issues.apache.org/jira/browse/STORM-2749 > Project: Apache Storm > Issue Type: Improvement >Reporter: Ethan Li >Priority: Major > Labels: pull-request-available > > We think we probably want to get rid of state spout stuff since it's never > being implemented. > The related code can be traced back to very early day of storm, e.g. > [TopologyBuilder#setStateSpout|https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/topology/TopologyBuilder.java#L436-L443] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3523) supervisor restarts when releasing slot with missing file
[ https://issues.apache.org/jira/browse/STORM-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3523: -- Labels: pull-request-available (was: ) > supervisor restarts when releasing slot with missing file > - > > Key: STORM-3523 > URL: https://issues.apache.org/jira/browse/STORM-3523 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > > {code:java} > 2019-10-03 16:25:32.809 o.a.s.d.s.Slot SLOT_6719 [ERROR] Error when > processing event > java.io.FileNotFoundException: File > 'x/storm/supervisor/stormdist/xxx-190213-004131-001-209-1550018519/stormconf.ser' > does not exist > at > org.apache.storm.shade.org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:297) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.shade.org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1851) > ~[shaded-deps-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.utils.ConfigUtils.readSupervisorStormConfGivenPath(ConfigUtils.java:308) > ~[storm-client-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.utils.ConfigUtils.readSupervisorStormConfImpl(ConfigUtils.java:469) > ~[storm-client-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.utils.ConfigUtils.readSupervisorStormConf(ConfigUtils.java:303) > ~[storm-client-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.localizer.AsyncLocalizer.getLocalResources(AsyncLocalizer.java:359) > ~[storm-server-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.localizer.AsyncLocalizer.releaseSlotFor(AsyncLocalizer.java:460) > ~[storm-server-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.daemon.supervisor.Slot.handleWaitingForBlobLocalization(Slot.java:435) > ~[storm-server-2.0.1.y.jar:2.0.1.y] > at > org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:229) > ~[storm-server-2.0.1.y.jar:2.0.1.y] > at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:900) > [storm-server-2.0.1.y.jar:2.0.1.y] > 2019-10-03 16:25:32.810 o.a.s.u.Utils SLOT_6719 [ERROR] Halting process: > Error when processing an event > java.lang.RuntimeException: Halting process: Error when processing an event > at org.apache.storm.utils.Utils.exitProcess(Utils.java:550) > [storm-client-2.0.1.y.jar:2.0.1.y] > at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:947) > [storm-server-2.0.1.y.jar:2.0.1.y] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3518) configs for favored nodes and unfavored nodes should support range of numbers
[ https://issues.apache.org/jira/browse/STORM-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3518: -- Labels: pull-request-available (was: ) > configs for favored nodes and unfavored nodes should support range of numbers > - > > Key: STORM-3518 > URL: https://issues.apache.org/jira/browse/STORM-3518 > Project: Apache Storm > Issue Type: Improvement >Reporter: Ethan Li >Assignee: Daisy Chen >Priority: Minor > Labels: pull-request-available > > Apache Storm has two configs for topologies to choose favored nodes and > unfavored nodes. > https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/Config.java#L351 > {code:java} > /** > * A list of host names that this topology would prefer to be scheduled > on (no guarantee is given though). This is intended for > * debugging only. > */ > @IsStringList > public static final String TOPOLOGY_SCHEDULER_FAVORED_NODES = > "topology.scheduler.favored.nodes"; > /** > * A list of host names that this topology would prefer to NOT be > scheduled on (no guarantee is given though). This is intended for > * debugging only. > */ > @IsStringList > public static final String TOPOLOGY_SCHEDULER_UNFAVORED_NODES = > "topology.scheduler.unfavored.nodes"; > {code} > It only support plain text currently, for example > {code:java} > host1.yahoo.com > host2.yahoo.com > hostA3.yahoo.com > hostA4.yahoo.com > {code} > It would be nice to be able to support ranges like > {code:java} > host[1-2].yahoo.com > hostA[3-4].yahoo.com > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3188) Removing try-catch block from getAndResetWorkerHeartbeats
[ https://issues.apache.org/jira/browse/STORM-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3188: -- Labels: pull-request-available (was: ) > Removing try-catch block from getAndResetWorkerHeartbeats > - > > Key: STORM-3188 > URL: https://issues.apache.org/jira/browse/STORM-3188 > Project: Apache Storm > Issue Type: Improvement > Components: storm-server >Affects Versions: 2.0.0 >Reporter: Zhengdai Hu >Priority: Minor > Labels: pull-request-available > > After refactoring, SupervisorUtils.readWorkerHeartbeats no longer throws > checked Exceptions. I'm wondering if we still want to keep the try-catch > block to wrap around its invocation in getAndResetWorkerHeartbeats in > ReportWorkerHeartbeats.java. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3231) TopologyBySubmissionTimeComparator does not consider priority
[ https://issues.apache.org/jira/browse/STORM-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3231: -- Labels: pull-request-available (was: ) > TopologyBySubmissionTimeComparator does not consider priority > - > > Key: STORM-3231 > URL: https://issues.apache.org/jira/browse/STORM-3231 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Aaron Gresch >Priority: Minor > Labels: pull-request-available > > TopologyBySubmissionTimeComparator indicates "Comparator that sorts > topologies by priority and then by submission time", but the code only > considers uptime. > > I am not sure what the intent should be. Either the code or comment should > be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3520) Storm CLI drpc-client incorrectly validating function args
[ https://issues.apache.org/jira/browse/STORM-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3520: -- Labels: pull-request-available (was: ) > Storm CLI drpc-client incorrectly validating function args > -- > > Key: STORM-3520 > URL: https://issues.apache.org/jira/browse/STORM-3520 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Govind Menon >Assignee: Govind Menon >Priority: Major > Labels: pull-request-available > Fix For: 2.0.1, 2.1.0, 2.2.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3519) Change ConstraintSolverStrategy::backtrackSearch to avoid StackOverflowException
[ https://issues.apache.org/jira/browse/STORM-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3519: -- Labels: pull-request-available (was: ) > Change ConstraintSolverStrategy::backtrackSearch to avoid > StackOverflowException > > > Key: STORM-3519 > URL: https://issues.apache.org/jira/browse/STORM-3519 > Project: Apache Storm > Issue Type: Bug > Components: storm-server >Affects Versions: 2.0.0 >Reporter: Bipin Prasad >Assignee: Bipin Prasad >Priority: Major > Labels: pull-request-available > > When ConstraintSolverStrategy::backtrackSearch recursively call itself - > after approximately 2 calls, there is a StackOverflowException. This can > be replicated by running > TestConstraintSolverStrategy::testScheduleLargeExecutorConstraintCount. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3515) Storm CLI config options are passed directly to underlying JAVA cli
[ https://issues.apache.org/jira/browse/STORM-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3515: -- Labels: pull-request-available (was: ) > Storm CLI config options are passed directly to underlying JAVA cli > --- > > Key: STORM-3515 > URL: https://issues.apache.org/jira/browse/STORM-3515 > Project: Apache Storm > Issue Type: Bug > Components: storm-client >Affects Versions: 2.0.0 >Reporter: Govind Menon >Assignee: Govind Menon >Priority: Major > Labels: pull-request-available > Fix For: 2.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3516) Kill or Rebalance Topology not processed on Nimbus restart
[ https://issues.apache.org/jira/browse/STORM-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3516: -- Labels: pull-request-available (was: ) > Kill or Rebalance Topology not processed on Nimbus restart > -- > > Key: STORM-3516 > URL: https://issues.apache.org/jira/browse/STORM-3516 > Project: Apache Storm > Issue Type: Bug >Reporter: David Andsager >Assignee: David Andsager >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3186) Customizable configuration for metric reporting interval
[ https://issues.apache.org/jira/browse/STORM-3186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3186: -- Labels: newbie pull-request-available (was: newbie) > Customizable configuration for metric reporting interval > > > Key: STORM-3186 > URL: https://issues.apache.org/jira/browse/STORM-3186 > Project: Apache Storm > Issue Type: Improvement > Components: storm-server, storm-webapp >Affects Versions: 2.0.0 >Reporter: Zhengdai Hu >Assignee: Rishabh Jain >Priority: Major > Labels: newbie, pull-request-available > > In current implementation, all subclass of ScheduledReporter are hard coded > report interval of 10 seconds. However I think it would make sense to make > this an item in configuration so user can change the reporting frequency to > fit their needs. > See discussion https://github.com/apache/storm/pull/2764#discussion_r203726617 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3510) WorkerState.transferLocalBatch backpressure resend logic fix
[ https://issues.apache.org/jira/browse/STORM-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3510: -- Labels: pull-request-available (was: ) > WorkerState.transferLocalBatch backpressure resend logic fix > > > Key: STORM-3510 > URL: https://issues.apache.org/jira/browse/STORM-3510 > Project: Apache Storm > Issue Type: Bug > Components: storm-client >Affects Versions: 2.2.0 >Reporter: Christopher Johnson >Priority: Minor > Labels: pull-request-available > > WorkerState.transferLocalBatch uses an int lastOverflowCount to track the > size of the overflow queue, and periodically resend the backpressure status > to remote workers if the queue continues to grow. > > The current implementation has two problems: > * The single variable tracks the receive queue of every executor in the > worker, meaning it will be overwritten as tuples are sent to different > executors. > * The variable is locally scoped, and so is not carried over between > mini-batches. > > This only comes in to effect when the overflow queue grows beyond 1, > which shouldn't happen unless a backpressure signal isn't received by an > upstream worker, but if it does happen then a backpressure signal is going to > be sent for every mini-batch processed. I do not know if this is the > intended behaviour, but the way the code is written seems to indicate that it > isn't. > > I have thought of two redesigns to fix these problems and make the behaviour > align with how one would interpret the code: > > # *Change the lastOverflowCount variable to a map of taskId to overflow > count* - This will retain the behaviour of resending the backpressure update > every mini-batch once over the threshold, if that behaviour is intended. > However, it will increase garbage by creating a new map every time > WorkerState.transferLocalBatch is called by the NettyWorker thread. > # *Change the lastOverflowCount variable to a map of taskId to overflow > count* *and move it to the BackPressureTracker class* - This will retain the > counter between batches, and so only resend backpressure status every 1 > received tuples per task. > > My preference is for the second option, as if the intended behaviour is to > resend every mini batch it should be rewritten so the intent is explicit from > the code. > > It is also possible that doing it the second way could run in to concurrency > issues i didn't think of, but as far as i can tell the > topology.worker.receiver.thread.count config option isn't used at all? If > that's the case and there is only one NettyWorker thread per worker then it > should be fine. > > I have implemented both methods and attempted to benchmark them with > [https://github.com/yahoo/storm-perf-test] but as i am running all workers on > one machine i couldn't get it to the point that the relevant code was ever > called. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (STORM-3508) The links to download in setting up environmtn page are broken
[ https://issues.apache.org/jira/browse/STORM-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3508: -- Labels: pull-request-available (was: ) > The links to download in setting up environmtn page are broken > -- > > Key: STORM-3508 > URL: https://issues.apache.org/jira/browse/STORM-3508 > Project: Apache Storm > Issue Type: Documentation >Reporter: p shirish reddy >Priority: Trivial > Labels: pull-request-available > > Navigate to > [https://storm.apache.org/releases/2.0.0/Setting-up-development-environment.html] > clicking on downloads page will result in 404 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3509) Improved RAS scheduling
[ https://issues.apache.org/jira/browse/STORM-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3509: -- Labels: pull-request-available (was: ) > Improved RAS scheduling > --- > > Key: STORM-3509 > URL: https://issues.apache.org/jira/browse/STORM-3509 > Project: Apache Storm > Issue Type: New Feature >Reporter: David Andsager >Assignee: David Andsager >Priority: Minor > Labels: pull-request-available > > Don't scheduling topology if topology resource requirements exceeds total > available on cluster. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3507) Need feedback from blacklisting to scheduling
[ https://issues.apache.org/jira/browse/STORM-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3507: -- Labels: pull-request-available (was: ) > Need feedback from blacklisting to scheduling > - > > Key: STORM-3507 > URL: https://issues.apache.org/jira/browse/STORM-3507 > Project: Apache Storm > Issue Type: Improvement >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Labels: pull-request-available > > It would be really nice if the scheduler could know which nodes would have > been blacklisted but were released because the cluster was full. Then the > scheduler could give good nodes to high priority jobs, and low priority once > get the possibly bad nodes. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3504) AsyncLocalizerTest is stubbing file system operations
[ https://issues.apache.org/jira/browse/STORM-3504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3504: -- Labels: pull-request-available (was: ) > AsyncLocalizerTest is stubbing file system operations > - > > Key: STORM-3504 > URL: https://issues.apache.org/jira/browse/STORM-3504 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Stig Rohde Døssing >Assignee: Diogo Monteiro >Priority: Major > Labels: pull-request-available > > AsyncLocalizerTest mocks AdvancedFSOps in order to avoid interacting with the > real file system. This is most likely unnecessary, and could be replaced with > using temporary files/directories. If possible, we should rewrite the tests > to use temporary files, and use the real AdvancedFSOps. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3506) prevent topology from overriding STORM_CGROUP_HIERARCHY_DIR and WORKER_METRICS
[ https://issues.apache.org/jira/browse/STORM-3506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3506: -- Labels: pull-request-available (was: ) > prevent topology from overriding STORM_CGROUP_HIERARCHY_DIR and WORKER_METRICS > -- > > Key: STORM-3506 > URL: https://issues.apache.org/jira/browse/STORM-3506 > Project: Apache Storm > Issue Type: Improvement >Affects Versions: 2.0.1 >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > > We had an issue where users were using older versions of storm the set these > differing values than the cluster supported. These parameters don't make > sense for topologies to override. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3503) Create unit tests for blacklistOnBadSlot option
[ https://issues.apache.org/jira/browse/STORM-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3503: -- Labels: pull-request-available (was: ) > Create unit tests for blacklistOnBadSlot option > --- > > Key: STORM-3503 > URL: https://issues.apache.org/jira/browse/STORM-3503 > Project: Apache Storm > Issue Type: Test >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Labels: pull-request-available > > follow up on STORM-3492 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3479) HB timeout configurable on a topology level
[ https://issues.apache.org/jira/browse/STORM-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3479: -- Labels: pull-request-available (was: ) > HB timeout configurable on a topology level > --- > > Key: STORM-3479 > URL: https://issues.apache.org/jira/browse/STORM-3479 > Project: Apache Storm > Issue Type: New Feature >Reporter: Ethan Li >Priority: Major > Labels: pull-request-available > > HB timeout on a cluster level may not be applicable to some GC/memory heave > bolts. And it can easily get HB timeout which causes the worker get killed. > HB timeout on a topology level should be supported and users should be able > to configure it. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3501) Local Cluster worker restarts
[ https://issues.apache.org/jira/browse/STORM-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3501: -- Labels: pull-request-available (was: ) > Local Cluster worker restarts > - > > Key: STORM-3501 > URL: https://issues.apache.org/jira/browse/STORM-3501 > Project: Apache Storm > Issue Type: Bug > Components: storm-server >Affects Versions: 2.0.0, 2.1.0 > Environment: Linux >Reporter: Diogo Monteiro >Priority: Minor > Labels: pull-request-available > > I was trying to launch a topology that I'm developing (in 2.0.0) and noticed > that the worker was getting restarted each ~30 seconds. > I placed a breakpoint in the _kill_ method of _LocalContainer_ > ([https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/daemon/supervisor/LocalContainer.java#L66]) > to try and understand why the worker was getting restarted. > > The call stack was: > _kill:66, LocalContainer (org.apache.storm.daemon.supervisor) > killContainerFor:269, Slot (org.apache.storm.daemon.supervisor) > handleRunning:724, Slot (org.apache.storm.daemon.supervisor) > stateMachineStep:218, Slot (org.apache.storm.daemon.supervisor) > run:931, Slot (org.apache.storm.daemon.supervisor) _ > > With this I can understand that the worker is killed because a blob has > changed > ([https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/daemon/supervisor/Slot.java#L724]). > In fact, there's a changing blob in the _dynamicState_ at that point. > > I checked the _AsyncLocalizer_ which downloads, caches blobs locally, and > notifies the Slot state machine of a changing blob. > > I noticed this: > * > [https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java#L339] > * > [https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java#L265] > * > [https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L142] > * > [https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L192] > > Which tell me that (correct me if I'm wrong): > * Supervisor tries to update blobs each 30 seconds. > * The topology jar blob requires extraction of the resources directory > (either from a jar or directly in a classpath URL). It does so in > _fetchUnzipToTemp_ and it's existence is checked in _isFullyDownloaded_. > * The Slot is notified of a changing blob if: > * the remote version is different from the local version (the code has > changed). > * OR the blob is not fully downloaded (the jar exists, and the extracted > resources directory exists). > > Well, I did not have a resources folder under the root of the classpath, and > that's why the worker was being restarted each ~30 seconds, as the Slot was > being notified of a changing blob everytime _updateBlobs_ ran. > I created a resources folder (with dummy files) under the root of the > classpath and the problem is now solved. > > However, if I understand correctly, the resources folder is only required > for _multilang_. Our topologies do not use _multilang_ and this do not happen > in Storm 1.1.3 for instance. > > Happy to submit MR. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3500) Spelling issue in storm.blobstore.dependency.jar.upload.chuck.size.bytes
[ https://issues.apache.org/jira/browse/STORM-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3500: -- Labels: pull-request-available (was: ) > Spelling issue in storm.blobstore.dependency.jar.upload.chuck.size.bytes > > > Key: STORM-3500 > URL: https://issues.apache.org/jira/browse/STORM-3500 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing >Priority: Major > Labels: pull-request-available > > It should be "chunk". As the property hasn't been in a release yet, we can > safely fix it without worrying about backward compatibility -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3498) Fix missing cases of invoking bash directly without /bin/env
[ https://issues.apache.org/jira/browse/STORM-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3498: -- Labels: portability pull-request-available (was: portability) > Fix missing cases of invoking bash directly without /bin/env > > > Key: STORM-3498 > URL: https://issues.apache.org/jira/browse/STORM-3498 > Project: Apache Storm > Issue Type: Bug >Reporter: Radim Kolar >Priority: Major > Labels: portability, pull-request-available > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3495) TestConstraintSolverStrategy is not stable on travis
[ https://issues.apache.org/jira/browse/STORM-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3495: -- Labels: pull-request-available (was: ) > TestConstraintSolverStrategy is not stable on travis > > > Key: STORM-3495 > URL: https://issues.apache.org/jira/browse/STORM-3495 > Project: Apache Storm > Issue Type: Test >Reporter: David Andsager >Assignee: David Andsager >Priority: Major > Labels: pull-request-available > > This test fails occasionally, hitting stack overflow on with > parallelismMultiplier of 5. This test requires recursing 3000 times, but > overflow has occurred with 1000. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (STORM-3490) Add checkstyle rule RedundantModifier
[ https://issues.apache.org/jira/browse/STORM-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-3490: -- Labels: pull-request-available (was: ) > Add checkstyle rule RedundantModifier > - > > Key: STORM-3490 > URL: https://issues.apache.org/jira/browse/STORM-3490 > Project: Apache Storm > Issue Type: Improvement >Reporter: Karl Richter >Assignee: Karl Richter >Priority: Major > Labels: pull-request-available > > Rule is already practiced in most cases. And enforcing it increases clarity. -- This message was sent by Atlassian Jira (v8.3.2#803003)