[jira] [Created] (STORM-3292) Trident HiveState must flush writers when the batch commits
Arun Mahadevan created STORM-3292: - Summary: Trident HiveState must flush writers when the batch commits Key: STORM-3292 URL: https://issues.apache.org/jira/browse/STORM-3292 Project: Apache Storm Issue Type: Improvement Reporter: Arun Mahadevan For trident the hive writer is flushed only after it hits the batch size. see - https://github.com/apache/storm/blob/master/external/storm-hive/src/main/java/org/apache/storm/hive/trident/HiveState.java#L108 Trident HiveState does not flush during the batch commit and it appears to be an oversight. Without this trident state cannot guarantee at-least once. (E.g. if the transaction is open but trident moves to the next txid and later fails the data in the open transaction is lost). So I think for at-least once, the HiveState must flush all the writers irrespective of the batch sizes when trident invokes the "commit(txid)" . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (STORM-3123) Storm Kafka Monitor does not work with Kafka over two-way SSL
[ https://issues.apache.org/jira/browse/STORM-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1664#comment-1664 ] Arun Mahadevan commented on STORM-3123: --- [~kabhwan] raised - https://github.com/apache/storm/pull/2909 for 1.x branch > Storm Kafka Monitor does not work with Kafka over two-way SSL > - > > Key: STORM-3123 > URL: https://issues.apache.org/jira/browse/STORM-3123 > Project: Apache Storm > Issue Type: Bug > Components: storm-kafka-monitor >Affects Versions: 1.2.2 >Reporter: Vipin Rathor >Assignee: Vipin Rathor >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > > Storm Kafka Monitor has no option to read / parse SSL truststore/keystore > properties which are required to connect to Kafka running over two-way SSL. > As a fix, it needs to understand the following additional Kafka properties: > {code:java} > ssl.truststore.location= > ssl.truststore.password= > ssl.keystore.location= > ssl.keystore.password= > ssl.key.password= > {code} > Since, JVM has a fallback mechanism for loading SSL truststore, Storm Kafka > Monitor would always endup using some truststore and would eventually work > with one-way SSL (which is also a default for Kafka setup). > Since there is no such fallback for SSL keystore, Storm Kafka Monitor would > start without a keystore and would eventually throw this error (in SSL debug > mode): > {code:java} > Warning: no suitable certificate found - continuing without client > authentication > *** Certificate chain > > *** > {code} > At this time, Kafka broker would complain about above like this: > {code:java} > kafka-network-thread-1002-SSL-7, READ: TLSv1.2 Handshake, length = 141 > *** Certificate chain > > *** > kafka-network-thread-1002-SSL-7, fatal error: 42: null cert chain > javax.net.ssl.SSLHandshakeException: null cert chain > {code} > Therefore, in the absence of this fix, the only available workaround is to > stick to one-way SSL in Kafka (i.e. keep ssl.client.auth=none in Kafka). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3281) ZK ACL checks breaks nimbus HA with LocalFsBlobStore
[ https://issues.apache.org/jira/browse/STORM-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-3281: -- Description: A new check [1] is introduced to check the ZK ACLs before nimbus starts up. However this can fail with Nimbus HA enabled with LocalFsBlobStore causing nimbus to not start up. 1. Set up a cluster with 3 Nimbus (blobstore replication factor = 2). It might be reproducible with two nimbus but I havent checked. 2. Deploy topology 3. Once the topology is activated and running, Kill the leader nimbus. 4. The nimbus becomes offline and one of the other two is elected as a leader 5. Now try to bring up the killed nimbus. It will fail with following exception {noformat} 2018-11-06 10:22:05.195 o.a.s.t.t.TSaslTransport main [DEBUG] CLIENT: reading data length: 37 2018-11-06 10:22:05.204 o.a.s.b.BlobStoreUtils main [DEBUG] Updating state inside zookeeper for an update 2018-11-06 10:22:05.209 o.a.s.u.StormBoundedExponentialBackoffRetry main [DEBUG] The baseSleepTimeMs [2000] the maxSleepTimeMs [6] the maxRetries [5] 2018-11-06 10:22:05.224 o.a.s.m.n.Login main [INFO] successfully logged in. 2018-11-06 10:22:05.224 o.a.s.s.a.k.KerberosSaslTransportPlugin main [DEBUG] SASL GSSAPI client transport is being established 2018-11-06 10:22:05.238 o.a.s.s.a.k.KerberosSaslTransportPlugin main [DEBUG] do as:storm_componentsrandyptkmqbymmt...@example.com 2018-11-06 10:22:05.240 o.a.s.t.t.TSaslTransport main [DEBUG] opening transport org.apache.storm.thrift.transport.TSaslClientTransport@75dc1c1c 2018-11-06 10:22:05.241 o.a.s.s.a.k.KerberosSaslTransportPlugin main [ERROR] Client failed to open SaslClientTransport to interact with a server during session initiation: org.apache.storm.thrift.transport.TT ransportException: java.net.ConnectException: Connection refused (Connection refused) org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.thrift.transport.TSaslTransport.open(TSaslTransport.java:266) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin$1.run(KerberosSaslTransportPlugin.java:145) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin$1.run(KerberosSaslTransportPlugin.java:141) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_151] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_151] at org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin.connect(KerberosSaslTransportPlugin.java:140) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.ThriftClient.(ThriftClient.java:73) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.utils.NimbusClient.(NimbusClient.java:131) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:233) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:269) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:344) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:274) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.BlobStore.readBlobTo(BlobStore.java:271) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.BlobStore.readBlob(BlobStore.java:300) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.zookeeper.AclEnforcement.verifyAcls(AclEnforcement.java:111) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.daemon.nimbus$_launch.invoke(nimbus.clj:2588) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.daemon.nimbus$_main.invoke(nimbus.clj:2612)
[jira] [Created] (STORM-3281) ZK ACL checks breaks nimbus HA with LocalFsBlobStore
Arun Mahadevan created STORM-3281: - Summary: ZK ACL checks breaks nimbus HA with LocalFsBlobStore Key: STORM-3281 URL: https://issues.apache.org/jira/browse/STORM-3281 Project: Apache Storm Issue Type: Improvement Affects Versions: 1.2.2 Reporter: Arun Mahadevan A new check [1] is introduced to check the ZK ACLs before nimbus starts up. However this can fail with Nimbus HA enabled with LocalFsBlobStore. 1. Set up a cluster with 3 Nimbus (blobstore replication factor = 2). It might be reproducible with two nimbus but I havent checked. 2. Deploy topology 3. Once the topology is activated and running, Kill the leader nimbus. 4. The nimbus becomes offline and one of the other two is elected as a leader 5. Now try to bring up the killed nimbus. It will fail with following exception {noformat} 2018-11-06 10:22:05.195 o.a.s.t.t.TSaslTransport main [DEBUG] CLIENT: reading data length: 37 2018-11-06 10:22:05.204 o.a.s.b.BlobStoreUtils main [DEBUG] Updating state inside zookeeper for an update 2018-11-06 10:22:05.209 o.a.s.u.StormBoundedExponentialBackoffRetry main [DEBUG] The baseSleepTimeMs [2000] the maxSleepTimeMs [6] the maxRetries [5] 2018-11-06 10:22:05.224 o.a.s.m.n.Login main [INFO] successfully logged in. 2018-11-06 10:22:05.224 o.a.s.s.a.k.KerberosSaslTransportPlugin main [DEBUG] SASL GSSAPI client transport is being established 2018-11-06 10:22:05.238 o.a.s.s.a.k.KerberosSaslTransportPlugin main [DEBUG] do as:storm_componentsrandyptkmqbymmt...@example.com 2018-11-06 10:22:05.240 o.a.s.t.t.TSaslTransport main [DEBUG] opening transport org.apache.storm.thrift.transport.TSaslClientTransport@75dc1c1c 2018-11-06 10:22:05.241 o.a.s.s.a.k.KerberosSaslTransportPlugin main [ERROR] Client failed to open SaslClientTransport to interact with a server during session initiation: org.apache.storm.thrift.transport.TT ransportException: java.net.ConnectException: Connection refused (Connection refused) org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.thrift.transport.TSaslTransport.open(TSaslTransport.java:266) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin$1.run(KerberosSaslTransportPlugin.java:145) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin$1.run(KerberosSaslTransportPlugin.java:141) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_151] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_151] at org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin.connect(KerberosSaslTransportPlugin.java:140) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.security.auth.ThriftClient.(ThriftClient.java:73) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.utils.NimbusClient.(NimbusClient.java:131) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:233) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:269) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:344) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:274) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.BlobStore.readBlobTo(BlobStore.java:271) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.blobstore.BlobStore.readBlob(BlobStore.java:300) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.zookeeper.AclEnforcement.verifyAcls(AclEnforcement.java:111) ~[storm-core-1.2.1.3.3.0.0-154.jar:1.2.1.3.3.0.0-154] at org.apache.storm.daemon.nimbus$_launch.invoke(nimbus.clj:2588)
[jira] [Created] (STORM-3252) Blobstore sync bug fix
Arun Mahadevan created STORM-3252: - Summary: Blobstore sync bug fix Key: STORM-3252 URL: https://issues.apache.org/jira/browse/STORM-3252 Project: Apache Storm Issue Type: Improvement Affects Versions: 1.2.2 Reporter: Arun Mahadevan Fix For: 2.0.0, 1.x -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-3222) Fix KafkaSpout internals to use LinkedList instead of ArrayList
Arun Mahadevan created STORM-3222: - Summary: Fix KafkaSpout internals to use LinkedList instead of ArrayList Key: STORM-3222 URL: https://issues.apache.org/jira/browse/STORM-3222 Project: Apache Storm Issue Type: Improvement Reporter: Arun Mahadevan Assignee: Arun Mahadevan KafkaSpout internally maintains a waitingToEmit list per topic partition and keeps removing the first item to emit during each nextTuple. The implementation uses an ArrayList which results in un-necessary traversal and copy for each tuple. Also I am not sure why the nextTuple only emits a single tuple wheres ideally it should emit whatever it can emit in a single nextTuple call which is more efficient. However the logic appears too complicated to refactor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3110) Supervisor does not kill all worker processes in secure mode in case of user mismatch
[ https://issues.apache.org/jira/browse/STORM-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-3110: -- Description: While running in secure mode, supervisor sets the worker user (in workers local state) as the user that launched the topology. {code:java} SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo {code} However the worker OS process does not actually run as the user "foo" (instead runs as storm user) unless {{supervisor.run.worker.as.user}} is also set. If the supervisor's assignment changes, the supervisor in some cases checks if all processes are dead by matching the "pid+user". Here if the worker is running as a different user (say storm) the supervisor wrongly assumes that the worker process is dead. Later when supervisor tries to launch a worker at that same port, it throws a bind exception o.a.s.m.n.Server main [INFO] Create Netty Server Netty-server-localhost-6700, buffer_size: 5242880, maxWorkers: 1 o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:6700 at org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] was: While running in secure mode, supervisor sets the worker user (in workers local state) as the user that launched the topology. {code:java} SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo {code} However the OS process does not actually run as the user (e.g foo) unless "supervisor.run.worker.as.user" is also set. if the supervisor's assignment changes, the supervisor in some cases checks if all processes are dead by matching the "pid+user" name. Here if the worker is running as a different user (say storm) the supervisor wrongly assumes that the worker process is dead. Later when supervisor tries to launch a worker at that same port, it throws a bind exception o.a.s.m.n.Server main [INFO] Create Netty Server Netty-server-localhost-6700, buffer_size: 5242880, maxWorkers: 1 o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:6700 at org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] > Supervisor does not kill all worker processes in secure mode in case of user > mismatch > - > > Key: STORM-3110 > URL: https://issues.apache.org/jira/browse/STORM-3110 > Project: Apache Storm > Issue Type: Improvement >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > While running in secure mode, supervisor sets the worker user (in workers > local state) as the user that launched the topology. > > {code:java} > SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo > {code} > > However the worker OS process does not actually run as the user "foo" > (instead runs as storm user) unless {{supervisor.run.worker.as.user}} is also > set. > If the supervisor's assignment changes, the supervisor in some cases checks > if all processes are dead by matching the "pid+user". Here if the worker is > running as a different user (say storm) the supervisor wrongly assumes that > the worker process is dead. > Later when supervisor tries to launch a worker at that same port, it throws a > bind exception > o.a.s.m.n.Server main [INFO] Create Netty Server Netty-server-localhost-6700, > buffer_size: 5242880, maxWorkers: 1 > o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker > org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to > bind to: 0.0.0.0/0.0.0.0:6700 > at > org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) > ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3110) Supervisor does not kill all worker processes in secure mode in case of user mismatch
[ https://issues.apache.org/jira/browse/STORM-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-3110: -- Description: While running in secure mode, supervisor sets the worker user (in workers local state) as the user that launched the topology. {code:java} SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo {code} However the OS process does not actually run as the user (e.g foo) unless "supervisor.run.worker.as.user" is also set. if the supervisor's assignment changes, the supervisor in some cases checks if all processes are dead by matching the "pid+user" name. Here if the worker is running as a different user (say storm) the supervisor wrongly assumes that the worker process is dead. Later when supervisor tries to launch a worker at that same port, it throws a bind exception o.a.s.m.n.Server main [INFO] Create Netty Server Netty-server-localhost-6700, buffer_size: 5242880, maxWorkers: 1 o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:6700 at org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] was: While running in secure mode, supervisor sets the worker user (in workers local state) as the user that launched the topology. {code:java} SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo {code} However the OS process does not actually run as the user (e.g hrt_qa) unless "supervisor.run.worker.as.user" is also set. if the supervisor's assignment changes, the supervisor in some cases checks if all processes are dead by matching the "pid+user" name. Here if the worker is running as a different user (say storm) the supervisor wrongly assumes that the worker process is dead. Later when supervisor tries to launch a worker at that same port, it throws a bind exception o.a.s.m.n.Server main [INFO] Create Netty Server Netty-server-localhost-6700, buffer_size: 5242880, maxWorkers: 1 o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:6700 at org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] > Supervisor does not kill all worker processes in secure mode in case of user > mismatch > - > > Key: STORM-3110 > URL: https://issues.apache.org/jira/browse/STORM-3110 > Project: Apache Storm > Issue Type: Improvement >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > While running in secure mode, supervisor sets the worker user (in workers > local state) as the user that launched the topology. > > {code:java} > SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo > {code} > > However the OS process does not actually run as the user (e.g foo) unless > "supervisor.run.worker.as.user" is also set. > > if the supervisor's assignment changes, the supervisor in some cases checks > if all processes are dead by matching the "pid+user" name. Here if the worker > is running as a different user (say storm) the supervisor wrongly assumes > that the worker process is dead. > > Later when supervisor tries to launch a worker at that same port, it throws a > bind exception > > o.a.s.m.n.Server main [INFO] Create Netty Server > Netty-server-localhost-6700, buffer_size: 5242880, maxWorkers: 1 > o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker > org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to > bind to: 0.0.0.0/0.0.0.0:6700 > at > org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) > ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-3110) Supervisor does not kill all worker processes in secure mode in case of user mismatch
Arun Mahadevan created STORM-3110: - Summary: Supervisor does not kill all worker processes in secure mode in case of user mismatch Key: STORM-3110 URL: https://issues.apache.org/jira/browse/STORM-3110 Project: Apache Storm Issue Type: Improvement Reporter: Arun Mahadevan Assignee: Arun Mahadevan While running in secure mode, supervisor sets the worker user (in workers local state) as the user that launched the topology. {code:java} SET worker-user 4d67a6be-4c80-4622-96af-f94706d58553 foo {code} However the OS process does not actually run as the user (e.g hrt_qa) unless "supervisor.run.worker.as.user" is also set. if the supervisor's assignment changes, the supervisor in some cases checks if all processes are dead by matching the "pid+user" name. Here if the worker is running as a different user (say storm) the supervisor wrongly assumes that the worker process is dead. Later when supervisor tries to launch a worker at that same port, it throws a bind exception o.a.s.m.n.Server main [INFO] Create Netty Server Netty-server-localhost-6700, buffer_size: 5242880, maxWorkers: 1 o.a.s.d.worker main [ERROR] Error on initialization of server mk-worker org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:6700 at org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) ~[storm-core-1.2.0.3.1.0.0-501.jar:1.2.0.3.1.0.0-501] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3035) JMS Spout ack method causes failure in some cases
[ https://issues.apache.org/jira/browse/STORM-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-3035: -- Description: JMS Spout ack method assumes that the set "toCommit" is always non-empty but if a fail is invoked (that clears the "toCommit") followed by an ack, it can cause failure. {noformat} 2018-03-09 08:43:03,220 GMT-0500 MCO-432882-L2 [Thread-36-inboundSpout-executor[5 5]] 7.0.0 ERROR logging$eval1$fn__7.invoke Async loop died! java.lang.RuntimeException: java.util.NoSuchElementException at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:485) ~[storm-core-1.1.0.2.6.3.0- 235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451) ~[storm-core- 1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.utils.DisruptorQueue.consumeBatch(DisruptorQueue.java:441) ~[storm-core-1.1.0.2.6.3.0- 235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.disruptor$consume_batch.invoke(disruptor.clj:69) ~[storm-core- 1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.daemon.executor$fn__6856$fn__6871$fn__6902.invoke(executor.clj:627) ~[storm-core-1.1.0.2.6.3.0- 235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.util$async_loop$fn__555.invoke(util.clj:484) [storm-core- 1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111] Caused by: java.util.NoSuchElementException at java.util.TreeMap.key(TreeMap.java:1327) ~[?:1.8.0_111] at java.util.TreeMap.firstKey(TreeMap.java:290) ~ [?:1.8.0_111] at java.util.TreeSet.first(TreeSet.java:394) ~[?:1.8.0_111] at org.apache.storm.jms.spout.JmsSpout.ack(JmsSpout.java:251) ~[classes/:?] at org.apache.storm.daemon.executor$ack_spout_msg.invoke(executor.clj:446) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.daemon.executor$fn__6856$tuple_action_fn__6862.invoke(executor.clj:535) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.daemon.executor$mk_task_receiver$fn__6845.invoke(executor.clj:462) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.disruptor$clojure_handler$reify__6558.onEvent(disruptor.clj:40) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] ... 7 more {noformat} was: JMS Spout ack method assumes that the set "toCommit" is always non-empty but if a fail is invoked (that clears the "toCommit") followed by an ack, it can cause failure. {noformat} 2018-03-09 08:43:03,220 GMT-0500 MCO-432882-L2 [Thread-36-inboundSpout-executor[5 5]] 7.0.0 ERROR logging$eval1$fn__7.invoke Async loop died! java.lang.RuntimeException: java.util.NoSuchElementException at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:485) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.utils.DisruptorQueue.consumeBatch(DisruptorQueue.java:441) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.disruptor$consume_batch.invoke(disruptor.clj:69) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.daemon.executor$fn__6856$fn__6871$fn__6902.invoke(executor.clj:627) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.util$async_loop$fn__555.invoke(util.clj:484) [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111] Caused by: java.util.NoSuchElementException at java.util.TreeMap.key(TreeMap.java:1327) ~[?:1.8.0_111] at java.util.TreeMap.firstKey(TreeMap.java:290) ~[?:1.8.0_111] at java.util.TreeSet.first(TreeSet.java:394) ~[?:1.8.0_111] at org.apache.storm.jms.spout.JmsSpout.ack(JmsSpout.java:251) ~[classes/:?] at org.apache.storm.daemon.executor$ack_spout_msg.invoke(executor.clj:446) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.daemon.executor$fn__6856$tuple_action_fn__6862.invoke(executor.clj:535) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.daemon.executor$mk_task_receiver$fn__6845.invoke(executor.clj:462) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.disruptor$clojure_handler$reify__6558.onEvent(disruptor.clj:40) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] ... 7 more{noformat} > JMS Spout ack method causes failure in some cases >
[jira] [Created] (STORM-3035) JMS Spout ack method causes failure in some cases
Arun Mahadevan created STORM-3035: - Summary: JMS Spout ack method causes failure in some cases Key: STORM-3035 URL: https://issues.apache.org/jira/browse/STORM-3035 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Fix For: 2.0.0, 1.2.2 JMS Spout ack method assumes that the set "toCommit" is always non-empty but if a fail is invoked (that clears the "toCommit") followed by an ack, it can cause failure. {noformat} 2018-03-09 08:43:03,220 GMT-0500 MCO-432882-L2 [Thread-36-inboundSpout-executor[5 5]] 7.0.0 ERROR logging$eval1$fn__7.invoke Async loop died! java.lang.RuntimeException: java.util.NoSuchElementException at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:485) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.utils.DisruptorQueue.consumeBatch(DisruptorQueue.java:441) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.disruptor$consume_batch.invoke(disruptor.clj:69) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.daemon.executor$fn__6856$fn__6871$fn__6902.invoke(executor.clj:627) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.util$async_loop$fn__555.invoke(util.clj:484) [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111] Caused by: java.util.NoSuchElementException at java.util.TreeMap.key(TreeMap.java:1327) ~[?:1.8.0_111] at java.util.TreeMap.firstKey(TreeMap.java:290) ~[?:1.8.0_111] at java.util.TreeSet.first(TreeSet.java:394) ~[?:1.8.0_111] at org.apache.storm.jms.spout.JmsSpout.ack(JmsSpout.java:251) ~[classes/:?] at org.apache.storm.daemon.executor$ack_spout_msg.invoke(executor.clj:446) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.daemon.executor$fn__6856$tuple_action_fn__6862.invoke(executor.clj:535) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.daemon.executor$mk_task_receiver$fn__6845.invoke(executor.clj:462) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.disruptor$clojure_handler$reify__6558.onEvent(disruptor.clj:40) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472) ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] ... 7 more{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-2993) Storm HDFS bolt throws ClosedChannelException when Time rotation policy is used
Arun Mahadevan created STORM-2993: - Summary: Storm HDFS bolt throws ClosedChannelException when Time rotation policy is used Key: STORM-2993 URL: https://issues.apache.org/jira/browse/STORM-2993 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Storm connector throws below error in the worker logs. 2018-03-12 18:14:58.123 o.a.s.h.c.r.MoveFileAction Timer-3 [INFO] Moving file hdfs://ctr-e138-1518143905142-85179-01-04.hwx.site:8020/tmp/foo/my-bolt-3-0-1520878438104.txt to /tmp/dest2/my-bolt-3-0-15 20878438104.txt 2018-03-12 18:14:58.123 o.a.s.h.c.r.MoveFileAction Timer-0 [INFO] Moving file hdfs://ctr-e138-1518143905142-85179-01-04.hwx.site:8020/tmp/foo/my-bolt-6-0-1520878438104.txt to /tmp/dest2/my-bolt-6-0-15 20878438104.txt 2018-03-12 18:14:58.123 o.a.s.h.c.r.MoveFileAction Timer-1 [INFO] Moving file hdfs://ctr-e138-1518143905142-85179-01-04.hwx.site:8020/tmp/foo/my-bolt-5-0-1520878438104.txt to /tmp/dest2/my-bolt-5-0-15 20878438104.txt 2018-03-12 18:14:58.124 o.a.s.h.c.r.MoveFileAction Timer-2 [INFO] Moving file hdfs://ctr-e138-1518143905142-85179-01-04.hwx.site:8020/tmp/foo/my-bolt-4-0-1520878438104.txt to /tmp/dest2/my-bolt-4-0-15 20878438104.txt 2018-03-12 18:14:58.132 o.a.s.h.b.AbstractHdfsBolt Timer-2 [INFO] File rotation took 28 ms. 2018-03-12 18:14:58.132 o.a.s.h.b.AbstractHdfsBolt Timer-0 [INFO] File rotation took 29 ms. 2018-03-12 18:14:58.132 o.a.s.h.b.AbstractHdfsBolt Timer-3 [INFO] File rotation took 28 ms. 2018-03-12 18:14:58.132 o.a.s.h.b.AbstractHdfsBolt Timer-1 [INFO] File rotation took 28 ms. 2018-03-12 18:14:58.132 o.a.s.h.b.AbstractHdfsBolt Thread-12-my-bolt-executor[6 6] [INFO] Tuple failed to write, forcing a flush of existing data. 2018-03-12 18:14:58.132 o.a.s.h.b.AbstractHdfsBolt Thread-8-my-bolt-executor[3 3] [INFO] Tuple failed to write, forcing a flush of existing data. 2018-03-12 18:14:58.132 o.a.s.h.b.AbstractHdfsBolt Thread-16-my-bolt-executor[5 5] [INFO] Tuple failed to write, forcing a flush of existing data. 2018-03-12 18:14:58.132 o.a.s.d.executor Thread-8-my-bolt-executor[3 3] [ERROR] java.nio.channels.ClosedChannelException: null at org.apache.hadoop.hdfs.ExceptionLastSeen.throwException4Close(ExceptionLastSeen.java:73) ~[stormjar.jar:?] at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:153) ~[stormjar.jar:?] at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:105) ~[stormjar.jar:?] at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57) ~[stormjar.jar:?] at java.io.DataOutputStream.write(DataOutputStream.java:107) ~[?:1.8.0_161] at java.io.FilterOutputStream.write(FilterOutputStream.java:97) ~[?:1.8.0_161] at org.apache.storm.hdfs.common.HDFSWriter.doWrite(HDFSWriter.java:48) ~[stormjar.jar:?] at org.apache.storm.hdfs.common.AbstractHDFSWriter.write(AbstractHDFSWriter.java:40) ~[stormjar.jar:?] at org.apache.storm.hdfs.bolt.AbstractHdfsBolt.execute(AbstractHdfsBolt.java:158) [stormjar.jar:?] at org.apache.storm.daemon.executor$fn__10189$tuple_action_fn__10191.invoke(executor.clj:745) [storm-core-1.2.1.3.0.0.0-1013.jar:1.2.1.3.0.0.0-1013] at org.apache.storm.daemon.executor$mk_task_receiver$fn__10108.invoke(executor.clj:473) [storm-core-1.2.1.3.0.0.0-1013.jar:1.2.1.3.0.0.0-1013] at org.apache.storm.disruptor$clojure_handler$reify__4115.onEvent(disruptor.clj:41) [storm-core-1.2.1.3.0.0.0-1013.jar:1.2.1.3.0.0.0-1013] at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:509) [storm-core-1.2.1.3.0.0.0-1013.jar:1.2.1.3.0.0.0-1013] at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:487) [storm-core-1.2.1.3.0.0.0-1013.jar:1.2.1.3.0.0.0-1013] at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:74) [storm-core-1.2.1.3.0.0.0-1013.jar:1.2.1.3.0.0.0-1013] at org.apache.storm.daemon.executor$fn__10189$fn__10202$fn__10257.invoke(executor.clj:868) [storm-core-1.2.1.3.0.0.0-1013.jar:1.2.1.3.0.0.0-1013] at org.apache.storm.util$async_loop$fn__1221.invoke(util.clj:484) [storm-core-1.2.1.3.0.0.0-1013.jar:1.2.1.3.0.0.0-1013] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161] 2018-03-12 18:14:58.133 o.a.s.d.executor Thread-16-my-bolt-executor[5 5] Apparently the Timed rotation policy does not synchronize properly so its possible that the HDFS bolt code can attempt to write to a closed writer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2985) Add jackson-annotations to dependency management
[ https://issues.apache.org/jira/browse/STORM-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-2985: -- Description: We recently upgraded to jackson version 2.9.4. However different versions of jackson-annotation dependencies are inherited via transitive dependencies of other jars. Its best to keep it in sync. was: Storm 1.x branch uses jackson 2.6.3 which has some known vulnerabilities. Upgrade to the latest jackson version 2.9.4 in 1.x and master branch. > Add jackson-annotations to dependency management > > > Key: STORM-2985 > URL: https://issues.apache.org/jira/browse/STORM-2985 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan >Priority: Major > Labels: pull-request-available > Fix For: 1.2.2 > > > > We recently upgraded to jackson version 2.9.4. However different versions of > jackson-annotation dependencies are inherited via transitive dependencies of > other jars. Its best to keep it in sync. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2985) Add jackson-annotations to dependency management
[ https://issues.apache.org/jira/browse/STORM-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-2985: -- Fix Version/s: (was: 2.0.0) > Add jackson-annotations to dependency management > > > Key: STORM-2985 > URL: https://issues.apache.org/jira/browse/STORM-2985 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan >Priority: Major > Labels: pull-request-available > Fix For: 1.2.2 > > > > We recently upgraded to jackson version 2.9.4. However different versions of > jackson-annotation dependencies are inherited via transitive dependencies of > other jars. Its best to keep it in sync. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-2985) Add jackson-annotations to dependency management
Arun Mahadevan created STORM-2985: - Summary: Add jackson-annotations to dependency management Key: STORM-2985 URL: https://issues.apache.org/jira/browse/STORM-2985 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Fix For: 2.0.0, 1.2.2 Storm 1.x branch uses jackson 2.6.3 which has some known vulnerabilities. Upgrade to the latest jackson version 2.9.4 in 1.x and master branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2967) Upgrade jackson to latest version 2.9.4
[ https://issues.apache.org/jira/browse/STORM-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2967. --- Resolution: Fixed > Upgrade jackson to latest version 2.9.4 > --- > > Key: STORM-2967 > URL: https://issues.apache.org/jira/browse/STORM-2967 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.2.2 > > Time Spent: 50m > Remaining Estimate: 0h > > Storm 1.x branch uses jackson 2.6.3 which has some known vulnerabilities. > > Upgrade to the latest jackson version 2.9.4 in 1.x and master branch. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2967) Upgrade jackson to latest version 2.9.4
[ https://issues.apache.org/jira/browse/STORM-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-2967: -- Fix Version/s: 1.2.2 2.0.0 > Upgrade jackson to latest version 2.9.4 > --- > > Key: STORM-2967 > URL: https://issues.apache.org/jira/browse/STORM-2967 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.2.2 > > Time Spent: 40m > Remaining Estimate: 0h > > Storm 1.x branch uses jackson 2.6.3 which has some known vulnerabilities. > > Upgrade to the latest jackson version 2.9.4 in 1.x and master branch. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-2968) Exclude a few unwanted jars from storm-autocreds
Arun Mahadevan created STORM-2968: - Summary: Exclude a few unwanted jars from storm-autocreds Key: STORM-2968 URL: https://issues.apache.org/jira/browse/STORM-2968 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-2967) Upgrade jackson to latest version 2.9.4
Arun Mahadevan created STORM-2967: - Summary: Upgrade jackson to latest version 2.9.4 Key: STORM-2967 URL: https://issues.apache.org/jira/browse/STORM-2967 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Storm 1.x branch uses jackson 2.6.3 which has some known vulnerabilities. Upgrade to the latest jackson version 2.9.4 in 1.x and master branch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (STORM-2951) Storm binaries packages oncrpc jar which is LGPL
[ https://issues.apache.org/jira/browse/STORM-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367812#comment-16367812 ] Arun Mahadevan commented on STORM-2951: --- Hi [~ptgoetz], are the changes already merged to 1.x branch ? Checking since the Jira is marked as resolved. > Storm binaries packages oncrpc jar which is LGPL > - > > Key: STORM-2951 > URL: https://issues.apache.org/jira/browse/STORM-2951 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Arun Mahadevan >Assignee: P. Taylor Goetz >Priority: Major > Fix For: 1.2.1 > > > With the recent storm metrics changes storm packages oncrpc-1.0.7.jar which > is LGPL licence. > > [https://mvnrepository.com/artifact/org.acplt/oncrpc/1.0.7] > > I am not sure if its ok to package libraries with LGPL license in storm > distribution. > > Its coming from metrics-ganglia dependency in storm-core. > [~ptgoetz], can you provide inputs ? If this needs to be excluded, I can > craft a patch and push it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-2951) Storm binaries packages oncrpc jar which is LGPL
Arun Mahadevan created STORM-2951: - Summary: Storm binaries packages oncrpc jar which is LGPL Key: STORM-2951 URL: https://issues.apache.org/jira/browse/STORM-2951 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan With the recent storm metrics changes storm packages oncrpc-1.0.7.jar which is LGPL licence. [https://mvnrepository.com/artifact/org.acplt/oncrpc/1.0.7] I am not sure if its ok to package libraries with LGPL license in storm distribution. Its coming from metrics-ganglia dependency in storm-core. [~ptgoetz], can you provide inputs ? If this need to excluded, I can craft a patch and push it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (STORM-2900) Possible NPE while populating credentials in nimbus.
[ https://issues.apache.org/jira/browse/STORM-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331264#comment-16331264 ] Arun Mahadevan commented on STORM-2900: --- thanks [~satish.duggana] for the patch. Merged to master and 1.x branch. > Possible NPE while populating credentials in nimbus. > > > Key: STORM-2900 > URL: https://issues.apache.org/jira/browse/STORM-2900 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.x >Reporter: Satish Duggana >Assignee: Satish Duggana >Priority: Critical > Labels: pull-request-available > Fix For: 2.0.0, 1.2.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Nimbus Auto Creds[1.x, 2.0] may get into NPE while populating credentials > when there is no config for the given key. > 1.x - > [https://github.com/apache/storm/blob/1.x-branch/external/storm-autocreds/src/main/java/org/apache/storm/common/AbstractAutoCreds.java#L75] > 2.0 - > [https://github.com/apache/storm/blob/master/external/storm-autocreds/src/main/java/org/apache/storm/common/AbstractHadoopNimbusPluginAutoCreds.java#L55] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2900) Possible NPE while populating credentials in nimbus.
[ https://issues.apache.org/jira/browse/STORM-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2900. --- Resolution: Fixed > Possible NPE while populating credentials in nimbus. > > > Key: STORM-2900 > URL: https://issues.apache.org/jira/browse/STORM-2900 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.x >Reporter: Satish Duggana >Assignee: Satish Duggana >Priority: Critical > Labels: pull-request-available > Fix For: 2.0.0, 1.2.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Nimbus Auto Creds[1.x, 2.0] may get into NPE while populating credentials > when there is no config for the given key. > 1.x - > [https://github.com/apache/storm/blob/1.x-branch/external/storm-autocreds/src/main/java/org/apache/storm/common/AbstractAutoCreds.java#L75] > 2.0 - > [https://github.com/apache/storm/blob/master/external/storm-autocreds/src/main/java/org/apache/storm/common/AbstractHadoopNimbusPluginAutoCreds.java#L55] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2900) Possible NPE while populating credentials in nimbus.
[ https://issues.apache.org/jira/browse/STORM-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-2900: -- Fix Version/s: 1.2.0 2.0.0 > Possible NPE while populating credentials in nimbus. > > > Key: STORM-2900 > URL: https://issues.apache.org/jira/browse/STORM-2900 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.x >Reporter: Satish Duggana >Assignee: Satish Duggana >Priority: Critical > Labels: pull-request-available > Fix For: 2.0.0, 1.2.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Nimbus Auto Creds[1.x, 2.0] may get into NPE while populating credentials > when there is no config for the given key. > 1.x - > [https://github.com/apache/storm/blob/1.x-branch/external/storm-autocreds/src/main/java/org/apache/storm/common/AbstractAutoCreds.java#L75] > 2.0 - > [https://github.com/apache/storm/blob/master/external/storm-autocreds/src/main/java/org/apache/storm/common/AbstractHadoopNimbusPluginAutoCreds.java#L55] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2881) Storm-druid topologies fail with NoSuchMethodError
[ https://issues.apache.org/jira/browse/STORM-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2881. --- Resolution: Fixed > Storm-druid topologies fail with NoSuchMethodError > -- > > Key: STORM-2881 > URL: https://issues.apache.org/jira/browse/STORM-2881 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > Labels: pull-request-available > Fix For: 1.2.0 > > Time Spent: 3h > Remaining Estimate: 0h > > Deploy a sample topology with storm-druid (E.g. the SampleDruidBoltTopology > available in the storm git repo). The worker crashes with below error: > {noformat} > 2017-12-28 15:47:36.382 o.a.s.util > Thread-11-113-Violation-Events-Cube-executor9 9 ERROR Async loop died! > java.lang.NoSuchMethodError: > org.apache.curator.framework.api.CreateBuilder.creatingParentsIfNeeded()Lorg/apache/curator/framework/api/ProtectACLCreateModePathAndBytesable; > at > com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:125) > > ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 > at > com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:122) > > ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 > at com.metamx.common.scala.Predef$EffectOps.withEffect(Predef.scala:44) > ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 > at > {noformat} > storm-druid has dependency on curator 2.6.0, but the storm parent pom has > defined 4.0.0 version in the dependency Management. > Due to this storm-druid is inheriting the 4.0.0 version of curator and > including that version in the jar. > If we explicitly mention the curator dependency in the storm-druid pom.xml, > this can be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (STORM-2881) Storm-druid topologies fail with NoSuchMethodError
[ https://issues.apache.org/jira/browse/STORM-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313339#comment-16313339 ] Arun Mahadevan commented on STORM-2881: --- [~kabhwan], clone for 2.0 - https://issues.apache.org/jira/browse/STORM-2884 so that we can resolve this bug. > Storm-druid topologies fail with NoSuchMethodError > -- > > Key: STORM-2881 > URL: https://issues.apache.org/jira/browse/STORM-2881 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > Labels: pull-request-available > Fix For: 1.2.0 > > Time Spent: 3h > Remaining Estimate: 0h > > Deploy a sample topology with storm-druid (E.g. the SampleDruidBoltTopology > available in the storm git repo). The worker crashes with below error: > {noformat} > 2017-12-28 15:47:36.382 o.a.s.util > Thread-11-113-Violation-Events-Cube-executor9 9 ERROR Async loop died! > java.lang.NoSuchMethodError: > org.apache.curator.framework.api.CreateBuilder.creatingParentsIfNeeded()Lorg/apache/curator/framework/api/ProtectACLCreateModePathAndBytesable; > at > com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:125) > > ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 > at > com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:122) > > ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 > at com.metamx.common.scala.Predef$EffectOps.withEffect(Predef.scala:44) > ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 > at > {noformat} > storm-druid has dependency on curator 2.6.0, but the storm parent pom has > defined 4.0.0 version in the dependency Management. > Due to this storm-druid is inheriting the 4.0.0 version of curator and > including that version in the jar. > If we explicitly mention the curator dependency in the storm-druid pom.xml, > this can be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (STORM-2884) Storm-druid topologies fail with NoSuchMethodError (Storm 2.0)
[ https://issues.apache.org/jira/browse/STORM-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-2884: -- Fix Version/s: (was: 1.2.0) 2.0.0 > Storm-druid topologies fail with NoSuchMethodError (Storm 2.0) > -- > > Key: STORM-2884 > URL: https://issues.apache.org/jira/browse/STORM-2884 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > Labels: pull-request-available > Fix For: 2.0.0 > > > Deploy a sample topology with storm-druid (E.g. the SampleDruidBoltTopology > available in the storm git repo). The worker crashes with below error: > {noformat} > 2017-12-28 15:47:36.382 o.a.s.util > Thread-11-113-Violation-Events-Cube-executor9 9 ERROR Async loop died! > java.lang.NoSuchMethodError: > org.apache.curator.framework.api.CreateBuilder.creatingParentsIfNeeded()Lorg/apache/curator/framework/api/ProtectACLCreateModePathAndBytesable; > at > com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:125) > > ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 > at > com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:122) > > ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 > at com.metamx.common.scala.Predef$EffectOps.withEffect(Predef.scala:44) > ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 > at > {noformat} > storm-druid has dependency on curator 2.6.0, but the storm parent pom has > defined 4.0.0 version in the dependency Management. > Due to this storm-druid is inheriting the 4.0.0 version of curator and > including that version in the jar. > If we explicitly mention the curator dependency in the storm-druid pom.xml, > this can be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (STORM-2884) Storm-druid topologies fail with NoSuchMethodError (Storm 2.0)
Arun Mahadevan created STORM-2884: - Summary: Storm-druid topologies fail with NoSuchMethodError (Storm 2.0) Key: STORM-2884 URL: https://issues.apache.org/jira/browse/STORM-2884 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Fix For: 1.2.0 Deploy a sample topology with storm-druid (E.g. the SampleDruidBoltTopology available in the storm git repo). The worker crashes with below error: {noformat} 2017-12-28 15:47:36.382 o.a.s.util Thread-11-113-Violation-Events-Cube-executor9 9 ERROR Async loop died! java.lang.NoSuchMethodError: org.apache.curator.framework.api.CreateBuilder.creatingParentsIfNeeded()Lorg/apache/curator/framework/api/ProtectACLCreateModePathAndBytesable; at com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:125) ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 at com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:122) ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 at com.metamx.common.scala.Predef$EffectOps.withEffect(Predef.scala:44) ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 at {noformat} storm-druid has dependency on curator 2.6.0, but the storm parent pom has defined 4.0.0 version in the dependency Management. Due to this storm-druid is inheriting the 4.0.0 version of curator and including that version in the jar. If we explicitly mention the curator dependency in the storm-druid pom.xml, this can be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (STORM-2881) Storm-druid topologies fail with NoSuchMethodError
Arun Mahadevan created STORM-2881: - Summary: Storm-druid topologies fail with NoSuchMethodError Key: STORM-2881 URL: https://issues.apache.org/jira/browse/STORM-2881 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Deploy a sample topology with storm-druid (E.g. the SampleDruidBoltTopology available in the storm git repo). The worker crashes with below error: {noformat} 2017-12-28 15:47:36.382 o.a.s.util Thread-11-113-Violation-Events-Cube-executor9 9 ERROR Async loop died! java.lang.NoSuchMethodError: org.apache.curator.framework.api.CreateBuilder.creatingParentsIfNeeded()Lorg/apache/curator/framework/api/ProtectACLCreateModePathAndBytesable; at com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:125) ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 at com.metamx.tranquility.beam.ClusteredBeam$$anonfun$com$metamx$tranquility$beam$ClusteredBeam$$zpathWithDefault$1.apply(ClusteredBeam.scala:122) ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 at com.metamx.common.scala.Predef$EffectOps.withEffect(Predef.scala:44) ~dep-org.apache.storm-storm-druid-jar-1.2.0.3.1.0.0-420.jar.1514384427000:1.2.0.3.1.0.0-420 at {noformat} storm-druid has dependency on curator 2.6.0, but the storm parent pom has defined 4.0.0 version in the dependency Management. Due to this storm-druid is inheriting the 4.0.0 version of curator and including that version in the jar. If we explicitly mention the curator dependency in the storm-druid pom.xml, this can be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (STORM-2761) JoinBolt.java 's paradigm is new model of stream join?
[ https://issues.apache.org/jira/browse/STORM-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181944#comment-16181944 ] Arun Mahadevan commented on STORM-2761: --- As per my understanding the tuples are buffered in both streams and joined only once when the window triggers. e.g with a 1 min window all tuples that arrived in the last 1 min in "stream1" is joined with all the tuples that arrived in the last 1 min in "stream2" when the 1 min completes. If it does not work that way there might be a bug. cc [~roshan_naik] > JoinBolt.java 's paradigm is new model of stream join? > -- > > Key: STORM-2761 > URL: https://issues.apache.org/jira/browse/STORM-2761 > Project: Apache Storm > Issue Type: Question > Components: storm-client >Reporter: Fei Pan >Priority: Critical > > Hi, I am a researcher from University of Toronto and I am studying > acceleration on stream processing platform. I have a question about the model > of window-based stream join used in the JoinBolt.java. From my understanding, > when a new tuple arrived, we join this new tuple with all the tuples in the > window of the opposite stream. However, in the JoinBolt.java, not only the > new tuple, but the tuples in the entire local window will join with the > window of the opposite stream. This actually produces a lot of duplicated > results, since most of the old tuples in the local window have joined before. > I don't know if this is a new paradigm or the storm's team misunderstood the > model of stream join. Can someone help me to clarify this question? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (STORM-2154) Prototype beam runner using unified streams api
[ https://issues.apache.org/jira/browse/STORM-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167429#comment-16167429 ] Arun Mahadevan commented on STORM-2154: --- You can make you changes in the beam-runner branch. https://github.com/apache/storm/tree/beam-runner Rebase this branch with master and then you can make your changes. > Prototype beam runner using unified streams api > --- > > Key: STORM-2154 > URL: https://issues.apache.org/jira/browse/STORM-2154 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > > This is mostly to identify any gaps and validate the proposed apis. It will > be just a prototype runner using the apis proposed in STORM-1961. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (STORM-2154) Prototype beam runner using unified streams api
[ https://issues.apache.org/jira/browse/STORM-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16152100#comment-16152100 ] Arun Mahadevan commented on STORM-2154: --- It will be good to document what we plan to do in the prototype, what the gaps are how much of the Beam SDK we can use to address some of these and so on. >As a first step, are you ok if I update the beam prototype you mentioned with >latest beam api against storm master branch. Do you mean upgrading to the latest Beam api in the beam-runner branch and get it working with the core spouts/bolt? Right now I think the implementation is just the barebones and may not even work end-end. Yes it will be a good first exercise to get it working end-end with the latest BEAM APIs and then we can look at re-writing it with the Storm's stream APIs. > Prototype beam runner using unified streams api > --- > > Key: STORM-2154 > URL: https://issues.apache.org/jira/browse/STORM-2154 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > > This is mostly to identify any gaps and validate the proposed apis. It will > be just a prototype runner using the apis proposed in STORM-1961. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (STORM-2154) Prototype beam runner using unified streams api
[ https://issues.apache.org/jira/browse/STORM-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16145487#comment-16145487 ] Arun Mahadevan commented on STORM-2154: --- [~bkamal] The idea is to prototype a beam runner using the underlying streams API. There was some initial investigation around this, but its not a trivial task. I will try to revisit and put up a plan and a skeleton in the next few weeks or so and we can build upon that. Meanwhile you can take a look at the beam prototype that was done using the storm core apis here - https://github.com/apache/storm/tree/beam-runner The beam apis itself might have changed since but you can use it as a reference. You should also take a look at the beam runner APIs and the other runners and come up with a high level plan on what are the gaps and list a high level plan. > Prototype beam runner using unified streams api > --- > > Key: STORM-2154 > URL: https://issues.apache.org/jira/browse/STORM-2154 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > > This is mostly to identify any gaps and validate the proposed apis. It will > be just a prototype runner using the apis proposed in STORM-1961. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (STORM-2258) Streams api - support CoGroupByKey
[ https://issues.apache.org/jira/browse/STORM-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2258. --- Resolution: Fixed Assignee: Kamal Fix Version/s: 2.0.0 Thanks for the patch, merged to master. > Streams api - support CoGroupByKey > -- > > Key: STORM-2258 > URL: https://issues.apache.org/jira/browse/STORM-2258 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Arun Mahadevan >Assignee: Kamal > Fix For: 2.0.0 > > > Group together values with same key from both streams. Similar constructs are > supported in beam, spark and flink. > When called on a Stream of (K, V) and (K, W) pairs, return a new Stream of > (K, Seq[V], Seq[W]) tuples > See also - https://cloud.google.com/dataflow/model/group-by-key -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (STORM-2692) Load only configs specific to the topology in populateCredentials
Arun Mahadevan created STORM-2692: - Summary: Load only configs specific to the topology in populateCredentials Key: STORM-2692 URL: https://issues.apache.org/jira/browse/STORM-2692 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Theres a single instance of AutoCredentials plugin in Nimbus and right now we load all the config keys in "populateCredentials". This can cause issues when multiple topologies are submitted. The second one tries to load the keys of the first topology. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (STORM-2614) Enhance stateful windowing to persist the window state
[ https://issues.apache.org/jira/browse/STORM-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2614. --- Resolution: Fixed Merged to master > Enhance stateful windowing to persist the window state > -- > > Key: STORM-2614 > URL: https://issues.apache.org/jira/browse/STORM-2614 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > Fix For: 2.0.0 > > Time Spent: 19h 40m > Remaining Estimate: 0h > > Right now the tuples in window are stored in memory. This limits the usage to > windows that fit in memory and the source tuples cannot be acked until the > window expiry. By persisting the window transparently in the state backend > and caching/iterating them as need, we could support larger windows and also > windowed bolts with user/application state. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (STORM-2614) Enhance stateful windowing to persist the window state
[ https://issues.apache.org/jira/browse/STORM-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-2614: -- Component/s: storm-core > Enhance stateful windowing to persist the window state > -- > > Key: STORM-2614 > URL: https://issues.apache.org/jira/browse/STORM-2614 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > Fix For: 2.0.0 > > Time Spent: 19h 40m > Remaining Estimate: 0h > > Right now the tuples in window are stored in memory. This limits the usage to > windows that fit in memory and the source tuples cannot be acked until the > window expiry. By persisting the window transparently in the state backend > and caching/iterating them as need, we could support larger windows and also > windowed bolts with user/application state. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (STORM-2614) Enhance stateful windowing to persist the window state
[ https://issues.apache.org/jira/browse/STORM-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-2614: -- Fix Version/s: 2.0.0 > Enhance stateful windowing to persist the window state > -- > > Key: STORM-2614 > URL: https://issues.apache.org/jira/browse/STORM-2614 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > Fix For: 2.0.0 > > Time Spent: 19h 40m > Remaining Estimate: 0h > > Right now the tuples in window are stored in memory. This limits the usage to > windows that fit in memory and the source tuples cannot be acked until the > window expiry. By persisting the window transparently in the state backend > and caching/iterating them as need, we could support larger windows and also > windowed bolts with user/application state. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (STORM-2614) Enhance stateful windowing to persist the window state
[ https://issues.apache.org/jira/browse/STORM-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098049#comment-16098049 ] Arun Mahadevan commented on STORM-2614: --- High level design: This builds on top of the existing state checkpointing mechanism (documented here - https://github.com/apache/storm/blob/master/docs/State-checkpointing.md). Theres nothing extra added to the underlying checkpointing mechanism itself and its pretty straightforward. The tuples in window (think a FIFO queue) are split into multiple partitions so that they are more manageable and can be distributed/sharded via the underlying key-value state (redis/hbase etc). The modified partitions are saved during a checkpoint. During iteration the partition are loaded on demand from the underlying state backend as they are accessed. A subset of the partitions that are most likely to be used again are cached in memory. During checkpoint, the following are saved : 1. Any modified or newly created window partitions. 2. Any state needed to recover the Trigger/Eviction policies. 3. State thats exposed to the user where the user may have saved some values. Since the KV state does not guarantee any specific ordering of the keys during iteration, a separate structure is maintained to store the ordered partition Ids which is used during iteration to retrieve the partitions in order. This is also saved during the checkpoint. The above mechanism kicks in only if user choses to use the windowed state persistence, otherwise the current behavior (keeping the tuples in an in-memory queue) is retained. > Enhance stateful windowing to persist the window state > -- > > Key: STORM-2614 > URL: https://issues.apache.org/jira/browse/STORM-2614 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > Time Spent: 1.5h > Remaining Estimate: 0h > > Right now the tuples in window are stored in memory. This limits the usage to > windows that fit in memory and the source tuples cannot be acked until the > window expiry. By persisting the window transparently in the state backend > and caching/iterating them as need, we could support larger windows and also > windowed bolts with user/application state. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (STORM-2614) Enhance stateful windowing to persist the window state
Arun Mahadevan created STORM-2614: - Summary: Enhance stateful windowing to persist the window state Key: STORM-2614 URL: https://issues.apache.org/jira/browse/STORM-2614 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Right now the tuples in window are stored in memory. This limits the usage to windows that fit in memory and the source tuples cannot be acked until the window expiry. By persisting the window transparently in the state backend and caching/iterating them as need, we could support larger windows and also windowed bolts with user/application state. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (STORM-2258) Streams api - support CoGroupByKey
[ https://issues.apache.org/jira/browse/STORM-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064194#comment-16064194 ] Arun Mahadevan commented on STORM-2258: --- We already have a groupByKey implementation. What we are looking for is a coGroupByKey which is more like a join but instead of returning the cross product of the matching keys, groups together values for the same key from the joined streams. For example, Say stream1 has values - (k1, v1), (k2, v2), (k2, v3) and stream2 has values - (k1, x1), (k1, x2), (k3, x3) The the co-grouped stream would contain - {noformat} (k1, ([v1], [x1, x2]), (k2, ([v2, v3], [])), (k3, ([], [x3])) {noformat} Since you are co-grouping two streams containing key-value pairs, you would define the operation on the PairStream class, with the signature of the method to be something like, {noformat} public PairStream> coGroupByKey(PairStream otherStream) { // ... } {noformat} To get some hints on how to go about implementing this, you can take a look at the implementation of the join operation. (see JoinProcessor.java) > Streams api - support CoGroupByKey > -- > > Key: STORM-2258 > URL: https://issues.apache.org/jira/browse/STORM-2258 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Arun Mahadevan > > Group together values with same key from both streams. Similar constructs are > supported in beam, spark and flink. > When called on a Stream of (K, V) and (K, W) pairs, return a new Stream of > (K, Seq[V], Seq[W]) tuples > See also - https://cloud.google.com/dataflow/model/group-by-key -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (STORM-2562) Use stronger key size for blow fish key generator and get rid of stack trace
[ https://issues.apache.org/jira/browse/STORM-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2562. --- Resolution: Fixed Thanks [~pshah] , merged to master and 1.x-branch > Use stronger key size for blow fish key generator and get rid of stack trace > > > Key: STORM-2562 > URL: https://issues.apache.org/jira/browse/STORM-2562 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Priyank Shah >Assignee: Priyank Shah > Fix For: 2.0.0, 1.x > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (STORM-2563) Remove the workaround to handle missing UGI.loginUserFromSubject
Arun Mahadevan created STORM-2563: - Summary: Remove the workaround to handle missing UGI.loginUserFromSubject Key: STORM-2563 URL: https://issues.apache.org/jira/browse/STORM-2563 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/security/auth/kerberos/AutoTGT.java#L225 The "userCons.setAccessible(true)" invokes constructor of a package private class bypassing the Java access control checks and raising red flags in our internal security scans. The "loginUserFromSubject(Subject subject)" has been added to UGI (https://issues.apache.org/jira/browse/HADOOP-10164) and available since Hadoop version 2.3 released over three years ago (http://hadoop.apache.org/releases.html). I think the workaround is no longer required since the case will not happen when using hadoop-common versions >= 2.3 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (STORM-2516) WindowedBoltExecutorTest.testExecuteWithLateTupleStream is flaky
[ https://issues.apache.org/jira/browse/STORM-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2516. --- Resolution: Fixed Fix Version/s: 1.1.1 2.0.0 Thanks [~Srdo], merged to master and 1.x-branch > WindowedBoltExecutorTest.testExecuteWithLateTupleStream is flaky > > > Key: STORM-2516 > URL: https://issues.apache.org/jira/browse/STORM-2516 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing > Fix For: 2.0.0, 1.1.1 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > See https://travis-ci.org/apache/storm/jobs/232571820. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (STORM-2489) Overlap and data loss on WindowedBolt based on Duration
[ https://issues.apache.org/jira/browse/STORM-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2489. --- Resolution: Fixed Fix Version/s: 1.1.1 2.0.0 > Overlap and data loss on WindowedBolt based on Duration > --- > > Key: STORM-2489 > URL: https://issues.apache.org/jira/browse/STORM-2489 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.2 > Environment: windows 10, eclipse, jdk1.7 >Reporter: wangkui >Assignee: Arun Mahadevan > Fix For: 2.0.0, 1.1.1 > > Attachments: TumblingWindowIssue.java > > Time Spent: 3h > Remaining Estimate: 0h > > The attachment is my test script, one of my test results is: > ``` > expired=1...55 > get=56...4024 > new=56...4024 > Recived=3969,RecivedTotal=3969 > expired=56...4020 > get=4021...8191 > new=4025...8191 > Recived=4171,RecivedTotal=8140 > SendTotal=12175 > expired=4021...8188 > get=8189...12175 > new=8192...12175 > Recived=3987,RecivedTotal=12127 > ``` > This test result shows that some tuples appear in the expired list directly, > we lost these data if we just use get() to get tuples, this is the first bug. > The second: the tuples of get() has overlap, the getNew() seems alright. > The problem not happen definitely, may need to try several times. > Actually, I'm newbie about storm, so I'm not sure this is a bug indeed, or, I > use it in wrong way? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (STORM-2520) AutoHDFS should prefer cluster-wise hdfs kerberos principal to global hdfs kerberos principal
[ https://issues.apache.org/jira/browse/STORM-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2520. --- Resolution: Fixed Fix Version/s: 1.1.1 2.0.0 Thanks [~kabhwan], merged to master and 1.x branch > AutoHDFS should prefer cluster-wise hdfs kerberos principal to global hdfs > kerberos principal > - > > Key: STORM-2520 > URL: https://issues.apache.org/jira/browse/STORM-2520 > Project: Apache Storm > Issue Type: Bug > Components: storm-autocreds >Affects Versions: 2.0.0, 1.1.1 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim > Fix For: 2.0.0, 1.1.1 > > Time Spent: 50m > Remaining Estimate: 0h > > After STORM-2482, we can set cluster-wise principal and keytab (and > configurations) instead of setting global principal and keytab for HDFS and > HBase. > (Hive will be supported via STORM-2501.) > In AutoHDFS there's a missed spot which always uses global principal, and it > throws some errors when global principal is not set. > It should prefer cluster-wise principal to global principal. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (STORM-2519) AbstractAutoCreds should look for configKeys in both nimbus and topology configs
[ https://issues.apache.org/jira/browse/STORM-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2519. --- Resolution: Fixed Fix Version/s: 1.1.1 2.0.0 > AbstractAutoCreds should look for configKeys in both nimbus and topology > configs > > > Key: STORM-2519 > URL: https://issues.apache.org/jira/browse/STORM-2519 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > Fix For: 2.0.0, 1.1.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (STORM-2519) AbstractAutoCreds should look for configKeys in both nimbus and topology configs
[ https://issues.apache.org/jira/browse/STORM-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan reassigned STORM-2519: - Assignee: Arun Mahadevan > AbstractAutoCreds should look for configKeys in both nimbus and topology > configs > > > Key: STORM-2519 > URL: https://issues.apache.org/jira/browse/STORM-2519 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (STORM-2519) AbstractAutoCreds should look for configKeys in both nimbus and topology configs
Arun Mahadevan created STORM-2519: - Summary: AbstractAutoCreds should look for configKeys in both nimbus and topology configs Key: STORM-2519 URL: https://issues.apache.org/jira/browse/STORM-2519 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (STORM-2498) Download Full File link broken in 1.x branch
Arun Mahadevan created STORM-2498: - Summary: Download Full File link broken in 1.x branch Key: STORM-2498 URL: https://issues.apache.org/jira/browse/STORM-2498 Project: Apache Storm Issue Type: Bug Affects Versions: 1.x Reporter: Arun Mahadevan Assignee: Arun Mahadevan The download link points to "download?file=%5BLjava.lang.Object%3B%406d6db3b" instead of something like "download?file=wordcount-1-1493298799%2F6701%2Fworker.log" -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (STORM-2489) Overlap and data loss on WindowedBolt based on Duration
[ https://issues.apache.org/jira/browse/STORM-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992758#comment-15992758 ] Arun Mahadevan edited comment on STORM-2489 at 5/2/17 11:43 AM: [~wangkui], the initial tuples expired because the trigger was not fired exactly after the window interval but after a delay. When I tested in local mode with spout emitting without a delay, the trigger happened after 6s (for a 4s tumbling window). This may be because the system is overwhelmed with data and not able to schedule the trigger thread on time. In this case the initial tuples (0 - 2s) will not be considered in the first window. Typically the window duration should be such that all the tuples within a window can be processed before the next window trigger, otherwise the next window trigger will be delayed and it will lead to incorrect results. You should use a real cluster with multiple hosts/workers and split the data among these workers to handle such high data rates. Another option would be to use event time windows where each event contains a "timestamp" field and the window calculations are done based on the actual event time instead of system time. was (Author: arunmahadevan): [~wangkui], the initial tuples expired because the trigger was not fired exactly after the window interval but after a delay. When I tested in local mode with spout emitting without a delay, the trigger happened after 6s (for a 4s tumbling window). This may be because the system is overwhelmed with data and not able to schedule the trigger thread on time. In this case the initial tuples (0 - 2s) will not be considered in the first window. Typically the window duration should be such that all the tuples within a window can be processed before the next window trigger, otherwise the next window trigger will be delayed and it will lead to incorrect results. You should use a real cluster with multiple hosts/workers and split the data among these workers to handle such high data rates. > Overlap and data loss on WindowedBolt based on Duration > --- > > Key: STORM-2489 > URL: https://issues.apache.org/jira/browse/STORM-2489 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.2 > Environment: windows 10, eclipse, jdk1.7 >Reporter: wangkui >Assignee: Arun Mahadevan > Attachments: TumblingWindowIssue.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The attachment is my test script, one of my test results is: > ``` > expired=1...55 > get=56...4024 > new=56...4024 > Recived=3969,RecivedTotal=3969 > expired=56...4020 > get=4021...8191 > new=4025...8191 > Recived=4171,RecivedTotal=8140 > SendTotal=12175 > expired=4021...8188 > get=8189...12175 > new=8192...12175 > Recived=3987,RecivedTotal=12127 > ``` > This test result shows that some tuples appear in the expired list directly, > we lost these data if we just use get() to get tuples, this is the first bug. > The second: the tuples of get() has overlap, the getNew() seems alright. > The problem not happen definitely, may need to try several times. > Actually, I'm newbie about storm, so I'm not sure this is a bug indeed, or, I > use it in wrong way? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (STORM-2489) Overlap and data loss on WindowedBolt based on Duration
[ https://issues.apache.org/jira/browse/STORM-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992758#comment-15992758 ] Arun Mahadevan commented on STORM-2489: --- [~wangkui], the initial tuples expired because the trigger was not fired exactly after the window interval but after a delay. When I tested in local mode with spout emitting without a delay, the trigger happened after 6s (for a 4s tumbling window). This may be because the system is overwhelmed with data and not able to schedule the trigger thread on time. In this case the initial tuples (0 - 2s) will not be considered in the first window. Typically the window duration should be such that all the tuples within a window can be processed before the next window trigger, otherwise the next window trigger will be delayed and it will lead to incorrect results. You should use a real cluster with multiple hosts/workers and split the data among these workers to handle such high data rates. > Overlap and data loss on WindowedBolt based on Duration > --- > > Key: STORM-2489 > URL: https://issues.apache.org/jira/browse/STORM-2489 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.2 > Environment: windows 10, eclipse, jdk1.7 >Reporter: wangkui >Assignee: Arun Mahadevan > Attachments: TumblingWindowIssue.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The attachment is my test script, one of my test results is: > ``` > expired=1...55 > get=56...4024 > new=56...4024 > Recived=3969,RecivedTotal=3969 > expired=56...4020 > get=4021...8191 > new=4025...8191 > Recived=4171,RecivedTotal=8140 > SendTotal=12175 > expired=4021...8188 > get=8189...12175 > new=8192...12175 > Recived=3987,RecivedTotal=12127 > ``` > This test result shows that some tuples appear in the expired list directly, > we lost these data if we just use get() to get tuples, this is the first bug. > The second: the tuples of get() has overlap, the getNew() seems alright. > The problem not happen definitely, may need to try several times. > Actually, I'm newbie about storm, so I'm not sure this is a bug indeed, or, I > use it in wrong way? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (STORM-2489) Overlap and data loss on WindowedBolt based on Duration
[ https://issues.apache.org/jira/browse/STORM-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984426#comment-15984426 ] Arun Mahadevan commented on STORM-2489: --- [~wangkui], I made some minor tweak to process the tuples with some time offset, please pull the current changes and test. > Overlap and data loss on WindowedBolt based on Duration > --- > > Key: STORM-2489 > URL: https://issues.apache.org/jira/browse/STORM-2489 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.2 > Environment: windows 10, eclipse, jdk1.7 >Reporter: wangkui >Assignee: Arun Mahadevan > Attachments: TumblingWindowIssue.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The attachment is my test script, one of my test results is: > ``` > expired=1...55 > get=56...4024 > new=56...4024 > Recived=3969,RecivedTotal=3969 > expired=56...4020 > get=4021...8191 > new=4025...8191 > Recived=4171,RecivedTotal=8140 > SendTotal=12175 > expired=4021...8188 > get=8189...12175 > new=8192...12175 > Recived=3987,RecivedTotal=12127 > ``` > This test result shows that some tuples appear in the expired list directly, > we lost these data if we just use get() to get tuples, this is the first bug. > The second: the tuples of get() has overlap, the getNew() seems alright. > The problem not happen definitely, may need to try several times. > Actually, I'm newbie about storm, so I'm not sure this is a bug indeed, or, I > use it in wrong way? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (STORM-2489) Overlap and data loss on WindowedBolt based on Duration
[ https://issues.apache.org/jira/browse/STORM-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984369#comment-15984369 ] Arun Mahadevan commented on STORM-2489: --- [~wangkui] thanks for the update. I made some minor changes to the patch today. If you could pull the latest changes and run your tests a few times will be great. > Overlap and data loss on WindowedBolt based on Duration > --- > > Key: STORM-2489 > URL: https://issues.apache.org/jira/browse/STORM-2489 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.2 > Environment: windows 10, eclipse, jdk1.7 >Reporter: wangkui >Assignee: Arun Mahadevan > Attachments: TumblingWindowIssue.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The attachment is my test script, one of my test results is: > ``` > expired=1...55 > get=56...4024 > new=56...4024 > Recived=3969,RecivedTotal=3969 > expired=56...4020 > get=4021...8191 > new=4025...8191 > Recived=4171,RecivedTotal=8140 > SendTotal=12175 > expired=4021...8188 > get=8189...12175 > new=8192...12175 > Recived=3987,RecivedTotal=12127 > ``` > This test result shows that some tuples appear in the expired list directly, > we lost these data if we just use get() to get tuples, this is the first bug. > The second: the tuples of get() has overlap, the getNew() seems alright. > The problem not happen definitely, may need to try several times. > Actually, I'm newbie about storm, so I'm not sure this is a bug indeed, or, I > use it in wrong way? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (STORM-2489) Overlap and data loss on WindowedBolt based on Duration
[ https://issues.apache.org/jira/browse/STORM-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982487#comment-15982487 ] Arun Mahadevan commented on STORM-2489: --- [~wangkui], posted a potential fix - https://github.com/apache/storm/pull/2090 You might want to try your tests with the patch (you can apply the patch or clone https://github.com/arunmahadevan/storm/tree/STORM-2489) > Overlap and data loss on WindowedBolt based on Duration > --- > > Key: STORM-2489 > URL: https://issues.apache.org/jira/browse/STORM-2489 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.2 > Environment: windows 10, eclipse, jdk1.7 >Reporter: wangkui >Assignee: Arun Mahadevan > Attachments: TumblingWindowIssue.java > > Time Spent: 10m > Remaining Estimate: 0h > > The attachment is my test script, one of my test results is: > ``` > expired=1...55 > get=56...4024 > new=56...4024 > Recived=3969,RecivedTotal=3969 > expired=56...4020 > get=4021...8191 > new=4025...8191 > Recived=4171,RecivedTotal=8140 > SendTotal=12175 > expired=4021...8188 > get=8189...12175 > new=8192...12175 > Recived=3987,RecivedTotal=12127 > ``` > This test result shows that some tuples appear in the expired list directly, > we lost these data if we just use get() to get tuples, this is the first bug. > The second: the tuples of get() has overlap, the getNew() seems alright. > The problem not happen definitely, may need to try several times. > Actually, I'm newbie about storm, so I'm not sure this is a bug indeed, or, I > use it in wrong way? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (STORM-2489) Overlap and data loss on WindowedBolt based on Duration
[ https://issues.apache.org/jira/browse/STORM-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan reassigned STORM-2489: - Assignee: Arun Mahadevan > Overlap and data loss on WindowedBolt based on Duration > --- > > Key: STORM-2489 > URL: https://issues.apache.org/jira/browse/STORM-2489 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.2 > Environment: windows 10, eclipse, jdk1.7 >Reporter: wangkui >Assignee: Arun Mahadevan > Attachments: TumblingWindowIssue.java > > > The attachment is my test script, one of my test results is: > ``` > expired=1...55 > get=56...4024 > new=56...4024 > Recived=3969,RecivedTotal=3969 > expired=56...4020 > get=4021...8191 > new=4025...8191 > Recived=4171,RecivedTotal=8140 > SendTotal=12175 > expired=4021...8188 > get=8189...12175 > new=8192...12175 > Recived=3987,RecivedTotal=12127 > ``` > This test result shows that some tuples appear in the expired list directly, > we lost these data if we just use get() to get tuples, this is the first bug. > The second: the tuples of get() has overlap, the getNew() seems alright. > The problem not happen definitely, may need to try several times. > Actually, I'm newbie about storm, so I'm not sure this is a bug indeed, or, I > use it in wrong way? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (STORM-2489) Overlap and data loss on WindowedBolt based on Duration
[ https://issues.apache.org/jira/browse/STORM-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15980736#comment-15980736 ] Arun Mahadevan commented on STORM-2489: --- [~wangkui] thanks for reporting the issue. Processing-time based windows internally uses a ScheduledExecutor. It maybe that the tasks are not triggered exactly after the tumbling window interval causing some overlap. To debug further, can you also print the window end timestamp (https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/windowing/Window.java#L55) each time the execute is invoked? "getEndTimestamp" returns the reference processing time which is used to calculate the window boundaries. You will need build storm from the latest master and run your examples against that since "getEndTimestamp" is not available in any of the released versions of storm. > Overlap and data loss on WindowedBolt based on Duration > --- > > Key: STORM-2489 > URL: https://issues.apache.org/jira/browse/STORM-2489 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.2 > Environment: windows 10, eclipse, jdk1.7 >Reporter: wangkui > Attachments: TumblingWindowIssue.java > > > The attachment is my test script, one of my test results is: > ``` > expired=1...55 > get=56...4024 > new=56...4024 > Recived=3969,RecivedTotal=3969 > expired=56...4020 > get=4021...8191 > new=4025...8191 > Recived=4171,RecivedTotal=8140 > SendTotal=12175 > expired=4021...8188 > get=8189...12175 > new=8192...12175 > Recived=3987,RecivedTotal=12127 > ``` > This test result shows that some tuples appear in the expired list directly, > we lost these data if we just use get() to get tuples, this is the first bug. > The second: the tuples of get() has overlap, the getNew() seems alright. > The problem not happen definitely, may need to try several times. > Actually, I'm newbie about storm, so I'm not sure this is a bug indeed, or, I > use it in wrong way? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (STORM-2482) Refactor the Storm auto credential plugins to be more usable
Arun Mahadevan created STORM-2482: - Summary: Refactor the Storm auto credential plugins to be more usable Key: STORM-2482 URL: https://issues.apache.org/jira/browse/STORM-2482 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Currently, the auto credential plugins are part of the respective external modules like storm-hdfs, storm-hbase etc. If users want to use it, they need to place the jars (storm-hdfs, storm-hbase) and its dependencies into ext lib. Currently these plugins does not accept any hadoop configuration programatically. These are set by placing config files like hdfs-site.xml in the class path and this does not scale well nor does it allow users to connect and fetch tokens from different clusters (say two different name nodes) with a single topology. To make the auto cred plugins more usable, 1. Refactor the AutoHdfs, AutoHbase etc into a separate storm external module (say storm-autocreds). This jars along with its dependencies can be packaged and extracted to a folder like lib-autocreds which can be loaded into the class path when storm runs in secure mode (e.g. by setting STORM_EXT_CLASSPATH). The required plugins would be loaded by nimubs/workers based on the user configuration in storm.yaml. 2. Modify the plugins to accept "configKeys" via topology config. "configKeys" would be a list of string "keys" that the user would pass in the topology config. {noformat} // for hdfs topoConf.set("hdfsCredentialsConfigKeys", Arrays.asList(new String[] {"cluster1Key", "cluster2Key"})); // put respective config map for the config keys, topoConf.set("cluster1Key", configMap1); topoConf.set("cluster2Key", configMap2); {noformat} This way we can support credentials from multiple clusters. 3. During topology submission, nimbus invokes "populateCredentials". If "configKeys" are specified, the plugins will login to hadoop for each config key and fetch the credentials (delegation tokens) and store it with respective keys in the storm cluster state. Cluster state already stores the credentials as a Mapso no changes are needed there. The workers will download the credentials and invoke "populateSubject". The plugin would populate all the credentials for all the configured "configKeys" into the subject. Similar steps would be performed during "updateSubject" 4. Nimbus periodically invokes "renew" credentials. At this time the plugin will fetch the credentials for the configured "configKeys" (i.e. for the users from different clusters) and renew the respective credentials. 5. The user could specify different principal and keytab within the config key map so that the plugin will use appropriate user for logging into the respective cluster. We also need to enhance the auto cred by adding more plugins. E.g for hbase and kafka delegation tokens which are missing currently (this could be a separate JIRAs). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (STORM-2427) Event logger enable/disable UI is not working as expected in master branch
Arun Mahadevan created STORM-2427: - Summary: Event logger enable/disable UI is not working as expected in master branch Key: STORM-2427 URL: https://issues.apache.org/jira/browse/STORM-2427 Project: Apache Storm Issue Type: Bug Affects Versions: 2.0.0 Reporter: Arun Mahadevan Assignee: Arun Mahadevan Need to pull missing commits from 1.x branch -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (STORM-2425) Storm Hive Bolt not closing open transactions
Arun Mahadevan created STORM-2425: - Summary: Storm Hive Bolt not closing open transactions Key: STORM-2425 URL: https://issues.apache.org/jira/browse/STORM-2425 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Hive bolt will close connection only if parameter "max_connections" is exceeded or bolt dies. So if we open a connection to Hive via Hive bolt and some time later we stop producing messages to Hive bolt, connection will be maintained and corresponding transactions will be opened. This can be a problem if we launch two topologies and one of them will maintain open transactions doing nothing, and other will work writing messages to hive. At some point hive will launch compactions to collapse small delta files generated by Hive Bolt into one base file. But compaction wont launch if we have opened transactions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (STORM-2334) Bolt for Joining streams
[ https://issues.apache.org/jira/browse/STORM-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-2334: -- Fix Version/s: 1.1.1 2.0.0 > Bolt for Joining streams > > > Key: STORM-2334 > URL: https://issues.apache.org/jira/browse/STORM-2334 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.x >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: 2.0.0, 1.1.1 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > Create a general purpose windowed bolt that performs Joins on multiple data > streams. > Since, depending on the topo config, the bolt could be receiving data either > on 'default' streams or on named streams join bolt should be able to > differentiate the incoming data based on names of upstream components as well > as stream names. > *Example:* > The following SQL style join involving 4 tables : > {code} > select userId, key4, key2, key3 > from stream1 > join stream2 on stream2.userId = stream1.key1 > join stream3 on stream3.key3 = stream2.userId > left join stream4 on stream4.key4 = stream3.key3 > {code} > Could be expressed using the Join Bolt over 4 named streams as : > {code} > new JoinBolt(STREAM, "stream1", "key1") //'STREAM' arg indicates that > stream1/2/3/4 are names of streams. 'key1' is the key on which > .join ("stream2", "userId", "stream1") //join stream2 on > stream2.userId=stream1.key1 > .join ("stream3", "key3","stream2") //join stream3 on > stream3.key3=stream2.userId > .leftjoin ("stream4", "key4","stream3") //left join stream4 on > stream4.key4=stream3.key3 > .select("userId, key4, key2, key3") // chose output fields > .withWindowLength(..) > .withSlidingInterval(..); > {code} > Or based on named source components : > {code} > new JoinBolt(SOURCE, "kafkaSpout1", "key1") //'SOURCE' arg indicates that > kafkaSpout1, hdfsSpout3 etc are names of upstream components > .join ("kafkaSpout2", "userId","kafkaSpout1" ) > .join ("hdfsSpout3", "key3", "kafkaSpout2") > .leftjoin ("mqttSpout1", "key4", "hdfsSpout3") > .select ("userId, key4, key2, key3") > .withWindowLength(..) > .withSlidingInterval(..); > {code} > In order for the tuples to be joined correctly, 'fields grouping' should be > employed on the incoming streams. Each stream should be grouped on the same > key using which it will be joined against other streams. This is a > restriction compared to SQL which allows join a table with others on any key > and any number of keys. > *For example:* If a 'Stream1' is Fields Grouped on 'key1', we cannot use a > different 'key2' on 'Stream1' to join it with other streams. However, > 'Stream1' can be joined using the same key with multiple other streams as > show in this SQL. > {code} > select > from stream1 > join stream2 on stream2.userId = stream1.key1 > join stream3 on stream3.key3 = stream1.key2 // not supportable in Join > Bolt > {code} > Consequently the join bolt's syntax is a bit simplified compared to SQL. The > key name for any given stream only appears once, as soon the stream is > introduced for the first time in the join. Thereafter that key is implicitly > used for joining. See the case of 'stream3' being joined with both 'stream2' > and 'stream4' in the first example. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (STORM-2367) Documentation for streams API
Arun Mahadevan created STORM-2367: - Summary: Documentation for streams API Key: STORM-2367 URL: https://issues.apache.org/jira/browse/STORM-2367 Project: Apache Storm Issue Type: Sub-task Reporter: Arun Mahadevan Assignee: Arun Mahadevan Add Readme/documentation for the streams API. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (STORM-2365) Support for specifying output stream in event hubs spout
Arun Mahadevan created STORM-2365: - Summary: Support for specifying output stream in event hubs spout Key: STORM-2365 URL: https://issues.apache.org/jira/browse/STORM-2365 Project: Apache Storm Issue Type: Improvement Reporter: Arun Mahadevan Assignee: Arun Mahadevan -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (STORM-2282) Streams api - provide options for handling errors
[ https://issues.apache.org/jira/browse/STORM-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820222#comment-15820222 ] Arun Mahadevan edited comment on STORM-2282 at 1/12/17 5:31 AM: [~roshan_naik] bq. Its ok to keep retry-ing errors that are considered retry-worthy. The non-retry worthy are the ones we must fail fast and move them out as they will cause a jam and prevent good tuples from flowing. Also no good way to recover from that. bq.One approach to do this... bq.For non-retry worthy error conditions (like bad data), any processing element in the pipeline can throw a specific exception. This can then be handled by the runtime to send that tuple to a configurable dead letter queue. Ideally this needs a new kind of Fail() notification in the spout to avoid re-emit. Alternatively, we can send the spout an ACK instead of FAIL to avoid retry. The DeadLetterQ bolt's metrics will capture these failure metrics. Better not to have each spout/bolt explicitly deal with dead letter queue ... that will complicate the topology definition...as every spout bolt will need to be configured and wired up. bq. For retry-worthy errors (timeouts, destination unavailable, etc)... The existing retry mechanism can kick in. However, in today's core API, there is one pain point for spout writers. Each spout needs to implement the logic to track inflight tuples and attempt retry on fail(). The implementation is moderately complicated as ACKs/Fails can come in any order. All the spouts have to do the same thing but end up doing slightly differently. Some have retry limits, some don't. This retry logic should ideally be lifted out of the Spout and handled in the API. This new API is a good opportunity to address this issue. Yes the idea is good if we can expose the right api for users and also keep the implementation simple. At a high level there could be a global api at StreamBuilder or a more granular one at a specific stage in the pipeline. Something like, {code:java} // globally at stream builder streamBuilder.setRetryPolicy(...); Stream errors = streamBuilder.deadLetterQueue(); // at stream api level Stream[] streams = stream.map(..).branchErrors(); Stream success = stream[0]; Stream errors = streams[1]; {code} We also need to see how it will work with Storm's current timeout based replay mechanism and so on. We can discuss further and come up with the right approach. was (Author: arunmahadevan): bq. Its ok to keep retry-ing errors that are considered retry-worthy. The non-retry worthy are the ones we must fail fast and move them out as they will cause a jam and prevent good tuples from flowing. Also no good way to recover from that. bq.One approach to do this... bq.For non-retry worthy error conditions (like bad data), any processing element in the pipeline can throw a specific exception. This can then be handled by the runtime to send that tuple to a configurable dead letter queue. Ideally this needs a new kind of Fail() notification in the spout to avoid re-emit. Alternatively, we can send the spout an ACK instead of FAIL to avoid retry. The DeadLetterQ bolt's metrics will capture these failure metrics. Better not to have each spout/bolt explicitly deal with dead letter queue ... that will complicate the topology definition...as every spout bolt will need to be configured and wired up. bq. For retry-worthy errors (timeouts, destination unavailable, etc)... The existing retry mechanism can kick in. However, in today's core API, there is one pain point for spout writers. Each spout needs to implement the logic to track inflight tuples and attempt retry on fail(). The implementation is moderately complicated as ACKs/Fails can come in any order. All the spouts have to do the same thing but end up doing slightly differently. Some have retry limits, some don't. This retry logic should ideally be lifted out of the Spout and handled in the API. This new API is a good opportunity to address this issue. Yes the idea is good if we can expose the right api for users and also keep the implementation simple. At a high level there could be a global api at StreamBuilder or a more granular one at a specific stage in the pipeline. Something like, {code:java} // globally at stream builder streamBuilder.setRetryPolicy(...); Stream errors = streamBuilder.deadLetterQueue(); // at stream api level Stream[] streams = stream.map(..).branchErrors(); Stream success = stream[0]; Stream errors = streams[1]; {code} We also need to see how it will work with Storm's current timeout based replay mechanism and so on. We can discuss further and come up with the right approach. > Streams api - provide options for handling errors > - > > Key: STORM-2282 > URL: https://issues.apache.org/jira/browse/STORM-2282 >
[jira] [Closed] (STORM-2203) Add a getAll method to KeyValueState interface
[ https://issues.apache.org/jira/browse/STORM-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan closed STORM-2203. - > Add a getAll method to KeyValueState interface > -- > > Key: STORM-2203 > URL: https://issues.apache.org/jira/browse/STORM-2203 > Project: Apache Storm > Issue Type: Improvement > Components: storm-core >Reporter: Abhishek > Fix For: 2.0.0, 1.1.0 > > Time Spent: 10m > Remaining Estimate: 0h > > A getAll method which would return all the key value pairs present in the > state could be really useful in Stateful bolts. Example use case - Loop over > all key value pairs in state on receiving a tick tuple and store all values > satisfying a given criterion to database. > I'll be happy to provide a patch if you guys think it's a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-2203) Add a getAll method to KeyValueState interface
[ https://issues.apache.org/jira/browse/STORM-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-2203. --- Resolution: Fixed Fix Version/s: 1.1.0 2.0.0 > Add a getAll method to KeyValueState interface > -- > > Key: STORM-2203 > URL: https://issues.apache.org/jira/browse/STORM-2203 > Project: Apache Storm > Issue Type: Improvement > Components: storm-core >Reporter: Abhishek > Fix For: 2.0.0, 1.1.0 > > Time Spent: 10m > Remaining Estimate: 0h > > A getAll method which would return all the key value pairs present in the > state could be really useful in Stateful bolts. Example use case - Loop over > all key value pairs in state on receiving a tick tuple and store all values > satisfying a given criterion to database. > I'll be happy to provide a patch if you guys think it's a good idea. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (STORM-1886) Extend KeyValueState interface with delete method
[ https://issues.apache.org/jira/browse/STORM-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan closed STORM-1886. - > Extend KeyValueState interface with delete method > - > > Key: STORM-1886 > URL: https://issues.apache.org/jira/browse/STORM-1886 > Project: Apache Storm > Issue Type: Improvement >Reporter: Balazs Kossovics >Assignee: Balazs Kossovics > Fix For: 2.0.0, 1.1.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Even if the implementation of checkpointing only uses the get/put methods of > the KeyValueState interface, the existance of a delete method could be really > useful in the general case. > I made a first implementation, what do you think about? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-1886) Extend KeyValueState interface with delete method
[ https://issues.apache.org/jira/browse/STORM-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan resolved STORM-1886. --- Resolution: Fixed Fix Version/s: 1.1.0 2.0.0 > Extend KeyValueState interface with delete method > - > > Key: STORM-1886 > URL: https://issues.apache.org/jira/browse/STORM-1886 > Project: Apache Storm > Issue Type: Improvement >Reporter: Balazs Kossovics >Assignee: Balazs Kossovics > Fix For: 2.0.0, 1.1.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Even if the implementation of checkpointing only uses the get/put methods of > the KeyValueState interface, the existance of a delete method could be really > useful in the general case. > I made a first implementation, what do you think about? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-2262) Streams api - add option to access late tuple stream
Arun Mahadevan created STORM-2262: - Summary: Streams api - add option to access late tuple stream Key: STORM-2262 URL: https://issues.apache.org/jira/browse/STORM-2262 Project: Apache Storm Issue Type: Sub-task Reporter: Arun Mahadevan Could be an additional api like windowWithLatetupleStream(...) which returns an array of two streams (windowed and late tuple stream) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-2261) Streams api - support Union
Arun Mahadevan created STORM-2261: - Summary: Streams api - support Union Key: STORM-2261 URL: https://issues.apache.org/jira/browse/STORM-2261 Project: Apache Storm Issue Type: Sub-task Reporter: Arun Mahadevan Union returns a new stream that contains the union of the elements in the source stream and other stream. Similar constructs are supported in spark streaming and flink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-2260) Streams api - support side inputs
Arun Mahadevan created STORM-2260: - Summary: Streams api - support side inputs Key: STORM-2260 URL: https://issues.apache.org/jira/browse/STORM-2260 Project: Apache Storm Issue Type: Sub-task Reporter: Arun Mahadevan Side inputs are additional inputs computed at runtime that can be used within the main computation. E.g. Joining an IP address stream with a blacklist (side-input) which is periodically updated. Also see - https://cloud.google.com/dataflow/model/par-do#side-inputs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-2257) Add built in support for sum function with different types in storm-sql standalone mode
Arun Mahadevan created STORM-2257: - Summary: Add built in support for sum function with different types in storm-sql standalone mode Key: STORM-2257 URL: https://issues.apache.org/jira/browse/STORM-2257 Project: Apache Storm Issue Type: Bug Reporter: Arun Mahadevan Assignee: Arun Mahadevan Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-2154) Prototype beam runner using unified streams api
Arun Mahadevan created STORM-2154: - Summary: Prototype beam runner using unified streams api Key: STORM-2154 URL: https://issues.apache.org/jira/browse/STORM-2154 Project: Apache Storm Issue Type: Sub-task Reporter: Arun Mahadevan Assignee: Arun Mahadevan This is mostly to identify any gaps and validate the proposed apis. It will be just a prototype runner using the apis proposed in STORM-1961. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (STORM-1961) Come up with streams api for storm core use cases
[ https://issues.apache.org/jira/browse/STORM-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Mahadevan updated STORM-1961: -- Attachment: UnifiedStreamapiforStorm_v2.pdf Updated doc with more details on the current implementation and possible approaches for the next phase to provide stronger guarantees. The doc and the example in the PR should be a good starting point to get a good understanding of the current implementation and provide feedback. The PR - https://github.com/apache/storm/pull/1693 > Come up with streams api for storm core use cases > - > > Key: STORM-1961 > URL: https://issues.apache.org/jira/browse/STORM-1961 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > Attachments: UnifiedStreamapiforStorm.pdf, > UnifiedStreamapiforStorm_v2.pdf > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)