[jira] [Reopened] (STORM-1347) ui changes to display the topology version.
[ https://issues.apache.org/jira/browse/STORM-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt reopened STORM-1347: - Accidentally closed. > ui changes to display the topology version. > > > Key: STORM-1347 > URL: https://issues.apache.org/jira/browse/STORM-1347 > Project: Apache Storm > Issue Type: Sub-task > Components: storm-core >Reporter: Parth Brahmbhatt >Assignee: Parth Brahmbhatt > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-1347) ui changes to display the topology version.
[ https://issues.apache.org/jira/browse/STORM-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt resolved STORM-1347. - Resolution: Fixed > ui changes to display the topology version. > > > Key: STORM-1347 > URL: https://issues.apache.org/jira/browse/STORM-1347 > Project: Apache Storm > Issue Type: Sub-task > Components: storm-core >Reporter: Parth Brahmbhatt >Assignee: Parth Brahmbhatt > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1604) Delayed transition should handle NotALeaderException
Parth Brahmbhatt created STORM-1604: --- Summary: Delayed transition should handle NotALeaderException Key: STORM-1604 URL: https://issues.apache.org/jira/browse/STORM-1604 Project: Apache Storm Issue Type: Bug Components: storm-core Affects Versions: 1.0.0 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Fix For: 1.0.0 Currently if an action(kill, rebalance) is scheduled with delay, nimbus stores the state in zookeeper and then schedules a delayed event to do final transition. If during this wait time, leader nimbus loses the leadership, when the delayed operation is executed it receives a NotALeaderException which it does not handle causing the nimbus to die. We should catch the exception and ignore it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1569) Allowing users to specify the nimbus thrift server queue size.
Parth Brahmbhatt created STORM-1569: --- Summary: Allowing users to specify the nimbus thrift server queue size. Key: STORM-1569 URL: https://issues.apache.org/jira/browse/STORM-1569 Project: Apache Storm Issue Type: Improvement Components: storm-core Affects Versions: 0.10.0 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Fix For: 1.0.0 Currently the nimbus sever in secure mode uses https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html Backed by https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/SynchronousQueue.html, Please see https://github.com/apache/thrift/blob/0.9.2/lib/java/src/org/apache/thrift/server/TThreadPoolServer.java#L132. This means that if all executor threads are busy serving a request and new requests come in we will see RejectedExecutionExceptions in logs once they have reached the retry limit. Instead we should allow the requests to be queued. This patch allows the requests to be queued by replacing SynchronousQueue with https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ArrayBlockingQueue.html with default size of 10 requests which should be large enough for most applications. Applications can modify this default by adding the config nimbus.queue.size to their storm.yaml and bouncing nimbus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-1147) Storm JDBCBolt should add validation to ensure either insertQuery or table name is specified and not both.
[ https://issues.apache.org/jira/browse/STORM-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt resolved STORM-1147. - Resolution: Fixed > Storm JDBCBolt should add validation to ensure either insertQuery or table > name is specified and not both. > -- > > Key: STORM-1147 > URL: https://issues.apache.org/jira/browse/STORM-1147 > Project: Apache Storm > Issue Type: Bug > Components: storm-jdbc >Affects Versions: 0.10.0 >Reporter: Parth Brahmbhatt >Assignee: Parth Brahmbhatt >Priority: Trivial > Fix For: 1.0.0 > > > The JDBCBolt takes either an insert query or table name but does not do any > validation check to ensure only one of the two option is provided. We should > add a validation check and throw an exception with proper messaging to avoid > confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-1521) When using Kerberos login from keytab with multiple bolts/executors ticket is not renewed
[ https://issues.apache.org/jira/browse/STORM-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt resolved STORM-1521. - Resolution: Fixed Fix Version/s: 2.0.0 > When using Kerberos login from keytab with multiple bolts/executors ticket is > not renewed > - > > Key: STORM-1521 > URL: https://issues.apache.org/jira/browse/STORM-1521 > Project: Apache Storm > Issue Type: Bug > Components: storm-hbase >Affects Versions: 0.10.0, 0.9.5 >Reporter: Dan Bahir >Assignee: Dan Bahir > Fix For: 2.0.0 > > > When logging in with a keytab, if the topology has more than one instance of > an HBase bolt then the ticket will not be automatically renewed. > Expected: The ticket will be automatically renewed and the bolt will be able > to write to the database. > Actual: The ticket is not renewed and the bolt loses access to HBase. > Note when there is only one bolt with one executor is renews correctly. > Exception in bolt is: > 2015-12-18T09:41:13.862-0500 o.a.h.s.UserGroupInformation [ERROR] > PriviledgedActionException as:u...@somewhere.com > cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any > Kerberos tgt)] > 2015-12-18T09:41:13.862-0500 o.a.h.i.RpcClient [WARN] Exception encountered > while connecting to the server : javax.security.sasl.SaslException: GSS > initiate > failed [Caused by GSSException: No valid credentials provided (Mechanism > level: > Failed to find any Kerberos tgt)] > 2015-12-18T09:41:13.863-0500 o.a.h.i.RpcClient [ERROR] SASL authentication > failed. The most likely cause is missing or invalid credentials. Consider > 'kinit'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-1199) Create HDFS Spout
[ https://issues.apache.org/jira/browse/STORM-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt resolved STORM-1199. - Resolution: Fixed Fix Version/s: 1.0.0 > Create HDFS Spout > - > > Key: STORM-1199 > URL: https://issues.apache.org/jira/browse/STORM-1199 > Project: Apache Storm > Issue Type: New Feature >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: 1.0.0 > > Attachments: HDFSSpoutforStorm v2.pdf, HDFSSpoutforStorm.pdf, > hdfs-spout.1.patch > > > Create an HDFS spout so that Storm can suck in data from files in a HDFS > directory -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1423) storm UI in a secure env shows error even when credentials are present
Parth Brahmbhatt created STORM-1423: --- Summary: storm UI in a secure env shows error even when credentials are present Key: STORM-1423 URL: https://issues.apache.org/jira/browse/STORM-1423 Project: Apache Storm Issue Type: Bug Components: storm-core Affects Versions: 0.10.1 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Fix For: 0.11.0 storm UI in a secure env shows error even when credentials are present -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1381) Client side topology submission hook.
Parth Brahmbhatt created STORM-1381: --- Summary: Client side topology submission hook. Key: STORM-1381 URL: https://issues.apache.org/jira/browse/STORM-1381 Project: Apache Storm Issue Type: New Feature Components: storm-core Affects Versions: 0.11.0 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Trivial Fix For: 0.11.0 A client side hook is suppose to be invoked when a user submits the topology using TopologySubmitter. We already have nimbus side hook for all the topology actions however those are good if users don't want to actually inspect the topology being submitted or the classes that makes up the topology (spouts and bolts) as on nimbus side these classes are not available in class path. As a concrete example, in hortonworks we wanted to integrate storm with atlas to provide complete lineage of data even when it passes through a storm topology. Atlas needed to actually look inside the topology components (i.e. kafka spout to figure out what topic the data is being pulled from, or hbase bolt to figure out which cluster and what table data is being pushed into.) to give a meaningful lineage. We originally proposed that they use the server side hook but with that they had to download the user uploaded jar and add it to the class path dynamically or spin a new jvn whose output will then be read by the atlas integration hook. The client side hook is suppose to make it easy when the topology itself needs to be examined. We are using this in our internal repo for atlas integration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1346) upgrade topology CLI tool
[ https://issues.apache.org/jira/browse/STORM-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15040780#comment-15040780 ] Parth Brahmbhatt commented on STORM-1346: - Code available in https://github.com/Parth-Brahmbhatt/incubator-storm/tree/STORM-1346 but is blocked until https://github.com/apache/storm/pull/922 gets merged. > upgrade topology CLI tool > -- > > Key: STORM-1346 > URL: https://issues.apache.org/jira/browse/STORM-1346 > Project: Apache Storm > Issue Type: Sub-task > Components: storm-core >Reporter: Parth Brahmbhatt >Assignee: Parth Brahmbhatt > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1346) upgrade topology CLI tool
Parth Brahmbhatt created STORM-1346: --- Summary: upgrade topology CLI tool Key: STORM-1346 URL: https://issues.apache.org/jira/browse/STORM-1346 Project: Apache Storm Issue Type: Sub-task Components: storm-core Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1345) Thrift, nimbus ,zookeeper, supervisor and worker changes to support update topology.
Parth Brahmbhatt created STORM-1345: --- Summary: Thrift, nimbus ,zookeeper, supervisor and worker changes to support update topology. Key: STORM-1345 URL: https://issues.apache.org/jira/browse/STORM-1345 Project: Apache Storm Issue Type: Sub-task Components: storm-core Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1347) ui changes to display the topology version.
Parth Brahmbhatt created STORM-1347: --- Summary: ui changes to display the topology version. Key: STORM-1347 URL: https://issues.apache.org/jira/browse/STORM-1347 Project: Apache Storm Issue Type: Sub-task Components: storm-core Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1187) Support for late and out of order events in time based windows
[ https://issues.apache.org/jira/browse/STORM-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15011207#comment-15011207 ] Parth Brahmbhatt commented on STORM-1187: - [~arunmahadevan] Do you want to post your design doc for review? > Support for late and out of order events in time based windows > -- > > Key: STORM-1187 > URL: https://issues.apache.org/jira/browse/STORM-1187 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan > > Right now the time based windows uses the timestamp when the tuple is > received by the bolt. > However there are use cases where the tuples can be processed based on the > time when they are actually generated vs the time when they are received. So > we need to add support for processing events with a time lag and also have > some way to specify and read tuple timestamps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-1098) Storm Nimbus Hook
[ https://issues.apache.org/jira/browse/STORM-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt resolved STORM-1098. - Resolution: Fixed Fix Version/s: 0.11.0 > Storm Nimbus Hook > - > > Key: STORM-1098 > URL: https://issues.apache.org/jira/browse/STORM-1098 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Sriharsha Chintalapani >Assignee: Parth Brahmbhatt > Fix For: 0.11.0 > > > Apache Atlas provides governance services and also lineage. It will be great > if we can capture the topology changes as part of Apache Atlas so that user > can see how the topology changed over the period of time. > Storm has ITaskHook but this is for topology components like spout & bolt. > Similar to ITaskHook we should provide a nimbus hook that will allow > pluggable implementations to run and can nimbus will execute this on any > topology operation like upload jar, download, activate, deactivate , kill > etc.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1098) Storm Nimbus Hook
[ https://issues.apache.org/jira/browse/STORM-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992692#comment-14992692 ] Parth Brahmbhatt commented on STORM-1098: - [~svenkat] Give you are the reporter for ATLAS-183 and ATLAS-181, Please review the interface and let me know if this will suffice for Atlas integration. > Storm Nimbus Hook > - > > Key: STORM-1098 > URL: https://issues.apache.org/jira/browse/STORM-1098 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Sriharsha Chintalapani >Assignee: Parth Brahmbhatt > > Apache Atlas provides governance services and also lineage. It will be great > if we can capture the topology changes as part of Apache Atlas so that user > can see how the topology changed over the period of time. > Storm has ITaskHook but this is for topology components like spout & bolt. > Similar to ITaskHook we should provide a nimbus hook that will allow > pluggable implementations to run and can nimbus will execute this on any > topology operation like upload jar, download, activate, deactivate , kill > etc.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (STORM-1098) Storm Nimbus Hook
[ https://issues.apache.org/jira/browse/STORM-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt reassigned STORM-1098: --- Assignee: Parth Brahmbhatt (was: Sriharsha Chintalapani) > Storm Nimbus Hook > - > > Key: STORM-1098 > URL: https://issues.apache.org/jira/browse/STORM-1098 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Sriharsha Chintalapani >Assignee: Parth Brahmbhatt > > Apache Atlas provides governance services and also lineage. It will be great > if we can capture the topology changes as part of Apache Atlas so that user > can see how the topology changed over the period of time. > Storm has ITaskHook but this is for topology components like spout & bolt. > Similar to ITaskHook we should provide a nimbus hook that will allow > pluggable implementations to run and can nimbus will execute this on any > topology operation like upload jar, download, activate, deactivate , kill > etc.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1147) Storm JDBCBolt should add validation to ensure either insertQuery or table name is specified and not both.
Parth Brahmbhatt created STORM-1147: --- Summary: Storm JDBCBolt should add validation to ensure either insertQuery or table name is specified and not both. Key: STORM-1147 URL: https://issues.apache.org/jira/browse/STORM-1147 Project: Apache Storm Issue Type: Bug Components: storm-jdbc Affects Versions: 0.10.0 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Trivial Fix For: 0.11.0 The JDBCBolt takes either an insert query or table name but does not do any validation check to ensure only one of the two option is provided. We should add a validation check and throw an exception with proper messaging to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1139) Issues regarding storm-postgresql interface
[ https://issues.apache.org/jira/browse/STORM-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981895#comment-14981895 ] Parth Brahmbhatt commented on STORM-1139: - You can post your question to u...@storm.apache.org see this for more details on how to subscribe https://storm.apache.org/community.html. If I understand correctly you want to write a storm topology where one of the components write to PostegresDB. WE have a jdbc connector that you can try out https://github.com/apache/storm/tree/master/external/storm-jdbc. For an example topology see https://github.com/apache/storm/blob/master/external/storm-jdbc/src/test/java/org/apache/storm/jdbc/topology/UserPersistanceTopology.java > Issues regarding storm-postgresql interface > --- > > Key: STORM-1139 > URL: https://issues.apache.org/jira/browse/STORM-1139 > Project: Apache Storm > Issue Type: Bug >Reporter: hima >Assignee: hima > > hai > I am trying to write storm bolt to insert data in postgesql DB.But i am > facing issues like > java.io.NotSerializableException:org.postgresql.jdbc4.Jdbc4Connection. > Can anyone provide me full code for storm bolt that can insert data into > postgres database. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-726) Adding nimbus.host config for backward compatibility of client config
[ https://issues.apache.org/jira/browse/STORM-726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt resolved STORM-726. Resolution: Fixed > Adding nimbus.host config for backward compatibility of client config > - > > Key: STORM-726 > URL: https://issues.apache.org/jira/browse/STORM-726 > Project: Apache Storm > Issue Type: Sub-task > Components: storm-core >Reporter: Parth Brahmbhatt >Assignee: Parth Brahmbhatt > > As part of Nimbus HA initiative we added nimbus discovery for client based on > a new config called nimbus.seeds where users can specify a list of nimbus > hosts that the clients can contact to figure out leader nimbus address. We > deleted the nimbus.host config which is one value that all users modify in > their cluster setup. Deleting this config is a backward incompatible change > and will pretty much force everyone to update their client config even if > they don't want nimbus HA. For backward compatibilty it is better to fail ver > to nimbus.host when nimbus.seeds config has no value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-654) Create a thrift API to discover nimbus so all the clients are not forced to contact zookeeper.
[ https://issues.apache.org/jira/browse/STORM-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt resolved STORM-654. Resolution: Fixed > Create a thrift API to discover nimbus so all the clients are not forced to > contact zookeeper. > -- > > Key: STORM-654 > URL: https://issues.apache.org/jira/browse/STORM-654 > Project: Apache Storm > Issue Type: Sub-task > Components: storm-core >Reporter: Parth Brahmbhatt >Assignee: Parth Brahmbhatt > > Current implementation of Nimbus-HA requires each nimbus client to discover > nimbus hosts by contacting zookeeper. In order to reduce the load on > zookeeper we could expose a thrift API as described in the future improvement > section of the Nimbus HA design doc. > We will add an extra field in ClusterSummary structure called nimbuses. > struct ClusterSummary { > 1: required list supervisors; > 2: required i32 nimbus_uptime_secs; > 3: required list topologies; > 4: required list nimbuses; > } > struct NimbusSummary { > 1: require string host; > 2: require int port; > 3: require int uptimeSecs; > 4: require boolean isLeader; > 5: require string version; > 6: optional list local_storm_ids; //need a better name but these > are list of storm-ids for which this nimbus host has the code available > locally. > } > We will create a nimbus.hosts configuration which will serve as the seed list > of nimbus hosts. Any nimbus host can serve the read requests so any client > can issue getClusterSummary call and they can extract the leader nimbus > summary from the list of nimbuses. All nimbus hosts will cache this > information to reduce the load on zookeeper. > In addition we can add a RedirectException. When a request that can only be > served by leader nimbus (i.e. submit, kill, rebalance, deactivate, activate) > is issued against a non leader nimbus, the non leader nimbus will throw a > RedirectException and the client will handle the exception by refreshing > their leader nimbus host and contacting that host as part of retry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-655) Ad replication count as part of topology summary.
[ https://issues.apache.org/jira/browse/STORM-655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt resolved STORM-655. Resolution: Fixed > Ad replication count as part of topology summary. > - > > Key: STORM-655 > URL: https://issues.apache.org/jira/browse/STORM-655 > Project: Apache Storm > Issue Type: Sub-task > Components: storm-core >Reporter: Parth Brahmbhatt >Assignee: Parth Brahmbhatt > > With Nimbus HA each topology is replicated across multiple nimbus hosts. We > want to modify the UI/REST/Thrift APIs so we can expose the replication count > of a topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-123) NImbus BCP
[ https://issues.apache.org/jira/browse/STORM-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908427#comment-14908427 ] Parth Brahmbhatt commented on STORM-123: Given storm-166 is checked [~xumingming] do you think this is ok to resolve? > NImbus BCP > -- > > Key: STORM-123 > URL: https://issues.apache.org/jira/browse/STORM-123 > Project: Apache Storm > Issue Type: Wish >Reporter: James Xu >Priority: Minor > > https://github.com/nathanmarz/storm/issues/737 > Hi, > We are building a system where we need to have a BCP for nimbus box. > Topologies will already be deployed on nimbus1 box while nimbus2 is our BCP. > If nimbus1 goes down due to hardware failure how do get nimbus2 in so that we > can control the start/kill of the topologies which were already deployed on > nimbus1. I understand that topologies deployed earlier will continue to run > since the jars have already been distributed to the supervisors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (STORM-123) NImbus BCP
[ https://issues.apache.org/jira/browse/STORM-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908427#comment-14908427 ] Parth Brahmbhatt edited comment on STORM-123 at 9/25/15 6:05 PM: - Given storm-166 is merged [~xumingming] do you think this is ok to resolve? was (Author: parth.brahmbhatt): Given storm-166 is checked [~xumingming] do you think this is ok to resolve? > NImbus BCP > -- > > Key: STORM-123 > URL: https://issues.apache.org/jira/browse/STORM-123 > Project: Apache Storm > Issue Type: Wish >Reporter: James Xu >Priority: Minor > > https://github.com/nathanmarz/storm/issues/737 > Hi, > We are building a system where we need to have a BCP for nimbus box. > Topologies will already be deployed on nimbus1 box while nimbus2 is our BCP. > If nimbus1 goes down due to hardware failure how do get nimbus2 in so that we > can control the start/kill of the topologies which were already deployed on > nimbus1. I understand that topologies deployed earlier will continue to run > since the jars have already been distributed to the supervisors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1011) HBaseBolt default mapper should handle null values
Parth Brahmbhatt created STORM-1011: --- Summary: HBaseBolt default mapper should handle null values Key: STORM-1011 URL: https://issues.apache.org/jira/browse/STORM-1011 Project: Apache Storm Issue Type: Bug Components: storm-hbase Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Fix For: 0.11.0 Hbase bolt's default mapper currently can not handle null values. We should support insertion of null values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1012) Shade Jackson dependency
Parth Brahmbhatt created STORM-1012: --- Summary: Shade Jackson dependency Key: STORM-1012 URL: https://issues.apache.org/jira/browse/STORM-1012 Project: Apache Storm Issue Type: Bug Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Fix For: 0.11.0 Shading jackson dependency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-996) netty-unit-tests/test-batch demonstrates out-of-order delivery
[ https://issues.apache.org/jira/browse/STORM-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703522#comment-14703522 ] Parth Brahmbhatt commented on STORM-996: [~dagit] How are you reproducing this? I just ran this test in a loop for 50 times and it passed all the time for me. I am on following java version ➜ incubator-storm git:(ha-merge) ✗ java -version java version 1.8.0_31 Java(TM) SE Runtime Environment (build 1.8.0_31-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode) netty-unit-tests/test-batch demonstrates out-of-order delivery -- Key: STORM-996 URL: https://issues.apache.org/jira/browse/STORM-996 Project: Apache Storm Issue Type: Bug Affects Versions: 0.10.0 Reporter: Derek Dagit Assignee: Derek Dagit Priority: Blocker backtype.storm.messaging.netty-unit-test/test-batch One example of output. Similar things happen sporadically and vary widely by number of failed assertions. Tuples are not just skewed, but actually seem to come in out-of-order. {quote} actual: (not (= 66040 66041)) at: test_runner.clj:105 expected: (= req_msg resp_msg) actual: (not (= 66041 66042)) at: test_runner.clj:105 expected: (= req_msg resp_msg) actual: (not (= 66042 66040)) at: test_runner.clj:105 {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680283#comment-14680283 ] Parth Brahmbhatt commented on STORM-167: Sorry I have been busy wit some other stuff at work. Let me see if I can finish the work in coming 2 weeks. proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Assignee: Parth Brahmbhatt Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (STORM-821) storm-jdbc create a connection provider interface to decouple from hikariCP being the only connection pool implementation that can be used.
[ https://issues.apache.org/jira/browse/STORM-821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt resolved STORM-821. Resolution: Fixed storm-jdbc create a connection provider interface to decouple from hikariCP being the only connection pool implementation that can be used. --- Key: STORM-821 URL: https://issues.apache.org/jira/browse/STORM-821 Project: Apache Storm Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Minor Fix For: 0.10.0 The current implementation of storm-jdbc is couple with HikariCP configuration. We propose to remove this coupling by introducing a connectionProvider interface with a default HikariCP implementation. This will allow users to do their own connection pool management or chose a different connection pooling library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14565030#comment-14565030 ] Parth Brahmbhatt commented on STORM-167: Hey , I haven't had time to look into this yet and I am busy with some other stuff so probably wont get to it in next 2 months. If someone else wants to take it up please feel free to do so. proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Assignee: Parth Brahmbhatt Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-827) storm-hdfs requires you to ship a keytab to access secure HDFS
[ https://issues.apache.org/jira/browse/STORM-827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550867#comment-14550867 ] Parth Brahmbhatt commented on STORM-827: I think if you hav AutoHDFS configured you do not need to specify keytabs because of this condition https://github.com/apache/storm/blob/master/external/storm-hdfs/src/main/java/org/apache/storm/hdfs/common/security/HdfsSecurityUtil.java#L44 I never had a chance to test AutoTGT but if we can confirm it works as expected this jira is still valid. storm-hdfs requires you to ship a keytab to access secure HDFS -- Key: STORM-827 URL: https://issues.apache.org/jira/browse/STORM-827 Project: Apache Storm Issue Type: Bug Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Priority: Critical storm-hdfs to access secure hdfs assumes that you have to use a keytab to access HDFS. It should be able to work with either AutoTGT or AutoHDFS too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-821) storm-jdbc create a connection provider interface to decouple from hikariCP being the only connection pool implementation that can be used.
Parth Brahmbhatt created STORM-821: -- Summary: storm-jdbc create a connection provider interface to decouple from hikariCP being the only connection pool implementation that can be used. Key: STORM-821 URL: https://issues.apache.org/jira/browse/STORM-821 Project: Apache Storm Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Minor Fix For: 0.10.0 The current implementation of storm-jdbc is couple with HikariCP configuration. We propose to remove this coupling by introducing a connectionProvider interface with a default HikariCP implementation. This will allow users to do their own connection pool management or chose a different connection pooling library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-766) Supervisor summary should include the version.
[ https://issues.apache.org/jira/browse/STORM-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495168#comment-14495168 ] Parth Brahmbhatt commented on STORM-766: Hi [~redsanket] , Welcome to the storm community and thanks for showing interest to contribute. Sorry for not providing enough details on jira in first place. Storm will start supporting rolling upgrades in either version 0.10 or the next version as most of our serialization has moved from java to thrift. Previous version used java serialization which meant the entire cluster had to be running on same version so it made sense to have a cluster version. With the rolling upgrade this will change and different hosts in the cluster might be running different version. It will be useful for admins to view what hosts are on what versions. To expose this information this jira proposes to expose the storm version in supervisor summary. This means adding a verion string to this structure https://github.com/apache/storm/blob/master/storm-core/src/storm.thrift#L150 and re-generating the thrift structure using https://github.com/apache/storm/blob/master/storm-core/src/genthrift.sh. Supervisors will also have to store this information in zookeeper so nimbus can read this information so you will need to probably look into modifying https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/supervisor.clj#L505 and https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/cluster.clj#L397. Once this data is stored in zookeeper nimbus needs to read it and populate the SupervisorSummary instances with version https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/nimbus.clj#L1250. Finally the ui changes can be made here https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/ui/core.clj#L571 and https://github.com/apache/storm/blob/master/storm-core/src/ui/public/templates/index-page-template.html#L145. Let me know if you need any more information and once again thanks for considering to contribute. Supervisor summary should include the version. -- Key: STORM-766 URL: https://issues.apache.org/jira/browse/STORM-766 Project: Apache Storm Issue Type: Bug Affects Versions: 0.10.0 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Minor Fix For: 0.10.0 With the support for rolling upgrade, different nodes in the cluster can run different versions of storm. We should include the version in SupervisorSummary just like NimbusSummary so admins can identify nodes that needs upgrading/downgrading from UI. As part of this change I will also add a supervisor/log link in the ui. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-760) use JSON for conf serialization
[ https://issues.apache.org/jira/browse/STORM-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486044#comment-14486044 ] Parth Brahmbhatt commented on STORM-760: When I tested the rolling upgrade feature I did not upgrade clojure. I built and deployed a topology on a cluster, rebuilt the storm jar and replaced the old jar with new jar, bounced nimbus/ui/workers and supervisors and the same topology was working fine. But I agree that any kind of dependency change should be rolling upgradable. use JSON for conf serialization --- Key: STORM-760 URL: https://issues.apache.org/jira/browse/STORM-760 Project: Apache Storm Issue Type: Bug Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Now that STORM-634 has gone in, the only real place we still use java serialization that is not required by the contract with the end user is when nimbus writes out the topology conf, and the worker/supervisor tries to read it back in. We already write it out using JSON when submitting a topology, we should do the same here to avoid rolling upgrade issues, ideally compressed to save space too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-760) use JSON for conf serialization
[ https://issues.apache.org/jira/browse/STORM-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483831#comment-14483831 ] Parth Brahmbhatt commented on STORM-760: There is one place else which is backtype.storm.utils.LocalState which currently writes the state to localdisk and uses java serialization. It is ok because it writes java's hashtable so I haven't seen it causing any issues but it might be better to convert that into Json as well. use JSON for conf serialization --- Key: STORM-760 URL: https://issues.apache.org/jira/browse/STORM-760 Project: Apache Storm Issue Type: Bug Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans Now that STORM-634 has gone in, the only real place we still use java serialization that is not required by the contract with the end user is when nimbus writes out the topology conf, and the worker/supervisor tries to read it back in. We already write it out using JSON when submitting a topology, we should do the same here to avoid rolling upgrade issues, ideally compressed to save space too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-749) Remove CSRF check from rest API
Parth Brahmbhatt created STORM-749: -- Summary: Remove CSRF check from rest API Key: STORM-749 URL: https://issues.apache.org/jira/browse/STORM-749 Project: Apache Storm Issue Type: Task Affects Versions: 0.9.3 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Fix For: 0.10.0 I think we can safely get rid of the whole CSRF code. CSRF vulnerability is only exposed when websites use session based authentication. In our case we only use http authentication so we are not really vulnerable to CSRF attacks. Currently the CSRF check only hinders non browser clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-741) Allow users to have a config value doAsUser that will be picked up by nimbus client.
Parth Brahmbhatt created STORM-741: -- Summary: Allow users to have a config value doAsUser that will be picked up by nimbus client. Key: STORM-741 URL: https://issues.apache.org/jira/browse/STORM-741 Project: Apache Storm Issue Type: Sub-task Reporter: Parth Brahmbhatt Currently the only way users can impersonate another user is by creating a NimbusClient explicitly by passing a doAsUser Param. We should allow a config value doAsUser to ensure users can submit this config as part of storm jar command. This will be useful for storm-rest/ui component which now has the ability to submit topology. Using this config value the ui server can submit a storm topology on behalf of another user and the topology will still execute as the submitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (STORM-741) Allow users to have a config value doAsUser that will be picked up by nimbus client.
[ https://issues.apache.org/jira/browse/STORM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt reassigned STORM-741: -- Assignee: Parth Brahmbhatt Allow users to have a config value doAsUser that will be picked up by nimbus client. -- Key: STORM-741 URL: https://issues.apache.org/jira/browse/STORM-741 Project: Apache Storm Issue Type: Sub-task Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Currently the only way users can impersonate another user is by creating a NimbusClient explicitly by passing a doAsUser Param. We should allow a config value doAsUser to ensure users can submit this config as part of storm jar command. This will be useful for storm-rest/ui component which now has the ability to submit topology. Using this config value the ui server can submit a storm topology on behalf of another user and the topology will still execute as the submitter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-727) Storm tests should succeed even if a storm process is running locally.
Parth Brahmbhatt created STORM-727: -- Summary: Storm tests should succeed even if a storm process is running locally. Key: STORM-727 URL: https://issues.apache.org/jira/browse/STORM-727 Project: Apache Storm Issue Type: Bug Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Trivial Currently the nimbus_auth_test tries to kick of a local nimbus process with default port 6627 so if a developer is running storm locally this test always fails due to port conflict. its just annoying , we should use a non default port for test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-726) Adding nimbus.host config for backward compatibility of client config
[ https://issues.apache.org/jira/browse/STORM-726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382398#comment-14382398 ] Parth Brahmbhatt commented on STORM-726: Cant really post PR against nimbus-ha branch as it is not upmerged to master. here is a PR against my own na branch which is upmerged https://github.com/Parth-Brahmbhatt/incubator-storm/pull/4 Adding nimbus.host config for backward compatibility of client config - Key: STORM-726 URL: https://issues.apache.org/jira/browse/STORM-726 Project: Apache Storm Issue Type: Sub-task Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt As part of Nimbus HA initiative we added nimbus discovery for client based on a new config called nimbus.seeds where users can specify a list of nimbus hosts that the clients can contact to figure out leader nimbus address. We deleted the nimbus.host config which is one value that all users modify in their cluster setup. Deleting this config is a backward incompatible change and will pretty much force everyone to update their client config even if they don't want nimbus HA. For backward compatibilty it is better to fail ver to nimbus.host when nimbus.seeds config has no value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-719) Trident name space for spout in zookeeper should be separated by topology name
[ https://issues.apache.org/jira/browse/STORM-719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376110#comment-14376110 ] Parth Brahmbhatt commented on STORM-719: Possible duplicate of https://issues.apache.org/jira/browse/STORM-587 Trident name space for spout in zookeeper should be separated by topology name -- Key: STORM-719 URL: https://issues.apache.org/jira/browse/STORM-719 Project: Apache Storm Issue Type: Bug Reporter: Sriharsha Chintalapani Assignee: Sriharsha Chintalapani Trident name space for spout in zk doesn't use topology name this can result in conflict if another topology uses the same spout name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-587) trident transactional state in zk should be namespaced with topology id
[ https://issues.apache.org/jira/browse/STORM-587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376805#comment-14376805 ] Parth Brahmbhatt commented on STORM-587: I agree with Sriharsha. I don't see why should we impose a constraint like this when we can easily avoid it. trident transactional state in zk should be namespaced with topology id --- Key: STORM-587 URL: https://issues.apache.org/jira/browse/STORM-587 Project: Apache Storm Issue Type: Improvement Reporter: Parth Brahmbhatt Assignee: Sriharsha Chintalapani Currently when a trident transaction spout is initialized it creates a node in zk under /transactional with the spout name as the node's name. This is pretty dangerous as any other topology can be submitted with same spout name and now these 2 spouts will be overwriting each other's states. I believe it is better to namespace this with topologyId just like all other zk entries under /storm. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-167) proposal for storm topology online update
[ https://issues.apache.org/jira/browse/STORM-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14369823#comment-14369823 ] Parth Brahmbhatt commented on STORM-167: This is a useful feature and I see a lot of user interest. [~xiaokang] Thanks for the original patch and I am not sure why it was not reviewed. Do you think you can upmerge this with storm/master? If not, do you mind if I take over this task? proposal for storm topology online update - Key: STORM-167 URL: https://issues.apache.org/jira/browse/STORM-167 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Priority: Minor https://github.com/nathanmarz/storm/issues/540 Now update topology code can only be done by kill it and re-submit a new one. During the kill and re-submit process some request may delay or fail. It is not so good for online service. So we consider to add topology online update recently. Mission update running topology code gracefully one worker after another without service total interrupted. Just update topology code, not update topology DAG structure including component, stream and task number. Proposal * client use storm update topology-name new-jar-file to submit new-jar-file update request * nimbus update stormdist dir, link topology-dir to new one * nimbus update topology version on zk * the supervisors that running this topology update it ** check topology version on zk, if it is not the same as local version, a topology update begin ** each supervisor schedule the topology's worker update at a rand(expect-max-update-time) time point ** sync-supervisor download the latest code from nimbus ** sync-process check local worker heartbeat version(to be added), if it is not the same with sync-supervisor downloaded version, kill the worker ** sync-process restart killed worker ** new worker heartbeat to zk with version(to be added), it can be displayed on web ui to check update progress. This feature is deployed in our production clusters. It's really useful for topologys handling online request waiting for response. Topology jar can be updated without entire service offline. We hope that this feature is useful for others too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-711) All connectors should use collector.reportError and tuple anchoring.
Parth Brahmbhatt created STORM-711: -- Summary: All connectors should use collector.reportError and tuple anchoring. Key: STORM-711 URL: https://issues.apache.org/jira/browse/STORM-711 Project: Apache Storm Issue Type: Bug Components: external Affects Versions: 0.9.3 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Fix For: 0.10.0 Currently most of our connectors log an error when a tuple fails to process during execute method. We should change the connectors so we can use collector.reportError instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-700) Add a hint about auto-generation in java source
[ https://issues.apache.org/jira/browse/STORM-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353230#comment-14353230 ] Parth Brahmbhatt commented on STORM-700: Isn't the fact that they are placed under a package called generated enough? Do you think we should just have some custom annotation @Generated at class level to indicate this? You will probably have to modify the thrift compiler to achieve this and I am not sure if its worth the effort. Add a hint about auto-generation in java source --- Key: STORM-700 URL: https://issues.apache.org/jira/browse/STORM-700 Project: Apache Storm Issue Type: Improvement Reporter: Karl Richter Apparently some java classes in `storm-core` are created by a code generatorrefhttps://github.com/apache/storm/pull/460/ref, but there's no hint in the generated code about that which would be extremely helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-699) storm-jdbc should support customer insert queries.
Parth Brahmbhatt created STORM-699: -- Summary: storm-jdbc should support customer insert queries. Key: STORM-699 URL: https://issues.apache.org/jira/browse/STORM-699 Project: Apache Storm Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Minor Currently storm-jdbc insert bolt/state only supports to specify a table name and constructs a query of the form insert into tablename values(?,?,?) based on table's schema. This fails to support use cases like insert into as select * from or special cases like Phoenix that has a jdbc driver but only supports upsert into. We should add a way so the users can specify their own custom query for the insert bolt. This was already pointed out by [~revans2] during the PR review and we now have concrete cases that will be benefited by this feature. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-446) secure Impersonation in storm
[ https://issues.apache.org/jira/browse/STORM-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337102#comment-14337102 ] Parth Brahmbhatt commented on STORM-446: [~revans2] Ok I will go ahead with the second approach but I am running into another SASL API detail. I only get the authenticatedId and authorizationId when the AuthrizationCallBack occurs, once the callback returns the server seem to only record authrizedId which is what it returns when we call *saslServer.getAuthorizationID()* and there is no *saslServer.getAuthenticationID()* API. I also considered doing the impersonation authorization as part of AuthorizationCallback itself, but there is no way to access client ip/hostName as the callback only gets authenticationId and authorizationId and no socket information and this information is not known at the time of Callback initialization. If you know the workaround on top of your head let me know. secure Impersonation in storm - Key: STORM-446 URL: https://issues.apache.org/jira/browse/STORM-446 Project: Apache Storm Issue Type: Improvement Reporter: Sriharsha Chintalapani Assignee: Parth Brahmbhatt Labels: Security Storm security adds features of authenticating with kerberos and than uses that principal and TGT as way to authorize user operations, topology operation. Currently Storm UI user needs to be part of nimbus.admins to get details on user submitted topologies. Ideally storm ui needs to take authenticated user principal to submit requests to nimbus which will than authorize the user rather than storm UI user. This feature will also benefit superusers to impersonate other users to submit topologies in a secured way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-446) secure Impersonation in storm
[ https://issues.apache.org/jira/browse/STORM-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335105#comment-14335105 ] Parth Brahmbhatt commented on STORM-446: Thanks [~revans2], that would be helpful. I knew about ReqContext and TransportPlugin. I actually tested the doAs behavior with API changes by adding a method addProxyUser to ReqContext which adds a ProxyUser principal to reqContext's subject , overriding the principal added during the topLevel process which is obtained by calling *saslServer.getAuthorizationID()* and returns that principal when reqContext.principal() is called. The missing part right now is how does the client send this principal to server in our thrift setup. secure Impersonation in storm - Key: STORM-446 URL: https://issues.apache.org/jira/browse/STORM-446 Project: Apache Storm Issue Type: Improvement Reporter: Sriharsha Chintalapani Assignee: Parth Brahmbhatt Labels: Security Storm security adds features of authenticating with kerberos and than uses that principal and TGT as way to authorize user operations, topology operation. Currently Storm UI user needs to be part of nimbus.admins to get details on user submitted topologies. Ideally storm ui needs to take authenticated user principal to submit requests to nimbus which will than authorize the user rather than storm UI user. This feature will also benefit superusers to impersonate other users to submit topologies in a secured way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (STORM-446) secure Impersonation in storm
[ https://issues.apache.org/jira/browse/STORM-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335105#comment-14335105 ] Parth Brahmbhatt edited comment on STORM-446 at 2/24/15 5:16 PM: - Thanks [~revans2], that would be helpful. I knew about ReqContext and TransportPlugin. I actually tested the doAs behavior with API changes by adding a method addProxyUser to ReqContext which adds a ProxyUser principal to reqContext's subject and returns that principal when reqContext.principal() is called. The missing part right now is how does the client send this principal to server in our thrift setup. was (Author: parth.brahmbhatt): Thanks [~revans2], that would be helpful. I knew about ReqContext and TransportPlugin. I actually tested the doAs behavior with API changes by adding a method addProxyUser to ReqContext which adds a ProxyUser principal to reqContext's subject , overriding the principal added during the topLevel process which is obtained by calling *saslServer.getAuthorizationID()* and returns that principal when reqContext.principal() is called. The missing part right now is how does the client send this principal to server in our thrift setup. secure Impersonation in storm - Key: STORM-446 URL: https://issues.apache.org/jira/browse/STORM-446 Project: Apache Storm Issue Type: Improvement Reporter: Sriharsha Chintalapani Assignee: Parth Brahmbhatt Labels: Security Storm security adds features of authenticating with kerberos and than uses that principal and TGT as way to authorize user operations, topology operation. Currently Storm UI user needs to be part of nimbus.admins to get details on user submitted topologies. Ideally storm ui needs to take authenticated user principal to submit requests to nimbus which will than authorize the user rather than storm UI user. This feature will also benefit superusers to impersonate other users to submit topologies in a secured way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-446) secure Impersonation in storm
[ https://issues.apache.org/jira/browse/STORM-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335650#comment-14335650 ] Parth Brahmbhatt commented on STORM-446: [~revans2] Thanks a lot for the pointer I tried it and it works as expected. As far as authZ for impersonation go we have 2 options. We already have a list of admin users , so as part of impersonation I can check that the user trying to impersonate is in the admin user list. Alternatively I can follow hadoop/hbase config and add following 2 configs: storm.impersonation.userX.groups: [list of groups userX is allowed to impersonate] storm.impersonation.userX.hosts[list of hosts from which userX is allowed to impersonate] I like the second option as due to finer granularity it provides more security however it also requires extra configuration. Let me know what you guys think. secure Impersonation in storm - Key: STORM-446 URL: https://issues.apache.org/jira/browse/STORM-446 Project: Apache Storm Issue Type: Improvement Reporter: Sriharsha Chintalapani Assignee: Parth Brahmbhatt Labels: Security Storm security adds features of authenticating with kerberos and than uses that principal and TGT as way to authorize user operations, topology operation. Currently Storm UI user needs to be part of nimbus.admins to get details on user submitted topologies. Ideally storm ui needs to take authenticated user principal to submit requests to nimbus which will than authorize the user rather than storm UI user. This feature will also benefit superusers to impersonate other users to submit topologies in a secured way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (STORM-446) secure Impersonation in storm
[ https://issues.apache.org/jira/browse/STORM-446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt reassigned STORM-446: -- Assignee: Parth Brahmbhatt secure Impersonation in storm - Key: STORM-446 URL: https://issues.apache.org/jira/browse/STORM-446 Project: Apache Storm Issue Type: Improvement Reporter: Sriharsha Chintalapani Assignee: Parth Brahmbhatt Labels: Security Storm security adds features of authenticating with kerberos and than uses that principal and TGT as way to authorize user operations, topology operation. Currently Storm UI user needs to be part of nimbus.admins to get details on user submitted topologies. Ideally storm ui needs to take authenticated user principal to submit requests to nimbus which will than authorize the user rather than storm UI user. This feature will also benefit superusers to impersonate other users to submit topologies in a secured way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-650) Storm-Kafka Refactoring and Improvements
[ https://issues.apache.org/jira/browse/STORM-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333726#comment-14333726 ] Parth Brahmbhatt commented on STORM-650: I don't think we are proposing to use high level consumer. Currently storm uses kafka's internal zk details to figure out a topic's partition - ListBroker and leader broker mapping. STORM-650 is about changing that so we can use Kafka's admin API to get that information. The partition to consumer assignment is still handled by storm's own PartitionManager. Storm-Kafka Refactoring and Improvements Key: STORM-650 URL: https://issues.apache.org/jira/browse/STORM-650 Project: Apache Storm Issue Type: Improvement Components: storm-kafka Reporter: P. Taylor Goetz This is intended to be a parent/umbrella JIRA covering a number of efforts/suggestions aimed at improving the storm-kafka module. The goal is to facilitate communication and collaboration by providing a central point for discussion and coordination. The first phase should be to identify and agree upon a list of high-level points we would like to address. Once that is complete, we can move on to implementation/design discussions, followed by an implementation plan, division of labor, etc. A non-exhaustive, initial list of items follows. New/additional thoughts can be proposed in the comments. * Improve API for Specifying the Kafka Starting point Configuring the kafka spout's starting position (e.g. forceFromStart=true) is a common source of confusion. This should be refactored to provide an easy to understand, unambiguous API for configuring this property. * Use Kafka APIs Instead of Internal ZK Metadata (STORM-590) Currently the Kafka spout relies on reading Kafka's internal metadata from zookeeper. This should be refactored to use the Kafka Consumer API to protect against changes to the internal metadata format stored in ZK. * Improve Error Handling There are a number of failure scenarios with the kafka spout that users may want to react to differently based on their use case. Add a failure handler API that allows users to implement and/or plug in alternative failure handling implementations. It is assumed that default (sane) implementations would be included and configured by default. * Configuration/General Refactoring (BrokerHosts, etc.) (STORM-631) (need to flesh this out better) Reduce unnecessary marker interfaces/instance of checks. Unify configuration of core storm/trident spout implementations. * Kafka Spout doesn't pick up from the beginning of the queue unless forceFromStart specified (STORM-563) Discussion Items: * How important is backward compatibility? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-446) secure Impersonation in storm
[ https://issues.apache.org/jira/browse/STORM-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334441#comment-14334441 ] Parth Brahmbhatt commented on STORM-446: [~harsha_ch] I haven't thought this through but my initial impression is that will be a pretty fat API taking way too many optional params and will be hard to use. We don't really have to pass the credentials of the other user, just the principal name. The reason I was hoping to find a way to have another principal submitted was because I stumbled upon http://docs.oracle.com/javase/7/docs/api/javax/security/sasl/AuthorizeCallback.html which seems to talk about authenticated and authrozied ids and also has a method isAuthorized() that determines if autheticatedId can act on behalf of authorizedId. I could not find any useful examples or any documentation other than java doc. If there is indeed no way to pass any additional info, I can add the API you suggested or add the optional doAs param to all APIs. [~revans2] [~ptgoetz] any thoughts? secure Impersonation in storm - Key: STORM-446 URL: https://issues.apache.org/jira/browse/STORM-446 Project: Apache Storm Issue Type: Improvement Reporter: Sriharsha Chintalapani Assignee: Parth Brahmbhatt Labels: Security Storm security adds features of authenticating with kerberos and than uses that principal and TGT as way to authorize user operations, topology operation. Currently Storm UI user needs to be part of nimbus.admins to get details on user submitted topologies. Ideally storm ui needs to take authenticated user principal to submit requests to nimbus which will than authorize the user rather than storm UI user. This feature will also benefit superusers to impersonate other users to submit topologies in a secured way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-446) secure Impersonation in storm
[ https://issues.apache.org/jira/browse/STORM-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334154#comment-14334154 ] Parth Brahmbhatt commented on STORM-446: [~revans2] [~harsha_ch] I wanted to check with you guys if the following approach makes sense to support this feature I tried to do the following: * Added an API in StormSubmitter , *submitTopologyAs* which takes all the usual params + String doAsUser. * Authenticate using the keytab in the jaas.conf. * Create a new subject using the doAsUser as the principal, make the server side call as a privileged action with this subject. {code:java} Nimbus.Client client = NimbusClient.getConfiguredClient(conf).getClient(); User proxyUser = new User(doAsUser); Subject subject = new Subject(); subject.getPrincipals().add(proxyUser); Subject.doAs(subject, new PrivilegedActionObject() { @Override public Object run() { client.submitTopology(args); } }) {code} I originally thought sasl would forward the principal from the current thread context's subject to the server but on the server side *String authId = saslServer.getAuthorizationID();* still returns the original authenticated Id that was sent as part of connection establishment. I dont want to modify all the APIs to include a UserGroupInfomration look-a-like param but looking at the hadoop implementation it seems hadoop also passes the UGI as part of the RPC call that they make. Do you guys have any other alternative ideas that does not involve changing all the thrift APIS? secure Impersonation in storm - Key: STORM-446 URL: https://issues.apache.org/jira/browse/STORM-446 Project: Apache Storm Issue Type: Improvement Reporter: Sriharsha Chintalapani Assignee: Parth Brahmbhatt Labels: Security Storm security adds features of authenticating with kerberos and than uses that principal and TGT as way to authorize user operations, topology operation. Currently Storm UI user needs to be part of nimbus.admins to get details on user submitted topologies. Ideally storm ui needs to take authenticated user principal to submit requests to nimbus which will than authorize the user rather than storm UI user. This feature will also benefit superusers to impersonate other users to submit topologies in a secured way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-682) Supervisor local worker state corrupted and failing to start.
Parth Brahmbhatt created STORM-682: -- Summary: Supervisor local worker state corrupted and failing to start. Key: STORM-682 URL: https://issues.apache.org/jira/browse/STORM-682 Project: Apache Storm Issue Type: Bug Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt If supervisor's cleanup of a worker fails to delete some heartbeat files the local state of the supervisors get corrupted.The only way to recover the supervisor from this state is to delete the local state folder where supervisor stores all worker information.This fix can get very cumbersome if it happens on multiple worker nodes. The root cause of the issue is the order in which worker heartbeat versioned store files are created vs the deletion order of those files. LocalState.put first creates a data file X and then marks a success by creating a file X.version. During get it first checks for all *.version files , tries to find the largest value of X and then issues a read against X. See the below pseudo code {code:java} start_supervisor() { workerIds = `ls local-state/workers` for each workerId in workerIds versions = `ls local-state/workers/workerId/heartbeats/*.version` latest_version = max(versions) read local-state/workers/workerId/heartbeats/latest_version [Note there is no .version extension] } {code} During cleanup it first tries to delete file X and then X.version. If X gets deleted but X.version fails to delete the supervisor fails to start with FileNotFoundException in the code above. We propose to change the deletion order so the .version files get deleted before the data file and catch any IOException when reading worker heartbeats to avoid supervisor failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-654) Create a thrift API to discover nimbus so all the clients are not forced to contact zookeeper.
[ https://issues.apache.org/jira/browse/STORM-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319303#comment-14319303 ] Parth Brahmbhatt commented on STORM-654: https://github.com/Parth-Brahmbhatt/incubator-storm/pull/3 posted pull request against my own branch. I haven't implemented the caching yet. I think we should just cache CluasterInfo. The ui right now makes 4 requests(cluster summary, supervisor summary, topology summary and now nimbus summary) to getClusterInfo for the index page. GetClusterInfo does not have any filters so we end up reading everything form zookeeper even though only one of the 4 params are used by a single request. I don't think consistency is really important in this case and caching this will both improve ui performance and reduce load on zookeeper. Create a thrift API to discover nimbus so all the clients are not forced to contact zookeeper. -- Key: STORM-654 URL: https://issues.apache.org/jira/browse/STORM-654 Project: Apache Storm Issue Type: Sub-task Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Current implementation of Nimbus-HA requires each nimbus client to discover nimbus hosts by contacting zookeeper. In order to reduce the load on zookeeper we could expose a thrift API as described in the future improvement section of the Nimbus HA design doc. We will add an extra field in ClusterSummary structure called nimbuses. struct ClusterSummary { 1: required listSupervisorSummary supervisors; 2: required i32 nimbus_uptime_secs; 3: required listTopologySummary topologies; 4: required listNimbusSummary nimbuses; } struct NimbusSummary { 1: require string host; 2: require int port; 3: require int uptimeSecs; 4: require boolean isLeader; 5: require string version; 6: optional liststring local_storm_ids; //need a better name but these are list of storm-ids for which this nimbus host has the code available locally. } We will create a nimbus.hosts configuration which will serve as the seed list of nimbus hosts. Any nimbus host can serve the read requests so any client can issue getClusterSummary call and they can extract the leader nimbus summary from the list of nimbuses. All nimbus hosts will cache this information to reduce the load on zookeeper. In addition we can add a RedirectException. When a request that can only be served by leader nimbus (i.e. submit, kill, rebalance, deactivate, activate) is issued against a non leader nimbus, the non leader nimbus will throw a RedirectException and the client will handle the exception by refreshing their leader nimbus host and contacting that host as part of retry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-649) Storm HDFS test topologies should write to /tmp dir
[ https://issues.apache.org/jira/browse/STORM-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314975#comment-14314975 ] Parth Brahmbhatt commented on STORM-649: PR posted :https://github.com/apache/storm/pull/426 Storm HDFS test topologies should write to /tmp dir --- Key: STORM-649 URL: https://issues.apache.org/jira/browse/STORM-649 Project: Apache Storm Issue Type: Bug Reporter: Sriharsha Chintalapani Assignee: Parth Brahmbhatt Priority: Trivial storm hdfs test topologies write to /foo. This makes it difficult for to run it as system test as the test user requires to have privileges on /. Instead we should write to /tmp. FileNameFormat fileNameFormat = new DefaultFileNameFormat() .withPath(/foo/) .withExtension(.txt); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-654) Create a thrift API to discover nimbus so all the clients are not forced to contact zookeeper.
Parth Brahmbhatt created STORM-654: -- Summary: Create a thrift API to discover nimbus so all the clients are not forced to contact zookeeper. Key: STORM-654 URL: https://issues.apache.org/jira/browse/STORM-654 Project: Apache Storm Issue Type: Sub-task Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Current implementation of Nimbus-HA requires each nimbus client to discover nimbus hosts by contacting zookeeper. In order to reduce the load on zookeeper we could expose a thrift API as described in the future improvement section of the Nimbus HA design doc. We will add an extra field in ClusterSummary structure called nimbuses. struct ClusterSummary { 1: required listSupervisorSummary supervisors; 2: required i32 nimbus_uptime_secs; 3: required listTopologySummary topologies; 4: required listNimbusSummary nimbuses; } struct NimbusSummary { 1: require string host; 2: require int port; 3: require int uptimeSecs; 4: require boolean isLeader; 5: require string version; 6: optional liststring local_storm_ids; //need a better name but these are list of storm-ids for which this nimbus host has the code available locally. } We will create a nimbus.hosts configuration which will serve as the seed list of nimbus hosts. Any nimbus host can serve the read requests so any client can issue getClusterSummary call and they can extract the leader nimbus summary from the list of nimbuses. All nimbus hosts will cache this information to reduce the load on zookeeper. In addition we can add a RedirectException. When a request that can only be served by leader nimbus (i.e. submit, kill, rebalance, deactivate, activate) is issued against a non leader nimbus, the non leader nimbus will throw a RedirectException and the client will handle the exception by refreshing their leader nimbus host and contacting that host as part of retry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-655) Ad replication count as part of topology summary.
Parth Brahmbhatt created STORM-655: -- Summary: Ad replication count as part of topology summary. Key: STORM-655 URL: https://issues.apache.org/jira/browse/STORM-655 Project: Apache Storm Issue Type: Sub-task Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt With Nimbus HA each topology is replicated across multiple nimbus hosts. We want to modify the UI/REST/Thrift APIs so we can expose the replication count of a topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-590) KafkaSpout should use kafka consumer api
[ https://issues.apache.org/jira/browse/STORM-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301491#comment-14301491 ] Parth Brahmbhatt commented on STORM-590: Thanks for assigning it to me, I have already posted a PR here https://github.com/apache/storm/pull/387 KafkaSpout should use kafka consumer api Key: STORM-590 URL: https://issues.apache.org/jira/browse/STORM-590 Project: Apache Storm Issue Type: Improvement Components: storm-kafka Affects Versions: 0.9.2-incubating Reporter: Kai Sasaki Assignee: Parth Brahmbhatt Following below ticket https://github.com/apache/storm/pull/338 KafkaSpout uses kakfa internal data included zk nodes. However it should be changed to get these data from kafka consumer api provided kafka project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-634) Storm should support rolling upgrade/downgrade of storm cluster.
[ https://issues.apache.org/jira/browse/STORM-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295776#comment-14295776 ] Parth Brahmbhatt commented on STORM-634: I have made some progress on this. I converted SupervisorInfo, Assignment and StormBase to thrift structures and I am now working on converting the ZkWorkerHeartBeat to thrift.While working on thriftifying these structures I have realized that the defrecords for all these structures are pretty meaningless. The code does not really adhere to fields defined in defrecord and does not take advantage of defrecord structure in any way(no defaults, no validations, not implementing any protocol or interfaces, not used in any multimethods). It treats these structures as open maps and that is actually very very confusing. For example StormBase does not define delay_secs or previous_state as fields but when we do rebalance or kill transition we still store this information as part of StormBase instance. The simplest way to address this jira would be to just get rid of these defrecords and instead create mk-X methods for each one of them which just returns a map representing the structure. The reason serialization breaks currently is because defrecords are treated as java classes and serialized accordingly. If we just make all these structure open maps that is no longer an issue. If the code starts making assumptions about certain keys will always exist in the map that is equivalent to adding a new required field in the thrift structure which is not going to be backward compatible anyways. You would think we will lose some redability if things were marked as maps instead of predefined types but I think we will actually gain redability as we are not respecting the defined type anyways which again is super confusing. The only advantage I see for moving to thrift at this point would be for future projects like moving the heartbeat from zookeeper to nimbus. [~revans2] [~ptgoetz] Would like to hear your opinion. Storm should support rolling upgrade/downgrade of storm cluster. Key: STORM-634 URL: https://issues.apache.org/jira/browse/STORM-634 Project: Apache Storm Issue Type: Improvement Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Currently when a new version of storm is released in order to upgrade existing storm clusters users need to backup their existing topologies , kill all the topologies , perform the upgrade and resubmit all the topologies. This is painful and results in downtime which may not be acceptable for Always alive production systems. Storm should support a rolling upgrade/downgrade deployment process to avoid these downtimes and to make the transition to a different version effortless. Based on my initial attempt the primary issue seem to be the java serialization used to serialize java classes like StormBase, Assignment, WorkerHeartbeat which is then stored in zookeeper. When deserializing if the serial versions do not match the deserialization fails resulting in processes just getting killed indefinitely. We need to change the Utils/serialize and Utils/deserialize so it can support non java serialization mechanism like json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-634) Storm should support rolling upgrade/downgrade of storm cluster.
[ https://issues.apache.org/jira/browse/STORM-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287898#comment-14287898 ] Parth Brahmbhatt commented on STORM-634: [~revans2] Makes sense, I am going to take a stab at Thrift. Storm should support rolling upgrade/downgrade of storm cluster. Key: STORM-634 URL: https://issues.apache.org/jira/browse/STORM-634 Project: Apache Storm Issue Type: Improvement Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Currently when a new version of storm is released in order to upgrade existing storm clusters users need to backup their existing topologies , kill all the topologies , perform the upgrade and resubmit all the topologies. This is painful and results in downtime which may not be acceptable for Always alive production systems. Storm should support a rolling upgrade/downgrade deployment process to avoid these downtimes and to make the transition to a different version effortless. Based on my initial attempt the primary issue seem to be the java serialization used to serialize java classes like StormBase, Assignment, WorkerHeartbeat which is then stored in zookeeper. When deserializing if the serial versions do not match the deserialization fails resulting in processes just getting killed indefinitely. We need to change the Utils/serialize and Utils/deserialize so it can support non java serialization mechanism like json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-634) Storm should support rolling upgrade/downgrade of storm cluster.
Parth Brahmbhatt created STORM-634: -- Summary: Storm should support rolling upgrade/downgrade of storm cluster. Key: STORM-634 URL: https://issues.apache.org/jira/browse/STORM-634 Project: Apache Storm Issue Type: Improvement Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Currently when a new version of storm is released in order to upgrade existing storm clusters users need to backup their existing topologies , kill all the topologies , perform the upgrade and resubmit all the topologies. This is painful and results in downtime which may not be acceptable for Always alive production systems. Storm should support a rolling upgrade/downgrade deployment process to avoid these downtimes and to make the transition to a different version effortless. Based on my initial attempt the primary issue seem to be the java serialization used to serialize java classes like StormBase, Assignment, WorkerHeartbeat which is then stored in zookeeper. When deserializing if the serial versions do not match the deserialization fails resulting in processes just getting killed indefinitely. We need to change the Utils/serialize and Utils/deserialize so it can support non java serialization mechanism like json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-393) Add topic to KafkaSpout output
[ https://issues.apache.org/jira/browse/STORM-393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277604#comment-14277604 ] Parth Brahmbhatt commented on STORM-393: [~ptgoetz] No response in 4 months and as I mentioned above this is supported even right now. I think we should just close this jira. Add topic to KafkaSpout output -- Key: STORM-393 URL: https://issues.apache.org/jira/browse/STORM-393 Project: Apache Storm Issue Type: Improvement Affects Versions: 0.9.2-incubating Reporter: Alexey Raga Labels: features It would be beneficial to have topic as a tuple value emitted from KafkaSpout. Not only it is useful if STORM-392 is implemented, but also in case when we have more than one KafkaSpout in a system -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (STORM-560) ZkHosts in README should use 2181 as port
[ https://issues.apache.org/jira/browse/STORM-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt closed STORM-560. -- Resolution: Duplicate ZkHosts in README should use 2181 as port - Key: STORM-560 URL: https://issues.apache.org/jira/browse/STORM-560 Project: Apache Storm Issue Type: Documentation Components: storm-kafka Affects Versions: 0.9.2-incubating Reporter: Anyi Li Priority: Minor Labels: easyfix Fix For: 0.9.2-incubating Recently I am using kafka spout to consume message from kafka server. While reading documents on https://github.com/apache/storm/tree/master/external/storm-kafka, the instructions on creating ZkHosts is really confusing. It seems like it should point to broker instead of zookeeper because the example uses port 9092. But after couple of testing, I realized it should point to the zookeeper and use 2181. The document is not clear on that point. It would be great if we can change that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-616) add storm-jdbc to list of external connectors.
Parth Brahmbhatt created STORM-616: -- Summary: add storm-jdbc to list of external connectors. Key: STORM-616 URL: https://issues.apache.org/jira/browse/STORM-616 Project: Apache Storm Issue Type: New JIRA Project Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Minor There have been several questions in the apache mailing list around how to use storm to write tuples to a relational database. Storm should add a jdbc connector to its list of external connectors that has a general implementation to insert data into relational dbs as part of a topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-166) Highly available Nimbus
[ https://issues.apache.org/jira/browse/STORM-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256067#comment-14256067 ] Parth Brahmbhatt commented on STORM-166: Link to pull request: https://github.com/apache/storm/pull/354/files Highly available Nimbus --- Key: STORM-166 URL: https://issues.apache.org/jira/browse/STORM-166 Project: Apache Storm Issue Type: New Feature Reporter: James Xu Assignee: Parth Brahmbhatt Priority: Minor https://github.com/nathanmarz/storm/issues/360 The goal of this feature is to be able to run multiple Nimbus servers so that if one goes down another one will transparently take over. Here's what needs to happen to implement this: 1. Everything currently stored on local disk on Nimbus needs to be stored in a distributed and reliable fashion. A DFS is perfect for this. However, as we do not want to make a DFS a mandatory requirement to run Storm, the storage of these artifacts should be pluggable (default to local filesystem, but the interface should support DFS). You would only be able to run multiple NImbus if you use the right storage, and the storage interface chosen should have a flag indicating whether it's suitable for HA mode or not. If you choose local storage and try to run multiple Nimbus, one of the Nimbus's should fail to launch. 2. Nimbus's should register themselves in Zookeeper. They should use a leader election protocol to decide which one is currently responsible for launching and monitoring topologies. 3. StormSubmitter should find the Nimbus to connect to via Zookeeper. In case the leader changes during submission, it should use a retry protocol to try reconnecting to the new leader and attempting submission again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-586) Trident kafka spout fails instead of updating offset when kafka offset is out of range.
Parth Brahmbhatt created STORM-586: -- Summary: Trident kafka spout fails instead of updating offset when kafka offset is out of range. Key: STORM-586 URL: https://issues.apache.org/jira/browse/STORM-586 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.3 Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Critical Trident KafkaEmitter does not catch the newly added UpdateOffsetException which results in the spout failing repeatedly instead of automatically updating the offset to earliest time. PROBLEM: Exception while using the Trident Kafka Spout. 2014-12-04 18:38:03 b.s.util ERROR Async loop died! java.lang.RuntimeException: storm.kafka.UpdateOffsetException at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at backtype.storm.daemon.executor$fn_4195$fn4207$fn_4254.invoke(executor.clj:745) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at backtype.storm.util$async_loop$fn__442.invoke(util.clj:436) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at clojure.lang.AFn.run(AFn.java:24) clojure-1.4.0.jar:na at java.lang.Thread.run(Thread.java:745) na:1.7.0_71 Caused by: storm.kafka.UpdateOffsetException: null at storm.kafka.KafkaUtils.fetchMessages(KafkaUtils.java:186) ~stormjar.jar:na at storm.kafka.trident.TridentKafkaEmitter.fetchMessages(TridentKafkaEmitter.java:132) ~stormjar.jar:na at storm.kafka.trident.TridentKafkaEmitter.doEmitNewPartitionBatch(TridentKafkaEmitter.java:113) ~stormjar.jar:na at storm.kafka.trident.TridentKafkaEmitter.failFastEmitNewPartitionBatch(TridentKafkaEmitter.java:72) ~stormjar.jar:na at storm.kafka.trident.TridentKafkaEmitter.access$400(TridentKafkaEmitter.java:46) ~stormjar.jar:na at storm.kafka.trident.TridentKafkaEmitter$2.emitPartitionBatchNew(TridentKafkaEmitter.java:233) ~stormjar.jar:na at storm.kafka.trident.TridentKafkaEmitter$2.emitPartitionBatchNew(TridentKafkaEmitter.java:225) ~stormjar.jar:na at storm.trident.spout.PartitionedTridentSpoutExecutor$Emitter$1.init(PartitionedTridentSpoutExecutor.java:125) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at storm.trident.topology.state.RotatingTransactionalState.getState(RotatingTransactionalState.java:83) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at storm.trident.topology.state.RotatingTransactionalState.getStateOrCreate(RotatingTransactionalState.java:110) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at storm.trident.spout.PartitionedTridentSpoutExecutor$Emitter.emitBatch(PartitionedTridentSpoutExecutor.java:121) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at storm.trident.spout.TridentSpoutExecutor.execute(TridentSpoutExecutor.java:82) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at storm.trident.topology.TridentBoltExecutor.execute(TridentBoltExecutor.java:369) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at backtype.storm.daemon.executor$fn_4195$tuple_action_fn_4197.invoke(executor.clj:630) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at backtype.storm.daemon.executor$mk_task_receiver$fn__4118.invoke(executor.clj:398) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at backtype.storm.disruptor$clojure_handler$reify__723.onEvent(disruptor.clj:58) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:99) ~storm-core-0.9.1.2.1.7.0-784.jar:0.9.1.2.1.7.0-784 ... 6 common frames omitted 2014-12-04 18:38:03 b.s.d.executor ERROR java.lang.RuntimeException: storm.kafka.UpdateOffsetException at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107) ~[storm-core-0.9.1.2.1.7.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-587) trident transactional state in zk should be namespaced with topology id
Parth Brahmbhatt created STORM-587: -- Summary: trident transactional state in zk should be namespaced with topology id Key: STORM-587 URL: https://issues.apache.org/jira/browse/STORM-587 Project: Apache Storm Issue Type: Improvement Reporter: Parth Brahmbhatt Currently when a trident transaction spout is initialized it creates a node in zk under /transactional with the spout name as the node's name. This is pretty dangerous as any other topology can be submitted with same spout name and now these 2 spouts will be overwriting each other's states. I believe it is better to namespace this with topologyId just like all other zk entries under /storm. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (STORM-555) Storm Json response encoding should be UTF-8
[ https://issues.apache.org/jira/browse/STORM-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt updated STORM-555: --- Description: Currently the storm Rest API responses has charset=iso-8859-1 set by ring/jetty by default. However on html pages we have meta tag set to UTF-*. The response charset should be set to UTF-8 to support GB18030. To reproduce: submit a storm topology with non ascii characters like '䶴ㄩ鼾丄狜〇'. In the UI the topology name will be displayed incorrectly with '??' . This is applicable to bolt and spout names or anything in reponse that might have non ascii characters. In addition because of incorrect charset no topology actions can be performed on topoogies with nonascii character. was: Currently the storm Rest API responses has charset=iso-8859-1 set by ring/jetty by default. However on html pages we have meta tag set to UTF-*. The response charset should be set to UTF-8 to support GB18030. To reproduce: submit a storm topology with non ascii characters like '䶴ㄩ鼾丄狜〇'. In the UI the topology name will be displayed incorrectly with '??' . This is applicable to bolt and spout names or anything in reponse that might have non ascii characters. Storm Json response encoding should be UTF-8 Key: STORM-555 URL: https://issues.apache.org/jira/browse/STORM-555 Project: Apache Storm Issue Type: Bug Reporter: Parth Brahmbhatt Assignee: Parth Brahmbhatt Priority: Minor Currently the storm Rest API responses has charset=iso-8859-1 set by ring/jetty by default. However on html pages we have meta tag set to UTF-*. The response charset should be set to UTF-8 to support GB18030. To reproduce: submit a storm topology with non ascii characters like '䶴ㄩ鼾丄狜〇'. In the UI the topology name will be displayed incorrectly with '??' . This is applicable to bolt and spout names or anything in reponse that might have non ascii characters. In addition because of incorrect charset no topology actions can be performed on topoogies with nonascii character. -- This message was sent by Atlassian JIRA (v6.3.4#6332)