Re: [DISCUSS] Adopt Apache Storm Bylaws

2015-02-12 Thread Nathan Marz
+1

On Thu, Feb 12, 2015 at 5:57 PM, P. Taylor Goetz  wrote:

> Pull request updated.
>
> Here’s a link to the latest commit:
> https://github.com/ptgoetz/storm/commit/18a68a074570db01fc6377a269feb90ecda898ab
>
> - Taylor
>
> On Feb 12, 2015, at 8:41 PM, P. Taylor Goetz  wrote:
>
> > Great hear. I will update the pull request accordingly.
> >
> > -Taylor
> >
> >
> >> On Feb 12, 2015, at 5:24 PM, Derek Dagit 
> wrote:
> >>
> >> I am OK with codifying the retroactive -1 as proposed by Nathan, and I
> >> am otherwise OK with the proposed bylaws.
> >> --
> >> Derek
> >>
> >>
> >>
> >> - Original Message -
> >> From: Bobby Evans 
> >> To: "dev@storm.apache.org" 
> >> Cc:
> >> Sent: Thursday, February 12, 2015 8:12 AM
> >> Subject: Re: [DISCUSS] Adopt Apache Storm Bylaws
> >>
> >> That seems fine to me.  Most other projects I have worked on follow a
> similar procedure, and a retroactive -1 can be applied, without having it
> codified, but making it official seems fine to me.
> >> I am +1 for those changes.
> >> - Bobby
> >>
> >>
> >>
> >> On Thursday, February 12, 2015 2:23 AM, Nathan Marz <
> nat...@nathanmarz.com> wrote:
> >>
> >>
> >> Yes, I would like to codify it. It's not about there being a bug with a
> >> patch – it's about realizing that particular patch does not fit in with
> a
> >> coherent vision of Storm, or that functionality could be achieved in a
> >> completely different way. So basically, preventing bloat. With that
> change
> >> I'm +1 to the bylaws and I believe we would have a consensus.
> >>
> >>> On Wed, Feb 11, 2015 at 7:34 PM, P. Taylor Goetz 
> wrote:
> >>>
> >>> I have no problem with your proposal. Actually I never even considered
> >>> setting a timeline for a revert. I've always felt that if there was any
> >>> problem with a patch/modification, it could be reverted at any time --
> no
> >>> deadline. If we find a problem, we fix it. We've reverted changes in
> the
> >>> past, and lived to tell about it :).
> >>>
> >>> So I would think we don't even have to mention any revert timeline. If
> we
> >>> feel the need to codify that, I'm okay with it.
> >>>
> >>> -Taylor
> >>>
>  On Feb 11, 2015, at 9:06 PM, Nathan Marz 
> wrote:
> 
>  I'm -1 on these bylaws. This commit process encourages merging as
> fast as
>  possible and does not give adequate time for dissenting opinions to
> veto
> >>> a
>  patch. I'm concerned about two things:
> 
>  1. Regressions - Having too lax of a merge process will lead to
> >>> unforeseen
>  regressions. We all saw this first hand with ZeroMQ: I had to freeze
> the
>  version of ZeroMQ used by Storm because subsequent versions would
> regress
>  in numerous ways.
>  2. Bloat – All software projects have a tendency to become bloated and
>  build complexity because things were added piecemeal without a
> coherent
>  vision.
> 
>  These are very serious issues, and I've seen too many projects become
>  messes because of them. The only way to control these problems are
> with
>  -1's. Trust isn't even the issue here – one committer may very well
> >>> think a
>  new feature "looks fine" and "why not let it in", while another will
>  recognize that the feature is unnecessary, adds complexity, and/or
> can be
>  addressed via better means. As is, the proposed bylaws are attempting
> to
>  make vetoing very difficult.
> 
>  I have a proposal which I believe gets the best of all worlds:
> allowing
> >>> for
>  fast responsiveness on contributions while allowing for regressions
> and
>  bloat to be controlled. It is just a slight modification of the
> current
>  bylaws:
> 
>  "A minimum of one +1 from a Committer other than the one who authored
> the
>  patch, and no -1s. The code can be committed after the first +1. If a
> -1
> >>> is
>  received to the patch within 7 days after the patch was posted, it
> may be
>  reverted immediately if it was already merged."
> 
>  To be clear, if a patch was posted on the 7th and merged on the 10th,
> it
>  may be -1'd and reverted until the 14th.
> 
>  With this process patches can be merged just as fast as before, but it
> >>> also
>  allows for committers with a more holistic or deeper understanding of
> a
>  part of Storm to prevent unnecessary complexity.
> 
> 
>  On Tue, Feb 10, 2015 at 7:48 AM, Bobby Evans
>  
>  wrote:
> 
> > I am fine with this. I mostly want a starting point, and we can
> adjust
> > things from there is need be.
> > - Bobby
> >
> >
> >On Sunday, February 8, 2015 8:39 PM, Harsha 
> >>> wrote:
> >
> >
> >
> > Thanks for putting this together. Proposed bylaws looks good to
> > me. -Harsha
> >
> >
> >> On Thu, Feb 5, 2015, at 02:10 PM, P. Taylor Goetz wrote:
> >> Associated pull request can be found here:
> >> https://github.com/apache/storm/pull/419

[GitHub] storm pull request: Storm-539. Storm hive bolt and trident state.

2015-02-12 Thread harshach
Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/350#issuecomment-74202499
  
Thanks @ptgoetz  for the review. Added sponsors and included you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (STORM-441) Remove bootstrap macro from Clojure codebase

2015-02-12 Thread Kyle Nusbaum (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Nusbaum reassigned STORM-441:
--

Assignee: Kyle Nusbaum

> Remove bootstrap macro from Clojure codebase
> 
>
> Key: STORM-441
> URL: https://issues.apache.org/jira/browse/STORM-441
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Dane Hammer
>Assignee: Kyle Nusbaum
>Priority: Trivial
>
> The bootstrap macro in backtype.storm.bootstrap is purely a convenience for 
> importing/using/requiring a large number of dependencies, but it's not used 
> for anything else. It removes those imports/uses/requires from the namespace 
> form, making it harder to track down where a definition is coming from, which 
> defeats some IDE tools.
> I propose removing it entirely, making the Clojure part of the codebase more 
> readable and updated to current conventions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (STORM-433) Give users visibility to the depth of queues at each bolt

2015-02-12 Thread Kyle Nusbaum (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Nusbaum reassigned STORM-433:
--

Assignee: Kyle Nusbaum

> Give users visibility to the depth of queues at each bolt
> -
>
> Key: STORM-433
> URL: https://issues.apache.org/jira/browse/STORM-433
> Project: Apache Storm
>  Issue Type: Wish
>Reporter: Dane Hammer
>Assignee: Kyle Nusbaum
>Priority: Minor
>
> I envision being able to browse the Storm UI and see where queues of tuples 
> are backing up.
> Today if I see latencies increasing at a bolt, it may not be due to anything 
> specific to that bolt, but that it is backed up behind an overwhelmed bolt 
> (which has too low of parallelism or too high of latency).
> I would expect this could use sampling like the metrics reported to the UI 
> today, and just retrieve data from netty about the state of the queues. I 
> wouldn't imagine supporting zeromq on the first pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Adopt Apache Storm Bylaws

2015-02-12 Thread P. Taylor Goetz
Pull request updated.

Here’s a link to the latest commit: 
https://github.com/ptgoetz/storm/commit/18a68a074570db01fc6377a269feb90ecda898ab

- Taylor

On Feb 12, 2015, at 8:41 PM, P. Taylor Goetz  wrote:

> Great hear. I will update the pull request accordingly.
> 
> -Taylor
> 
> 
>> On Feb 12, 2015, at 5:24 PM, Derek Dagit  
>> wrote:
>> 
>> I am OK with codifying the retroactive -1 as proposed by Nathan, and I
>> am otherwise OK with the proposed bylaws.
>> -- 
>> Derek 
>> 
>> 
>> 
>> - Original Message -
>> From: Bobby Evans 
>> To: "dev@storm.apache.org" 
>> Cc: 
>> Sent: Thursday, February 12, 2015 8:12 AM
>> Subject: Re: [DISCUSS] Adopt Apache Storm Bylaws
>> 
>> That seems fine to me.  Most other projects I have worked on follow a 
>> similar procedure, and a retroactive -1 can be applied, without having it 
>> codified, but making it official seems fine to me.
>> I am +1 for those changes.
>> - Bobby
>> 
>> 
>> 
>> On Thursday, February 12, 2015 2:23 AM, Nathan Marz 
>>  wrote:
>> 
>> 
>> Yes, I would like to codify it. It's not about there being a bug with a
>> patch – it's about realizing that particular patch does not fit in with a
>> coherent vision of Storm, or that functionality could be achieved in a
>> completely different way. So basically, preventing bloat. With that change
>> I'm +1 to the bylaws and I believe we would have a consensus.
>> 
>>> On Wed, Feb 11, 2015 at 7:34 PM, P. Taylor Goetz  wrote:
>>> 
>>> I have no problem with your proposal. Actually I never even considered
>>> setting a timeline for a revert. I've always felt that if there was any
>>> problem with a patch/modification, it could be reverted at any time -- no
>>> deadline. If we find a problem, we fix it. We've reverted changes in the
>>> past, and lived to tell about it :).
>>> 
>>> So I would think we don't even have to mention any revert timeline. If we
>>> feel the need to codify that, I'm okay with it.
>>> 
>>> -Taylor
>>> 
 On Feb 11, 2015, at 9:06 PM, Nathan Marz  wrote:
 
 I'm -1 on these bylaws. This commit process encourages merging as fast as
 possible and does not give adequate time for dissenting opinions to veto
>>> a
 patch. I'm concerned about two things:
 
 1. Regressions - Having too lax of a merge process will lead to
>>> unforeseen
 regressions. We all saw this first hand with ZeroMQ: I had to freeze the
 version of ZeroMQ used by Storm because subsequent versions would regress
 in numerous ways.
 2. Bloat – All software projects have a tendency to become bloated and
 build complexity because things were added piecemeal without a coherent
 vision.
 
 These are very serious issues, and I've seen too many projects become
 messes because of them. The only way to control these problems are with
 -1's. Trust isn't even the issue here – one committer may very well
>>> think a
 new feature "looks fine" and "why not let it in", while another will
 recognize that the feature is unnecessary, adds complexity, and/or can be
 addressed via better means. As is, the proposed bylaws are attempting to
 make vetoing very difficult.
 
 I have a proposal which I believe gets the best of all worlds: allowing
>>> for
 fast responsiveness on contributions while allowing for regressions and
 bloat to be controlled. It is just a slight modification of the current
 bylaws:
 
 "A minimum of one +1 from a Committer other than the one who authored the
 patch, and no -1s. The code can be committed after the first +1. If a -1
>>> is
 received to the patch within 7 days after the patch was posted, it may be
 reverted immediately if it was already merged."
 
 To be clear, if a patch was posted on the 7th and merged on the 10th, it
 may be -1'd and reverted until the 14th.
 
 With this process patches can be merged just as fast as before, but it
>>> also
 allows for committers with a more holistic or deeper understanding of a
 part of Storm to prevent unnecessary complexity.
 
 
 On Tue, Feb 10, 2015 at 7:48 AM, Bobby Evans >>> 
 wrote:
 
> I am fine with this. I mostly want a starting point, and we can adjust
> things from there is need be.
> - Bobby
> 
> 
>On Sunday, February 8, 2015 8:39 PM, Harsha 
>>> wrote:
> 
> 
> 
> Thanks for putting this together. Proposed bylaws looks good to
> me. -Harsha
> 
> 
>> On Thu, Feb 5, 2015, at 02:10 PM, P. Taylor Goetz wrote:
>> Associated pull request can be found here:
>> https://github.com/apache/storm/pull/419
>> 
>> 
>> This is another attempt at gaining consensus regarding adopting
>> official bylaws for the Apache Storm project. The changes are minor
>> and should be apparent in the pull request diff.
>> 
>> In earlier discussions, there were concerns raised about certain
>> actions requiring approval ty

Re: [DISCUSS] Adopt Apache Storm Bylaws

2015-02-12 Thread P. Taylor Goetz
Great hear. I will update the pull request accordingly.

-Taylor


> On Feb 12, 2015, at 5:24 PM, Derek Dagit  wrote:
> 
> I am OK with codifying the retroactive -1 as proposed by Nathan, and I
> am otherwise OK with the proposed bylaws.
> -- 
> Derek 
> 
> 
> 
> - Original Message -
> From: Bobby Evans 
> To: "dev@storm.apache.org" 
> Cc: 
> Sent: Thursday, February 12, 2015 8:12 AM
> Subject: Re: [DISCUSS] Adopt Apache Storm Bylaws
> 
> That seems fine to me.  Most other projects I have worked on follow a similar 
> procedure, and a retroactive -1 can be applied, without having it codified, 
> but making it official seems fine to me.
> I am +1 for those changes.
> - Bobby
> 
> 
> 
>  On Thursday, February 12, 2015 2:23 AM, Nathan Marz 
>  wrote:
>   
> 
> Yes, I would like to codify it. It's not about there being a bug with a
> patch – it's about realizing that particular patch does not fit in with a
> coherent vision of Storm, or that functionality could be achieved in a
> completely different way. So basically, preventing bloat. With that change
> I'm +1 to the bylaws and I believe we would have a consensus.
> 
>> On Wed, Feb 11, 2015 at 7:34 PM, P. Taylor Goetz  wrote:
>> 
>> I have no problem with your proposal. Actually I never even considered
>> setting a timeline for a revert. I've always felt that if there was any
>> problem with a patch/modification, it could be reverted at any time -- no
>> deadline. If we find a problem, we fix it. We've reverted changes in the
>> past, and lived to tell about it :).
>> 
>> So I would think we don't even have to mention any revert timeline. If we
>> feel the need to codify that, I'm okay with it.
>> 
>> -Taylor
>> 
>>> On Feb 11, 2015, at 9:06 PM, Nathan Marz  wrote:
>>> 
>>> I'm -1 on these bylaws. This commit process encourages merging as fast as
>>> possible and does not give adequate time for dissenting opinions to veto
>> a
>>> patch. I'm concerned about two things:
>>> 
>>> 1. Regressions - Having too lax of a merge process will lead to
>> unforeseen
>>> regressions. We all saw this first hand with ZeroMQ: I had to freeze the
>>> version of ZeroMQ used by Storm because subsequent versions would regress
>>> in numerous ways.
>>> 2. Bloat – All software projects have a tendency to become bloated and
>>> build complexity because things were added piecemeal without a coherent
>>> vision.
>>> 
>>> These are very serious issues, and I've seen too many projects become
>>> messes because of them. The only way to control these problems are with
>>> -1's. Trust isn't even the issue here – one committer may very well
>> think a
>>> new feature "looks fine" and "why not let it in", while another will
>>> recognize that the feature is unnecessary, adds complexity, and/or can be
>>> addressed via better means. As is, the proposed bylaws are attempting to
>>> make vetoing very difficult.
>>> 
>>> I have a proposal which I believe gets the best of all worlds: allowing
>> for
>>> fast responsiveness on contributions while allowing for regressions and
>>> bloat to be controlled. It is just a slight modification of the current
>>> bylaws:
>>> 
>>> "A minimum of one +1 from a Committer other than the one who authored the
>>> patch, and no -1s. The code can be committed after the first +1. If a -1
>> is
>>> received to the patch within 7 days after the patch was posted, it may be
>>> reverted immediately if it was already merged."
>>> 
>>> To be clear, if a patch was posted on the 7th and merged on the 10th, it
>>> may be -1'd and reverted until the 14th.
>>> 
>>> With this process patches can be merged just as fast as before, but it
>> also
>>> allows for committers with a more holistic or deeper understanding of a
>>> part of Storm to prevent unnecessary complexity.
>>> 
>>> 
>>> On Tue, Feb 10, 2015 at 7:48 AM, Bobby Evans >> 
>>> wrote:
>>> 
 I am fine with this. I mostly want a starting point, and we can adjust
 things from there is need be.
 - Bobby
 
 
 On Sunday, February 8, 2015 8:39 PM, Harsha 
>> wrote:
 
 
 
 Thanks for putting this together. Proposed bylaws looks good to
 me. -Harsha
 
 
> On Thu, Feb 5, 2015, at 02:10 PM, P. Taylor Goetz wrote:
> Associated pull request can be found here:
> https://github.com/apache/storm/pull/419
> 
> 
> This is another attempt at gaining consensus regarding adopting
> official bylaws for the Apache Storm project. The changes are minor
> and should be apparent in the pull request diff.
> 
> In earlier discussions, there were concerns raised about certain
> actions requiring approval types that were too strict. In retrospect,
> and after reviewing the bylaws of other project (Apache Drill [1],
> Apache Hadoop [2]) as well as the official Glossary of Apache-Related
> Terms [3], it seems that some of those concerns were somewhat
> unfounded, and stemmed from the fact that different projects use
> dif

[jira] [Commented] (STORM-561) Add ability to create topologies dynamically

2015-02-12 Thread P. Taylor Goetz (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319381#comment-14319381
 ] 

P. Taylor Goetz commented on STORM-561:
---

[~marz] Maybe the context here (at least the JIRA title) betrays at least what 
I have in mind...

Assume the title of this JIRA is something like "Slick GUI for Topology 
Editing/Management".

If someone created a great GUI for defining/editing and submitting topologies 
(no coding or maven packaging required), and that under the covers the app used 
JSON/YAML to store topology definitions (in a modular/reusable way), would you 
opposed to adding it as a module under `external`?

When I think of "Dynamic Topologies" like the title of this JIRA implies, I 
think of something more along the lines of what you are alluding to. And I 
would be supportive of both.

> Add ability to create topologies dynamically
> 
>
> Key: STORM-561
> URL: https://issues.apache.org/jira/browse/STORM-561
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Nathan Leung
>Assignee: Nathan Leung
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> It would be nice if a storm topology could be built dynamically, instead of 
> requiring a recompile to change parameters (e.g. number of workers, number of 
> tasks, layout, etc).
> I would propose the following data structures for building core storm 
> topologies.  I haven't done a design for trident yet but the intention would 
> be to add trident support when core storm support is complete (or in parallel 
> if there are other people working on it):
> {code}
> // fields value and arguments are mutually exclusive
> class Argument {
> String argumentType;  // Class used to lookup arguments in 
> method/constructor
> String implementationType; // Class used to create this argument
> String value; // String used to construct this argument
> List arguments; // arguments used to build this argument
> }
> class Dependency {
> String upstreamComponent; // name of upstream component
> String grouping;
> List arguments; // arguments for the grouping
> }
> class StormSpout {
> String name;
> String klazz;  // Class of this spout
> List  arguments;
> int numTasks;
> int numExecutors;
> }
> class StormBolt {
> String name;
> String klazz; // Class of this bolt
> List  arguments;
> int numTasks;
> int numExecutors;
> List dependencies;
> }
> class StormTopologyRepresentation {
> String name;
> List spouts;
> List bolts;
> Map config;
> int numWorkers;
> }
> {code}
> Topology creation will be built on top of the data structures above.  The 
> benefits:
> * Dependency free.  Code to unmarshal from json, xml, etc, can be kept in 
> extensions, or as examples, and users can write a different unmarshaller if 
> they want to use a different text representation.
> * support arbitrary spout and bolts types
> * support of all groupings, streams, via reflections
> * ability to specify configuration map via config file
> * reification of spout / bolt / dependency arguments
> ** recursive argument reification for complex objects



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: Storm-166: Nimbus HA design doc and implementa...

2015-02-12 Thread Parth-Brahmbhatt
Github user Parth-Brahmbhatt commented on the pull request:

https://github.com/apache/storm/pull/354#issuecomment-74185884
  
upmerged to master and added the discovery using a thrift API. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-654) Create a thrift API to discover nimbus so all the clients are not forced to contact zookeeper.

2015-02-12 Thread Parth Brahmbhatt (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319303#comment-14319303
 ] 

Parth Brahmbhatt commented on STORM-654:


https://github.com/Parth-Brahmbhatt/incubator-storm/pull/3 posted pull request 
against my own branch.

I haven't implemented the caching yet. I think we should just cache 
CluasterInfo. The ui right now makes 4 requests(cluster summary, supervisor 
summary, topology summary and now nimbus summary) to getClusterInfo for the 
index page. GetClusterInfo does not have any filters so we end up reading 
everything form zookeeper even though only one of the 4 params are used by a 
single request. I don't think consistency is really important in this case and 
caching this will both improve ui performance and reduce load on zookeeper. 



> Create a thrift API to discover nimbus so all the clients are not forced to 
> contact zookeeper.
> --
>
> Key: STORM-654
> URL: https://issues.apache.org/jira/browse/STORM-654
> Project: Apache Storm
>  Issue Type: Sub-task
>Reporter: Parth Brahmbhatt
>Assignee: Parth Brahmbhatt
>
> Current implementation of Nimbus-HA requires each nimbus client to discover 
> nimbus hosts by contacting zookeeper. In order to reduce the load on 
> zookeeper we could expose a thrift API as described in the future improvement 
> section of the Nimbus HA design doc. 
> We will add an extra field in ClusterSummary structure called nimbuses.
> struct ClusterSummary {
>   1: required list supervisors;
>   2: required i32 nimbus_uptime_secs;
>   3: required list topologies;
>   4: required list nimbuses;
> }
> struct NimbusSummary {
> 1: require string host;
> 2: require int port;
> 3: require int uptimeSecs;
> 4: require boolean isLeader;
> 5: require string version;
> 6: optional list local_storm_ids; //need a better name but these 
> are list of storm-ids for which this nimbus host has the code available 
> locally.
> }
> We will create a nimbus.hosts configuration which will serve as the seed list 
> of nimbus hosts. Any nimbus host can serve the read requests so any client 
> can issue getClusterSummary call and they can extract the leader nimbus 
> summary from the list of nimbuses. All nimbus hosts will cache this 
> information to reduce the load on zookeeper. 
> In addition we can add a RedirectException. When a request that can only be 
> served by leader nimbus (i.e. submit, kill, rebalance, deactivate, activate) 
> is issued against a non leader nimbus, the non leader nimbus will throw a 
> RedirectException and the client will handle the exception by refreshing 
> their leader nimbus host and contacting that host as part of retry. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-671) Measure tuple serialization/deserialization latency.

2015-02-12 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created STORM-671:
-

 Summary: Measure tuple serialization/deserialization latency.
 Key: STORM-671
 URL: https://issues.apache.org/jira/browse/STORM-671
 Project: Apache Storm
  Issue Type: New Feature
Reporter: Robert Joseph Evans


Some times the serialization/deserialization cost can be very high, and it is 
not currently measured anywhere in storm.  We should measure it, at least in a 
similar way to how we do execute and process latency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Adopt Apache Storm Bylaws

2015-02-12 Thread Derek Dagit
I am OK with codifying the retroactive -1 as proposed by Nathan, and I
am otherwise OK with the proposed bylaws.
-- 
Derek 



- Original Message -
From: Bobby Evans 
To: "dev@storm.apache.org" 
Cc: 
Sent: Thursday, February 12, 2015 8:12 AM
Subject: Re: [DISCUSS] Adopt Apache Storm Bylaws

That seems fine to me.  Most other projects I have worked on follow a similar 
procedure, and a retroactive -1 can be applied, without having it codified, but 
making it official seems fine to me.
I am +1 for those changes.
 - Bobby



     On Thursday, February 12, 2015 2:23 AM, Nathan Marz 
 wrote:
  

Yes, I would like to codify it. It's not about there being a bug with a
patch – it's about realizing that particular patch does not fit in with a
coherent vision of Storm, or that functionality could be achieved in a
completely different way. So basically, preventing bloat. With that change
I'm +1 to the bylaws and I believe we would have a consensus.

On Wed, Feb 11, 2015 at 7:34 PM, P. Taylor Goetz  wrote:

> I have no problem with your proposal. Actually I never even considered
> setting a timeline for a revert. I've always felt that if there was any
> problem with a patch/modification, it could be reverted at any time -- no
> deadline. If we find a problem, we fix it. We've reverted changes in the
> past, and lived to tell about it :).
>
> So I would think we don't even have to mention any revert timeline. If we
> feel the need to codify that, I'm okay with it.
>
> -Taylor
>
> > On Feb 11, 2015, at 9:06 PM, Nathan Marz  wrote:
> >
> > I'm -1 on these bylaws. This commit process encourages merging as fast as
> > possible and does not give adequate time for dissenting opinions to veto
> a
> > patch. I'm concerned about two things:
> >
> > 1. Regressions - Having too lax of a merge process will lead to
> unforeseen
> > regressions. We all saw this first hand with ZeroMQ: I had to freeze the
> > version of ZeroMQ used by Storm because subsequent versions would regress
> > in numerous ways.
> > 2. Bloat – All software projects have a tendency to become bloated and
> > build complexity because things were added piecemeal without a coherent
> > vision.
> >
> > These are very serious issues, and I've seen too many projects become
> > messes because of them. The only way to control these problems are with
> > -1's. Trust isn't even the issue here – one committer may very well
> think a
> > new feature "looks fine" and "why not let it in", while another will
> > recognize that the feature is unnecessary, adds complexity, and/or can be
> > addressed via better means. As is, the proposed bylaws are attempting to
> > make vetoing very difficult.
> >
> > I have a proposal which I believe gets the best of all worlds: allowing
> for
> > fast responsiveness on contributions while allowing for regressions and
> > bloat to be controlled. It is just a slight modification of the current
> > bylaws:
> >
> > "A minimum of one +1 from a Committer other than the one who authored the
> > patch, and no -1s. The code can be committed after the first +1. If a -1
> is
> > received to the patch within 7 days after the patch was posted, it may be
> > reverted immediately if it was already merged."
> >
> > To be clear, if a patch was posted on the 7th and merged on the 10th, it
> > may be -1'd and reverted until the 14th.
> >
> > With this process patches can be merged just as fast as before, but it
> also
> > allows for committers with a more holistic or deeper understanding of a
> > part of Storm to prevent unnecessary complexity.
> >
> >
> > On Tue, Feb 10, 2015 at 7:48 AM, Bobby Evans  >
> > wrote:
> >
> >> I am fine with this. I mostly want a starting point, and we can adjust
> >> things from there is need be.
> >> - Bobby
> >>
> >>
> >>    On Sunday, February 8, 2015 8:39 PM, Harsha 
> wrote:
> >>
> >>
> >>
> >> Thanks for putting this together. Proposed bylaws looks good to
> >> me. -Harsha
> >>
> >>
> >>> On Thu, Feb 5, 2015, at 02:10 PM, P. Taylor Goetz wrote:
> >>> Associated pull request can be found here:
> >>> https://github.com/apache/storm/pull/419
> >>>
> >>>
> >>> This is another attempt at gaining consensus regarding adopting
> >>> official bylaws for the Apache Storm project. The changes are minor
> >>> and should be apparent in the pull request diff.
> >>>
> >>> In earlier discussions, there were concerns raised about certain
> >>> actions requiring approval types that were too strict. In retrospect,
> >>> and after reviewing the bylaws of other project (Apache Drill [1],
> >>> Apache Hadoop [2]) as well as the official Glossary of Apache-Related
> >>> Terms [3], it seems that some of those concerns were somewhat
> >>> unfounded, and stemmed from the fact that different projects use
> >>> different and inconsistent names for various approval types.
> >>>
> >>> In an effort to remedy the situation, I have modified the “Approvals”
> >>> table to use the same names as the Glossary of Apache-Related Terms
> >>> [3]. The tabl

[jira] [Commented] (STORM-633) Nimbus - HTTP Error 413 full HEAD if using kerberos authentication

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319058#comment-14319058
 ] 

ASF GitHub Bot commented on STORM-633:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/393#issuecomment-74160282
  
+1 (since this is just a doc update, we probably don't need to wait for 
additional review)


> Nimbus - HTTP Error 413 full HEAD if using kerberos authentication
> --
>
> Key: STORM-633
> URL: https://issues.apache.org/jira/browse/STORM-633
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 0.9.3
>Reporter: Kevin Risden
>Assignee: Sriharsha Chintalapani
>
> When trying to access Nimbus that is kerberized, a HTTP 413 full HEAD error 
> is received. This seems related to the issue outlined in HADOOP-8816.
> Setting the Jetty header buffer size with ring-jetty is outlined on 
> Stackoverflow here: 
> http://stackoverflow.com/questions/9285096/clojure-ring-using-the-ring-jetty-adapter-large-requests-give-me-a-413-full-h
> The setting could be exposed like the host as done in STORM-575.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-633. Nimbus - HTTP Error 413 full HEAD i...

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/393#issuecomment-74160282
  
+1 (since this is just a doc update, we probably don't need to wait for 
additional review)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: Storm-539. Storm hive bolt and trident state.

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/350#issuecomment-74159664
  
+1

You should probably list yourself as a committer sponsor. ;) (You can add 
me as well if you like.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-329) Add Option to Config Message handling strategy when connection timeout

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318945#comment-14318945
 ] 

ASF GitHub Bot commented on STORM-329:
--

Github user miguno commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74147926
  
If you need at-least-once processing you must use an acking topology, which 
will allow Storm to replay lost messages.  If instead you go with an unacking 
topology (= no guaranteed message processing) then you may run into data loss.  
There re pros and cons for each variant, and e.g. in our case we use both 
depending on the use case.

Also: The semantics described above have been in Storm right from the 
beginning.  None of these have been changed by this pull request.


> On 12.02.2015, at 20:01, Daniel Schonfeld  
wrote:
> 
> Doesn't dropping the messages coming from a non ack/fail caring spout 
negate the 'at least once' attempt of storm? I mean doesn't that kinda force 
you to make all your spouts ack/fail aware where before you could have gotten 
away without it?
> 
> In other words. There is a chance that if the worker that died is the one 
containing the spout and if the first bolt is located on another worker, that 
technically at-least once wasn't tried but rather fell to the floor right away.
> 
> —
> Reply to this email directly or view it on GitHub.
> 



> Add Option to Config Message handling strategy when connection timeout
> --
>
> Key: STORM-329
> URL: https://issues.apache.org/jira/browse/STORM-329
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 0.9.2-incubating
>Reporter: Sean Zhong
>Priority: Minor
>  Labels: Netty
> Attachments: storm-329.patch, worker-kill-recover3.jpg
>
>
> This is to address a [concern brought 
> up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986] 
> during the work at STORM-297:
> {quote}
> [~revans2] wrote: Your logic makes since to me on why these calls are 
> blocking. My biggest concern around the blocking is in the case of a worker 
> crashing. If a single worker crashes this can block the entire topology from 
> executing until that worker comes back up. In some cases I can see that being 
> something that you would want. In other cases I can see speed being the 
> primary concern and some users would like to get partial data fast, rather 
> then accurate data later.
> Could we make it configurable on a follow up JIRA where we can have a max 
> limit to the buffering that is allowed, before we block, or throw data away 
> (which is what zeromq does)?
> {quote}
> If some worker crash suddenly, how to handle the message which was supposed 
> to be delivered to the worker?
> 1. Should we buffer all message infinitely?
> 2. Should we block the message sending until the connection is resumed?
> 3. Should we config a buffer limit, try to buffer the message first, if the 
> limit is met, then block?
> 4. Should we neither block, nor buffer too much, but choose to drop the 
> messages, and use the built-in storm failover mechanism? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread miguno
Github user miguno commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74147926
  
If you need at-least-once processing you must use an acking topology, which 
will allow Storm to replay lost messages.  If instead you go with an unacking 
topology (= no guaranteed message processing) then you may run into data loss.  
There re pros and cons for each variant, and e.g. in our case we use both 
depending on the use case.

Also: The semantics described above have been in Storm right from the 
beginning.  None of these have been changed by this pull request.


> On 12.02.2015, at 20:01, Daniel Schonfeld  
wrote:
> 
> Doesn't dropping the messages coming from a non ack/fail caring spout 
negate the 'at least once' attempt of storm? I mean doesn't that kinda force 
you to make all your spouts ack/fail aware where before you could have gotten 
away without it?
> 
> In other words. There is a chance that if the worker that died is the one 
containing the spout and if the first bolt is located on another worker, that 
technically at-least once wasn't tried but rather fell to the floor right away.
> 
> —
> Reply to this email directly or view it on GitHub.
> 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-652) Use latest junit 4.11

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318923#comment-14318923
 ] 

ASF GitHub Bot commented on STORM-652:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/409#issuecomment-74145938
  
+1 The upgrade does not seem to cause any issues with tests.


> Use latest junit 4.11
> -
>
> Key: STORM-652
> URL: https://issues.apache.org/jira/browse/STORM-652
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-652. Use latest junit 4.11 .

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/409#issuecomment-74145938
  
+1 The upgrade does not seem to cause any issues with tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-641) Add total number of topologies to api/v1/cluster/summary.

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318921#comment-14318921
 ] 

ASF GitHub Bot commented on STORM-641:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/411#issuecomment-74145467
  
+1


> Add total number of topologies to api/v1/cluster/summary.
> -
>
> Key: STORM-641
> URL: https://issues.apache.org/jira/browse/STORM-641
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-641. Add total number of topologies to a...

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/411#issuecomment-74145467
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-640) Storm UI vulnerable to poodle attack

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318919#comment-14318919
 ] 

ASF GitHub Bot commented on STORM-640:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/412#issuecomment-74145215
  
+1


> Storm UI vulnerable to poodle attack
> 
>
> Key: STORM-640
> URL: https://issues.apache.org/jira/browse/STORM-640
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>Priority: Trivial
>
> More info on this page http://en.wikipedia.org/wiki/POODLE . 
> Steps to verify
> 1 Enable storm ui or logviewer to listen in SSL
> 2. openssl s_client -connect host:port | grep Protocol
> 3. If SSLv3 shows up you have the vulnerability, TLS protocol versions are OK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-640. Storm UI vulnerable to poodle attac...

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/412#issuecomment-74145215
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: ShellSpout hangs on reportError?

2015-02-12 Thread William Oberman
Sorry for the cross post to dev, but I think this thread has veered into
actual dev questions.  I still don't know if there is something
fundamentally wrong about my use case, or if this is a bug.  For a dev
reading this for the first time, the main correction I'd make is to my
email subject.  reportError isn't hanging, it's throwing a runtime
exception (wrapping an interrupted exception).  As for what is throwing the
interrupted exception, I think it's Zookeeper itself.

Both ShellSpout and ShellBolt's die() has a
"_collector.reportError(exception);" line.  I changed both to:

try {
_collector.reportError(exception);
} catch (RuntimeException e) {
if(e.getCause() instanceof InterruptedException) {
//zookeeper.clj wraps zk InterruptedException with runtime
exception
} else {
throw e;
}
}
==
and now everything starts to work as I expected.

Does this patch make any sense?  Or is it a bandaid over a deeper issue?

will

On Thu, Feb 12, 2015 at 2:15 PM, William Oberman 
wrote:

> Ok, I realized that I did NOT check if ShellSpout.die() was throwing a
> RuntimeException.   I added a try/catch block, and it is!   The
> RuntimeException is preventing _process.destroy and System.exit() from
> happening, both of which need to happen to make topology recovery happen.
>
> But, I'm not sure *why* this exception is happening yet, since it's an
> interrupted exception and I don't think the exception tells me *who*
> interrupted my thread...
>
> 2015-02-12T14:12:35.581-0500 b.s.s.ShellSpout [ERROR] die exception!
> java.lang.RuntimeException: java.lang.InterruptedException
> at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> backtype.storm.zookeeper$exists_node_QMARK_$fn__3279.invoke(zookeeper.clj:102)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at backtype.storm.zookeeper$exists_node_QMARK_.invoke(zookeeper.clj:98)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at backtype.storm.zookeeper$mkdirs.invoke(zookeeper.clj:114)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> backtype.storm.cluster$mk_distributed_cluster_state$reify__3533.mkdirs(cluster.clj:119)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> backtype.storm.cluster$mk_storm_cluster_state$reify__3990.report_error(cluster.clj:400)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> backtype.storm.daemon.executor$throttled_report_error_fn$fn__5565.invoke(executor.clj:180)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> backtype.storm.daemon.executor$fn__5717$fn$reify__5759.reportError(executor.clj:533)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> backtype.storm.spout.SpoutOutputCollector.reportError(SpoutOutputCollector.java:132)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at backtype.storm.spout.ShellSpout.die(ShellSpout.java:235)
> [storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at backtype.storm.spout.ShellSpout.access$200(ShellSpout.java:42)
> [storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> backtype.storm.spout.ShellSpout$SpoutHeartbeatTimerTask.run(ShellSpout.java:261)
> [storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> [na:1.7.0_71]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> [na:1.7.0_71]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> [na:1.7.0_71]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> [na:1.7.0_71]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [na:1.7.0_71]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.InterruptedException: null
> at java.lang.Object.wait(Native Method) ~[na:1.7.0_71]
> at java.lang.Object.wait(Object.java:503) ~[na:1.7.0_71]
> at
> org.apache.storm.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at org.apache.storm.zookeeper.ZooKeeper.exists(ZooKeeper.java:1040)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:172)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:161)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:157)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> org.apache.storm.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:148)
> ~[storm-core-0.9.3.jar:0.9.4-SNAPSHOT]
> at
> org.apache.storm.curator.f

[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread danielschonfeld
Github user danielschonfeld commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74130344
  
Doesn't dropping the messages coming from a non ack/fail caring spout 
negate the 'at least once' attempt of storm? I mean doesn't that kinda force 
you to make all your spouts ack/fail aware where before you could have gotten 
away without it?

In other words.  There is a chance that if the worker that died is the one 
containing the spout and if the first bolt is located on another worker, that 
technically at-least once wasn't tried but rather fell to the floor right away.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-329) Add Option to Config Message handling strategy when connection timeout

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318764#comment-14318764
 ] 

ASF GitHub Bot commented on STORM-329:
--

Github user danielschonfeld commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74130344
  
Doesn't dropping the messages coming from a non ack/fail caring spout 
negate the 'at least once' attempt of storm? I mean doesn't that kinda force 
you to make all your spouts ack/fail aware where before you could have gotten 
away without it?

In other words.  There is a chance that if the worker that died is the one 
containing the spout and if the first bolt is located on another worker, that 
technically at-least once wasn't tried but rather fell to the floor right away.


> Add Option to Config Message handling strategy when connection timeout
> --
>
> Key: STORM-329
> URL: https://issues.apache.org/jira/browse/STORM-329
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 0.9.2-incubating
>Reporter: Sean Zhong
>Priority: Minor
>  Labels: Netty
> Attachments: storm-329.patch, worker-kill-recover3.jpg
>
>
> This is to address a [concern brought 
> up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986] 
> during the work at STORM-297:
> {quote}
> [~revans2] wrote: Your logic makes since to me on why these calls are 
> blocking. My biggest concern around the blocking is in the case of a worker 
> crashing. If a single worker crashes this can block the entire topology from 
> executing until that worker comes back up. In some cases I can see that being 
> something that you would want. In other cases I can see speed being the 
> primary concern and some users would like to get partial data fast, rather 
> then accurate data later.
> Could we make it configurable on a follow up JIRA where we can have a max 
> limit to the buffering that is allowed, before we block, or throw data away 
> (which is what zeromq does)?
> {quote}
> If some worker crash suddenly, how to handle the message which was supposed 
> to be delivered to the worker?
> 1. Should we buffer all message infinitely?
> 2. Should we block the message sending until the connection is resumed?
> 3. Should we config a buffer limit, try to buffer the message first, if the 
> limit is met, then block?
> 4. Should we neither block, nor buffer too much, but choose to drop the 
> messages, and use the built-in storm failover mechanism? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-658) config topology.acker.executors default value is null and then should not start acker bolts

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318750#comment-14318750
 ] 

ASF GitHub Bot commented on STORM-658:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/417#issuecomment-74128755
  
-1. This breaks trident functionality if `topology.acker.executors` is 
`null` in `storm.yaml` and not overridden in the topology conf.

I've not dug too far into the root cause, but the situation above results 
in `StackOverflowError`s in the MasterBatchCoordinator spout. See stack trace 
below.
```
java.lang.StackOverflowError: null
at clojure.lang.Numbers.multiply(Numbers.java:3663) 
~[clojure-1.6.0.jar:na]
at backtype.storm.stats$curr_time_bucket.invoke(stats.clj:29) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.stats$update_rolling_window.doInvoke(stats.clj:41) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at clojure.lang.RestFn.applyTo(RestFn.java:142) 
~[clojure-1.6.0.jar:na]
at clojure.core$apply.invoke(core.clj:628) ~[clojure-1.6.0.jar:na]
at 
backtype.storm.stats$update_rolling_window_set$iter__2980__2984$fn__2985$fn__2986.invoke(stats.clj:77)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.stats$update_rolling_window_set$iter__2980__2984$fn__2985.invoke(stats.clj:76)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at clojure.lang.LazySeq.sval(LazySeq.java:40) 
~[clojure-1.6.0.jar:na]
at clojure.lang.LazySeq.seq(LazySeq.java:49) ~[clojure-1.6.0.jar:na]
at clojure.lang.RT.seq(RT.java:484) ~[clojure-1.6.0.jar:na]
at clojure.core$seq.invoke(core.clj:133) ~[clojure-1.6.0.jar:na]
at clojure.core$dorun.invoke(core.clj:2855) ~[clojure-1.6.0.jar:na]
at clojure.core$doall.invoke(core.clj:2871) ~[clojure-1.6.0.jar:na]
at 
backtype.storm.stats$update_rolling_window_set.doInvoke(stats.clj:76) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at clojure.lang.RestFn.invoke(RestFn.java:439) 
~[clojure-1.6.0.jar:na]
at clojure.lang.Atom.swap(Atom.java:65) ~[clojure-1.6.0.jar:na]
at clojure.core$swap_BANG_.invoke(core.clj:2234) 
~[clojure-1.6.0.jar:na]
at backtype.storm.stats$emitted_tuple_BANG_.invoke(stats.clj:215) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.task$mk_tasks_fn$fn__4329.invoke(task.clj:160) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn__4630$send_spout_msg__4648.invoke(executor.clj:502)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn$reify__4657.emit(executor.clj:547) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.spout.SpoutOutputCollector.emit(SpoutOutputCollector.java:49) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.sync(MasterBatchCoordinator.java:176)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.ack(MasterBatchCoordinator.java:145)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$ack_spout_msg.invoke(executor.clj:399) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn__4630$send_spout_msg__4648.invoke(executor.clj:532)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn$reify__4657.emit(executor.clj:547) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.spout.SpoutOutputCollector.emit(SpoutOutputCollector.java:49) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.sync(MasterBatchCoordinator.java:200)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.ack(MasterBatchCoordinator.java:145)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$ack_spout_msg.invoke(executor.clj:399) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn__4630$send_spout_msg__4648.invoke(executor.clj:532)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn$reify__4657.emit(executor.clj:547) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.spout.SpoutOutputCollector.emit(SpoutOutputCollector.java:49) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.sync(MasterBa

[GitHub] storm pull request: STORM-658:when config topology.acker.executors...

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/417#issuecomment-74128755
  
-1. This breaks trident functionality if `topology.acker.executors` is 
`null` in `storm.yaml` and not overridden in the topology conf.

I've not dug too far into the root cause, but the situation above results 
in `StackOverflowError`s in the MasterBatchCoordinator spout. See stack trace 
below.
```
java.lang.StackOverflowError: null
at clojure.lang.Numbers.multiply(Numbers.java:3663) 
~[clojure-1.6.0.jar:na]
at backtype.storm.stats$curr_time_bucket.invoke(stats.clj:29) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.stats$update_rolling_window.doInvoke(stats.clj:41) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at clojure.lang.RestFn.applyTo(RestFn.java:142) 
~[clojure-1.6.0.jar:na]
at clojure.core$apply.invoke(core.clj:628) ~[clojure-1.6.0.jar:na]
at 
backtype.storm.stats$update_rolling_window_set$iter__2980__2984$fn__2985$fn__2986.invoke(stats.clj:77)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.stats$update_rolling_window_set$iter__2980__2984$fn__2985.invoke(stats.clj:76)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at clojure.lang.LazySeq.sval(LazySeq.java:40) 
~[clojure-1.6.0.jar:na]
at clojure.lang.LazySeq.seq(LazySeq.java:49) ~[clojure-1.6.0.jar:na]
at clojure.lang.RT.seq(RT.java:484) ~[clojure-1.6.0.jar:na]
at clojure.core$seq.invoke(core.clj:133) ~[clojure-1.6.0.jar:na]
at clojure.core$dorun.invoke(core.clj:2855) ~[clojure-1.6.0.jar:na]
at clojure.core$doall.invoke(core.clj:2871) ~[clojure-1.6.0.jar:na]
at 
backtype.storm.stats$update_rolling_window_set.doInvoke(stats.clj:76) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at clojure.lang.RestFn.invoke(RestFn.java:439) 
~[clojure-1.6.0.jar:na]
at clojure.lang.Atom.swap(Atom.java:65) ~[clojure-1.6.0.jar:na]
at clojure.core$swap_BANG_.invoke(core.clj:2234) 
~[clojure-1.6.0.jar:na]
at backtype.storm.stats$emitted_tuple_BANG_.invoke(stats.clj:215) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.task$mk_tasks_fn$fn__4329.invoke(task.clj:160) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn__4630$send_spout_msg__4648.invoke(executor.clj:502)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn$reify__4657.emit(executor.clj:547) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.spout.SpoutOutputCollector.emit(SpoutOutputCollector.java:49) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.sync(MasterBatchCoordinator.java:176)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.ack(MasterBatchCoordinator.java:145)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$ack_spout_msg.invoke(executor.clj:399) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn__4630$send_spout_msg__4648.invoke(executor.clj:532)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn$reify__4657.emit(executor.clj:547) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.spout.SpoutOutputCollector.emit(SpoutOutputCollector.java:49) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.sync(MasterBatchCoordinator.java:200)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.ack(MasterBatchCoordinator.java:145)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$ack_spout_msg.invoke(executor.clj:399) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn__4630$send_spout_msg__4648.invoke(executor.clj:532)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.daemon.executor$fn__4615$fn$reify__4657.emit(executor.clj:547) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.storm.spout.SpoutOutputCollector.emit(SpoutOutputCollector.java:49) 
~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.sync(MasterBatchCoordinator.java:176)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
storm.trident.topology.MasterBatchCoordinator.ack(MasterBatchCoordinator.java:145)
 ~[storm-core-0.10.0-SNAPSHOT.jar:0.10.0-SNAPSHOT]
at 
backtype.st

[jira] [Commented] (STORM-581) Add rebalance params to Storm REST API

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318630#comment-14318630
 ] 

ASF GitHub Bot commented on STORM-581:
--

Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/415#issuecomment-74118053
  
fixed the doc. Thanks.


> Add rebalance params to Storm REST API
> --
>
> Key: STORM-581
> URL: https://issues.apache.org/jira/browse/STORM-581
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>
> Improve rest api  /api/v1/topology/:id/rebalance/:wait-time to  accept params 
> like new workers and component parallelism



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-581. Add rebalance params to Storm REST ...

2015-02-12 Thread harshach
Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/415#issuecomment-74118053
  
fixed the doc. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: Storm-539. Storm hive bolt and trident state.

2015-02-12 Thread harshach
Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/350#issuecomment-74117405
  
@revans2 @ptgoetz  can you please review this. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-581) Add rebalance params to Storm REST API

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318586#comment-14318586
 ] 

ASF GitHub Bot commented on STORM-581:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/415#issuecomment-74114183
  
+1 The typo fix is a non-code change so I'm fine with including that.


> Add rebalance params to Storm REST API
> --
>
> Key: STORM-581
> URL: https://issues.apache.org/jira/browse/STORM-581
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>
> Improve rest api  /api/v1/topology/:id/rebalance/:wait-time to  accept params 
> like new workers and component parallelism



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-581. Add rebalance params to Storm REST ...

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/415#issuecomment-74114183
  
+1 The typo fix is a non-code change so I'm fine with including that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: STORM-670: restore java 1.6 compatibility (sto...

2015-02-12 Thread ptgoetz
GitHub user ptgoetz opened a pull request:

https://github.com/apache/storm/pull/431

STORM-670: restore java 1.6 compatibility (storm-kafka)

Not sure how long we want to cling to 1.6 compatibility, but this is a very 
small change.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ptgoetz/storm STORM-670

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/431.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #431


commit d9568f99841e8449f41353e409eac73aeb3b6dc7
Author: P. Taylor Goetz 
Date:   2015-02-12T17:15:04Z

STORM-670: restore java 1.6 compatibility




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-670) [storm-kafka] Restore Java 1.6 compatibility

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318579#comment-14318579
 ] 

ASF GitHub Bot commented on STORM-670:
--

GitHub user ptgoetz opened a pull request:

https://github.com/apache/storm/pull/431

STORM-670: restore java 1.6 compatibility (storm-kafka)

Not sure how long we want to cling to 1.6 compatibility, but this is a very 
small change.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ptgoetz/storm STORM-670

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/431.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #431


commit d9568f99841e8449f41353e409eac73aeb3b6dc7
Author: P. Taylor Goetz 
Date:   2015-02-12T17:15:04Z

STORM-670: restore java 1.6 compatibility




> [storm-kafka] Restore Java 1.6 compatibility
> 
>
> Key: STORM-670
> URL: https://issues.apache.org/jira/browse/STORM-670
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-kafka
>Affects Versions: 0.10.0
>Reporter: P. Taylor Goetz
>Assignee: P. Taylor Goetz
>
> java.lang.Long.compare(Long, Long) is only available in Java 1.7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-670) [storm-kafka] Restore Java 1.6 compatibility

2015-02-12 Thread P. Taylor Goetz (JIRA)
P. Taylor Goetz created STORM-670:
-

 Summary: [storm-kafka] Restore Java 1.6 compatibility
 Key: STORM-670
 URL: https://issues.apache.org/jira/browse/STORM-670
 Project: Apache Storm
  Issue Type: Bug
  Components: storm-kafka
Affects Versions: 0.10.0
Reporter: P. Taylor Goetz
Assignee: P. Taylor Goetz


java.lang.Long.compare(Long, Long) is only available in Java 1.7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-581) Add rebalance params to Storm REST API

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318558#comment-14318558
 ] 

ASF GitHub Bot commented on STORM-581:
--

Github user revans2 commented on the pull request:

https://github.com/apache/storm/pull/415#issuecomment-74109714
  
One really minor spelling fix.  I am +1 for the code so unless the code 
changes you don't need another +1 from me.


> Add rebalance params to Storm REST API
> --
>
> Key: STORM-581
> URL: https://issues.apache.org/jira/browse/STORM-581
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>
> Improve rest api  /api/v1/topology/:id/rebalance/:wait-time to  accept params 
> like new workers and component parallelism



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-581. Add rebalance params to Storm REST ...

2015-02-12 Thread revans2
Github user revans2 commented on the pull request:

https://github.com/apache/storm/pull/415#issuecomment-74109714
  
One really minor spelling fix.  I am +1 for the code so unless the code 
changes you don't need another +1 from me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-581) Add rebalance params to Storm REST API

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318554#comment-14318554
 ] 

ASF GitHub Bot commented on STORM-581:
--

Github user revans2 commented on a diff in the pull request:

https://github.com/apache/storm/pull/415#discussion_r24597644
  
--- Diff: STORM-UI-REST-API.md ---
@@ -630,6 +642,31 @@ Rebalances a topology.
 |--||-|
 |id   |String (required)| Topology Id  |
 |wait-time |String (required)| Wait time before rebalance happens |
+|rebalanceOptions| Json (optional) | topology rebalance options |
+
+
+Sample rebalancOptions json:
--- End diff --

really minor s/rebalancOptions/rebalanceOptions/ (missing an e)


> Add rebalance params to Storm REST API
> --
>
> Key: STORM-581
> URL: https://issues.apache.org/jira/browse/STORM-581
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>
> Improve rest api  /api/v1/topology/:id/rebalance/:wait-time to  accept params 
> like new workers and component parallelism



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-581. Add rebalance params to Storm REST ...

2015-02-12 Thread revans2
Github user revans2 commented on a diff in the pull request:

https://github.com/apache/storm/pull/415#discussion_r24597644
  
--- Diff: STORM-UI-REST-API.md ---
@@ -630,6 +642,31 @@ Rebalances a topology.
 |--||-|
 |id   |String (required)| Topology Id  |
 |wait-time |String (required)| Wait time before rebalance happens |
+|rebalanceOptions| Json (optional) | topology rebalance options |
+
+
+Sample rebalancOptions json:
--- End diff --

really minor s/rebalancOptions/rebalanceOptions/ (missing an e)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-581) Add rebalance params to Storm REST API

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318545#comment-14318545
 ] 

ASF GitHub Bot commented on STORM-581:
--

Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/415#issuecomment-74108012
  
@ptgoetz @revans2  Bobby already gave a +1 from previous PR. This new PR is 
an upmerge without any code changes. Can you please take a look.


> Add rebalance params to Storm REST API
> --
>
> Key: STORM-581
> URL: https://issues.apache.org/jira/browse/STORM-581
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>
> Improve rest api  /api/v1/topology/:id/rebalance/:wait-time to  accept params 
> like new workers and component parallelism



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-581. Add rebalance params to Storm REST ...

2015-02-12 Thread harshach
Github user harshach commented on the pull request:

https://github.com/apache/storm/pull/415#issuecomment-74108012
  
@ptgoetz @revans2  Bobby already gave a +1 from previous PR. This new PR is 
an upmerge without any code changes. Can you please take a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-130) [Storm 0.8.2]: java.io.FileNotFoundException: File '../stormconf.ser' does not exist

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318467#comment-14318467
 ] 

ASF GitHub Bot commented on STORM-130:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/418#issuecomment-74099551
  
+1

I'll presume +1 approvals carry over from #401, but allow others time to 
comment.


> [Storm 0.8.2]: java.io.FileNotFoundException: File '../stormconf.ser' does 
> not exist
> 
>
> Key: STORM-130
> URL: https://issues.apache.org/jira/browse/STORM-130
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: James Xu
>Assignee: Sriharsha Chintalapani
>Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/438
> Hi developers,
> We met critical issue with deploying storm topology to our prod cluster.
> After deploying topology we got trace on workers (Storm 
> 0.8.2/zookeeper-3.3.6) :
> 2013-01-14 10:57:39 ZooKeeper [INFO] Initiating client connection, 
> connectString=zookeeper1.company.com:2181,zookeeper2.company.com:2181,zookeeper3.company.com:2181
>  sessionTimeout=2 watcher=com.netflix.curator.ConnectionState@254ba9a2
> 2013-01-14 10:57:39 ClientCnxn [INFO] Opening socket connection to server 
> zookeeper1.company.com/10.72.209.112:2181
> 2013-01-14 10:57:39 ClientCnxn [INFO] Socket connection established to 
> zookeeper1.company.com/10.72.209.112:2181, initiating session
> 2013-01-14 10:57:39 ClientCnxn [INFO] Session establishment complete on 
> server zookeeper1.company.com/10.72.209.112:2181, sessionid = 
> 0x13b3e4b5c780239, negotiated timeout = 2
> 2013-01-14 10:57:39 zookeeper [INFO] Zookeeper state update: :connected:none
> 2013-01-14 10:57:39 ZooKeeper [INFO] Session: 0x13b3e4b5c780239 closed
> 2013-01-14 10:57:39 ClientCnxn [INFO] EventThread shut down
> 2013-01-14 10:57:39 CuratorFrameworkImpl [INFO] Starting
> 2013-01-14 10:57:39 ZooKeeper [INFO] Initiating client connection, 
> connectString=zookeeper1.company.com:2181,zookeeper2.company.com:2181,zookeeper3.company.com:2181/storm
>  sessionTimeout=2 watcher=com.netflix.curator.ConnectionState@33a998c7
> 2013-01-14 10:57:39 ClientCnxn [INFO] Opening socket connection to server 
> zookeeper1.company.com/10.72.209.112:2181
> 2013-01-14 10:57:39 ClientCnxn [INFO] Socket connection established to 
> zookeeper1.company.com/10.72.209.112:2181, initiating session
> 2013-01-14 10:57:39 ClientCnxn [INFO] Session establishment complete on 
> server zookeeper1.company.com/10.72.209.112:2181, sessionid = 
> 0x13b3e4b5c78023a, negotiated timeout = 2
> 2013-01-14 10:57:39 worker [ERROR] Error on initialization of server mk-worker
> java.io.FileNotFoundException: File 
> '/tmp/storm/supervisor/stormdist/normalization-prod-1-1358161053/stormconf.ser'
>  does not exist
> at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:137)
> at org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1135)
> at backtype.storm.config$read_supervisor_storm_conf.invoke(config.clj:138)
> at backtype.storm.daemon.worker$worker_data.invoke(worker.clj:146)
> at 
> backtype.storm.daemon.worker$fn__4348$exec_fn__1228__auto4349.invoke(worker.clj:332)
> at clojure.lang.AFn.applyToHelper(AFn.java:185)
> at clojure.lang.AFn.applyTo(AFn.java:151)
> at clojure.core$apply.invoke(core.clj:601)
> at 
> backtype.storm.daemon.worker$fn__4348$mk_worker__4404.doInvoke(worker.clj:323)
> at clojure.lang.RestFn.invoke(RestFn.java:512)
> at backtype.storm.daemon.worker$_main.invoke(worker.clj:433)
> at clojure.lang.AFn.applyToHelper(AFn.java:172)
> at clojure.lang.AFn.applyTo(AFn.java:151)
> at backtype.storm.daemon.worker.main(Unknown Source)
> 2013-01-14 10:57:39 util [INFO] Halting process: ("Error on initialization")
> Supervisor trace:
> 2013-01-14 10:59:01 supervisor [INFO] d6735377-f0d6-4247-9f35-c8620e2b0e26 
> still hasn't started
> 2013-01-14 10:59:02 supervisor [INFO] d6735377-f0d6-4247-9f35-c8620e2b0e26 
> still hasn't starte
> ...
> 2013-01-14 10:59:34 supervisor [INFO] d6735377-f0d6-4247-9f35-c8620e2b0e26 
> still hasn't started
> 2013-01-14 10:59:35 supervisor [INFO] Worker 
> d6735377-f0d6-4247-9f35-c8620e2b0e26 failed to start
> 2013-01-14 10:59:35 supervisor [INFO] Worker 
> 234264c6-d9d6-4e8a-ab0a-8926bdd6b536 failed to start
> 2013-01-14 10:59:35 supervisor [INFO] Shutting down and clearing state for id 
> 234264c6-d9d6-4e8a-ab0a-8926bdd6b536. Current supervisor time: 1358161175. 
> State: :disallowed, Heartbeat: nil
> 2013-01-14 10:59:35 supervisor [INFO] Shutting down 
> d5c3235f-5880-4be8-a759-5654b3df6a27:234264c6-d9d6-4e8a-ab0a-8926bdd6b536
> 2013-01-14 10:59:35 util [INFO] Error when trying to kill 4819. Process is 
> probably already dead.
> 2013-01-14 10:59:35 supe

[GitHub] storm pull request: STORM-130: Supervisor getting killed due to ja...

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/418#issuecomment-74099551
  
+1

I'll presume +1 approvals carry over from #401, but allow others time to 
comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-651) improvements to storm.cmd

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318453#comment-14318453
 ] 

ASF GitHub Bot commented on STORM-651:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/427#issuecomment-74098372
  
+1


> improvements to storm.cmd
> -
>
> Key: STORM-651
> URL: https://issues.apache.org/jira/browse/STORM-651
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Sriharsha Chintalapani
>Assignee: Parth Brahmbhatt
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: Storm-649: Storm HDFS test topologies should w...

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/426#issuecomment-74098599
  
+1, looks fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: STORM-651: Rename "ui" service to "storm ui" a...

2015-02-12 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/427#issuecomment-74098372
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request: SimpleTransportPlugin TThreadPoolServer

2015-02-12 Thread darionyaphet
GitHub user darionyaphet opened a pull request:

https://github.com/apache/storm/pull/430

SimpleTransportPlugin TThreadPoolServer

I'm reading storm's source code and I found it's very difference from 
version 0.9.3 . Storm Nimbus's ThriftServer is create by SimpleTransportPlugin. 
In getServer() function build a THsHaServer server to process supervisor's 
request . 
I know TThreadPoolServer maybe support higher throughput and lesser 
lantency. So I make this pull request. 
Also some config could be remove such as queueSize .


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/darionyaphet/storm master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/430.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #430


commit 4f0f9f4b3031137a8ad971f78f1dc6a210778516
Author: yaphet 
Date:   2015-02-10T13:40:03Z

Merge pull request #1 from apache/master

merge update

commit 357f23abaf4abd09193b6446b797eddc2258375f
Author: darionyaphet 
Date:   2015-02-12T15:36:50Z

update transport plugin




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-329) Add Option to Config Message handling strategy when connection timeout

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318317#comment-14318317
 ] 

ASF GitHub Bot commented on STORM-329:
--

Github user clockfly commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74084548
  
+1

On Thu, Feb 12, 2015 at 6:44 PM, Michael G. Noll 
wrote:

> Thanks for your feedback, Nathan.
>
> As far as I understand this patch does not enable backpressure. But:
> because there is no backpressure (yet) that we can rely on, this patch 
will
> improve at least the situation during the startup phase of a topology to
> prevent that a) an unacked topo will not lose messages during the startup,
> and b) we do not need to unnecessarily replay messages in case of acked
> topos during their startup. This is achieved by checking that all worker
> connections are ready before the topology starts processing data.
>
> So backpressure is still an open feature. Backpressure was IIRC mentioned
> in the initial PR because there was a deficiency (dating back to a ZMQ
> related TODO) that caused problems related to this PR/Storm tickets (327,
> 404, and one more). However, this patch does make the best of the current
> situation even in the absence of backpressure. But first and foremost this
> patch fixes a (critical) cascading failure that can bring Storm clusters 
to
> a halt.
>
> Please correct me if I'm mistaken in my summary.
>
> —
> Reply to this email directly or view it on GitHub
> .
>



> Add Option to Config Message handling strategy when connection timeout
> --
>
> Key: STORM-329
> URL: https://issues.apache.org/jira/browse/STORM-329
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 0.9.2-incubating
>Reporter: Sean Zhong
>Priority: Minor
>  Labels: Netty
> Attachments: storm-329.patch, worker-kill-recover3.jpg
>
>
> This is to address a [concern brought 
> up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986] 
> during the work at STORM-297:
> {quote}
> [~revans2] wrote: Your logic makes since to me on why these calls are 
> blocking. My biggest concern around the blocking is in the case of a worker 
> crashing. If a single worker crashes this can block the entire topology from 
> executing until that worker comes back up. In some cases I can see that being 
> something that you would want. In other cases I can see speed being the 
> primary concern and some users would like to get partial data fast, rather 
> then accurate data later.
> Could we make it configurable on a follow up JIRA where we can have a max 
> limit to the buffering that is allowed, before we block, or throw data away 
> (which is what zeromq does)?
> {quote}
> If some worker crash suddenly, how to handle the message which was supposed 
> to be delivered to the worker?
> 1. Should we buffer all message infinitely?
> 2. Should we block the message sending until the connection is resumed?
> 3. Should we config a buffer limit, try to buffer the message first, if the 
> limit is met, then block?
> 4. Should we neither block, nor buffer too much, but choose to drop the 
> messages, and use the built-in storm failover mechanism? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread clockfly
Github user clockfly commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74084548
  
+1

On Thu, Feb 12, 2015 at 6:44 PM, Michael G. Noll 
wrote:

> Thanks for your feedback, Nathan.
>
> As far as I understand this patch does not enable backpressure. But:
> because there is no backpressure (yet) that we can rely on, this patch 
will
> improve at least the situation during the startup phase of a topology to
> prevent that a) an unacked topo will not lose messages during the startup,
> and b) we do not need to unnecessarily replay messages in case of acked
> topos during their startup. This is achieved by checking that all worker
> connections are ready before the topology starts processing data.
>
> So backpressure is still an open feature. Backpressure was IIRC mentioned
> in the initial PR because there was a deficiency (dating back to a ZMQ
> related TODO) that caused problems related to this PR/Storm tickets (327,
> 404, and one more). However, this patch does make the best of the current
> situation even in the absence of backpressure. But first and foremost this
> patch fixes a (critical) cascading failure that can bring Storm clusters 
to
> a halt.
>
> Please correct me if I'm mistaken in my summary.
>
> —
> Reply to this email directly or view it on GitHub
> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Adopt Apache Storm Bylaws

2015-02-12 Thread Bobby Evans
That seems fine to me.  Most other projects I have worked on follow a similar 
procedure, and a retroactive -1 can be applied, without having it codified, but 
making it official seems fine to me.
I am +1 for those changes.
 - Bobby
 

 On Thursday, February 12, 2015 2:23 AM, Nathan Marz 
 wrote:
   

 Yes, I would like to codify it. It's not about there being a bug with a
patch – it's about realizing that particular patch does not fit in with a
coherent vision of Storm, or that functionality could be achieved in a
completely different way. So basically, preventing bloat. With that change
I'm +1 to the bylaws and I believe we would have a consensus.

On Wed, Feb 11, 2015 at 7:34 PM, P. Taylor Goetz  wrote:

> I have no problem with your proposal. Actually I never even considered
> setting a timeline for a revert. I've always felt that if there was any
> problem with a patch/modification, it could be reverted at any time -- no
> deadline. If we find a problem, we fix it. We've reverted changes in the
> past, and lived to tell about it :).
>
> So I would think we don't even have to mention any revert timeline. If we
> feel the need to codify that, I'm okay with it.
>
> -Taylor
>
> > On Feb 11, 2015, at 9:06 PM, Nathan Marz  wrote:
> >
> > I'm -1 on these bylaws. This commit process encourages merging as fast as
> > possible and does not give adequate time for dissenting opinions to veto
> a
> > patch. I'm concerned about two things:
> >
> > 1. Regressions - Having too lax of a merge process will lead to
> unforeseen
> > regressions. We all saw this first hand with ZeroMQ: I had to freeze the
> > version of ZeroMQ used by Storm because subsequent versions would regress
> > in numerous ways.
> > 2. Bloat – All software projects have a tendency to become bloated and
> > build complexity because things were added piecemeal without a coherent
> > vision.
> >
> > These are very serious issues, and I've seen too many projects become
> > messes because of them. The only way to control these problems are with
> > -1's. Trust isn't even the issue here – one committer may very well
> think a
> > new feature "looks fine" and "why not let it in", while another will
> > recognize that the feature is unnecessary, adds complexity, and/or can be
> > addressed via better means. As is, the proposed bylaws are attempting to
> > make vetoing very difficult.
> >
> > I have a proposal which I believe gets the best of all worlds: allowing
> for
> > fast responsiveness on contributions while allowing for regressions and
> > bloat to be controlled. It is just a slight modification of the current
> > bylaws:
> >
> > "A minimum of one +1 from a Committer other than the one who authored the
> > patch, and no -1s. The code can be committed after the first +1. If a -1
> is
> > received to the patch within 7 days after the patch was posted, it may be
> > reverted immediately if it was already merged."
> >
> > To be clear, if a patch was posted on the 7th and merged on the 10th, it
> > may be -1'd and reverted until the 14th.
> >
> > With this process patches can be merged just as fast as before, but it
> also
> > allows for committers with a more holistic or deeper understanding of a
> > part of Storm to prevent unnecessary complexity.
> >
> >
> > On Tue, Feb 10, 2015 at 7:48 AM, Bobby Evans  >
> > wrote:
> >
> >> I am fine with this. I mostly want a starting point, and we can adjust
> >> things from there is need be.
> >> - Bobby
> >>
> >>
> >>    On Sunday, February 8, 2015 8:39 PM, Harsha 
> wrote:
> >>
> >>
> >>
> >> Thanks for putting this together. Proposed bylaws looks good to
> >> me. -Harsha
> >>
> >>
> >>> On Thu, Feb 5, 2015, at 02:10 PM, P. Taylor Goetz wrote:
> >>> Associated pull request can be found here:
> >>> https://github.com/apache/storm/pull/419
> >>>
> >>>
> >>> This is another attempt at gaining consensus regarding adopting
> >>> official bylaws for the Apache Storm project. The changes are minor
> >>> and should be apparent in the pull request diff.
> >>>
> >>> In earlier discussions, there were concerns raised about certain
> >>> actions requiring approval types that were too strict. In retrospect,
> >>> and after reviewing the bylaws of other project (Apache Drill [1],
> >>> Apache Hadoop [2]) as well as the official Glossary of Apache-Related
> >>> Terms [3], it seems that some of those concerns were somewhat
> >>> unfounded, and stemmed from the fact that different projects use
> >>> different and inconsistent names for various approval types.
> >>>
> >>> In an effort to remedy the situation, I have modified the “Approvals”
> >>> table to use the same names as the Glossary of Apache-Related Terms
> >>> [3]. The table below provides a mapping between the terms used in this
> >>> proposed update to the Apache Storm bylaws, the Apache Glossary, the
> >>> Apache Drill bylaws, and the Apache Hadoop bylaws.
> >>>
> >>>
> >>> | Proposed Storm Bylaws | Apache Glossary | Apache Drill | Apache
> >>> | Hadoop | Definit

[jira] [Commented] (STORM-329) Add Option to Config Message handling strategy when connection timeout

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317985#comment-14317985
 ] 

ASF GitHub Bot commented on STORM-329:
--

Github user miguno commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74050954
  
Thanks for your feedback, Nathan.

As far as I understand this patch does not enable backpressure. But: 
because there is no backpressure (yet) that we can rely on, this patch will 
improve at least the situation during the startup phase of a topology to 
prevent that a) an unacked topo will not lose messages during the startup, and 
b) we do not need to unnecessarily replay messages in case of acked topos 
during their startup.  This is achieved by checking that all worker connections 
are ready before the topology starts processing data.

So backpressure is still an open feature. Backpressure was IIRC mentioned 
in the initial PR because there was a deficiency (dating back to a ZMQ related 
TODO) that caused problems related to this PR/Storm tickets (327, 404, and one 
more).  However, this patch does make the best of the current situation even in 
the absence of backpressure.  But first and foremost this patch fixes a 
(critical) cascading failure that can bring Storm clusters to a halt.

Please correct me if I'm mistaken in my summary.



> Add Option to Config Message handling strategy when connection timeout
> --
>
> Key: STORM-329
> URL: https://issues.apache.org/jira/browse/STORM-329
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 0.9.2-incubating
>Reporter: Sean Zhong
>Priority: Minor
>  Labels: Netty
> Attachments: storm-329.patch, worker-kill-recover3.jpg
>
>
> This is to address a [concern brought 
> up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986] 
> during the work at STORM-297:
> {quote}
> [~revans2] wrote: Your logic makes since to me on why these calls are 
> blocking. My biggest concern around the blocking is in the case of a worker 
> crashing. If a single worker crashes this can block the entire topology from 
> executing until that worker comes back up. In some cases I can see that being 
> something that you would want. In other cases I can see speed being the 
> primary concern and some users would like to get partial data fast, rather 
> then accurate data later.
> Could we make it configurable on a follow up JIRA where we can have a max 
> limit to the buffering that is allowed, before we block, or throw data away 
> (which is what zeromq does)?
> {quote}
> If some worker crash suddenly, how to handle the message which was supposed 
> to be delivered to the worker?
> 1. Should we buffer all message infinitely?
> 2. Should we block the message sending until the connection is resumed?
> 3. Should we config a buffer limit, try to buffer the message first, if the 
> limit is met, then block?
> 4. Should we neither block, nor buffer too much, but choose to drop the 
> messages, and use the built-in storm failover mechanism? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread miguno
Github user miguno commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74050954
  
Thanks for your feedback, Nathan.

As far as I understand this patch does not enable backpressure. But: 
because there is no backpressure (yet) that we can rely on, this patch will 
improve at least the situation during the startup phase of a topology to 
prevent that a) an unacked topo will not lose messages during the startup, and 
b) we do not need to unnecessarily replay messages in case of acked topos 
during their startup.  This is achieved by checking that all worker connections 
are ready before the topology starts processing data.

So backpressure is still an open feature. Backpressure was IIRC mentioned 
in the initial PR because there was a deficiency (dating back to a ZMQ related 
TODO) that caused problems related to this PR/Storm tickets (327, 404, and one 
more).  However, this patch does make the best of the current situation even in 
the absence of backpressure.  But first and foremost this patch fixes a 
(critical) cascading failure that can bring Storm clusters to a halt.

Please correct me if I'm mistaken in my summary.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-329) Add Option to Config Message handling strategy when connection timeout

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317820#comment-14317820
 ] 

ASF GitHub Bot commented on STORM-329:
--

Github user nathanmarz commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74036235
  
I retract my earlier -1. It was mentioned that this enables backpressure 
for unacked topologies. Is this the case? If so, this is a great new feature of 
Storm and there should be tests added testing this behavior. Namely, it should 
test that:

- Spouts stop emitting when all buffers fill up
- Topology recovers on worker death and only a subset of messages are 
dropped in that case


> Add Option to Config Message handling strategy when connection timeout
> --
>
> Key: STORM-329
> URL: https://issues.apache.org/jira/browse/STORM-329
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 0.9.2-incubating
>Reporter: Sean Zhong
>Priority: Minor
>  Labels: Netty
> Attachments: storm-329.patch, worker-kill-recover3.jpg
>
>
> This is to address a [concern brought 
> up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986] 
> during the work at STORM-297:
> {quote}
> [~revans2] wrote: Your logic makes since to me on why these calls are 
> blocking. My biggest concern around the blocking is in the case of a worker 
> crashing. If a single worker crashes this can block the entire topology from 
> executing until that worker comes back up. In some cases I can see that being 
> something that you would want. In other cases I can see speed being the 
> primary concern and some users would like to get partial data fast, rather 
> then accurate data later.
> Could we make it configurable on a follow up JIRA where we can have a max 
> limit to the buffering that is allowed, before we block, or throw data away 
> (which is what zeromq does)?
> {quote}
> If some worker crash suddenly, how to handle the message which was supposed 
> to be delivered to the worker?
> 1. Should we buffer all message infinitely?
> 2. Should we block the message sending until the connection is resumed?
> 3. Should we config a buffer limit, try to buffer the message first, if the 
> limit is met, then block?
> 4. Should we neither block, nor buffer too much, but choose to drop the 
> messages, and use the built-in storm failover mechanism? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread nathanmarz
Github user nathanmarz commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74036235
  
I retract my earlier -1. It was mentioned that this enables backpressure 
for unacked topologies. Is this the case? If so, this is a great new feature of 
Storm and there should be tests added testing this behavior. Namely, it should 
test that:

- Spouts stop emitting when all buffers fill up
- Topology recovers on worker death and only a subset of messages are 
dropped in that case


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Adopt Apache Storm Bylaws

2015-02-12 Thread Nathan Marz
Yes, I would like to codify it. It's not about there being a bug with a
patch – it's about realizing that particular patch does not fit in with a
coherent vision of Storm, or that functionality could be achieved in a
completely different way. So basically, preventing bloat. With that change
I'm +1 to the bylaws and I believe we would have a consensus.

On Wed, Feb 11, 2015 at 7:34 PM, P. Taylor Goetz  wrote:

> I have no problem with your proposal. Actually I never even considered
> setting a timeline for a revert. I've always felt that if there was any
> problem with a patch/modification, it could be reverted at any time -- no
> deadline. If we find a problem, we fix it. We've reverted changes in the
> past, and lived to tell about it :).
>
> So I would think we don't even have to mention any revert timeline. If we
> feel the need to codify that, I'm okay with it.
>
> -Taylor
>
> > On Feb 11, 2015, at 9:06 PM, Nathan Marz  wrote:
> >
> > I'm -1 on these bylaws. This commit process encourages merging as fast as
> > possible and does not give adequate time for dissenting opinions to veto
> a
> > patch. I'm concerned about two things:
> >
> > 1. Regressions - Having too lax of a merge process will lead to
> unforeseen
> > regressions. We all saw this first hand with ZeroMQ: I had to freeze the
> > version of ZeroMQ used by Storm because subsequent versions would regress
> > in numerous ways.
> > 2. Bloat – All software projects have a tendency to become bloated and
> > build complexity because things were added piecemeal without a coherent
> > vision.
> >
> > These are very serious issues, and I've seen too many projects become
> > messes because of them. The only way to control these problems are with
> > -1's. Trust isn't even the issue here – one committer may very well
> think a
> > new feature "looks fine" and "why not let it in", while another will
> > recognize that the feature is unnecessary, adds complexity, and/or can be
> > addressed via better means. As is, the proposed bylaws are attempting to
> > make vetoing very difficult.
> >
> > I have a proposal which I believe gets the best of all worlds: allowing
> for
> > fast responsiveness on contributions while allowing for regressions and
> > bloat to be controlled. It is just a slight modification of the current
> > bylaws:
> >
> > "A minimum of one +1 from a Committer other than the one who authored the
> > patch, and no -1s. The code can be committed after the first +1. If a -1
> is
> > received to the patch within 7 days after the patch was posted, it may be
> > reverted immediately if it was already merged."
> >
> > To be clear, if a patch was posted on the 7th and merged on the 10th, it
> > may be -1'd and reverted until the 14th.
> >
> > With this process patches can be merged just as fast as before, but it
> also
> > allows for committers with a more holistic or deeper understanding of a
> > part of Storm to prevent unnecessary complexity.
> >
> >
> > On Tue, Feb 10, 2015 at 7:48 AM, Bobby Evans  >
> > wrote:
> >
> >> I am fine with this. I mostly want a starting point, and we can adjust
> >> things from there is need be.
> >> - Bobby
> >>
> >>
> >> On Sunday, February 8, 2015 8:39 PM, Harsha 
> wrote:
> >>
> >>
> >>
> >> Thanks for putting this together. Proposed bylaws looks good to
> >> me. -Harsha
> >>
> >>
> >>> On Thu, Feb 5, 2015, at 02:10 PM, P. Taylor Goetz wrote:
> >>> Associated pull request can be found here:
> >>> https://github.com/apache/storm/pull/419
> >>>
> >>>
> >>> This is another attempt at gaining consensus regarding adopting
> >>> official bylaws for the Apache Storm project. The changes are minor
> >>> and should be apparent in the pull request diff.
> >>>
> >>> In earlier discussions, there were concerns raised about certain
> >>> actions requiring approval types that were too strict. In retrospect,
> >>> and after reviewing the bylaws of other project (Apache Drill [1],
> >>> Apache Hadoop [2]) as well as the official Glossary of Apache-Related
> >>> Terms [3], it seems that some of those concerns were somewhat
> >>> unfounded, and stemmed from the fact that different projects use
> >>> different and inconsistent names for various approval types.
> >>>
> >>> In an effort to remedy the situation, I have modified the “Approvals”
> >>> table to use the same names as the Glossary of Apache-Related Terms
> >>> [3]. The table below provides a mapping between the terms used in this
> >>> proposed update to the Apache Storm bylaws, the Apache Glossary, the
> >>> Apache Drill bylaws, and the Apache Hadoop bylaws.
> >>>
> >>>
> >>> | Proposed Storm Bylaws | Apache Glossary | Apache Drill | Apache
> >>> | Hadoop | Definition |
> >>> |
> >>
> ---||||-|
> >>> | Consensus Approval | Consensus Approval | Lazy Consensus | Consensus
> >>> | Approval | 3 binding +1 votes and no binding -1 votes |

[jira] [Commented] (STORM-329) Add Option to Config Message handling strategy when connection timeout

2015-02-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317779#comment-14317779
 ] 

ASF GitHub Bot commented on STORM-329:
--

Github user miguno commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74032874
  
This patch allows a worker to properly detect that the connection to a peer 
becomes unavailable -- for whatever reason (the remote worker is dead or 
restarting, there was a network glitch, etc). Also, any reconnection attempts 
are now async so that reconnecting will not block other activities of the 
worker (such as sending messages to other workers it is still connected to).

So to your question: this patch includes the case of the worker that 
remained alive trying to (re)connect to a dead peer.

Does that help?




> Add Option to Config Message handling strategy when connection timeout
> --
>
> Key: STORM-329
> URL: https://issues.apache.org/jira/browse/STORM-329
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 0.9.2-incubating
>Reporter: Sean Zhong
>Priority: Minor
>  Labels: Netty
> Attachments: storm-329.patch, worker-kill-recover3.jpg
>
>
> This is to address a [concern brought 
> up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986] 
> during the work at STORM-297:
> {quote}
> [~revans2] wrote: Your logic makes since to me on why these calls are 
> blocking. My biggest concern around the blocking is in the case of a worker 
> crashing. If a single worker crashes this can block the entire topology from 
> executing until that worker comes back up. In some cases I can see that being 
> something that you would want. In other cases I can see speed being the 
> primary concern and some users would like to get partial data fast, rather 
> then accurate data later.
> Could we make it configurable on a follow up JIRA where we can have a max 
> limit to the buffering that is allowed, before we block, or throw data away 
> (which is what zeromq does)?
> {quote}
> If some worker crash suddenly, how to handle the message which was supposed 
> to be delivered to the worker?
> 1. Should we buffer all message infinitely?
> 2. Should we block the message sending until the connection is resumed?
> 3. Should we config a buffer limit, try to buffer the message first, if the 
> limit is met, then block?
> 4. Should we neither block, nor buffer too much, but choose to drop the 
> messages, and use the built-in storm failover mechanism? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-329: fix cascading Storm failure by impr...

2015-02-12 Thread miguno
Github user miguno commented on the pull request:

https://github.com/apache/storm/pull/429#issuecomment-74032874
  
This patch allows a worker to properly detect that the connection to a peer 
becomes unavailable -- for whatever reason (the remote worker is dead or 
restarting, there was a network glitch, etc). Also, any reconnection attempts 
are now async so that reconnecting will not block other activities of the 
worker (such as sending messages to other workers it is still connected to).

So to your question: this patch includes the case of the worker that 
remained alive trying to (re)connect to a dead peer.

Does that help?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---