Re: Importing Apache Storm to Intellij

2016-07-19 Thread Cody Innowhere
storm-core dependency in pom.xml in storm-starter module is provided by
default, you may want to change it to compile.

On Tue, Jul 19, 2016 at 9:17 PM, Walid Aljoby <
walid_alj...@yahoo.com.invalid> wrote:

>
> Hi everyone,
> I am trying to run WordCountTopology from Intellij, I faced the following
> run-time error:Anyone please can see how can it can be fixed out?
>
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/storm/topology/IRichSpout at java.lang.Class.forName0(Native
> Method) at java.lang.Class.forName(Class.java:195) at
> com.intellij.rt.execution.application.AppMain.main(AppMain.java:122)Caused
> by: java.lang.ClassNotFoundException: org.apache.storm.topology.IRichSpout
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
> java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
> java.security.AccessController.doPrivileged(Native Method) at
> java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
> java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 3 more
>
> Thank You--RegardsWA


Re: [DISCUSS] Maintaining example topologies for external components

2016-06-30 Thread Cody Innowhere
We can have examples in perspective modules and gather the example
binaries(jars) together into one directory (each in a sub directory) by
adding a maven task.

On Thu, Jun 30, 2016 at 10:12 PM, Jungtaek Lim  wrote:

> Thanks Abhishek for bring good topic to discuss.
> Personally I'm either fine. The thing I'd just like to see is having a
> general rule across all external modules.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
>
> 2016년 6월 30일 (목) 오후 11:09, Satish Duggana 님이 작성:
>
> > Hi Xin,
> > I am fine with having multiple submodules in samples/examples module.
> >
> > Thanks,
> > Satish.
> >
> > On 6/30/16, 6:51 PM, "Xin Wang"  wrote:
> >
> > hi Satish,
> > In one `examples` directory not mean they must be in the same java maven
> > project. we can create several projects. e.g. examples/storm-opentsdb,
> > examples/storm-redis.
> >
> > 2016-06-30 21:04 GMT+08:00 Xin Wang :
> >
> > > +1 for the former.
> > >
> > > * examples have the same entry will be very convenient for users to
> > > looking for.
> > > * refer to other projects like spark, it has the `examples` module
> > > including `streaming`, `sql`, `ml`.
> > >
> > > Also, I'm happy to take part in.
> > >
> > > Thanks.
> > > Xin
> > >
> > > 2016-06-30 16:39 GMT+08:00 Abhishek Agarwal :
> > >
> > >> Hi all,
> > >> Right now the example topologies/classes for some external modules are
> > >> being put up in the test folder itself. The problem I see is that,
> > >> -> the example code isn't really test code so test folder isn't the
> > right
> > >> fit.
> > >> -> people, who are looking for example code, may not find the example
> > >> code.
> > >>
> > >> I can see two solutions -
> > >>  -> Have module level example directories e.g.
> examples/storm-opentsdb,
> > >> examples/storm-redis
> > >> -> Have an examples folder within module itself and example topologies
> > are
> > >> put there.
> > >>
> > >> In any case, it is better to document github links of example code in
> > the
> > >> documentation of any external module.
> > >>
> > >> Let me know what you guys think.
> > >>
> > >> --
> > >> Regards,
> > >> Abhishek Agarwal
> > >>
> > >
> > >
> >
> >
> >
>


Re: New Committers/PMC Members: John Fang and Abhishek Agarwal

2016-06-07 Thread Cody Innowhere
Cons to John & Abhishek!

On Wed, Jun 8, 2016 at 6:20 AM, Harsha  wrote:

> Congrats John & Abhishek !
>
> -Harsha
>
> On Tue, Jun 7, 2016, at 02:14 PM, Aaron.Dossett wrote:
> > I second that!
> >
> > On 6/7/16, 3:12 PM, "Bobby Evans"  wrote:
> >
> > >Congratulations to both of you.  Well deserved.
> > > - Bobby
> > >
> > >On Tuesday, June 7, 2016 3:02 PM, P. Taylor Goetz <
> ptgo...@gmail.com>
> > >wrote:
> > >
> > >
> > > Please join me in welcoming John Fang and Abhishek Agarwal as a new
> > >Apache Storm Committers and PMC members.
> > >
> > >John and Abhishek have demonstrated a strong commitment to the Apache
> > >Storm community through active participation and mentoring on the Storm
> > >mailing lists. They have also authored many enhancements and bug fixes
> > >spanning both Storm¹s core codebase, as well as a numerous integration
> > >components.
> > >
> > >Welcome John and Abhishek!
> > >
> > >-Taylor
> > >
> > >
> >
>


Re: [DISCUSSION] opinions on breaking changes on metrics for 1.x

2016-05-23 Thread Cody Innowhere
 example, IMetricsConsumer could be changed to no
> > longer receive built-in metrics on Storm), it will not break backward
> > compatibility from the API side anyway.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > 2016년 5월 20일 (금) 오전 12:57, Abhishek Agarwal 님이 작성:
> >
> > > Sounds good. Having two separate metric reporters may be confusing but
> it
> > > is better than breaking the client code.
> > >
> > > Codahale library allows user to specify frequency per reporter
> instance.
> > > Storm on the other hand allows different reporting frequency for each
> > > metric. How will that mapping work? I am ok to drop the support for
> > custom
> > > frequency for each metric. Internal metrics in storm anyway use same
> > > frequency of reporting.
> > >
> > > On Thu, May 19, 2016 at 9:04 PM, Bobby Evans
>  > >
> > > wrote:
> > >
> > > > I personally would like to see that change happen differently for the
> > two
> > > > branches.
> > > > On 1.x we add in a new API for both reporting metrics and collecting
> in
> > > > parallel to the old API.  We leave IMetric and IMetricsConsumer in
> > place
> > > > but deprecated.  As we move internal metrics over from the old
> > interface
> > > to
> > > > the new one, we either keep versions of the old ones in place or we
> > > provide
> > > > a translation shim going from the new to the old.
> > > >
> > > > In 2.x either the old way is gone completely or it is off by default.
> > I
> > > > prefer gone completely.
> > > >
> > > > If we go off of dropwizard/codahale metrics or a layer around them
> like
> > > > was discussed previously it seems fairly straight forward to take
> some
> > of
> > > > our current metrics that all trigger at the same interval and setup a
> > > > reporter that can translate them into the format that was reported
> > > > previously.
> > > > In 1.x to get a full picture of what is happening if your topology
> you
> > > may
> > > > need two separate reporters.  One for the new metrics and one for the
> > > old,
> > > > but it should only be for a short period of time. - Bobby
> > > >
> > > > On Thursday, May 19, 2016 1:00 AM, Cody Innowhere <
> > > e.neve...@gmail.com>
> > > > wrote:
> > > >
> > > >
> > > >  If we want to refactor the metrics system, I think we may have to
> > incur
> > > > breaking changes. We can make it backward compatible but this means
> we
> > > may
> > > > build an adapt layer on top of metrics, or a lot of "if...else..."
> > which
> > > > might be ugly, either way, it might be a pain to maintain the code.
> > > > So I prefer to making breaking changes if we want to build a new
> > metrics
> > > > system, and I'm OK to move JStorm metrics migration phase forward to
> > 1.x,
> > > > and I'm happy to share our design & experiences.
> > > >
> > > > On Thu, May 19, 2016 at 11:12 AM, Jungtaek Lim 
> > > wrote:
> > > >
> > > > > Hi devs,
> > > > >
> > > > > I'd like to see our opinions on breaking changes on metrics for
> 1.x.
> > > > >
> > > > > Some backgrounds here:
> > > > >
> > > > > - As you may have seen, I'm trying to address some places to
> improve
> > > > > metrics without breaking backward compatibility, but it's limited
> due
> > > to
> > > > > interface IMetric which is opened to public.
> > > > > - We're working on Storm 2.0.0, and evaluation / adoption for
> metrics
> > > > > feature of JStorm is planned to phase 2 but we all don't know
> > estimated
> > > > > release date, and I feel it's not near future.
> > > > > - We've just released Storm 1.0.x, so I expected the lifetime of
> > Storm
> > > > 1.0
> > > > > (even 0.9) months or even years.
> > > > >
> > > > > If someone wants to know what exactly things current metrics
> feature
> > > > > matter, please let me know so that I will summarize.
> > > > >
> > > > > I have other ideas on mind to relieve some problems with current
> > > metrics,
> > > > > so I'm also OK to postpone renewal of metrics to 2.0.0 with
> applying
> > > > those
> > > > > workaround ideas. But if we're having willingness to address
> metrics
> > on
> > > > > 1.x, IMO we can consider breaking backward compatibility from 1.x
> for
> > > > once.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Thanks,
> > > > > Jungtaek Lim (HeartSaVioR)
> > > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Abhishek Agarwal
> > >
> >
>
>
>
> --
> Regards,
> Abhishek Agarwal
>
>
>
>
>


Re: [DISCUSSION] opinions on breaking changes on metrics for 1.x

2016-05-23 Thread Cody Innowhere
 example, IMetricsConsumer could be changed to no
> > longer receive built-in metrics on Storm), it will not break backward
> > compatibility from the API side anyway.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > 2016년 5월 20일 (금) 오전 12:57, Abhishek Agarwal 님이 작성:
> >
> > > Sounds good. Having two separate metric reporters may be confusing but
> it
> > > is better than breaking the client code.
> > >
> > > Codahale library allows user to specify frequency per reporter
> instance.
> > > Storm on the other hand allows different reporting frequency for each
> > > metric. How will that mapping work? I am ok to drop the support for
> > custom
> > > frequency for each metric. Internal metrics in storm anyway use same
> > > frequency of reporting.
> > >
> > > On Thu, May 19, 2016 at 9:04 PM, Bobby Evans
>  > >
> > > wrote:
> > >
> > > > I personally would like to see that change happen differently for the
> > two
> > > > branches.
> > > > On 1.x we add in a new API for both reporting metrics and collecting
> in
> > > > parallel to the old API.  We leave IMetric and IMetricsConsumer in
> > place
> > > > but deprecated.  As we move internal metrics over from the old
> > interface
> > > to
> > > > the new one, we either keep versions of the old ones in place or we
> > > provide
> > > > a translation shim going from the new to the old.
> > > >
> > > > In 2.x either the old way is gone completely or it is off by default.
> > I
> > > > prefer gone completely.
> > > >
> > > > If we go off of dropwizard/codahale metrics or a layer around them
> like
> > > > was discussed previously it seems fairly straight forward to take
> some
> > of
> > > > our current metrics that all trigger at the same interval and setup a
> > > > reporter that can translate them into the format that was reported
> > > > previously.
> > > > In 1.x to get a full picture of what is happening if your topology
> you
> > > may
> > > > need two separate reporters.  One for the new metrics and one for the
> > > old,
> > > > but it should only be for a short period of time. - Bobby
> > > >
> > > > On Thursday, May 19, 2016 1:00 AM, Cody Innowhere <
> > > e.neve...@gmail.com>
> > > > wrote:
> > > >
> > > >
> > > >  If we want to refactor the metrics system, I think we may have to
> > incur
> > > > breaking changes. We can make it backward compatible but this means
> we
> > > may
> > > > build an adapt layer on top of metrics, or a lot of "if...else..."
> > which
> > > > might be ugly, either way, it might be a pain to maintain the code.
> > > > So I prefer to making breaking changes if we want to build a new
> > metrics
> > > > system, and I'm OK to move JStorm metrics migration phase forward to
> > 1.x,
> > > > and I'm happy to share our design & experiences.
> > > >
> > > > On Thu, May 19, 2016 at 11:12 AM, Jungtaek Lim 
> > > wrote:
> > > >
> > > > > Hi devs,
> > > > >
> > > > > I'd like to see our opinions on breaking changes on metrics for
> 1.x.
> > > > >
> > > > > Some backgrounds here:
> > > > >
> > > > > - As you may have seen, I'm trying to address some places to
> improve
> > > > > metrics without breaking backward compatibility, but it's limited
> due
> > > to
> > > > > interface IMetric which is opened to public.
> > > > > - We're working on Storm 2.0.0, and evaluation / adoption for
> metrics
> > > > > feature of JStorm is planned to phase 2 but we all don't know
> > estimated
> > > > > release date, and I feel it's not near future.
> > > > > - We've just released Storm 1.0.x, so I expected the lifetime of
> > Storm
> > > > 1.0
> > > > > (even 0.9) months or even years.
> > > > >
> > > > > If someone wants to know what exactly things current metrics
> feature
> > > > > matter, please let me know so that I will summarize.
> > > > >
> > > > > I have other ideas on mind to relieve some problems with current
> > > metrics,
> > > > > so I'm also OK to postpone renewal of metrics to 2.0.0 with
> applying
> > > > those
> > > > > workaround ideas. But if we're having willingness to address
> metrics
> > on
> > > > > 1.x, IMO we can consider breaking backward compatibility from 1.x
> for
> > > > once.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Thanks,
> > > > > Jungtaek Lim (HeartSaVioR)
> > > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Abhishek Agarwal
> > >
> >
>
>
>
> --
> Regards,
> Abhishek Agarwal
>
>
>
>
>


Re: [DISCUSSION] opinions on breaking changes on metrics for 1.x

2016-05-18 Thread Cody Innowhere
If we want to refactor the metrics system, I think we may have to incur
breaking changes. We can make it backward compatible but this means we may
build an adapt layer on top of metrics, or a lot of "if...else..." which
might be ugly, either way, it might be a pain to maintain the code.
So I prefer to making breaking changes if we want to build a new metrics
system, and I'm OK to move JStorm metrics migration phase forward to 1.x,
and I'm happy to share our design & experiences.

On Thu, May 19, 2016 at 11:12 AM, Jungtaek Lim  wrote:

> Hi devs,
>
> I'd like to see our opinions on breaking changes on metrics for 1.x.
>
> Some backgrounds here:
>
> - As you may have seen, I'm trying to address some places to improve
> metrics without breaking backward compatibility, but it's limited due to
> interface IMetric which is opened to public.
> - We're working on Storm 2.0.0, and evaluation / adoption for metrics
> feature of JStorm is planned to phase 2 but we all don't know estimated
> release date, and I feel it's not near future.
> - We've just released Storm 1.0.x, so I expected the lifetime of Storm 1.0
> (even 0.9) months or even years.
>
> If someone wants to know what exactly things current metrics feature
> matter, please let me know so that I will summarize.
>
> I have other ideas on mind to relieve some problems with current metrics,
> so I'm also OK to postpone renewal of metrics to 2.0.0 with applying those
> workaround ideas. But if we're having willingness to address metrics on
> 1.x, IMO we can consider breaking backward compatibility from 1.x for once.
>
> What do you think?
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>


Re: [DISCUSSION] Version lines of 1.x

2016-05-09 Thread Cody Innowhere
I'm also +1 for maintaining 1.x branch & master and not maintaining 0.10.x
branch.

On Mon, May 9, 2016 at 1:04 PM, Abhishek Agarwal 
wrote:

> +1. There is lot development effort pending against 1.x branch which will
> get unblocked with 1.1.0 branch. I am assuming, we will not introduce any
> backward incompatible changes in the new branch. But what will be the
> release timeline of 1.1.0? Many of the PRs affect small portion of code.
> Back porting these minor improvements as well as bugs into three branches
> will be counter productive. We might as well work with 1.0.x and keep
> pushing the changes there.
>
> On Mon, May 9, 2016 at 8:50 AM, Jungtaek Lim  wrote:
>
> > What a coincidence! :)
> >
> > My feeling is that this issue would be another representation of 'drop
> > further releases of 0.x'.
> >
> > If we want to have minor and bugfix version separated, we would have at
> > least 3 branches, master (for 2.0), 1.1.x, 1.0.x. I'm seeing that not all
> > bugfixes are applied to 0.10.x when we're pointing 1.x-branch as next
> > release, which means even maintaining 3 branches are not easy. (It should
> > be addressed if we maintain two 1.x version lines.)
> > Moreover, package name change makes us a bit bothering to backport into
> > 0.10.x.
> >
> > So, I'm sorry for 0.x users but I'm in favor of not maintaining 0.10.x
> > branch.
> > I'm curious what we all think about this, too.
> >
> > 2016년 5월 9일 (월) 오전 11:10, P. Taylor Goetz 님이 작성:
> >
> > > Perfect timing as I was thinking about similar things.
> > >
> > > The new metrics APIs being proposed against the 1.x branch would be an
> > API
> > > addition, and IMO should bump the minor version when added. I'd be +1
> for
> > > that.
> > >
> > > I guess it comes down to how many version branches do we want to
> support?
> > > We may need to divide and conquer to support that.
> > >
> > > -Taylor
> > >
> > > > On May 8, 2016, at 9:51 PM, Jungtaek Lim  wrote:
> > > >
> > > > Hi devs,
> > > >
> > > > I have a feeling that we recently try to respect semantic versioning,
> > at
> > > > least separating feature updates and bugfixes.
> > > >
> > > > Recently we released 1.0.0 and 1.0.1 continuously, which was OK since
> > it
> > > > addressed performance regressions and critical bugs. I'm curious that
> > we
> > > > want to maintain minor version line and bugfix version line for 1.x
> > > version
> > > > lines. (meaning two version lines for 1.x)
> > > >
> > > > In fact, we discussed to freeze the feature during releasing 2.0.0,
> but
> > > we
> > > > don't have timeframe for 2.0.0 and phase 1 is not completed yet, so I
> > > don't
> > > > think we can freeze developing or improving the features for 1.x
> lines.
> > > >
> > > > There're many pending pull requests for 1.x (and master, maybe) but
> not
> > > > sure I can merge them into 1.x-branch. In order to address them we
> > should
> > > > settle this.
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > > Jungtaek Lim (HeartSaVioR)
> > >
> >
>
>
>
> --
> Regards,
> Abhishek Agarwal
>


Re: Thought on complete latency

2016-04-29 Thread Cody Innowhere
@Jungtaek,
Yes you're right. Since most of our use cases are multi-thread spouts, this
is not a problem for us.

As to Storm, I think we can use your second workaround for now, and
re-evaluate this when we finish porting multi-thread spout in Phase 2.

On Fri, Apr 29, 2016 at 6:10 PM, Jungtaek Lim  wrote:

> FYI: Issue STORM-1742 <https://issues.apache.org/jira/browse/STORM-1742>
> and
> related pull request (WIP) #1379
> <https://github.com/apache/storm/pull/1379> are
> available.
>
> I've also done with functional test so that you can easily see what I'm
> claiming.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2016년 4월 29일 (금) 오후 4:59, Jungtaek Lim 님이 작성:
>
> > sorry some correction which may confuse someone:
> >
> > With this circumstance there's no issue to keep it as is, since *users
> > normally don't implemented ack() / fail() as long blocking method*.
> >
> > 2016년 4월 29일 (금) 오후 4:56, Jungtaek Lim 님이 작성:
> >
> >> Cody,
> >> Thanks for join the conversation.
> >>
> >> If my understanding is right, the way JStorm handles complete latency is
> >> same to what Apache Storm currently does.
> >> Please refer
> >>
> https://github.com/apache/storm/blob/master/storm-core/src/clj/org/apache/storm/daemon/executor.clj#L545
> >>
> >> What I really want to address is when/which component will decide
> >> complete timestamp.
> >>
> >> If my understanding is right, JStorm separates the threads in Spout
> which
> >> one is responsible for outgoing tuples, and other one is responsible for
> >> receiving / handling incoming tuple. With this circumstance there's no
> >> issue to keep it as is, since normally ack() / fail() are not
> implemented
> >> as long blocking method.
> >>
> >> But many users are implementing nextTuple() to sleep long amount of time
> >> to throttle theirselves (yes, I was one of them) and that decision makes
> >> tuples from Acker also waiting amount of time. There're some exceptions:
> >> throttling is on (via backpressure), count of pending tuples are over
> max
> >> spout pending.
> >>
> >> So I guess same issue persists on JStorm with single thread mode Spout.
> >>
> >> Please let me know if I'm missing something.
> >>
> >> Thanks!
> >> Jungtaek Lim (HeartSaVioR)
> >>
> >>
> >>
> >> 2016년 4월 29일 (금) 오후 4:32, Cody Innowhere 님이 작성:
> >>
> >>> What we do in JStorm is to set timestamp of a tuple before a spout
> sends
> >>> it
> >>> to downstream bolts, then in spout's ack/fail method, we get current
> >>> timestamp, by subtracting the original ts, we get process latency, note
> >>> this delta time includes network cost from spouts to bolts, ser/deser
> >>> time, bolt process time, network cost between acker to the original
> >>> spout,
> >>> i.e., it's almost the time of tuple's life cycle.
> >>>
> >>> I'm adding this on porting executor.clj. In such a way, we don't need
> to
> >>> care about time sync problem.
> >>>
> >>> On Fri, Apr 29, 2016 at 11:18 AM, Jungtaek Lim 
> >>> wrote:
> >>>
> >>> > One way to confirm my assumption is valid, we could use
> sojourn_time_ms
> >>> > currently provided to queue metrics.
> >>> >
> >>> > We could see sojourn_time_ms in '__receive' metrics of Spout
> component
> >>> to
> >>> > verify how long messages from Acker wait from receive queue in Spout.
> >>> >
> >>> > And we also could estimate "waiting time in transfer queue in Spout"
> by
> >>> > seeing sojourn_time_ms in '__send' metrics of Spout component, and
> >>> estimate
> >>> > "waiting time for ACK_INIT in receive queue in Acker" by seeing
> >>> > sojourn_time_ms in '__receive' metrics of Acker component.
> >>> >
> >>> > Since I don't have clusters/topologies for normal use case I'm not
> sure
> >>> > what normally the values are, but at least, metrics from
> >>> > ThroughtputVsLatency, sojourn_time_ms in '__send' of Spout is often
> >>> close
> >>> > to 0, and sojourn_time_ms in '__receive' of Acker is less than 2ms.
> >>> > If message transfer latency of ACK_INIT message is tiny, su

Re: Thought on complete latency

2016-04-29 Thread Cody Innowhere
What we do in JStorm is to set timestamp of a tuple before a spout sends it
to downstream bolts, then in spout's ack/fail method, we get current
timestamp, by subtracting the original ts, we get process latency, note
this delta time includes network cost from spouts to bolts, ser/deser
time, bolt process time, network cost between acker to the original spout,
i.e., it's almost the time of tuple's life cycle.

I'm adding this on porting executor.clj. In such a way, we don't need to
care about time sync problem.

On Fri, Apr 29, 2016 at 11:18 AM, Jungtaek Lim  wrote:

> One way to confirm my assumption is valid, we could use sojourn_time_ms
> currently provided to queue metrics.
>
> We could see sojourn_time_ms in '__receive' metrics of Spout component to
> verify how long messages from Acker wait from receive queue in Spout.
>
> And we also could estimate "waiting time in transfer queue in Spout" by
> seeing sojourn_time_ms in '__send' metrics of Spout component, and estimate
> "waiting time for ACK_INIT in receive queue in Acker" by seeing
> sojourn_time_ms in '__receive' metrics of Acker component.
>
> Since I don't have clusters/topologies for normal use case I'm not sure
> what normally the values are, but at least, metrics from
> ThroughtputVsLatency, sojourn_time_ms in '__send' of Spout is often close
> to 0, and sojourn_time_ms in '__receive' of Acker is less than 2ms.
> If message transfer latency of ACK_INIT message is tiny, sum of latencies
> on option 2 would be also tiny, maybe less than 5ms (just an assumption).
>
> I really like to see those metrics values (including sojourn_time_ms in
> '__receive' of Bolts) from various live topologies which handles normal use
> cases to make my assumption solid. Please share if you're logging those
> metrics.
>
> I'll try to go on 2) first, but still open to any ideas / opinions /
> objections.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2016년 4월 29일 (금) 오전 9:38, Jungtaek Lim 님이 작성:
>
> > Roshan,
> >
> > Thanks for sharing your thought.
> > About your thoughts I'm in favor of 1), that's what my sketch is trying
> to
> > achieve.
> >
> > If we agree to go on 1), IMO the options I stated are clear. Let me
> > elaborate more.
> >
> > Root tuple has been made from "Spout" and on definition of 'complete
> > latency' tuple tree is considered as complete from "Acker". Since start
> > point and end point are occurring different components, we should
> > tolerate either "latency of handling ACK_INIT between Spout and Acker"
> > (which changes start point to Acker) or "time variation between machine
> > which Spout is running on and machine which Acker is running on". I think
> > there's no way to avoid both of two, so we should just choose which is
> > smaller to be easier to ignore. I agree it could feel tricky for us.
> >
> > I found some answers / articles claiming there could be sub-millisecond
> > precision within same LAN if machines are syncing from same ntp server,
> and
> > other articles claiming hundreds of millisecond precision which is not
> > acceptable to tolerate.
> > I guess Storm doesn't require machines to be synched with same time, so
> it
> > will be new requirement to set up cluster.
> >
> > And latency of handling ACK_INIT between Spout and Acker is up to
> hardware
> > cluster configurations, but normally we place machines to same rack or
> same
> > switch, or at least group to same LAN which shows low latency.
> > So it's up to "waiting time in transfer queue in Spout" and "waiting time
> > for ACK_INIT in receive queue in Acker". But if we don't want to get into
> > too deeply, I guess this would be fine for normal situation, since Acker
> is
> > lightweight and should be keep up the traffic.
> >
> > - Jungtaek Lim (HeartSaVioR)
> >
> >
> > 2016년 4월 29일 (금) 오전 5:41, Roshan Naik 님이 작성:
> >
> >> IMO, avoiding the time variation on machines makes total sense. But I
> feel
> >> that this is a tricky question.
> >>
> >>
> >> Couple more thoughts:
> >>
> >> 1)  As per
> >>
> >>
> http://storm.apache.org/releases/current/Guaranteeing-message-processing.ht
> >> ml
> >> <
> http://storm.apache.org/releases/current/Guaranteeing-message-processing.html
> >
> >>
> >> "Storm can detect when the tree of tuples is fully processed and can ack
> >> or fail the spout tuple appropriately."
> >>
> >>
> >> That seems to indicate that when the ACKer has received all the
> necessary
> >> acks, then it considers the tuple fully processed. If we go by that, and
> >> we define complete latency as the time taken to fully process a tuple,
> >> then it is not necessary to include the time it takes for the ACK to be
> >> delivered to spout.
> >>
> >>
> >> 2) If you include the time it takes to deliver the ACK to the spout,
> then
> >> we also need to wonder if we should include the time that the spout
> takes
> >> to process the ACK() call. I am unclear if the spout.ack() throws an
> >> exception what that means to the idea of Œfully processed¹. Here you can
> >> compu

Re: Publishing cluster stats on Apache Storm

2016-04-18 Thread Cody Innowhere
Jungtaek,
Thanks for your explanation, still +1 for your proposal.
Moreover, I'd like to hear storm users about this, especially use cases for
metrics so that we can see what's best to do this.

On Tue, Apr 19, 2016 at 11:19 AM, Jungtaek Lim  wrote:

> Cody,
> Thanks for giving an opinion.
>
> I guess there was a miscommunication between you and me.
> This shouldn't be related to topology. Plugin should be set up via cluster
> configuration, and singular daemon like Nimbus (master) or UI should
> publish cluster metrics to the plugin.
> I agree Nimbus should have this feature, but only master should publish the
> cluster metrics so we should handle it.
>
> Btw, since we don't have built-in metrics storage, we can't provide query
> API with time-series fashion.
> I guess publishing (pushing) seems the better way what we can address
> without huge modification.
> (Did you mean Nimbus has query API which queries the external storage?)
>
> Please note that it's an idea for improvement against 1.x, not 2.0.
> In phase 2 we should evaluate and replace metrics feature with JStorm or
> another after porting to Java.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2016년 4월 19일 (화) 오전 11:59, Cody Innowhere 님이 작성:
>
> > I'm +1 for publishing cluster stats, but don't quite understand your
> > implementation.
> > Do you mean that a topology which wants cluster stats can get cluster
> stats
> > by implementing the pluggable consumer/reporter?
> > Personally I would prefer to put this role into nimbus (can be pluggable
> > too) since it's the responsibility of nimbus to take care of all
> > topologies, naturally all topology stats. Then we expose API's from
> nimbus
> > for external queries.
> >
> > On Tue, Apr 19, 2016 at 9:47 AM, Jungtaek Lim  wrote:
> >
> > > Hi devs,
> > >
> > > While Storm publishes topology metrics by metrics consumers, Storm
> > doesn't
> > > publish cluster metrics any way.
> > >
> > > Therefore, I'd like to introduce the feature of publishing cluster
> stats
> > to
> > > pluggable consumers so that users can also push to external storages
> and
> > > query on it or even configure dashboard.
> > > (For topology stats it's supported via MetricsConsumer.)
> > >
> > > I'm seeing some workarounds like having their own reporter process
> > polling
> > > cluster information from Nimbus or REST API.
> > > (For example, Ambari has cluster metrics reporter for Storm
> > > <
> > >
> >
> https://github.com/apache/ambari/blob/trunk/ambari-metrics/ambari-metrics-storm-sink/src/main/java/org/apache/hadoop/metrics2/sink/storm/StormTimelineMetricsReporter.java
> > > >
> > > but it relies on custom build of Storm.)
> > > Yes it works anyway but if Storm provides the feature naturally, with
> > > pluggable way like MetricsConsumer it would be great for users who want
> > to
> > > configure dashboard regarding this.
> > >
> > > I also saw that STORM-1158 <
> > > http://issues.apache.org/jira/browse/STORM-1158>
> > > adds metrics reporter for internal actions, but it has some limitation
> on
> > > it.
> > >
> > > - It focuses how many requests the daemon receives, not cluster stats.
> > > - It requires reporter to use codahale metrics. Moreover it shades
> > > codahale-metrics so other implementations of reporter may not work
> > properly
> > > if it uses other codahale-metrics plugin like ganglia.
> > >
> > > So I'm planning to add metrics feature of cluster stat by not relying
> on
> > > daemon metrics reporter, but introduce new pluggable consumer/reporter.
> > >
> > > What do you think? Do you have opinion that we need to add cluster
> stats
> > to
> > > daemon metrics reporter?
> > >
> > > Thanks,
> > > Jungtaek Lim (HeartSaVioR)
> > >
> >
>


Re: Publishing cluster stats on Apache Storm

2016-04-18 Thread Cody Innowhere
I'm +1 for publishing cluster stats, but don't quite understand your
implementation.
Do you mean that a topology which wants cluster stats can get cluster stats
by implementing the pluggable consumer/reporter?
Personally I would prefer to put this role into nimbus (can be pluggable
too) since it's the responsibility of nimbus to take care of all
topologies, naturally all topology stats. Then we expose API's from nimbus
for external queries.

On Tue, Apr 19, 2016 at 9:47 AM, Jungtaek Lim  wrote:

> Hi devs,
>
> While Storm publishes topology metrics by metrics consumers, Storm doesn't
> publish cluster metrics any way.
>
> Therefore, I'd like to introduce the feature of publishing cluster stats to
> pluggable consumers so that users can also push to external storages and
> query on it or even configure dashboard.
> (For topology stats it's supported via MetricsConsumer.)
>
> I'm seeing some workarounds like having their own reporter process polling
> cluster information from Nimbus or REST API.
> (For example, Ambari has cluster metrics reporter for Storm
> <
> https://github.com/apache/ambari/blob/trunk/ambari-metrics/ambari-metrics-storm-sink/src/main/java/org/apache/hadoop/metrics2/sink/storm/StormTimelineMetricsReporter.java
> >
> but it relies on custom build of Storm.)
> Yes it works anyway but if Storm provides the feature naturally, with
> pluggable way like MetricsConsumer it would be great for users who want to
> configure dashboard regarding this.
>
> I also saw that STORM-1158 <
> http://issues.apache.org/jira/browse/STORM-1158>
> adds metrics reporter for internal actions, but it has some limitation on
> it.
>
> - It focuses how many requests the daemon receives, not cluster stats.
> - It requires reporter to use codahale metrics. Moreover it shades
> codahale-metrics so other implementations of reporter may not work properly
> if it uses other codahale-metrics plugin like ganglia.
>
> So I'm planning to add metrics feature of cluster stat by not relying on
> daemon metrics reporter, but introduce new pluggable consumer/reporter.
>
> What do you think? Do you have opinion that we need to add cluster stats to
> daemon metrics reporter?
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>


Re: 答复: Question on Metrics Server to Alibaba team

2016-03-29 Thread Cody Innowhere
@Harsha,
Currently we already use rocksdb to store a time series data rather than
the latest window values.

@Bobby,
I will think about HA and post a detailed document for review (together
with MetricUploader interface) later.

On Wed, Mar 30, 2016 at 9:35 AM, Harsha  wrote:

> Another thing to consider is to store a time series data not the current
> approach where we store 1min, 10min, 3hrs windowed approach and
> definitely not depend on external storage such as hdfs .
>
> On Fri, Mar 25, 2016, at 06:43 AM, Bobby Evans wrote:
> > My concern is really around how much time/effort it is to get to a final
> > solution, and to ultimately maintain/support that solution.  If I was
> > doing this from scratch I would probably pull something off of the shelf
> > that is tested and has an entire community supporting it instead of
> > writing something ourselves from scratch.  But in this case we have a
> > solution from JStorm, that we know works.  Because this is the backend
> > that we are talking about we can switch things out later on if we need
> > to.  Like I said before I am fine with using the JStorm code initially.
> > I mostly want to be sure of a few things.
> > 1. The metrics interface we expose to end users is well thought out and
> > can be extended in the future.2. The interfaces that connect this front
> > end to the back end are though out and we could replace the back end if
> > needed.3. The solution offers some level of high availability.  If Nimbus
> > a worker, etc. crash it is OK to lose some data, but we don't want to
> >  - Bobby
> >
> > On Friday, March 25, 2016 6:26 AM, Cody Innowhere
> >  wrote:
> >
> >
> >  Bobby,
> > I understand your concern. Still, I think our metrics design in JStorm
> > can
> > work without any external service, as I mentioned above, we can store
> > metrics in rocksdb on nimbus server. A rough thought will be: we store
> > the
> > latest 1 hour of 1-min window data, 10 hours of 10-min window data, 5
> > days
> > of 2-hour window data, 30 days of 1-day window, etc. And if there's the
> > need to sync metrics data between nimbus servers, we can add a sync
> > thread
> > to handle nimbus fail-over, since it's just metrics data that don't
> > really
> > matter too much, we can use a plain simple sync model.
> >
> > The external service is another option to end users, if users feel it's
> > important (or maybe their business built on top of storm is very
> > important), they can use this external service to build their own monitor
> > system which can be more useful than the original solution shipped with
> > storm.
> >
> > On Fri, Mar 25, 2016 at 2:09 AM, Bobby Evans
> > 
> > wrote:
> >
> > > The problem is that we want something for storm that can work out of
> the
> > > box, ideally without some other complicated external service (except
> > > zookeeper which we already have, and is not actually that complex to
> setup
> > > and run).
> > > If we feel that we must have some external state store that is required
> > > for storm to run, then we need to make the decision carefully and
> > > deliberately.
> > >  - Bobby
> > >
> > >On Wednesday, March 23, 2016 8:37 AM, John Fang <
> > > xiaojian@alibaba-inc.com> wrote:
> > >
> > >
> > >  Sorry , I misunderstand it. We will make H/A for TopologyMaster. And
> > > metric meta will store at HDFS,  So the metrics meta won't rely on the
> > > nimbus. It can enhance the stability of the metric system.
> > >
> > > -邮件原件-
> > > 发件人: Cody Innowhere [mailto:e.neve...@gmail.com]
> > > 发送时间: 2016年3月23日 19:59
> > > 收件人: dev@storm.apache.org
> > > 主题: Re: Question on Metrics Server to Alibaba team
> > >
> > > If we don't rely on any external system, our metrics system is still
> > > available but will store metrics meta/data in rocksdb on nimbus
> servers.
> > > There will be limits though, for example, we cannot store metrics data
> all
> > > through the topology lifecycle, because rocksdb is only a KV storage,
> it
> > > may not support efficient scan operations and too much data in local
> disk
> > > may bring in extra IO overhead, so we may have to store latest 1hour
> of m1
> > > data, 6 hours of m10 data as such (currently not implemented in
> JStorm, but
> > > quite easy to do this).
> > >
> > > TopologyMaster is merely a channel for registering/computing/uploading

Re: 答复: Question on Metrics Server to Alibaba team

2016-03-25 Thread Cody Innowhere
Bobby,
I understand your concern. Still, I think our metrics design in JStorm can
work without any external service, as I mentioned above, we can store
metrics in rocksdb on nimbus server. A rough thought will be: we store the
latest 1 hour of 1-min window data, 10 hours of 10-min window data, 5 days
of 2-hour window data, 30 days of 1-day window, etc. And if there's the
need to sync metrics data between nimbus servers, we can add a sync thread
to handle nimbus fail-over, since it's just metrics data that don't really
matter too much, we can use a plain simple sync model.

The external service is another option to end users, if users feel it's
important (or maybe their business built on top of storm is very
important), they can use this external service to build their own monitor
system which can be more useful than the original solution shipped with
storm.

On Fri, Mar 25, 2016 at 2:09 AM, Bobby Evans 
wrote:

> The problem is that we want something for storm that can work out of the
> box, ideally without some other complicated external service (except
> zookeeper which we already have, and is not actually that complex to setup
> and run).
> If we feel that we must have some external state store that is required
> for storm to run, then we need to make the decision carefully and
> deliberately.
>  - Bobby
>
> On Wednesday, March 23, 2016 8:37 AM, John Fang <
> xiaojian@alibaba-inc.com> wrote:
>
>
>  Sorry , I misunderstand it. We will make H/A for TopologyMaster. And
> metric meta will store at HDFS,  So the metrics meta won't rely on the
> nimbus. It can enhance the stability of the metric system.
>
> -邮件原件-
> 发件人: Cody Innowhere [mailto:e.neve...@gmail.com]
> 发送时间: 2016年3月23日 19:59
> 收件人: dev@storm.apache.org
> 主题: Re: Question on Metrics Server to Alibaba team
>
> If we don't rely on any external system, our metrics system is still
> available but will store metrics meta/data in rocksdb on nimbus servers.
> There will be limits though, for example, we cannot store metrics data all
> through the topology lifecycle, because rocksdb is only a KV storage, it
> may not support efficient scan operations and too much data in local disk
> may bring in extra IO overhead, so we may have to store latest 1hour of m1
> data, 6 hours of m10 data as such (currently not implemented in JStorm, but
> quite easy to do this).
>
> TopologyMaster is merely a channel for registering/computing/uploading
> metrics to nimbus, so if a TM goes down, the topology metrics will be
> unavailable for a while before it gets pulled up somewhere else(for a
> normal failover case, this should be very fast), while supervisor/nimbus
> metrics are unaffected as they're sent to nimbus via thrift interface. As
> long as TM is back, the topology metrics will be available again.
>
> Currently JStorm does sync metrics meta but metrics data between multiple
> nimbus serers is not synced. So under a nimbus failure, possibly we may
> lose some metrics data.
>
>
> On Wed, Mar 23, 2016 at 3:19 PM, Jungtaek Lim  wrote:
>
> > John,
> >
> > My concern is H/A of metrics on Storm by default. (I'm not 100% sure
> > Bobby pointed out same things.)
> >
> > Since Apache Storm has been used by various users so that we can't
> > assume that users have knowledges of external systems (including
> > Hadoop ecosystem, personal opinion) and operate them smoothly.
> > It reminds me about the importance to keep in mind about default.
> >
> > Therefore, I'm curious that new metrics feature of JStom can work
> > smoothly without external system (HBase / OTS). And love to see it
> > supports H/A without other systems, or users have to tolerate lost of
> > metrics for some scenarios.
> >
> > I guess this may be valid questions on H/A (as far as my understanding
> > of design doc is right): How metrics work when TopologyMaster is down?
> > And how metrics work when failover of Nimbus occurs?
> >
> > Personally I don't mind losing metrics for short durations (just want
> > to check availability of H/A), but failure shouldn't mess up whole
> metrics.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > 2016년 3월 23일 (수) 오후 3:39, John Fang 님이 작성:
> >
> > > @ Bobby Evans Jstorm code has experienced a lot of tests over the
> > > past
> > few
> > > years, espatially HA and scalability. We have done a lot of
> > > optimization about Metrics. The performance is better than Flink in
> > > my tests. In my personal opinion, the metric in jstorm offers very
> > > much informations. And the metric can tell us where is the bottleneck
> whe

Re: Question on Metrics Server to Alibaba team

2016-03-23 Thread Cody Innowhere
trics are passed to Nimbus and Nimbus cached metrics, which implies
> > we can treat all metrics as same, and we can also provide built-in
> metrics
> > (including custom metrics) to users via REST API, too.
> >
> > I thought about standalone metrics server process which handles whole
> > metric works (maybe TopologyMaster + Nimbus on design doc), but if
> current
> > implementation of metric feature on JStorm can take care of what I'm
> > assuming, I guess it's great enough.
> >
> > Since I don't know about TopologyMaster, I just wonder that there're any
> > SPOFs (including soft) and how metrics work when if component of SPOF
> goes
> > down.
> > Since Cody gives digging point to take a look at, we can evaluate that
> > feature before phase 2.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > 2016년 3월 22일 (화) 오전 1:36, Harsha 님이 작성:
> >
> > > One of the goals of this work and probably can be addressed in
> > > separate jira is how the topology metrics reporter works. Today its a
> > > bolt thats part of a topology graph that means its another node in the
> > > Topology DAG that needs be tuned for better performance. Some of our
> > > users took performance hits by deploying topology metrics reporter
> > > that can send metrics to Ganglia. Ideally this collection should be
> > > asynchronous and not be a node in topology DAG.
> > >
> > > Shipping default metrics server and along with pluggable option for
> > > users who wants to graphite or other timeline servers should be the
> > > goal.
> > >
> > > --Harsha
> > >
> > >
> > > On Mon, Mar 21, 2016, at 08:49 AM, Abhishek Agarwal wrote:
> > > > @Cody - The design looks good. Does the design allow to aggregate
> > > > metrics at the task/executor level? Basically, number of distinct
> > > > metrics is proportional to the number of distinct tasks, did you
> > > > ever run into such a use case?
> > > >
> > > >
> > > > On Mon, Mar 21, 2016 at 8:46 PM, Cody Innowhere
> > > > 
> > > > wrote:
> > > >
> > > > > Also, you can read the code from our latest release JStorm 2.1.1.
> > > > >
> > > > > On Mon, Mar 21, 2016 at 11:10 PM, Cody Innowhere
> > > > > 
> > > > > wrote:
> > > > >
> > > > > > @Jungtaek,
> > > > > > We did some tests on codahale metrics, compared to
> > > > > > meters/histograms, counters are quite fast. So we mainly focused
> > > > > > on the optimization of
> > > > > meters
> > > > > > and histograms (they are indeed very slow) including double
> > > > > > sampling, changing the clock from ns (System.nanoTime) to ms,
> etc.
> > > > > > You can take a look at the
> > > > > > "com.alipay.dw.jstorm.example.sequence.bolt.TotalCount" class of
> > > > > > our sequence-split-merge example code, as the client code entry
> > > > > > to
> > > metrics.
> > > > > > After that, you may dig to TopologyMaster class, which is still
> > > > > > part
> > > of a
> > > > > > topology, and then to TopologyMetricsRunnable, which is a part
> > > > > > of
> > > nimbus
> > > > > > server, finally to MetricUploader plugin, this is where the
> > > > > > metrics interfere with our "metrics server". Still, there're
> > > > > > some nits in the
> > > > > code,
> > > > > > but I think that should be no big problem.
> > > > > >
> > > > > > I'd also like to point out that our "metrics server" is not
> > > > > > strictly
> > > a
> > > > > > real metrics server, since most of the duty lies on nimbus
> > > > > > server and topology master, it's more appropriate to call it
> > metrics storage.
> > > The
> > > > > main
> > > > > > reason for this is that we don't want to make a heavy-weight
> > > > > > metrics
> > > > > server
> > > > > > out of JStorm, and this makes us very easy to maintain (we have
> > > > > > teams
> > > > > that
> > > > > > specifically maintain HBase/OTS in Alibaba since they're so
> > >

Re: Question on Metrics Server to Alibaba team

2016-03-21 Thread Cody Innowhere
Also, you can read the code from our latest release JStorm 2.1.1.

On Mon, Mar 21, 2016 at 11:10 PM, Cody Innowhere 
wrote:

> @Jungtaek,
> We did some tests on codahale metrics, compared to meters/histograms,
> counters are quite fast. So we mainly focused on the optimization of meters
> and histograms (they are indeed very slow) including double sampling,
> changing the clock from ns (System.nanoTime) to ms, etc.
> You can take a look at the
> "com.alipay.dw.jstorm.example.sequence.bolt.TotalCount" class of our
> sequence-split-merge example code, as the client code entry to metrics.
> After that, you may dig to TopologyMaster class, which is still part of a
> topology, and then to TopologyMetricsRunnable, which is a part of nimbus
> server, finally to MetricUploader plugin, this is where the metrics
> interfere with our "metrics server". Still, there're some nits in the code,
> but I think that should be no big problem.
>
> I'd also like to point out that our "metrics server" is not strictly a
> real metrics server, since most of the duty lies on nimbus server and
> topology master, it's more appropriate to call it metrics storage. The main
> reason for this is that we don't want to make a heavy-weight metrics server
> out of JStorm, and this makes us very easy to maintain (we have teams that
> specifically maintain HBase/OTS in Alibaba since they're so commonly used
> in production).
>
> On Mon, Mar 21, 2016 at 10:54 PM, Jungtaek Lim  wrote:
>
>> Thanks Cody and Bobby for the explanation.
>>
>> Cody,
>> I took a look at design doc and looks promising, especially it doesn't do
>> sampling when metric type is 'counter'. As far as I heard (I didn't try
>> it)
>> it becomes huge performance hit in Apache Storm when we change sample rate
>> to 1.0.
>> Could you guide the entry point of metric feature in JStorm to dig into?
>>
>> And just a curiosity, did you consider extracting metric feature (which is
>> done with TopologyMasters and Nimbuses) into separate component?
>> I understood your mention to 'metrics server' as separate component, but
>> after seeing design doc, feature seems to be implemented on Nimbus.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2016년 3월 19일 (토) 오전 1:25, Cody Innowhere 님이 작성:
>>
>> > JStorm has provided a MetricUploader interface, which is similar to
>> > IMetricsConsumer in storm, and the underlying implementation is
>> pluggable,
>> > you can use HBase, or any other KV store that supports timeline queries
>> or
>> > even a database(maybe for it's a small cluster). We provide model
>> classes
>> > in jstorm-core, as to what kinds of metrics data need to be stored, it's
>> > totally up to the detailed implementation. Our internal implementation
>> uses
>> > OTS, which is a product of aliyun (https://www.aliyun.com/product/ots/
>> ),
>> > but it's easy to adapt to other implementations.
>> >
>> > On Fri, Mar 18, 2016 at 11:52 PM, Bobby Evans
>> > > >
>> > wrote:
>> >
>> > > Yes we originally wanted to try and use the Hadoop Timeline Server for
>> > > storm metrics feedback to nimbus + UI + history like server.  But it
>> was
>> > > not stable at the time, so we stopped.  For the sake of playing nicely
>> > with
>> > > the rest of the big data ecosystem I would like to see us support it
>> as
>> > an
>> > > option for metrics collection/query, but until the timeline server v2
>> is
>> > > ready and released.  For me the important thing is that we have a
>> decent
>> > > time series DB that comes with storm by default and is pluggable so we
>> > can
>> > > replace it with something else that has similar capabilities in the
>> > future.
>> > >  - Bobby
>> > >
>> > > On Friday, March 18, 2016 10:39 AM, Cody Innowhere <
>> > > e.neve...@gmail.com> wrote:
>> > >
>> > >
>> > >  It's actually in Phase 2 of porting JStorm, but I'm absolutely ok to
>> > > discuss this in advance.
>> > >
>> > > On Fri, Mar 18, 2016 at 11:31 PM, Cody Innowhere > >
>> > > wrote:
>> > >
>> > > > Yes it's already in production.
>> > > > The implementation basically follows the design document in
>> > > > https://issues.apache.org/jira/browse/STORM-1329, you can take a
>> look
>> > &

Re: Question on Metrics Server to Alibaba team

2016-03-21 Thread Cody Innowhere
@Jungtaek,
We did some tests on codahale metrics, compared to meters/histograms,
counters are quite fast. So we mainly focused on the optimization of meters
and histograms (they are indeed very slow) including double sampling,
changing the clock from ns (System.nanoTime) to ms, etc.
You can take a look at the
"com.alipay.dw.jstorm.example.sequence.bolt.TotalCount" class of our
sequence-split-merge example code, as the client code entry to metrics.
After that, you may dig to TopologyMaster class, which is still part of a
topology, and then to TopologyMetricsRunnable, which is a part of nimbus
server, finally to MetricUploader plugin, this is where the metrics
interfere with our "metrics server". Still, there're some nits in the code,
but I think that should be no big problem.

I'd also like to point out that our "metrics server" is not strictly a real
metrics server, since most of the duty lies on nimbus server and topology
master, it's more appropriate to call it metrics storage. The main reason
for this is that we don't want to make a heavy-weight metrics server out of
JStorm, and this makes us very easy to maintain (we have teams that
specifically maintain HBase/OTS in Alibaba since they're so commonly used
in production).

On Mon, Mar 21, 2016 at 10:54 PM, Jungtaek Lim  wrote:

> Thanks Cody and Bobby for the explanation.
>
> Cody,
> I took a look at design doc and looks promising, especially it doesn't do
> sampling when metric type is 'counter'. As far as I heard (I didn't try it)
> it becomes huge performance hit in Apache Storm when we change sample rate
> to 1.0.
> Could you guide the entry point of metric feature in JStorm to dig into?
>
> And just a curiosity, did you consider extracting metric feature (which is
> done with TopologyMasters and Nimbuses) into separate component?
> I understood your mention to 'metrics server' as separate component, but
> after seeing design doc, feature seems to be implemented on Nimbus.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2016년 3월 19일 (토) 오전 1:25, Cody Innowhere 님이 작성:
>
> > JStorm has provided a MetricUploader interface, which is similar to
> > IMetricsConsumer in storm, and the underlying implementation is
> pluggable,
> > you can use HBase, or any other KV store that supports timeline queries
> or
> > even a database(maybe for it's a small cluster). We provide model classes
> > in jstorm-core, as to what kinds of metrics data need to be stored, it's
> > totally up to the detailed implementation. Our internal implementation
> uses
> > OTS, which is a product of aliyun (https://www.aliyun.com/product/ots/),
> > but it's easy to adapt to other implementations.
> >
> > On Fri, Mar 18, 2016 at 11:52 PM, Bobby Evans
>  > >
> > wrote:
> >
> > > Yes we originally wanted to try and use the Hadoop Timeline Server for
> > > storm metrics feedback to nimbus + UI + history like server.  But it
> was
> > > not stable at the time, so we stopped.  For the sake of playing nicely
> > with
> > > the rest of the big data ecosystem I would like to see us support it as
> > an
> > > option for metrics collection/query, but until the timeline server v2
> is
> > > ready and released.  For me the important thing is that we have a
> decent
> > > time series DB that comes with storm by default and is pluggable so we
> > can
> > > replace it with something else that has similar capabilities in the
> > future.
> > >  - Bobby
> > >
> > > On Friday, March 18, 2016 10:39 AM, Cody Innowhere <
> > > e.neve...@gmail.com> wrote:
> > >
> > >
> > >  It's actually in Phase 2 of porting JStorm, but I'm absolutely ok to
> > > discuss this in advance.
> > >
> > > On Fri, Mar 18, 2016 at 11:31 PM, Cody Innowhere 
> > > wrote:
> > >
> > > > Yes it's already in production.
> > > > The implementation basically follows the design document in
> > > > https://issues.apache.org/jira/browse/STORM-1329, you can take a
> look
> > > > first and feel free to ask questions.
> > > >
> > > > On Fri, Mar 18, 2016 at 10:19 PM, Jungtaek Lim 
> > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I got something to do with metrics so I'm seeking the pull requests
> > > which
> > > >> addresses metrics.
> > > >> And at #753 <https://github.com/apache/storm/pull/753> I found Cody
> > > said
> > > >> we
> > > >&

Re: Question on Metrics Server to Alibaba team

2016-03-19 Thread Cody Innowhere
Yes it's already in production.
The implementation basically follows the design document in
https://issues.apache.org/jira/browse/STORM-1329, you can take a look first
and feel free to ask questions.

On Fri, Mar 18, 2016 at 10:19 PM, Jungtaek Lim  wrote:

> Hi,
>
> I got something to do with metrics so I'm seeking the pull requests which
> addresses metrics.
> And at #753  I found Cody said
> we
> (maybe it means Alibaba team) are currently working on Metrics Server.
> (I also found comment which said there was some talk while ago around
> integrating Hadoop timeline server. Seems like no one came up with the
> result, and I prefer to avoid big dependency so I'm in favor of Metrics
> Server for now.)
>
> I think that would improve metrics feature of Storm much better, so I'd
> like to see how the work is going. Sure it's only when there's no issue for
> you to work transparently. I just would like to prevent duplication of
> work, and would like to help if needed and possible.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>


Re: Question on Metrics Server to Alibaba team

2016-03-19 Thread Cody Innowhere
It's actually in Phase 2 of porting JStorm, but I'm absolutely ok to
discuss this in advance.

On Fri, Mar 18, 2016 at 11:31 PM, Cody Innowhere 
wrote:

> Yes it's already in production.
> The implementation basically follows the design document in
> https://issues.apache.org/jira/browse/STORM-1329, you can take a look
> first and feel free to ask questions.
>
> On Fri, Mar 18, 2016 at 10:19 PM, Jungtaek Lim  wrote:
>
>> Hi,
>>
>> I got something to do with metrics so I'm seeking the pull requests which
>> addresses metrics.
>> And at #753 <https://github.com/apache/storm/pull/753> I found Cody said
>> we
>> (maybe it means Alibaba team) are currently working on Metrics Server.
>> (I also found comment which said there was some talk while ago around
>> integrating Hadoop timeline server. Seems like no one came up with the
>> result, and I prefer to avoid big dependency so I'm in favor of Metrics
>> Server for now.)
>>
>> I think that would improve metrics feature of Storm much better, so I'd
>> like to see how the work is going. Sure it's only when there's no issue
>> for
>> you to work transparently. I just would like to prevent duplication of
>> work, and would like to help if needed and possible.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>
>


Re: Question on Metrics Server to Alibaba team

2016-03-18 Thread Cody Innowhere
JStorm has provided a MetricUploader interface, which is similar to
IMetricsConsumer in storm, and the underlying implementation is pluggable,
you can use HBase, or any other KV store that supports timeline queries or
even a database(maybe for it's a small cluster). We provide model classes
in jstorm-core, as to what kinds of metrics data need to be stored, it's
totally up to the detailed implementation. Our internal implementation uses
OTS, which is a product of aliyun (https://www.aliyun.com/product/ots/),
but it's easy to adapt to other implementations.

On Fri, Mar 18, 2016 at 11:52 PM, Bobby Evans 
wrote:

> Yes we originally wanted to try and use the Hadoop Timeline Server for
> storm metrics feedback to nimbus + UI + history like server.  But it was
> not stable at the time, so we stopped.  For the sake of playing nicely with
> the rest of the big data ecosystem I would like to see us support it as an
> option for metrics collection/query, but until the timeline server v2 is
> ready and released.  For me the important thing is that we have a decent
> time series DB that comes with storm by default and is pluggable so we can
> replace it with something else that has similar capabilities in the future.
>  - Bobby
>
> On Friday, March 18, 2016 10:39 AM, Cody Innowhere <
> e.neve...@gmail.com> wrote:
>
>
>  It's actually in Phase 2 of porting JStorm, but I'm absolutely ok to
> discuss this in advance.
>
> On Fri, Mar 18, 2016 at 11:31 PM, Cody Innowhere 
> wrote:
>
> > Yes it's already in production.
> > The implementation basically follows the design document in
> > https://issues.apache.org/jira/browse/STORM-1329, you can take a look
> > first and feel free to ask questions.
> >
> > On Fri, Mar 18, 2016 at 10:19 PM, Jungtaek Lim 
> wrote:
> >
> >> Hi,
> >>
> >> I got something to do with metrics so I'm seeking the pull requests
> which
> >> addresses metrics.
> >> And at #753 <https://github.com/apache/storm/pull/753> I found Cody
> said
> >> we
> >> (maybe it means Alibaba team) are currently working on Metrics Server.
> >> (I also found comment which said there was some talk while ago around
> >> integrating Hadoop timeline server. Seems like no one came up with the
> >> result, and I prefer to avoid big dependency so I'm in favor of Metrics
> >> Server for now.)
> >>
> >> I think that would improve metrics feature of Storm much better, so I'd
> >> like to see how the work is going. Sure it's only when there's no issue
> >> for
> >> you to work transparently. I just would like to prevent duplication of
> >> work, and would like to help if needed and possible.
> >>
> >> Thanks,
> >> Jungtaek Lim (HeartSaVioR)
> >>
> >
> >
>
>
>
>


Re: CPU utilization with Trident

2016-03-08 Thread Cody Innowhere
@sam,
CPU usage is very close to your application logic, i.e., it depends on
whether your topology is CPU-intensive or not.
It may be also related to the throughput of your topologies. So it's really
hard to tell if a certain cpu usage like .5 is good or bad.

In your case of comparing two topologies, let's say topology A and B, I can
give some rough hints:
1. If A's CPU usage is higher than B with higher throughput, I'd say
possibly it's worth higher CPU usage to gain extra throughput;
2. If A's throughput and B's throughput are almost the same, I'd prefer the
topology which has lower CPU usage.

Hope this helps.

On Wed, Mar 9, 2016 at 12:28 PM, sam mohel  wrote:

> What is the importance of CPU utilization after submitted topology as I
> need to compare between two topologies which  is better in performance
>
> And numbers which close to .5 is better or not ?
> How can I know ?
> Thanks
>


Re: [jira] [Commented] (STORM-1579) Got java.lang.ClassCastException when running tests in storm-core

2016-02-25 Thread Cody Innowhere
By running mvn clean install I get the same exceptions, note that the
exceptions don't affect final test results, it's just unexpected exceptions.

On Fri, Feb 26, 2016 at 11:30 AM, Cody Innowhere 
wrote:

> I was running mvn clean package and it's the latest code from master. I've
> never met such exceptions before pulling the latest code today.
>
> On Fri, Feb 26, 2016 at 11:06 AM, Jungtaek Lim (JIRA) 
> wrote:
>
>>
>> [
>> https://issues.apache.org/jira/browse/STORM-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168334#comment-15168334
>> ]
>>
>> Jungtaek Lim commented on STORM-1579:
>> -
>>
>> Did you run the integration test?
>> `mvn clean install` on current master doesn't reproduce this behavior.
>> Could you pull the latest and try again?
>>
>> > Got java.lang.ClassCastException when running tests in storm-core
>> > -
>> >
>> > Key: STORM-1579
>> > URL: https://issues.apache.org/jira/browse/STORM-1579
>> > Project: Apache Storm
>> >  Issue Type: Bug
>> >  Components: storm-core
>> >Affects Versions: 2.0.0
>> > Environment: Mac OS X, jdk1.7
>> >Reporter: Cody
>> > Fix For: 2.0.0
>> >
>> >   Original Estimate: 72h
>> >  Remaining Estimate: 72h
>> >
>> > Stacktrace:
>> > 125277 [Thread-1736-__eventlogger-executor[4 4]] ERROR
>> o.a.s.m.FileBasedEventLogger - Error setting up FileBasedEventLogger.
>> > java.nio.file.NoSuchFileException:
>> /logs/workers-artifacts/metrics-tester-1-0/1024/events.log
>> > at
>> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>> ~[?:1.7.0_75]
>> > at
>> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>> ~[?:1.7.0_75]
>> > at
>> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>> ~[?:1.7.0_75]
>> > at
>> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
>> ~[?:1.7.0_75]
>> > at
>> java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:430)
>> ~[?:1.7.0_75]
>> > at java.nio.file.Files.newOutputStream(Files.java:172) ~[?:1.7.0_75]
>> > at java.nio.file.Files.newBufferedWriter(Files.java:2722)
>> ~[?:1.7.0_75]
>> > at
>> org.apache.storm.metric.FileBasedEventLogger.initLogWriter(FileBasedEventLogger.java:51)
>> [classes/:?]
>> > at
>> org.apache.storm.metric.FileBasedEventLogger.prepare(FileBasedEventLogger.java:97)
>> [classes/:?]
>> > at
>> org.apache.storm.metric.EventLoggerBolt.prepare(EventLoggerBolt.java:48)
>> [classes/:?]
>> > at
>> org.apache.storm.daemon.executor$fn__6507$bolt_transfer_fn__6522.invoke(executor.clj:792)
>> [classes/:?]
>> > at clojure.lang.AFn.call(AFn.java:18) [clojure-1.7.0.jar:?]
>> > at org.apache.storm.utils.Utils$6.run(Utils.java:2177) [classes/:?]
>> > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_75]
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.3.4#6332)
>>
>
>


Re: [jira] [Commented] (STORM-1579) Got java.lang.ClassCastException when running tests in storm-core

2016-02-25 Thread Cody Innowhere
I was running mvn clean package and it's the latest code from master. I've
never met such exceptions before pulling the latest code today.

On Fri, Feb 26, 2016 at 11:06 AM, Jungtaek Lim (JIRA) 
wrote:

>
> [
> https://issues.apache.org/jira/browse/STORM-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168334#comment-15168334
> ]
>
> Jungtaek Lim commented on STORM-1579:
> -
>
> Did you run the integration test?
> `mvn clean install` on current master doesn't reproduce this behavior.
> Could you pull the latest and try again?
>
> > Got java.lang.ClassCastException when running tests in storm-core
> > -
> >
> > Key: STORM-1579
> > URL: https://issues.apache.org/jira/browse/STORM-1579
> > Project: Apache Storm
> >  Issue Type: Bug
> >  Components: storm-core
> >Affects Versions: 2.0.0
> > Environment: Mac OS X, jdk1.7
> >Reporter: Cody
> > Fix For: 2.0.0
> >
> >   Original Estimate: 72h
> >  Remaining Estimate: 72h
> >
> > Stacktrace:
> > 125277 [Thread-1736-__eventlogger-executor[4 4]] ERROR
> o.a.s.m.FileBasedEventLogger - Error setting up FileBasedEventLogger.
> > java.nio.file.NoSuchFileException:
> /logs/workers-artifacts/metrics-tester-1-0/1024/events.log
> > at
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
> ~[?:1.7.0_75]
> > at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> ~[?:1.7.0_75]
> > at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> ~[?:1.7.0_75]
> > at
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
> ~[?:1.7.0_75]
> > at
> java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:430)
> ~[?:1.7.0_75]
> > at java.nio.file.Files.newOutputStream(Files.java:172) ~[?:1.7.0_75]
> > at java.nio.file.Files.newBufferedWriter(Files.java:2722)
> ~[?:1.7.0_75]
> > at
> org.apache.storm.metric.FileBasedEventLogger.initLogWriter(FileBasedEventLogger.java:51)
> [classes/:?]
> > at
> org.apache.storm.metric.FileBasedEventLogger.prepare(FileBasedEventLogger.java:97)
> [classes/:?]
> > at
> org.apache.storm.metric.EventLoggerBolt.prepare(EventLoggerBolt.java:48)
> [classes/:?]
> > at
> org.apache.storm.daemon.executor$fn__6507$bolt_transfer_fn__6522.invoke(executor.clj:792)
> [classes/:?]
> > at clojure.lang.AFn.call(AFn.java:18) [clojure-1.7.0.jar:?]
> > at org.apache.storm.utils.Utils$6.run(Utils.java:2177) [classes/:?]
> > at java.lang.Thread.run(Thread.java:745) [?:1.7.0_75]
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: How can i monitor topology in storm ui ?

2016-01-12 Thread Cody Innowhere
I'm not sure whether I catch you correctly, if you mean a storm metrics
monitor which enables you to see the historic metrics, you may refer to
this article:
http://www.michael-noll.com/blog/2013/11/06/sending-metrics-from-storm-to-graphite/

On Wed, Jan 13, 2016 at 11:21 AM, master researcher 
wrote:

> is there another tool i can monitor topology ?
>
> On Tue, Jan 12, 2016 at 2:40 AM, master researcher 
> wrote:
>
> > Thanks in advance for any help , i have topology submitted it but need to
> > know if i made change int the code an need to submitted new one how can i
> > compare between old and new topology
> >
> > should i compare between it through Executed latency only or what ?
> >
> > if someone has links or vidoes for someone illustrate monitioring well ,
> > i'll apperciate it
> >
> > Waiting Your Reply ...
> >
>


Re: Problem with storm since 4 months

2015-12-17 Thread Cody Innowhere
sam,
try 'lsof -i4 | grep 6706' to see which process is binding this port.
also, your setting says that any process that binds a random port may bind
to a port between 1024~65000, if possible, you can change this to
10240~65000 so that no random-binding process will use ports less than
10240.

On Thu, Dec 17, 2015 at 5:07 PM, sam mohel  wrote:

> i edited to make topology.debug true and got in the supervisor log file
> still
> hasn't start and in the worker log file
>
> 2015-12-17 07:52:23 task [INFO] Emitting: b-7 __system ["startup"]
> 2015-12-17 07:52:23 executor [INFO] Loaded executor tasks b-7:[33 33]
> 2015-12-17 07:52:23 executor [INFO] Preparing bolt b-7:(33)
> 2015-12-17 07:52:23 executor [INFO] Finished loading executor b-7:[33 33]
> 2015-12-17 07:52:23 worker [INFO] Launching receive-thread for
> 5587bcc1-05d4-4d92-ae3d-2a8503cef259:6706
> 2015-12-17 07:52:23 executor [INFO] Prepared bolt b-7:(33)
>
> after finished loading got alot of this lines
>
> 2015-12-17 07:52:27 executor [INFO] Processing received message
> source: __system:-1, stream: __tick, id: {}, [5]
> 2015-12-17 07:52:27 executor [INFO] Processing received message
> source: __system:-1, stream: __tick, id: {}, [5]
>
> Got in the storm ui zeros in emitted and transfered
>
> i executed the command that launch worker and supervisor got
>
> 2015-12-17 07:59:04 executor [INFO] Prepared bolt b-7:(33)
> 2015-12-17 07:59:04 util [ERROR] Async loop died!
>  org.zeromq.ZMQException: Address already in use(0x62)
> at org.zeromq.ZMQ$Socket.bind(Native Method)
> at zilch.mq$bind.invoke(mq.clj:69)
> at backtype.storm.messaging.zmq.ZMQContext.bind(zmq.clj:57)
> at
> backtype.storm.messaging.loader$launch_receive_thread_BANG_$fn__1629.invoke(loader.clj:26)
> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:375)
> at clojure.lang.AFn.run(AFn.java:24)
> at java.lang.Thread.run(Thread.java:701)
> 2015-12-17 07:59:04 util [INFO] Halting process:
>
>
> i read that " It was indeed a port conflict, but not
> with another ZMQ process. It turns out our ephemeral port range was messed
> up on the machines:
>
>
> $ cat /proc/sys/net/ipv4/ip_local_port_range 1024 65000" i'm on ubuntu
> 14.04 i tried it to put 6706 in reserved port but problem still
>
>
> On Thu, Dec 17, 2015 at 9:21 AM, 刘键(Basti Liu) 
> wrote:
>
> > Hi Sam,
> >
> > If the worker(pid=2621) belongs to the topology you just submitted, it
> > means the worker has bound the port "6703" successfully.
> > So there should not be any "binding error". Is there any other problems?
> >
> > Regards
> > Basti
> >
> > -Original Message-
> > From: sam mohel [mailto:sammoh...@gmail.com]
> > Sent: Thursday, December 17, 2015 2:11 PM
> > To: dev@storm.apache.org
> > Subject: Re: Problem with storm since 4 months
> >
> > can i find help ?
> >
> > On Fri, Dec 11, 2015 at 6:32 AM, sam mohel  wrote:
> >
> > > this topology that has problem , i mean this i'm now submitted it
> > >
> > > On Fri, Dec 11, 2015 at 5:45 AM, 刘键(Basti Liu)
> > > 
> > > wrote:
> > >
> > >> This worker (pid=2621) belongs to topology " fsd-1-1449794574".
> > >> Please check if this topology has already been killed. If so, just
> > >> kill this process.
> > >>
> > >> Regards
> > >> Basti
> > >> -Original Message-
> > >> From: sam mohel [mailto:sammoh...@gmail.com]
> > >> Sent: Friday, December 11, 2015 11:18 AM
> > >> To: dev@storm.apache.org
> > >> Subject: Re: Problem with storm since 4 months
> > >>
> > >> is that right command ps aux |grep 2621
> > >> user  2621  7.8  2.7 3444276 108056 pts/12 Sl+  02:42  12:01 java
> > >> -server -Djava.net.preferIPv4Stack=true
> > >> -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib
> > >> -Dlogfile.name=worker-6703.log -Dstorm.home=/home/user/storm-0.8.2
> > >> -Dlog4j.configuration=storm.log.properties -cp
> > >> /home/user/storm-0.8.2/storm-0.8.2.jar:/home/user/storm-0.8.2/lib/com
> > >> mons-exec-1.1.jar:/home/user/storm-0.8.2/lib/carbonite-1.5.0.jar:/hom
> > >> e/user/storm-0.8.2/lib/ring-jetty-adapter-0.3.11.jar:/home/user/storm
> > >> -0.8.2/lib/minlog-1.2.jar:/home/user/storm-0.8.2/lib/hiccup-0.3.6.jar
> > >> :/home/user/storm-0.8.2/lib/commons-lang-2.5.jar:/home/user/storm-0.8
> > >> .2/lib/math.numeric-tower-0.0.1.jar:/home/user/storm-0.8.2/lib/servle
> > >> t-api-2.5-20081211.jar:/home/user/storm-0.8.2/lib/slf4j-log4j12-1.5.8
> > >> .jar:/home/user/storm-0.8.2/lib/commons-logging-1.1.1.jar:/home/user/
> > >> storm-0.8.2/lib/tools.logging-0.2.3.jar:/home/user/storm-0.8.2/lib/lo
> > >> g4j-1.2.16.jar:/home/user/storm-0.8.2/lib/clout-1.0.1.jar:/home/user/
> > >> storm-0.8.2/lib/httpcore-4.1.jar:/home/user/storm-0.8.2/lib/servlet-a
> > >> pi-2.5.jar:/home/user/storm-0.8.2/lib/objenesis-1.2.jar:/home/user/st
> > >> orm-0.8.2/lib/clojure-1.4.0.jar:/home/user/storm-0.8.2/lib/json-simpl
> > >> e-1.1.jar:/home/user/storm-0.8.2/lib/First-Story-Detection-1.0-SNAPSH
> > >> OT.jar:/home/user/storm-0.8.2/lib/httpclient-4.1.1.jar:/home/user/sto
> > >> rm-0.8.2/lib/jzmq-2

Re: How to get number of tasks from TopologyContext

2015-12-16 Thread Cody Innowhere
I've replied the answer in Stackoverflow too if you don't mind.

On Thu, Dec 17, 2015 at 12:01 AM, Cody Innowhere 
wrote:

> Hi Matthias,
> In JStorm, there's no executor, so 
> TopologyContext.getComponentTasks()
> returns task ids within this component.
>
> As to your questions:
> - do tasks actually have an ID?
> *[Cody] *in JStorm, each task has an taskId of Integer type. Usually,
> each component is assigned a range of task id's (the num is equal to
> component parallelism)
>
>  - if yes, can those IDs be retrieved?
> *[Cody] *Yes, use TopologyContext.getThisTaskId() method
>
>  - can we get at least the number of tasks per operator somehow?
> *[Cody] *Yes, use TopologyContext.getComponentTasks().size()
>
>  - should the above method get renamed?
> *[Cody] *We may discuss this when merging phase starts. You may refer to
> related jira later.
>
>
> On Wed, Dec 16, 2015 at 11:37 PM, Matthias J. Sax 
> wrote:
>
>> Thanks for your feedback.
>>
>> Turns out, the question was related to JStorm... I guess this should be
>> consider for the merge process.
>>
>> > sorry , i find i use the jstorm. storm is no problem. but when i use
>> jstorm,this problem arise
>>
>> -Matthias
>>
>>
>> On 12/16/2015 03:25 PM, Arun Iyer wrote:
>> > TopologyContext.getComponentTasks returns the list of task ids for the
>> component (not executor ids).
>> >
>> > You could just try printing the output of getComponentTasks in the
>> prepare method after doing 'setNumTasks’ (with  task > parallelism)
>> > while building the topology. Worked for me.
>> >
>> > - Arun
>> >
>> >
>> >
>> > On 12/16/15, 6:21 PM, "Matthias J. Sax"  wrote:
>> >
>> >> Hi,
>> >>
>> >> today, the above question appeared on SO:
>> >>
>> https://stackoverflow.com/questions/34309189/how-to-get-the-task-number-and-id-not-the-executor-in-storm
>> >>
>> >> The problem is, that
>> >>
>> >> TopologyContext.getComponentTasks()
>> >>
>> >> returns the IDs of the executors (and not the tasks). The name of the
>> >> method is not chooses very good -- I guess this dates back to the time
>> >> before the separation of tasks and executors...
>> >>
>> >> My question is now:
>> >>
>> >> - do tasks actually have an ID?
>> >> - if yes, can those IDs be retrieved?
>> >> - can we get at least the number of tasks per operator somehow?
>> >> - should the above method get renamed?
>> >>
>> >> As the number of tasks is fix, one could of course collect this
>> >> information an pass it via the Config to
>> >> StormSubmitter.submitTopology(...). However, this is quite a
>> work-around.
>> >>
>> >> Please let me know what you think about it.
>> >>
>> >>
>> >> -Matthias
>> >>
>>
>>
>


Re: How to get number of tasks from TopologyContext

2015-12-16 Thread Cody Innowhere
Hi Matthias,
In JStorm, there's no executor, so
TopologyContext.getComponentTasks()
returns task ids within this component.

As to your questions:
- do tasks actually have an ID?
*[Cody] *in JStorm, each task has an taskId of Integer type. Usually, each
component is assigned a range of task id's (the num is equal to component
parallelism)

 - if yes, can those IDs be retrieved?
*[Cody] *Yes, use TopologyContext.getThisTaskId() method

 - can we get at least the number of tasks per operator somehow?
*[Cody] *Yes, use TopologyContext.getComponentTasks().size()

 - should the above method get renamed?
*[Cody] *We may discuss this when merging phase starts. You may refer to
related jira later.


On Wed, Dec 16, 2015 at 11:37 PM, Matthias J. Sax  wrote:

> Thanks for your feedback.
>
> Turns out, the question was related to JStorm... I guess this should be
> consider for the merge process.
>
> > sorry , i find i use the jstorm. storm is no problem. but when i use
> jstorm,this problem arise
>
> -Matthias
>
>
> On 12/16/2015 03:25 PM, Arun Iyer wrote:
> > TopologyContext.getComponentTasks returns the list of task ids for the
> component (not executor ids).
> >
> > You could just try printing the output of getComponentTasks in the
> prepare method after doing 'setNumTasks’ (with  task > parallelism)
> > while building the topology. Worked for me.
> >
> > - Arun
> >
> >
> >
> > On 12/16/15, 6:21 PM, "Matthias J. Sax"  wrote:
> >
> >> Hi,
> >>
> >> today, the above question appeared on SO:
> >>
> https://stackoverflow.com/questions/34309189/how-to-get-the-task-number-and-id-not-the-executor-in-storm
> >>
> >> The problem is, that
> >>
> >> TopologyContext.getComponentTasks()
> >>
> >> returns the IDs of the executors (and not the tasks). The name of the
> >> method is not chooses very good -- I guess this dates back to the time
> >> before the separation of tasks and executors...
> >>
> >> My question is now:
> >>
> >> - do tasks actually have an ID?
> >> - if yes, can those IDs be retrieved?
> >> - can we get at least the number of tasks per operator somehow?
> >> - should the above method get renamed?
> >>
> >> As the number of tasks is fix, one could of course collect this
> >> information an pass it via the Config to
> >> StormSubmitter.submitTopology(...). However, this is quite a
> work-around.
> >>
> >> Please let me know what you think about it.
> >>
> >>
> >> -Matthias
> >>
>
>


Re: exception in finding file when submit

2015-12-16 Thread Cody Innowhere
The config file won't be necessarily within the target directory, but in a
way your main class can find it.
For example:
storm jar /home/your_home/your_jar_file.jar your_main_class
/home/your_home/documents/config.properties

On Wed, Dec 16, 2015 at 2:53 PM, sam mohel  wrote:

> thanks but do you mean that this file should be in the target folder with
> jar files ?
>
> On Wed, Dec 16, 2015 at 4:04 AM, Cody Innowhere 
> wrote:
>
> > Hi Sam, "config.properties" is supposed to be an argument after your main
> > class parameter, i.e.,  storm jar your_jar_file.jar your_main_class
> > config.properties. So when submitting, you can place this file in the
> same
> > directory with your jar file so it can find it.
> >
> > On Wed, Dec 16, 2015 at 9:14 AM, sam mohel  wrote:
> >
> > > i need help in the exception
> > >
> > > i imported project worked well in local but in distributed when
> submitted
> > > the topology found this
> > >
> > > java.io.FileNotFoundException: config.properties (No such file or
> > >  directory)
> > >  then it submitted the topology !!!
> > >
> > > now the true path of this file already in the code
> > >  what other solution can i try it ?
> > >
> >
>


Re: exception in finding file when submit

2015-12-15 Thread Cody Innowhere
Hi Sam, "config.properties" is supposed to be an argument after your main
class parameter, i.e.,  storm jar your_jar_file.jar your_main_class
config.properties. So when submitting, you can place this file in the same
directory with your jar file so it can find it.

On Wed, Dec 16, 2015 at 9:14 AM, sam mohel  wrote:

> i need help in the exception
>
> i imported project worked well in local but in distributed when submitted
> the topology found this
>
> java.io.FileNotFoundException: config.properties (No such file or
>  directory)
>  then it submitted the topology !!!
>
> now the true path of this file already in the code
>  what other solution can i try it ?
>


Re: [GitHub] storm pull request: [STORM-1359] change kryo links from google cod...

2015-12-01 Thread Cody Innowhere
@vesense,
I see all 3 files are in trunk, maybe you can up-merge your local
repository first?

On Wed, Dec 2, 2015 at 10:10 AM, darionyaphet  wrote:

> Github user darionyaphet commented on the pull request:
>
> https://github.com/apache/storm/pull/909#issuecomment-161156921
>
> it's same with [#908](https://github.com/apache/storm/pull/908) ?  If
> they are the same, please close this PR  .
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


Re: [jira] [Commented] (STORM-973) Netty-Client Connection Failed

2015-11-29 Thread Cody Innowhere
Hi darion,
have you checked your topology if some worker dies before this happens?
It's pretty normal to get a connection reset if some worker dies since the
ports of the dead worker will be released.

On Sun, Nov 29, 2015 at 7:14 PM, darion yaphets (JIRA) 
wrote:

>
> [
> https://issues.apache.org/jira/browse/STORM-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15030888#comment-15030888
> ]
>
> darion yaphets commented on STORM-973:
> --
>
> Hello  Alex ~  It's seems difficulty because this is dependent on network
> envirnment .
>
> > Netty-Client Connection Failed
> > --
> >
> > Key: STORM-973
> > URL: https://issues.apache.org/jira/browse/STORM-973
> > Project: Apache Storm
> >  Issue Type: Bug
> >  Components: storm-core
> >Affects Versions: 0.9.4
> > Environment: apache-storm-0.9.4 JDK 1.7.0_75
> >Reporter: darion yaphets
> >
> > When Storm Topology startup in a distribution cluster I found netty
> connection will failed and messages will be droped by client itself.
> > worker log info as following :
> > ```
> > 2015-08-07T11:43:18.903+0800 b.s.m.n.StormClientErrorHandler [INFO]
> Connection failed Netty-Client-storm-01
> > java.io.IOException: Connection reset by peer
> >   at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> ~[na:1.7.0_75]
> >   at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> ~[na:1.7.0_75]
> >   at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> ~[na:1.7.0_75]
> >   at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[na:1.7.0_75]
> >   at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> ~[na:1.7.0_75]
> >   at
> org.apache.storm.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
> [storm-core-0.9.4.jar:0.9.4]
> >   at
> org.apache.storm.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
> [storm-core-0.9.4.jar:0.9.4]
> >   at
> org.apache.storm.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
> [storm-core-0.9.4.jar:0.9.4]
> >   at
> org.apache.storm.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
> [storm-core-0.9.4.jar:0.9.4]
> >   at
> org.apache.storm.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
> [storm-core-0.9.4.jar:0.9.4]
> >   at
> org.apache.storm.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> [storm-core-0.9.4.jar:0.9.4]
> >   at
> org.apache.storm.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
> [storm-core-0.9.4.jar:0.9.4]
> >   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [na:1.7.0_75]
> >   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_75]
> >   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> > 2015-08-07T11:43:19.426+0800 b.s.m.n.Client [INFO] connection attempt 1
> to Netty-Client-syq-storm-01.meilishuo.com/172.16.7.25:8711 scheduled to
> run in 0 ms
> > 2015-08-07T11:43:19.427+0800 b.s.m.n.Client [ERROR] connection to
> Netty-Client-storm-01 is unavailable
> > 2015-08-07T11:43:19.427+0800 b.s.m.n.Client [ERROR] dropping 1
> message(s) destined for Netty-Client-storm-01
> > 2015-08-07T11:43:19.428+0800 b.s.m.n.Client [ERROR] connection to
> Netty-Client-storm-01 is unavailable
> > 2015-08-07T11:43:19.428+0800 b.s.m.n.Client [ERROR] dropping 103
> message(s) destined for Netty-Client-storm-01
> > 2015-08-07T11:43:19.428+0800 b.s.m.n.Client [ERROR] connection to
> Netty-Client-storm-01 is unavailable
> > 2015-08-07T11:43:19.428+0800 b.s.m.n.Client [ERROR] dropping 35
> message(s) destined for Netty-Client-storm-01
> > ```
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: [GitHub] storm pull request: STORM-1218: Use markdown for JavaDoc

2015-11-25 Thread Cody Innowhere
+1. Cool.
(However the pegdown doclet plugin for IDEA doesn't work -_-)

On Thu, Nov 26, 2015 at 1:01 AM, harshach  wrote:

> Github user harshach commented on the pull request:
>
> https://github.com/apache/storm/pull/891#issuecomment-159672448
>
> +1. Thanks this will make doc writing easier.
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


Re: [DISCUSS] Plan for Merging JStorm Code

2015-11-24 Thread Cody Innowhere
Sorry I've mistaken it with the apache issues account, it's ok now, my id
is: cody.

On Tue, Nov 24, 2015 at 11:54 PM, P. Taylor Goetz  wrote:

> Hi Cody,
>
> I wasn’t able to find your username. Are you sure you have an account on
> cwiki.apache.org?
>
> -Taylor
>
> > On Nov 22, 2015, at 8:46 AM, Cody Innowhere  wrote:
> >
> > Hi Taylor,
> > I'd like to help too, could you add me in? my id is: Cody
> >
> > On Sat, Nov 21, 2015 at 11:51 AM, 刘键(Basti Liu) <
> basti...@alibaba-inc.com>
> > wrote:
> >
> >> Hi Taylor,
> >>
> >> Sorry for the late response.
> >> I'd like to help on this. Could you please help to give me the
> permission?
> >> Thanks.
> >> UserName: basti.lj
> >>
> >> Regards
> >> Basti
> >>
> >> -Original Message-
> >> From: P. Taylor Goetz [mailto:ptgo...@gmail.com]
> >> Sent: Thursday, November 19, 2015 6:24 AM
> >> To: dev@storm.apache.org; Bobby Evans
> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> >>
> >> All I have at this point is a placeholder wiki entry [1], and a lot of
> >> local notes that likely would only make sense to me.
> >>
> >> Let me know your wiki username and I’ll give you permissions. The same
> >> goes for anyone else who wants to help.
> >>
> >> -Taylor
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> >>
> >>> On Nov 18, 2015, at 2:08 PM, Bobby Evans 
> >> wrote:
> >>>
> >>> Taylor and others I was hoping to get started filing JIRA and planning
> >>> on how we are going to do the java migration + JStorm merger.  Is
> >>> anyone else starting to do this?  If not would anyone object to me
> >>> starting on it? - Bobby
> >>>
> >>>
> >>>   On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
> >> ptgo...@gmail.com> wrote:
> >>>
> >>>
> >>> Thanks for putting this together Basti, that comparison helps a lot.
> >>>
> >>> And thanks Bobby for converting it into markdown. I was going to just
> >> attach the spreadsheet to JIRA, but markdown is a much better solution.
> >>>
> >>> -Taylor
> >>>
> >>>> On Nov 12, 2015, at 12:03 PM, Bobby Evans  >
> >> wrote:
> >>>>
> >>>> I translated the excel spreadsheet into a markdown file and put up a
> >> pull request for it.
> >>>> https://github.com/apache/storm/pull/877
> >>>> I did a few edits to it to make it work with Markdown, and to add in a
> >> few of my own comments.  I also put in a field for JIRAs to be able to
> >> track the migration.
> >>>> Overall I think your evaluation was very good.  We have a fair amount
> >> of work ahead of us to decide what version of various features we want
> to
> >> go forward with.
> >>>>  - Bobby
> >>>>
> >>>>
> >>>>On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
> >> basti...@alibaba-inc.com> wrote:
> >>>>
> >>>>
> >>>> Hi Bobby & Jungtaek,
> >>>>
> >>>> Thanks for your replay.
> >>>> I totally agree that compatibility is the most important thing.
> >> Actually, JStorm has been compatible with the user API of Storm.
> >>>> As you mentioned below, we indeed still have some features different
> >> between Storm and JStorm. I have tried to list them (minor update or
> >> improvements are not included).
> >>>> Please refer to attachment for details. If any missing, please help
> >>>> to point out. (The current working features are probably missing
> here.)
> >> Just have a look at these differences. For the missing features in
> JStorm,
> >> I did not see any obstacle which will block the merge to JStorm.
> >>>> For the features which has different solution between Storm and
> JStorm,
> >> we can evaluate the solution one by one to decision which one is
> >> appropriate.
> >>>> After the finalization of evaluation, I think JStorm team can take the
> >> merging job and publish a stable release in 2 months.
> >>>> But anyway, the detailed implementation for these features with
> >> different solution is transparent to user. So, from user's point

Re: [VOTE] Storm 2.0 plan

2015-11-24 Thread Cody Innowhere
+1 for this plan.

Also +1 for starting evaluation solutions of major feature differences
early before starting migration to avoid unnecessary re-work.

On Tue, Nov 24, 2015 at 3:26 PM, 刘键(Basti Liu) 
wrote:

> +1 for this plan.
>
> @Taylor
> Could you help to update the reference branch "jstorm-import" with latest
> JStorm version(2.1.0), since some JStorm features
> listed in the plan of phase 2 are not included? Thanks.
>
> @Bobby and the devs who are interested in this migration
> We will try to add some design documents for the features related to
> "JStorm evaluate/port" JIAR of phase 2 by the end of week.
> If the JIRA might cause much rework during phase 2, we will also comment
> the solution on relative JIRA of phase 1. Hope we can
> have an evaluation together before starting migration, to avoid some
> unnecessary rework.
> If any comments for "JStorm" part, please feel free to tell us.
>
> Regards
> Basti
>
> -Original Message-
> From: Bobby Evans [mailto:ev...@yahoo-inc.com.INVALID]
> Sent: Tuesday, November 24, 2015 5:40 AM
> To: Dev
> Subject: [VOTE] Storm 2.0 plan
>
> Sorry for spaming everyone with all the JIRA creations today.  I have
> filed all of the JIRA corresponding to the plan for JStorm merger listed
> here.
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> The bylaws
> https://github.com/apache/storm/blob/master/docs/documentation/BYLAWS.md
> don't cover a vote on the direction of the project like this.  They cover
> the merger of each pull request that would be made, but not a direction
> change. As such this vote is more symbolic than anything, and I would love
> to hear from everyone involved.
>
> The current plan is to finish merging in features for a 1.0 release.
>
> https://cwiki.apache.org/confluence/display/STORM/Storm+Release+1.0
> is supposed to cover this, but I think it is missing some features others
> want, so please let us know if you really want to get your feature in
> before this happens.  As such the time frame is a bit flexible but I would
> like to shoot for doing a storm-1.0 release before mid December.
> After that we would begin merging in the clojure->java transition JIRA.
> Once those are complete the feature freeze would be lifted and JStorm
> features would be merged in along with other features.  Hopefully we would
> have a Storm 2.0 release by mid February to mid March, depending on how
> things go.
>
> I am +1 on this plan (if you couldn't tell)
>  - Bobby
>
>


Re: [DISCUSS] Plan for Merging JStorm Code

2015-11-22 Thread Cody Innowhere
Hi Taylor,
I'd like to help too, could you add me in? my id is: Cody

On Sat, Nov 21, 2015 at 11:51 AM, 刘键(Basti Liu) 
wrote:

> Hi Taylor,
>
> Sorry for the late response.
> I'd like to help on this. Could you please help to give me the permission?
> Thanks.
> UserName: basti.lj
>
> Regards
> Basti
>
> -Original Message-
> From: P. Taylor Goetz [mailto:ptgo...@gmail.com]
> Sent: Thursday, November 19, 2015 6:24 AM
> To: dev@storm.apache.org; Bobby Evans
> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
>
> All I have at this point is a placeholder wiki entry [1], and a lot of
> local notes that likely would only make sense to me.
>
> Let me know your wiki username and I’ll give you permissions. The same
> goes for anyone else who wants to help.
>
> -Taylor
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
>
> > On Nov 18, 2015, at 2:08 PM, Bobby Evans 
> wrote:
> >
> > Taylor and others I was hoping to get started filing JIRA and planning
> > on how we are going to do the java migration + JStorm merger.  Is
> > anyone else starting to do this?  If not would anyone object to me
> > starting on it? - Bobby
> >
> >
> >On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
> ptgo...@gmail.com> wrote:
> >
> >
> > Thanks for putting this together Basti, that comparison helps a lot.
> >
> > And thanks Bobby for converting it into markdown. I was going to just
> attach the spreadsheet to JIRA, but markdown is a much better solution.
> >
> > -Taylor
> >
> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans 
> wrote:
> >>
> >> I translated the excel spreadsheet into a markdown file and put up a
> pull request for it.
> >> https://github.com/apache/storm/pull/877
> >> I did a few edits to it to make it work with Markdown, and to add in a
> few of my own comments.  I also put in a field for JIRAs to be able to
> track the migration.
> >> Overall I think your evaluation was very good.  We have a fair amount
> of work ahead of us to decide what version of various features we want to
> go forward with.
> >>   - Bobby
> >>
> >>
> >> On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
> basti...@alibaba-inc.com> wrote:
> >>
> >>
> >> Hi Bobby & Jungtaek,
> >>
> >> Thanks for your replay.
> >> I totally agree that compatibility is the most important thing.
> Actually, JStorm has been compatible with the user API of Storm.
> >> As you mentioned below, we indeed still have some features different
> between Storm and JStorm. I have tried to list them (minor update or
> improvements are not included).
> >> Please refer to attachment for details. If any missing, please help
> >> to point out. (The current working features are probably missing here.)
> Just have a look at these differences. For the missing features in JStorm,
> I did not see any obstacle which will block the merge to JStorm.
> >> For the features which has different solution between Storm and JStorm,
> we can evaluate the solution one by one to decision which one is
> appropriate.
> >> After the finalization of evaluation, I think JStorm team can take the
> merging job and publish a stable release in 2 months.
> >> But anyway, the detailed implementation for these features with
> different solution is transparent to user. So, from user's point of view,
> there is not any compatibility problem.
> >>
> >> Besides compatibility, by our experience, stability is also important
> and is not an easy job. 4 people in JStorm team took almost one year to
> finish the porting from "clojure core"
> >> to "java core", and to make it stable. Of course, we have many devs in
> community to make the porting job faster. But it still needs a long time to
> run many online complex topologys to find bugs and fix them. So, that is
> the reason why I proposed to do merging and build on a stable "java core".
> >>
> >> -Original Message-
> >> From: Bobby Evans [mailto:ev...@yahoo-inc.com.INVALID]
> >> Sent: Wednesday, November 11, 2015 10:51 PM
> >> To: dev@storm.apache.org
> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> >>
> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.
> Migrating the APIs to org.apache.storm is a big non-backwards compatible
> move, and a major version bump to 2.x seems like a good move there.
> >> +1 for the release plan
> >>
> >> I would like the move for user facing APIs to org.apache to be one of
> the last things we do.  Translating clojure code into java and moving it to
> org.apache I am not too concerned about.
> >>
> >> Basti,
> >> We have two code bases that have diverged significantly from one
> another in terms of functionality.  The storm code now or soon will have A
> Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware
> Scheduling, a distributed cache like API, log searching, security, massive
> performance improvements, shaded almost all of our dependencies, a REST API
> for programtically accessing everything on the UI, and I am sure I a

Re: [jira] [Commented] (STORM-1216) button to kill all topologies in Storm UI

2015-11-18 Thread Cody Innowhere
And I'd like to see another button to restart all topologies :-)

On Wed, Nov 18, 2015 at 6:55 PM, Longda Feng (JIRA)  wrote:

>
> [
> https://issues.apache.org/jira/browse/STORM-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010754#comment-15010754
> ]
>
> Longda Feng commented on STORM-1216:
> 
>
> Assign this issue to me, But I won't resolve it in 0.11. Current UI is
> hard to do this. But JStorm UI can do this.
> So please wait until JStorm merge into Storm. We will start this.
>
> We have met several times migrate one cluster's topology to another
> cluster. so this is useful for this case.
>
> > button to kill all topologies in Storm UI
> > -
> >
> > Key: STORM-1216
> > URL: https://issues.apache.org/jira/browse/STORM-1216
> > Project: Apache Storm
> >  Issue Type: Wish
> >  Components: storm-core
> >Affects Versions: 0.11.0
> >Reporter: Erik Weathers
> >Assignee: Longda Feng
> >Priority: Minor
> >
> > In the Storm-on-Mesos project we had a [request to have an ability to
> "shut down the storm cluster" via a UI button|
> https://github.com/mesos/storm/issues/46].   That could be accomplished
> via a button in the Storm UI to kill all of the topologies.
> > I understand if this is viewed as an undesirable feature, but I just
> wanted to document the request.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: [GitHub] storm pull request: add storm-id to worker log filename

2015-11-17 Thread Cody Innowhere
@jessicasco,
according to #STORM-901, worker log is already in storm-id directory, so is
it necessary to make this change?

On Wed, Nov 18, 2015 at 5:25 AM, kishorvpatil  wrote:

> Github user kishorvpatil commented on the pull request:
>
> https://github.com/apache/storm/pull/886#issuecomment-157511940
>
> @jessicasco change worker log, requires changes to core ui, logviewer,
> worker in order to make sure the links and other search functionality don't
> break.
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


Re: [GitHub] storm pull request: (STORM-956) When the execute() or nextTuple() ...

2015-11-17 Thread Cody Innowhere
Agree with bastiliu & kishorvpatil.
In production we have a lot of cases that a spout just wait there for
something to happen(i.e., wait for an incoming tuple), we have no idea
about user's intention, stopping the heart-beats & killing the worker
doesn't seem a good idea.

-1


On Wed, Nov 18, 2015 at 8:41 AM, kishorvpatil  wrote:

> Github user kishorvpatil commented on the pull request:
>
> https://github.com/apache/storm/pull/647#issuecomment-157558858
>
> I think the spout and bolt should take care of handling hangs ( or use
> timeouts instead of making blocking calls). Also, the spout/bolt code
> should guard against creating threads that can cause unhandled
> exceptions/hang-ups. Forcing worker to not send heart-beats would make
> killing other components running on that worker - which is not desired.
> Secondly, worker should not be killed unless it is certain that is the
> process issue and not external service issue - e.g.  if kafka spout hangs -
> killing worker might force it to be relaunched or scheduled may not solve
> the problem - new worker process still make another blocking call and
> hang-up.
>
> Thirdly, killing worker will force relaunch/reschedule/ - forcing
> topology to be un-stabie as all other workers in loop have to reconnect to
> this new worker. In large topologies that might become a bigger problem and
> lead to domino effects and take longer to settle the topology.
>
> -1
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


Re: [jira] [Commented] (STORM-1206) Reduce logviewer memory usage

2015-11-14 Thread Cody Innowhere
Zhuo Liu, thanks for the explanation, I see.

On Sun, Nov 15, 2015 at 1:18 AM, Zhuo Liu (JIRA)  wrote:

>
> [
> https://issues.apache.org/jira/browse/STORM-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005491#comment-15005491
> ]
>
> Zhuo Liu commented on STORM-1206:
> -
>
> Hello Cody, we already have a hierarchical log file structure as
> topology/worker/all files, please check storm-901. So in default case, each
> worker directory will only have very limited number of log files and gc
> files(10). We met the above problem in a very extreme case where we set the
> gc log file name with pid, causing the automatic file number limit not
> working and crazy restarting creates a huge number of gc files for a single
> worker.
>
> > Reduce logviewer memory usage
> > -
> >
> > Key: STORM-1206
> > URL: https://issues.apache.org/jira/browse/STORM-1206
> > Project: Apache Storm
> >  Issue Type: Improvement
> >  Components: storm-core
> >Reporter: Zhuo Liu
> >Assignee: Zhuo Liu
> >
> > In production, we ran into an issue with logviewers bouncing with out of
> memory errors. Note that this happens very rarely, we met this in some
> extreme case when super frequently restarting of workers generates a huge
> number of gc files (~1M files).
> > What was happening is that if there are lots of log files (~1 M files)
> for a particular headless user, we would have so many strings resident in
> memory that logviewer would run out of heap space.
> > We were able to work around this by increasing the heap space, but we
> should consider putting some sort of an upper bound on the number of files
> so that we don't run in to this issue, even with the bigger heap.
> > Using the java DirectoryStream can avoid holding all file names in
> memory during file listing. Also, a multi-round directory cleaner can be
> introduced to delete files while disk quota is exceeded.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: [GitHub] storm pull request: Added in feature diff

2015-11-14 Thread Cody Innowhere
@revans2,
Thanks for the compliment.
We've done a lot on the new metric system of JStorm, which we think, is a
very important part of JStorm/Storm. We've been helping users within our
company troubleshoot almost every day, and we know which metrics are
necessary to help do troubleshooting and how to monitor the metrics.

But I think it's not the metrics that matter most, what really matters is
that users should know instantly upon a sudden/sharp metric value change,
i.e., metrics help user to discover problems and troubleshoot problems. And
I look forward to working with community to enable Storm with the ability.


On Fri, Nov 13, 2015 at 10:32 PM, revans2  wrote:

> Github user revans2 commented on the pull request:
>
> https://github.com/apache/storm/pull/877#issuecomment-156448057
>
> @wuchong and @bastiliu thanks for the feedback and comments.  I
> updated the pull request based off of them.  Any more feedback is
> definitely welcome.  By the way great work on jstorm.  My has spent way too
> much time playing around with http://storm.taobao.org/ being jealous of
> the features there.
>
> Very excited to see what all of us can accomplish together.
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


Re: [DISCUSS] Plan for Merging JStorm Code

2015-11-10 Thread Cody Innowhere
Thanks Taylor.

The plan looks good.

> During migration, it's probably easiest to operate with two local clones
of the Apache Storm repo: one for working (i.e. checked out to working
branch) > and one for reference/copying (i.e. checked out to
"jstorm-import").
Do you mean two storm branches(both clojure core) or one storm branch with
one JStorm-imported branch?

Tong.Wang is responsible for integration tests of JStorm, @Tong.Wang, could
you see to it if you can do the performace test?


On Wed, Nov 11, 2015 at 8:00 AM, Suresh Srinivas 
wrote:

> +1 for 1.0.0 release.
>
> I also like Hackathon to encourage bigger participation. It will be fun.
>
> On 11/10/15, 3:38 PM, "Harsha"  wrote:
>
> >Thanks Taylor for putting together plan on merging JStorm code.
> >Few things
> >
> >1. we should call 0.11.0 as 1.0.0 release since storm has security and
> >   nimbus ha . Quite a lot of features and improvements added this is
> >   going to be big release and its about time we call 1.0.0
> >
> >1.1 "align package names (e.g backtype.storm --> org.apache.storm /
> >com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin" I
> >propose we do this as next release itself instead of pushing it
> >onto another .
> >
> >"Phase 3 - Migrate Clojure --> Java"
> >
> >I would like to propose a Hackathon among storm community along with
> >jstorm. We need to come up with plan of action on what code needs to be
> >merged into storm-core. I am thinking it will help better to have
> >everyone over video chat or something to go over the code and get it
> >migrated to java.
> >
> >
> >Thanks, Harsha
> >
> >On Tue, Nov 10, 2015, at 01:32 PM, P. Taylor Goetz wrote:
> >> Based on a number of discussions regarding merging the JStorm code,
> >> I¹ve tried to distill the ideas presented and inserted some of my own.
> >> The result is below.
> >>
> >> I¹ve divided the plan into three phases, though they are not
> >> necessarily sequential ‹ obviously some tasks can take place in
> >> parallel.
> >>
> >> None of this is set in stone, just presented for discussion. Any and
> >> all comments are welcome.
> >>
> >> ---
> >>
> >> Phase 1 - Plan for 0.11.x Release
> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
> >> 2. Announce feature-freeze for 0.11.x
> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
> >> 4. Release 0.11.0 (or whatever version # we want to use)
> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
> >>
> >>
> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
> >> 1. Determine/document unique features in JStorm (e.g. classpath
> >>isolation, cgroups, etc.) and create JIRA for migrating the
> >>feature.
> >> 2. Create JIRA for migrating each clojure component (or logical group
> >>of components) to Java. Assumes tests will be ported as well.
> >> 3. Discuss/establish style guide for Java coding conventions. Consider
> >>using Oracle¹s or Google¹s Java conventions as a base ‹ they are
> >>both pretty solid.
> >> 4. align package names (e.g backtype.storm --> org.apache.storm /
> >>com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
> >>
> >>
> >> Phase 3 - Migrate Clojure --> Java
> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever
> >>possible (core functionality only, features distinct to JStorm
> >>migrated separately).
> >> 2. Port JStorm-specific features.
> >> 3. Begin releasing preview/beta versions.
> >> 4. Code cleanup (across the board) and refactoring using established
> >>coding conventions, and leveraging PMD/Checkstyle reports for
> >>reference. (Note: good oportunity for new contributors.)
> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift
> >>feature freeze.
> >>
> >>
> >> Notes: We should consider bumping up to version 1.0 sometime soon and
> >> then switching to semantic versioning [3] from then on.
> >>
> >>
> >> With the exception of package name alignment, the "jstorm-import"
> >> branch will largely be read-only throughout the process.
> >>
> >> During migration, it's probably easiest to operate with two local
> >> clones of the Apache Storm repo: one for working (i.e. checked out to
> >> working branch) and one for reference/copying (i.e. checked out to
> >>"jstorm-
> >> import").
> >>
> >> Feature-freeze probably only needs to be enforced against core
> >> functionality. Components under "external" can likely be exempt, but
> >> we should figure out a process for accepting and releasing new
> >> features during the migration.
> >>
> >> Performance testing should be continuous throughout the process. Since
> >> we don't really have ASF infrastructure for performance testing, we
> >> will need a volunteer(s) to host and run the performance tests.
> >> Performance test results can be posted to the wiki [2]. It would
> >> probably be a good idea to establish a baseline with the 0.10.0
> >> release.
> >>
> >> I¹ve attached an analysis document Sean 

Re: 答复: [VOTE] Accept Alibaba JStorm Code Donation

2015-10-29 Thread Cody Innowhere
+1
(sorry for previous mis-post, repost to this thread)

I think programming languages as Clojure and Java is no gap for common
developers, but it might be a problem for industrial users, who may not be
 willing or have time to dive into the details of the Clojure core, and
this is the great advantage of a Java core storm and will definitely
attract more contributors to the community and improve the development of
storm.
Also, as JStorm describes, it has been tested in vast industrial use cases
for stability and efficiency, if JStorm were merged into storm, it will
surely make storm better & stronger.

On Fri, Oct 30, 2015 at 9:19 AM, 方孝健(玄弟) 
wrote:

> +1
>
> -邮件原件-
> 发件人: Derek Dagit [mailto:der...@yahoo-inc.com.INVALID]
> 发送时间: 2015年10月28日 1:56
> 收件人: dev@storm.apache.org
> 主题: Re: [VOTE] Accept Alibaba JStorm Code Donation
>
> +1
>
> --
> Derek
>
>
> 
> From: P. Taylor Goetz 
> To: dev@storm.apache.org
> Sent: Tuesday, October 27, 2015 12:48 PM
> Subject: [VOTE] Accept Alibaba JStorm Code Donation
>
>
>
> All,
>
> The IP Clearance process for the Alibaba JStorm code donation has
> completed.
>
> The IP Clearance Status document can be found here:
>
> http://incubator.apache.org/ip-clearance/storm-jstorm.html
>
> The source code can be found at https://github.com/alibaba/jstorm with
> the following git commit SHA: e935da91a897797dad56e24c4ffa57860ac91878
>
> This is a VOTE to accept the code donation, and import the donated code
> into the Apache Storm git repository. Discussion regarding how to proceed
> with merging the codebases can take place in separate thread.
>
> [ ] +1 Accept the Alibaba JStorm code donation.
> [ ] +0 Indifferent
> [ ] -1 Do not accept the code donation because…
>
> This VOTE will be open for at least 72 hours.
>
> -Taylor
>
>


Re: 答复: [VOTE] Release Apache Storm 0.10.0 (rc1)

2015-10-29 Thread Cody Innowhere
+1

I think programming languages as Clojure and Java is no gap for common
developers, but it might be a problem for industrial users, who may not be
 willing or have time to dive into the details of the Clojure core, and
this is the great advantage of a Java core storm and will definitely
attract more contributors to the community and improve the development of
storm.
Also, as JStorm describes, it has been tested in vast industrial use cases
for stability and efficiency, if JStorm were merged into storm, it will
surely make storm better & stronger.

On Thu, Oct 29, 2015 at 8:19 PM, 方孝健(玄弟) 
wrote:

> +1, I think it is great!
>
> -邮件原件-
> 发件人: Bobby Evans [mailto:ev...@yahoo-inc.com.INVALID]
> 发送时间: 2015年10月26日 22:24
> 收件人: dev@storm.apache.org
> 主题: Re: [VOTE] Release Apache Storm 0.10.0 (rc1)
>
> +1 looks good, built from the git tag and ran some simple test.
>  - Bobby
>
>
>  On Friday, October 23, 2015 9:47 PM, 임정택  wrote:
>
>
>-
>
>   test passed : OK
>   -- extract source tar
>   -- build via "mvn clean install”
>
>   -
>
>   deploy to docker cluster using wurstmeister/storm-docker : OK
>   -
>
>   topology tests via storm-starter : OK
>   -- run WordCounts and RollingTopWords, local mode and remote mode
>
>   -
>
>   test REST APIs (retrieve / activate / deactivate / rebalance / kill) : OK
>
>
> Looks good overall. +1.
>
> 2015-10-24 5:26 GMT+09:00 P. Taylor Goetz :
>
> > This is a call to vote on releasing Apache Storm 0.10.0 (rc1)
> >
> > Full list of changes in this release:
> >
> >
> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHA
> > NGELOG.md;hb=d02f94268dec229d1125a24fdf53fa303cbc2b29
> >
> > The tag/commit to be voted upon is v0.10.0:
> >
> >
> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=45b1b1484
> > 01fd05f0f79cc7abdf6b5c7fc43df20;hb=d02f94268dec229d1125a24fdf53fa303cb
> > c2b29
> >
> > The source archive being voted upon can be found here:
> >
> >
> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-0.10.0/apach
> > e-storm-0.10.0-src.tar.gz
> >
> > Other release files, signatures and digests can be found here:
> >
> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-0.10.0/
> >
> > The release artifacts are signed with the following key:
> >
> >
> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEY
> > S;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
> >
> > The Nexus staging repository for this release is:
> >
> > https://repository.apache.org/content/repositories/orgapachestorm-1025
> >
> > Please vote on releasing this package as Apache Storm 0.10.0.
> >
> > When voting, please list the actions taken to verify the release.
> >
> > This vote will be open for at least 72 hours.
> >
> > [ ] +1 Release this package as Apache Storm 0.10.0 [ ]  0 No opinion [
> > ] -1 Do not release this package because...
> >
> > Thanks to everyone who contributed to this release.
> >
> > -Taylor
> >
>
>
>
> --
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net Twitter :
> http://twitter.com/heartsavior LinkedIn :
> http://www.linkedin.com/in/heartsavior
>
>
>
>