Re: [VOTE] Release Apache Storm 2.0.0 RC7

2019-04-30 Thread Arun Mahadevan
+1
- Downloaded binaries
- Brought up a local cluster
- Ran a few topologies and checked the output and logs.

If I just click on the "Owner" link from the UI, it displays a red banner
with a warning.
"This user's topologies are in danger of being unscheduled due to the
owner's over-use of cluster resources.
Please keep this user's resource consumption within guaranteed bounds to
ensure topologies for this user will continue to run."

It just looks odd and I am not sure whats the criteria to decide if there
is overuse. Looks like its displayed irrespective of the topology.

Anyways I don't think its a blocker, something we could fix in the next
release.

Thanks,
Arun



On Tue, 30 Apr 2019 at 14:49, Roshan Naik 
wrote:

>  Yes, you need to rebuild your topology jars against 2.0.
> If you have some settings to tweak perf with 1.x,  refer to
> https://github.com/apache/storm/blob/master/docs/Performance.md for the
> 2.x configs.
>
> On Tuesday, April 30, 2019, 2:41:49 PM PDT, Alexandre Vermeerbergen <
> avermeerber...@gmail.com> wrote:
>
>  Hello,
>
> I'm eager to test Storm 2.0.0 with my complex topologies, but first of
> all: do I need to rebuild all my topologies' Big Jars with Storm
> 2.0.0, or may I try my existing Storm 1.2.3 (recent snapshot)-based
> Big Jars ?
>
> Kind regards,
> Alexandre Vermeerbergen
>
> Le mar. 30 avr. 2019 à 00:49, P. Taylor Goetz  a écrit
> :
> >
> > This is a call to vote on releasing Apache Storm 2.0.0 (rc7)
> >
> > Full list of changes in this release:
> >
> >
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc7/RELEASE_NOTES.html
> >
> > The tag/commit to be voted upon is v2.0.0:
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=007863edd95e838b3df414928c6fa3f28244ab49;hb=2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae
> >
> > The source archive being voted upon can be found here:
> >
> >
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc7/apache-storm-2.0.0-src.tar.gz
> >
> > Other release files, signatures and digests can be found here:
> >
> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc7/
> >
> > The release artifacts are signed with the following key:
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
> >
> > The Nexus staging repository for this release is:
> >
> > https://repository.apache.org/content/repositories/orgapachestorm-1079
> >
> > Please vote on releasing this package as Apache Storm 2.0.0.
> >
> > When voting, please list the actions taken to verify the release.
> >
> > This vote will be open for at least 72 hours.
> >
> > [ ] +1 Release this package as Apache Storm 2.0.0
> > [ ]  0 No opinion
> > [ ] -1 Do not release this package because...
> >
> > Thanks to everyone who contributed to this release.
> >
> > -Taylor


Re: [VOTE] Release Apache Storm 2.0.0 (rc4)

2019-01-29 Thread Arun Mahadevan
Agree with continuing with the RC if this is not a blocker.

On Tue, 29 Jan 2019 at 12:29, Roshan Naik 
wrote:

>  Correct me if I am wrong, this seems to be a bug with a workaround but
> not an exploitable security hole. ? if this is not a security hole, and the
> workaround is realistic then we should go ahead with the current RC
> IMO.-roshan
> On Tuesday, January 29, 2019, 10:26:53 AM PST, Kishorkumar Patil <
> kishorvpa...@apache.org> wrote:
>
>  Aaron,
> Thank you for patch and suggesting the work around in the mean time. The PR
> for STORM-3317 is merged into master now.
> Considering the work around exists for STORM-3317, I am open to either go
> ahead with current RC or create a new one.
>
> Thanks,
> Kishor
>
> On Tue, Jan 29, 2019 at 11:19 AM Aaron Gresch  wrote:
>
> > The workaround for STORM-3317 is to force your
> > java.security.auth.login.config file on the launcher box to remain in the
> > same location as where it is hosted on the supervisors.
> >
> >
> > On Mon, Jan 28, 2019 at 10:10 AM Aaron Gresch  wrote:
> >
> > >
> > > Not sure if it affects the release, but STORM-3317 is a new bug in 2.0
> > > where if your launcher box has the java.security.auth.login.config
> file
> > in
> > > a different location than the supervisors, uploading credentials will
> not
> > > work.
> > >
> > > A PR is available that fixes the issue.
> > >
> > >
> > >
> > > On Tue, Jan 8, 2019 at 1:03 PM P. Taylor Goetz 
> > wrote:
> > >
> > >> This is a call to vote on releasing Apache Storm 2.0.0 (rc4)
> > >>
> > >> Full list of changes in this release:
> > >>
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc4/RELEASE_NOTES.html
> > >>
> > >> The tag/commit to be voted upon is v2.0.0:
> > >>
> > >>
> > >>
> >
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=1eece73e8c9ed7f41d2f20f727bc7f644c499360;hb=ddee8decac57d1a4a0aa23cc76066609a2abc8d2
> > >>
> > >> The source archive being voted upon can be found here:
> > >>
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc4/apache-storm-2.0.0-src.tar.gz
> > >>
> > >> Other release files, signatures and digests can be found here:
> > >>
> > >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc4/
> > >>
> > >> The release artifacts are signed with the following key:
> > >>
> > >>
> > >>
> >
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
> > >>
> > >> The Nexus staging repository for this release is:
> > >>
> > >>
> https://repository.apache.org/content/repositories/orgapachestorm-1073
> > >>
> > >> Please vote on releasing this package as Apache Storm 2.0.0.
> > >>
> > >> When voting, please list the actions taken to verify the release.
> > >>
> > >> This vote will be open for at least 72 hours.
> > >>
> > >> [ ] +1 Release this package as Apache Storm 2.0.0
> > >> [ ]  0 No opinion
> > >> [ ] -1 Do not release this package because...
> > >>
> > >> Thanks to everyone who contributed to this release.
> > >>
> > >> -Taylor
> > >
> > >
> >
>


Re: Storm 2.0 blogs ?

2019-01-22 Thread Arun Mahadevan
Nice suggestions Roshan and Taylor.

I think we can start with a release announcement (maybe with pointers to
the docs) and follow it up with blog posts that deep dives into the
different features and improvements.

A few from my side for the release announcement would be,

- Streams API - A typed API for users to express their streaming
computations more easily and supports functional style operations. [1]
- Windowing enhancements and support for window state checkpointing [2]

Would also try to do some write up for blogs around this once we figured
out how to go about the blog series.

- Arun

[1] https://github.com/apache/storm/blob/master/docs/Stream-API.md
[2]
https://github.com/apache/storm/blob/master/docs/Windowing.md#stateful-windowing



On Tue, 22 Jan 2019 at 09:57, P. Taylor Goetz  wrote:

> Awesome. I’ve been thinking about this as well.
>
> The release notes don’t really tell the story of everything in this
> release, and some of the new features and improvements elude my memory.
>
> If others could point out new features that they’d like to be pointed out
> in the release announcement, please list them in this thread. Better yet
> would be a blurb describing the feature/improvement and what benefits it
> brings users. Then I can stitch them together into a release announcement.
> Basically crowd/dev-sourcing the release announcement.
>
> Or, instead of one big announcement, do we want to do a series of blog
> posts? If we go the latter route, the first post should probably cover the
> most significant features. What would those be? Roshan has a pretty good
> start.
>
> What would others add?
>
> If anyone wants to help out, let me know.
>
> -Taylor
>
> > On Jan 22, 2019, at 10:37 AM, Bobby Evans  wrote:
> >
> > I totally agree, especially with the performance improvements in it.
> >
> > Thanks,
> >
> > Bobby
> >
> > On Tue, Jan 22, 2019 at 12:40 AM Roshan Naik
> 
> > wrote:
> >
> >> Now that  2.0 has all the votes it needs to move forward, maybe a good
> >> time to think of some blogs to go with this long awaited release.
> >> Some potential topics that come to mind are:
> >> 1- Overview of new features and major changes since 1.x2-
> Re-architecture
> >> (messaging, threading, back pressure)3- Micro benchmarks4- Revisit the
> >> famous Yahoo benchmark5- Window state persistence6- SQL enhancements7-
> new
> >> Metrics stuff8- Kafka related changes9- Security10- An area you have
> worked
> >> on ?11- Other ideas ?
> >> Anyone interested in contributing blogs ?
> >> FYI: I am working on content for topics 2 & 3.
> >>
> >> -roshan
>
>


Re: [VOTE] Release Apache Storm 2.0.0 (rc4)

2019-01-10 Thread Arun Mahadevan
This is for users to use the "auto credentials" mechanism (delegation
tokens) with HDFS/Hive/Hbase.

We have been shipping it since 1.x (I think since 1.2.0 release) so that
users can just add that directory to class path rather than building it
separately to get the right dependencies. We could consider removing it
from the main binary and ship it separately but it will need changes to the
build, release and documentation and users will need to download and
install it separately.


Thanks,
Arun

On Thu, 10 Jan 2019 at 10:28, Stig Rohde Døssing 
wrote:

> I think this was remarked on by Roshan in the last RC, but the binary
> distribution has become significantly larger since 1.x. It looks like this
> is down to storm-autocreds not being added to the exclusion list in
> storm-dist/binary/final-package/src/main/assembly/binary.xml.
>
> Since the module isn't excluded, external/storm-autocreds contains the
> module jar, plus all dependency jars. Is this an accident, or do we want to
> include these jars in the distribution?
>
> Den ons. 9. jan. 2019 kl. 19.48 skrev Ethan Li  >:
>
> > +1
> >
> > - Built from the src, ran all the unit tests and integration tests.
> > - Set up a single-node cluster and submit ThroughputVsLatency topology.
> > - Checked the UI.
> > They look good.
> >
> > Thanks
> > Ethan
> >
> > > On Jan 9, 2019, at 8:48 AM, Bobby Evans  wrote:
> > >
> > > +1 built from the git tag.  Ran all of the unit tests and ran some
> manual
> > > tests they all passed.
> > >
> > > Thanks,
> > >
> > > Bobby
> > >
> > > On Tue, Jan 8, 2019 at 6:30 PM Xin Wang 
> wrote:
> > >
> > >> +1
> > >>
> > >> Built it and ran all of the tests.  Everything passed.
> > >>
> > >> -Xin
> > >>
> > >> Kishorkumar Patil  于2019年1月9日周三 上午5:08写道:
> > >>
> > >>> +1
> > >>>
> > >>> - built from source code and deployment works.
> > >>> -  Ran some of the tests for UI, DRPC, ThroughputVsLatency
> > >>> -  Validated UI bugs reported in the recent past are fixed in this
> > >> version
> > >>>
> > >>> -Kishor
> > >>>
> > >>>
> > >>> On Tue, Jan 8, 2019 at 2:29 PM Arun Mahadevan 
> > wrote:
> > >>>
> > >>>> +1
> > >>>>
> > >>>> - Downloaded the binaries and validated signatures.
> > >>>> - Deployed the binaries, ran some sample topologies and checked the
> > UI.
> > >>>> - Ran top level build using the source zip.
> > >>>>
> > >>>> Thanks,
> > >>>> Arun
> > >>>>
> > >>>>
> > >>>> On Tue, 8 Jan 2019 at 11:03, P. Taylor Goetz 
> > >> wrote:
> > >>>>
> > >>>>> This is a call to vote on releasing Apache Storm 2.0.0 (rc4)
> > >>>>>
> > >>>>> Full list of changes in this release:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc4/RELEASE_NOTES.html
> > >>>>>
> > >>>>> The tag/commit to be voted upon is v2.0.0:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=1eece73e8c9ed7f41d2f20f727bc7f644c499360;hb=ddee8decac57d1a4a0aa23cc76066609a2abc8d2
> > >>>>>
> > >>>>> The source archive being voted upon can be found here:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc4/apache-storm-2.0.0-src.tar.gz
> > >>>>>
> > >>>>> Other release files, signatures and digests can be found here:
> > >>>>>
> > >>>>>
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc4/
> > >>>>>
> > >>>>> The release artifacts are signed with the following key:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
> > >>>>>
> > >>>>> The Nexus staging repository for this release is:
> > >>>>>
> > >>>>>
> > >>
> https://repository.apache.org/content/repositories/orgapachestorm-1073
> > >>>>>
> > >>>>> Please vote on releasing this package as Apache Storm 2.0.0.
> > >>>>>
> > >>>>> When voting, please list the actions taken to verify the release.
> > >>>>>
> > >>>>> This vote will be open for at least 72 hours.
> > >>>>>
> > >>>>> [ ] +1 Release this package as Apache Storm 2.0.0
> > >>>>> [ ]  0 No opinion
> > >>>>> [ ] -1 Do not release this package because...
> > >>>>>
> > >>>>> Thanks to everyone who contributed to this release.
> > >>>>>
> > >>>>> -Taylor
> > >>>>
> > >>>
> > >>
> > >>
> > >> --
> > >> Thanks,
> > >> Xin
> > >>
> >
> >
>


Re: New Storm Committer and PMC Member

2019-01-09 Thread Arun Mahadevan
Congratulations Govind, keep up the good work.

- Arun

On Wed, 9 Jan 2019 at 11:13, Kishorkumar Patil 
wrote:

> Congratulations Govind!
>
> -Kishor
>
>
> On Wed, Jan 9, 2019 at 1:00 PM Stig Rohde Døssing 
> wrote:
>
> > Congratulations.
> >
> > Den ons. 9. jan. 2019 kl. 19.55 skrev Roshan Naik
> > :
> >
> > > Congratulations Govind. Roshan
> > >
> > >
> > > Sent from Yahoo Mail for iPhone
> > >
> > >
> > > On Wednesday, January 9, 2019, 10:47 AM, Ethan Li <
> > > ethanopensou...@gmail.com> wrote:
> > >
> > > Congratulations! Govind. Well deserved!
> > >
> > > Ethan
> > >
> > > > On Jan 9, 2019, at 12:40 PM, Hugo Louro  wrote:
> > > >
> > > > Congratulations Govind. Very well deserved. Thank you for all your
> > > > contributions and dedication to the Storm project.
> > > >
> > > > Best,
> > > > Hugo
> > > >
> > > > On Wed, Jan 9, 2019 at 10:36 AM Bobby Evans 
> wrote:
> > > >
> > > >> I am happy to announce that Govind Menon has just been added as the
> > > latest
> > > >> Committer and PMC member to the Apache Storm Project.  Please join
> me
> > in
> > > >> congratulating him on this and thanking him for his contributions so
> > > far.
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Bobby Evans
> > > >>
> > >
> > >
> > >
> > >
> > >
> >
>


Re: [VOTE] Release Apache Storm 2.0.0 (rc4)

2019-01-08 Thread Arun Mahadevan
+1

- Downloaded the binaries and validated signatures.
- Deployed the binaries, ran some sample topologies and checked the UI.
- Ran top level build using the source zip.

Thanks,
Arun


On Tue, 8 Jan 2019 at 11:03, P. Taylor Goetz  wrote:

> This is a call to vote on releasing Apache Storm 2.0.0 (rc4)
>
> Full list of changes in this release:
>
>
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc4/RELEASE_NOTES.html
>
> The tag/commit to be voted upon is v2.0.0:
>
>
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=1eece73e8c9ed7f41d2f20f727bc7f644c499360;hb=ddee8decac57d1a4a0aa23cc76066609a2abc8d2
>
> The source archive being voted upon can be found here:
>
>
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc4/apache-storm-2.0.0-src.tar.gz
>
> Other release files, signatures and digests can be found here:
>
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-2.0.0-rc4/
>
> The release artifacts are signed with the following key:
>
>
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>
> The Nexus staging repository for this release is:
>
> https://repository.apache.org/content/repositories/orgapachestorm-1073
>
> Please vote on releasing this package as Apache Storm 2.0.0.
>
> When voting, please list the actions taken to verify the release.
>
> This vote will be open for at least 72 hours.
>
> [ ] +1 Release this package as Apache Storm 2.0.0
> [ ]  0 No opinion
> [ ] -1 Do not release this package because...
>
> Thanks to everyone who contributed to this release.
>
> -Taylor


Re: Regarding releasing Apache Storm 2.0.0

2018-09-11 Thread Arun Mahadevan
out before merging it in,
> so
> > we
> > > > can
> > > > > set up the branches properly at that time.
> > > > >
> > > > >
> > > > > On Wed, Jul 18, 2018 at 10:47 PM Jungtaek Lim 
> > wrote:
> > > > >
> > > > >> I'd like to say first, thanks Stig to take up remaining issues.
> > Thanks
> > > > to
> > > > >> his efforts, according to the epic, we have only one major issue
> > left:
> > > > >> porting UI to Java [1], and pull request [2] is available for
> that.
> > > > >> There're another issues [3] [4] targeting 2.0.0 (since it is
> > backward
> > > > >> incompatible) but they are all about removing deprecated things,
> so
> > > > easier
> > > > >> to be reviewed and make decisions.
> > > > >>
> > > > >> Once we have a patch for that now, IMHO it would be good to review
> > and
> > > > ship
> > > > >> in 2.0.0 if it wouldn't take a month or so. We could do some
> sanity
> > > > tests
> > > > >> in parallel, so waiting for UI port would not block much time on
> > > > releasing
> > > > >> Storm 2.0.0.
> > > > >>
> > > > >> - Jungtaek Lim (HeartSaVioR)
> > > > >>
> > > > >> 1. https://issues.apache.org/jira/browse/STORM-1311
> > > > >> 2. https://github.com/apache/storm/pull/2752
> > > > >> 3. https://issues.apache.org/jira/browse/STORM-2947
> > > > >> 4. https://issues.apache.org/jira/browse/STORM-3156
> > > > >>
> > > > >>
> > > > >> 2018년 7월 11일 (수) 오전 5:12, Alexandre Vermeerbergen <
> > > > >> avermeerber...@gmail.com>님이
> > > > >> 작성:
> > > > >>
> > > > >>> +1 would love to try it when an RC is avail!
> > > > >>>
> > > > >>> Alexandre Vermeerbergen
> > > > >>>
> > > > >>> 2018-07-10 21:15 GMT+02:00 Arun Mahadevan :
> > > > >>>> +1 to get it out soon.
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> On 7/10/18, 11:52 AM, "P. Taylor Goetz" 
> > wrote:
> > > > >>>>
> > > > >>>>> +1 Sounds good to me.
> > > > >>>>>
> > > > >>>>> -Taylor
> > > > >>>>>
> > > > >>>>>> On Jul 10, 2018, at 2:18 AM, Jungtaek Lim 
> > > > wrote:
> > > > >>>>>>
> > > > >>>>>> Hi devs,
> > > > >>>>>>
> > > > >>>>>> I hopefully have a time to sort out issues regarding Storm
> > 2.0.0 and
> > > > >>> link
> > > > >>>>>> to epic issue.
> > > > >>>>>>
> > > > >>>>>> https://issues.apache.org/jira/browse/STORM-2714
> > > > >>>>>> (require login to Apache JIRA to see issues in epic)
> > > > >>>>>>
> > > > >>>>>> I guess we are close to the release, mostly left reviewing
> some
> > > > >> pending
> > > > >>>>>> pull requests, and some manual sanity tests.
> > > > >>>>>>
> > > > >>>>>> Given that master branch is relatively stabilized for Travis
> CI
> > > > >> build,
> > > > >>> as
> > > > >>>>>> well as style check and Java port make codebase better (at
> > least for
> > > > >>> me), I
> > > > >>>>>> would really want to make Storm 2.0.0 released sooner than
> > later,
> > > > and
> > > > >>> rely
> > > > >>>>>> majorly on 2.x version line.
> > > > >>>>>>
> > > > >>>>>> So I would propose dev folks to concentrate on remaining tasks
> > for
> > > > >>> Storm
> > > > >>>>>> 2.0.0 till we announce release. WDYT?
> > > > >>>>>>
> > > > >>>>>> Thanks,
> > > > >>>>>> Jungtaek Lim (HeartSaVioR)
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > > >
> >
>


Re: Regarding releasing Apache Storm 2.0.0

2018-07-10 Thread Arun Mahadevan
+1 to get it out soon.




On 7/10/18, 11:52 AM, "P. Taylor Goetz"  wrote:

>+1 Sounds good to me.
>
>-Taylor
>
>> On Jul 10, 2018, at 2:18 AM, Jungtaek Lim  wrote:
>> 
>> Hi devs,
>> 
>> I hopefully have a time to sort out issues regarding Storm 2.0.0 and link
>> to epic issue.
>> 
>> https://issues.apache.org/jira/browse/STORM-2714
>> (require login to Apache JIRA to see issues in epic)
>> 
>> I guess we are close to the release, mostly left reviewing some pending
>> pull requests, and some manual sanity tests.
>> 
>> Given that master branch is relatively stabilized for Travis CI build, as
>> well as style check and Java port make codebase better (at least for me), I
>> would really want to make Storm 2.0.0 released sooner than later, and rely
>> majorly on 2.x version line.
>> 
>> So I would propose dev folks to concentrate on remaining tasks for Storm
>> 2.0.0 till we announce release. WDYT?
>> 
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>



Re: Verification about my observation for current State implementation

2018-02-20 Thread Arun Mahadevan
o many flexibility and being stuck from that.)
>
>Windowed state is another thing we should deal with. I still didn't have
>time to deep dive with, but to support resharding, window (and effectively
>any states) should be grouped by key so that it can be placed to right
>(might be new) task after relaunching topology. I'm asking that
>current window/partition/windowsystem
>is capable of.
>
>> IMO before we do this, we can introduce stateful exactly once processing
>via the streams API as the first step.
>
>While I definitely agree that introducing stateful exactly-once processing
>is major feature and the thing we should support sooner than later, I'm not
>100% sure about which thing to do first. We're supporting at-least-once
>stateful processing in 1.x version line and haven't provided "state
>reshard" feature for a long time which makes actual users directly suffer,
>so that's another major feature to me.
>
>- Jungtaek Lim (HeartSaVioR)
>
>1. http://storm.apache.org/releases/1.2.1/Command-line-client.html
>
>
>2018년 2월 21일 (수) 오전 4:22, Arun Iyer <ai...@hortonworks.com>님이 작성:
>
>> correction: task count remains the same (executor count can vary).
>>
>>
>>
>>
>> On 2/20/18, 11:20 AM, "Arun Iyer on behalf of Arun Mahadevan" <
>> ai...@hortonworks.com on behalf of ar...@apache.org> wrote:
>>
>> >Hi Jungtaek,
>> >
>> >
>> >1. Right now users need to vary the database/table name in the state
>> provider config per topology. Agree, its better to include topology in the
>> namespace.
>> >
>> >2. IMO, users should have the flexibility to store multiple key-values in
>> the state in the core API. As of now storm only supports rebalancing the
>> tasks (executor count remains the same). So users need not worry about
>> re-sharding their state as long as they use the right grouping. I think
>> this flexibility also helps us to provide useful abstractions on top.
>> >
>> >3. Re-sharding would be based on the keys (re-hashing). The component id,
>> task-id would be used to map to the namespace. So I am not clear about the
>> concern you have raised.
>> >
>> >We could support dynamic state re-sharding via underlying higher level
>> abstractions (e.g. Streams API) where we control the namespaces, keys etc
>> and would be more manageable. IMO before we do this, we can introduce
>> stateful exactly once processing via the streams API as the first step.
>> >
>> >Thanks,
>> >Arun
>> >
>> >
>> >
>> >On 2/19/18, 4:59 PM, "Jungtaek Lim" <kabh...@gmail.com> wrote:
>> >
>> >>Hi,
>> >>
>> >>I'd like to verify my observation on current State implementation is
>> >>correct, so that we could fix them if necessary and make plan for
>> >>improvement.
>> >>
>> >>1. State is stored with namespace prefix which typically composes to
>> >>(component id, task id) pair and it doesn't look like having
>> classification
>> >>for topology. Is this correct observation? If then I think that's worth
>> to
>> >>call it as 'critical' and it must be fixed.
>> >>
>> >>2. We're allowing end-users to put key of state, and also no restriction
>> >>for grouping on stateful component. I feel such flexibility breaks the
>> >>possibility to reshard state and end-users are required to implement
>> their
>> >>own reshard tool according to their topology state key distribution
>> logic.
>> >>I expect it will not happen on streams API (since it should be done with
>> >>keyed stream) but wouldn't it better to also restrict such flexibility
>> also
>> >>for core API?
>> >>
>> >>3. Suppose we are going to support state resharding (for allowing change
>> of
>> >>parallelism) and we restrict to apply field grouping with key while
>> >>connecting stateful component.
>> >>Then key-value can be moved based on key (though finding and replacing
>> task
>> >>id may not be trivial if component name has '-'... we have same issue on
>> >>metric name, so maybe time to restrict characters on topology name as
>> well
>> >>as component name?).
>> >>Is it also true for window/partition/windowsystem state? I didn't take a
>> >>deep look on window state (I would find a time) but it would be great if
>> >>someone knowing the detail makes it clear.
>> >>
>> >>Thanks in advance,
>> >>Jungtaek Lim (HeartSaVioR)
>> >
>>



Re: Verification about my observation for current State implementation

2018-02-20 Thread Arun Mahadevan
Hi Jungtaek,


1. Right now users need to vary the database/table name in the state provider 
config per topology. Agree, its better to include topology in the namespace.

2. IMO, users should have the flexibility to store multiple key-values in the 
state in the core API. As of now storm only supports rebalancing the tasks 
(executor count remains the same). So users need not worry about re-sharding 
their state as long as they use the right grouping. I think this flexibility 
also helps us to provide useful abstractions on top.

3. Re-sharding would be based on the keys (re-hashing). The component id, 
task-id would be used to map to the namespace. So I am not clear about the 
concern you have raised.

We could support dynamic state re-sharding via underlying higher level 
abstractions (e.g. Streams API) where we control the namespaces, keys etc and 
would be more manageable. IMO before we do this, we can introduce stateful 
exactly once processing via the streams API as the first step.

Thanks,
Arun



On 2/19/18, 4:59 PM, "Jungtaek Lim"  wrote:

>Hi,
>
>I'd like to verify my observation on current State implementation is
>correct, so that we could fix them if necessary and make plan for
>improvement.
>
>1. State is stored with namespace prefix which typically composes to
>(component id, task id) pair and it doesn't look like having classification
>for topology. Is this correct observation? If then I think that's worth to
>call it as 'critical' and it must be fixed.
>
>2. We're allowing end-users to put key of state, and also no restriction
>for grouping on stateful component. I feel such flexibility breaks the
>possibility to reshard state and end-users are required to implement their
>own reshard tool according to their topology state key distribution logic.
>I expect it will not happen on streams API (since it should be done with
>keyed stream) but wouldn't it better to also restrict such flexibility also
>for core API?
>
>3. Suppose we are going to support state resharding (for allowing change of
>parallelism) and we restrict to apply field grouping with key while
>connecting stateful component.
>Then key-value can be moved based on key (though finding and replacing task
>id may not be trivial if component name has '-'... we have same issue on
>metric name, so maybe time to restrict characters on topology name as well
>as component name?).
>Is it also true for window/partition/windowsystem state? I didn't take a
>deep look on window state (I would find a time) but it would be great if
>someone knowing the detail makes it clear.
>
>Thanks in advance,
>Jungtaek Lim (HeartSaVioR)



Re: [VOTE] Release Apache Storm 1.2.1 (rc1)

2018-02-16 Thread Arun Mahadevan
+1 (binding)

- Downloaded .zip/.tar.gz and verified md5.

- Verified the oncrpc jar is not included in the distribution

- Ran a few sample topologies, checked log viewer.

- Built the source code from .zip file


Thanks,
Arun



On 2/16/18, 3:27 PM, "Satish Duggana"  wrote:

>+1 (binding)
>
>On Sat, Feb 17, 2018 at 3:03 AM, P. Taylor Goetz  wrote:
>
>> +1 (binding)
>>
>> Verified that the LGPL dependency is no longer included in the binary
>> distribution.
>>
>> -Taylor
>>
>> > On Feb 16, 2018, at 4:01 PM, P. Taylor Goetz  wrote:
>> >
>> > This is a call to vote on releasing Apache Storm 1.2.1 (rc1)
>> >
>> > NOTE: The primary purpose of this release is to remove an LGPL-licensed
>> binary that was inadvertently included in the 1.2.0 binary release.
>> >
>> > Full list of changes in this release:
>> >
>> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.
>> 2.1-rc1/RELEASE_NOTES.html
>> >
>> > The tag/commit to be voted upon is v1.2.1:
>> >
>> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=
>> 89646c86c667e0a35aed45c8063d777ae1f32b30;hb=d156d25d991311eaa1f5131d3dc347
>> 87f87ce684
>> >
>> > The source archive being voted upon can be found here:
>> >
>> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.
>> 2.1-rc1/apache-storm-1.2.1-src.tar.gz
>> >
>> > Other release files, signatures and digests can be found here:
>> >
>> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.2.1-rc1/
>> >
>> > The release artifacts are signed with the following key:
>> >
>> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_
>> plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>> >
>> > The Nexus staging repository for this release is:
>> >
>> > https://repository.apache.org/content/repositories/orgapachestorm-1061
>> >
>> > Please vote on releasing this package as Apache Storm 1.2.1.
>> >
>> > When voting, please list the actions taken to verify the release.
>> >
>> > This vote will be open for 72 hours or until at least 3 PMC members vote
>> +1.
>> >
>> > [ ] +1 Release this package as Apache Storm 1.2.1
>> > [ ]  0 No opinion
>> > [ ] -1 Do not release this package because...
>> >
>> > Thanks to everyone who contributed to this release.
>> >
>> > -Taylor
>>
>>



Re: [DISCUSS] consider EOL for version lines

2018-02-13 Thread Arun Mahadevan
+1 to maintain 3 version lines.

I think the next focus should be 2.0.0 than 1.3.0.




On 2/12/18, 11:40 PM, "Jungtaek Lim"  wrote:

>Hi devs,
>
>I've noticed that we are providing 4 different version lines (1.1.x, 1.0.x,
>0.10.x, 0.9.x) in download page, and I expect we will add one more for
>1.2.0. Moreover, we have one more develop version line (2.0.0 - master)
>which most of development happens there.
>
>Recently we're releasing 3 version lines (1.0.6 / 1.1.2 / 1.2.0)
>simultaneously and it took heavy effort to track all the RCs and verify all
>of them. I guess release manager would take more overhead of releasing, and
>it doesn't make sense for me if we continue maintaining all of them.
>
>Ideally I'd like to propose maintaining three version lines: 2.0.0 (next
>major) / 1.3.0 (next minor - may not happen) / 1.2.1 (next bugfix) and
>making others EOL (that respects semantic versioning and even other
>projects tend to maintain only two version lines), but if someone feels too
>aggressive, I propose at least we explicitly announce EOL to 0.x version
>lines and get rid of any supports (downloads) for them.
>
>Would like to hear your opinion.
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)



Re: [CANCELED] [VOTE] Release Apache Storm 1.2.0 (rc2)

2018-02-05 Thread Arun Mahadevan
STORM-2918 has been merged to 1.x branch.


Now looks like we waiting for 1.x versions of 

https://github.com/apache/storm/pull/2538 and 
https://github.com/apache/storm/pull/2537 ?

Can we get the next RC as soon as the above two are merged to 1.x ?

Thanks,
Arun



On 2/1/18, 12:55 AM, "dbis...@gmail.com on behalf of Artem Ervits" 
<dbis...@gmail.com on behalf of artemerv...@gmail.com> wrote:

>-1
>Please include https://issues.apache.org/jira/browse/STORM-2918
>
>On Jan 31, 2018 1:59 PM, "Stig Rohde Døssing" <stigdoess...@gmail.com>
>wrote:
>
>> The log indicates a bug. We can remove the WARN messages pretty easily, but
>> we'd still be throwing and catching an exception for each processed record.
>>
>> We're storing some data in Kafka alongside committed offsets that lets us
>> only apply the EARLIEST and LATEST strategies for where the consumer should
>> start if the topology is redeployed, rather than applying them every time
>> the worker restarts. This was added as a fix to a bug I introduced in
>> https://issues.apache.org/jira/browse/STORM-2666, where we error if the
>> spout tries to emit offsets that have already been committed (that never
>> happens during normal operation, but if the spout is restarted and using
>> EARLIEST it can happen). The new behavior of EARLIEST/LATEST is also much
>> more useful than the old one IMO.
>>
>> The bug is that we're only storing the metadata if the spout is configured
>> for at-least-once. We also support at-most-once and an "anything goes"
>> setting, and when either of those are used, the spout tries to read the
>> metadata that isn't there and complains about it. If we just remove the
>> log, EARLIEST and LATEST will behave differently for at-least-once and
>> at-most-once/no-guarantee.
>>
>> The patch makes changes to ensure that we store metadata in all cases.
>>
>> 2018-01-31 19:47 GMT+01:00 Arun Mahadevan <ar...@apache.org>:
>>
>> > Are we waiting for https://github.com/apache/storm/pull/2538 to start
>> the
>> > next RC ?
>> >
>> > This patch seem to contain more changes than just fixing a logging issue.
>> > Can we only address the concern raised about logs filling with WARN
>> > messages and address the rest in the next release or does the logs
>> indicate
>> > some bug which is a blocker for 1.2.0 ?
>> >
>> > - Arun
>> >
>> >
>> >
>> >
>> > On 1/29/18, 12:30 PM, "P. Taylor Goetz" <ptgo...@gmail.com> wrote:
>> >
>> > >Cancelling this RC in order to address Stig’s concerns.
>> > >
>> > >-Taylor
>> > >
>> > >> On Jan 26, 2018, at 4:11 PM, P. Taylor Goetz <ptgo...@gmail.com>
>> wrote:
>> > >>
>> > >> This is a call to vote on releasing Apache Storm 1.2.0 (rc2)
>> > >>
>> > >> Full list of changes in this release:
>> > >>
>> > >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.
>> > 2.0-rc2/RELEASE_NOTES.html
>> > >>
>> > >> The tag/commit to be voted upon is v1.2.0:
>> > >>
>> > >> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=
>> > 17a4645d7d65f5a7a08a50b5185c0fc52e82692f;hb=
>> 458aa1cb696097cf07d4466aa7417c
>> > 7b89662221
>> > >>
>> > >> The source archive being voted upon can be found here:
>> > >>
>> > >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.
>> > 2.0-rc2/apache-storm-1.2.0-src.tar.gz
>> > >>
>> > >> Other release files, signatures and digests can be found here:
>> > >>
>> > >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.2.0-rc2/
>> > >>
>> > >> The release artifacts are signed with the following key:
>> > >>
>> > >> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_
>> > plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>> > >>
>> > >> The Nexus staging repository for this release is:
>> > >>
>> > >> https://repository.apache.org/content/repositories/
>> orgapachestorm-1056
>> > >>
>> > >> Please vote on releasing this package as Apache Storm 1.2.0.
>> > >>
>> > >> When voting, please list the actions taken to verify the release.
>> > >>
>> > >> This vote will be open for at least 72 hours.
>> > >>
>> > >> [ ] +1 Release this package as Apache Storm 1.2.0
>> > >> [ ]  0 No opinion
>> > >> [ ] -1 Do not release this package because...
>> > >>
>> > >> Thanks to everyone who contributed to this release.
>> > >>
>> > >> -Taylor
>> > >
>> >
>> >
>>



Re: [CANCELED] [VOTE] Release Apache Storm 1.2.0 (rc2)

2018-01-31 Thread Arun Mahadevan
Are we waiting for https://github.com/apache/storm/pull/2538 to start the next 
RC ?

This patch seem to contain more changes than just fixing a logging issue. Can 
we only address the concern raised about logs filling with WARN messages and 
address the rest in the next release or does the logs indicate some bug which 
is a blocker for 1.2.0 ?

- Arun




On 1/29/18, 12:30 PM, "P. Taylor Goetz"  wrote:

>Cancelling this RC in order to address Stig’s concerns.
>
>-Taylor
>
>> On Jan 26, 2018, at 4:11 PM, P. Taylor Goetz  wrote:
>> 
>> This is a call to vote on releasing Apache Storm 1.2.0 (rc2)
>> 
>> Full list of changes in this release:
>> 
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.2.0-rc2/RELEASE_NOTES.html
>> 
>> The tag/commit to be voted upon is v1.2.0:
>> 
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=17a4645d7d65f5a7a08a50b5185c0fc52e82692f;hb=458aa1cb696097cf07d4466aa7417c7b89662221
>> 
>> The source archive being voted upon can be found here:
>> 
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.2.0-rc2/apache-storm-1.2.0-src.tar.gz
>> 
>> Other release files, signatures and digests can be found here:
>> 
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.2.0-rc2/
>> 
>> The release artifacts are signed with the following key:
>> 
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>> 
>> The Nexus staging repository for this release is:
>> 
>> https://repository.apache.org/content/repositories/orgapachestorm-1056
>> 
>> Please vote on releasing this package as Apache Storm 1.2.0.
>> 
>> When voting, please list the actions taken to verify the release.
>> 
>> This vote will be open for at least 72 hours.
>> 
>> [ ] +1 Release this package as Apache Storm 1.2.0
>> [ ]  0 No opinion
>> [ ] -1 Do not release this package because...
>> 
>> Thanks to everyone who contributed to this release.
>> 
>> -Taylor
>



Re: [DISCUSS] Decouple Storm core and connectors

2018-01-29 Thread Arun Mahadevan
If the storm-kafka-client is fairly stable with the changes we made in the 1.2 
release, then would we just want to continue the current process ? 

If we want to decouple, I think option 2 may be better than option 1 to start 
with.

When you say storm-kafka-client-vX.X.X, is it going to be completely 
independent of the Storm version and will work across Storm 1.0.x, 1.x and 2.0 
(future) storm releases?

Thanks,
Arun 


On 1/29/18, 12:27 PM, "P. Taylor Goetz"  wrote:

>To give some background information, the Spark PMC decided to remove a number 
>of connectors from their repo (and thus releases). Some members of the 
>community wanted to see some sort of official community support for those 
>connectors, thus the Apache Bahir project was created. Flink also decided to 
>follow that model.
>
>I don’t feel we’ve reached the same conclusion (to remove connectors from our 
>distribution), or, perhaps, not yet. I think where we are right now is wanting 
>to decouple storm-kafka-client from Storm’s release cycle so updates can be 
>released more often.
>
>As I mentioned in the vote thread, there are two approaches we could take:
>
>1. Move storm-kafka-client to a new git repo under the purview of the Storm 
>PMC.
>2. Leave storm-kafka-client in place, but decouple it from the main 
>build/release process so it can be released independently of storm proper.
>
>I lean toward option 2. One downside as Stig pointed out is that it would make 
>tagging awkward, but I don’t see that as much of a problem — we could simply 
>keep the top-level tag convention “vX.X.X” and add another along the lines of 
>“storm-kafka-client-vX.X.X”. Yes, it adds another tag/tag convention, but even 
>if we moved all the connectors to another repo and versioned them all 
>independently we would have to do the same in that repo.
>
>If we start with option 2, it would make it easier to rollback if for some 
>reason we later decided it was’t a good idea. It could also represent a “try 
>before you buy” option toward moving to option 1.
>
>The argument I see for option 1 would be a better IDE experience — i.e. you 
>would have two separate IDE projects that might make development/testing with 
>various versions easier.
>
>I’m open to either approach and would volunteer to do the work to make it 
>happen.
>
>-Taylor
>
>> On Jan 29, 2018, at 12:45 AM, Jungtaek Lim  wrote:
>> 
>> My idea is basically came from Apache Bahir. (http://bahir.apache.org/) It
>> was for Apache Spark, but Flink decided to migrate their connectors to
>> Bahir so it is also for Apache Flink. They're also maintaining some
>> connectors (I'd say first class support) in their repositories, but not
>> all. I think we could select some of connectors to support as first class,
>> and move out others to Bahir or another storm repository (storm-connectors?
>> storm-externals?).
>> 
>> - Jungtaek Lim (HeartSavioR)
>> 
>> 2018년 1월 29일 (월) 오후 2:30, Jungtaek Lim 님이 작성:
>> 
>>> Hi devs,
>>> 
>>> This is initial post to separate out discussion topic from vote thread,
>>> and continue discussing.
>>> 
>>> Background of the topic:
>>> 1. Releasing Storm requires huge bootstrapping, and normally takes several
>>> months to release bugfix version. Note that it is not minor version...
>>> Minor version is released per near a year. Connectors are maintained with
>>> same release cadence, which makes connectors also long period to release,
>>> whether it is (implicitly) beta or not.
>>> 2. Most of the change for connectors are not related to Storm core. It
>>> tends to be compatible with all release versions with same major version.
>>> 3. (IMHO) We have too many connectors which we even can't maintain
>>> actively. For example, ES connector couldn't support ES higher than 1.x.
>>> 4. Connectors are having same release version for Storm core, hence newly
>>> added connector will have at least 1.x version which no one would think it
>>> is beta.
>>> 
>>> Downside:
>>> 1. Detached connectors can be easy to be forgotten. (easier than current)
>>> 2. Connectors may have hard time if we bring backward incompatible change
>>> to Storm core. We may remedy this with having supported version range for
>>> specific connector version.
>>> 
>>> Please put your opinion regarding topic. You're encouraged to copy your
>>> previous post in vote thread which helps to centralize opinions in current
>>> thread.
>>> 
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>> 
>



Re: [VOTE] Release Apache Storm 1.2.0 (rc1)

2018-01-24 Thread Arun Mahadevan
+1 (binding)

- Downloaded and deployed the tar.gz and .zip binary distribution.
- Verified MD5.
- Built the source with JDK 1.8
- Ran a few sample topologies and observed the output.
- Viewed the worker logs via log viewer and did some basic sanity on the 
metrics.

Thanks,
Arun




On 1/23/18, 10:40 PM, "Jungtaek Lim"  wrote:

>Let's back to verify the release and vote.
>
>+1 (binding)
>
>> source
>
>- verify file (signature, MD5, SHA)
>-- source, tar.gz : OK
>-- source, zip : OK
>
>- extract file
>-- source, tar.gz : OK
>-- source, zip : OK
>
>- diff-ing extracted files between tar.gz and zip : OK
>
>- build source with JDK 7
>-- source, tar.gz : integration-test failed, others are OK
>
>- build source dist
>-- source, tar.gz : OK
>
>- build binary dist
>-- source, tar.gz : OK
>
>> binary
>
>- verify file (signature, MD5, SHA)
>-- binary, tar.gz : OK
>-- binary, zip : OK
>
>- extract file
>-- binary, tar.gz : OK
>-- binary, zip : OK
>
>- diff-ing extracted files between tar.gz and zip : OK
>
>- launch daemons : OK
>
>- run RollingTopWords (local) : OK
>
>- run RollingTopWords (remote) : OK
>  - activate / deactivate / rebalance / kill : OK
>  - logviewer (worker dir, daemon dir) : OK
>  - change log level : OK
>  - thread dump, heap dump, restart worker : OK
>  - log search : OK
>
>I don't see odd numbers while testing, but I don't have stage/production
>level of cluster/use case, hence someone might be able to see the behavior
>what Alexandre encountered.
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)
>
>2018년 1월 24일 (수) 오전 10:19, Jungtaek Lim 님이 작성:
>
>> Alexandre,
>>
>> Please file an issue with screenshot and reproducible step (if only
>> possible). It would be very appreciated if you could spend time to dive
>> into the codebase and find the cause, and fix and submit a patch (only when
>> you could get it).
>> Open source community can't live without contributors. I think reporting
>> issue itself is great contribution, but I feel we don't have enough code
>> contributors who could help driving the community.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2018년 1월 24일 (수) 오전 9:57, P. Taylor Goetz 님이 작성:
>>
>>> Yes, that’s the same error I got, and I think we both just shaved the
>>> same yak. ;)
>>>
>>> I imagine infra is enforcing TLS > 1.0 now.
>>>
>>> -Taylor
>>>
>>> > On Jan 23, 2018, at 7:46 PM, Jungtaek Lim  wrote:
>>> >
>>> > Stig, the script doesn't also work for me, but that's not because of
>>> script
>>> > or jira module error.
>>> > I've encountered TLSV1_ALERT_PROTOCOL_VERSION error and my python2.7 is
>>> > unfortunately coupled with OpenSSL 0.9.8zh which doesn't support
>>> TLSv1.2.
>>> > My python3.6 is coupled with OpenSSL 1.0.2l but the script is not
>>> > compatible with python 3. Maybe I need to modify the script to be
>>> > compatible with python3.6.
>>> >
>>> > cc. to Taylor, assuming that we are getting same error.
>>> >
>>> > - Jungtaek Lim (HeartSaVioR)
>>> >
>>> > 2018년 1월 24일 (수) 오전 8:21, Stig Rohde Døssing 님이
>>> 작성:
>>> >
>>> >> Taylor,
>>> >>
>>> >> The release notes script appears to work fine for me. There are a
>>> couple of
>>> >> issues with fix version 1.2.0 that are not resolved, which we should
>>> fix.
>>> >> Note that 2710 is the release 1.2.0 epic, we might want to not mark
>>> that
>>> >> with a fix version so it isn't included in the release notes.
>>> >>
>>> >> dev-tools/release_notes.py 1.2.0
>>> >> The release is not completed since unresolved issues or improperly
>>> resolved
>>> >> issues were found still tagged with this release as the fix version:
>>> >> Unresolved issue:  STORM-2904 None
>>> >> https://issues.apache.org/jira/browse/STORM-2904
>>> >> Unresolved issue:  STORM-2710 None
>>> >> https://issues.apache.org/jira/browse/STORM-2710
>>> >> Unresolved issue:  STORM-2153 None
>>> >> https://issues.apache.org/jira/browse/STORM-2153
>>> >>
>>> >> If I ignore the unresolved issues check, I get the expected release
>>> notes
>>> >>
>>> >> dev-tools/release_notes.py 1.2.0 > release-1.2.0.html produces
>>> >> https://pste.eu/p/ZvbF.html
>>> >>
>>> >> 2018-01-24 0:09 GMT+01:00 Alexandre Vermeerbergen <
>>> >> avermeerber...@gmail.com>
>>> >> :
>>> >>
>>> >>> Hello,
>>> >>>
>>> >>> I'm afraid I my vote in 1.2.0 RC1 is a -1:
>>> >>>
>>> >>> Indeed metrics displayed in Storm UI from 1.2.0 RC1 are obviously
>>> wrong.
>>> >>>
>>> >>> See for example attached picture showing "Assigned Mem (MB)" for one
>>> >>> of our topologies:
>>> >>> -  On the left hand side we have Storm 1.1.0 showing 2112 MB on each
>>> >>> host, which sounds "normal" to us (in line with what we had with
>>> >>> previous Storm 1.0.3 version)
>>> >>> -  On the right hand side we have Storm 1.2.0 RC1 showing 65 MB on
>>> >>> each host, which sound completely wrong !
>>> >>>
>>> >>> And I have similar concerns on the statistics on 

Re: [VOTE] Release Apache Storm 1.2.0 (rc1)

2018-01-23 Thread Arun Mahadevan
Looks weird, at-least the storm-starter exclamation and word count topologies 
shows Assigned Mem: 2496 MB and capacity numbers in the range of “0.005 - 
0.015” on a 16 GB macbook.

Thanks,
Arun 


On 1/23/18, 3:09 PM, "Alexandre Vermeerbergen"  wrote:

>Hello,
>
>I'm afraid I my vote in 1.2.0 RC1 is a -1:
>
>Indeed metrics displayed in Storm UI from 1.2.0 RC1 are obviously wrong.
>
>See for example attached picture showing "Assigned Mem (MB)" for one
>of our topologies:
>-  On the left hand side we have Storm 1.1.0 showing 2112 MB on each
>host, which sounds "normal" to us (in line with what we had with
>previous Storm 1.0.3 version)
>-  On the right hand side we have Storm 1.2.0 RC1 showing 65 MB on
>each host, which sound completely wrong !
>
>And I have similar concerns on the statistics on bolts, for example on
>a bolt of our topology in charge of writing logs into HBase, we have:
>
>With Storm 1.1.0, capacity (last 10 min): 0.090 ; Execute Latency (ms): 0.029
>With Storm 1.2.0, capacity (last 10 min): 438.956 ; Execute Latency
>(ms): 197.840
>
>Am I the only one to find weird numbers in Storm UI 1.2.0 ?
>
>Best regards,
>Alexandre Vermeerbergen



Re: [Discuss] Release Storm 1.2.0 (cont.)

2018-01-08 Thread Arun Mahadevan
+1 to start the release process once STORM-2153 is merged. 

If STORM-2860 can be merged soon we can include that as well.

Thanks,
Arun




On 1/7/18, 4:14 PM, "Jungtaek Lim" <kabh...@gmail.com> wrote:

>Now we merged STORM-2867 and STORM-2869.
>
>Remaining issues are STORM-2153 and STORM-2860, and STORM-2860 doesn't seem
>to bring benefit on 1.x version line hence unless I'm missing here, we just
>need to make sure STORM-2153 is resolved so that we could start release
>phase for Storm 1.2.0.
>
>- Jungtaek Lim (HeartSaVioR)
>
>2017년 12월 29일 (금) 오전 4:51, Arun Mahadevan <ar...@apache.org>님이 작성:
>
>> >STORM-2869: KafkaSpout discards all pending record when adjusting the
>> >consumer position after a commit [1]
>>
>> Hope we could get it merged this week or early next week.
>>
>>
>> >>New Feature
>> >STORM-2153: New Metrics Reporting API [2]
>>
>> I think this is waiting for a final +1 from revans2.
>>
>>
>> >STORM-2867: Add consumer lag metrics to KafkaSpout [3]
>>
>>
>> If required we can call out the kafka dependency since its a minor version
>> change. It may not be an issue if we use the reflection workaround proposed
>> in the PR ?
>>
>> IMO, it will be ideal to start the release process for 1.2.0 in the first
>> week of Jan after the above three are addressed.
>>
>> Thanks,
>> Arun
>>
>>
>>
>> On 12/27/17, 11:46 PM, "Jungtaek Lim" <kabh...@gmail.com> wrote:
>>
>> >Looks like we got lost the chance to make release phase be started in
>> 2017,
>> >but I think we are really close to be sure we could start the process in
>> >early Jan. 2018.
>> >
>> >We haven't had "feature freeze" before releasing, so typically we still
>> >have a chance to get more features in Storm 1.2.0.
>> >
>> >So far, what we have remaining issues for Storm 1.2.0:
>> >
>> >> Bug
>> >STORM-2869: KafkaSpout discards all pending record when adjusting the
>> >consumer position after a commit [1]
>> >
>> >The PR for master got +1, so once we have PR for 1.x-branch, we could go
>> on
>> >merging. I expect this will be done in several days, unless Stig is going
>> >for long vacation.
>> >
>> >> New Feature
>> >STORM-2153: New Metrics Reporting API [2]
>> >
>> >This is likely waiting for final review, so I expect this patch to be
>> >finished within couple of weeks (early Jan. 2018), and if not I'd like to
>> >propose moving out of 1.2.0.
>> >
>> >STORM-2867: Add consumer lag metrics to KafkaSpout [3]
>> >
>> >The patch looks good, but it requires Kafka dependency to be updated from
>> >0.10.0.0 to 0.10.1.0 which might make Kafka 0.10.0.x user unable to use
>> >storm-kafka-client in Storm 1.2.0. Do we want to have a poll, or would it
>> >be not a big deal?
>> >
>> >STORM-2860: Add Kerberos support to Solr bolt [4]
>> >
>> >This patch breaks backward compatibility and we are discussing about
>> >alternative way to not break backward compatibility for 1.2.0. If we can
>> >get alternative in time, we could bring it to Storm 1.2.0, but if not, it
>> >should not block the release.
>> >
>> >Please participate reviewing, or provide any missing issues for Storm
>> >1.2.0, or give opinions on Storm 1.2.0.
>> >
>> >Thanks,
>> >Jungtaek Lim (HeartSaVioR)
>> >
>> >1. https://issues.apache.org/jira/browse/STORM-2869
>> >2. https://issues.apache.org/jira/browse/STORM-2153
>> >3. https://issues.apache.org/jira/browse/STORM-2867
>> >4. https://issues.apache.org/jira/browse/STORM-2860
>> >
>> >2017년 12월 18일 (월) 오전 3:50, Stig Rohde Døssing <stigdoess...@gmail.com>님이
>> 작성:
>> >
>> >> Alexandre,
>> >>
>> >> There are a couple more issues pending, see
>> >> https://issues.apache.org/jira/browse/STORM-2710. It might be easier if
>> >> you
>> >> build the code yourself. There's a guide at
>> >>
>> >>
>> https://github.com/apache/storm/blob/master/DEVELOPER.md#create-a-storm-distribution-packaging
>> >> .
>> >> The only change you should need to make is to run "mvn package
>> -Dgpg.skip"
>> >> instead of "mvn package" in the storm-dist directory to skip the GPG
>> >> signature part.
>> >>
>> >> 2017-12-17 15:09 GMT+0

Re: [Discuss] Release Storm 1.2.0 (cont.)

2017-12-28 Thread Arun Mahadevan
>STORM-2869: KafkaSpout discards all pending record when adjusting the
>consumer position after a commit [1]

Hope we could get it merged this week or early next week.


>>New Feature
>STORM-2153: New Metrics Reporting API [2]

I think this is waiting for a final +1 from revans2.


>STORM-2867: Add consumer lag metrics to KafkaSpout [3]


If required we can call out the kafka dependency since its a minor version 
change. It may not be an issue if we use the reflection workaround proposed in 
the PR ?

IMO, it will be ideal to start the release process for 1.2.0 in the first week 
of Jan after the above three are addressed.

Thanks,
Arun



On 12/27/17, 11:46 PM, "Jungtaek Lim"  wrote:

>Looks like we got lost the chance to make release phase be started in 2017,
>but I think we are really close to be sure we could start the process in
>early Jan. 2018.
>
>We haven't had "feature freeze" before releasing, so typically we still
>have a chance to get more features in Storm 1.2.0.
>
>So far, what we have remaining issues for Storm 1.2.0:
>
>> Bug
>STORM-2869: KafkaSpout discards all pending record when adjusting the
>consumer position after a commit [1]
>
>The PR for master got +1, so once we have PR for 1.x-branch, we could go on
>merging. I expect this will be done in several days, unless Stig is going
>for long vacation.
>
>> New Feature
>STORM-2153: New Metrics Reporting API [2]
>
>This is likely waiting for final review, so I expect this patch to be
>finished within couple of weeks (early Jan. 2018), and if not I'd like to
>propose moving out of 1.2.0.
>
>STORM-2867: Add consumer lag metrics to KafkaSpout [3]
>
>The patch looks good, but it requires Kafka dependency to be updated from
>0.10.0.0 to 0.10.1.0 which might make Kafka 0.10.0.x user unable to use
>storm-kafka-client in Storm 1.2.0. Do we want to have a poll, or would it
>be not a big deal?
>
>STORM-2860: Add Kerberos support to Solr bolt [4]
>
>This patch breaks backward compatibility and we are discussing about
>alternative way to not break backward compatibility for 1.2.0. If we can
>get alternative in time, we could bring it to Storm 1.2.0, but if not, it
>should not block the release.
>
>Please participate reviewing, or provide any missing issues for Storm
>1.2.0, or give opinions on Storm 1.2.0.
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)
>
>1. https://issues.apache.org/jira/browse/STORM-2869
>2. https://issues.apache.org/jira/browse/STORM-2153
>3. https://issues.apache.org/jira/browse/STORM-2867
>4. https://issues.apache.org/jira/browse/STORM-2860
>
>2017년 12월 18일 (월) 오전 3:50, Stig Rohde Døssing 님이 작성:
>
>> Alexandre,
>>
>> There are a couple more issues pending, see
>> https://issues.apache.org/jira/browse/STORM-2710. It might be easier if
>> you
>> build the code yourself. There's a guide at
>>
>> https://github.com/apache/storm/blob/master/DEVELOPER.md#create-a-storm-distribution-packaging
>> .
>> The only change you should need to make is to run "mvn package -Dgpg.skip"
>> instead of "mvn package" in the storm-dist directory to skip the GPG
>> signature part.
>>
>> 2017-12-17 15:09 GMT+01:00 Alexandre Vermeerbergen <
>> avermeerber...@gmail.com
>> >:
>>
>> > Hello Storm developers,
>> >
>> > Now that I see that everything planned for Storm 1.2.0 release is done
>> (as
>> > I see at https://issues.apache.org/jira/projects/STORM/versions/12341047
>> ),
>> > would it be possible to have new binaries for us to assess this new
>> release
>> > non-regression?
>> >
>> > In particular, I would like to check whether or not the bizarre
>> capacities
>> > metrics I get with the 1+ month-old Storm 1.2.0 preview are still there.
>> >
>> > Best regards,
>> > Alexandre
>> >
>> >
>> > 2017-12-09 10:38 GMT+01:00 Alexandre Vermeerbergen <
>> > avermeerber...@gmail.com
>> > >:
>> >
>> > > Hello Storm developers,
>> > >
>> > > It's been about 2 weeks that I running Storm 1.2.0 preview with 15
>> > > topologies, 6 Supervisors, 1 Nimbus, 3 Zookeepers, and Kafka 0.10 libs
>> > all
>> > > with Storm Kafka client spout instead of our own Kafka 0.10 spout.
>> > >
>> > > I noticed that statistics are going a bit nuts on bolts, with
>> capacities
>> > > reaching hundreds or more while everything seem to be running fine.
>> > > This look like this is tied to topologies relying on tick tuple - not
>> > sure
>> > > but just a guess.
>> > >
>> > > See what kind of ridiculous capacities I can reach:
>> > >
>> > > IdExecutorsTasksEmittedTransferredCapacity (last
>> > > 10m)Execute latency (ms)ExecutedProcess latency (ms)
>> > > AckedFailedError HostError PortLast errorError Time
>> > > aggregate221306601306600.0000.027130620
>> > > 0.0311305800
>> > > alertsToKafka33000.0100.0542238580
>> > > 337.08022384200
>> > > checkUnknown2020145840145840120.39152060.445
>> > > 1906015062.337   

Re: [Discuss] Release Storm 1.2.0

2017-12-05 Thread Arun Mahadevan
Looks like now we are only waiting on below Kafka spout issues :

https://github.com/apache/storm/pull/2428
https://github.com/apache/storm/pull/2438

Maybe we should include the metrics changes as well?
https://github.com/apache/storm/pull/2203  


Can we try to get the above merged ASAP and start the 1.2.0 release process ?

Thanks,
Arun



On 11/21/17, 3:18 AM, "generalbas@gmail.com on behalf of Stig Rohde 
Døssing"  wrote:

>Alexandre,
>
>It's a bug in the way I tried to fix the NPE you had a few days ago in
>https://github.com/apache/storm/pull/2428. I missed that using
>setKey/setValue actually builds a new KafkaSpoutConfig.Builder instead of
>just setting a field, and the change I made to the copy constructor means
>that if the value deserializer is set in kafkaProps (which it is when using
>KafkaSpoutConfig.builder), using setKey/Value is ignored.
>
>I've amended the fix to STORM-2826 and added a few more tests. The new jar
>is at
>https://drive.google.com/file/d/1DgJWjhWwczYgZS82YGd63V3GT2G_v9fd/view?usp=sharing
>.
>
>There is not as far as I know a way for you to get the subscribed topics
>from the subscription.
>
>2017-11-21 11:04 GMT+01:00 Alexandre Vermeerbergen >:
>
>> Hello Stig,
>>
>> Here's an update of my tests with storm 1.2.0 preview:
>> - I accept the limitation on the stability of the string format returned by
>> getTopicsString(), as I have adapted our code to detect both 1.1.0-style &
>> 1.2.0-style. Isn't there a clean way to get the list of topics other than
>> our fragile parsing?
>> - My ~15 topologies have been running for 24 hours with storm 1.2.0 preview
>> + our own Kafka spout deriving from Storm kafka client 1.2.0 preview
>> setting, I have seen no stability nor performance issue (but that's not yet
>> a large  scale test).
>> - When I tried to switch one of our topologies to your storm-kafka-client,
>> I was surprised to get no stats on the topology.
>>   Then I noticed exceptions for all messages read by the spout:
>>
>> java.lang.String cannot be cast to
>> com.dassault_systemes.infra.monitoring.model.Event
>> java.lang.ClassCast*Excep
>> > statefulAlerting_ows-171-33-69-118-eu-west-2-compute-outscale-com_
>> defaultStormTopic-165-1511258026%2F6706%2Fworker.
>> log=19293=51200>*tion:
>> java.lang.String cannot be cast to
>> com.acme_systemes.infra.monitoring.model.Event
>> at com.dassault_systemes.storm.eval
>>
>>
>> And also:
>>
>> 2017-11-21 08:28:40.958 o.a.s.k.s.KafkaSpout
>> Thread-5-eventFromAdminTopic-executor[12 12] [INFO] Kafka Spout opened
>> with
>> the following configuration:
>> KafkaSpoutConfig{kafkaProps={key.deserializer=class
>> org.apache.kafka.common.serialization.StringDeserializer,
>> value.deserializer=class
>> org.apache.kafka.common.serialization.StringDeserializer,
>> enable.auto.commit=false, request.timeout.ms=120,
>> group.id=Storm_RealTimeSupervision_9XkvRUExS2GFNAZNcBjQug_
>> defaultStormTopic_alerting_administration,
>> bootstrap.servers=ows-171-33-69-118.eu-west-2.compute.outscale.com:9092,
>> auto.commit.interval.ms=6, session.timeout.ms=12,
>> auto.offset.reset=earliest},
>> key=org.apache.kafka.common.serialization.StringDeserializer@61dc4a48,
>> value=com.acme_systemes.storm.evaluator.spout.EventKafkaDeserializer@
>> 5d6e1916,
>> pollTimeoutMs=200, offsetCommitPeriodMs=3,
>> maxUncommittedOffsets=1000, firstPollOffsetStrategy=LATEST,
>> subscription=org.apache.storm.kafka.spout.ManualPartitionSubscription@
>> 4ff512c9,
>> translator=com.acme_systemes.storm.evaluator.spout.
>> EventKafkaRecordTranslator@593c16f5,
>> retryService=KafkaSpoutRetryExponentialBackoff{delay=TimeInterval{length=
>> 0,
>> timeUnit=SECONDS}, ratio=TimeInterval{length=2, timeUnit=MILLISECONDS},
>> maxRetries=2147483647, maxRetryDelay=TimeInterval{length=10,
>> timeUnit=SECONDS}}, tupleListener=EmptyKafkaTupleListener}
>>
>> This later stack is very strange: it shows that our custom deserializer was
>> indeed taken into account in a field called "value=..." but not as the
>> key.deserializer which remained set to StringDeserializer.
>>
>> Our Kafka spout initialization code is the following one:
>>
>> KafkaSpoutConfig spoutConfigForMainTopic =
>> KafkaSpoutConfig
>> .builder(elasticKafkaBrokers, KafkaTopics.MAIN)
>> .setValue(EventKafkaDeserializer.class)
>> .setGroupId(consumerId + "_" + KafkaTopics.MAIN)
>> .setFirstPollOffsetStrategy(strategy)
>> .setProp(kafkaConsumerProp)
>> .setRecordTranslator(
>> new EventKafkaRecordTranslator(true))
>> .build();
>>
>> We then noticed the following discussion:
>>
>> http://mail-archives.apache.org/mod_mbox/storm-user/201709.mbox/%
>> 

Re: [DISCUSS] Drop standalone mode of Storm SQL

2017-12-05 Thread Arun Mahadevan
+1, I don’t see much use for standalone mode other than for testing. 

Assume we can use the storm-sql in local mode to run topologies locally without 
deploying to cluster ?

Thanks,
Arun






On 12/4/17, 10:53 PM, "Jungtaek Lim"  wrote:

>Hi devs,
>
>We have been exposing "standalone mode" of Storm SQL which leverages Storm
>SQL in a JVM process rather than composing topology and run.
>At a start we implemented both standalone and trident modes with same
>approach, but while we improved Storm SQL by leveraging more features on
>Calcite, we addressed only trident mode, and now twos are diverged.
>
>I guess there is likely no actual user on standalone mode since its classes
>are exposed but we didn't document it. I know a case, but the source codes
>on standalone mode code are migrated to the project (and modified to
>conform to the project) and the project no longer depends on Storm SQL.
>
>If we all don't have any other case, how about dropping it and only
>concentrate to trident mode?
>(Btw, I'm trying to replace the backend on Storm SQL from Trident to
>Streams API, which may make the mode name obsolete, but after dropping
>standalone mode we don't even need the name for mode since there will be
>only one mode.)
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)



[Discuss] Release Storm 1.2.0

2017-11-14 Thread Arun Mahadevan
Hi,

Looks like we are only waiting on 
https://issues.apache.org/jira/browse/STORM-2546 . 

Are there any other issues which are blockers for Storm 1.2.0? Would be great 
to see the 1.2.0 release out soon as it has a lot of critical fixes.

Thanks,
Arun




Re: Reducing the number of threads in Storm Windowing

2017-10-24 Thread Arun Mahadevan
As you already noted, tick tuples provides only seconds granularity so that was 
one of the reasons for having separate threads for triggering processing time 
windows. Also the tick tuples are published to the receive queue of the 
executor, so if there are messages in front of it, the delivery of the tick 
tuple will be delayed. So we may not be able to guarantee processing time 
windows (that processes say events in the last x secs).

Perhaps if we could deliver the tick tuples at a higher priority via a 
different queue it would work. Also the tick tuples itself requires one thread 
per executor so we would still need to have a thread. However this would 
simplify the processing of the windowing logic since the logic need not be 
thread safe (currently it needs to be due to the triggers).

Regarding you other solution, I didn’t quite get how you would implement the 
callbacks without additional threads per executor.

Thanks,
Arun



On 10/23/17, 7:42 AM, "Jerry Peng"  wrote:

>Hello all,
>
>I just realized that in the windowing implementation, we use an
>additional thread as a timer for processing time triggered windows:
>
>https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/windowing/TimeTriggerPolicy.java#L71
>
>We also do the same thing in:
>
>https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/windowing/WaterMarkEventGenerator.java#L71
>
>to periodically generate watermarks.
>
>Perhaps, instead of additional threads we can just use tick tuples for
>such triggering of events?  Or we can allow registration of timers and
>callback functions in topologies so that additional timer threads
>would be not be needed?  I know that the tick tuple frequency is in
>seconds and windowing supports millisecond granularity but the tick
>tuple frequency can easily be changed to support millisecond
>frequency.
>
>On the other hand, a more eloquent solution would be having an API
>that allows users to register arbitrary timer events in their
>topologies. Something like this:
>
>conf.registerTimer(long frequency, Object callback)
>
>Anyone have any thoughts on this?
>
>Best,
>
>Jerry
>



Re: [DISCUSS] Release Storm 1.0.5 / 1.1.2

2017-10-13 Thread Arun Mahadevan
I was hoping we will get 1.2.0 out along with 1.1.2. The pending issues in the 
epic https://issues.apache.org/jira/browse/STORM-2710 seems to have been 
addressed. Can you add the new issue to the epic?

If its not something critical we can do it in a minor release post 1.2.0.

Thanks,
Arun


On 10/14/17, 3:50 AM, "Hugo Da Cruz Louro" <hlo...@hortonworks.com> wrote:

>I am +1 to releasing 1.1.2 right away. I am in the middle of one review but I 
>will finish it in the next day, such that we can get this merged soon.
>
>However, we need to hold onto releasing 1.2.0 until some of the changes for 
>ProcessingGuarantee that got in this 
>patch<https://github.com/apache/storm/commit/48f6969027e7b02a5b9220577189d3911aa2226d>
> are fixed. I briefly discussed [1] this issue with @Stig on Gitter, I will 
>submit a patch with the change.
>
>Thanks,
>Hugo
>[1] - We did not have a technical discussion. I just asked a couple of 
>clarifying questions and then the idea surged that we should improve some of 
>the changes in this  
>patch<https://github.com/apache/storm/commit/48f6969027e7b02a5b9220577189d3911aa2226d>.
> I will create a JIRA, and all the discussion go through either JIRA or dev 
>email list.
>
>On Oct 10, 2017, at 12:48 PM, Stig Rohde Døssing 
><stigdoess...@gmail.com<mailto:stigdoess...@gmail.com>> wrote:
>
>Thanks Jungtaek, that sounds like a good plan. Here's the new PR for 2607
>https://github.com/apache/storm/pull/2367.
>
>Beginning release next week sounds good to me.
>
>2017-10-10 17:42 GMT+02:00 Arun Mahadevan 
><ar...@apache.org<mailto:ar...@apache.org>>:
>
>+1 for addressing the pending reviews and getting 1.2.0 out soon.
>
>
>
>
>On 10/10/17, 6:14 AM, "Jungtaek Lim" 
><kabh...@gmail.com<mailto:kabh...@gmail.com>> wrote:
>
>Stig,
>
>Let's just handle all the issues pending Storm 1.1.2. For pending issues
>on
>Storm 1.2.0, I already handled all the things.
>
>For STORM-2607, could you just take over and craft a new pull request? We
>are waiting more than 2 months after requesting simple rebase (sadly it is
>not done yet), which I don't think it's acceptable. That issue relates a
>bug which we should handle it in time.
>(The patch includes your work indeed.)
>
>For STORM-2549, let's see someone could review in this week. I'll try to
>get it too.
>
>Then I think we can start release phase for Storm 1.1.2 and 1.2.0 at next
>week. Opinions anyone?
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)
>
>2017년 10월 10일 (화) 오전 4:02, Stig Rohde Døssing 
><stigdoess...@gmail.com<mailto:stigdoess...@gmail.com>>님이
>작성:
>
>Maybe we would be better off releasing 1.1.2 as is, and postponing the
>other issues to 1.2.0? I don't think we should delay the fix for
>https://issues.apache.org/jira/browse/STORM-2682 for much longer.
>
>2017-09-22 14:50 GMT+02:00 Alexandre Vermeerbergen <
>avermeerber...@gmail.com<mailto:avermeerber...@gmail.com>
>:
>
>Hello,
>
>I don't know if that help, but we're still waiting with lots of
>expectations https://issues.apache.org/jira/browse/STORM-2648 with
>Storm
>1.2.0 !
>
>Best regards,
>Alexandre Vermeerbergen
>
>
>2017-09-22 12:24 GMT+02:00 Jungtaek Lim 
><kabh...@gmail.com<mailto:kabh...@gmail.com>>:
>
>Looks like three weeks went by from initiating the thread.
>
>I'm seeing some issues pending for review and all of them are
>regarding
>storm-kafka-client.
>
>Remaining issues are below:
>
>Storm 1.1.2
>
>https://issues.apache.org/jira/browse/STORM-2549
>https://issues.apache.org/jira/browse/STORM-2607
>https://issues.apache.org/jira/browse/STORM-2666
>
>Storm 1.2.0
>
>https://issues.apache.org/jira/browse/STORM-2648
>
>Please note that above issues are 'effectively' blocker for
>releases.
>Like
>I said Storm 1.1.1 has critical issue which is fixed and will be
>available
>at Storm 1.1.2, so at least I'd like to see the progress on Storm
>1.1.2,
>and ideally with Storm 1.2.0 since there's only one issue left on
>epic.
>
>Please finish reviewing if you are in reviewing one or more of them.
>I'll
>try to start reviewing them but take some times since I'm not
>familiar
>with
>that module.
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)
>
>2017년 8월 30일 (수) 오전 2:45, P. Taylor Goetz 
><ptgo...@gmail.com<mailto:ptgo...@gmail.com>>님이 작성:
>
>It looks to me like 1.0.5 is ready for a release candidate (still
>some
>ongoing work for 1.1.2, but likely soon).
>
>Is there anything else we would want to include in 1.0.5 or
>should we
>go
>ahead with a release?
>
>-Taylor
>
>On Aug 25, 2017, at 3:26

Re: [DISCUSS] Release Storm 1.0.5 / 1.1.2

2017-10-10 Thread Arun Mahadevan
+1 for addressing the pending reviews and getting 1.2.0 out soon.




On 10/10/17, 6:14 AM, "Jungtaek Lim"  wrote:

>Stig,
>
>Let's just handle all the issues pending Storm 1.1.2. For pending issues on
>Storm 1.2.0, I already handled all the things.
>
>For STORM-2607, could you just take over and craft a new pull request? We
>are waiting more than 2 months after requesting simple rebase (sadly it is
>not done yet), which I don't think it's acceptable. That issue relates a
>bug which we should handle it in time.
>(The patch includes your work indeed.)
>
>For STORM-2549, let's see someone could review in this week. I'll try to
>get it too.
>
>Then I think we can start release phase for Storm 1.1.2 and 1.2.0 at next
>week. Opinions anyone?
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)
>
>2017년 10월 10일 (화) 오전 4:02, Stig Rohde Døssing 님이 작성:
>
>> Maybe we would be better off releasing 1.1.2 as is, and postponing the
>> other issues to 1.2.0? I don't think we should delay the fix for
>> https://issues.apache.org/jira/browse/STORM-2682 for much longer.
>>
>> 2017-09-22 14:50 GMT+02:00 Alexandre Vermeerbergen <
>> avermeerber...@gmail.com
>> >:
>>
>> > Hello,
>> >
>> > I don't know if that help, but we're still waiting with lots of
>> > expectations https://issues.apache.org/jira/browse/STORM-2648 with Storm
>> > 1.2.0 !
>> >
>> > Best regards,
>> > Alexandre Vermeerbergen
>> >
>> >
>> > 2017-09-22 12:24 GMT+02:00 Jungtaek Lim :
>> >
>> > > Looks like three weeks went by from initiating the thread.
>> > >
>> > > I'm seeing some issues pending for review and all of them are regarding
>> > > storm-kafka-client.
>> > >
>> > > Remaining issues are below:
>> > >
>> > > > Storm 1.1.2
>> > >
>> > > https://issues.apache.org/jira/browse/STORM-2549
>> > > https://issues.apache.org/jira/browse/STORM-2607
>> > > https://issues.apache.org/jira/browse/STORM-2666
>> > >
>> > > > Storm 1.2.0
>> > >
>> > > https://issues.apache.org/jira/browse/STORM-2648
>> > >
>> > > Please note that above issues are 'effectively' blocker for releases.
>> > Like
>> > > I said Storm 1.1.1 has critical issue which is fixed and will be
>> > available
>> > > at Storm 1.1.2, so at least I'd like to see the progress on Storm
>> 1.1.2,
>> > > and ideally with Storm 1.2.0 since there's only one issue left on epic.
>> > >
>> > > Please finish reviewing if you are in reviewing one or more of them.
>> I'll
>> > > try to start reviewing them but take some times since I'm not familiar
>> > with
>> > > that module.
>> > >
>> > > Thanks,
>> > > Jungtaek Lim (HeartSaVioR)
>> > >
>> > > 2017년 8월 30일 (수) 오전 2:45, P. Taylor Goetz 님이 작성:
>> > >
>> > > > It looks to me like 1.0.5 is ready for a release candidate (still
>> some
>> > > > ongoing work for 1.1.2, but likely soon).
>> > > >
>> > > > Is there anything else we would want to include in 1.0.5 or should we
>> > go
>> > > > ahead with a release?
>> > > >
>> > > > -Taylor
>> > > >
>> > > > > On Aug 25, 2017, at 3:26 AM, Jungtaek Lim 
>> wrote:
>> > > > >
>> > > > > Hi devs,
>> > > > >
>> > > > > We received a bug report (STORM-2682
>> > > > > ) on Storm 1.0.4
>> > and
>> > > > > 1.1.1 which prevents Storm cluster from update. Personally it looks
>> > > like
>> > > > > pretty critical, and hopefully it is fixed now.
>> > > > > So maybe we would like to have another bug fix releases quickly for
>> > > > > affected 1.x version lines. What do you think?
>> > > > >
>> > > > > Also please enumerate the issues if you would want to include any
>> bug
>> > > fix
>> > > > > issues to the new bug fix releases, so that we can create epic
>> issues
>> > > and
>> > > > > track them to make releases happening sooner.
>> > > > >
>> > > > > Thanks,
>> > > > > Jungtaek Lim (HeartSaVioR)
>> > > >
>> > > >
>> > >
>> >
>>



Re: [VOTE] Java Code Style Standard for Apache Storm.

2017-05-02 Thread Arun Mahadevan
+1 for Google Java Style Guide (with 4 spaces indentation instead of 2 and line 
wrap set to 120 instead of 100)

Thanks,
Arun

On 5/2/17, 12:32 AM, "Bobby Evans"  wrote:

>Just a reminder to everyone that voting ends on Wednesday.
>
>
>- Bobby
>
>On Friday, April 28, 2017, 1:33:37 PM CDT, Kishorkumar Patil 
> wrote:
>[1] Google Java Style Guide
>
>[2] Sun Java Code 
>Conventions
>
>[3] HBase 
>style
> (sorry all I could find was an eclipse XML config)
>
>[4] Hadoop style which is 
>described as java but with 2 spaces instead of 4. 
>
>-Kishor
>
>
>On Thursday, April 27, 2017 10:01 PM, Satish Duggana  
> wrote:
> 
>
> [1] Google Java Style Guideio/styleguide/javaguide.html>
>
>[2] Hadoop style which
>is described as java but with 2 spaces instead of 4.
>
>[3] Sun Java Code Conventionscodeconventions-150003.pdf>
>
>[4] HBase stylesupport/hbase_eclipse_formatter.xml> (sorry all I could find was an eclipse
>XML config)
>
>Thanks,
>~Satish.
>
>On Thu, Apr 27, 2017 at 7:50 PM, Kyle Nusbaum <
>knusb...@yahoo-inc.com.invalid> wrote:
>
>> [1] Google Java Style Guide> io/styleguide/javaguide.html>
>>
>> [2] Sun Java Code Conventions> codeconventions-150003.pdf>
>>
>> [3] HBase style> support/hbase_eclipse_formatter.xml> (sorry all I could find was an
>> eclipse XML config)
>>
>> [4] Hadoop style
>> which is described as java but with 2 spaces instead of 4.
>> Thanks,
>> -- Kyle
>>
>> On Thursday, April 27, 2017, 3:51:25 AM CDT, Julien Nioche <
>> lists.digitalpeb...@gmail.com> wrote:Non-binding : [1 ] Google Java Style
>> Guide
>>
>> On 26 April 2017 at 19:50, Bobby Evans 
>> wrote:
>>
>> > We would like to adopt a code style standard for Apache Storm.  Please
>> > rank the following with 1 being the most desired and 4 being the least
>> > desired (5 if you have a write in choice).  This is not an official vote
>> as
>> > per the ByLaws, but we will probably go with whichever wins the most 1
>> > votes (unless it is really close and then we may go with some STeVe like
>> > ranking, but I want to avoid it because it is hard and I might get the
>> math
>> > wrong) This is open to everyone so please vote.
>> >
>> >
>> > [ ] Google Java Style Guide> > io/styleguide/javaguide.html>
>> >
>> > [ ] Sun Java Code Conventions> > codeconventions-150003.pdf>
>> >
>> > [ ] HBase style> > support/hbase_eclipse_formatter.xml> (sorry all I could find was an
>> > eclipse XML config)
>> >
>> > [ ] Hadoop style
>> > which is described as java but with 2 spaces instead of 4.
>> >
>> > [ ] Other (Please specify)
>> >
>> >
>> > I apologize if the formatting is bad I am forced to use a corporate
>> > sponsored mail client that somehow messed up plain text formatting.  I
>> > don't know how.
>> >
>> >
>> > - Bobby
>>
>>
>>
>>
>> --
>>
>> *Open Source Solutions for Text Engineering*
>>
>> http://www.digitalpebble.com
>> http://digitalpebble.blogspot.com/
>> #digitalpebble 
>>
>
>
>  




Re: Need Help

2017-05-01 Thread Arun Mahadevan
Hi Kamal,

If you are interested to work on the streaming API, there are a few pending 
tasks here (https://issues.apache.org/jira/browse/STORM-1843) that you could 
pick up. CoGroupByKey and union might be simpler ones to start with. If you are 
planning to add support for beam runner, you could also try to prototype a beam 
runner using the streams APIs but you would have to add the missing primitives 
first before you could build a full fledged beam runner.

Thanks,
Arun

On 4/28/17, 12:02 AM, "Kamal Bhatt"  wrote:

>Hello,
>
>I am a developer interested to work on Storm tasks based on  Java Streaming
>API and/or  supporting storm as a runtime for apache beam.
>
>I have some experience working in Storm project like translating  clojure
>tests to java and Admin Command.
>
>Could someone from the group working on these tasks help me get started. I
>appreciate any help/direction on this.
>
>Thanks,
>Kamal




Re: [DISCUSS] Storm 2.0 Roadmap

2017-03-24 Thread Arun Mahadevan
+1 to release with the porting completed. I think its mainly the UI server and 
log viewer that’s pending. 

We can start doing the regression and performance tests for whatever is already 
ported.

If anyone is running the master branch in their pre-prod / prod environments, 
it will be good to know and give us more confidence.

The other features can be added in follow up releases.

Regards,
Arun


On 3/24/17, 11:47 AM, "Satish Duggana"  wrote:

>+1 to have 2.0 with porting and performance(it should be at least as good
>as 1.x release) issues addressed
>
>We can target other tasks(mentioned by Taylor and Jungtaek) for 2.x-branch.
>
>
>Exactly-once support:
>While thinking through the exactlyonce support design, it is realized
>better to avoid acking tuples and implement exactly once by snapshotting
>barriers. It seems JStorm folks followed similar design, they claim it
>gives better performance. This feature is essential for beam runner and we
>can decide on respective approaches though.
>
>Beam Runner
>Lets hold on this for now and keep it in Storm till 2.x. We should avoid
>having a minimal beam runner in haste. It is better to address STORM-2284,
>exactly-once and other windowing enhancements to enable beam runner.
>
>JStorm
>Agree with Jungtaek on looking at the latest JStorm and align/scope with
>the features for 2.x.
>
>STORM-2284
>We may want to look at JStorm worker before working on respective
>components in this epic to pull appropriate enhancements.
>
>YARN/MESOS
>Supporting Storm on YARN/Mesos for 2.x.
>
>Thanks,
>Satish.
>
>
>On Fri, Mar 24, 2017 at 9:09 AM, Jungtaek Lim  wrote:
>
>> First of all, +1 to complete only port work and do sanity check (including
>> performance regression), and release.
>>
>> If we can get STORM-2284 within deterministic time frame (say 2~3 months)
>> that should be great, but if not I'd in favor of postponing that to later
>> 2.x release.
>>
>> JStorm released their new versions after code donation. So there're more
>> things we could get ideas from, or even adopt from.
>> https://github.com/alibaba/jstorm/blob/master/history.md
>> As you noticed from release note link, we also need to update phase 2 since
>> they already changed what we're planning to do in phase 2. For example,
>> they changed backpressure to end-to-end, and changed to use snapshot rather
>> than acker.
>> May be sure, JStorm pulled many features from today's Storm, like Flux,
>> Windowing, more shuffle groupings, log search, log level change, and so on.
>>
>> STORM-2426  is due to
>> the
>> limitation of Spout lifecycle (all the things are done in single thread),
>> and STORM-1358 (JStorm's
>> multi-thread Spout) can remedy this (despite that Spout implementation may
>> need to guarantee thread-safety later). It's not a just improvement but
>> close to design concern so would like to address sooner than other things
>> in phase 2.
>>
>> For Storm SQL side, I've lost progress but major work would be adopting
>> group by with windowing. It was not available from Calcite but will be
>> available at next release (1.12.0).
>> I've filed this to STORM-2405
>> , but windowing & micro
>> batch is not intuitive, so I would like to change the underlying API to
>> stream API in SQL. Also filed this to STORM-2406
>> .
>>
>> Just 2 cents btw, hopefully I would like to see metrics V2 sooner since we
>> lost metrics even when doing normal operation like restarting worker,
>> rebalancing, and so on. Eventually we need to fight with dynamic scaling,
>> and then metrics will be broken often.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2017년 3월 24일 (금) 오전 5:05, Harsha Chintalapani 님이 작성:
>>
>> > Storm 2.0 migration to java in itself is a big win and would attract
>> wider
>> > community and adoption. So my vote would be to resolve the first 3 items
>> to
>> > get a release out.
>> > All the other featured mentioned are great to have but shouldn't be
>> > blockers for 2.0 release.
>> >
>> > -Harsha
>> >
>> > On Thu, Mar 23, 2017 at 11:51 AM P. Taylor Goetz 
>> > wrote:
>> >
>> > > With the 1.1.0 release nearing completion, I’d like to turn our
>> attention
>> > > to 2.0 and develop a plan for what features, etc. to include.
>> > >
>> > > The following 3 are what I feel are the minimum for a 2.0 release.
>> These
>> > > could likely be resolved relatively quickly:
>> > >
>> > > * Performance — I’ve not benchmarked the master branch vs. 1.0.x or
>> 1.1.x
>> > > in a while, but I feel it will be important to make sure there are no
>> > > performance regressions, and would hope that we actually have a
>> > performance
>> > > improvement over previous versions. To that end (e.g. if there is in
>> > fact a
>> > > performance 

Re: [VOTE] Release Apache Storm 1.1.0 (RC3)

2017-03-24 Thread Arun Mahadevan
>Could you cast your vote? If you are still not satisfied with excluding
>jars you can cast -0 or even -1.

I am not fully convinced with the current binary distribution.

1. Why do we expect users to build the examples source from the binary 
distribution? Since these are most likely to be used by new users they will 
find it difficult if the build breaks. I checked other distributions like spark 
and they have the example jar inside the binary, but the size is pretty small. 
If we remove the shading and only keep the storm-starter.jar the size will be 
pretty small. Other option is to release a separate binary (like 
apache-storm-examples-xyz.jar) with just the example jars.
2. Keeping the connectors out of the binary is good. We should also remove the 
directories (with only the README.md). The users can find this info from the 
website.
3. Storm-sql jars are kept in the binary. Is there some reason? May be this 
should also be removed from the binary to be consistent.

Thanks,
Arun

On 3/23/17, 7:34 PM, "Jungtaek Lim" <kabh...@gmail.com> wrote:

>+1 to the latter.
>
>I'm in favor of documenting the change to release note, and also docs so
>that website can be reflected. The users who are affected to the change
>wouldn't be much, since using dependency management tool (Maven, Gradle,
>and so on) has been recommended for creating topology jar.
>
>For me it's not a blocker for release.
>
>Arun, I initiated another thread to discuss moving non-connectors to the
>top directory.
>Could you cast your vote? If you are still not satisfied with excluding
>jars you can cast -0 or even -1.
>
>- Jungtaek Lim (HeartSaVioR)
>
>2017년 3월 23일 (목) 오후 10:43, P. Taylor Goetz <ptgo...@gmail.com>님이 작성:
>
>> Do we want to cancel this RC in order to better document the changes, or
>> will documenting it in the release announcement suffice for now (provided
>> documentation is added for subsequent releases)?
>>
>> I’m partial to the latter, but am open to others’ opinions.
>>
>> -Taylor
>>
>>
>> > On Mar 22, 2017, at 9:49 AM, Bobby Evans <ev...@yahoo-inc.com.INVALID>
>> wrote:
>> >
>> > +1 I built form the tag and ran using a single node cluster.
>> > The examples and external components are excluded because they are
>> huge.  Because of shading they we distribute the same copy of them multiple
>> times.
>> > I agree with Alexandre.  We should document this change better, because
>> it is confusing for people to get a release that used to have these in it,
>> but does not any more.
>> >
>> >
>> > - Bobby
>> >
>> > On Tuesday, March 21, 2017, 10:46:38 PM CDT, Arun Mahadevan <
>> ar...@apache.org> wrote:Verified the artifacts. Compiled examples and ran
>> some sample topologies. Looks good.
>> >
>> > BTW, why are the external modules excluded from the binaries (the .zip
>> and .tar.gz). Isn’t it better if the binary distribution includes them?
>> Maybe it was already discussed but I am missing it. The sql directory
>> however seems to include the jars so it looks inconsistent.
>> >
>> > - Arun
>> >
>> >
>> > On 3/22/17, 12:56 AM, "P. Taylor Goetz" <ptgo...@apache.org> wrote:
>> >
>> >> This is a call to vote on releasing Apache Storm 1.1.0 (rc3)
>> >>
>> >> Full list of changes in this release:
>> >>
>> >>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;h=68fbab3c4f91359bd397d93a157830542839b002;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31
>> >>
>> >> The tag/commit to be voted upon is v1.1.0:
>> >>
>> >>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=7fa62404feb6b86b3143c851b46237580720eb6b;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31
>> >>
>> >> The source archive being voted upon can be found here:
>> >>
>> >>
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.1.0-rc3/apache-storm-1.1.0-src.tar.gz
>> >>
>> >> Other release files, signatures and digests can be found here:
>> >>
>> >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.1.0-rc3/
>> >>
>> >> The release artifacts are signed with the following key:
>> >>
>> >>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>> >>
>> >> The Nexus staging repository for this release is:
>> >>
>> >> https://repository.apache.org/content/repositories/orgapachestorm-1047
>> >>
>> >> Please vote on releasing this package as Apache Storm 1.1.0.
>> >>
>> >> When voting, please list the actions taken to verify the release.
>> >>
>> >> This vote will be open for at least 72 hours.
>> >>
>> >> [ ] +1 Release this package as Apache Storm 1.1.0
>> >> [ ]  0 No opinion
>> >> [ ] -1 Do not release this package because...
>> >>
>> >> Thanks to everyone who contributed to this release.
>> >>
>> >> -Taylor
>> >
>>
>>




Re: [VOTE] Release Apache Storm 1.1.0 (RC3)

2017-03-22 Thread Arun Mahadevan
Verified the artifacts. Compiled examples and ran some sample topologies. Looks 
good.

BTW, why are the external modules excluded from the binaries (the .zip and 
.tar.gz). Isn’t it better if the binary distribution includes them? Maybe it 
was already discussed but I am missing it. The sql directory however seems to 
include the jars so it looks inconsistent.

- Arun


On 3/22/17, 12:56 AM, "P. Taylor Goetz"  wrote:

>This is a call to vote on releasing Apache Storm 1.1.0 (rc3)
>
>Full list of changes in this release:
>
>https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;h=68fbab3c4f91359bd397d93a157830542839b002;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31
>
>The tag/commit to be voted upon is v1.1.0:
>
>https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=7fa62404feb6b86b3143c851b46237580720eb6b;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31
>
>The source archive being voted upon can be found here:
>
>https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.1.0-rc3/apache-storm-1.1.0-src.tar.gz
>
>Other release files, signatures and digests can be found here:
>
>https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.1.0-rc3/
>
>The release artifacts are signed with the following key:
>
>https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>
>The Nexus staging repository for this release is:
>
>https://repository.apache.org/content/repositories/orgapachestorm-1047
>
>Please vote on releasing this package as Apache Storm 1.1.0.
>
>When voting, please list the actions taken to verify the release.
>
>This vote will be open for at least 72 hours.
>
>[ ] +1 Release this package as Apache Storm 1.1.0
>[ ]  0 No opinion
>[ ] -1 Do not release this package because...
>
>Thanks to everyone who contributed to this release.
>
>-Taylor




Re: Storm 1.x-branch won't build in IntelliJ anymore

2017-02-28 Thread Arun Mahadevan
>If we really need long time to discuss above, I'm even OK to revert DRPC
>port and start 2.0.0 with webservices unported, (DRPC, UI, Logviewer) and
>address them at 2.1 or other minor versions.
>
>Any other opinions?
>

Yes, it may be better to move to 2.0 sooner and migrate the pending components 
in 2.1. May be what we need is more rigorous testing of the master branch 
before we release 2.0.

We could also consider moving to JDK 8 for the next 1.x release and it would 
make porting patches to 1.x branch much easier (unless we want to do a JDK 
upgrade only with 2.0).

Thanks,
Arun

On 3/1/17, 7:40 AM, "Jungtaek Lim"  wrote:

>Thanks Roshan for bring this up.
>
>For me moving toward 2.0.0 makes more sense.
>
>I know master branch has similar issue (DRPC) and it also has
>not-yet-ported things but if we are going to struggle with 1.x branch issue
>again and again, Storm 2.0.0 will never come. Another recent headache issue
>is JDK 7 vs JDK 8. Patch for master easily breaks for 1.x branch due to
>this, and we just had to cancel another 1.1.0 RC vote.
>
>We might want to discuss how to handle webservice like DRPC (My feeling is
>that current approach is somewhat kinda hacky.), but we can initiate
>different thread for more details.
>
>If we really need long time to discuss above, I'm even OK to revert DRPC
>port and start 2.0.0 with webservices unported, (DRPC, UI, Logviewer) and
>address them at 2.1 or other minor versions.
>
>Any other opinions?
>
>- Jungtaek Lim (HeartSaVioR)
>
>2017년 3월 1일 (수) 오전 9:27, Roshan Naik 님이 작성:
>
>Lately (about a week and half maybe) it has not been possible to get the
>1.x-branch to build inside IntelliJ. None of the modules are able to locate
>the  LocalCluster class (which is a clojure class). Previously, every once
>in a while, I used to get this problem in the storm-starter module and was
>able to get around it by doing a mvn clean install –DskipTests on the cmd
>line and then doing a full rebuild of the project in IntelliJ.
>
>Now the problem has become a lot more endemic (modules like storm-sql,
>storm-*-examples, etc.). And the above workaround doesn’t help. Spent many
>futile hours trying to work around the build issue within Intellij (both
>2015 and 2016 versions and using different Clojure plugins).
>
>There seems to possible ways to move forward:
>
>
>-  Somebody here knows what magic to do work address this. And we
>can all use that.
>
>-  Bring in the java port of LocalCluster.clj from master branch to
>1.x (STORM-1281). I spend a little time to see if this was easy to do, but
>appears kind of complicated due to number of files involved and
>dependencies on prior patches. Somebody familiar with the original porting
>effort may be better person to take this up.
>
>Thoughts ?
>
>-Roshan




Re: [Discuss] Storm hdfs spout improvements

2017-02-14 Thread Arun Mahadevan
Can you please raise a pull request with your proposal? That way it will be 
easier to review and comment.

Thanks,
Arun


On 2/15/17, 9:04 AM, "Sachin Pasalkar"  wrote:

>Can any one take a look at this? I have attached my code in JIRA.
>
>On 14/02/17, 7:38 AM, "Sachin Pasalkar" 
>wrote:
>
>>I have created JIRA for this
>>https://issues.apache.org/jira/browse/STORM-2358.
>>For point 1:
>>
>>Its specific use case just to support why it needs to be public
>>
>>For point 2:
>>We are limiting code to be very specific to these 2 implementations we
>>should have generic implementation. I see there is another check-in
>>happening for ZippedTextFileReader.I have attached my code changes in
>>JIRA, please take a look, where you need to provide class.
>>
>>For point 3:
>>
>>Lets assume I have multiple topologies with different readers. So I
>>defined the a base topology class with HDFSSpout in it. Now I always needs
>>to pass the outputFields as separate array. This actually can be part of
>>every reader class as its very specific to it.
>>
>>Thanks,
>>Sachin
>>
>>On 14/02/17, 4:52 AM, "Roshan Naik"  wrote:
>>
>>>
>>>
>>>On 2/13/17, 12:14 PM, "Sachin Pasalkar" 
>>>wrote:
>>>
I have attached updated source code of HDFSSpout for more reference. I
have updated respective classes (not attached)
>>>
>>>
>>>Don¹t see any attachment. Answers are below. Better to do this discussion
>>>on a JIRA.
>>>
>>>
>>>On 2/13/17, 8:32 AM, "Sachin Pasalkar" 
>>>wrote:
>>>
Hi,

I was looking at storm hdfs spout code in 1.x branch, I found below
improvements can be made in below code.

  1.  Make org.apache.storm.hdfs.spout.AbstractFileReader as public so
that it can be used in generics.
>>>
>>>Java generics and making a class public are unrelated to my knowledge.
>>>But
>>>making it public sounds ok to me if its useful for "user defined² readers
>>>Š although it doesn¹t really have that much going on in it. For future
>>>built-in reader types it is immaterial as they can derive from it anyway
>>>just like the existing ones. HdfsSpout class itself doesn¹t care about
>>>the
>>>ŒAbstractFileReader¹ type. For that there is the ŒFileReader¹ interface.
>>>
>>>
>>>
  2.  org.apache.storm.hdfs.spout.HdfsSpout requires readerType as
String. It will be great to have class
readerType; So we will not use Class.forName at multiple places also it
will help in below point.
>>>
>>>The reason it is a string, is that, for built-in readers,  we wanted to
>>>support Œshort aliases¹ like Œtext¹ and Œseq¹ instead of FQCN..
>>>
>>>
  3.  HdfsSpout also needs to provide outFields which are declared as
constants in each reader(e.g.SequenceFileReader). We can have abstract
API AbstractFileReader in which return them to user to make it generic.
>>>
>>>
>>>These consts can¹t go into the AbstractFileReader as they are reader
>>>specific.
>>>
>>>They are there just for convenience.  Users can call withOutputFields()
>>>on
>>>the spout and set it to these predefined names or anything else.
>>>
>>>
>>>-Roshan
>>>
>>
>




Re: simple question about grouping

2017-01-23 Thread Arun Mahadevan
 

Grouping makes sense only when you have more than one task for a bolt. If your 
bolt has more than one task, then the grouping will decide how the tuples from 
the spout are distributed to the individual tasks of the bolt. (shuffe = 
random, fields = keyed on some field and so on). 

 

See http://storm.apache.org/releases/current/Concepts.html 

 

Thanks,

Arun

 

 

From: sam mohel 
Reply-To: "u...@storm.apache.org" 
Date: Monday, January 23, 2017 at 3:09 PM
To: "u...@storm.apache.org" , "dev@storm.apache.org" 

Subject: simple question about grouping

 

i have text file contains data . size of this file is 3.5 MB . My topology 
consists of one spout and one bolt so is that possible to make all processing 
in one bolt and in this case what is the role of grouping here ? 

Thanks in advance 



Re: [DISCUSS] Prioritizing works in progress

2016-12-26 Thread Arun Mahadevan
The streams API implementation has limited usage of 1.8 features and can be 
easily ported to 1.7 if required. The examples are written in 1.8, the thought 
being users would stick to the Java 8 style usage (lambdas) from the beginning. 
If there is consensus we could also consider moving the 1.x branch to JDK 8. 

Anyways would like interested folks to start reviewing the changes so that we 
can take it forward.

Thanks,
Arun


On 12/23/16, 10:09 AM, "Jungtaek Lim"  wrote:

>FYI, I've realized that internal of Stream API (pull request) relies on JDK
>8 (what I've found is 'static method in interface' and maybe more) so for
>now Stream API is expected to be included for at least Storm 2.0.0 unless
>the PR is modified to fit to JDK 7.
>
>- Jungtaek Lim (HeartSaVioR)
>
>2016년 12월 21일 (수) 오전 9:40, Jungtaek Lim 님이 작성:
>
>> Thanks Manu and Taylor for giving your opinions.
>>
>> - Storm SQL improvement
>>
>> There're some huge PRs available but there're all about improvement which
>> shouldn't be blocker for releasing 1.1.0. (I'd like to also include them to
>> 1.1.0 but not sure it can be happen really soon.)
>> I'll send a request for reviewing about pending Storm SQL PRs.
>>
>> Only one issue (STORM-2200) is linked to release 1.1.0 epic which is
>> blocker for me.
>>
>> - Java port
>>
>> I also had some developers saying 'If core of Storm were written by Java,
>> I could experiment and even contribute on something'. I was one of them,
>> and to be honest, I'm still a beginner of Clojure. Moving to Java 8 also
>> gives great functionalities for us, so Java port is what I think the most
>> important thing among the huge works now in progress. Ideally, and
>> hopefully, I'd like to see us focus on this and make this happen at the
>> very early next year.
>> (Yes we should do some manual tests and maybe some refactoring too.)
>>
>> - Metrics V2
>>
>> I'm not sure when we plan to release Storm 1.2.0, but given that there're
>> only two things left (logviewer / ui) for completing port work (except
>> tests) I guess Storm 2.0.0 might be happen earlier.
>> Taylor, when do you expect metrics V2 will be available for reviewing?
>>
>> - Stream API
>>
>> With labeling as experiment or annotating with evolving, we could include
>> the first version to next minor excluding 1.1.0. (We could even include
>> this to 1.1.0 if we start reviewing this very soon.)
>>
>> I'd like to hear others' opinions as well.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2016년 12월 21일 (수) 오전 7:33, P. Taylor Goetz 님이 작성:
>>
>> Hi Jungtaek,
>>
>> > - Beam runner
>>
>> There’s not been much activity around this, and I haven’t had much time to
>> work on it recently, but there’s a decent foundation to build upon. So it
>> would be fairly easy for others to start contributing to that effort.
>> There’s also interest from the Beam community in that runner, so one
>> possibility is to move that effort to the Apache Beam project.
>>
>> This is very preliminary work, so I don’t have a good handle on what the
>> target release would be.
>>
>> > - Metrics renewal
>>
>>
>> This is what I’ve been referring to as “metrics_v2”. This is progressing
>> fairly well with support for multiple reporters (e.g. Graphite, Ganglia,
>> console, etc.), worker metrics, disruptor metrics, etc.
>>
>> I would like to target this work for 1.2.0.
>>
>> > - Java port
>>
>> This effort seems to have picked up (for example Bobby’s conversion of
>> Nimbus, etc.) and is progressing steadily. It’s taken a lot longer than
>> initially thought, but a lot of that can be attributed to the ebb and flow
>> of people’s availability to do the work.
>>
>> > - Storm SQL improvement (Streaming SQL in future)
>>
>> You’ve been spearheading most of the work here, so I’d delegate to you for
>> your opinion on where it stands. If you need additional reviews, just ask
>> on list or via GitHub (e.g. “[REVIEW REQUEST]” in the subject line might
>> help get attention).
>>
>> My thinking has been that this could be included in the 1.1.0 release. Is
>> there a set of JIRA issues you would like to include in order to make that
>> happen?
>>
>> > - Stream API
>>
>> This seems to have stalled a bit, though there seems to be a lot of
>> interest around it. I think we all would agree that when introducing a new
>> API for building topologies, it’s important that we get right from the
>> start and have strong buy-in from the development community. I would
>> encourage anyone interested in the Streams API to review the proposal and
>> initial code.
>>
>> I think it is close, but I’m not sure what release to target. Possibly the
>> 2.0 release?
>>
>> Re: 1.1.0 Release
>>
>> STORM-2176 is a fairly big concern of mine since the feature it involves
>> was introduced in 1.0.0 and did not work then nor in any subsequent or
>> future releases (may not be a problem in 2.0). Unfortunately, as you’ve
>> seen, finding the root cause is elusive. That issue 

Re: [DISCUSS] Feature Branch for Apache Beam Runner

2016-10-19 Thread Arun Mahadevan
+1

On 10/19/16, 8:58 PM, "P. Taylor Goetz"  wrote:

>If there are no objections, I’d like to create the feature branch and push 
>what I have so far. I’ve not had too much time lately to work on it, but 
>other’s have expressed interest in contributing so I’d like to make it 
>available.
>
>-Taylor
>
>
>> On Sep 19, 2016, at 11:15 AM, Bobby Evans  
>> wrote:
>> 
>> +1 on the idea.  I would love to contribute, but I doubt I will find time to 
>> do it any time soon. - Bobby
>> 
>>On Friday, September 16, 2016 12:05 AM, Satish Duggana 
>>  wrote:
>> 
>> 
>> Taylor,
>> I am interested in contributing to this effort. Gone through Beam APIs
>> earlier and had some initial thoughts on Storm runner. We can start with
>> existing core storm constructs but it is better to design in such a way
>> that these can be replaced with new APIs.
>> 
>> Thanks,
>> Satish.
>> 
>> On Fri, Sep 16, 2016 at 3:35 AM, P. Taylor Goetz  wrote:
>> 
>>> I'm open to change, but yes, I started with core storm since it offers the
>>> most flexibility wrt how Beam constructs are translated.
>>> 
>>> -Taylor
>>> 
 On Sep 15, 2016, at 5:51 PM, Roshan Naik  wrote:
 
 Good idea. Will the Beam API be implemented to run on top Storm Core
 primitives ?
 -roshan
 
 
> On 9/15/16, 2:00 PM, "P. Taylor Goetz"  wrote:
> 
> I¹ve been tinkering with implementing an Apache Beam runner on top of
> Storm and would like to open it up so others in the community can
> contribute. To that end I¹d like to propose creating a feature branch
>>> for
> that work if there are others who are interested in getting involved. We
> did that a while back when storm-sql was originally developed.
> 
> Basically, review requirements for that branch would be relaxed during
> development, with a final, strict review before merging back to one of
> our main branches.
> 
> I¹d like to document what I have and future improvements in a proposal
> document, and follow that with pushing the code to the feature branch
>>> for
> group collaboration.
> 
> Any thoughts? Anyone interested in contributing to such an effort?
> 
> -Taylor
 
>>> 
>> 
>




Re: [VOTE] Accept storm-jms Code Donation

2016-09-08 Thread Arun Mahadevan
+1 (binding)


On 9/9/16, 7:20 AM, "Jungtaek Lim"  wrote:

>+1 (binding)
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)
>
>On Friday, September 9, 2016, S G  wrote:
>
>> +1
>>
>> On Thu, Sep 8, 2016 at 1:50 PM, P. Taylor Goetz > > wrote:
>>
>> > Following an earlier discussion thread, I’d like to start of VOTE on
>> > whether to accept the storm-jms code donation.
>> >
>> > The codebase being donated can be found here [1].
>> >
>> > [ ] +1 Accept the code donation.
>> > [ ] 0 No opinion
>> > [ ] -1 Do not accept the code donation because…
>> >
>> > Everyone is encouraged to vote. PMC member votes are binding.
>> >
>> > -Taylor
>> >
>> > [1] https://github.com/ptgoetz/storm-jms
>> >
>>
>
>
>-- 
>Name : Jungtaek Lim
>Blog : http://medium.com/@heartsavior
>Twitter : http://twitter.com/heartsavior
>LinkedIn : http://www.linkedin.com/in/heartsavior




[jira] [Updated] (STORM-1961) Come up with streams api for storm core use cases

2016-09-08 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan updated STORM-1961:
--
Attachment: UnifiedStreamapiforStorm.pdf

The high level design and api doc. I have been making progress on an initial 
implementation and working to get it to a reviewable state.

> Come up with streams api for storm core use cases
> -
>
> Key: STORM-1961
> URL: https://issues.apache.org/jira/browse/STORM-1961
> Project: Apache Storm
>  Issue Type: Sub-task
>    Reporter: Arun Mahadevan
>    Assignee: Arun Mahadevan
> Attachments: UnifiedStreamapiforStorm.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-2047) In secure setup the log page can't be viewed

2016-08-18 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427634#comment-15427634
 ] 

Arun Mahadevan commented on STORM-2047:
---

[~raghavgautam] The storm security doc 
https://github.com/apache/storm/blob/master/docs/SECURITY.md already explains 
whitelisting the UI servers in the browser. Maybe we can add a note to 
explicitly mention that the hosts where the log viewer runs needs to be 
whitelisted as well.

> In secure setup the log page can't be viewed
> 
>
> Key: STORM-2047
> URL: https://issues.apache.org/jira/browse/STORM-2047
> Project: Apache Storm
>  Issue Type: Bug
>  Components: documentation
>Reporter: Raghav Kumar Gautam
>    Assignee: Arun Mahadevan
> Attachments: screenshot-1.png
>
>
> This is about the topology inspector feature. When we click events button on 
> the bolt page, we expect that we will get to a log page which will show 
> tuples. Instead we get  authentication required error, see attached image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-2042) Nimbus client connections not closed properly causing connection leaks

2016-08-17 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan updated STORM-2042:
--
Fix Version/s: 1.x
   2.0.0

> Nimbus client connections not closed properly causing connection leaks
> --
>
> Key: STORM-2042
> URL: https://issues.apache.org/jira/browse/STORM-2042
> Project: Apache Storm
>  Issue Type: Bug
>    Reporter: Arun Mahadevan
>    Assignee: Arun Mahadevan
> Fix For: 2.0.0, 1.x
>
>
> The nimbus client connections are not closed properly causing connection 
> leaks. After the number of connections exceed nimbus.thrift.threads, a 
> RejectedExecutionException is thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1434) Support the GROUP BY clause in StormSQL

2016-08-16 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422735#comment-15422735
 ] 

Arun Mahadevan commented on STORM-1434:
---

[~kabhwan] 

Trident supports two kinds of aggregates on grouped streams,

aggregate - aggregates on each group within a batch
persistentAggregate - aggregates across batches (using the underlying state)

May be you can do an aggregate per batch (window being the batch boundary) 
which fits with how you would run a group-by query on a regular table 

I am not sure how would you represent a persistentAggregate operation in SQL. 
May be something along the lines of INSERT into state(key, count) select key, 
count(key) from stream group by key ON DUPLICATE KEY set key = key + count 


> Support the GROUP BY clause in StormSQL
> ---
>
> Key: STORM-1434
> URL: https://issues.apache.org/jira/browse/STORM-1434
> Project: Apache Storm
>  Issue Type: New Feature
>  Components: storm-sql
>Reporter: Haohui Mai
>
> This jira tracks the effort of implement the support `GROUP BY` clause in 
> StormSQL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-2037) debug operation should be whitelisted in SimpleAclAuthorizer

2016-08-11 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-2037:
-

 Summary: debug operation should be whitelisted in 
SimpleAclAuthorizer
 Key: STORM-2037
 URL: https://issues.apache.org/jira/browse/STORM-2037
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 1.x, 2.0.0
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


For topology event logging to work in secure mode, the "debug" operation should 
be whitelisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-2027) Possible Race Condition issue in SlidingWindow

2016-08-08 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412986#comment-15412986
 ] 

Arun Mahadevan commented on STORM-2027:
---

[~kabhwan] I havent looked deeper, but it looks like the RollingCount and slot 
based counter could be replaced with something like,

{code:java}
private static class RollingCountBolt extends BaseWindowedBolt {
private OutputCollector collector;

@Override
public void prepare(Map stormConf, TopologyContext context, 
OutputCollector collector) {
super.prepare(stormConf, context, collector);
this.collector = collector;
}

@Override
public void execute(TupleWindow inputWindow) {
Map<Object, Long> counts = new HashMap<>();
for (Tuple tuple : inputWindow.get()) {
Object obj = tuple.getValue(0);
Long count;
if ((count = counts.get(obj)) != null) {
counts.put(obj, count + 1);
} else {
counts.put(obj, 1L);
}
}

for (Map.Entry<Object, Long> count : counts.entrySet()) {
collector.emit(new Values(count.getKey(), count.getValue()));
}
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("obj", "count"));
}
}
{code}

And,

{code:java}
builder.setBolt(counterId, new 
RollingCountBolt().withWindow(Duration.seconds(9), Duration.seconds(3)), 4)
  .fieldsGrouping(spoutId, new Fields("word"));
{code}

> Possible Race Condition issue in SlidingWindow
> --
>
> Key: STORM-2027
> URL: https://issues.apache.org/jira/browse/STORM-2027
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Priority: Minor
> Attachments: TestSlotBasedCounter.java
>
>
> The function SlotBasedCounter#incrementCount() presents a bug. If 2 
> concurrent threads want to update the same counter, the result is different 
> from the expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Storm 1.0.2 (rc4)

2016-08-03 Thread Arun Mahadevan

+1 (binding)

Deployed 1.0.2 and ran sample topologies.

Thanks,
Arun

On 8/3/16, 5:57 AM, "Jungtaek Lim"  wrote:

>Reminder: vote was open for a week and has 1 binding and 2 non-binding.
>
>Please participate testing the 1.0.2 RC4 and vote to make releasing happen.
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)
>
>2016년 7월 28일 (목) 오후 5:56, Xin Wang 님이 작성:
>
>> +1 (non binding)
>>  Deployed 3-node cluster and example topologies.
>>
>>  Thanks,
>>  Xin.
>>
>> 2016-07-27 18:18 GMT+08:00 Satish Duggana :
>>
>> > +1 (Non binding)
>> >
>> > src distribution
>> >   - Retrieved source archive and built using 'mvn clean install -P
>> > all-tests’
>> >   - Built the binary package from the above source archive
>> >
>> > bin distribution
>> >   - Ran different topologies in local cluster
>> >   - Created a 3 node cluster with worker slots.
>> >   - Deployed few topologies
>> >   - Checked various options (like deactivate/kill/activate topology view
>> > etc) and monitoring stats in the UI for those topologies.
>> >   - Ran storm commands on those topologies like
>> > deactivate/rebalance/activate/kill with respective options.
>> >   - Killed some of the workers to check failover etc.
>> >   - Checked change log level settings for topologies.
>> >
>> > Thanks,
>> > Satish.
>> >
>> > On 7/27/16, 1:41 PM, "Jungtaek Lim"  wrote:
>> >
>> > +1 (binding)
>> >
>> > Detailed tests are done from RC3, and critical fixes for RC4 are
>> tested
>> > manually while fixing them.
>> >
>> > - source extracted and build passed
>> > - binary extracted and daemons launched
>> > - running RollingTopWords to remote mode succeed
>> >
>> > Thanks,
>> > Jungtaek Lim (HeartSaVioR)
>> >
>> > 2016년 7월 27일 (수) 오전 5:04, P. Taylor Goetz 님이 작성:
>> >
>> > > This is a call to vote on releasing Apache Storm 1.0.2 (rc4)
>> > >
>> > > Full list of changes in this release:
>> > >
>> > >
>> > >
>> >
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;hb=c9768c154166b6217ec7bffc4a9aa73e90f2339d
>> > >
>> > > The tag/commit to be voted upon is v1.0.2:
>> > >
>> > >
>> > >
>> >
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;hb=54f319fd56c33437364c550a800ac9e6fe058b95
>> > >
>> > > The source archive being voted upon can be found here:
>> > >
>> > >
>> > >
>> >
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.0.2-rc4/apache-storm-1.0.2-src.tar.gz
>> > >
>> > > Other release files, signatures and digests can be found here:
>> > >
>> > >
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.0.2-rc4/
>> > >
>> > > The release artifacts are signed with the following key:
>> > >
>> > >
>> > >
>> >
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>> > >
>> > > The Nexus staging repository for this release is:
>> > >
>> > >
>> > https://repository.apache.org/content/repositories/orgapachestorm-1038
>> > >
>> > > Please vote on releasing this package as Apache Storm 1.0.2.
>> > >
>> > > When voting, please list the actions taken to verify the release.
>> > >
>> > > This vote will be open for at least 72 hours.
>> > >
>> > > [ ] +1 Release this package as Apache Storm 1.0.2
>> > > [ ]  0 No opinion
>> > > [ ] -1 Do not release this package because...
>> > >
>> > > Thanks to everyone who contributed to this release.
>> > >
>> > > -Taylor
>> > >
>> >
>> >
>> >
>>




Re: [DISCUSSION] Policy of resolving dependencies for non storm-core modules

2016-07-21 Thread Arun Mahadevan
Shade and relocate the external modules sounds ok as a short term solution. 

For the long term we should consider something like the second option to add 
external modules without shipping uber jars.

Thanks,
Arun

On 7/22/16, 6:07 AM, "Jungtaek Lim"  wrote:

>Hi devs,
>
>AFAIK, we had been struggled to resolve dependency issues for storm-core.
>As we all know, the strategy we have been using is shade & relocating.
>
>Now State and Storm SQL requires that some of external modules need to be
>included to extlib, which is the classpath workers refer.
>
>http://issues.apache.org/jira/browse/STORM-1881
>https://issues.apache.org/jira/browse/STORM-1435
>
>There're two issues here:
>- We don't make uber jar for external modules so users need to find and
>copy dependencies jars to extlib manually.
>- External modules also use Guava and Jackson and so on which are origin of
>version conflict issues.
>
>So we should apply the shade & relocating strategy for every external
>modules (at least storm-redis, storm-kafka, storm-sql-core,
>storm-sql-kafka), or introduce the way to add the dependency without adding
>them to extlib. (like --packages and --jar for Spark)
>
>Please express your opinions about this.
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)




[jira] [Created] (STORM-1987) Fix TridentKafkaWordCount arg handling

2016-07-19 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1987:
-

 Summary: Fix TridentKafkaWordCount arg handling 
 Key: STORM-1987
 URL: https://issues.apache.org/jira/browse/STORM-1987
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan
Priority: Minor


zkUrl and brokerUrl are not set correctly in distributed mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (STORM-1964) Unexpected behavior when using count window together with timestamp extraction

2016-07-15 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan reassigned STORM-1964:
-

Assignee: Arun Mahadevan

> Unexpected behavior when using count window together with timestamp extraction
> --
>
> Key: STORM-1964
> URL: https://issues.apache.org/jira/browse/STORM-1964
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 1.0.1
>Reporter: Lorenzo Affetti
>    Assignee: Arun Mahadevan
>Priority: Minor
>  Labels: timestamp, windowing
>
> I launched a topology applying a tumbling count window of size 2 (watermark 
> interval 200ms, lag 1s) with the following input (timestamp,value):
> {noformat}
> (10,10)
> (10,20)
> (11,30)
> (12,40)
> (12,50)
> (12,60)
> (12,70)
> (13,80)
> (14,90)
> (15,100)
> {noformat}
> And I got these windows as output:
> {noformat}
> [(10,10), (10,20)]
> [(12,60), (12,70)]
> [(12,60), (12,70)]// why (60, 70) twice?
> [(13,80), (14,90)]
> {noformat}
> I would expect something like:
> {noformat}
> [(10,10), (10,20)]
> [(11,30), (12,40)]
> [(12,50), (12,60)]
> [(12,70), (13,80)]
> [(14,90), (15,100)]
> {noformat}
> It seems like that timestamp extraction and count windows does not fit each 
> other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1968) Storm logviewer does not work for nimbus.log in secure cluster

2016-07-13 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1968:
-

 Summary: Storm logviewer does not work for nimbus.log in secure 
cluster
 Key: STORM-1968
 URL: https://issues.apache.org/jira/browse/STORM-1968
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


logviewer invokes "get-log-user-group-whitelist" which tries to get the worker 
metadata file by invoking "get-log-metadata-file". In the case of nimbus.log 
clojure-from-yaml-file  returns nil and the authorization fails.

Modify clojure-from-yaml-file to return an empty map in case of failures so 
that the authorization can continue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (STORM-1873) Reemit late tuples in windowed mode

2016-06-09 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan resolved STORM-1873.
---
   Resolution: Fixed
Fix Version/s: 1.1.0
   2.0.0

> Reemit late tuples in windowed mode
> ---
>
> Key: STORM-1873
> URL: https://issues.apache.org/jira/browse/STORM-1873
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: storm-core
>Reporter: Balazs Kossovics
> Fix For: 2.0.0, 1.1.0
>
>
> Currently late tuples are just logged (and acknowledged in the coming 1.0.2), 
> but in our  use-case it would be desirable to emit them on a different stream 
> than the default.
> I implemented a first version, where every windowed bolt are going to have a 
> '_late' stream by default, and component-specific parameter 
> (Config.TOPOLOGY_BOLTS_EMIT_LATE_TUPLE) the definer of the bolt could turn on 
> or off the emission of the late tuples on this stream. 
> One could turn on the emission of late tuples with a builder method like this:
> {code:title=MyWindowedBolt.java|borderStyle=solid}
> new MyWindowedBolt()
> .withTimestampField("timestamp")
> .withLateTupleEmission(true)
> .withWindow(
> new BaseWindowedBolt.Duration(1, TimeUnit.MINUTES),
> new BaseWindowedBolt.Duration(1, TimeUnit.SECONDS)
> );
> {code}
> What do you think about it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (STORM-1878) Flux does not handle stateful bolts

2016-06-07 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan resolved STORM-1878.
---
   Resolution: Fixed
Fix Version/s: 1.1.0
   1.0.2
   2.0.0

> Flux does not handle stateful bolts
> ---
>
> Key: STORM-1878
> URL: https://issues.apache.org/jira/browse/STORM-1878
> Project: Apache Storm
>  Issue Type: Bug
>  Components: Flux
>Affects Versions: 1.0.1
>Reporter: Daniel Klessing
> Fix For: 2.0.0, 1.0.2, 1.1.0
>
>
> We noticed that it is not possible at the moment to create a topology with 
> Flux which contains stateful bolts (based on IStatefulBolt). Those bolts will 
> not be instantiated.
> Pull request upcoming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: reemit late tuples in windowed mode

2016-05-30 Thread Arun Mahadevan
Hi Balázs,

The idea sounds good. 

Only whitelisted configs can be overridden at component level. The java configs 
are converted to clojure variables by replacing _ (underscore) with - (hyphen) 
and you need to add and entry like [1] in executor.clj to get it working.

- Arun

[1] 
https://github.com/apache/storm/blob/master/storm-core/src/clj/org/apache/storm/daemon/executor.clj#L114




On 5/30/16, 11:26 PM, "Balázs Kossovics"  wrote:

>Hey,
>
>I'm trying to implement an alterative behaviour in case of late tuples 
>for windowing. Currently late tuples are just logged (and acknowledged 
>in the coming 1.0.2), but in my usecase it would be desirable to emit 
>them onto a user defined stream. One could define a bolt with a stream 
>for late tuples like this:
>
>new MyWindowedBolt()
> .withTimestampField("timestamp")
>*  .withLateTupleStream("late_tuples")
> .withWindow(
> new BaseWindowedBolt.Duration(1, TimeUnit.MINUTES),
> new BaseWindowedBolt.Duration(1, TimeUnit.SECONDS)
> )
>
>I made a quick patch 
>(https://github.com/kosii/storm/commit/216c991da3c5b6c6cac1b25182b86507c3fb5e9e)
> 
>to test the idea, where the indended behaviour would be something like this:
>1, the withLateTupleStream builder method puts a new key into the 
>windowConfiguration Map,
>2, in WindowedBoltExecutor. declareOutputFields a new stream gets declared,
>3, in WindowedBoltExecutor.prepare we store the stream's name in a 
>private variable,
>4, if this variable is not null, then in the execute method we emit each 
>late tuple onto the new stream.
>
>The problem is in the 3rd step 
>(https://github.com/kosii/storm/commit/216c991da3c5b6c6cac1b25182b86507c3fb5e9e#diff-32eff130ad25f4b7b6c069e8d42245acR170),
> 
>where the stormConf map doesn't contain anymore my key, which was still 
>present in the 2nd step. I'm out of ideas, so I'd really appreciate if 
>someone could explain me what's happening here.
>
>I'm also interested in your ideas concerning the feature, and what would 
>you like to eventually see in the upstream.
>
>Best regards,
>Balazs
>
>



[jira] [Created] (STORM-1868) Modify TridentKafkaWordCount to run in distributed mode

2016-05-26 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1868:
-

 Summary: Modify TridentKafkaWordCount to run in distributed mode
 Key: STORM-1868
 URL: https://issues.apache.org/jira/browse/STORM-1868
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (STORM-1859) Late tuples are not acked in windowed mode

2016-05-25 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan resolved STORM-1859.
---
   Resolution: Fixed
Fix Version/s: 1.1.0
   1.0.2
   2.0.0

> Late tuples are not acked in windowed mode
> --
>
> Key: STORM-1859
> URL: https://issues.apache.org/jira/browse/STORM-1859
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Balazs Kossovics
> Fix For: 2.0.0, 1.0.2, 1.1.0
>
>
> The current implementation simply ignores late tuples without acking
> them, which causes timeouts and replays after
> TOPOLOGY_MESSAGE_TIMEOUT_SECS. A tuple which was late at a some time
> is going to be late at any moment in the future, so there is no point
> of replaying it, because the lingering late tuples will just block the
> topology (especially if TOPOLOGY_MAX_SPOUT_PENDING is set).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (STORM-1850) State Checkpointing documentation update regarding spout state management

2016-05-23 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan resolved STORM-1850.
---
   Resolution: Fixed
Fix Version/s: 1.1.0
   2.0.0

> State Checkpointing documentation update regarding spout state management
> -
>
> Key: STORM-1850
> URL: https://issues.apache.org/jira/browse/STORM-1850
> Project: Apache Storm
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Olivier Mallassi
>Priority: Trivial
>  Labels: newbie
> Fix For: 2.0.0, 1.1.0
>
>
> update the documentation with respect to this discussion on the Mailing list 
> https://mail-archives.apache.org/mod_mbox/storm-user/201605.mbox/%3CF9D9F747-8431-4A26-9028-BD74C09BCA84%40hortonworks.com%3E
>  
> concerning the "consistency" between spout and bolt state (during 
> checkpointing). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1841) Address a few minor issues in windowing and doc

2016-05-20 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294011#comment-15294011
 ] 

Arun Mahadevan commented on STORM-1841:
---

Thanks [~raghavgautam] for finding and reporting this issue. The fix is merged 
to 1.x-branch.

> Address a few minor issues in windowing and doc
> ---
>
> Key: STORM-1841
> URL: https://issues.apache.org/jira/browse/STORM-1841
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.0.1
>    Reporter: Arun Mahadevan
>    Assignee: Arun Mahadevan
>Priority: Minor
>
> 1. Do not accept negative values for window length or sliding interval in 
> BaseWindowedBolt
> 2. Added static factories for Count and Duration for ease of use.
> 3. Explicitly call out when the first window is evaluated for sliding windows 
> in the windowing doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1851) Nimbus impersonation authorizer in defaults.yaml causes issues in secure mode

2016-05-19 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1851:
-

 Summary: Nimbus impersonation authorizer in defaults.yaml causes 
issues in secure mode
 Key: STORM-1851
 URL: https://issues.apache.org/jira/browse/STORM-1851
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan
Priority: Minor


  "nimbus.impersonation.authorizer" is set to "ImpersonationAuthorizer" by 
default and this causes issues when a user tries to submit topology as a 
different user in secure mode since the "nimbus.impersonation.acl" 
configuration is not set by default. Users need to set nimbus.impersonation.acl 
first before they can submit topology as a user other than "storm" in secure 
mode.

Removing this config allows users to submit topologies as any user in secure 
mode by default. Users can set up impersonation by providing both authorizer 
and the acls in storm.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1757) Apache Beam Runner for Storm

2016-05-17 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15287098#comment-15287098
 ] 

Arun Mahadevan commented on STORM-1757:
---

> So if I ask all bolts to checkpoint and most succeed but one does not, I can 
> not longer restore everyone to a consistent place, so I can start replaying 
> again. This is what I am concerned about.

The checkpointing handle this via a prepare and a commit phase. 
- First a "prepare" message is send. If the prepare fails (most of the bolts 
succeeded but one did not), the checkpoint is restored to the last successful 
point (rollback) and tuples are replayed.
- If prepare succeeds, a "commit" message is send.
- If commit fails, the commit is re-attempted by sending the "commit" message 
again. The bolts that had already committed ignores this and the bolt that had 
previously not committed would now commit. They can do so because the prepared 
data is persisted in the state during prepare phase.
- Once the commit succeeds the txn is marked as complete. 

> Apache Beam Runner for Storm
> 
>
> Key: STORM-1757
> URL: https://issues.apache.org/jira/browse/STORM-1757
> Project: Apache Storm
>  Issue Type: Brainstorming
>Reporter: P. Taylor Goetz
>Priority: Minor
>
> This is a call for interested parties to collaborate on an Apache Beam [1] 
> runner for Storm, and express their thoughts and opinions.
> Given the addition of the Windowing API to Apache Storm, we should be able to 
> map naturally to the Beam API. If not, it may be indicative of shortcomings 
> of the Storm API that should be addressed.
> [1] http://beam.incubator.apache.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1757) Apache Beam Runner for Storm

2016-05-17 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15286896#comment-15286896
 ] 

Arun Mahadevan commented on STORM-1757:
---

The state transition doc tries to explain some of this.

https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/spout/CheckPointState.java#L25


> Apache Beam Runner for Storm
> 
>
> Key: STORM-1757
> URL: https://issues.apache.org/jira/browse/STORM-1757
> Project: Apache Storm
>  Issue Type: Brainstorming
>Reporter: P. Taylor Goetz
>Priority: Minor
>
> This is a call for interested parties to collaborate on an Apache Beam [1] 
> runner for Storm, and express their thoughts and opinions.
> Given the addition of the Windowing API to Apache Storm, we should be able to 
> map naturally to the Beam API. If not, it may be indicative of shortcomings 
> of the Storm API that should be addressed.
> [1] http://beam.incubator.apache.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1757) Apache Beam Runner for Storm

2016-05-17 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15286881#comment-15286881
 ] 

Arun Mahadevan commented on STORM-1757:
---

[~revans2] the current checkpointing supports at-least once and may be we could 
use it as is for the initial prototype if we are ok with the guarantee. The 
current implementation does not checkpoint the state of the spout, but I think 
it can be extended to do so either by having the checkpoint spout co-ordinating 
with user spouts via zookeeper or by having  the checkpoint spout act like a 
co-ordinator with user spouts running as bolts (similar to trident). I will put 
more thought around this and try to come up with a prototype. 

I am not sure I understand correctly the requirement for having a commit id in 
rollback. The current restore happens by discarding any prepared (but 
un-commited) changes or rolling forward any commits that were in progress.

> Apache Beam Runner for Storm
> 
>
> Key: STORM-1757
> URL: https://issues.apache.org/jira/browse/STORM-1757
> Project: Apache Storm
>  Issue Type: Brainstorming
>Reporter: P. Taylor Goetz
>Priority: Minor
>
> This is a call for interested parties to collaborate on an Apache Beam [1] 
> runner for Storm, and express their thoughts and opinions.
> Given the addition of the Windowing API to Apache Storm, we should be able to 
> map naturally to the Beam API. If not, it may be indicative of shortcomings 
> of the Storm API that should be addressed.
> [1] http://beam.incubator.apache.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1841) Address a few minor issues in windowing and doc

2016-05-16 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1841:
-

 Summary: Address a few minor issues in windowing and doc
 Key: STORM-1841
 URL: https://issues.apache.org/jira/browse/STORM-1841
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 2.0.0, 1.0.1
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan
Priority: Minor


1. Do not accept negative values for window length or sliding interval in 
BaseWindowedBolt
2. Added static factories for Count and Duration for ease of use.
3. Explicitly call out when the first window is evaluated for sliding windows 
in the windowing doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1757) Apache Beam Runner for Storm

2016-05-16 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284767#comment-15284767
 ] 

Arun Mahadevan commented on STORM-1757:
---

[~revans2] distributed checkpointing implemented on top of storm's acking 
mechanism is currently used to implement the Storm's Stateful bolts. It 
currently provides at-least once guarantee and I think it can be enhanced to 
address the checkpointing requirements.

> Apache Beam Runner for Storm
> 
>
> Key: STORM-1757
> URL: https://issues.apache.org/jira/browse/STORM-1757
> Project: Apache Storm
>  Issue Type: Brainstorming
>Reporter: P. Taylor Goetz
>Priority: Minor
>
> This is a call for interested parties to collaborate on an Apache Beam [1] 
> runner for Storm, and express their thoughts and opinions.
> Given the addition of the Windowing API to Apache Storm, we should be able to 
> map naturally to the Beam API. If not, it may be indicative of shortcomings 
> of the Storm API that should be addressed.
> [1] http://beam.incubator.apache.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1757) Apache Beam Runner for Storm

2016-05-16 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284266#comment-15284266
 ] 

Arun Mahadevan commented on STORM-1757:
---

Multiple active windows what [~satish.duggana] is proposing is good to have to 
address feature parity with beam and processing of late events, but we should 
also be aware of the extra complexities it will add in terms of buffering and 
tracking multiple windows. Custom triggers, eviction will be useful with single 
active window as well and may be much more simpler to add first. We should also 
try to make the trident implementation robust in terms of basic event time 
processing, time windows and address follow on JIRAs.

[~sriharsha] idea of unified api based on Java streams api sounds good and if 
it makes sense lets open a separate JIRA to work on it.

> Apache Beam Runner for Storm
> 
>
> Key: STORM-1757
> URL: https://issues.apache.org/jira/browse/STORM-1757
> Project: Apache Storm
>  Issue Type: Brainstorming
>Reporter: P. Taylor Goetz
>Priority: Minor
>
> This is a call for interested parties to collaborate on an Apache Beam [1] 
> runner for Storm, and express their thoughts and opinions.
> Given the addition of the Windowing API to Apache Storm, we should be able to 
> map naturally to the Beam API. If not, it may be indicative of shortcomings 
> of the Storm API that should be addressed.
> [1] http://beam.incubator.apache.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1833) Add simple equi-join support in storm-sql standalone mode

2016-05-13 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1833:
-

 Summary: Add simple equi-join support in storm-sql standalone mode
 Key: STORM-1833
 URL: https://issues.apache.org/jira/browse/STORM-1833
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


Provide simple equi join support in storm sql standalone mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1757) Apache Beam Runner for Storm

2016-05-13 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282475#comment-15282475
 ] 

Arun Mahadevan commented on STORM-1757:
---

Unified storm stream api is ideal but could be a bigger effort than doing a 
beam runner poc with existing storm trident apis or adding minimal extra apis 
to get started and to identify the gaps.

> Apache Beam Runner for Storm
> 
>
> Key: STORM-1757
> URL: https://issues.apache.org/jira/browse/STORM-1757
> Project: Apache Storm
>  Issue Type: Brainstorming
>Reporter: P. Taylor Goetz
>Priority: Minor
>
> This is a call for interested parties to collaborate on an Apache Beam [1] 
> runner for Storm, and express their thoughts and opinions.
> Given the addition of the Windowing API to Apache Storm, we should be able to 
> map naturally to the Beam API. If not, it may be indicative of shortcomings 
> of the Storm API that should be addressed.
> [1] http://beam.incubator.apache.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Storm 0.10.1 (rc2)

2016-05-03 Thread Arun Mahadevan
+1 (binding)
- Extracted binaries
- Ran sample topologies 
- Browsed storm UI

Thanks,
Arun




On 4/28/16, 12:01 PM, "Jungtaek Lim"  wrote:

>+1 (binding)
>
>- testing with source distribution : OK
>  - unzip : OK
>  - building from source dist : OK
>- how to build: running `mvn -P all-tests clean install` on unzipped
>source dist.
>
>- testing with binary distribution (one machine) : OK
>  - launch daemons : OK
>  - run RollingTopWords (local) : OK
>  - run RollingTopWords (remote) : OK
>- activate / deactivate / rebalance / kill : OK
>
>Thanks,
>Jungtaek Lim
>
>2016년 4월 28일 (목) 오전 3:29, P. Taylor Goetz 님이 작성:
>
>> This is a call to vote on releasing Apache Storm 0.10.1 (rc2)
>>
>> Full list of changes in this release:
>>
>>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;hb=8179921a569b6cf1d97798eed8e7b03b131bc495
>>
>> The tag/commit to be voted upon is v0.10.1:
>>
>>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=ddf051149f3c386342937efdabdaf45694602dd1;hb=8179921a569b6cf1d97798eed8e7b03b131bc495;a=tree;h=26291835f22474506f0fe90b0459eab0d00bf4a9;hb=f0d3eae7395b3ee036b94b922707f74868ba6190
>>
>> The source archive being voted upon can be found here:
>>
>>
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-0.10.1-rc2/apache-storm-0.10.1-src.tar.gz
>>
>> Other release files, signatures and digests can be found here:
>>
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-0.10.1-rc2/
>>
>> The release artifacts are signed with the following key:
>>
>>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>>
>> The Nexus staging repository for this release is:
>>
>> https://repository.apache.org/content/repositories/orgapachestorm-1031/
>>
>> Please vote on releasing this package as Apache Storm 0.10.1.
>>
>> When voting, please list the actions taken to verify the release.
>>
>> This vote will be open for at least 72 hours.
>>
>> [ ] +1 Release this package as Apache Storm 0.10.1
>> [ ]  0 No opinion
>> [ ] -1 Do not release this package because...
>>
>> Thanks to everyone who contributed to this release.
>>
>> -Taylor
>>



Re: [VOTE] Release Apache Storm 1.0.1 (rc3)

2016-05-03 Thread Arun Mahadevan
+1 (binding)
- Extracted all binaries.
- Ran sample topologies.
- Verified Stateful bolt changes added in 1.0.1 works as expected.

Thanks,
Arun





On 5/3/16, 4:52 AM, "Harsha"  wrote:

>+1 (binding)
>  - Deployed 3-node cluster  and example topologies
>  - Verified the binaries signature.
>Thanks,
>Harsha
>
>On Sun, May 1, 2016, at 07:11 PM, Jungtaek Lim wrote:
>> +1 (binding)
>> 
>> - build succeed from source code
>> - binary extracted well, and supervisor launches worker fine even though
>> JAVA_HOME is unset (STORM-1741) / also checked when JAVA_HOME is set
>> 
>> Thanks all of efforts you all have been done with this release.
>> 
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>> 
>> 
>> 2016년 4월 30일 (토) 오전 10:08, Xin Wang 님이 작성:
>> 
>> > +1 (Non binding)
>> >
>> > 2016-04-30 5:28 GMT+08:00 P. Taylor Goetz :
>> >
>> > > If at first you don't succeed... ;)
>> > >
>> > > This is a call to vote on releasing Apache Storm 1.0.1 (rc3)
>> > >
>> > > Full list of changes in this release:
>> > >
>> > >
>> > >
>> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;hb=b5c16f919ad4099e6fb25f1095c9af8b64ac9f91
>> > >
>> > > The tag/commit to be voted upon is v1.0.1:
>> > >
>> > >
>> > >
>> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=14b2ac0fe33759a50a2392ebdc7ad3821d43b89f;hb=b5c16f919ad4099e6fb25f1095c9af8b64ac9f91
>> > >
>> > > The source archive being voted upon can be found here:
>> > >
>> > >
>> > >
>> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.0.1-rc3/apache-storm-1.0.1-src.tar.gz
>> > >
>> > > Other release files, signatures and digests can be found here:
>> > >
>> > > https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.0.1-rc3/
>> > >
>> > > The release artifacts are signed with the following key:
>> > >
>> > >
>> > >
>> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>> > >
>> > > The Nexus staging repository for this release is:
>> > >
>> > > https://repository.apache.org/content/repositories/orgapachestorm-1033/
>> > >
>> > > Please vote on releasing this package as Apache Storm 1.0.1.
>> > >
>> > > When voting, please list the actions taken to verify the release.
>> > >
>> > > This vote will be open for at least 72 hours.
>> > >
>> > > [ ] +1 Release this package as Apache Storm 1.0.1
>> > > [ ]  0 No opinion
>> > > [ ] -1 Do not release this package because...
>> > >
>> > > Thanks to everyone who contributed to this release.
>> > >
>> > > -Taylor
>> > >
>> >
>



[jira] [Commented] (STORM-1757) Apache Beam Runner for Storm

2016-05-02 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268155#comment-15268155
 ] 

Arun Mahadevan commented on STORM-1757:
---

Would like to collaborate in this effort.

> Apache Beam Runner for Storm
> 
>
> Key: STORM-1757
> URL: https://issues.apache.org/jira/browse/STORM-1757
> Project: Apache Storm
>  Issue Type: Brainstorming
>Reporter: P. Taylor Goetz
>Priority: Minor
>
> This is a call for interested parties to collaborate on an Apache Beam [1] 
> runner for Storm, and express their thoughts and opinions.
> Given the addition of the Windowing API to Apache Storm, we should be able to 
> map naturally to the Beam API. If not, it may be indicative of shortcomings 
> of the Storm API that should be addressed.
> [1] http://beam.incubator.apache.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1714) StatefulBolts ends up as normal bolts while using TopologyBuilder.setBolt without parallelism

2016-04-15 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1714:
-

 Summary: StatefulBolts ends up as normal bolts while using 
TopologyBuilder.setBolt without parallelism
 Key: STORM-1714
 URL: https://issues.apache.org/jira/browse/STORM-1714
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 1.0.0, 2.0.0
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


StatefulBolt inherits from IRichBolt which but the TopologyBuilder.setBolt 
overload is chosen based on the static type of the parameter causing issues. 
See if StatfulBolt can be refactored to not directly inherit from IRichBolt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1711) Kerberos principals gets mixed up while using storm-hive

2016-04-14 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1711:
-

 Summary: Kerberos principals gets mixed up while using storm-hive
 Key: STORM-1711
 URL: https://issues.apache.org/jira/browse/STORM-1711
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 1.0.0, 2.0.0
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


Storm-hive uses UserGroupInformation.loginUserFromKeytab which updates the 
static variable that stores current UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1709) Add group by support in storm-sql standalone mode

2016-04-13 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1709:
-

 Summary: Add group by support in storm-sql standalone mode
 Key: STORM-1709
 URL: https://issues.apache.org/jira/browse/STORM-1709
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: AW: Use only latest values

2016-04-10 Thread Arun Mahadevan
Hi Matthias, 

WindowedBolt does support event time. In trident its is not yet exposed.

Hi Daniela,

You could solve your use cases in different ways. One would be to have a 
WindowedBolt with a 1 min tumbling window, do your custom aggregation (e.g. 
sum) every time the window tumbles and emit the results to another bolt where 
you update the count in Redis. Most of your state saving could also be 
automated by defining a Stateful bolt that would periodically checkpoint your 
state (sum per device). You could also club both windowing and state into a 
StatefulWindowedBolt implementation. You can evaluate the options and decide 
based on your use cases.

Take a look at the sample topologies (SlidingWindowTopology, 
SlidingTupleTsTopology, StatefulTopology, StatefulWindowingTopology) in 
storm-starter and the docs for more info.

https://github.com/apache/storm/blob/master/docs/Windowing.md

https://github.com/apache/storm/blob/master/docs/State-checkpointing.md


-Arun




On 4/10/16, 4:30 PM, "Matthias J. Sax"  wrote:

>A tumbling window (ie, non-overlapping window) is the right approach (a
>sliding window is overlapping).
>
>The window goes into your aggregation bolt (windowing and aggregation
>goes hand in hand, ie, when the window gets closed, the aggregation is
>triggered and the window content is handed over to the aggregation
>function).
>
>Be aware that Storm (currently) only supports processing time window (an
>no event time windows).
>
>-Matthias
>
>
>On 04/10/2016 09:56 AM, Daniela Stoiber wrote:
>> Hi,
>> 
>> thank you for your reply.
>> 
>> How can I ensure that the latest values are pulled from Redis the sum is
>> updated every minute? Do I need a sliding window with an interval of 1
>> minute? Where would this sliding window be located in my topology?
>> 
>> Thank you in advance.
>> 
>> Regards,
>> Daniela 
>> 
>> -Ursprüngliche Nachricht-
>> Von: Matthias J. Sax [mailto:mj...@apache.org] 
>> Gesendet: Samstag, 9. April 2016 12:13
>> An: dev@storm.apache.org
>> Betreff: Re: Use only latest values
>> 
>> Sounds reasonable.
>> 
>> 
>> On 04/09/2016 08:34 AM, Daniela Stoiber wrote:
>>> Hi,
>>>
>>>  
>>>
>>> I would like to cache values and to use only the latest "valid" values 
>>> to build a sum.
>>>
>>> In more detail, I receive values from devices periodically. I would 
>>> like to add up all the valid values each minute. But not every device 
>>> sends a new value every minute. And as long as there is no new value 
>>> the old one should be used for the sum. As soon as I receive a new 
>>> value from a device I would like to overwrite the old value and to use 
>>> the new one for the sum. Would that be possible with the combination of
>> Storm and Redis?
>>>
>>>  
>>>
>>> My idea was to use the following:
>>>
>>>  
>>>
>>> - Kafka Spout
>>>
>>> - Storm Bolt for storing the tuples in Redis and for overwriting the 
>>> values as soon as a new one is delivered
>>>
>>> - Storm Bolt for reading the latest tuples from Redis
>>>
>>> - Storm Bolt for grouping (I would like to group the devices per 
>>> region)
>>>
>>> - Storm Bolt for aggregation
>>>
>>> - Storm Bolt for storing the results again in Redis
>>>
>>>  
>>>
>>> Thank you in advance.
>>>
>>>  
>>>
>>> Regards,
>>>
>>> Daniela
>>>
>>>
>> 
>> 
>



[jira] [Updated] (STORM-1692) HBaseSecurityUtil#login modifies the current UGI causing issues if two instances are running with different credentials

2016-04-07 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan updated STORM-1692:
--
Summary: HBaseSecurityUtil#login modifies the current UGI causing issues if 
two instances are running with different credentials  (was: 
org.apache.storm.hbase.security.HBaseSecurityUtil#login uses broken double 
checked locking. )

> HBaseSecurityUtil#login modifies the current UGI causing issues if two 
> instances are running with different credentials
> ---
>
> Key: STORM-1692
> URL: https://issues.apache.org/jira/browse/STORM-1692
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-hbase
>Reporter: Satish Duggana
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1692) org.apache.storm.hbase.security.HBaseSecurityUtil#login uses broken double checked locking.

2016-04-07 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229845#comment-15229845
 ] 

Arun Mahadevan commented on STORM-1692:
---

HBaseSecurityUtil uses legacyprovider.login(), which finally ends up in 
UserGroupInformation.loginUserFromKeytab() which modifies the static loginUser 
(currently logged in user) and this is causing issues if two tasks in the same 
jvm login with different credentials. 

The entire login method should rather be re-written to not affect the current 
user during login.

> org.apache.storm.hbase.security.HBaseSecurityUtil#login uses broken double 
> checked locking. 
> 
>
> Key: STORM-1692
> URL: https://issues.apache.org/jira/browse/STORM-1692
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-hbase
>Reporter: Satish Duggana
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Storm 1.0.0 (rc2)

2016-04-06 Thread Arun Mahadevan
+1

Verified build from the source archive, deployed .tar.gz binary and ran a 
sample topology.

- Arun



On 4/6/16, 2:40 AM, "P. Taylor Goetz"  wrote:

>+1 (binding)
>
>- Verified build from source archive with `mvn clean install -P all-tests`
>- Checked LICENSE and NOTICE files
>- Deployed to a small cluster and tested a variety of topologies.
>
>-Taylor
>
>> On Apr 5, 2016, at 2:38 PM, P. Taylor Goetz  wrote:
>> 
>> This is a call to vote on releasing Apache Storm 1.0.0 (rc2)
>> 
>> Full list of changes in this release:
>> 
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;hb=dba655a47aaad74f26b9bb9a75fa52c0eedd8b1e
>> 
>> The tag/commit to be voted upon is v1.0.0:
>> 
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=91db02dbcbfc4ce6b949b1316c9ebc9ec1bcc95f;hb=dba655a47aaad74f26b9bb9a75fa52c0eedd8b1e
>> 
>> The source archive being voted upon can be found here:
>> 
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.0.0-rc2/apache-storm-1.0.0-src.tar.gz
>> 
>> Other release files, signatures and digests can be found here:
>> 
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.0.0-rc2/
>> 
>> The release artifacts are signed with the following key:
>> 
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>> 
>> The Nexus staging repository for this release is:
>> 
>> https://repository.apache.org/content/repositories/orgapachestorm-1028/
>> 
>> Please vote on releasing this package as Apache Storm 1.0.0.
>> 
>> When voting, please list the actions taken to verify the release.
>> 
>> This vote will be open for at least 72 hours.
>> 
>> [ ] +1 Release this package as Apache Storm 1.0.0
>> [ ]  0 No opinion
>> [ ] -1 Do not release this package because...
>> 
>> Thanks to everyone who contributed to this release.
>> 
>> -Taylor
>



Re: Combining group by and time window

2016-03-29 Thread Arun Mahadevan
Hi Daniela,

For storm core, windowed bolts would give you the tuples in the last minute but 
you would have to do the grouping yourself. You could of-course use a fields 
grouping to split the load across the windowed bolts. For trident you might 
want to take a look at the windowing apis that were added recently and see if 
it fits your need. You have to choose between trident and core based on your 
use cases, the guarantee you need and if you need batching vs per tuple 
processing etc.

- Arun




On 3/29/16, 8:52 PM, "Daniela Stoiber"  wrote:

>Hi,
>
> 
>
>I have a stream with time series data from different regions. I would like
>to group the stream by the different regions and to add up the values of the
>last minute (time window) per region. The sums should be persisted to Redis
>or something like this.
>
> 
>
>I already found out that Storm Trident provides a group by function to split
>the stream. I think this could be useful.
>
>Storm core provides time windows, so I could use it for the aggregation.
>
> 
>
>But how can I combine these two components? Or is this not possible?
>
> 
>
>Would it be useful to do the grouping already in Kafka (with different
>topics) or is it better to do it in Storm
>
>Thank you in advance.
>
>Regards,
>
>Daniela
>



[jira] [Created] (STORM-1662) Reduce map lookups in send_to_eventlogger

2016-03-28 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1662:
-

 Summary: Reduce map lookups in send_to_eventlogger
 Key: STORM-1662
 URL: https://issues.apache.org/jira/browse/STORM-1662
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


Reducing map lookup in send_to_eventlogger can improve performance when when a 
spout emits in a tight loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1615) Update state checkpointing doc with bolt's acking contract

2016-03-09 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1615:
-

 Summary: Update state checkpointing doc with bolt's acking contract
 Key: STORM-1615
 URL: https://issues.apache.org/jira/browse/STORM-1615
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan
 Fix For: 1.0.0, 2.0.0


Update 
https://github.com/apache/storm/blob/asf-site/documentation/State-checkpointing.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-1608) Fix stateful topology acking behavior

2016-03-06 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan updated STORM-1608:
--
Summary: Fix stateful topology acking behavior  (was: Fix stateful bolt 
acking behavior)

> Fix stateful topology acking behavior
> -
>
> Key: STORM-1608
> URL: https://issues.apache.org/jira/browse/STORM-1608
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 1.0.0, 2.0.0
>    Reporter: Arun Mahadevan
>    Assignee: Arun Mahadevan
>
> Right now the acking is automatically taken care of for the non-stateful 
> bolts in a stateful topology. This leads to double acking if BaseRichBolts 
> are part of the topology. For the non-stateful bolts, its better to let the 
> bolt do the acking rather than automatically acking.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1608) Fix stateful bolt acking behavior

2016-03-06 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1608:
-

 Summary: Fix stateful bolt acking behavior
 Key: STORM-1608
 URL: https://issues.apache.org/jira/browse/STORM-1608
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 1.0.0, 2.0.0
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


Right now the acking is automatically taken care of for the non-stateful bolts 
in a stateful topology. This leads to double acking if BaseRichBolts are part 
of the topology. For the non-stateful bolts, its better to let the bolt do the 
acking rather than automatically acking.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1586) ExprCompiler support for UDFs in Storm-sql

2016-02-28 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1586:
-

 Summary: ExprCompiler support for UDFs in Storm-sql
 Key: STORM-1586
 URL: https://issues.apache.org/jira/browse/STORM-1586
 Project: Apache Storm
  Issue Type: Sub-task
Reporter: Arun Mahadevan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (STORM-1586) ExprCompiler support for UDFs in Storm-sql

2016-02-28 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan reassigned STORM-1586:
-

Assignee: Arun Mahadevan

> ExprCompiler support for UDFs in Storm-sql
> --
>
> Key: STORM-1586
> URL: https://issues.apache.org/jira/browse/STORM-1586
> Project: Apache Storm
>  Issue Type: Sub-task
>    Reporter: Arun Mahadevan
>    Assignee: Arun Mahadevan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1585) Add DDL support for UDFs in Storm-sql

2016-02-28 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1585:
-

 Summary: Add DDL support for UDFs in Storm-sql
 Key: STORM-1585
 URL: https://issues.apache.org/jira/browse/STORM-1585
 Project: Apache Storm
  Issue Type: Sub-task
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1584) Support UDF in storm-sql

2016-02-28 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1584:
-

 Summary: Support UDF in storm-sql
 Key: STORM-1584
 URL: https://issues.apache.org/jira/browse/STORM-1584
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-1570) Support nested field lookup in Storm sql

2016-02-28 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan updated STORM-1570:
--
Summary: Support nested field lookup in Storm sql  (was: Support nested 
field lookup and user defined functions in Storm sql)

> Support nested field lookup in Storm sql
> 
>
> Key: STORM-1570
> URL: https://issues.apache.org/jira/browse/STORM-1570
> Project: Apache Storm
>  Issue Type: Improvement
>    Reporter: Arun Mahadevan
>    Assignee: Arun Mahadevan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1576) TopologyBuilder fails with ConcurrentModification in addCheckPointInputs for stateful topologies

2016-02-25 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1576:
-

 Summary: TopologyBuilder fails with ConcurrentModification in 
addCheckPointInputs for stateful topologies
 Key: STORM-1576
 URL: https://issues.apache.org/jira/browse/STORM-1576
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


addCheckPointInputs adds to map while iterating over it, which needs to be 
fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-1576) TopologyBuilder fails with ConcurrentModification in addCheckPointInputs for stateful topologies

2016-02-25 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan updated STORM-1576:
--
Affects Version/s: 2.0.0
   1.0.0

> TopologyBuilder fails with ConcurrentModification in addCheckPointInputs for 
> stateful topologies
> 
>
> Key: STORM-1576
> URL: https://issues.apache.org/jira/browse/STORM-1576
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 1.0.0, 2.0.0
>    Reporter: Arun Mahadevan
>    Assignee: Arun Mahadevan
>
> addCheckPointInputs adds to map while iterating over it, which needs to be 
> fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1570) Support nested field lookup and user defined functions in Storm sql

2016-02-23 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1570:
-

 Summary: Support nested field lookup and user defined functions in 
Storm sql
 Key: STORM-1570
 URL: https://issues.apache.org/jira/browse/STORM-1570
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (STORM-1566) Worker exits with error o.a.s.d.worker [ERROR] Error on initialization of server mk-worker java.lang.ClassCastException: java.lang.String cannot be cast to java.io.File

2016-02-23 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan resolved STORM-1566.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

Thanks [~satish.duggana] merged to master.

> Worker exits with error o.a.s.d.worker [ERROR] Error on initialization of 
> server mk-worker java.lang.ClassCastException: java.lang.String cannot be 
> cast to java.io.File
> 
>
> Key: STORM-1566
> URL: https://issues.apache.org/jira/browse/STORM-1566
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 2.0.0
>Reporter: Satish Duggana
>Assignee: Satish Duggana
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (STORM-1558) Utils in java breaks component page due to illegal type cast

2016-02-23 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan resolved STORM-1558.
---
Resolution: Fixed

Thanks [~Cody] merged to master.

> Utils in java breaks component page due to illegal type cast
> 
>
> Key: STORM-1558
> URL: https://issues.apache.org/jira/browse/STORM-1558
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 2.0.0
>Reporter: Cody
>Assignee: Cody
> Fix For: 2.0.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Two methods in Utils.java:
> logsFilename and eventLogsFilename, the 'port' argument was changed to String 
> type in PR for #STORM-1538, but its caller event-log-link and worker-log-link 
> in core.clj passes an int port, which results in illegal type cast.
> Also this is possibly the cause to #STORM-1545



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1540) Topology Debug/Sampling Breaks Trident Topologies

2016-02-16 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148380#comment-15148380
 ] 

Arun Mahadevan commented on STORM-1540:
---

Actually this issue happens only when the trident tuples has to be transferred 
over the network (and has to be serialized). 

To reproduce, I set numWorkers to '4' in TridentWordCount, 
topology.eventlogger.executors: 1 in storm.yaml, ran the topology and then turn 
on debug/sampling from UI.

> Topology Debug/Sampling Breaks Trident Topologies
> -
>
> Key: STORM-1540
> URL: https://issues.apache.org/jira/browse/STORM-1540
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core, trident
>Affects Versions: 1.0.0
>Reporter: P. Taylor Goetz
>    Assignee: Arun Mahadevan
>Priority: Blocker
>
> Steps to reproduce:
> 1. Deploy a Trident topology.
> 2. Turn on debug/sampling.
> Workers will crash with the following error:
> 2016-02-11 14:13:23.617 o.a.s.util [ERROR] Async loop died!
> java.lang.RuntimeException: java.lang.RuntimeException: 
> java.io.NotSerializableException: org.apache.storm.trident.tuple.ConsList
>   at 
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:448)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at 
> org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:414)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at 
> org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at 
> org.apache.storm.disruptor$consume_loop_STAR_$fn__7651.invoke(disruptor.clj:83)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) 
> [storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
>   at java.lang.Thread.run(Thread.java:745) [?:1.8.0_72]
> Caused by: java.lang.RuntimeException: java.io.NotSerializableException: 
> org.apache.storm.trident.tuple.ConsList
>   at 
> org.apache.storm.serialization.SerializableSerializer.write(SerializableSerializer.java:41)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568) 
> ~[kryo-2.21.jar:?]
>   at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:75)
>  ~[kryo-2.21.jar:?]
>   at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:18)
>  ~[kryo-2.21.jar:?]
>   at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:486) 
> ~[kryo-2.21.jar:?]
>   at 
> org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:44)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at 
> org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:44)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at 
> org.apache.storm.daemon.worker$mk_transfer_fn$transfer_fn__8346.invoke(worker.clj:186)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at 
> org.apache.storm.daemon.executor$start_batch_transfer__GT_worker_handler_BANG_$fn__8037.invoke(executor.clj:309)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at 
> org.apache.storm.disruptor$clojure_handler$reify__7634.onEvent(disruptor.clj:40)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   at 
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:435)
>  ~[storm-core-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
>   ... 6 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1552) Fix topology event sampling log directory

2016-02-15 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1552:
-

 Summary: Fix topology event sampling log directory 
 Key: STORM-1552
 URL: https://issues.apache.org/jira/browse/STORM-1552
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 1.0.0, 2.0.0
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


Run a topology and enable event inspection by clicking "Debug" from UI. The 
events are logged under 
"storm-local/workers-artifacts/{storm-id}/port/events.log". In the spout/bolt 
details page, the "events" link does not display the log file.

The events.log should be kept under 
logs/workers-artifacts/{storm-id}/{port}/events.log so that its viewable via 
logviewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1550) Fix storm.cmd to pass url-encoded options

2016-02-15 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1550:
-

 Summary: Fix storm.cmd to pass url-encoded options
 Key: STORM-1550
 URL: https://issues.apache.org/jira/browse/STORM-1550
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan


Windows 'storm.cmd' should url-encode options passed in storm.options similar 
to storm.py.

The regex that splits the options (org.apache.storm.utils.Utils) can be 
simplified to `split(",")` once this is done. The regex can currently produce 
undesired effect if invalid Json strings are passed via `storm.cmd` in windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1532) Fix readCommandLineOpts to parse JSON correctly

2016-02-09 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138858#comment-15138858
 ] 

Arun Mahadevan commented on STORM-1532:
---

-c drpc.servers=[\"host1\"] works

> Fix readCommandLineOpts to parse JSON correctly
> ---
>
> Key: STORM-1532
> URL: https://issues.apache.org/jira/browse/STORM-1532
> Project: Apache Storm
>  Issue Type: Bug
>    Reporter: Arun Mahadevan
>Assignee: Arun Mahadevan
>
> Utils.readCommandLineOpts does a split on "," so it does not correctly parse 
> values passed as json object or json arrays.
> Tested by passing 
> -c drpc.servers=[\"host1\", \"host2\"]  in storm jar command and it fails 
> with an exception,
> Exception in thread "main" java.lang.IllegalArgumentException: Field 
> DRPC_SERVERS must be an Iterable but was a class java.lang.String



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1532) Fix readCommandLineOpts to parse JSON correctly

2016-02-09 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138898#comment-15138898
 ] 

Arun Mahadevan commented on STORM-1532:
---

Actually this seems to be an issue only with storm.cmd.

Storm.py does a urlencode on storm.options so its able to parse json arrays 
correctly.

> Fix readCommandLineOpts to parse JSON correctly
> ---
>
> Key: STORM-1532
> URL: https://issues.apache.org/jira/browse/STORM-1532
> Project: Apache Storm
>  Issue Type: Bug
>    Reporter: Arun Mahadevan
>    Assignee: Arun Mahadevan
>
> Utils.readCommandLineOpts does a split on "," so it does not correctly parse 
> values passed as json object or json arrays.
> Tested by passing 
> -c drpc.servers=[\"host1\", \"host2\"]  in storm jar command and it fails 
> with an exception,
> Exception in thread "main" java.lang.IllegalArgumentException: Field 
> DRPC_SERVERS must be an Iterable but was a class java.lang.String



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1532) Fix readCommandLineOpts to parse JSON correctly

2016-02-09 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1532:
-

 Summary: Fix readCommandLineOpts to parse JSON correctly
 Key: STORM-1532
 URL: https://issues.apache.org/jira/browse/STORM-1532
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


Utils.readCommandLineOpts does a split on "," so it does not correctly parse 
values passed as json object or json arrays.

Tested by passing 

-c drpc.servers=[\"host1\", \"host2\"]  in storm jar command and it fails with 
an exception,

Exception in thread "main" java.lang.IllegalArgumentException: Field 
DRPC_SERVERS must be an Iterable but was a class java.lang.String



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-1476) Filter -c options from args and add them as part of storm.options

2016-02-09 Thread Arun Mahadevan (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15140427#comment-15140427
 ] 

Arun Mahadevan commented on STORM-1476:
---

Thanks [~satish.duggana] merged to master and 1.x branch.

> Filter -c options from args and add them as part of storm.options
> -
>
> Key: STORM-1476
> URL: https://issues.apache.org/jira/browse/STORM-1476
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 1.0.0
> Environment: Windows
>Reporter: Satish Duggana
>Assignee: Satish Duggana
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (STORM-1476) Filter -c options from args and add them as part of storm.options

2016-02-09 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan resolved STORM-1476.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> Filter -c options from args and add them as part of storm.options
> -
>
> Key: STORM-1476
> URL: https://issues.apache.org/jira/browse/STORM-1476
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 1.0.0
> Environment: Windows
>Reporter: Satish Duggana
>Assignee: Satish Duggana
> Fix For: 1.0.0, 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Trident pipelining and transactional properties

2016-02-08 Thread Arun Mahadevan
The execute phase is pipelined and only the commits are strictly ordered. 

So a trident bolt could receive tuples from batch1, batch2 and again batch1 and 
so on. The framework internally maintains separate context for each batch and 
the execute is invoked with the respective batch’s context. The bolts could 
also emit tuples which are forwarded to the next bolt in the DAG without 
waiting for the batch to complete.

However the finsihBatch/commit is ordered. I.e commit for batch2 is invoked 
only after batch1 commit is successful.


Thanks,
Arun

On 2/8/16, 9:50 PM, "Felix Dreissig"  wrote:

>Hi,
>
>I’m re-posting this from Storm-User, as it didn’t get a reply there and 
>touches the internal implementation quite a bit.
>
>I am trying to understand the parallelism properties and transactional 
>semantics offered by Trident and couldn’t find an answer to these two 
>questions:
>
>1. The „Trident Spouts“ documentation [1] says that „[b]y default, Trident 
>processes a single batch at a time, waiting for the batch to succeed or fail 
>before trying another batch“.
>But do Trident bolts always wait until a batch is completed, collect the 
>results and then pass them on to the next bolt(s) as complete batches? Without 
>pipelining, this would mean that only one bolt can be active at a time, 
>effectively preventing any parallelism.
>Or are tuples entering a stream and being delivered to the next bolt(s) as 
>soon as they are emitted? This would still introduce some idle time and 
>increased latency without pipelining, but at least seems like a better 
>resource utilization.
>
>2. The idea that states will only ever have to deal with a new batch or one 
>from immediately before (assuming transactional or opaque-transactional 
>spouts) is at the core of Trident’s state model.
>On this topic, the docs section from above promises that even with pipelining, 
>„Trident will order any state updates taking place in the topology among 
>batches“. Is this some special guarantee for the built-in stateful operations 
>(i.e. partitionPersist and persistentAggregate, which afaics uses 
>partitionPersist internally), or can all bolts assume that they’ll never see 
>any batch repeated except the latest one they processed?
>
>I couldn’t find these questions covered in the docs or in previous 
>discussions. So I tried consulting the source code, but it’s not easily 
>comprehensible with regard to such issues.
>Any help would be highly appreciated.
>
>Best regards,
>Felix
>
>[1] https://storm.apache.org/documentation/Trident-spouts.html#pipelining



[jira] [Created] (STORM-1527) Trident api doc for peek

2016-02-05 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1527:
-

 Summary: Trident api doc for peek
 Key: STORM-1527
 URL: https://issues.apache.org/jira/browse/STORM-1527
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


Document api added in https://issues.apache.org/jira/browse/STORM-1517



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1517) Add peek api in trident stream

2016-02-02 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1517:
-

 Summary: Add peek api in trident stream
 Key: STORM-1517
 URL: https://issues.apache.org/jira/browse/STORM-1517
 Project: Apache Storm
  Issue Type: Bug
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


peek can be used to examine tuples at a point in the trident stream pipeline. 
This is similar to the java8 stream peek api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-1513) Update trident api docs

2016-02-01 Thread Arun Mahadevan (JIRA)
Arun Mahadevan created STORM-1513:
-

 Summary: Update trident api docs
 Key: STORM-1513
 URL: https://issues.apache.org/jira/browse/STORM-1513
 Project: Apache Storm
  Issue Type: Sub-task
Reporter: Arun Mahadevan
Assignee: Arun Mahadevan


Update trident api docs with the newly added api from STORM-1505



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >