[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases

2016-10-19 Thread satishd
Github user satishd commented on the issue:

https://github.com/apache/storm/pull/1693
  
@arunmahadevan I was saying we should not hold on this PR till Beam runner 
API is tried with these APIs. We should review these APIs soon and these may 
remain experimental for couple of minor releases. 

I was going through Beam features earlier and I plan to have a doc about 
Beam features and how we can start implementing some of those APIs. I will add 
details once Taylor's branch is open and the doc can be updated with that. I 
will share the doc once it is ready(in 2-3 weeks). We can create subtasks based 
on that doc and work on them later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases

2016-10-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/storm/pull/1693
  
For me this stream API is not only for supporting Beam. This is what Storm 
needed to have even before Beam is open-sourced. We may want to label this API 
to experiment or unstable or something, but anyway I'd like to see it released 
not too late.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1736: STORM-1446 Compile the Calcite logical plan to Storm Trid...

2016-10-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/storm/pull/1736
  
Fixed some bugs, and renamed classes to reflect more specific.
Unit test passes and manually tested.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases

2016-10-19 Thread arunmahadevan
Github user arunmahadevan commented on the issue:

https://github.com/apache/storm/pull/1693
  
@satishd right now its more of a prototype proof of concept for the 
proposed apis, not the final version. Once the beam api solidifies we can have 
a separate discussion to identify the gaps, but doesn't make sense to hold off 
the prototype implementation which is more for validating the proposed apis.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases

2016-10-19 Thread satishd
Github user satishd commented on the issue:

https://github.com/apache/storm/pull/1693
  
It is better to discuss these APIs before working on Beam runner 
prototypes. Lets not jump on using these APIs to work on runner. Beam API is 
currently changing and we should discuss those API contracts and we should 
evaluate what kind of enhancements we need to do. We can create subtasks for 
the requirements and work on those later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases

2016-10-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/storm/pull/1693
  
@arunmahadevan 
I'm in the middle of reviewing updated design doc again. Since I've read it 
once so there will be no outstanding comments, but just would like to check 
again.

Btw, I merged this PR to my local and convert one of example on 
storm-starter (windowing example), and it seems to run fine. After reviewing 
design doc, I'll play with API sets and leave some feedbacks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Feature Branch for Apache Beam Runner

2016-10-19 Thread Satish Duggana
+1, waiting for that. :)
Currently,there are API changes going on in Beam. It seem they plan to get
that done by the end of 2016.

~Satish.

On Wed, Oct 19, 2016 at 9:19 PM, Bobby Evans 
wrote:

> +1 - Bobby
>
> On Wednesday, October 19, 2016 10:30 AM, Arun Mahadevan <
> ar...@apache.org> wrote:
>
>
>  +1
>
> On 10/19/16, 8:58 PM, "P. Taylor Goetz"  wrote:
>
> >If there are no objections, I’d like to create the feature branch and
> push what I have so far. I’ve not had too much time lately to work on it,
> but other’s have expressed interest in contributing so I’d like to make it
> available.
> >
> >-Taylor
> >
> >
> >> On Sep 19, 2016, at 11:15 AM, Bobby Evans 
> wrote:
> >>
> >> +1 on the idea.  I would love to contribute, but I doubt I will find
> time to do it any time soon. - Bobby
> >>
> >>On Friday, September 16, 2016 12:05 AM, Satish Duggana <
> satish.dugg...@gmail.com> wrote:
> >>
> >>
> >> Taylor,
> >> I am interested in contributing to this effort. Gone through Beam APIs
> >> earlier and had some initial thoughts on Storm runner. We can start with
> >> existing core storm constructs but it is better to design in such a way
> >> that these can be replaced with new APIs.
> >>
> >> Thanks,
> >> Satish.
> >>
> >> On Fri, Sep 16, 2016 at 3:35 AM, P. Taylor Goetz 
> wrote:
> >>
> >>> I'm open to change, but yes, I started with core storm since it offers
> the
> >>> most flexibility wrt how Beam constructs are translated.
> >>>
> >>> -Taylor
> >>>
>  On Sep 15, 2016, at 5:51 PM, Roshan Naik 
> wrote:
> 
>  Good idea. Will the Beam API be implemented to run on top Storm Core
>  primitives ?
>  -roshan
> 
> 
> > On 9/15/16, 2:00 PM, "P. Taylor Goetz"  wrote:
> >
> > I¹ve been tinkering with implementing an Apache Beam runner on top of
> > Storm and would like to open it up so others in the community can
> > contribute. To that end I¹d like to propose creating a feature branch
> >>> for
> > that work if there are others who are interested in getting
> involved. We
> > did that a while back when storm-sql was originally developed.
> >
> > Basically, review requirements for that branch would be relaxed
> during
> > development, with a final, strict review before merging back to one
> of
> > our main branches.
> >
> > I¹d like to document what I have and future improvements in a
> proposal
> > document, and follow that with pushing the code to the feature branch
> >>> for
> > group collaboration.
> >
> > Any thoughts? Anyone interested in contributing to such an effort?
> >
> > -Taylor
> 
> >>>
> >>
> >
>
>
>
>
>


Re: [DISCUSS] Feature Branch for Apache Beam Runner

2016-10-19 Thread Bobby Evans
+1 - Bobby 

On Wednesday, October 19, 2016 10:30 AM, Arun Mahadevan  
wrote:
 

 +1

On 10/19/16, 8:58 PM, "P. Taylor Goetz"  wrote:

>If there are no objections, I’d like to create the feature branch and push 
>what I have so far. I’ve not had too much time lately to work on it, but 
>other’s have expressed interest in contributing so I’d like to make it 
>available.
>
>-Taylor
>
>
>> On Sep 19, 2016, at 11:15 AM, Bobby Evans  
>> wrote:
>> 
>> +1 on the idea.  I would love to contribute, but I doubt I will find time to 
>> do it any time soon. - Bobby
>> 
>>    On Friday, September 16, 2016 12:05 AM, Satish Duggana 
>> wrote:
>> 
>> 
>> Taylor,
>> I am interested in contributing to this effort. Gone through Beam APIs
>> earlier and had some initial thoughts on Storm runner. We can start with
>> existing core storm constructs but it is better to design in such a way
>> that these can be replaced with new APIs.
>> 
>> Thanks,
>> Satish.
>> 
>> On Fri, Sep 16, 2016 at 3:35 AM, P. Taylor Goetz  wrote:
>> 
>>> I'm open to change, but yes, I started with core storm since it offers the
>>> most flexibility wrt how Beam constructs are translated.
>>> 
>>> -Taylor
>>> 
 On Sep 15, 2016, at 5:51 PM, Roshan Naik  wrote:
 
 Good idea. Will the Beam API be implemented to run on top Storm Core
 primitives ?
 -roshan
 
 
> On 9/15/16, 2:00 PM, "P. Taylor Goetz"  wrote:
> 
> I¹ve been tinkering with implementing an Apache Beam runner on top of
> Storm and would like to open it up so others in the community can
> contribute. To that end I¹d like to propose creating a feature branch
>>> for
> that work if there are others who are interested in getting involved. We
> did that a while back when storm-sql was originally developed.
> 
> Basically, review requirements for that branch would be relaxed during
> development, with a final, strict review before merging back to one of
> our main branches.
> 
> I¹d like to document what I have and future improvements in a proposal
> document, and follow that with pushing the code to the feature branch
>>> for
> group collaboration.
> 
> Any thoughts? Anyone interested in contributing to such an effort?
> 
> -Taylor
 
>>> 
>> 
>



   

Re: [DISCUSS] Feature Branch for Apache Beam Runner

2016-10-19 Thread Arun Mahadevan
+1

On 10/19/16, 8:58 PM, "P. Taylor Goetz"  wrote:

>If there are no objections, I’d like to create the feature branch and push 
>what I have so far. I’ve not had too much time lately to work on it, but 
>other’s have expressed interest in contributing so I’d like to make it 
>available.
>
>-Taylor
>
>
>> On Sep 19, 2016, at 11:15 AM, Bobby Evans  
>> wrote:
>> 
>> +1 on the idea.  I would love to contribute, but I doubt I will find time to 
>> do it any time soon. - Bobby
>> 
>>On Friday, September 16, 2016 12:05 AM, Satish Duggana 
>>  wrote:
>> 
>> 
>> Taylor,
>> I am interested in contributing to this effort. Gone through Beam APIs
>> earlier and had some initial thoughts on Storm runner. We can start with
>> existing core storm constructs but it is better to design in such a way
>> that these can be replaced with new APIs.
>> 
>> Thanks,
>> Satish.
>> 
>> On Fri, Sep 16, 2016 at 3:35 AM, P. Taylor Goetz  wrote:
>> 
>>> I'm open to change, but yes, I started with core storm since it offers the
>>> most flexibility wrt how Beam constructs are translated.
>>> 
>>> -Taylor
>>> 
 On Sep 15, 2016, at 5:51 PM, Roshan Naik  wrote:
 
 Good idea. Will the Beam API be implemented to run on top Storm Core
 primitives ?
 -roshan
 
 
> On 9/15/16, 2:00 PM, "P. Taylor Goetz"  wrote:
> 
> I¹ve been tinkering with implementing an Apache Beam runner on top of
> Storm and would like to open it up so others in the community can
> contribute. To that end I¹d like to propose creating a feature branch
>>> for
> that work if there are others who are interested in getting involved. We
> did that a while back when storm-sql was originally developed.
> 
> Basically, review requirements for that branch would be relaxed during
> development, with a final, strict review before merging back to one of
> our main branches.
> 
> I¹d like to document what I have and future improvements in a proposal
> document, and follow that with pushing the code to the feature branch
>>> for
> group collaboration.
> 
> Any thoughts? Anyone interested in contributing to such an effort?
> 
> -Taylor
 
>>> 
>> 
>




Re: [DISCUSS] Feature Branch for Apache Beam Runner

2016-10-19 Thread P. Taylor Goetz
If there are no objections, I’d like to create the feature branch and push what 
I have so far. I’ve not had too much time lately to work on it, but other’s 
have expressed interest in contributing so I’d like to make it available.

-Taylor


> On Sep 19, 2016, at 11:15 AM, Bobby Evans  wrote:
> 
> +1 on the idea.  I would love to contribute, but I doubt I will find time to 
> do it any time soon. - Bobby
> 
>On Friday, September 16, 2016 12:05 AM, Satish Duggana 
>  wrote:
> 
> 
> Taylor,
> I am interested in contributing to this effort. Gone through Beam APIs
> earlier and had some initial thoughts on Storm runner. We can start with
> existing core storm constructs but it is better to design in such a way
> that these can be replaced with new APIs.
> 
> Thanks,
> Satish.
> 
> On Fri, Sep 16, 2016 at 3:35 AM, P. Taylor Goetz  wrote:
> 
>> I'm open to change, but yes, I started with core storm since it offers the
>> most flexibility wrt how Beam constructs are translated.
>> 
>> -Taylor
>> 
>>> On Sep 15, 2016, at 5:51 PM, Roshan Naik  wrote:
>>> 
>>> Good idea. Will the Beam API be implemented to run on top Storm Core
>>> primitives ?
>>> -roshan
>>> 
>>> 
 On 9/15/16, 2:00 PM, "P. Taylor Goetz"  wrote:
 
 I¹ve been tinkering with implementing an Apache Beam runner on top of
 Storm and would like to open it up so others in the community can
 contribute. To that end I¹d like to propose creating a feature branch
>> for
 that work if there are others who are interested in getting involved. We
 did that a while back when storm-sql was originally developed.
 
 Basically, review requirements for that branch would be relaxed during
 development, with a final, strict review before merging back to one of
 our main branches.
 
 I¹d like to document what I have and future improvements in a proposal
 document, and follow that with pushing the code to the feature branch
>> for
 group collaboration.
 
 Any thoughts? Anyone interested in contributing to such an effort?
 
 -Taylor
>>> 
>> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases

2016-10-19 Thread ptgoetz
Github user ptgoetz commented on the issue:

https://github.com/apache/storm/pull/1693
  
@arunmahadevan I will review. Before you start working on a prototype beam 
runner, let me push what I have to a feature branch. I have a fair amount of 
the basic scaffolding you would need.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases

2016-10-19 Thread arunmahadevan
Github user arunmahadevan commented on the issue:

https://github.com/apache/storm/pull/1693
  
Updated doc -

https://docs.google.com/document/d/1Ew7uFF1UJ6e_zq0t4bM6A9auuEaArviAjjWYSpVFqPY/edit?usp=sharing

Pinging for reviews and feedback. Its a big patch mostly due the number of 
different apis, however the doc and the examples should be a good starting 
point.

@HeartSaVioR would like inputs from you esp. since it could be potentially 
used in storm-sql in future.

@ptgoetz would be great if you could take a look as well, since you were 
working on a prototype beam runner for storm. To validate the apis and identify 
gaps, I will try to do a prototype runner using the proposed apis.

@revans2 please take a look if you get a chance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1739: STORM-1443 Support customizing parallelism in Stor...

2016-10-19 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request:

https://github.com/apache/storm/pull/1739

STORM-1443 Support customizing parallelism in StormSQL

* Add 'PARALLELISM' to table definition
  * default value is 1
* Set parallelism to new stream while creating stream with scan
  * downstream operators will also have same parallelism unless 
repartitioned
  * not apply parallelism to output table since it can trigger repartition

Below is the screenshot which runs SQL statement:

https://cloud.githubusercontent.com/assets/1317309/19513856/72a944c2-962c-11e6-91d0-2f6f08b7aefd.png;>

```
CREATE EXTERNAL TABLE APACHE_LOGS (id INT PRIMARY KEY, remote_ip VARCHAR, 
request_url VARCHAR, request_method VARCHAR, status VARCHAR, 
request_header_user_agent VARCHAR, time_received_utc_isoformat VARCHAR, time_us 
DOUBLE) LOCATION 'kafka://localhost:2181/brokers?topic=apachelogs-v2' 
PARALLELISM 5
CREATE EXTERNAL TABLE APACHE_SLOW_LOGS (dummy_id INT PRIMARY KEY, 
request_url VARCHAR, request_method VARCHAR, cnt INT, time_elapsed_ms_min INT, 
time_elapsed_ms_max INT, time_elapsed_ms_avg INT) LOCATION 
'kafka://localhost:2181/brokers?topic=apacheslowlogs-v2' TBLPROPERTIES 
'{"producer":{"bootstrap.servers":"localhost:9092","acks":"1","key.serializer":"org.apache.storm.kafka.IntSerializer","value.serializer":"org.apache.storm.kafka.ByteBufferSerializer"}}'
INSERT INTO APACHE_SLOW_LOGS SELECT MIN(ID), REQUEST_URL, REQUEST_METHOD, 
COUNT(*) AS CNT, MIN(TIME_US) / 1000 AS TIME_ELAPSED_MS_MIN, MAX(TIME_US) / 
1000 AS TIME_ELAPSED_MS_MAX, AVG(TIME_US) / 1000 AS TIME_ELAPSED_MS_AVG FROM 
APACHE_LOGS GROUP BY REQUEST_URL, REQUEST_METHOD HAVING AVG(TIME_US) / 1000 >= 
300
```

Please refer task count of each component. Task count of each component is 
5 unless it's repartitioned due to aggregation.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HeartSaVioR/storm 
STORM-1443-on-top-of-STORM-1446

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/1739.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1739


commit a6fdf67547a4bf45b7892256e3ca8eb272dcd29c
Author: Jungtaek Lim 
Date:   2016-10-13T10:00:10Z

STORM-1446 Compile the Calcite logical plan to Storm Trident logical plan

* Port SamzaSQL implementation to Storm
  * https://github.com/milinda/samza-sql
* Apply some rules to optimize
* optimize Calc
  * merge filter and projection scripts into one
  * also applying short circuit
* Modify Trident unit tests to use new query planner
* arrange some files
  * Move some files which are only used from standalone
  * Remove some files which are no longer used
* guard the possibility of stack overflow error on explaining
  * just leave error logs, and print out empty plan and continue
  * reported this behavior to Calcite community
* leave some comments to clarify what it means

commit 319479bc7d8add43ffea0370d1762c19b705c72b
Author: Jungtaek Lim 
Date:   2016-10-19T09:25:53Z

STORM-1443 Support customizing parallelism in StormSQL

* Add 'PARALLELISM' to table definition
  * default value is 1
* Set parallelism to new stream while creating stream with scan
  * downstream operators will also have same parallelism unless 
repartitioned
  * not apply parallelism to output table since it can trigger repartition




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---