[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases
Github user satishd commented on the issue: https://github.com/apache/storm/pull/1693 @arunmahadevan I was saying we should not hold on this PR till Beam runner API is tried with these APIs. We should review these APIs soon and these may remain experimental for couple of minor releases. I was going through Beam features earlier and I plan to have a doc about Beam features and how we can start implementing some of those APIs. I will add details once Taylor's branch is open and the doc can be updated with that. I will share the doc once it is ready(in 2-3 weeks). We can create subtasks based on that doc and work on them later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1693 For me this stream API is not only for supporting Beam. This is what Storm needed to have even before Beam is open-sourced. We may want to label this API to experiment or unstable or something, but anyway I'd like to see it released not too late. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1736: STORM-1446 Compile the Calcite logical plan to Storm Trid...
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1736 Fixed some bugs, and renamed classes to reflect more specific. Unit test passes and manually tested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/1693 @satishd right now its more of a prototype proof of concept for the proposed apis, not the final version. Once the beam api solidifies we can have a separate discussion to identify the gaps, but doesn't make sense to hold off the prototype implementation which is more for validating the proposed apis. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases
Github user satishd commented on the issue: https://github.com/apache/storm/pull/1693 It is better to discuss these APIs before working on Beam runner prototypes. Lets not jump on using these APIs to work on runner. Beam API is currently changing and we should discuss those API contracts and we should evaluate what kind of enhancements we need to do. We can create subtasks for the requirements and work on those later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/1693 @arunmahadevan I'm in the middle of reviewing updated design doc again. Since I've read it once so there will be no outstanding comments, but just would like to check again. Btw, I merged this PR to my local and convert one of example on storm-starter (windowing example), and it seems to run fine. After reviewing design doc, I'll play with API sets and leave some feedbacks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: [DISCUSS] Feature Branch for Apache Beam Runner
+1, waiting for that. :) Currently,there are API changes going on in Beam. It seem they plan to get that done by the end of 2016. ~Satish. On Wed, Oct 19, 2016 at 9:19 PM, Bobby Evanswrote: > +1 - Bobby > > On Wednesday, October 19, 2016 10:30 AM, Arun Mahadevan < > ar...@apache.org> wrote: > > > +1 > > On 10/19/16, 8:58 PM, "P. Taylor Goetz" wrote: > > >If there are no objections, I’d like to create the feature branch and > push what I have so far. I’ve not had too much time lately to work on it, > but other’s have expressed interest in contributing so I’d like to make it > available. > > > >-Taylor > > > > > >> On Sep 19, 2016, at 11:15 AM, Bobby Evans > wrote: > >> > >> +1 on the idea. I would love to contribute, but I doubt I will find > time to do it any time soon. - Bobby > >> > >>On Friday, September 16, 2016 12:05 AM, Satish Duggana < > satish.dugg...@gmail.com> wrote: > >> > >> > >> Taylor, > >> I am interested in contributing to this effort. Gone through Beam APIs > >> earlier and had some initial thoughts on Storm runner. We can start with > >> existing core storm constructs but it is better to design in such a way > >> that these can be replaced with new APIs. > >> > >> Thanks, > >> Satish. > >> > >> On Fri, Sep 16, 2016 at 3:35 AM, P. Taylor Goetz > wrote: > >> > >>> I'm open to change, but yes, I started with core storm since it offers > the > >>> most flexibility wrt how Beam constructs are translated. > >>> > >>> -Taylor > >>> > On Sep 15, 2016, at 5:51 PM, Roshan Naik > wrote: > > Good idea. Will the Beam API be implemented to run on top Storm Core > primitives ? > -roshan > > > > On 9/15/16, 2:00 PM, "P. Taylor Goetz" wrote: > > > > I¹ve been tinkering with implementing an Apache Beam runner on top of > > Storm and would like to open it up so others in the community can > > contribute. To that end I¹d like to propose creating a feature branch > >>> for > > that work if there are others who are interested in getting > involved. We > > did that a while back when storm-sql was originally developed. > > > > Basically, review requirements for that branch would be relaxed > during > > development, with a final, strict review before merging back to one > of > > our main branches. > > > > I¹d like to document what I have and future improvements in a > proposal > > document, and follow that with pushing the code to the feature branch > >>> for > > group collaboration. > > > > Any thoughts? Anyone interested in contributing to such an effort? > > > > -Taylor > > >>> > >> > > > > > > >
Re: [DISCUSS] Feature Branch for Apache Beam Runner
+1 - Bobby On Wednesday, October 19, 2016 10:30 AM, Arun Mahadevanwrote: +1 On 10/19/16, 8:58 PM, "P. Taylor Goetz" wrote: >If there are no objections, I’d like to create the feature branch and push >what I have so far. I’ve not had too much time lately to work on it, but >other’s have expressed interest in contributing so I’d like to make it >available. > >-Taylor > > >> On Sep 19, 2016, at 11:15 AM, Bobby Evans >> wrote: >> >> +1 on the idea. I would love to contribute, but I doubt I will find time to >> do it any time soon. - Bobby >> >> On Friday, September 16, 2016 12:05 AM, Satish Duggana >> wrote: >> >> >> Taylor, >> I am interested in contributing to this effort. Gone through Beam APIs >> earlier and had some initial thoughts on Storm runner. We can start with >> existing core storm constructs but it is better to design in such a way >> that these can be replaced with new APIs. >> >> Thanks, >> Satish. >> >> On Fri, Sep 16, 2016 at 3:35 AM, P. Taylor Goetz wrote: >> >>> I'm open to change, but yes, I started with core storm since it offers the >>> most flexibility wrt how Beam constructs are translated. >>> >>> -Taylor >>> On Sep 15, 2016, at 5:51 PM, Roshan Naik wrote: Good idea. Will the Beam API be implemented to run on top Storm Core primitives ? -roshan > On 9/15/16, 2:00 PM, "P. Taylor Goetz" wrote: > > I¹ve been tinkering with implementing an Apache Beam runner on top of > Storm and would like to open it up so others in the community can > contribute. To that end I¹d like to propose creating a feature branch >>> for > that work if there are others who are interested in getting involved. We > did that a while back when storm-sql was originally developed. > > Basically, review requirements for that branch would be relaxed during > development, with a final, strict review before merging back to one of > our main branches. > > I¹d like to document what I have and future improvements in a proposal > document, and follow that with pushing the code to the feature branch >>> for > group collaboration. > > Any thoughts? Anyone interested in contributing to such an effort? > > -Taylor >>> >> >
Re: [DISCUSS] Feature Branch for Apache Beam Runner
+1 On 10/19/16, 8:58 PM, "P. Taylor Goetz"wrote: >If there are no objections, I’d like to create the feature branch and push >what I have so far. I’ve not had too much time lately to work on it, but >other’s have expressed interest in contributing so I’d like to make it >available. > >-Taylor > > >> On Sep 19, 2016, at 11:15 AM, Bobby Evans >> wrote: >> >> +1 on the idea. I would love to contribute, but I doubt I will find time to >> do it any time soon. - Bobby >> >>On Friday, September 16, 2016 12:05 AM, Satish Duggana >> wrote: >> >> >> Taylor, >> I am interested in contributing to this effort. Gone through Beam APIs >> earlier and had some initial thoughts on Storm runner. We can start with >> existing core storm constructs but it is better to design in such a way >> that these can be replaced with new APIs. >> >> Thanks, >> Satish. >> >> On Fri, Sep 16, 2016 at 3:35 AM, P. Taylor Goetz wrote: >> >>> I'm open to change, but yes, I started with core storm since it offers the >>> most flexibility wrt how Beam constructs are translated. >>> >>> -Taylor >>> On Sep 15, 2016, at 5:51 PM, Roshan Naik wrote: Good idea. Will the Beam API be implemented to run on top Storm Core primitives ? -roshan > On 9/15/16, 2:00 PM, "P. Taylor Goetz" wrote: > > I¹ve been tinkering with implementing an Apache Beam runner on top of > Storm and would like to open it up so others in the community can > contribute. To that end I¹d like to propose creating a feature branch >>> for > that work if there are others who are interested in getting involved. We > did that a while back when storm-sql was originally developed. > > Basically, review requirements for that branch would be relaxed during > development, with a final, strict review before merging back to one of > our main branches. > > I¹d like to document what I have and future improvements in a proposal > document, and follow that with pushing the code to the feature branch >>> for > group collaboration. > > Any thoughts? Anyone interested in contributing to such an effort? > > -Taylor >>> >> >
Re: [DISCUSS] Feature Branch for Apache Beam Runner
If there are no objections, I’d like to create the feature branch and push what I have so far. I’ve not had too much time lately to work on it, but other’s have expressed interest in contributing so I’d like to make it available. -Taylor > On Sep 19, 2016, at 11:15 AM, Bobby Evanswrote: > > +1 on the idea. I would love to contribute, but I doubt I will find time to > do it any time soon. - Bobby > >On Friday, September 16, 2016 12:05 AM, Satish Duggana > wrote: > > > Taylor, > I am interested in contributing to this effort. Gone through Beam APIs > earlier and had some initial thoughts on Storm runner. We can start with > existing core storm constructs but it is better to design in such a way > that these can be replaced with new APIs. > > Thanks, > Satish. > > On Fri, Sep 16, 2016 at 3:35 AM, P. Taylor Goetz wrote: > >> I'm open to change, but yes, I started with core storm since it offers the >> most flexibility wrt how Beam constructs are translated. >> >> -Taylor >> >>> On Sep 15, 2016, at 5:51 PM, Roshan Naik wrote: >>> >>> Good idea. Will the Beam API be implemented to run on top Storm Core >>> primitives ? >>> -roshan >>> >>> On 9/15/16, 2:00 PM, "P. Taylor Goetz" wrote: I¹ve been tinkering with implementing an Apache Beam runner on top of Storm and would like to open it up so others in the community can contribute. To that end I¹d like to propose creating a feature branch >> for that work if there are others who are interested in getting involved. We did that a while back when storm-sql was originally developed. Basically, review requirements for that branch would be relaxed during development, with a final, strict review before merging back to one of our main branches. I¹d like to document what I have and future improvements in a proposal document, and follow that with pushing the code to the feature branch >> for group collaboration. Any thoughts? Anyone interested in contributing to such an effort? -Taylor >>> >> > signature.asc Description: Message signed with OpenPGP using GPGMail
[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases
Github user ptgoetz commented on the issue: https://github.com/apache/storm/pull/1693 @arunmahadevan I will review. Before you start working on a prototype beam runner, let me push what I have to a feature branch. I have a fair amount of the basic scaffolding you would need. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1693: [STORM-1961] Stream api for storm core use cases
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/1693 Updated doc - https://docs.google.com/document/d/1Ew7uFF1UJ6e_zq0t4bM6A9auuEaArviAjjWYSpVFqPY/edit?usp=sharing Pinging for reviews and feedback. Its a big patch mostly due the number of different apis, however the doc and the examples should be a good starting point. @HeartSaVioR would like inputs from you esp. since it could be potentially used in storm-sql in future. @ptgoetz would be great if you could take a look as well, since you were working on a prototype beam runner for storm. To validate the apis and identify gaps, I will try to do a prototype runner using the proposed apis. @revans2 please take a look if you get a chance. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request #1739: STORM-1443 Support customizing parallelism in Stor...
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/storm/pull/1739 STORM-1443 Support customizing parallelism in StormSQL * Add 'PARALLELISM' to table definition * default value is 1 * Set parallelism to new stream while creating stream with scan * downstream operators will also have same parallelism unless repartitioned * not apply parallelism to output table since it can trigger repartition Below is the screenshot which runs SQL statement: https://cloud.githubusercontent.com/assets/1317309/19513856/72a944c2-962c-11e6-91d0-2f6f08b7aefd.png;> ``` CREATE EXTERNAL TABLE APACHE_LOGS (id INT PRIMARY KEY, remote_ip VARCHAR, request_url VARCHAR, request_method VARCHAR, status VARCHAR, request_header_user_agent VARCHAR, time_received_utc_isoformat VARCHAR, time_us DOUBLE) LOCATION 'kafka://localhost:2181/brokers?topic=apachelogs-v2' PARALLELISM 5 CREATE EXTERNAL TABLE APACHE_SLOW_LOGS (dummy_id INT PRIMARY KEY, request_url VARCHAR, request_method VARCHAR, cnt INT, time_elapsed_ms_min INT, time_elapsed_ms_max INT, time_elapsed_ms_avg INT) LOCATION 'kafka://localhost:2181/brokers?topic=apacheslowlogs-v2' TBLPROPERTIES '{"producer":{"bootstrap.servers":"localhost:9092","acks":"1","key.serializer":"org.apache.storm.kafka.IntSerializer","value.serializer":"org.apache.storm.kafka.ByteBufferSerializer"}}' INSERT INTO APACHE_SLOW_LOGS SELECT MIN(ID), REQUEST_URL, REQUEST_METHOD, COUNT(*) AS CNT, MIN(TIME_US) / 1000 AS TIME_ELAPSED_MS_MIN, MAX(TIME_US) / 1000 AS TIME_ELAPSED_MS_MAX, AVG(TIME_US) / 1000 AS TIME_ELAPSED_MS_AVG FROM APACHE_LOGS GROUP BY REQUEST_URL, REQUEST_METHOD HAVING AVG(TIME_US) / 1000 >= 300 ``` Please refer task count of each component. Task count of each component is 5 unless it's repartitioned due to aggregation. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HeartSaVioR/storm STORM-1443-on-top-of-STORM-1446 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1739.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1739 commit a6fdf67547a4bf45b7892256e3ca8eb272dcd29c Author: Jungtaek LimDate: 2016-10-13T10:00:10Z STORM-1446 Compile the Calcite logical plan to Storm Trident logical plan * Port SamzaSQL implementation to Storm * https://github.com/milinda/samza-sql * Apply some rules to optimize * optimize Calc * merge filter and projection scripts into one * also applying short circuit * Modify Trident unit tests to use new query planner * arrange some files * Move some files which are only used from standalone * Remove some files which are no longer used * guard the possibility of stack overflow error on explaining * just leave error logs, and print out empty plan and continue * reported this behavior to Calcite community * leave some comments to clarify what it means commit 319479bc7d8add43ffea0370d1762c19b705c72b Author: Jungtaek Lim Date: 2016-10-19T09:25:53Z STORM-1443 Support customizing parallelism in StormSQL * Add 'PARALLELISM' to table definition * default value is 1 * Set parallelism to new stream while creating stream with scan * downstream operators will also have same parallelism unless repartitioned * not apply parallelism to output table since it can trigger repartition --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---