Hi Taylor,

It is glad to see your opinion. 
After the open source of Beam, there are a lot of interests in Beam from our 
internal users in Alibaba and other companies in China, which promotes us to 
provide the support of JStorm runner. But since the implementation of Storm 
runner is out of date, and over the past year many new features or different 
solution(especially for exactly once and state) were introduced in JStorm, we 
have to start the separate development of JStorm runner. 
Currently, we have finished a prototype(support most PTransforms, window and 
trigger of Beam) as Pei mentioned in another email, and the full testing is 
still on-going. Some users has built up their trial topology on it in Alibaba. 
But for further improvement, we still need the help of review from Beam 
community to ensure the correctness, and get notification of any broken or 
un-compatible update of Beam evolves. That is the reason why we decide to 
commit JStorm runner into Beam repository.

For personal understanding, the JStorm runner is not a duplicated effort. The 
major part of JStorm runner is probably reused in Storm. Some other parts like 
exactly once and state needs a propagation. When Storm community plan to 
restart the development of Storm runner, we'd like to help on this, as a part 
of merging JStorm features planned before. At that time, we can discuss whether 
merging JStorm feature or propagation is required.
Looking forward to the better collaboration between Beam, Storm and JStorm.

Regards
Jian Liu(Basti)

-----Original Message-----
From: P. Taylor Goetz [mailto:ptgo...@apache.org] 
Sent: Tuesday, April 11, 2017 1:48 AM
To: d...@beam.apache.org; dev@storm.apache.org
Subject: Apache Storm/JStorm Runner(s) for Apache Beam

Note: cross-posting to dev@beam and dev@storm

I’ve seen at least two threads on the dev@ list discussing the JStorm runner 
and my hope is we can expand on that discussion and cross-pollinate with the 
Storm/JStorm/Beam communities as well.

A while back I created a very preliminary proof of concept of getting a Storm 
Beam runner working [1]. That was mainly an exercise for me to familiarize 
myself with the Beam API and discover what it would take to develop a Beam 
runner on top of Storm. That code is way out of date (I was targeting Beam’s 
HEAD before the 0.2.0 release, and a lot of changes have since taken place) and 
didn’t really work as Jian Liu pointed out. It was a start, that perhaps could 
be further built upon, or parts harvested, etc. I don’t have any particular 
attachment to that code and wouldn’t be upset if it were completely discarded 
in favor of a better or more extensible implementation.

What I would like to see, and I think this is a great opportunity to do so, is 
a closer collaboration between the Apache Storm and JStorm communities. For 
those who aren’t familiar with those projects’ relationship, I’ll start with a 
little history…

JStorm began at Alibaba as a fork of Storm (pre-Apache?) with Storm’s Clojure 
code reimplemented in Java. The rationale behind that move was that Alibaba had 
a large number of Java developers but very few who were proficient with 
Clojure. Moving to pure Java made sense as it would expand the base of 
potential contributors.

In late 2015 Alibaba donated the JStorm codebase to the Apache Storm project, 
and the Apache Storm PMC committed to converting its Clojure code to Java in 
order to incorporate the code donation. At the time there was one catch — 
Apache Storm had implemented comprehensive security features such as Kerberos 
authentication/authorization and multi-tenancy in its Clojure code, which 
greatly complicated the move to Java and incorporation of the JStorm code. 
JStorm did not have the same security features. A number of JStorm developers 
have also become Storm PMC members.

Fast forward to today. The Storm community has completed the bulk of the move 
to Java and the next major release (presumably 2.0, which is currently under 
discussion) will be largely Java-based. We are now in a much better position to 
begin incorporating JStorm’s features, as well as implementing new features 
necessary to support the Beam API (such as support for bounded pipelines, among 
other features).

Having separate Apache Storm and JStorm beam runner implementations doesn’t 
feel appropriate in my personal opinion, especially since both projects have 
expressed an ongoing commitment to bringing JStorm’s additional features, and 
just as important, community, to Apache Storm.

One final note, when the Storm community initially discussed developing a Beam 
runner, the general consensus was do so within the Storm repository. My current 
thinking is that such an effort should take place within the Beam community, 
not only since that is the development pattern followed by other runner 
implementations (Flink, Apex, etc.), but also because it would serve to 
increase collaboration between Apache projects (always a good thing!).

I would love to hear opinions from others in the Storm/JStorm/Beam communities.

-Taylor=

Reply via email to