Hi Amit,
What do you think of the following:
- in the mean time that you reintroduce the Spark 2 branch, what about
"extending" the version in the current Spark runner ? Still using RDD/DStream, I
think we can support Spark 2.x even if we don't yet leverage the new provided
features.
Thoughts ?
Regards
JB
On 03/15/2017 07:39 PM, Amit Sela wrote:
Hi Cody,
I will re-introduce this branch soon as part of the work on BEAM-913
<https://issues.apache.org/jira/browse/BEAM-913>.
For now, and from previous experience with the mentioned branch, batch
implementation should be straight-forward.
Only issue is with streaming support - in the current runner (Spark 1.x) we
have experimental support for windows/triggers and we're working towards
full streaming support.
With Spark 2.x, there is no "general-purpose" stateful operator for the
Dataset API, so I was waiting to see if the new operator
<https://github.com/apache/spark/pull/17179> planned for next version could
help with that.
To summarize, I will introduce a skeleton for the Spark 2 runner with batch
support as soon as I can as a separate branch.
Thanks,
Amit
On Wed, Mar 15, 2017 at 9:07 AM Cody Innowhere <e.neve...@gmail.com> wrote:
Hi guys,
Is there anybody who's currently working on Spark 2.x runner? A old PR for
spark 2.x runner was closed a few days ago, so I wonder what's the status
now, and is there a roadmap for this?
Thanks~
--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com