Hello JB,

Perfect! I'm already on the Beam Slack workspace, I'll contact you once I
get to the office.

Thanks!
D.

On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi David,
>
> absolutely !! Let's move forward on the preparation steps.
>
> Are you on Slack and/or hangout to plan this ?
>
> Thanks,
> Regards
> JB
>
> On 01/02/2018 05:35 PM, David Morávek wrote:
>
>> Hello JB,
>>
>> can we help in any way to move things forward?
>>
>> Thanks,
>> D.
>>
>> On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré <j...@nanthrax.net
>> <mailto:j...@nanthrax.net>> wrote:
>>
>>     Thanks Jan,
>>
>>     It makes sense.
>>
>>     Let me take a look on the code to understand the "interaction".
>>
>>     Regards
>>     JB
>>
>>
>>     On 12/18/2017 04:26 PM, Jan Lukavský wrote:
>>
>>         Hi JB,
>>
>>         basically you are not wrong. The project started about three or
>> four
>>         years ago with a goal to unify batch and streaming processing into
>>         single portable, executor independent API. Because of that, it is
>>         currently "close" to Beam in this sense. But we don't see much
>> added
>>         value keeping this as a separate project, with one of the key
>>         differences to be the API (not the model itself), so we would
>> like to
>>         focus on translation from Euphoria API to Beam's SDK. That's why
>> we
>>         would like to see it as a DSL, so that it would be possible to use
>>         Euphoria API with Beam's runners as much natively as possible.
>>
>>         I hope I didn't make the subject even more unclear, if so, I'll
>> be happy
>>         to explain anything in more detail. :-)
>>
>>             Jan
>>
>>
>>         On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:
>>
>>             Hi Jan,
>>
>>             Thanks for your answers.
>>
>>             However, they confused me ;)
>>
>>             Regarding what you replied, Euphoria seems like a programming
>>             model/SDK "close" to Beam more than a DSL on top of an
>> existing Beam
>>             SDK.
>>
>>             Am I wrong ?
>>
>>             Regards
>>             JB
>>
>>             On 12/18/2017 03:44 PM, Jan Lukavský wrote:
>>
>>                 Hi Ismael,
>>
>>                 basically we adopted the Beam's design regarding
>> partitioning
>>                 (https://github.com/seznam/euphoria/issues/160
>>                 <https://github.com/seznam/euphoria/issues/160>) and
>> implemented
>>                 the sorting manually
>>                 (https://github.com/seznam/euphoria/issues/158
>>                 <https://github.com/seznam/euphoria/issues/158>). I'm
>> not aware
>>                 of the time model differences (Euphoria supports
>> ingestion and
>>                 event time, we don't support processing time by decision).
>>                 Regarding other differences (looking into Beam capability
>>                 matrix, I'd say that):
>>
>>                    - we don't support stateful FlatMap (i.e. ParDo) for
>> now
>>                 (https://github.com/seznam/euphoria/issues/192
>>                 <https://github.com/seznam/euphoria/issues/192>)
>>
>>                    - we don't support side inputs (by decision now, but
>> might be
>>                 reconsidered) and outputs
>>                 (https://github.com/seznam/euphoria/issues/124
>>                 <https://github.com/seznam/euphoria/issues/124>)
>>
>>
>>                    - we support complete event-time windows (non-merging,
>>                 merging, aligned, unaligned) and time control
>>
>>                    - we don't support processing time by decision (might
>> be
>>                 reconsidered if a valid use-case is found)
>>
>>                    - we support window triggering based on both time and
>> data,
>>                 including discarding and accumulating (without
>> accumulating &
>>                 retracting)
>>
>>                 All our executors (runners) - Flink, Spark and Local -
>> implement
>>                 the complete model, which we enforce using "operator test
>> kit"
>>                 that all executors must pass. Spark executor supports
>> bounded
>>                 sources only (for now). As David said, we currently don't
>> have
>>                 serialization abstraction, so there is some work to be
>> done in
>>                 that regard.
>>
>>                 Our intention is to completely supersede Euphoria, we
>> would like
>>                 to consider possibility to use executors that would not
>> rely on
>>                 Beam, but that is optional now and should be
>> straightforward.
>>
>>                 We'd be happy to answer any more questions you might have
>> and
>>                 thanks a lot!
>>
>>                 Best,
>>
>>                    Jan
>>
>>
>>                 On 12/18/2017 03:19 PM, Ismaël Mejía wrote:
>>
>>                     Hi,
>>
>>                     It is great to see that you guys have achieved a
>> maturity
>>                     point to
>>                     propose this. Congratulations for your work and the
>> idea to
>>                     contribute
>>                     it into Beam.
>>
>>                     I remember from a previous discussion with Jan about
>> the model
>>                     mismatch between Euphoria and Beam, because of some
>> design
>>                     decisions
>>                     of both projects. I remember you guys had some issues
>> with
>>                     the way
>>                     Beam's sources do partitioning, as well as Beam's
>> lack of
>>                     sorted data
>>                     (on shuffle a la hadoop). Also if I remember well the
>> 'time'
>>                     model of
>>                     Euphoria was simpler than Beam's. I talk about all of
>> this
>>                     because I
>>                     am curious about what parts of the Euphoria model you
>> guys
>>                     had to
>>                     sacrifice to support Beam, and what parts of Beam's
>> model
>>                     should still
>>                     be integrated into Euphoria (and if there is a
>>                     straightforward path to
>>                     do it).
>>
>>                     If I understand well if this gets merged into Apache
>> this
>>                     means that
>>                     Euphoria's current implementation would be superseded
>> by
>>                     this DSL? I
>>                     am curious because I would like to understand your
>> level of
>>                     investment
>>                     on supporting the future of this DSL.
>>
>>                     Thanks and congrats again !
>>                     Ismaël
>>
>>                     On Mon, Dec 18, 2017 at 10:12 AM, Jean-Baptiste Onofré
>>                     <j...@nanthrax.net <mailto:j...@nanthrax.net>> wrote:
>>
>>                         Depending of the donation, you would need ICLA
>> for each
>>                         contributor, and
>>                         CCLA in addition of SGA.
>>
>>                         We can sync with Davor and I for the legal stuff.
>>                         However, I would wait a little bit just to have
>> feedback
>>                         from the whole team
>>                         and start a formal vote.
>>
>>                         I would be happy to start the formal vote.
>>
>>                         Regards
>>                         JB
>>
>>                         On 12/18/2017 10:03 AM, David Morávek wrote:
>>
>>                             Hello,
>>
>>                             Thanks for the awesome feedback!
>>
>>                             Romain:
>>
>>                             We already use Java Stream API in all
>> operators
>>                             where it makes sense (eg.:
>>                             ReduceByKey). Still not sure if it was a good
>>                             choice, but i can be easily
>>                             converted to iterator anyway.
>>
>>                             Side outputs support is coming soon, we
>> already made
>>                             an initial work on
>>                             this.
>>
>>                             Side inputs are not supported in a way you
>> are used
>>                             to from beam, because
>>                             it can be replaced by Join operator on the
>> same key
>>                             (if annotated with
>>                             broadcastHashJoin, it will be turned into map
>> side
>>                             join).
>>
>>                             Only significant difference from Beam is,
>> that we
>>                             decided not to abstract
>>                             serialization, so we need to add support for
>> Type
>>                             Hints, because of type
>>                             erasure.
>>
>>                             Fluent API:
>>
>>                             API is fluent within one operator. It is
>> designed to
>>                             "lead the
>>                             programmer", which means, that he we'll be
>> only
>>                             offered methods that makes
>>                             sense after the last method he used (eg.: in
>>                             ReduceByKey, we know that after
>>                             keyBy either reduceBy method should come). It
>> is
>>                             implemented as a series of
>>                             builders.
>>
>>                             Davor:
>>
>>                             Thanks, I'll contact you, and will start the
>> process
>>                             of having all the
>>                             necessary paperwork signed on our side, so we
>> can
>>                             get things moving.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>                             On Mon, Dec 18, 2017 at 7:46 AM, Romain
>> Manni-Bucau
>>                             <rmannibu...@gmail.com <mailto:
>> rmannibu...@gmail.com>
>>                             <mailto:rmannibu...@gmail.com
>>                             <mailto:rmannibu...@gmail.com>>> wrote:
>>
>>                                   Hi guys
>>
>>                                   A DSL would be very welcomed, in
>> particular if
>>                             fluent.
>>
>>                                   Open question: did you study to
>> implement
>>                             Stream API (surely extending
>>                             it to
>>                                   have a BeamStream and a few more
>> features like
>>                             sides etc)? Would be
>>                             very
>>                                   natural and integrable easily anywhere
>> and
>>                             avoid a new API discovery.
>>
>>                                   Hazelcast jet did it so I dont see why
>> Beam
>>                             couldnt.
>>
>>                                   Le 18 déc. 2017 07:26, "Davor Bonaci"
>>                             <da...@apache.org <mailto:da...@apache.org>
>>                                   <mailto:da...@apache.org
>>
>>                             <mailto:da...@apache.org>>> a écrit :
>>
>>                                       Hi David,
>>                                       As JB noted, merging of these two
>> projects
>>                             is a great idea. If
>>                             fact,
>>                                       some of us have had those
>> discussions in
>>                             the past.
>>
>>                                       Legally, nothing particular is
>> strictly
>>                             necessary as the code seem
>>                             to
>>                                       already be Apache 2.0 licensed. We
>> don't,
>>                             however, want to be
>>                             perceived
>>                                       as making hostile forks, so it
>> would be
>>                             great to file a Software
>>                             Grant
>>                                       Agreement with the ASF Secretary. I
>> can
>>                             help with the process, as
>>                             necessary.
>>
>>                                       Project alignment-wise, there
>> aren't any
>>                             particular blockers that
>>                             I am
>>                                       aware of. We welcome DSLs.
>>
>>                                       Technically, the code would start
>> in a
>>                             feature branch. During this
>>                                       stage, we'd need to validate a few
>> things,
>>                             including confirmation
>>                             the
>>                                       code and dependencies match the ASF
>>                             policy, automate testing in
>>                             Beam's
>>                                       tooling, etc. At that point, we'd
>> take a
>>                             community vote to accept
>>                             the
>>                                       component into master, and consider
>>                             author(s) for committership in
>>                             the
>>                                       overall project.
>>
>>                                       Welcome to the ASF and Beam -- we
>> are
>>                             thrilled to have you! Hope
>>                             this
>>                                       helps, and please reach out if
>> anybody on
>>                             our end can help,
>>                             including JB
>>                                       or myself.
>>
>>                                       Davor
>>
>>
>>                                       On Sun, Dec 17, 2017 at 10:13 AM,
>>                             Jean-Baptiste Onofré
>>                             <j...@nanthrax.net <mailto:j...@nanthrax.net>
>>                                       <mailto:j...@nanthrax.net
>>
>>                             <mailto:j...@nanthrax.net>>> wrote:
>>
>>                                           Hi David,
>>
>>                                           Generally speaking, having
>> different
>>                             fluent DSL on top of the
>>                             Beam
>>                                           SDK is great.
>>
>>                                           I would like to take a look on
>> your
>>                             wordcount examples to give
>>                             you a
>>                                           complete feedback. I like the
>> idea and
>>                             a fluent Java DSL is
>>                             valuable.
>>
>>                                           Let's wait feedback from
>> others. If we
>>                             have a consensus, then
>>                             I
>>                                           would be more than happy to
>> help you
>>                             for the donation (I
>>                             worked on
>>                                           the Camel Java DSL while ago,
>> so I
>>                             have some experience here).
>>
>>                                           Thanks !
>>                                           Regards
>>                                           JB
>>
>>                                           On 12/17/2017 07:00 PM, David
>> Morávek
>>                             wrote:
>>
>>                                               Hello,
>>
>>
>>                                               First of all, thanks for the
>>                             amazing work the Apache Beam
>>                                               community is doing!
>>
>>
>>                                               In 2014, we've started
>> development
>>                             of the runtime
>>                             independent
>>                                               Java 8 API, that helps us to
>>                             create unified big-data
>>                             processing
>>                                               flows. It has been used as
>> a core
>>                             building block of
>>                             Seznam.cz
>>                                               web crawler data
>> infrastructure
>>                             every since. Its design
>>                                               principles and execution
>> model are
>>                             very similar to Apache
>>                             Beam.
>>
>>
>>                                               This API was open sourced
>> in 2016,
>>                             under the name Euphoria
>>                             API:
>>
>>                             https://github.com/seznam/euphoria
>>                             <https://github.com/seznam/euphoria>
>>                             <https://github.com/seznam/euphoria
>>                             <https://github.com/seznam/euphoria>>
>>
>>
>>                                               As it is very similar to
>> Apache
>>                             Beam, we feel, that it is
>>                             not
>>                                               worth of duplicating effort
>> in
>>                             terms of development of new
>>                                               runtimes and fine-tuning of
>>                             current ones.
>>
>>
>>                                               The main blocker for us to
>> switch
>>                             to Apache Beam is lack
>>                             of the
>>                                               Java 8 API. *W*e propose the
>>                             integration of Euphoria API
>>                             into
>>                                               Apache Beam as a Java 8
>> DSL, in
>>                             order to share our effort
>>                             with
>>                                               the community.
>>
>>
>>                                               Simple example of the
>> Euphoria API
>>                             usage, can be found
>>                             here:
>>
>>
>>                             https://github.com/seznam/euph
>> oria/tree/master/euphoria-examples/src/main/java/cz/seznam/
>> euphoria/examples/wordcount
>>                             <https://github.com/seznam/eup
>> horia/tree/master/euphoria-examples/src/main/java/cz/seznam/
>> euphoria/examples/wordcount>
>>
>>
>>                             <https://github.com/seznam/eup
>> horia/tree/master/euphoria-examples/src/main/java/cz/seznam/
>> euphoria/examples/wordcount
>>                             <https://github.com/seznam/eup
>> horia/tree/master/euphoria-examples/src/main/java/cz/seznam/
>> euphoria/examples/wordcount>>
>>
>>
>>
>>                                               If you feel, that Beam
>> community
>>                             could leverage from our
>>                             work,
>>                                               we would love to start
>> working on
>>                             Euphoria integration
>>                             into
>>                                               Apache Beam (we already
>> have a
>>                             working POC, with few basic
>>                                               operators implemented).
>>
>>
>>                                               I look forward to hearing
>> from you,
>>
>>                                               David
>>
>>
>>                                           --             Jean-Baptiste
>> Onofré
>>                             jbono...@apache.org <mailto:
>> jbono...@apache.org>
>>                             <mailto:jbono...@apache.org
>>                             <mailto:jbono...@apache.org>>
>>                             http://blog.nanthrax.net
>>                                           Talend - http://www.talend.com
>>
>>
>>
>>
>>
>>                             --                             s pozdravem
>>
>>                             David Morávek
>>
>>
>>                         --                         Jean-Baptiste Onofré
>>                         jbono...@apache.org <mailto:jbono...@apache.org>
>>                         http://blog.nanthrax.net
>>                         Talend - http://www.talend.com
>>
>>
>>
>>
>>
>>     --     Jean-Baptiste Onofré
>>     jbono...@apache.org <mailto:jbono...@apache.org>
>>     http://blog.nanthrax.net
>>     Talend - http://www.talend.com
>>
>>
>>
>>
>> --
>> s pozdravem
>>
>> David Morávek
>>
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to