I may have missed things, but any update on the progress of this donation?

On Tue, Jan 2, 2018 at 10:52 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Great !
>
> Thanks !
> Regards
> JB
>
> On 01/03/2018 07:29 AM, David Morávek wrote:
>
>> Hello JB,
>>
>> Perfect! I'm already on the Beam Slack workspace, I'll contact you once I
>> get to the office.
>>
>> Thanks!
>> D.
>>
>> On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré <j...@nanthrax.net
>> <mailto:j...@nanthrax.net>> wrote:
>>
>>     Hi David,
>>
>>     absolutely !! Let's move forward on the preparation steps.
>>
>>     Are you on Slack and/or hangout to plan this ?
>>
>>     Thanks,
>>     Regards
>>     JB
>>
>>     On 01/02/2018 05:35 PM, David Morávek wrote:
>>
>>         Hello JB,
>>
>>         can we help in any way to move things forward?
>>
>>         Thanks,
>>         D.
>>
>>         On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré <
>> j...@nanthrax.net
>>         <mailto:j...@nanthrax.net> <mailto:j...@nanthrax.net
>>         <mailto:j...@nanthrax.net>>> wrote:
>>
>>              Thanks Jan,
>>
>>              It makes sense.
>>
>>              Let me take a look on the code to understand the
>> "interaction".
>>
>>              Regards
>>              JB
>>
>>
>>              On 12/18/2017 04:26 PM, Jan Lukavský wrote:
>>
>>                  Hi JB,
>>
>>                  basically you are not wrong. The project started about
>> three or
>>         four
>>                  years ago with a goal to unify batch and streaming
>> processing into
>>                  single portable, executor independent API. Because of
>> that, it is
>>                  currently "close" to Beam in this sense. But we don't
>> see much
>>         added
>>                  value keeping this as a separate project, with one of
>> the key
>>                  differences to be the API (not the model itself), so we
>> would
>>         like to
>>                  focus on translation from Euphoria API to Beam's SDK.
>> That's why we
>>                  would like to see it as a DSL, so that it would be
>> possible to use
>>                  Euphoria API with Beam's runners as much natively as
>> possible.
>>
>>                  I hope I didn't make the subject even more unclear, if
>> so, I'll
>>         be happy
>>                  to explain anything in more detail. :-)
>>
>>                      Jan
>>
>>
>>                  On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote:
>>
>>                      Hi Jan,
>>
>>                      Thanks for your answers.
>>
>>                      However, they confused me ;)
>>
>>                      Regarding what you replied, Euphoria seems like a
>> programming
>>                      model/SDK "close" to Beam more than a DSL on top of
>> an
>>         existing Beam
>>                      SDK.
>>
>>                      Am I wrong ?
>>
>>                      Regards
>>                      JB
>>
>>                      On 12/18/2017 03:44 PM, Jan Lukavský wrote:
>>
>>                          Hi Ismael,
>>
>>                          basically we adopted the Beam's design regarding
>>         partitioning
>>                          (https://github.com/seznam/euphoria/issues/160
>>         <https://github.com/seznam/euphoria/issues/160>
>>                          <https://github.com/seznam/euphoria/issues/160
>>         <https://github.com/seznam/euphoria/issues/160>>) and implemented
>>                          the sorting manually
>>                          (https://github.com/seznam/euphoria/issues/158
>>         <https://github.com/seznam/euphoria/issues/158>
>>                          <https://github.com/seznam/euphoria/issues/158
>>         <https://github.com/seznam/euphoria/issues/158>>). I'm not aware
>>                          of the time model differences (Euphoria supports
>>         ingestion and
>>                          event time, we don't support processing time by
>> decision).
>>                          Regarding other differences (looking into Beam
>> capability
>>                          matrix, I'd say that):
>>
>>                             - we don't support stateful FlatMap (i.e.
>> ParDo) for now
>>                          (https://github.com/seznam/euphoria/issues/192
>>         <https://github.com/seznam/euphoria/issues/192>
>>                          <https://github.com/seznam/euphoria/issues/192
>>         <https://github.com/seznam/euphoria/issues/192>>)
>>
>>                             - we don't support side inputs (by decision
>> now, but
>>         might be
>>                          reconsidered) and outputs
>>                          (https://github.com/seznam/euphoria/issues/124
>>         <https://github.com/seznam/euphoria/issues/124>
>>                          <https://github.com/seznam/euphoria/issues/124
>>         <https://github.com/seznam/euphoria/issues/124>>)
>>
>>
>>                             - we support complete event-time windows
>> (non-merging,
>>                          merging, aligned, unaligned) and time control
>>
>>                             - we don't support processing time by
>> decision (might be
>>                          reconsidered if a valid use-case is found)
>>
>>                             - we support window triggering based on both
>> time
>>         and data,
>>                          including discarding and accumulating (without
>>         accumulating &
>>                          retracting)
>>
>>                          All our executors (runners) - Flink, Spark and
>> Local -
>>         implement
>>                          the complete model, which we enforce using
>> "operator
>>         test kit"
>>                          that all executors must pass. Spark executor
>> supports
>>         bounded
>>                          sources only (for now). As David said, we
>> currently
>>         don't have
>>                          serialization abstraction, so there is some work
>> to be
>>         done in
>>                          that regard.
>>
>>                          Our intention is to completely supersede
>> Euphoria, we
>>         would like
>>                          to consider possibility to use executors that
>> would not
>>         rely on
>>                          Beam, but that is optional now and should be
>>         straightforward.
>>
>>                          We'd be happy to answer any more questions you
>> might
>>         have and
>>                          thanks a lot!
>>
>>                          Best,
>>
>>                             Jan
>>
>>
>>                          On 12/18/2017 03:19 PM, Ismaël Mejía wrote:
>>
>>                              Hi,
>>
>>                              It is great to see that you guys have
>> achieved a
>>         maturity
>>                              point to
>>                              propose this. Congratulations for your work
>> and the
>>         idea to
>>                              contribute
>>                              it into Beam.
>>
>>                              I remember from a previous discussion with
>> Jan
>>         about the model
>>                              mismatch between Euphoria and Beam, because
>> of some
>>         design
>>                              decisions
>>                              of both projects. I remember you guys had
>> some
>>         issues with
>>                              the way
>>                              Beam's sources do partitioning, as well as
>> Beam's
>>         lack of
>>                              sorted data
>>                              (on shuffle a la hadoop). Also if I remember
>> well
>>         the 'time'
>>                              model of
>>                              Euphoria was simpler than Beam's. I talk
>> about all
>>         of this
>>                              because I
>>                              am curious about what parts of the Euphoria
>> model
>>         you guys
>>                              had to
>>                              sacrifice to support Beam, and what parts of
>> Beam's
>>         model
>>                              should still
>>                              be integrated into Euphoria (and if there is
>> a
>>                              straightforward path to
>>                              do it).
>>
>>                              If I understand well if this gets merged into
>>         Apache this
>>                              means that
>>                              Euphoria's current implementation would be
>>         superseded by
>>                              this DSL? I
>>                              am curious because I would like to
>> understand your
>>         level of
>>                              investment
>>                              on supporting the future of this DSL.
>>
>>                              Thanks and congrats again !
>>                              Ismaël
>>
>>                              On Mon, Dec 18, 2017 at 10:12 AM,
>> Jean-Baptiste Onofré
>>                              <j...@nanthrax.net <mailto:j...@nanthrax.net>
>>         <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>>> wrote:
>>
>>                                  Depending of the donation, you would
>> need ICLA
>>         for each
>>                                  contributor, and
>>                                  CCLA in addition of SGA.
>>
>>                                  We can sync with Davor and I for the
>> legal stuff.
>>                                  However, I would wait a little bit just
>> to have
>>         feedback
>>                                  from the whole team
>>                                  and start a formal vote.
>>
>>                                  I would be happy to start the formal
>> vote.
>>
>>                                  Regards
>>                                  JB
>>
>>                                  On 12/18/2017 10:03 AM, David Morávek
>> wrote:
>>
>>                                      Hello,
>>
>>                                      Thanks for the awesome feedback!
>>
>>                                      Romain:
>>
>>                                      We already use Java Stream API in
>> all operators
>>                                      where it makes sense (eg.:
>>                                      ReduceByKey). Still not sure if it
>> was a good
>>                                      choice, but i can be easily
>>                                      converted to iterator anyway.
>>
>>                                      Side outputs support is coming soon,
>> we
>>         already made
>>                                      an initial work on
>>                                      this.
>>
>>                                      Side inputs are not supported in a
>> way you
>>         are used
>>                                      to from beam, because
>>                                      it can be replaced by Join operator
>> on the
>>         same key
>>                                      (if annotated with
>>                                      broadcastHashJoin, it will be turned
>> into
>>         map side
>>                                      join).
>>
>>                                      Only significant difference from
>> Beam is,
>>         that we
>>                                      decided not to abstract
>>                                      serialization, so we need to add
>> support
>>         for Type
>>                                      Hints, because of type
>>                                      erasure.
>>
>>                                      Fluent API:
>>
>>                                      API is fluent within one operator.
>> It is
>>         designed to
>>                                      "lead the
>>                                      programmer", which means, that he
>> we'll be only
>>                                      offered methods that makes
>>                                      sense after the last method he used
>> (eg.: in
>>                                      ReduceByKey, we know that after
>>                                      keyBy either reduceBy method should
>> come).
>>         It is
>>                                      implemented as a series of
>>                                      builders.
>>
>>                                      Davor:
>>
>>                                      Thanks, I'll contact you, and will
>> start
>>         the process
>>                                      of having all the
>>                                      necessary paperwork signed on our
>> side, so
>>         we can
>>                                      get things moving.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>                                      On Mon, Dec 18, 2017 at 7:46 AM,
>> Romain
>>         Manni-Bucau
>>                                      <rmannibu...@gmail.com
>>         <mailto:rmannibu...@gmail.com> <mailto:rmannibu...@gmail.com
>>         <mailto:rmannibu...@gmail.com>>
>>                                      <mailto:rmannibu...@gmail.com
>>         <mailto:rmannibu...@gmail.com>
>>                                      <mailto:rmannibu...@gmail.com
>>         <mailto:rmannibu...@gmail.com>>>> wrote:
>>
>>                                            Hi guys
>>
>>                                            A DSL would be very welcomed,
>> in
>>         particular if
>>                                      fluent.
>>
>>                                            Open question: did you study
>> to implement
>>                                      Stream API (surely extending
>>                                      it to
>>                                            have a BeamStream and a few
>> more
>>         features like
>>                                      sides etc)? Would be
>>                                      very
>>                                            natural and integrable easily
>>         anywhere and
>>                                      avoid a new API discovery.
>>
>>                                            Hazelcast jet did it so I dont
>> see
>>         why Beam
>>                                      couldnt.
>>
>>                                            Le 18 déc. 2017 07:26, "Davor
>> Bonaci"
>>                                      <da...@apache.org <mailto:
>> da...@apache.org>
>>         <mailto:da...@apache.org <mailto:da...@apache.org>>
>>                                            <mailto:da...@apache.org
>>         <mailto:da...@apache.org>
>>
>>                                      <mailto:da...@apache.org
>>         <mailto:da...@apache.org>>>> a écrit :
>>
>>                                                Hi David,
>>                                                As JB noted, merging of
>> these two
>>         projects
>>                                      is a great idea. If
>>                                      fact,
>>                                                some of us have had those
>>         discussions in
>>                                      the past.
>>
>>                                                Legally, nothing
>> particular is
>>         strictly
>>                                      necessary as the code seem
>>                                      to
>>                                                already be Apache 2.0
>> licensed.
>>         We don't,
>>                                      however, want to be
>>                                      perceived
>>                                                as making hostile forks,
>> so it
>>         would be
>>                                      great to file a Software
>>                                      Grant
>>                                                Agreement with the ASF
>> Secretary.
>>         I can
>>                                      help with the process, as
>>                                      necessary.
>>
>>                                                Project alignment-wise,
>> there
>>         aren't any
>>                                      particular blockers that
>>                                      I am
>>                                                aware of. We welcome DSLs.
>>
>>                                                Technically, the code
>> would start
>>         in a
>>                                      feature branch. During this
>>                                                stage, we'd need to
>> validate a
>>         few things,
>>                                      including confirmation
>>                                      the
>>                                                code and dependencies
>> match the ASF
>>                                      policy, automate testing in
>>                                      Beam's
>>                                                tooling, etc. At that
>> point, we'd
>>         take a
>>                                      community vote to accept
>>                                      the
>>                                                component into master, and
>> consider
>>                                      author(s) for committership in
>>                                      the
>>                                                overall project.
>>
>>                                                Welcome to the ASF and
>> Beam -- we are
>>                                      thrilled to have you! Hope
>>                                      this
>>                                                helps, and please reach
>> out if
>>         anybody on
>>                                      our end can help,
>>                                      including JB
>>                                                or myself.
>>
>>                                                Davor
>>
>>
>>                                                On Sun, Dec 17, 2017 at
>> 10:13 AM,
>>                                      Jean-Baptiste Onofré
>>                                      <j...@nanthrax.net <mailto:
>> j...@nanthrax.net>
>>         <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>>
>>                                                <mailto:j...@nanthrax.net
>>         <mailto:j...@nanthrax.net>
>>
>>                                      <mailto:j...@nanthrax.net
>>         <mailto:j...@nanthrax.net>>>> wrote:
>>
>>                                                    Hi David,
>>
>>                                                    Generally speaking,
>> having
>>         different
>>                                      fluent DSL on top of the
>>                                      Beam
>>                                                    SDK is great.
>>
>>                                                    I would like to take a
>> look
>>         on your
>>                                      wordcount examples to give
>>                                      you a
>>                                                    complete feedback. I
>> like the
>>         idea and
>>                                      a fluent Java DSL is
>>                                      valuable.
>>
>>                                                    Let's wait feedback
>> from
>>         others. If we
>>                                      have a consensus, then
>>                                      I
>>                                                    would be more than
>> happy to
>>         help you
>>                                      for the donation (I
>>                                      worked on
>>                                                    the Camel Java DSL
>> while ago,
>>         so I
>>                                      have some experience here).
>>
>>                                                    Thanks !
>>                                                    Regards
>>                                                    JB
>>
>>                                                    On 12/17/2017 07:00
>> PM, David
>>         Morávek
>>                                      wrote:
>>
>>                                                        Hello,
>>
>>
>>                                                        First of all,
>> thanks for the
>>                                      amazing work the Apache Beam
>>                                                        community is doing!
>>
>>
>>                                                        In 2014, we've
>> started
>>         development
>>                                      of the runtime
>>                                      independent
>>                                                        Java 8 API, that
>> helps us to
>>                                      create unified big-data
>>                                      processing
>>                                                        flows. It has been
>> used
>>         as a core
>>                                      building block of
>>                                      Seznam.cz
>>                                                        web crawler data
>>         infrastructure
>>                                      every since. Its design
>>                                                        principles and
>> execution
>>         model are
>>                                      very similar to Apache
>>                                      Beam.
>>
>>
>>                                                        This API was open
>> sourced
>>         in 2016,
>>                                      under the name Euphoria
>>                                      API:
>>
>>         https://github.com/seznam/euphoria <https://github.com/seznam/eup
>> horia>
>>                                      <https://github.com/seznam/euphoria
>>         <https://github.com/seznam/euphoria>>
>>                                      <https://github.com/seznam/euphoria
>>         <https://github.com/seznam/euphoria>
>>                                      <https://github.com/seznam/euphoria
>>         <https://github.com/seznam/euphoria>>>
>>
>>
>>                                                        As it is very
>> similar to
>>         Apache
>>                                      Beam, we feel, that it is
>>                                      not
>>                                                        worth of
>> duplicating
>>         effort in
>>                                      terms of development of new
>>                                                        runtimes and
>> fine-tuning of
>>                                      current ones.
>>
>>
>>                                                        The main blocker
>> for us
>>         to switch
>>                                      to Apache Beam is lack
>>                                      of the
>>                                                        Java 8 API. *W*e
>> propose the
>>                                      integration of Euphoria API
>>                                      into
>>                                                        Apache Beam as a
>> Java 8
>>         DSL, in
>>                                      order to share our effort
>>                                      with
>>                                                        the community.
>>
>>
>>                                                        Simple example of
>> the
>>         Euphoria API
>>                                      usage, can be found
>>                                      here:
>>
>>
>>         https://github.com/seznam/euphoria/tree/master/euphoria-exam
>> ples/src/main/java/cz/seznam/euphoria/examples/wordcount
>>         <https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>>                                             <
>> https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount
>>         <https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount>>
>>
>>
>>                                             <
>> https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount
>>         <https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount>
>>                                             <
>> https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount
>>         <https://github.com/seznam/euphoria/tree/master/euphoria-exa
>> mples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>
>>
>>
>>
>>                                                        If you feel, that
>> Beam
>>         community
>>                                      could leverage from our
>>                                      work,
>>                                                        we would love to
>> start
>>         working on
>>                                      Euphoria integration
>>                                      into
>>                                                        Apache Beam (we
>> already
>>         have a
>>                                      working POC, with few basic
>>                                                        operators
>> implemented).
>>
>>
>>                                                        I look forward to
>> hearing
>>         from you,
>>
>>                                                        David
>>
>>
>>                                                    --
>> Jean-Baptiste
>>         Onofré
>>         jbono...@apache.org <mailto:jbono...@apache.org>
>>         <mailto:jbono...@apache.org <mailto:jbono...@apache.org>>
>>                                      <mailto:jbono...@apache.org
>>         <mailto:jbono...@apache.org>
>>                                      <mailto:jbono...@apache.org
>>         <mailto:jbono...@apache.org>>>
>>         http://blog.nanthrax.net
>>                                                    Talend -
>> http://www.talend.com
>>
>>
>>
>>
>>
>>                                      --                             s
>> pozdravem
>>
>>                                      David Morávek
>>
>>
>>                                  --                         Jean-Baptiste
>> Onofré
>>         jbono...@apache.org <mailto:jbono...@apache.org>
>>         <mailto:jbono...@apache.org <mailto:jbono...@apache.org>>
>>         http://blog.nanthrax.net
>>                                  Talend - http://www.talend.com
>>
>>
>>
>>
>>
>>              --     Jean-Baptiste Onofré
>>         jbono...@apache.org <mailto:jbono...@apache.org>
>>         <mailto:jbono...@apache.org <mailto:jbono...@apache.org>>
>>         http://blog.nanthrax.net
>>              Talend - http://www.talend.com
>>
>>
>>
>>
>>         --         s pozdravem
>>
>>         David Morávek
>>
>>
>>     --     Jean-Baptiste Onofré
>>     jbono...@apache.org <mailto:jbono...@apache.org>
>>     http://blog.nanthrax.net
>>     Talend - http://www.talend.com
>>
>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to