(Sounds good, thanks! We'll follow-up there.) On Tue, Feb 27, 2018 at 10:49 AM, David Morávek <[email protected]> wrote:
> Hi Davor, > > sorry for the delay, we were blocked by our legal department. I've send > both SGA and CCLA to [email protected], please let me know if you > need anything else. > > Regards, > David > > On Mon, Feb 19, 2018 at 6:13 AM, Jean-Baptiste Onofré <[email protected]> > wrote: > >> Hi Davor, >> >> We still have some discussion/paperwork on Euphoria side (SGA, ...). >> >> So, it's on track but it takes a little more time than expected. >> >> Regards >> JB >> >> On 02/19/2018 05:40 AM, Davor Bonaci wrote: >> > I may have missed things, but any update on the progress of this >> donation? >> > >> > On Tue, Jan 2, 2018 at 10:52 PM, Jean-Baptiste Onofré <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Great ! >> > >> > Thanks ! >> > Regards >> > JB >> > >> > On 01/03/2018 07:29 AM, David Morávek wrote: >> > >> > Hello JB, >> > >> > Perfect! I'm already on the Beam Slack workspace, I'll contact >> you once >> > I get to the office. >> > >> > Thanks! >> > D. >> > >> > On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré < >> [email protected] >> > <mailto:[email protected]> <mailto:[email protected] >> > <mailto:[email protected]>>> wrote: >> > >> > Hi David, >> > >> > absolutely !! Let's move forward on the preparation steps. >> > >> > Are you on Slack and/or hangout to plan this ? >> > >> > Thanks, >> > Regards >> > JB >> > >> > On 01/02/2018 05:35 PM, David Morávek wrote: >> > >> > Hello JB, >> > >> > can we help in any way to move things forward? >> > >> > Thanks, >> > D. >> > >> > On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré >> > <[email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>> >> > <mailto:[email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>>>> >> wrote: >> > >> > Thanks Jan, >> > >> > It makes sense. >> > >> > Let me take a look on the code to understand the >> "interaction". >> > >> > Regards >> > JB >> > >> > >> > On 12/18/2017 04:26 PM, Jan Lukavský wrote: >> > >> > Hi JB, >> > >> > basically you are not wrong. The project >> started about >> > three or >> > four >> > years ago with a goal to unify batch and >> streaming >> > processing into >> > single portable, executor independent API. >> Because of >> > that, it is >> > currently "close" to Beam in this sense. But >> we don't >> > see much >> > added >> > value keeping this as a separate project, with >> one of >> > the key >> > differences to be the API (not the model >> itself), so we >> > would >> > like to >> > focus on translation from Euphoria API to >> Beam's SDK. >> > That's why we >> > would like to see it as a DSL, so that it >> would be >> > possible to use >> > Euphoria API with Beam's runners as much >> natively as >> > possible. >> > >> > I hope I didn't make the subject even more >> unclear, if >> > so, I'll >> > be happy >> > to explain anything in more detail. :-) >> > >> > Jan >> > >> > >> > On 12/18/2017 04:08 PM, Jean-Baptiste Onofré >> wrote: >> > >> > Hi Jan, >> > >> > Thanks for your answers. >> > >> > However, they confused me ;) >> > >> > Regarding what you replied, Euphoria seems >> like a >> > programming >> > model/SDK "close" to Beam more than a DSL >> on top of an >> > existing Beam >> > SDK. >> > >> > Am I wrong ? >> > >> > Regards >> > JB >> > >> > On 12/18/2017 03:44 PM, Jan Lukavský wrote: >> > >> > Hi Ismael, >> > >> > basically we adopted the Beam's design >> regarding >> > partitioning >> > (https://github.com/seznam/eup >> horia/issues/160 >> > <https://github.com/seznam/euphoria/issues/160> >> > <https://github.com/seznam/euphoria/issues/160 >> > <https://github.com/seznam/euphoria/issues/160>> >> > <https://github.com/seznam/eup >> horia/issues/160 >> > <https://github.com/seznam/euphoria/issues/160> >> > <https://github.com/seznam/euphoria/issues/160 >> > <https://github.com/seznam/euphoria/issues/160>>>) and >> implemented >> > the sorting manually >> > (https://github.com/seznam/eup >> horia/issues/158 >> > <https://github.com/seznam/euphoria/issues/158> >> > <https://github.com/seznam/euphoria/issues/158 >> > <https://github.com/seznam/euphoria/issues/158>> >> > <https://github.com/seznam/eup >> horia/issues/158 >> > <https://github.com/seznam/euphoria/issues/158> >> > <https://github.com/seznam/euphoria/issues/158 >> > <https://github.com/seznam/euphoria/issues/158>>>). I'm not >> aware >> > of the time model differences >> (Euphoria supports >> > ingestion and >> > event time, we don't support >> processing time by >> > decision). >> > Regarding other differences (looking >> into Beam >> > capability >> > matrix, I'd say that): >> > >> > - we don't support stateful FlatMap >> (i.e. >> > ParDo) for now >> > (https://github.com/seznam/eup >> horia/issues/192 >> > <https://github.com/seznam/euphoria/issues/192> >> > <https://github.com/seznam/euphoria/issues/192 >> > <https://github.com/seznam/euphoria/issues/192>> >> > <https://github.com/seznam/eup >> horia/issues/192 >> > <https://github.com/seznam/euphoria/issues/192> >> > <https://github.com/seznam/euphoria/issues/192 >> > <https://github.com/seznam/euphoria/issues/192>>>) >> > >> > - we don't support side inputs (by >> decision >> > now, but >> > might be >> > reconsidered) and outputs >> > (https://github.com/seznam/eup >> horia/issues/124 >> > <https://github.com/seznam/euphoria/issues/124> >> > <https://github.com/seznam/euphoria/issues/124 >> > <https://github.com/seznam/euphoria/issues/124>> >> > <https://github.com/seznam/eup >> horia/issues/124 >> > <https://github.com/seznam/euphoria/issues/124> >> > <https://github.com/seznam/euphoria/issues/124 >> > <https://github.com/seznam/euphoria/issues/124>>>) >> > >> > >> > - we support complete event-time >> windows >> > (non-merging, >> > merging, aligned, unaligned) and time >> control >> > >> > - we don't support processing time >> by >> > decision (might be >> > reconsidered if a valid use-case is >> found) >> > >> > - we support window triggering >> based on both >> > time >> > and data, >> > including discarding and accumulating >> (without >> > accumulating & >> > retracting) >> > >> > All our executors (runners) - Flink, >> Spark and >> > Local - >> > implement >> > the complete model, which we enforce >> using >> > "operator >> > test kit" >> > that all executors must pass. Spark >> executor >> > supports >> > bounded >> > sources only (for now). As David said, >> we currently >> > don't have >> > serialization abstraction, so there is >> some >> > work to be >> > done in >> > that regard. >> > >> > Our intention is to completely >> supersede >> > Euphoria, we >> > would like >> > to consider possibility to use >> executors that >> > would not >> > rely on >> > Beam, but that is optional now and >> should be >> > straightforward. >> > >> > We'd be happy to answer any more >> questions you >> > might >> > have and >> > thanks a lot! >> > >> > Best, >> > >> > Jan >> > >> > >> > On 12/18/2017 03:19 PM, Ismaël Mejía >> wrote: >> > >> > Hi, >> > >> > It is great to see that you guys >> have >> > achieved a >> > maturity >> > point to >> > propose this. Congratulations for >> your work >> > and the >> > idea to >> > contribute >> > it into Beam. >> > >> > I remember from a previous >> discussion with Jan >> > about the model >> > mismatch between Euphoria and >> Beam, because >> > of some >> > design >> > decisions >> > of both projects. I remember you >> guys had some >> > issues with >> > the way >> > Beam's sources do partitioning, as >> well as >> > Beam's >> > lack of >> > sorted data >> > (on shuffle a la hadoop). Also if I >> > remember well >> > the 'time' >> > model of >> > Euphoria was simpler than Beam's. >> I talk >> > about all >> > of this >> > because I >> > am curious about what parts of the >> Euphoria >> > model >> > you guys >> > had to >> > sacrifice to support Beam, and >> what parts >> > of Beam's >> > model >> > should still >> > be integrated into Euphoria (and >> if there is a >> > straightforward path to >> > do it). >> > >> > If I understand well if this gets >> merged into >> > Apache this >> > means that >> > Euphoria's current implementation >> would be >> > superseded by >> > this DSL? I >> > am curious because I would like to >> > understand your >> > level of >> > investment >> > on supporting the future of this >> DSL. >> > >> > Thanks and congrats again ! >> > Ismaël >> > >> > On Mon, Dec 18, 2017 at 10:12 AM, >> > Jean-Baptiste Onofré >> > <[email protected] <mailto: >> [email protected]> >> > <mailto:[email protected] <mailto:[email protected]>> >> > <mailto:[email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>>>> wrote: >> > >> > Depending of the donation, you >> would >> > need ICLA >> > for each >> > contributor, and >> > CCLA in addition of SGA. >> > >> > We can sync with Davor and I >> for the >> > legal stuff. >> > However, I would wait a little >> bit just >> > to have >> > feedback >> > from the whole team >> > and start a formal vote. >> > >> > I would be happy to start the >> formal vote. >> > >> > Regards >> > JB >> > >> > On 12/18/2017 10:03 AM, David >> Morávek >> > wrote: >> > >> > Hello, >> > >> > Thanks for the awesome >> feedback! >> > >> > Romain: >> > >> > We already use Java Stream >> API in >> > all operators >> > where it makes sense (eg.: >> > ReduceByKey). Still not >> sure if it >> > was a good >> > choice, but i can be easily >> > converted to iterator >> anyway. >> > >> > Side outputs support is >> coming soon, we >> > already made >> > an initial work on >> > this. >> > >> > Side inputs are not >> supported in a >> > way you >> > are used >> > to from beam, because >> > it can be replaced by Join >> operator >> > on the >> > same key >> > (if annotated with >> > broadcastHashJoin, it will >> be >> > turned into >> > map side >> > join). >> > >> > Only significant >> difference from >> > Beam is, >> > that we >> > decided not to abstract >> > serialization, so we need >> to add >> > support >> > for Type >> > Hints, because of type >> > erasure. >> > >> > Fluent API: >> > >> > API is fluent within one >> operator. >> > It is >> > designed to >> > "lead the >> > programmer", which means, >> that he >> > we'll be only >> > offered methods that makes >> > sense after the last >> method he used >> > (eg.: in >> > ReduceByKey, we know that >> after >> > keyBy either reduceBy >> method should >> > come). >> > It is >> > implemented as a series of >> > builders. >> > >> > Davor: >> > >> > Thanks, I'll contact you, >> and will >> > start >> > the process >> > of having all the >> > necessary paperwork signed >> on our >> > side, so >> > we can >> > get things moving. >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > On Mon, Dec 18, 2017 at >> 7:46 AM, Romain >> > Manni-Bucau >> > <[email protected] >> > <mailto:[email protected]> >> > <mailto:[email protected] <mailto: >> [email protected]>> >> > <mailto:[email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto: >> [email protected]>>> >> > <mailto: >> [email protected] >> > <mailto:[email protected]> >> > <mailto:[email protected] <mailto: >> [email protected]>> >> > <mailto: >> [email protected] >> > <mailto:[email protected]> >> > <mailto:[email protected] <mailto: >> [email protected]>>>>> >> > wrote: >> > >> > Hi guys >> > >> > A DSL would be very >> welcomed, in >> > particular if >> > fluent. >> > >> > Open question: did >> you study >> > to implement >> > Stream API (surely >> extending >> > it to >> > have a BeamStream >> and a few more >> > features like >> > sides etc)? Would be >> > very >> > natural and >> integrable easily >> > anywhere and >> > avoid a new API discovery. >> > >> > Hazelcast jet did it >> so I >> > dont see >> > why Beam >> > couldnt. >> > >> > Le 18 déc. 2017 >> 07:26, "Davor >> > Bonaci" >> > <[email protected] >> > <mailto:[email protected]> <mailto:[email protected] >> > <mailto:[email protected]>> >> > <mailto:[email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>>> >> > <mailto: >> [email protected] >> > <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>> >> > >> > <mailto:[email protected] >> > <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>>>>> >> a écrit : >> > >> > Hi David, >> > As JB noted, >> merging of >> > these two >> > projects >> > is a great idea. If >> > fact, >> > some of us have >> had those >> > discussions in >> > the past. >> > >> > Legally, nothing >> > particular is >> > strictly >> > necessary as the code seem >> > to >> > already be >> Apache 2.0 >> > licensed. >> > We don't, >> > however, want to be >> > perceived >> > as making >> hostile forks, >> > so it >> > would be >> > great to file a Software >> > Grant >> > Agreement with >> the ASF >> > Secretary. >> > I can >> > help with the process, as >> > necessary. >> > >> > Project >> alignment-wise, there >> > aren't any >> > particular blockers that >> > I am >> > aware of. We >> welcome DSLs. >> > >> > Technically, the >> code >> > would start >> > in a >> > feature branch. During this >> > stage, we'd need >> to >> > validate a >> > few things, >> > including confirmation >> > the >> > code and >> dependencies >> > match the ASF >> > policy, automate testing in >> > Beam's >> > tooling, etc. At >> that >> > point, we'd >> > take a >> > community vote to accept >> > the >> > component into >> master, >> > and consider >> > author(s) for >> committership in >> > the >> > overall project. >> > >> > Welcome to the >> ASF and >> > Beam -- we are >> > thrilled to have you! Hope >> > this >> > helps, and >> please reach >> > out if >> > anybody on >> > our end can help, >> > including JB >> > or myself. >> > >> > Davor >> > >> > >> > On Sun, Dec 17, >> 2017 at >> > 10:13 AM, >> > Jean-Baptiste Onofré >> > <[email protected] >> > <mailto:[email protected]> <mailto:[email protected] <mailto: >> [email protected]>> >> > <mailto:[email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>>> >> > <mailto: >> [email protected] >> > <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>> >> > >> > <mailto:[email protected] >> > <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>>>>> >> wrote: >> > >> > Hi David, >> > >> > Generally >> speaking, >> > having >> > different >> > fluent DSL on top of the >> > Beam >> > SDK is great. >> > >> > I would like >> to take >> > a look >> > on your >> > wordcount examples to give >> > you a >> > complete >> feedback. I >> > like the >> > idea and >> > a fluent Java DSL is >> > valuable. >> > >> > Let's wait >> feedback from >> > others. If we >> > have a consensus, then >> > I >> > would be >> more than >> > happy to >> > help you >> > for the donation (I >> > worked on >> > the Camel >> Java DSL >> > while ago, >> > so I >> > have some experience here). >> > >> > Thanks ! >> > Regards >> > JB >> > >> > On >> 12/17/2017 07:00 >> > PM, David >> > Morávek >> > wrote: >> > >> > Hello, >> > >> > >> > First of >> all, >> > thanks for the >> > amazing work the Apache >> Beam >> > >> community is doing! >> > >> > >> > In 2014, >> we've >> > started >> > development >> > of the runtime >> > independent >> > Java 8 >> API, that >> > helps us to >> > create unified big-data >> > processing >> > flows. >> It has >> > been used >> > as a core >> > building block of >> > Seznam.cz >> > web >> crawler data >> > infrastructure >> > every since. Its design >> > >> principles and >> > execution >> > model are >> > very similar to Apache >> > Beam. >> > >> > >> > This API >> was open >> > sourced >> > in 2016, >> > under the name Euphoria >> > API: >> > >> > https://github.com/seznam/euphoria >> > <https://github.com/seznam/euphoria> < >> https://github.com/seznam/euphoria >> > <https://github.com/seznam/euphoria>> >> > < >> https://github.com/seznam/euphoria >> > <https://github.com/seznam/euphoria> >> > <https://github.com/seznam/euphoria >> > <https://github.com/seznam/euphoria>>> >> > < >> https://github.com/seznam/euphoria >> > <https://github.com/seznam/euphoria> >> > <https://github.com/seznam/euphoria >> > <https://github.com/seznam/euphoria>> >> > < >> https://github.com/seznam/euphoria >> > <https://github.com/seznam/euphoria> >> > <https://github.com/seznam/euphoria >> > <https://github.com/seznam/euphoria>>>> >> > >> > >> > As it is >> very >> > similar to >> > Apache >> > Beam, we feel, that it is >> > not >> > worth of >> duplicating >> > effort in >> > terms of development of new >> > runtimes >> and >> > fine-tuning of >> > current ones. >> > >> > >> > The main >> blocker >> > for us >> > to switch >> > to Apache Beam is lack >> > of the >> > Java 8 >> API. *W*e >> > propose the >> > integration of Euphoria API >> > into >> > Apache >> Beam as a >> > Java 8 >> > DSL, in >> > order to share our effort >> > with >> > the >> community. >> > >> > >> > Simple >> example of the >> > Euphoria API >> > usage, can be found >> > here: >> > >> > >> > >> > https://github.com/seznam/euphoria/tree/master/euphoria-exa >> mples/src/main/java/cz/seznam/euphoria/examples/wordcount >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount> >> > >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>> >> > >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount> >> > >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>>> >> > >> > >> > >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount> >> > >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>> >> > >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount> >> > >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount >> > <https://github.com/seznam/euphoria/tree/master/euphoria-ex >> amples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>> >> > >> > >> > >> > If you >> feel, that >> > Beam >> > community >> > could leverage from our >> > work, >> > we would >> love to >> > start >> > working on >> > Euphoria integration >> > into >> > Apache >> Beam (we >> > already >> > have a >> > working POC, with few basic >> > operators >> > implemented). >> > >> > >> > I look >> forward to >> > hearing >> > from you, >> > >> > David >> > >> > >> > >> -- >> > Jean-Baptiste >> > Onofré >> > [email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>> >> > <mailto:[email protected] <mailto:[email protected] >> > >> > <mailto:[email protected] <mailto:[email protected]>>> >> > <mailto: >> [email protected] >> > <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected] >> >> >> > <mailto: >> [email protected] >> > <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected] >> >>>> >> > http://blog.nanthrax.net >> > Talend - >> > http://www.talend.com >> > >> > >> > >> > >> > >> > -- >> s >> > pozdravem >> > >> > David Morávek >> > >> > >> > -- >> > Jean-Baptiste Onofré >> > [email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>> >> > <mailto:[email protected] <mailto:[email protected] >> > >> > <mailto:[email protected] <mailto:[email protected]>>> >> > http://blog.nanthrax.net >> > Talend - http://www.talend.com >> > >> > >> > >> > >> > >> > -- Jean-Baptiste Onofré >> > [email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>> >> > <mailto:[email protected] <mailto:[email protected] >> > >> > <mailto:[email protected] <mailto:[email protected]>>> >> > http://blog.nanthrax.net >> > Talend - http://www.talend.com >> > >> > >> > >> > >> > -- s pozdravem >> > >> > David Morávek >> > >> > >> > -- Jean-Baptiste Onofré >> > [email protected] <mailto:[email protected]> >> > <mailto:[email protected] <mailto:[email protected]>> >> > http://blog.nanthrax.net >> > Talend - http://www.talend.com >> > >> > >> > >> > -- >> > Jean-Baptiste Onofré >> > [email protected] <mailto:[email protected]> >> > http://blog.nanthrax.net >> > Talend - http://www.talend.com >> > >> > >> >> -- >> Jean-Baptiste Onofré >> [email protected] >> http://blog.nanthrax.net >> Talend - http://www.talend.com >> >> >
