Re: 2019 Beam Events

2018-12-04 Thread Matthias Baetens
Great stuff, Gris! Looking forward to what 2019 will bring!

The Beam meetup in London will have a new get together early next year as
well :-)
https://www.meetup.com/London-Apache-Beam-Meetup/


On Tue, 4 Dec 2018 at 23:50 Austin Bennett 
wrote:

> Already got that process kicked off with the NY and LA meet ups, now that
> SF is about to be inagurated goal will be to get these moving as well.
>
> For anyone that is in (or goes to) those areas:
> https://www.meetup.com/New-York-Apache-Beam/
> https://www.meetup.com/Los-Angeles-Apache-Beam/
>
> Please reach out to get involved!
>
>
>
> On Tue, Dec 4, 2018 at 3:13 PM Griselda Cuevas  wrote:
>
>> +1 to Pablo's suggestion, if there's interest in "Founding a Meetup group
>> in a particular city, let's create the Meetup page and start getting sign
>> ups. Joana will be reaching out with a comprenhexive list of how to get
>> started and we're hoping to compile a high level calendar of
>> launches/announcements to feed into your meetup.
>>
>> G
>>
>> On Tue, 4 Dec 2018 at 12:04, Daniel Salerno 
>> wrote:
>>
>>> =)
>>> What good news!
>>> Okay, I'll set up the group and try to get interested.
>>> Thank you
>>>
>>>
>>> Em ter, 4 de dez de 2018 às 17:19, Pablo Estrada 
>>> escreveu:
>>>
 FWIW, for some of these places that have interest (e.g. Brazil,
 Israel), it's possible to create a group in meetup.com, and start
 gauging interest, and looking for organizers.
 Once a group of people with interest exists, it's easier to get
 interest / sponsorship to bring speakers.
 So if you are willing to create the group in meetup, Daniel, we can
 monitor it and try to plan something as it grows : )
 Best
 -P.

 On Tue, Dec 4, 2018 at 10:55 AM Daniel Salerno 
 wrote:

>
> It's a shame that there are no events in Brazil ...
>
> =(
>
> Em ter, 4 de dez de 2018 às 13:12, OrielResearch Eila Arich-Landkof <
> e...@orielresearch.org> escreveu:
>
>> agree 
>>
>> On Tue, Dec 4, 2018 at 5:41 AM Chaim Turkel  wrote:
>>
>>> Israel would be nice to have one
>>> chaim
>>> On Tue, Dec 4, 2018 at 12:33 AM Griselda Cuevas 
>>> wrote:
>>> >
>>> > Hi Beam Community,
>>> >
>>> > I started curating industry conferences, meetups and events that
>>> are relevant for Beam, this initial list I came up with. I'd love your 
>>> help
>>> adding others that I might have overlooked. Once we're satisfied with 
>>> the
>>> list, let's re-share so we can coordinate proposal submissions, 
>>> attendance
>>> and community meetups there.
>>> >
>>> >
>>> > Cheers,
>>> >
>>> > G
>>> >
>>> >
>>> >
>>>
>>> --
>>>
>>>
>>> Loans are funded by
>>> FinWise Bank, a Utah-chartered bank located in Sandy,
>>> Utah, member FDIC, Equal
>>> Opportunity Lender. Merchant Cash Advances are
>>> made by Behalf. For more
>>> information on ECOA, click here
>>> . For important information
>>> about
>>> opening a new
>>> account, review Patriot Act procedures here
>>> .
>>> Visit Legal
>>>  to
>>> review our comprehensive program terms,
>>> conditions, and disclosures.
>>>
>>
>>
>> --
>> Eila
>> www.orielresearch.org
>> https://www.meetu
>> p.co
>> 
>> m/Deep-Learning-In-Production/
>> 
>>
>>
>> --


Re: org.apache.beam.runners.flink.PortableTimersExecutionTest is very flakey

2018-12-04 Thread Alex Amato
Well, here is my hacky solution.
You can see the changes I make to PortableTimersExecutionTest
https://github.com/apache/beam/pull/6786/files

I don't really understand why the pipeline never starts running when I make
the results object transient in PortableTiemrsExecutionTest.

So I instead continue to access a static object, but key it with the test
parameter, to prevent tests from interfering with each other.

I am not too sure how to proceed. I don't really want to check in this
hacky solution. But I am not too sure of what else to do with solved the
problems. Please let me know if you have any suggestions.

On Tue, Dec 4, 2018 at 5:26 PM Alex Amato  wrote:

> Thanks for letting me know Maximillian,
>
> Btw, I've been looking a this test the last few days as well. I have found
> a few other concurrency issues. That I hope to send a PR out for.
>
>
>- The PortableTimersExecutionTest result variable is using a static
>ArrayList, but can be writen to concurrently (by multiple thread AND
>multiple parameterized test instnace) which causing flakeyness.
>- But just using a ConcurrentLinkedQueue and a non static variable
>isn't sufficient as that will cause a copy of the results object to be
>copied during doFn serialization. So that makes all the assertions fail,
>since nothing get written to the same result object the test is using/
>- So it should be made private transient final. However, after trying
>   this I am seeing the test timeout, and I am not sure why. Continuing to
>   debug this.
>
>
> I think that my PR was increasing flakeyness, which is why I saw more of
> these issues.
> Just wanted to point these out in the meantime, hopefull it helps with
> debugging for you too.
>
> On Fri, Nov 30, 2018 at 7:49 AM Maximilian Michels  wrote:
>
>> This turned out to be a tricky bug. Robert and me had a joined debugging
>> session and managed to find the culprit.
>>
>> PR pending: https://github.com/apache/beam/pull/7171
>>
>> On 27.11.18 19:35, Kenneth Knowles wrote:
>> > I actually didn't look at this one. I filed a bunch more adjacent flake
>> > bugs. I didn't find your bug but I do see that test flaking at the same
>> > time as the others. FWIW here is the list of flakes and sickbayed
>> tests:
>> > https://issues.apache.org/jira/issues/?filter=12343195
>> >
>> > Kenn
>> >
>> > On Tue, Nov 27, 2018 at 10:25 AM Alex Amato > > > wrote:
>> >
>> > +Ken,
>> >
>> > Did you happen to look into this test? I heard that you may have
>> > been looking into this.
>> >
>> > On Mon, Nov 26, 2018 at 3:36 PM Maximilian Michels > > > wrote:
>> >
>> > Hi Alex,
>> >
>> > Thanks for your help! I'm quite used to debugging
>> > concurrent/distributed
>> > problems. But this one is quite tricky, especially with regards
>> > to GRPC
>> > threads. I try to provide more information in the following.
>> >
>> > There are two observations:
>> >
>> > 1) The problem is specifically related to how the cleanup is
>> > performed
>> > for the EmbeddedEnvironmentFactory. The environment is shutdown
>> > when the
>> > SDK Harness exists but the GRPC threads continue to linger for
>> > some time
>> > and may stall state processing on the next test.
>> >
>> > If you do _not_ close DefaultJobBundleFactory, which happens
>> during
>> > close() or dispose() in the FlinkExecutableStageFunction or
>> > ExecutableStageDoFnOperator respectively, the tests run just
>> > fine. I ran
>> > 1000 test runs without a single failure.
>> >
>> > The EmbeddedEnvironment uses direct channels which are marked
>> > experimental in GRPC. We may have to convert them to regular
>> socket
>> > communication.
>> >
>> > 2) Try setting a conditional breakpoint in GrpcStateService
>> > which will
>> > never break, e.g. "false". Set it here:
>> >
>> https://github.com/apache/beam/blob/6da9aa5594f96c0201d497f6dce4797c4984a2fd/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/GrpcStateService.java#L134
>> >
>> > The tests will never fail. The SDK harness is always shutdown
>> > correctly
>> > at the end of the test.
>> >
>> > Thanks,
>> > Max
>> >
>> > On 26.11.18 19:15, Alex Amato wrote:
>> >  > Thanks Maximilian, let me know if you need any help. Usually
>> > I debug
>> >  > this sort of thing by pausing the IntelliJ debugger to see
>> > all the
>> >  > different threads which are waiting on various conditions. If
>> > you find
>> >  > any insights from that, please post them here and we can try
>> > to figure
>> >  > out the source of the stuckness. Perhaps it may be some
>> > concurrency
>> >  > 

Re: org.apache.beam.runners.flink.PortableTimersExecutionTest is very flakey

2018-12-04 Thread Alex Amato
Thanks for letting me know Maximillian,

Btw, I've been looking a this test the last few days as well. I have found
a few other concurrency issues. That I hope to send a PR out for.


   - The PortableTimersExecutionTest result variable is using a static
   ArrayList, but can be writen to concurrently (by multiple thread AND
   multiple parameterized test instnace) which causing flakeyness.
   - But just using a ConcurrentLinkedQueue and a non static variable isn't
   sufficient as that will cause a copy of the results object to be copied
   during doFn serialization. So that makes all the assertions fail, since
   nothing get written to the same result object the test is using/
   - So it should be made private transient final. However, after trying
  this I am seeing the test timeout, and I am not sure why. Continuing to
  debug this.


I think that my PR was increasing flakeyness, which is why I saw more of
these issues.
Just wanted to point these out in the meantime, hopefull it helps with
debugging for you too.

On Fri, Nov 30, 2018 at 7:49 AM Maximilian Michels  wrote:

> This turned out to be a tricky bug. Robert and me had a joined debugging
> session and managed to find the culprit.
>
> PR pending: https://github.com/apache/beam/pull/7171
>
> On 27.11.18 19:35, Kenneth Knowles wrote:
> > I actually didn't look at this one. I filed a bunch more adjacent flake
> > bugs. I didn't find your bug but I do see that test flaking at the same
> > time as the others. FWIW here is the list of flakes and sickbayed tests:
> > https://issues.apache.org/jira/issues/?filter=12343195
> >
> > Kenn
> >
> > On Tue, Nov 27, 2018 at 10:25 AM Alex Amato  > > wrote:
> >
> > +Ken,
> >
> > Did you happen to look into this test? I heard that you may have
> > been looking into this.
> >
> > On Mon, Nov 26, 2018 at 3:36 PM Maximilian Michels  > > wrote:
> >
> > Hi Alex,
> >
> > Thanks for your help! I'm quite used to debugging
> > concurrent/distributed
> > problems. But this one is quite tricky, especially with regards
> > to GRPC
> > threads. I try to provide more information in the following.
> >
> > There are two observations:
> >
> > 1) The problem is specifically related to how the cleanup is
> > performed
> > for the EmbeddedEnvironmentFactory. The environment is shutdown
> > when the
> > SDK Harness exists but the GRPC threads continue to linger for
> > some time
> > and may stall state processing on the next test.
> >
> > If you do _not_ close DefaultJobBundleFactory, which happens
> during
> > close() or dispose() in the FlinkExecutableStageFunction or
> > ExecutableStageDoFnOperator respectively, the tests run just
> > fine. I ran
> > 1000 test runs without a single failure.
> >
> > The EmbeddedEnvironment uses direct channels which are marked
> > experimental in GRPC. We may have to convert them to regular
> socket
> > communication.
> >
> > 2) Try setting a conditional breakpoint in GrpcStateService
> > which will
> > never break, e.g. "false". Set it here:
> >
> https://github.com/apache/beam/blob/6da9aa5594f96c0201d497f6dce4797c4984a2fd/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/GrpcStateService.java#L134
> >
> > The tests will never fail. The SDK harness is always shutdown
> > correctly
> > at the end of the test.
> >
> > Thanks,
> > Max
> >
> > On 26.11.18 19:15, Alex Amato wrote:
> >  > Thanks Maximilian, let me know if you need any help. Usually
> > I debug
> >  > this sort of thing by pausing the IntelliJ debugger to see
> > all the
> >  > different threads which are waiting on various conditions. If
> > you find
> >  > any insights from that, please post them here and we can try
> > to figure
> >  > out the source of the stuckness. Perhaps it may be some
> > concurrency
> >  > issue leading to deadlock?
> >  >
> >  > On Thu, Nov 22, 2018 at 12:57 PM Maximilian Michels
> > mailto:m...@apache.org>
> >  > >> wrote:
> >  >
> >  > I couldn't fix it thus far. The issue does not seem to be
> > in the Flink
> >  > Runner but in the way the tests utilizes the EMBEDDED
> > environment to
> >  > run
> >  > multiple portable jobs in a row.
> >  >
> >  > When it gets stuck it is in RemoteBundle#close and it is
> > independent of
> >  > the test type (batch and streaming have different
> > implementations).
> >  >
> >  > Will give it another look tomorrow.
> 

Re: GSOC - Summer of Code, on Beam?

2018-12-04 Thread Pablo Estrada
Hi Austin!
Thanks a lot for surfacing this. I participated in GSOC as a student a
couple times, and loved it. This being my first time around as a committer,
I'm excited to try and help.

I think, for starters, it may be good to find issues in JIRA to label with
"gsoc", so please everyone who knows of good candidate project issues,
label them with "gsoc".

And then we can find mentors for these issues, and start helping students
in the application process.

Best
-P.

On Tue, Dec 4, 2018 at 3:46 PM Austin Bennett 
wrote:

> Would it make sense to have any GSOC students for next summer work on
> Beam?  Do we have some candidate things that would be suitable and
> sufficiently discrete projects?
>
> Initial applications for organizations not even open for about a month,
> though thought worth getting a sense from the group.
>
> A bit of info:
> https://summerofcode.withgoogle.com/archive/
>
> https://opensource.googleblog.com/2018/11/google-summer-of-code-15-years-strong.html
>
>
>
>


Re: 2019 Beam Events

2018-12-04 Thread Austin Bennett
Already got that process kicked off with the NY and LA meet ups, now that
SF is about to be inagurated goal will be to get these moving as well.

For anyone that is in (or goes to) those areas:
https://www.meetup.com/New-York-Apache-Beam/
https://www.meetup.com/Los-Angeles-Apache-Beam/

Please reach out to get involved!



On Tue, Dec 4, 2018 at 3:13 PM Griselda Cuevas  wrote:

> +1 to Pablo's suggestion, if there's interest in "Founding a Meetup group
> in a particular city, let's create the Meetup page and start getting sign
> ups. Joana will be reaching out with a comprenhexive list of how to get
> started and we're hoping to compile a high level calendar of
> launches/announcements to feed into your meetup.
>
> G
>
> On Tue, 4 Dec 2018 at 12:04, Daniel Salerno  wrote:
>
>> =)
>> What good news!
>> Okay, I'll set up the group and try to get interested.
>> Thank you
>>
>>
>> Em ter, 4 de dez de 2018 às 17:19, Pablo Estrada 
>> escreveu:
>>
>>> FWIW, for some of these places that have interest (e.g. Brazil, Israel),
>>> it's possible to create a group in meetup.com, and start gauging
>>> interest, and looking for organizers.
>>> Once a group of people with interest exists, it's easier to get interest
>>> / sponsorship to bring speakers.
>>> So if you are willing to create the group in meetup, Daniel, we can
>>> monitor it and try to plan something as it grows : )
>>> Best
>>> -P.
>>>
>>> On Tue, Dec 4, 2018 at 10:55 AM Daniel Salerno 
>>> wrote:
>>>

 It's a shame that there are no events in Brazil ...

 =(

 Em ter, 4 de dez de 2018 às 13:12, OrielResearch Eila Arich-Landkof <
 e...@orielresearch.org> escreveu:

> agree 
>
> On Tue, Dec 4, 2018 at 5:41 AM Chaim Turkel  wrote:
>
>> Israel would be nice to have one
>> chaim
>> On Tue, Dec 4, 2018 at 12:33 AM Griselda Cuevas 
>> wrote:
>> >
>> > Hi Beam Community,
>> >
>> > I started curating industry conferences, meetups and events that
>> are relevant for Beam, this initial list I came up with. I'd love your 
>> help
>> adding others that I might have overlooked. Once we're satisfied with the
>> list, let's re-share so we can coordinate proposal submissions, 
>> attendance
>> and community meetups there.
>> >
>> >
>> > Cheers,
>> >
>> > G
>> >
>> >
>> >
>>
>> --
>>
>>
>> Loans are funded by
>> FinWise Bank, a Utah-chartered bank located in Sandy,
>> Utah, member FDIC, Equal
>> Opportunity Lender. Merchant Cash Advances are
>> made by Behalf. For more
>> information on ECOA, click here
>> . For important information
>> about
>> opening a new
>> account, review Patriot Act procedures here
>> .
>> Visit Legal
>>  to
>> review our comprehensive program terms,
>> conditions, and disclosures.
>>
>
>
> --
> Eila
> www.orielresearch.org
> https://www.meetu
> p.co
> 
> m/Deep-Learning-In-Production/
> 
>
>
>


GSOC - Summer of Code, on Beam?

2018-12-04 Thread Austin Bennett
Would it make sense to have any GSOC students for next summer work on
Beam?  Do we have some candidate things that would be suitable and
sufficiently discrete projects?

Initial applications for organizations not even open for about a month,
though thought worth getting a sense from the group.

A bit of info:
https://summerofcode.withgoogle.com/archive/
https://opensource.googleblog.com/2018/11/google-summer-of-code-15-years-strong.html


Re: 2019 Beam Events

2018-12-04 Thread Griselda Cuevas
+1 to Pablo's suggestion, if there's interest in "Founding a Meetup group
in a particular city, let's create the Meetup page and start getting sign
ups. Joana will be reaching out with a comprenhexive list of how to get
started and we're hoping to compile a high level calendar of
launches/announcements to feed into your meetup.

G

On Tue, 4 Dec 2018 at 12:04, Daniel Salerno  wrote:

> =)
> What good news!
> Okay, I'll set up the group and try to get interested.
> Thank you
>
>
> Em ter, 4 de dez de 2018 às 17:19, Pablo Estrada 
> escreveu:
>
>> FWIW, for some of these places that have interest (e.g. Brazil, Israel),
>> it's possible to create a group in meetup.com, and start gauging
>> interest, and looking for organizers.
>> Once a group of people with interest exists, it's easier to get interest
>> / sponsorship to bring speakers.
>> So if you are willing to create the group in meetup, Daniel, we can
>> monitor it and try to plan something as it grows : )
>> Best
>> -P.
>>
>> On Tue, Dec 4, 2018 at 10:55 AM Daniel Salerno 
>> wrote:
>>
>>>
>>> It's a shame that there are no events in Brazil ...
>>>
>>> =(
>>>
>>> Em ter, 4 de dez de 2018 às 13:12, OrielResearch Eila Arich-Landkof <
>>> e...@orielresearch.org> escreveu:
>>>
 agree 

 On Tue, Dec 4, 2018 at 5:41 AM Chaim Turkel  wrote:

> Israel would be nice to have one
> chaim
> On Tue, Dec 4, 2018 at 12:33 AM Griselda Cuevas 
> wrote:
> >
> > Hi Beam Community,
> >
> > I started curating industry conferences, meetups and events that are
> relevant for Beam, this initial list I came up with. I'd love your help
> adding others that I might have overlooked. Once we're satisfied with the
> list, let's re-share so we can coordinate proposal submissions, attendance
> and community meetups there.
> >
> >
> > Cheers,
> >
> > G
> >
> >
> >
>
> --
>
>
> Loans are funded by
> FinWise Bank, a Utah-chartered bank located in Sandy,
> Utah, member FDIC, Equal
> Opportunity Lender. Merchant Cash Advances are
> made by Behalf. For more
> information on ECOA, click here
> . For important information about
> opening a new
> account, review Patriot Act procedures here
> .
> Visit Legal
>  to
> review our comprehensive program terms,
> conditions, and disclosures.
>


 --
 Eila
 www.orielresearch.org
 https://www.meetu 
 p.co 
 m/Deep-Learning-In-Production/
 





Re: [DISCUSS] Structuring Java based DSLs

2018-12-04 Thread Rui Wang
For pure SQL users, there shouldn't be a SDK concepts. SQL shell and JDBC
driver should be the way to interact Beam by SQL.


For embedded SQL use case in all SDKs (Python, Go, etc.), even assume there
are relational algebra operators defined in SDKs, SDKs still have to
implement its own way to parse SQL into operators (SQL is just a string).
To avoid that overhead, I would imagine that SDKs should keep SQL queries
and wait for a later but shared processing (I don't know if Portability
should handle SQL or if it could).


-Rui

On Tue, Dec 4, 2018 at 2:04 AM Jan Lukavský  wrote:

> Hi Kenn,
>
> my intent really was not to propose any changes right now. I'm trying to
> create a clear understanding about what the relation between Euphoria and
> SQL should be in long run. In my point of view, Euphoria should be always
> superset of SQL, because it should support complete relational algebra (and
> I'm not saying it does so right now, it should just be our goal) plus more
> flexible UDFs (not limited to SQL standard) and stateful processing (which
> will probably not be part of SQL any time soon). There should be some sort
> of guaranties that the semantics of SQL and Euphoria are the same, because
> that is what users would expect it to be. This can be for sure ensured by
> introducing another layer between Euphoria and core SDK (e.g. the join
> library), but the question is - what makes this solution different from
> creating this shared library from Euphoria itself (when looking at the big
> picture)? And it is not only about implementations of joins or any other
> operators, but there are other techniques that could be beneficial for SQL
> - e.g. pipeline sampling, automatic pipeline optimizations based on
> statistics from previous runs of batch queries, etc.
>
> The other way - that relational algebra nodes will become essential part
> of (some) SDK, that is equivalent to actually creating SQL SDK, am I right?
> I understand, that this approach can bring performance benefits, but
> besides that - is the language which implements SQL really important for
> users? Do we need SQL implementing Go UDFs, Java UDFs, Python UDFs? How
> would the resulting SQL query look like? If it is about allowing using SQL
> from all other SDKs (I want to do some basic preprocessing using SQL and
> then optimize some hard part in my favorite SDK) - can this be solved by
> enabling SQL in all SDKs by mixing various SDKs harnesses in single
> pipeline instead (e.g. I want to use SQL in Go SDK, I just tell the
> portable layer to run these operators using Java SDK and these using Go)?
> That seems plausible, solving interoperability issues, while leaving the
> whole implementation of SQL as an internal detail. Generally this solves
> more issues, like ability to reuse IOs in all SDKs (I'm aware that there
> are caveats, but that is beyond scope of intended discussion topic of this
> thread).
>
>  Jan
> On 12/3/18 7:27 PM, Kenneth Knowles wrote:
>
> To be honest, I don't think there's much worth doing right now. I think
> more self-contained is better for Beam SQL, generally. Two things I have on
> my mind are (1) SQL as an inline transform in every SDK and (2) supporting
> pure SQL like the CLI and JDBC driver, where the underlying language is an
> implementation detail.
>
> Big picture / long term, I would envision pure SQL, embedded SQL
> transform, and a DataFrame-like API in ~each SDK all desugaring to
> relational algebra nodes, sharing an optimizer, sharing some amount of
> mapping the physical plan to Beam transforms. The necessarily SDK-specific
> parts are the embedded transform API and UDFs in the host language. The
> rest should remain an implementation detail that we can change.
>
>  - For example, it is easy to imagine a customized columnar element/bundle
> encoding and SDK harness that only works for SQL to remove overhead of
> being general purpose. It could be written in C/C++/Go if we wanted to
> squeeze it for perf. Such things are made harder by having an elaborate
> end-user API between SQL and the core Beam model.
>  - Conversely, for whatever is chosen to underlie SQL's execution,
> stability is paramount. Ideally the simplest and least likely to change
> transforms would be the foundation. And I wouldn't want to have to design a
> user-friendly API for Euphoria or the join library just to enable a
> different join algorithm in SQL.
>
> So my take is keep SQL flexible, implement SQL on low-level and stable
> APIs, use join library, Euphoria, etc, if it looks like a big win, but
> don't build any policy here or do big refactors right now.
>
> Kenn
>
> On Mon, Dec 3, 2018 at 9:31 AM Jan Lukavský  wrote:
>
>> Hi Robert,
>>
>> currently there is no actual proposal, I was just trying to gather
>> feedback from the community. But my original thoughts would be [1]. I
>> actually don't see much need for restructuring the code by nesting
>> directories. If the community sees that it would make sense to structure
>> 

Re: 2019 Beam Events

2018-12-04 Thread Daniel Salerno
=)
What good news!
Okay, I'll set up the group and try to get interested.
Thank you


Em ter, 4 de dez de 2018 às 17:19, Pablo Estrada 
escreveu:

> FWIW, for some of these places that have interest (e.g. Brazil, Israel),
> it's possible to create a group in meetup.com, and start gauging
> interest, and looking for organizers.
> Once a group of people with interest exists, it's easier to get interest /
> sponsorship to bring speakers.
> So if you are willing to create the group in meetup, Daniel, we can
> monitor it and try to plan something as it grows : )
> Best
> -P.
>
> On Tue, Dec 4, 2018 at 10:55 AM Daniel Salerno 
> wrote:
>
>>
>> It's a shame that there are no events in Brazil ...
>>
>> =(
>>
>> Em ter, 4 de dez de 2018 às 13:12, OrielResearch Eila Arich-Landkof <
>> e...@orielresearch.org> escreveu:
>>
>>> agree 
>>>
>>> On Tue, Dec 4, 2018 at 5:41 AM Chaim Turkel  wrote:
>>>
 Israel would be nice to have one
 chaim
 On Tue, Dec 4, 2018 at 12:33 AM Griselda Cuevas 
 wrote:
 >
 > Hi Beam Community,
 >
 > I started curating industry conferences, meetups and events that are
 relevant for Beam, this initial list I came up with. I'd love your help
 adding others that I might have overlooked. Once we're satisfied with the
 list, let's re-share so we can coordinate proposal submissions, attendance
 and community meetups there.
 >
 >
 > Cheers,
 >
 > G
 >
 >
 >

 --


 Loans are funded by
 FinWise Bank, a Utah-chartered bank located in Sandy,
 Utah, member FDIC, Equal
 Opportunity Lender. Merchant Cash Advances are
 made by Behalf. For more
 information on ECOA, click here
 . For important information about
 opening a new
 account, review Patriot Act procedures here
 .
 Visit Legal
  to
 review our comprehensive program terms,
 conditions, and disclosures.

>>>
>>>
>>> --
>>> Eila
>>> www.orielresearch.org
>>> https://www.meetu 
>>> p.co 
>>> m/Deep-Learning-In-Production/
>>> 
>>>
>>>
>>>


Re: 2019 Beam Events

2018-12-04 Thread Pablo Estrada
FWIW, for some of these places that have interest (e.g. Brazil, Israel),
it's possible to create a group in meetup.com, and start gauging interest,
and looking for organizers.
Once a group of people with interest exists, it's easier to get interest /
sponsorship to bring speakers.
So if you are willing to create the group in meetup, Daniel, we can monitor
it and try to plan something as it grows : )
Best
-P.

On Tue, Dec 4, 2018 at 10:55 AM Daniel Salerno 
wrote:

>
> It's a shame that there are no events in Brazil ...
>
> =(
>
> Em ter, 4 de dez de 2018 às 13:12, OrielResearch Eila Arich-Landkof <
> e...@orielresearch.org> escreveu:
>
>> agree 
>>
>> On Tue, Dec 4, 2018 at 5:41 AM Chaim Turkel  wrote:
>>
>>> Israel would be nice to have one
>>> chaim
>>> On Tue, Dec 4, 2018 at 12:33 AM Griselda Cuevas  wrote:
>>> >
>>> > Hi Beam Community,
>>> >
>>> > I started curating industry conferences, meetups and events that are
>>> relevant for Beam, this initial list I came up with. I'd love your help
>>> adding others that I might have overlooked. Once we're satisfied with the
>>> list, let's re-share so we can coordinate proposal submissions, attendance
>>> and community meetups there.
>>> >
>>> >
>>> > Cheers,
>>> >
>>> > G
>>> >
>>> >
>>> >
>>>
>>> --
>>>
>>>
>>> Loans are funded by
>>> FinWise Bank, a Utah-chartered bank located in Sandy,
>>> Utah, member FDIC, Equal
>>> Opportunity Lender. Merchant Cash Advances are
>>> made by Behalf. For more
>>> information on ECOA, click here
>>> . For important information about
>>> opening a new
>>> account, review Patriot Act procedures here
>>> .
>>> Visit Legal
>>>  to
>>> review our comprehensive program terms,
>>> conditions, and disclosures.
>>>
>>
>>
>> --
>> Eila
>> www.orielresearch.org
>> https://www.meetu 
>> p.co 
>> m/Deep-Learning-In-Production/
>> 
>>
>>
>>


Re: 2019 Beam Events

2018-12-04 Thread Daniel Salerno
It's a shame that there are no events in Brazil ...

=(

Em ter, 4 de dez de 2018 às 13:12, OrielResearch Eila Arich-Landkof <
e...@orielresearch.org> escreveu:

> agree 
>
> On Tue, Dec 4, 2018 at 5:41 AM Chaim Turkel  wrote:
>
>> Israel would be nice to have one
>> chaim
>> On Tue, Dec 4, 2018 at 12:33 AM Griselda Cuevas  wrote:
>> >
>> > Hi Beam Community,
>> >
>> > I started curating industry conferences, meetups and events that are
>> relevant for Beam, this initial list I came up with. I'd love your help
>> adding others that I might have overlooked. Once we're satisfied with the
>> list, let's re-share so we can coordinate proposal submissions, attendance
>> and community meetups there.
>> >
>> >
>> > Cheers,
>> >
>> > G
>> >
>> >
>> >
>>
>> --
>>
>>
>> Loans are funded by
>> FinWise Bank, a Utah-chartered bank located in Sandy,
>> Utah, member FDIC, Equal
>> Opportunity Lender. Merchant Cash Advances are
>> made by Behalf. For more
>> information on ECOA, click here
>> . For important information about
>> opening a new
>> account, review Patriot Act procedures here
>> .
>> Visit Legal
>>  to
>> review our comprehensive program terms,
>> conditions, and disclosures.
>>
>
>
> --
> Eila
> www.orielresearch.org
> https://www.meetu 
> p.co 
> m/Deep-Learning-In-Production/
> 
>
>
>


Re: 2019 Beam Events

2018-12-04 Thread OrielResearch Eila Arich-Landkof
agree 

On Tue, Dec 4, 2018 at 5:41 AM Chaim Turkel  wrote:

> Israel would be nice to have one
> chaim
> On Tue, Dec 4, 2018 at 12:33 AM Griselda Cuevas  wrote:
> >
> > Hi Beam Community,
> >
> > I started curating industry conferences, meetups and events that are
> relevant for Beam, this initial list I came up with. I'd love your help
> adding others that I might have overlooked. Once we're satisfied with the
> list, let's re-share so we can coordinate proposal submissions, attendance
> and community meetups there.
> >
> >
> > Cheers,
> >
> > G
> >
> >
> >
>
> --
>
>
> Loans are funded by
> FinWise Bank, a Utah-chartered bank located in Sandy,
> Utah, member FDIC, Equal
> Opportunity Lender. Merchant Cash Advances are
> made by Behalf. For more
> information on ECOA, click here
> . For important information about
> opening a new
> account, review Patriot Act procedures here
> .
> Visit Legal
>  to
> review our comprehensive program terms,
> conditions, and disclosures.
>


-- 
Eila
www.orielresearch.org
https://www.meetu p.co

m/Deep-Learning-In-Production/



Re: 2019 Beam Events

2018-12-04 Thread Maximilian Michels
Thanks for sharing, Gris! This list will likely never be complete, as 
there are endless conferences :)


Nevertheless, it's a great idea to coordinate the attendance for the 
major ones.


Cheers,
Max

On 03.12.18 23:33, Griselda Cuevas wrote:

Hi Beam Community,

I started curating industry conferences, meetups and events that are 
relevant for Beam, this initial list I came up with 
. 
*I'd love your help adding others that I might have overlooked.* Once 
we're satisfied with the list, let's re-share so we can coordinate 
proposal submissions, attendance and community meetups there.



Cheers,

G





Re: Graceful shutdown of long-running Beam pipeline on Flink

2018-12-04 Thread Maximilian Michels

Thank you for sharing these, Lukasz!

Great question, Wayne!

As for pipeline shutdown, Flink users typically take a snapshot and then 
cancel the pipeline with Flink tools.


The Beam tooling needs to be improved to support cancelling as well. If 
snapshotting is enabled, the Beam job could also be restored from a 
snapshot instead of explicitly taking a savepoint.


Related issue for cancelling: 
https://issues.apache.org/jira/browse/BEAM-593 I think we should address 
this soon for the next release.


Thanks,
Max


On 03.12.18 17:53, Lukasz Cwik wrote:
There are propoosals for pipeline drain[1] and also for snapshot and 
update[2] for Apache Beam. We would love contributions in this space.


1: 
https://docs.google.com/document/d/1NExwHlj-2q2WUGhSO4jTu8XGhDPmm3cllSN8IMmWci8
2: 
https://docs.google.com/document/d/1UWhnYPgui0gUYOsuGcCjLuoOUlGA4QaY91n8p3wz9MY


On Mon, Dec 3, 2018 at 7:05 AM Wayne Collins > wrote:


Hi JC,

Thanks for the quick response!
I had hoped for an in-pipeline solution for runner portability but
it is nice to know we're not the only ones stepping outside to
interact with runner management. :-)

Wayne


On 2018-12-03 01:23, Juan Carlos Garcia wrote:

Hi Wayne,

We have the same setup and we do daily updates to our pipeline.

The way we do it is using the flink tool via a Jenkins.

Basically our deployment job do as follow:

1. Detect if the pipeline is running (it matches via job name)

2. If found, do a flink cancel with a savepoint (we uses hdfs for
checkpoint / savepoint) under a given directory.

3. It uses the flink run command for the new job and specify the
savepoint from step 2.

I don't think there is any support to achieve the same from within
the pipeline. You need to do this externally as explained above.

Best regards,
JC


Am Mo., 3. Dez. 2018, 00:46 hat Wayne Collins mailto:wayn...@dades.ca>> geschrieben:

Hi all,
We have a number of Beam pipelines processing unbounded
streams sourced from Kafka on the Flink runner and are very
happy with both the platform and performance!

The problem is with shutting down the pipelines...for version
upgrades, system maintenance, load management, etc. it would
be nice to be able to gracefully shut these down under
software control but haven't been able to find a way to do so.
We're in good shape on checkpointing and then cleanly
recovering but shutdowns are all destructive to Flink or the
Flink TaskManager.

Methods tried:

1) Calling cancel on FlinkRunnerResult returned from
pipeline.run()
This would be our preferred method but p.run() doesn't return
until termination and even if it did, the runner code simply
throws:
"throw new UnsupportedOperationException("FlinkRunnerResult
does not support cancel.");"
so this doesn't appear to be a near-term option.

2) Inject a "termination" message into the pipeline via Kafka
This does get through, but calling exit() from a stage in the
pipeline also terminates the Flink TaskManager.

3) Inject a "sleep" message, then manually restart the cluster
This is our current method: we pause the data at the source,
flood all branches of the pipeline with a "we're going down"
msg so the stages can do a bit of housekeeping, then hard-stop
the entire environment and re-launch with the new version.

Is there a "Best Practice" method for gracefully terminating
an unbounded pipeline from within the pipeline or from the
mainline that launches it?

Thanks!
Wayne

-- 
Wayne Collins

dades.ca    Inc.
mailto:wayn...@dades.ca
cell:416-898-5137



-- 
Wayne Collins

dades.ca    Inc.
mailto:wayn...@dades.ca
cell:416-898-5137



Re: 2019 Beam Events

2018-12-04 Thread Chaim Turkel
Israel would be nice to have one
chaim
On Tue, Dec 4, 2018 at 12:33 AM Griselda Cuevas  wrote:
>
> Hi Beam Community,
>
> I started curating industry conferences, meetups and events that are relevant 
> for Beam, this initial list I came up with. I'd love your help adding others 
> that I might have overlooked. Once we're satisfied with the list, let's 
> re-share so we can coordinate proposal submissions, attendance and community 
> meetups there.
>
>
> Cheers,
>
> G
>
>
>

-- 


Loans are funded by
FinWise Bank, a Utah-chartered bank located in Sandy, 
Utah, member FDIC, Equal
Opportunity Lender. Merchant Cash Advances are 
made by Behalf. For more
information on ECOA, click here 
. For important information about 
opening a new
account, review Patriot Act procedures here 
.
Visit Legal 
 to
review our comprehensive program terms, 
conditions, and disclosures. 


Re: [DISCUSS] Structuring Java based DSLs

2018-12-04 Thread Jan Lukavský

Hi Kenn,

my intent really was not to propose any changes right now. I'm trying to 
create a clear understanding about what the relation between Euphoria 
and SQL should be in long run. In my point of view, Euphoria should be 
always superset of SQL, because it should support complete relational 
algebra (and I'm not saying it does so right now, it should just be our 
goal) plus more flexible UDFs (not limited to SQL standard) and stateful 
processing (which will probably not be part of SQL any time soon). There 
should be some sort of guaranties that the semantics of SQL and Euphoria 
are the same, because that is what users would expect it to be. This can 
be for sure ensured by introducing another layer between Euphoria and 
core SDK (e.g. the join library), but the question is - what makes this 
solution different from creating this shared library from Euphoria 
itself (when looking at the big picture)? And it is not only about 
implementations of joins or any other operators, but there are other 
techniques that could be beneficial for SQL - e.g. pipeline sampling, 
automatic pipeline optimizations based on statistics from previous runs 
of batch queries, etc.


The other way - that relational algebra nodes will become essential part 
of (some) SDK, that is equivalent to actually creating SQL SDK, am I 
right? I understand, that this approach can bring performance benefits, 
but besides that - is the language which implements SQL really important 
for users? Do we need SQL implementing Go UDFs, Java UDFs, Python UDFs? 
How would the resulting SQL query look like? If it is about allowing 
using SQL from all other SDKs (I want to do some basic preprocessing 
using SQL and then optimize some hard part in my favorite SDK) - can 
this be solved by enabling SQL in all SDKs by mixing various SDKs 
harnesses in single pipeline instead (e.g. I want to use SQL in Go SDK, 
I just tell the portable layer to run these operators using Java SDK and 
these using Go)? That seems plausible, solving interoperability issues, 
while leaving the whole implementation of SQL as an internal detail. 
Generally this solves more issues, like ability to reuse IOs in all SDKs 
(I'm aware that there are caveats, but that is beyond scope of intended 
discussion topic of this thread).


 Jan

On 12/3/18 7:27 PM, Kenneth Knowles wrote:
To be honest, I don't think there's much worth doing right now. I 
think more self-contained is better for Beam SQL, generally. Two 
things I have on my mind are (1) SQL as an inline transform in every 
SDK and (2) supporting pure SQL like the CLI and JDBC driver, where 
the underlying language is an implementation detail.


Big picture / long term, I would envision pure SQL, embedded SQL 
transform, and a DataFrame-like API in ~each SDK all desugaring to 
relational algebra nodes, sharing an optimizer, sharing some amount of 
mapping the physical plan to Beam transforms. The necessarily 
SDK-specific parts are the embedded transform API and UDFs in the host 
language. The rest should remain an implementation detail that we can 
change.


 - For example, it is easy to imagine a customized columnar 
element/bundle encoding and SDK harness that only works for SQL to 
remove overhead of being general purpose. It could be written in 
C/C++/Go if we wanted to squeeze it for perf. Such things are made 
harder by having an elaborate end-user API between SQL and the core 
Beam model.
 - Conversely, for whatever is chosen to underlie SQL's execution, 
stability is paramount. Ideally the simplest and least likely to 
change transforms would be the foundation. And I wouldn't want to have 
to design a user-friendly API for Euphoria or the join library just to 
enable a different join algorithm in SQL.


So my take is keep SQL flexible, implement SQL on low-level and stable 
APIs, use join library, Euphoria, etc, if it looks like a big win, but 
don't build any policy here or do big refactors right now.


Kenn

On Mon, Dec 3, 2018 at 9:31 AM Jan Lukavský > wrote:


Hi Robert,

currently there is no actual proposal, I was just trying to gather
feedback from the community. But my original thoughts would be [1]. I
actually don't see much need for restructuring the code by nesting
directories. If the community sees that it would make sense to
structure
the dependencies, the second step would probably be to figure out
how to
accomplish this. I don't have any exact solution in mind so far, it
would be probably needed to first identify features that are
needed by
SQL and not supported by Euphoria currently. Then we can actually
identify costs and see it this still makes sense.

  Jan

On 12/3/18 6:17 PM, Robert Bradshaw wrote:
> Taking a step back, what exactly is the proposal. Looking at the
> original message, I see
>
> (1) Letting SQL take a dependency on Euphoria, sharing more code and
> taking advantage of the 

Re: contributor in the Beam

2018-12-04 Thread Chaim Turkel
done, thanks
On Mon, Dec 3, 2018 at 11:36 AM Jean-Baptiste Onofré  wrote:
>
> Can you please fix the conflict in the PR ?
>
> Thanks
> Regards
> JB
>
> On 03/12/2018 08:52, Chaim Turkel wrote:
> > it looks like there was a failure that is not due to the code, how can
> > i continue the process?
> > https://github.com/apache/beam/pull/7162
> >
> > On Thu, Nov 29, 2018 at 9:15 PM Chaim Turkel  wrote:
> >>
> >> hi,
> >>   i added another pr for the case of a self signed certificate ssl on
> >> the mongodb server
> >>
> >> https://github.com/apache/beam/pull/7162
> >> On Wed, Nov 28, 2018 at 5:16 PM Jean-Baptiste Onofré  
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I already upgraded locally. Let me push the PR.
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On 28/11/2018 16:02, Chaim Turkel wrote:
>  is there any reason that the mongo client version is still on 3.2.2?
>  can you upgrade it to 3.9.0?
>  chaim
>  On Tue, Nov 27, 2018 at 4:48 PM Jean-Baptiste Onofré  
>  wrote:
> >
> > Hi Chaim,
> >
> > The best is to create a Jira describing the new features you want to
> > add. Then, you can create a PR related to this Jira.
> >
> > As I'm the original MongoDbIO author, I would be more than happy to help
> > you and review the PR.
> >
> > Thanks !
> > Regards
> > JB
> >
> > On 27/11/2018 15:37, Chaim Turkel wrote:
> >> Hi,
> >>   I have added a few features to the MongoDbIO and would like to add
> >> them to the project.
> >> I have read https://beam.apache.org/contribute/
> >> I have added a jira user, what do i need to do next?
> >>
> >> chaim
> >>
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> 
> >>>
> >>> --
> >>> Jean-Baptiste Onofré
> >>> jbono...@apache.org
> >>> http://blog.nanthrax.net
> >>> Talend - http://www.talend.com
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com

-- 


Loans are funded by
FinWise Bank, a Utah-chartered bank located in Sandy, 
Utah, member FDIC, Equal
Opportunity Lender. Merchant Cash Advances are 
made by Behalf. For more
information on ECOA, click here 
. For important information about 
opening a new
account, review Patriot Act procedures here 
.
Visit Legal 
 to
review our comprehensive program terms, 
conditions, and disclosures.