Re: @RequireTimeSortedInput design draft

2019-06-06 Thread Reza Rokni
Hi Jan,

I have been working on a timeseries extension which makes use of many of
these techniques for joining two temporal streams, it's almost ready for
the PR, will ping it here when it is as it might be useful for you. In
general, I borrowed a lot of techniques from CoGroupBy code.

*1) need to figure out how to get Coder of input PCollection of stateful
ParDo inside StatefulDoFnRunner*
My join takes in a  , in the outer transform I use things like
leftCollection.getCoder()).getValueCoder(); Then when creating the Join
transform I can defer the StateSpec object creation until the constructor
is called.

*2) there are performance considerations, that can be solved probably only
by Sorted Map State [2]*
Sorted Map is going to be awesome, until then the only option is to create
a Cache in the DoFn to make it more efficient. For the cache to work you
need to key on Window + key and do things like clear the
cache @Startbundle. Better to wait for Sorted Map if this is not time
critical.

*3) additional work is needed for allowedLateness to work correctly (and
there are at least two ways how to solve this), see the design doc [3]*
Yup, in my case I can support this by not GC the right side of the join for
now, but that is a compromise.

*4) more tests (for batch and validatesRunner) are needed*
I just posted a question on the best way to make use of the @ValidateRunner
annotation on this list, sounds like it might be useful to you as well :-)


On Thu, 6 Jun 2019 at 23:03, Jan Lukavský  wrote:

> Hi,
>
> I have written a PoC implementation of this in [1] and I'd like to
> discuss some implementation details. First of all, I'd appreciate any
> feedback about this. There are some known issues:
>
>   1) need to figure out how to get Coder of input PCollection of
> stateful ParDo inside StatefulDoFnRunner
>
>   2) there are performance considerations, that can be solved probably
> only by Sorted Map State [2]
>
>   3) additional work is needed for allowedLateness to work correctly
> (and there are at least two ways how to solve this), see the design doc [3]
>
>   4) more tests (for batch and validatesRunner) are needed
>
> I have come across a few bugs in DirectRunner, which I tried to solve:
>
>   a) timers seem to be broken in stateful pardo with side inputs
>
>   b) timers need to be sorted by timestamp, otherwise state might be
> cleared before it gets chance to be flushed
>
>
> Thanks for feedback,
>
>   Jan
>
>
> [1] https://github.com/apache/beam/pull/8774
>
> [2]
>
> http://mail-archives.apache.org/mod_mbox/beam-dev/201905.mbox/%3ccalstk6+ldemtjmnuysn3vcufywjkhmgv1isfbdmxthoqh91...@mail.gmail.com%3e
>
> [3]
>
> https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/
>
>
> On 5/23/19 4:40 PM, Robert Bradshaw wrote:
> > Thanks for writing this up.
> >
> > I think the justification for adding this to the model needs to be
> > that it is useful (you have this covered, though some examples would
> > be nice) and that it's something that can't easily be done by users
> > themselves (specifically, though it can be (relatively) cheaply done
> > in streaming and batch, it's done in very different ways, and also
> > that it's hard to do via composition).
> >
> > On Thu, May 23, 2019 at 4:10 PM Jan Lukavský  wrote:
> >> Hi,
> >>
> >> I have written a very brief draft of how it might be possible to
> >> implement @RequireTimeSortedInput discussed in [1]. I see the document
> >> [2] a starting point for a discussion. There are several open questions,
> >> which I believe can be resolved by this great community. :-)
> >>
> >> Jan
> >>
> >> [1]
> http://mail-archives.apache.org/mod_mbox/beam-dev/201905.mbox/browser
> >>
> >> [2]
> >>
> https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/
> >>
>


-- 

This email may be confidential and privileged. If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it has gone
to the wrong person.

The above terms reflect a potential business arrangement, are provided
solely as a basis for further discussion, and are not intended to be and do
not constitute a legally binding obligation. No legally binding obligations
will be created, implied, or inferred until an agreement in final form is
executed in writing by all parties involved.


Testing code in extensions against runner

2019-06-06 Thread Reza Rokni
Hi,

I would like to validate some code that I am building under
extensions against different runners. It makes use of some caches in a DoFn
which are a little off the beaten path.

I have added @ValidatesRunner to the class and by adding the right values
to the gradle file in flink_runner have got the tests to run. However it
does not feel right for me to change the flink_runner.gradle file to
achieve this, especially as this is all experimental and under extensions.

I could copy over all the bits needed from the gradle file over to my
extensions gradle, but then I would need to do that for all runners , which
also feels a bit heavy weight. Is there a way, or should there be a way of
having a task added to my gradle file which will do tests against all
runners for me?

Cheers
Reza

-- 

This email may be confidential and privileged. If you received this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it has gone
to the wrong person.

The above terms reflect a potential business arrangement, are provided
solely as a basis for further discussion, and are not intended to be and do
not constitute a legally binding obligation. No legally binding obligations
will be created, implied, or inferred until an agreement in final form is
executed in writing by all parties involved.


Re: [PROPOSAL] Preparing for Beam 2.14.0 release

2019-06-06 Thread Anton Kedin
I don't know, haven't followed the docker images release thread. Will take
a look and see if it's feasible or is a blocker for this release.

Regards,
Anton

On Thu, Jun 6, 2019 at 12:41 PM Ismaël Mejía  wrote:

> Are you planning to release also the docker images that were postponed in
> the previous release? If so probably starting early to define that part of
> the  process will be a good idea.
>
> On Thu, Jun 6, 2019, 7:06 PM Jean-Baptiste Onofré  wrote:
>
>> +1
>>
>> Regards
>> JB
>> Le 6 juin 2019, à 19:02, Ankur Goenka  a écrit:
>>>
>>> +1
>>>
>>> On Thu, Jun 6, 2019, 9:13 AM Ahmet Altay  wrote:
>>>
 +1, thank you for keeping the cadence.

 On Thu, Jun 6, 2019 at 9:04 AM Anton Kedin  wrote:

> Hello Beam community!
>
> Beam 2.14 release branch cut date is June 19 according to the release
> calendar [1]. I would like to volunteer myself to do this release. The 
> plan
> is to cut the branch on that date, and cherrypick fixes if needed.
>
> If you have release blocking issues for 2.14 please mark their "Fix
> Version" as 2.14.0 [2]. Please use 2.15.0 release in JIRA in case you
> would like to move any non-blocking issues to that version.
>
> And if we're doing a 2.7.1 release it should probably happen
> independently and in parallel if we want to maintain the release cadence.
>
> Thoughts, comments, objections?
>
> Thanks,
> Anton
>
> [1]
> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
> [2]
> https://issues.apache.org/jira/browse/BEAM-7478?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%202.14.0
>



Re: [PROPOSAL] Preparing for Beam 2.14.0 release

2019-06-06 Thread Ismaël Mejía
Are you planning to release also the docker images that were postponed in
the previous release? If so probably starting early to define that part of
the  process will be a good idea.

On Thu, Jun 6, 2019, 7:06 PM Jean-Baptiste Onofré  wrote:

> +1
>
> Regards
> JB
> Le 6 juin 2019, à 19:02, Ankur Goenka  a écrit:
>>
>> +1
>>
>> On Thu, Jun 6, 2019, 9:13 AM Ahmet Altay  wrote:
>>
>>> +1, thank you for keeping the cadence.
>>>
>>> On Thu, Jun 6, 2019 at 9:04 AM Anton Kedin  wrote:
>>>
 Hello Beam community!

 Beam 2.14 release branch cut date is June 19 according to the release
 calendar [1]. I would like to volunteer myself to do this release. The plan
 is to cut the branch on that date, and cherrypick fixes if needed.

 If you have release blocking issues for 2.14 please mark their "Fix
 Version" as 2.14.0 [2]. Please use 2.15.0 release in JIRA in case you
 would like to move any non-blocking issues to that version.

 And if we're doing a 2.7.1 release it should probably happen
 independently and in parallel if we want to maintain the release cadence.

 Thoughts, comments, objections?

 Thanks,
 Anton

 [1]
 https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
 [2]
 https://issues.apache.org/jira/browse/BEAM-7478?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%202.14.0

>>>


Re: [Discuss] Ideas for Apache Beam presence in social media

2019-06-06 Thread Thomas Weise
Pinging individual PMC members doesn't work. There needs to be visibility
to proposed actions to anyone that is interested. That would require a form
of subscribe/notification mechanism (as exists for PRs and JIRAs).


On Thu, Jun 6, 2019 at 10:33 AM Aizhamal Nurmamat kyzy 
wrote:

> With the spreadsheet in http://s.apache.org/beam-tweets, anyone can
> propose tweets. I will check it every few days, and ping/tag PMC members to
> review tweets and publish. Does that sound fine?
> If you have ideas on how to make the process better, please let me know.
>
> Thanks,
> Aizhamal
>
> On Wed, Jun 5, 2019 at 4:10 AM Thomas Weise  wrote:
>
>> +1
>>
>> What would be the mechanism to notify the PMC that there is something to
>> review?
>>
>>
>> On Tue, Jun 4, 2019 at 9:55 PM Kenneth Knowles  wrote:
>>
>>> Bringing the PMC's conclusion back to this list, we are happy to start
>>> with the following arrangement:
>>>
>>>  - Doc/spreadsheet/etc readable by dev@ (aka the public), writable by
>>> some group of contributors to set up a queue of news
>>>  - Any member of PMC approves and executes the posts, with enough time
>>> elapsing to consider it lazy consensus
>>>
>>> Any mistake transcribing this conclusion is my own. And of course
>>> nothing is permanent, but we try and iterate.
>>>
>>> Kenn
>>>
>>> On Mon, Jun 3, 2019 at 2:18 PM Aizhamal Nurmamat kyzy <
>>> aizha...@google.com> wrote:
>>>
 Hello folks,

 I have created a spreadsheet where people can suggest tweets [1]. It
 contains a couple of tweets that have been tweeted as examples. Also, there
 are a couple others that I will ask PMC members to review in the next few
 days.

 I have also created a blog post[2] to invite community members to
 participate by proposing tweets / retweets.

 Does this look OK to everyone? I’d love to try it out and see if it
 drives engagement in the community. If not we can always change the
 processes.

 Thanks,
 aizhamal

 [1] s.apache.org/beam-tweets
 [2] https://github.com/apache/beam/pull/8747

 On Fri, May 24, 2019 at 4:26 PM Kenneth Knowles 
 wrote:

> Thanks for taking on this work!
>
> Kenn
>
> On Fri, May 24, 2019 at 2:52 PM Aizhamal Nurmamat kyzy <
> aizha...@google.com> wrote:
>
>> Hi everyone,
>>
>> I'd like to pilot this if that's okay by everyone. I'll set up a
>> spreadsheet, write a blog post publicizing it, and perhaps send out a
>> tweet. We can improve the process later with tools if necessary.
>>
>> Thanks all and have a great weekend!
>> Aizhamal
>>
>> On Tue, May 21, 2019 at 8:37 PM Kenneth Knowles 
>> wrote:
>>
>>> Great idea.
>>>
>>> Austin - point well taken about whether the PMC really has to
>>> micro-manage here. The stakes are potentially very high, but so are the
>>> stakes for code and website changes.
>>>
>>> I know that comdev votes authoring privileges to people who are not
>>> committers, but they are not speaking on behalf of comdev but under 
>>> their
>>> own name.
>>>
>>> Let's definitely find a way to be effective on social media.
>>>
>>> Kenn
>>>
>>> On Tue, May 21, 2019 at 4:14 AM Maximilian Michels 
>>> wrote:
>>>
 Hi Aizhamal,

 This is a great idea. I think it would help Beam to be more
 prominent on
 social media.

 We need to discuss this also on the private@ mailing list but I
 don't
 see anything standing in the way if the PMC always gets to approve
 the
 proposed social media postings.

 I could even imagine that the PMC gives rights to a Beam community
 member to post in their name.

 Thanks,
 Max

 On 21.05.19 03:09, Austin Bennett wrote:
 > Is PMC definitely in charge of this (approving, communication
 channel,
 > etc)?
 >
 > There could even be a more concrete pull-request-like function
 even for
 > things like tweets (to minimize cut/paste operations)?
 >
 > I remember a bit of a mechanism having been proposed some time
 ago (in
 > another circumstance), though doesn't look like it made it
 terribly far:
 >
 http://www.redhenlab.org/home/the-cognitive-core-research-topics-in-red-hen/the-barnyard/-slick-tweeting
 > (I haven't otherwise seen such functionality).
 >
 >
 >
 > On Mon, May 20, 2019 at 4:54 PM Robert Burke >>> > > wrote:
 >
 > +1
 > As a twitter user, I like this idea.
 >
 > On Mon, 20 May 2019 at 15:18, Aizhamal Nurmamat kyzy
 > mailto:aizha...@google.com>> wrote:
 >
 > Hello everyone,
 >
 >
 

Re: [Discuss] Ideas for Apache Beam presence in social media

2019-06-06 Thread Aizhamal Nurmamat kyzy
With the spreadsheet in http://s.apache.org/beam-tweets, anyone can propose
tweets. I will check it every few days, and ping/tag PMC members to review
tweets and publish. Does that sound fine?
If you have ideas on how to make the process better, please let me know.

Thanks,
Aizhamal

On Wed, Jun 5, 2019 at 4:10 AM Thomas Weise  wrote:

> +1
>
> What would be the mechanism to notify the PMC that there is something to
> review?
>
>
> On Tue, Jun 4, 2019 at 9:55 PM Kenneth Knowles  wrote:
>
>> Bringing the PMC's conclusion back to this list, we are happy to start
>> with the following arrangement:
>>
>>  - Doc/spreadsheet/etc readable by dev@ (aka the public), writable by
>> some group of contributors to set up a queue of news
>>  - Any member of PMC approves and executes the posts, with enough time
>> elapsing to consider it lazy consensus
>>
>> Any mistake transcribing this conclusion is my own. And of course nothing
>> is permanent, but we try and iterate.
>>
>> Kenn
>>
>> On Mon, Jun 3, 2019 at 2:18 PM Aizhamal Nurmamat kyzy <
>> aizha...@google.com> wrote:
>>
>>> Hello folks,
>>>
>>> I have created a spreadsheet where people can suggest tweets [1]. It
>>> contains a couple of tweets that have been tweeted as examples. Also, there
>>> are a couple others that I will ask PMC members to review in the next few
>>> days.
>>>
>>> I have also created a blog post[2] to invite community members to
>>> participate by proposing tweets / retweets.
>>>
>>> Does this look OK to everyone? I’d love to try it out and see if it
>>> drives engagement in the community. If not we can always change the
>>> processes.
>>>
>>> Thanks,
>>> aizhamal
>>>
>>> [1] s.apache.org/beam-tweets
>>> [2] https://github.com/apache/beam/pull/8747
>>>
>>> On Fri, May 24, 2019 at 4:26 PM Kenneth Knowles  wrote:
>>>
 Thanks for taking on this work!

 Kenn

 On Fri, May 24, 2019 at 2:52 PM Aizhamal Nurmamat kyzy <
 aizha...@google.com> wrote:

> Hi everyone,
>
> I'd like to pilot this if that's okay by everyone. I'll set up a
> spreadsheet, write a blog post publicizing it, and perhaps send out a
> tweet. We can improve the process later with tools if necessary.
>
> Thanks all and have a great weekend!
> Aizhamal
>
> On Tue, May 21, 2019 at 8:37 PM Kenneth Knowles 
> wrote:
>
>> Great idea.
>>
>> Austin - point well taken about whether the PMC really has to
>> micro-manage here. The stakes are potentially very high, but so are the
>> stakes for code and website changes.
>>
>> I know that comdev votes authoring privileges to people who are not
>> committers, but they are not speaking on behalf of comdev but under their
>> own name.
>>
>> Let's definitely find a way to be effective on social media.
>>
>> Kenn
>>
>> On Tue, May 21, 2019 at 4:14 AM Maximilian Michels 
>> wrote:
>>
>>> Hi Aizhamal,
>>>
>>> This is a great idea. I think it would help Beam to be more
>>> prominent on
>>> social media.
>>>
>>> We need to discuss this also on the private@ mailing list but I
>>> don't
>>> see anything standing in the way if the PMC always gets to approve
>>> the
>>> proposed social media postings.
>>>
>>> I could even imagine that the PMC gives rights to a Beam community
>>> member to post in their name.
>>>
>>> Thanks,
>>> Max
>>>
>>> On 21.05.19 03:09, Austin Bennett wrote:
>>> > Is PMC definitely in charge of this (approving, communication
>>> channel,
>>> > etc)?
>>> >
>>> > There could even be a more concrete pull-request-like function
>>> even for
>>> > things like tweets (to minimize cut/paste operations)?
>>> >
>>> > I remember a bit of a mechanism having been proposed some time ago
>>> (in
>>> > another circumstance), though doesn't look like it made it
>>> terribly far:
>>> >
>>> http://www.redhenlab.org/home/the-cognitive-core-research-topics-in-red-hen/the-barnyard/-slick-tweeting
>>> > (I haven't otherwise seen such functionality).
>>> >
>>> >
>>> >
>>> > On Mon, May 20, 2019 at 4:54 PM Robert Burke >> > > wrote:
>>> >
>>> > +1
>>> > As a twitter user, I like this idea.
>>> >
>>> > On Mon, 20 May 2019 at 15:18, Aizhamal Nurmamat kyzy
>>> > mailto:aizha...@google.com>> wrote:
>>> >
>>> > Hello everyone,
>>> >
>>> >
>>> > What does the community think of making Apache Beam’s
>>> social
>>> > media presence more active and more community driven?
>>> >
>>> >
>>> > The Slack and StackOverflow for Apache Beam offer pretty
>>> nice
>>> > support, but we still could utilize Twitter & LinkedIn
>>> better to
>>> > share more interesting Beam news. For example, we could
>>> tweet to

Re: [PROPOSAL] Preparing for Beam 2.14.0 release

2019-06-06 Thread Jean-Baptiste Onofré
+1

Regards
JB

Le 6 juin 2019 à 19:02, à 19:02, Ankur Goenka  a écrit:
>+1
>
>On Thu, Jun 6, 2019, 9:13 AM Ahmet Altay  wrote:
>
>> +1, thank you for keeping the cadence.
>>
>> On Thu, Jun 6, 2019 at 9:04 AM Anton Kedin  wrote:
>>
>>> Hello Beam community!
>>>
>>> Beam 2.14 release branch cut date is June 19 according to the
>release
>>> calendar [1]. I would like to volunteer myself to do this release.
>The plan
>>> is to cut the branch on that date, and cherrypick fixes if needed.
>>>
>>> If you have release blocking issues for 2.14 please mark their "Fix
>>> Version" as 2.14.0 [2]. Please use 2.15.0 release in JIRA in case
>you
>>> would like to move any non-blocking issues to that version.
>>>
>>> And if we're doing a 2.7.1 release it should probably happen
>>> independently and in parallel if we want to maintain the release
>cadence.
>>>
>>> Thoughts, comments, objections?
>>>
>>> Thanks,
>>> Anton
>>>
>>> [1]
>>>
>https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
>>> [2]
>>>
>https://issues.apache.org/jira/browse/BEAM-7478?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%202.14.0
>>>
>>


Re: [PROPOSAL] Preparing for Beam 2.14.0 release

2019-06-06 Thread Ankur Goenka
+1

On Thu, Jun 6, 2019, 9:13 AM Ahmet Altay  wrote:

> +1, thank you for keeping the cadence.
>
> On Thu, Jun 6, 2019 at 9:04 AM Anton Kedin  wrote:
>
>> Hello Beam community!
>>
>> Beam 2.14 release branch cut date is June 19 according to the release
>> calendar [1]. I would like to volunteer myself to do this release. The plan
>> is to cut the branch on that date, and cherrypick fixes if needed.
>>
>> If you have release blocking issues for 2.14 please mark their "Fix
>> Version" as 2.14.0 [2]. Please use 2.15.0 release in JIRA in case you
>> would like to move any non-blocking issues to that version.
>>
>> And if we're doing a 2.7.1 release it should probably happen
>> independently and in parallel if we want to maintain the release cadence.
>>
>> Thoughts, comments, objections?
>>
>> Thanks,
>> Anton
>>
>> [1]
>> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
>> [2]
>> https://issues.apache.org/jira/browse/BEAM-7478?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%202.14.0
>>
>


Re: [PROPOSAL] Preparing for Beam 2.14.0 release

2019-06-06 Thread Ahmet Altay
+1, thank you for keeping the cadence.

On Thu, Jun 6, 2019 at 9:04 AM Anton Kedin  wrote:

> Hello Beam community!
>
> Beam 2.14 release branch cut date is June 19 according to the release
> calendar [1]. I would like to volunteer myself to do this release. The plan
> is to cut the branch on that date, and cherrypick fixes if needed.
>
> If you have release blocking issues for 2.14 please mark their "Fix
> Version" as 2.14.0 [2]. Please use 2.15.0 release in JIRA in case you
> would like to move any non-blocking issues to that version.
>
> And if we're doing a 2.7.1 release it should probably happen independently
> and in parallel if we want to maintain the release cadence.
>
> Thoughts, comments, objections?
>
> Thanks,
> Anton
>
> [1]
> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
> [2]
> https://issues.apache.org/jira/browse/BEAM-7478?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%202.14.0
>


[PROPOSAL] Preparing for Beam 2.14.0 release

2019-06-06 Thread Anton Kedin
Hello Beam community!

Beam 2.14 release branch cut date is June 19 according to the release
calendar [1]. I would like to volunteer myself to do this release. The plan
is to cut the branch on that date, and cherrypick fixes if needed.

If you have release blocking issues for 2.14 please mark their "Fix
Version" as 2.14.0 [2]. Please use 2.15.0 release in JIRA in case you would
like to move any non-blocking issues to that version.

And if we're doing a 2.7.1 release it should probably happen independently
and in parallel if we want to maintain the release cadence.

Thoughts, comments, objections?

Thanks,
Anton

[1]
https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
[2]
https://issues.apache.org/jira/browse/BEAM-7478?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%202.14.0


Re: [PROPOSAL] Prepare for LTS bugfix release 2.7.1

2019-06-06 Thread Kenneth Knowles
Hi all,

Re-raising this thread. I got busy for the last month, and also did not
want to overlap the 2.13.0 release process. Now I want to pick up 2.7.1
again.

Can everyone check on any bug they have targeted to 2.7.1 [1] and get the
backports merged to release-2.7.1 and the tickets resolved?

Kenn

[1]
https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%202.7.1%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC

On Fri, Apr 26, 2019 at 11:19 AM Ahmet Altay  wrote:

> I agree with both keeping 2.7.x going until a new LTS is declared and
> declaring LTS spost-release after some use. 2.12 might actually be a good
> candidate, with multiple RCs/validations it presumably is well tested. We
> can consider that after it gets some real world use.
>
> On Fri, Apr 26, 2019 at 6:29 AM Robert Bradshaw 
> wrote:
>
>> IIRC, there was some talk on making 2.12 the next LTS, but the
>> consensus is to decide on a LTS after having had some experience with
>> it, not at or before the release itself.
>>
>>
>> On Fri, Apr 26, 2019 at 3:04 PM Alexey Romanenko
>>  wrote:
>> >
>> > Thanks for working on this, Kenn.
>> >
>> > Perhaps, I missed this but has it been already discussed/decided what
>> will be the next LTS release?
>> >
>> > On 26 Apr 2019, at 08:02, Kenneth Knowles  wrote:
>> >
>> > Since it is all trivially reversible if there is some other feeling
>> about this thread, I have gone ahead and started the work:
>> >
>> >  - I made release-2.7.1 branch point to the same commit as
>> release-2.7.0 so there is something to target PRs
>> >  - I have opened the first PR, cherry-picking the set_version script
>> and using it to set the version on the branch to 2.7.1:
>> https://github.com/apache/beam/pull/8407 (found bug in the new script
>> right away :-)
>> >
>> > Here is the release with list of issues:
>> https://issues.apache.org/jira/projects/BEAM/versions/12344458. So
>> anyone can grab a ticket and volunteer to open a backport PR to the
>> release-2.7.1 branch.
>> >
>> > I don't have a strong opinion about how long we should support the
>> 2.7.x line. I am curious about different perspectives on user / vendor
>> needs. I have two very basic thoughts: (1) we surely need to keep it going
>> until some time after we have another LTS designated, to make sure there is
>> a clear path for anyone only using LTS releases and (2) if we decide to end
>> support of 2.7.x but then someone volunteers to backport and release, of
>> course I would not expect anyone to block them, so it has no maximum
>> lifetime, but we just need consensus on a minimum. And of course that
>> consensus cannot force anyone to do the work, but is just a resolution of
>> the community.
>> >
>> > Kenn
>> >
>> > On Thu, Apr 25, 2019 at 10:29 PM Jean-Baptiste Onofré 
>> wrote:
>> >>
>> >> +1 it sounds good to me.
>> >>
>> >> Thanks !
>> >>
>> >> Regards
>> >> JB
>> >>
>> >> On 26/04/2019 02:42, Kenneth Knowles wrote:
>> >> > Hi all,
>> >> >
>> >> > Since the release of 2.7.0 we have identified some serious bugs:
>> >> >
>> >> >  - There are 8 (non-dupe) issues* tagged with Fix Version 2.7.1
>> >> >  - 2 are rated "Blocker" (aka P0) but I think the others may be
>> underrated
>> >> >  - If you know of a critical bug that is not on that list, please
>> file
>> >> > an LTS backport ticket for it
>> >> >
>> >> > If a user is on an old version and wants to move to the LTS, there
>> are
>> >> > some real blockers. I propose that we perform a 2.7.1 release
>> starting now.
>> >> >
>> >> > I volunteer to manage the release. What do you think?
>> >> >
>> >> > Kenn
>> >> >
>> >> > *Some are "resolved" but this is not accurate as the LTS 2.7.1
>> branch is
>> >> > not created yet. I suggest filing a ticket to track just the LTS
>> >> > backport when you hit a bug that merits it.
>> >> >
>> >
>> >
>>
>


Re: @RequireTimeSortedInput design draft

2019-06-06 Thread Jan Lukavský

Hi,

I have written a PoC implementation of this in [1] and I'd like to 
discuss some implementation details. First of all, I'd appreciate any 
feedback about this. There are some known issues:


 1) need to figure out how to get Coder of input PCollection of 
stateful ParDo inside StatefulDoFnRunner


 2) there are performance considerations, that can be solved probably 
only by Sorted Map State [2]


 3) additional work is needed for allowedLateness to work correctly 
(and there are at least two ways how to solve this), see the design doc [3]


 4) more tests (for batch and validatesRunner) are needed

I have come across a few bugs in DirectRunner, which I tried to solve:

 a) timers seem to be broken in stateful pardo with side inputs

 b) timers need to be sorted by timestamp, otherwise state might be 
cleared before it gets chance to be flushed



Thanks for feedback,

 Jan


[1] https://github.com/apache/beam/pull/8774

[2] 
http://mail-archives.apache.org/mod_mbox/beam-dev/201905.mbox/%3ccalstk6+ldemtjmnuysn3vcufywjkhmgv1isfbdmxthoqh91...@mail.gmail.com%3e


[3] 
https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/



On 5/23/19 4:40 PM, Robert Bradshaw wrote:

Thanks for writing this up.

I think the justification for adding this to the model needs to be
that it is useful (you have this covered, though some examples would
be nice) and that it's something that can't easily be done by users
themselves (specifically, though it can be (relatively) cheaply done
in streaming and batch, it's done in very different ways, and also
that it's hard to do via composition).

On Thu, May 23, 2019 at 4:10 PM Jan Lukavský  wrote:

Hi,

I have written a very brief draft of how it might be possible to
implement @RequireTimeSortedInput discussed in [1]. I see the document
[2] a starting point for a discussion. There are several open questions,
which I believe can be resolved by this great community. :-)

Jan

[1] http://mail-archives.apache.org/mod_mbox/beam-dev/201905.mbox/browser

[2]
https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/



Re: Timer support in Flink

2019-06-06 Thread Reza Rokni
I changed the font size and the wording a little, its in this PR:

https://github.com/apache/beam/pull/8773

I started to mess around with moving it around to better position etc.. But
then I quickly remembered why no one lets me near web pages css etc... :-)
Anyone else want to enhance that PR please go for it!

Cheers

Reza

On Tue, 4 Jun 2019 at 15:42, Robert Bradshaw  wrote:

> One issue with the fully expanded version is that it's so large it's
> hard to read.
>
> I think it would be useful to make the ~ entries (at least) clickable
> or with hover tool tips. It would be nice to be able to expand columns
> individually as well.
>
> On Tue, Jun 4, 2019 at 7:20 AM Melissa Pashniak 
> wrote:
> >
> >
> > Yeah, people's eyes likely jump to the big "What is being computed?"
> header first and skip the small font "expand details" (that's what my eyes
> did anyway!) Even just moving the expand/collapse to be AFTER the header of
> the table (or down to the next line)  and making the font bigger might help
> a lot. And maybe making the text more explicit: "Click to expand for more
> details".
> >
> > I'm traveling right now so can't take an in-depth look, but this might
> be doable by changing the order of things in [1] and the font size in [2].
> I'll add this info to the JIRA also.
> >
> > [1]
> https://github.com/apache/beam/blame/master/website/src/_includes/capability-matrix.md#L18
> > [2]
> https://github.com/apache/beam/blob/master/website/src/_sass/capability-matrix.scss#L130
> >
> >
> > On Mon, Jun 3, 2019 at 2:15 AM Maximilian Michels 
> wrote:
> >>
> >> Good point. I think I discovered the detailed view when I made changes
> >> to the source code. Classic tunnel-vision problem :)
> >>
> >> On 30.05.19 12:57, Reza Rokni wrote:
> >> > :-)
> >> >
> >> > https://issues.apache.org/jira/browse/BEAM-7456
> >> >
> >> > On Thu, 30 May 2019 at 18:41, Alex Van Boxel  >> > > wrote:
> >> >
> >> > Oh... you can expand the matrix. Never saw that, this could indeed
> >> > be better. So it isn't you.
> >> >
> >> >   _/
> >> > _/ Alex Van Boxel
> >> >
> >> >
> >> > On Thu, May 30, 2019 at 12:24 PM Reza Rokni  >> > > wrote:
> >> >
> >> > PS, until it was just pointed out to me by Max, I had missed
> the
> >> > (expand details) clickable link in the capability matrix.
> >> >
> >> > Probably just me, but do others think it's also easy to miss?
> If
> >> > yes I will raise a Jira for it
> >> >
> >> > On Wed, 29 May 2019 at 19:52, Reza Rokni  >> > > wrote:
> >> >
> >> > Thanx Max!
> >> >
> >> > Reza
> >> >
> >> > On Wed, 29 May 2019, 16:38 Maximilian Michels,
> >> > mailto:m...@apache.org>> wrote:
> >> >
> >> > Hi Reza,
> >> >
> >> > The detailed view of the capability matrix states:
> "The
> >> > Flink Runner
> >> > supports timers in non-merging windows."
> >> >
> >> > That is still the case. Other than that, timers should
> >> > be working fine.
> >> >
> >> >  > It makes very heavy use of Event.Time timers and
> has
> >> > to do some manual DoFn cache work to get around some
> >> > O(heavy) issues.
> >> >
> >> > If you are running on Flink 1.5, timer deletion
> suffers
> >> > from O(n)
> >> > complexity which has been fixed in newer versions.
> >> >
> >> > Cheers,
> >> > Max
> >> >
> >> > On 29.05.19 03:27, Reza Rokni wrote:
> >> >  > Hi Flink experts,
> >> >  >
> >> >  > I am getting ready to push a PR around a utility
> >> > class for timeseries join
> >> >  >
> >> >  > left.timestamp match to closest right.timestamp
> where
> >> > right.timestamp <=
> >> >  > left.timestamp.
> >> >  >
> >> >  > It makes very heavy use of Event.Time timers and
> has
> >> > to do some manual
> >> >  > DoFn cache work to get around some O(heavy) issues.
> >> > Wanted to test
> >> >  > things against Flink: In the capability matrix we
> >> > have "~" for Timer
> >> >  > support in Flink:
> >> >  >
> >> >  >
> >> >
> https://beam.apache.org/documentation/runners/capability-matrix/
> >> >  >
> >> >  > Is that page outdated, if not what are the areas
> that
> >> > still need to be
> >> >  > addressed please?
> >> >  >
> >> >  > Cheers
> >> >  >
> >> >  > Reza
> >> >  >
> >> >