Re: [VOTE] Go SDK

2018-05-22 Thread Ahmet Altay
+1 (binding)

Congratulations to the team!

On Tue, May 22, 2018 at 10:13 AM, Alan Myrvold  wrote:

> +1 (non-binding)
> Nice work!
>
> On Tue, May 22, 2018 at 9:18 AM Pablo Estrada  wrote:
>
>> +1 (binding)
>> Very excited to see this!
>>
>> On Tue, May 22, 2018 at 9:09 AM Thomas Weise  wrote:
>>
>>> +1 and congrats!
>>>
>>>
>>> On Tue, May 22, 2018 at 8:48 AM, Rafael Fernandez 
>>> wrote:
>>>
 +1 !

 On Tue, May 22, 2018 at 7:54 AM Lukasz Cwik  wrote:

> +1 (binding)
>
> On Tue, May 22, 2018 at 6:16 AM Robert Burke 
> wrote:
>
>> +1 (non-binding)
>>
>> I'm looking forward to helping gophers solve their big data problems
>> in their language of choice, and runner of choice!
>>
>> Next stop, a non-java portability runner?
>>
>> On Tue, May 22, 2018, 6:08 AM Kenneth Knowles  wrote:
>>
>>> +1 (binding)
>>>
>>> This is great. Feels like a phase change in the life of Apache Beam,
>>> having three languages, with multiple portable runners on the horizon.
>>>
>>> Kenn
>>>
>>> On Tue, May 22, 2018 at 2:50 AM Ismaël Mejía 
>>> wrote:
>>>
 +1 (binding)

 Go SDK brings new language support for a community not well
 supported in
 the Big Data world the Go developers, so this is a great. Also the
 fact
 that this is the first SDK integrated with the portability work
 makes it an
 interesting project to learn lessons from for future languages.

 Now it is the time to start building a community around the Go SDK
 this is
 the most important task now, and the only way to do it is to have
 the SDK
 as an official part of Beam so +1.

 Congrats to Henning and all the other contributors for this
 important
 milestone.
 On Tue, May 22, 2018 at 10:21 AM Holden Karau 
 wrote:

 > +1 (non-binding), I've had a chance to work with the SDK and it's
 pretty
 neat to see Beam add support for a language before the most of the
 big data
 ecosystem.

 > On Mon, May 21, 2018 at 10:29 PM, Jean-Baptiste Onofré <
 j...@nanthrax.net>
 wrote:

 >> Hi Henning,

 >> SGA has been filed for the entire project during the incubation
 period.

 >> Here, we have to check if SGA/IP donation is clean for the Go
 SDK.

 >> We don't have a lot to do, just checked that we are clean on
 this front.

 >> Regards
 >> JB

 >> On 22/05/2018 06:42, Henning Rohde wrote:

 >>> Thanks everyone!

 >>> Davor -- regarding your two comments:
 >>> * Robert mentioned that "SGA should have probably already
 been
 filed" in the previous thread. I got the impression that nothing
 further
 was needed. I'll follow up.
 >>> * The standard Go tooling basically always pulls directly
 from
 github, so there is no real urgency here.

 >>> Thanks,
 >>>Henning


 >>> On Mon, May 21, 2018 at 9:30 PM Jean-Baptiste Onofré <
 j...@nanthrax.net
 > wrote:

 >>>  +1 (binding)

 >>>  I just want to check about SGA/IP/Headers.

 >>>  Thanks !
 >>>  Regards
 >>>  JB

 >>>  On 22/05/2018 03:02, Henning Rohde wrote:
 >>>   > Hi everyone,
 >>>   >
 >>>   > Now that the remaining issues have been resolved as
 discussed,
 >>>  I'd like
 >>>   > to propose a formal vote on accepting the Go SDK into
 master. The
 >>>  main
 >>>   > practical difference is that the Go SDK would be part
 of the
 >>>  Apache Beam
 >>>   > release going forward.
 >>>   >
 >>>   > Highlights of the Go SDK:
 >>>   >   * Go user experience with natively-typed DoFns with
 (simulated)
 >>>   > generic types
 >>>   >   * Covers most of the Beam model: ParDo, GBK, CoGBK,
 Flatten,
 >>>  Combine,
 >>>   > Windowing, ..
 >>>   >   * Includes several IO connectors: Datastore,
 BigQuery, PubSub,
 >>>   > extensible textio.
 >>>   >   * Supports the portability framework for both batch
 and
 streaming,
 >>>   > notably the upcoming portable Flink runner
 >>>   >   * 

Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-05-17 Thread Ahmet Altay
On Thu, May 17, 2018 at 6:08 PM, Kenneth Knowles <k...@google.com> wrote:

> In case you didn't see the other thread, Andrew just discovered a problem
> in SQL's jar build. It may be a release blocker.
>

I missed Andrew's email. I only looked at the release blocking list. If it
might be a release blocker, could you please add it to the list?


>
> Just an FYI. Since the fix is likely small fixes to build file it seems ok
> to cut the branch and cherry pick.
>
> Kenn
>
> On Thu, May 17, 2018, 17:41 Ahmet Altay <al...@google.com> wrote:
>
>> Hi JB and all,
>>
>> I wanted to follow up on my previous email. The python streaming issue I
>> mentioned is resolved and removed from the blocker list. Blocker list is
>> empty now. You can go ahead with the release branch cut when you are ready.
>>
>> Thank you,
>> Ahmet
>>
>>
>> On Sun, May 13, 2018 at 8:43 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
>> wrote:
>>
>>> Hi guys,
>>>
>>> just to let you know that the build fully passed on my box.
>>>
>>> I'm testing the artifacts right now.
>>>
>>> Regards
>>> JB
>>>
>>> On 06/04/2018 10:48, Jean-Baptiste Onofré wrote:
>>>
>>>> Hi guys,
>>>>
>>>> Apache Beam 2.4.0 has been released on March 20th.
>>>>
>>>> According to our cycle of release (roughly 6 weeks), we should think
>>>> about 2.5.0.
>>>>
>>>> I'm volunteer to tackle this release.
>>>>
>>>> I'm proposing the following items:
>>>>
>>>> 1. We start the Jira triage now, up to Tuesday
>>>> 2. I would like to cut the release on Tuesday night (Europe time)
>>>> 2bis. I think it's wiser to still use Maven for this release. Do you
>>>> think we
>>>> will be ready to try a release with Gradle ?
>>>>
>>>> After this release, I would like a discussion about:
>>>> 1. Gradle release (if we release 2.5.0 with Maven)
>>>> 2. Isolate release cycle per Beam part. I think it would be interesting
>>>> to have
>>>> different release cycle: SDKs, DSLs, Runners, IOs. That's another
>>>> discussion, I
>>>> will start a thread about that.
>>>>
>>>> Thoughts ?
>>>>
>>>> Regards
>>>> JB
>>>>
>>>>
>>


Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-05-17 Thread Ahmet Altay
Hi JB and all,

I wanted to follow up on my previous email. The python streaming issue I
mentioned is resolved and removed from the blocker list. Blocker list is
empty now. You can go ahead with the release branch cut when you are ready.

Thank you,
Ahmet


On Sun, May 13, 2018 at 8:43 AM, Jean-Baptiste Onofré 
wrote:

> Hi guys,
>
> just to let you know that the build fully passed on my box.
>
> I'm testing the artifacts right now.
>
> Regards
> JB
>
> On 06/04/2018 10:48, Jean-Baptiste Onofré wrote:
>
>> Hi guys,
>>
>> Apache Beam 2.4.0 has been released on March 20th.
>>
>> According to our cycle of release (roughly 6 weeks), we should think
>> about 2.5.0.
>>
>> I'm volunteer to tackle this release.
>>
>> I'm proposing the following items:
>>
>> 1. We start the Jira triage now, up to Tuesday
>> 2. I would like to cut the release on Tuesday night (Europe time)
>> 2bis. I think it's wiser to still use Maven for this release. Do you
>> think we
>> will be ready to try a release with Gradle ?
>>
>> After this release, I would like a discussion about:
>> 1. Gradle release (if we release 2.5.0 with Maven)
>> 2. Isolate release cycle per Beam part. I think it would be interesting
>> to have
>> different release cycle: SDKs, DSLs, Runners, IOs. That's another
>> discussion, I
>> will start a thread about that.
>>
>> Thoughts ?
>>
>> Regards
>> JB
>>
>>


Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-05-04 Thread Ahmet Altay
Hi JB,

We found an issue related to using side inputs in streaming mode using
python SDK. Charles is currently trying to find the root cause. Would you
be able to give him some additional time to investigate the issue?

Charles, do you have a JIRA issue on the blocker list?

Thank you everyone for understanding.

Ahmet

On Fri, May 4, 2018 at 8:52 AM, Jean-Baptiste Onofré 
wrote:

> Hi
>
> I have couple of PRs I would like to include. I would like also to take
> the weekend for new builds and tests.
>
> If it works for everyone I propose to start the release process Tuesday.
>
> Thoughts ?
>
> Regards
> JB
> Le 4 mai 2018, à 17:49, Scott Wegner  a écrit:
>>
>> Hi JB, any idea when you will begin the release? Boyuan has a couple
>> Python PRs [1] [2] that are ready to merge, but we'd like to wait until
>> after the release branch is cut in case there is some performance
>> regression.
>>
>> [1] https://github.com/apache/beam/pull/4741
>> [2] https://github.com/apache/beam/pull/4925
>>
>> On Tue, May 1, 2018 at 9:25 AM Scott Wegner  wrote:
>>
>>> Sounds good, thanks J.B. Feel free to ping if you need anything.
>>>
>>> On Mon, Apr 30, 2018 at 10:12 PM Jean-Baptiste Onofré 
>>> wrote:
>>>
 That's a good idea ! I think using Slack to ping/ask is a good way as
 it's async.

 Regards
 JB

 On 05/01/2018 06:51 AM, Reuven Lax wrote:
 > I think it makes sense to have someone who hadn't done the Gradle
 migration to
 > run the release. However would it make sense for someone who did work
 on the
 > migration to partner with you JB? There may be issues that are simply
 due to
 > things that were not documented well. In that case the partner can
 quickly help
 > resolve, and can then be the one who makes sure that the
 documentation is updated.
 >
 > Reuven
 >
 > On Mon, Apr 30, 2018 at 9:36 PM Jean-Baptiste Onofré  > wrote:
 >
 > Hi Scott,
 >
 > Thanks for the update. The Gradle build crashed on my machine
 (not related to
 > Gradle). I launched a new one.
 >
 > I'm volunteer to cut the release: I think I know Gradle decently,
 and even if I
 > didn't work on the gradle "migration" during the last two weeks,
 I think it's
 > actually better: I have an "external" view on the latest changes.
 >
 > Thoughts ?
 >
 > Regards
 > JB
 >
 > On 05/01/2018 02:05 AM, Scott Wegner wrote:
 > > Welcome back JB!
 > >
 > > I just sent a separate update about Gradle [1]-- the build
 migration is
 > complete
 > > and the release documentation has been updated.
 > >
 > > I recommend we produce the 2.5.0 release using Gradle. Having a
 successful
 > > release should be the final validation before declaring the
 Gradle migration
 > > complete. So the sooner we can have a Gradle release, the
 sooner we can
 > get back
 > > to a single build system :)
 > >
 > > If it would be helpful, I suggest that somebody who's been
 working on the
 > Gradle
 > > migration to manage the 2.5.0 release. That way if we encounter
 any issues
 > from
 > > the build system, they should have sufficient expertise to fix
 it.
 > >
 > >
 > [1] https://lists.apache.org/thread.html/
 e543b3850bfc4950d57bc18624e1d4131324c6cf691fd10034947cad@%
 3Cdev.beam.apache.org%3E
 > >
 > > On Mon, Apr 30, 2018 at 11:38 AM Romain Manni-Bucau <
 rmannibu...@gmail.com
 > 
 > > >>
 wrote:
 > >
 > >
 > >
 > > Le 30 avr. 2018 19:39, "Jean-Baptiste Onofré" <
 j...@nanthrax.net
 > 
 > > >> a
 écrit :
 > >
 > > Hi guys,
 > >
 > > now that I'm back from vacations, I bring back 2.5.0
 release on
 > the table ;)
 > >
 > > This is also related to the current status of build
 (Maven/Gradle).
 > >
 > > FYI, I gonna start the Jira triage tomorrow and I
 launched couple of
 > > build on my
 > > machine (both Maven and Gradle) to get an update on the
 current
 > status.
 > >
 > > Please, let me know if you have an opinion about Gradle
 vs Maven
 > for the
 > > release.
 > >
 > >
 > > Produced artifacts are still too different to use gradle
 IMHO. Jira were
 > > 

Re: [PROPOSAL] Python 3 support

2018-04-18 Thread Ahmet Altay
Robbe, I added you as a contributor to our JIRA. You should be able to
assign issues to yourself. Board will auto update itself based on the
issues. Give it a try.

On Wed, Apr 18, 2018 at 1:15 AM, Robbe Sneyders <robbe.sneyd...@ml6.eu>
wrote:

> Thanks!
>
> Can someone give me permission to assign issues to myself?
> And edit rights to the Kanban board?
>
> Robbe
>
> On Tue, 17 Apr 2018 at 22:56 Ahmet Altay <al...@google.com> wrote:
>
>> Kanban board for python 3: https://issues.apache.org/
>> jira/secure/RapidBoard.jspa?rapidView=245
>>
>> (Thank you Davor!)
>>
>> Ahmet
>>
>> On Fri, Apr 6, 2018 at 6:32 PM, Reuven Lax <re...@google.com> wrote:
>>
>>> I had a similar problem.
>>>
>>> On Fri, Apr 6, 2018, 6:23 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> I tried to create a shared kanban board but I failed. I think I am
>>>> lacking some permission to create a shared filter. Could someone help with
>>>> creating this?
>>>>
>>>> The filter I planned to use was "project = BEAM AND (parent = BEAM-2784
>>>> OR parent = BEAM-1251) ORDER BY Rank ASC"
>>>>
>>>> Ahmet
>>>>
>>>> On Fri, Apr 6, 2018 at 5:45 AM, Robbe Sneyders <robbe.sneyd...@ml6.eu>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I don't seem to have the permissions to create a Kanban board or even
>>>>> assign tasks to myself. Who could help me with this?
>>>>>
>>>>> I've updated the coders package pull request [1] and added the applied
>>>>> strategy to the proposal document [2].
>>>>> It would be great to get some feedback on this, so we can start moving
>>>>> forward with other subpackages.
>>>>>
>>>>> Kind regards,
>>>>> Robbe
>>>>>
>>>>> [1] https://github.com/apache/beam/pull/4990
>>>>> [2] https://docs.google.com/document/d/1xDG0MWVlDKDPu_
>>>>> IW9gtMvxi2S9I0GB0VDTkPhjXT0nE/edit?usp=sharing
>>>>>
>>>>> On Mon, 2 Apr 2018 at 21:07 Robbe Sneyders <robbe.sneyd...@ml6.eu>
>>>>> wrote:
>>>>>
>>>>>> Hello Robert,
>>>>>>
>>>>>> I think a Kanban board on Jira as proposed by Ahmet can be helpful
>>>>>> for this. I'll look into setting one up tomorrow.
>>>>>>
>>>>>> In the meantime, you can find the first pull request with the updated
>>>>>> coders package here:
>>>>>> https://github.com/apache/beam/pull/4990
>>>>>>
>>>>>> Kind regards,
>>>>>> Robbe
>>>>>>
>>>>>> On Fri, 30 Mar 2018 at 18:01 Robert Bradshaw <rober...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Mar 30, 2018 at 8:39 AM Robbe Sneyders <
>>>>>>> robbe.sneyd...@ml6.eu> wrote:
>>>>>>>
>>>>>>>> Thanks Ahmet and Robert,
>>>>>>>>
>>>>>>>> I think we can work on different subpackages in parallel, but it's
>>>>>>>> important to apply the same strategy everywhere. I'm currently working 
>>>>>>>> on
>>>>>>>> applying step 1 (was mostly done already) and 2 of the proposal to the
>>>>>>>> coders subpackage to create a first pull request. We can then discuss 
>>>>>>>> the
>>>>>>>> applied strategy in detail before merging and applying it to the other
>>>>>>>> subpackages.
>>>>>>>>
>>>>>>>
>>>>>>> Sounds good. Again, could you document (in a more permanent/easy to
>>>>>>> look up state than email) when packages are started/done?
>>>>>>>
>>>>>>>
>>>>>>>> This strategy also includes the choice of automated tools. I'm
>>>>>>>> focusing on writing python 3 code with python 2 compatibility, which 
>>>>>>>> means
>>>>>>>> depending on the future package instead of the six package (which is
>>>>>>>> already used in some places in the current code base). I have already
>>>>>>>> noticed that this indeed requires a lot of manual work after running 
>>&

Re: Merge options in Github UI are confusing

2018-04-17 Thread Ahmet Altay
I agree with Robert. In this case one size does not fit all. There are
times, another round trip with a contributor would be frustrating to the
author. Especially for new contributors. Having the option to squash and
merge is useful in those cases. (For reference in the past we even helped
new contributors by doing small fixes at merge time.)

On Tue, Apr 17, 2018 at 2:28 PM, Robert Bradshaw 
wrote:

> I think the two options are useful, because we have different kinds of
> contributors. Sophisticated users curate their own history, create
> logically useful commits, build atop it, etc. and merge is by far the
> better option. Others have a single commit followed by any number of
> "lint," "fixup," and "reviewer comments" ones that should clearly be
> squashed, and given that it takes a round trip to ask them to squash it,
> it's nice to be able to do it once there's an LGTM as part of the merge. At
> least making this fact explicit and pointing it out in the docs may be
> useful.
> On Tue, Apr 17, 2018 at 1:43 PM Mingmin Xu  wrote:
>
> > Not strongly against `Create a merge commit`, but I use `squash and
> merge` by default. I understand the potential impact mentioned by Andrew,
> it's still a better option IMO:
> > 1. if a PR contains several parts, it can be documented in commit message
> instead of several commits; --If it's a big task, let's split it into
> several PRs if possible;
> > 2. when several PRs are changing the same file, I would ask contributor
> to fix it;
> > 3. most commits are introduced by reviewer's ask, it's not necessary to
> do another squash(by contributors) before merge;
>
> > On Tue, Apr 17, 2018 at 1:09 PM, Robert Burke 
> wrote:
>
> >> +1 Having made a few web commits and been frustrated by the options,
> anything to standardize on a single option seems good to me.
>
> >> On Tue, 17 Apr 2018 at 01:49 Etienne Chauchot 
> wrote:
>
> >>> +1 to enforce the behavior recommended in the committer guide. I
> usually ask the author to manually squash before committing.
>
> >>> Etienne
>
> >>> Le lundi 16 avril 2018 à 22:19 +, Robert Bradshaw a écrit :
>
> >>> +1, though I'll admit I've been an occasional user of the "squash and
> merge" button when a small PR has a huge number of small, fixup changes
> piled on it.
>
> >>> On Mon, Apr 16, 2018 at 3:07 PM Kenneth Knowles 
> wrote:
>
> >>> It is no secret that I agree with this. When you don't rewrite history,
> distributed git "just works". I didn't realize we could mechanically
> enforce it.
>
> >>> Kenn
>
> >>> On Mon, Apr 16, 2018 at 2:55 PM Andrew Pilloud 
> wrote:
>
> >>> The Github UI provides several options for merging a PR hidden behind
> the “Merge pull request” button. Only the “Create a merge commit” option
> does what most users expect, which is to merge by creating a new merge
> commit. This is the option recommended in the Beam committer’s guide, but
> it is not necessarily the default behavior of the merge button.
>
>
> >>> A small cleanup PR I made was recently merged via the merge button
> which generated a squash merge instead of a merge commit, breaking two
> other PRs which were based on it. See
> https://github.com/apache/beam/pull/4991
>
>
> >>> I would propose that we disable the options for both rebase and squash
> merging via the Github UI. This will make the behavior of the merge button
> unambiguous and consistent with our documentation, but will not prevent a
> committer from performing these operations from the git cli if they desire.
>
>
> >>> Andrew
>
>
>
>
>
>
> > --
> > 
> > Mingmin
>


Re: [PROPOSAL] Python 3 support

2018-04-17 Thread Ahmet Altay
Kanban board for python 3:
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245

(Thank you Davor!)

Ahmet

On Fri, Apr 6, 2018 at 6:32 PM, Reuven Lax <re...@google.com> wrote:

> I had a similar problem.
>
> On Fri, Apr 6, 2018, 6:23 PM Ahmet Altay <al...@google.com> wrote:
>
>> I tried to create a shared kanban board but I failed. I think I am
>> lacking some permission to create a shared filter. Could someone help with
>> creating this?
>>
>> The filter I planned to use was "project = BEAM AND (parent = BEAM-2784
>> OR parent = BEAM-1251) ORDER BY Rank ASC"
>>
>> Ahmet
>>
>> On Fri, Apr 6, 2018 at 5:45 AM, Robbe Sneyders <robbe.sneyd...@ml6.eu>
>> wrote:
>>
>>> Hi all,
>>>
>>> I don't seem to have the permissions to create a Kanban board or even
>>> assign tasks to myself. Who could help me with this?
>>>
>>> I've updated the coders package pull request [1] and added the applied
>>> strategy to the proposal document [2].
>>> It would be great to get some feedback on this, so we can start moving
>>> forward with other subpackages.
>>>
>>> Kind regards,
>>> Robbe
>>>
>>> [1] https://github.com/apache/beam/pull/4990
>>> [2] https://docs.google.com/document/d/1xDG0MWVlDKDPu_
>>> IW9gtMvxi2S9I0GB0VDTkPhjXT0nE/edit?usp=sharing
>>>
>>> On Mon, 2 Apr 2018 at 21:07 Robbe Sneyders <robbe.sneyd...@ml6.eu>
>>> wrote:
>>>
>>>> Hello Robert,
>>>>
>>>> I think a Kanban board on Jira as proposed by Ahmet can be helpful for
>>>> this. I'll look into setting one up tomorrow.
>>>>
>>>> In the meantime, you can find the first pull request with the updated
>>>> coders package here:
>>>> https://github.com/apache/beam/pull/4990
>>>>
>>>> Kind regards,
>>>> Robbe
>>>>
>>>> On Fri, 30 Mar 2018 at 18:01 Robert Bradshaw <rober...@google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Mar 30, 2018 at 8:39 AM Robbe Sneyders <robbe.sneyd...@ml6.eu>
>>>>> wrote:
>>>>>
>>>>>> Thanks Ahmet and Robert,
>>>>>>
>>>>>> I think we can work on different subpackages in parallel, but it's
>>>>>> important to apply the same strategy everywhere. I'm currently working on
>>>>>> applying step 1 (was mostly done already) and 2 of the proposal to the
>>>>>> coders subpackage to create a first pull request. We can then discuss the
>>>>>> applied strategy in detail before merging and applying it to the other
>>>>>> subpackages.
>>>>>>
>>>>>
>>>>> Sounds good. Again, could you document (in a more permanent/easy to
>>>>> look up state than email) when packages are started/done?
>>>>>
>>>>>
>>>>>> This strategy also includes the choice of automated tools. I'm
>>>>>> focusing on writing python 3 code with python 2 compatibility, which 
>>>>>> means
>>>>>> depending on the future package instead of the six package (which is
>>>>>> already used in some places in the current code base). I have already
>>>>>> noticed that this indeed requires a lot of manual work after running the
>>>>>> automated script.
>>>>>> The future package supports python 3.3+ compatibility, so I don't
>>>>>> think there is a higher cost supporting 3.4 compared to 3.5+.
>>>>>>
>>>>>
>>>>> Sure. It may incur a higher maintenance burden long-term though.
>>>>> (Basically, if we go out the door with 3.4 it's a promise to support it 
>>>>> for
>>>>> some time to come.)
>>>>>
>>>>>
>>>>>> I have already added a tox environment to run pylint2 with the --py3k
>>>>>> argument per updated subpackage, which should help avoid regression 
>>>>>> between
>>>>>> step 2 and step 3 of the proposal. This update will be pushed with the
>>>>>> first pull request.
>>>>>>
>>>>>> Kind regards,
>>>>>> Robbe
>>>>>>
>>>>>>
>>>>>> On Fri, 30 Mar 2018 at 02:22 Robert Bradshaw <rober...@google.com>
>>>>>> wrote:
>&

Re: Gradle Status [April 11]

2018-04-12 Thread Ahmet Altay
> Found another blocker in current artifacts creations: there is not
pom.xml and pom.properties in META-INF. This is used by tools +
libraries + integrations so it is quite important to not break it

Romain, is there a JIRA for this issues? If not could you create one please?

On Thu, Apr 12, 2018 at 3:17 AM, Etienne Chauchot 
wrote:

> Nice !
> thanks Kenn
>
> Le mercredi 11 avril 2018 à 18:21 +, Kenneth Knowles a écrit :
>
> Initial Nexmark+Gradle run is in, though a hiccup in the Spark runner +
> Netty has been introduced since yesterday. Etienne mentioned he has worked
> toward setting up periodic runs on all runners, so this should help get us
> towards that. We'll probably prefer to build standalone fat jars for
> selected runners and use those, which is pending unknown issues in the
> shadow config leaving out required dependencies.
>
> Kenn
>
> On Wed, Apr 11, 2018 at 10:25 AM Scott Wegner  wrote:
>
> Thanks everyone for the continued effort towards the Gradle migration. As
> a high-level summary of our progress since Friday: we have a viable build,
> with a number of minor issues that we're still working out. Please take a
> look at the new documentation in our contribution guide and log any bugs
> that you find.
>
> Here's a more detailed view of improvements from just the past few days..
>
> Release artifacts:
> *  Pom.xml generation logic now in master [1]
> * Nightly snapshots are now produced using Gradle [2]
> * Excluded modules propagated to dependencies when generating * pom.xml
> * Artifact JARs are properly shaded [3]
> * Working on fixing dependency scopes in generated pom [4]
> PreCommits / Postcommits:
> * All PreCommits and PostCommits migrated [5]; working on deflaking [6]
> [7] [8] [9]
> * Jenkins results now include JUnit test results [10] and build scan for
> easier debugging [11]
> * Spark ValidatesRunner PostCommit passes [12] [13]
> * Flink ValidatesRunner PostCommit more reliable [14]
> Documentation / IDE Setup:
> * Contribution Guide [15] is now updated with Gradle commands [16] [17]
> Performance Benchmarks:
> * Working on getting Nexmark benchmarks migrated [18]
>
> If I missed anything, please add it to this thread.
>
> We are continuing to use this general roadmap:
> (a) Publish release artifacts with Gradle (SNAPSHOT and signed releases)
> (b) Postcommits migrated to Gradle
> (c) Migrate documentation from maven to Gradle
> (d) Migrate perfkit suites to use Gradle
>
> Migration tasks are tracked as subtasks on BEAM-3249 [19]. Kenn has
> created a separate issue to track post-migration cleanup items:
> BEAM-4045 [20]. Feel free to grab any unassigned items off of either list.
>
>
> [1] https://github.com/apache/beam/pull/5054
> [2] https://github.com/apache/beam/pull/5057
> [3] https://github.com/apache/beam/pull/5087
> [4] https://github.com/apache/beam/pull/5098
> [5] https://github.com/apache/beam/pull/5047
> [6] https://github.com/apache/beam/pull/5088
> [7] https://github.com/apache/beam/pull/5086
> [8] https://github.com/apache/beam/pull/5066
> [9] https://github.com/apache/beam/pull/5059
> [10] https://github.com/apache/beam/pull/5045
> [11] https://github.com/apache/beam/pull/5091
> [12] https://github.com/apache/beam/pull/5093
> [13] https://github.com/apache/beam/pull/5069
> [14] https://github.com/apache/beam/pull/5068
> [15] https://beam.apache.org/contribute/contribution-guide/
> [16] https://github.com/apache/beam-site/pull/412
> [17] https://github.com/apache/beam-site/pull/414
> [18] https://github.com/apache/beam/pull/5051
> [19] https://issues.apache.org/jira/browse/BEAM-3249
> [20] https://issues.apache.org/jira/browse/BEAM-4045
>
> On Fri, Apr 6, 2018 at 9:32 AM Scott Wegner  wrote:
>
> I wanted to start a thread to summarize the current state of Gradle
> migration. We've made lots of good progress so far this week. Here's the
> status from what I can tell-- please add or correct anything I missed:
>
> * Release artifacts can be built and published for Snapshot and officlal
> releases [1]
> * Gradle-generated releases have been validated with the the Apache Beam
> archetype generation quickstart; still needs additional validation.
> * Generated release pom files have correct project metadata [2]
> * The python pre-commits are now working in Gradle [3]
> * Ismaël has started a collaborative doc of Gradle tips [4] as we all
> learn the new system-- please add your own. This will eventually feed into
> official documentation on the website.
> * Łukasz Gajowy is working on migrating performance testing framework [5]
> * Daniel is working on updating documentation to refer to Gradle instead
> of maven
>
> If I missed anything, please add it to this thread.
>
> The general roadmap we're working towards is:
> (a) Publish release artifacts with Gradle (SNAPSHOT and signed releases)
> (b) Postcommits migrated to Gradle
> (c) Migrate documentation from maven to Gradle
> (d) Migrate perfkit 

Re: [PROPOSAL] Python 3 support

2018-04-06 Thread Ahmet Altay
I tried to create a shared kanban board but I failed. I think I am lacking
some permission to create a shared filter. Could someone help with creating
this?

The filter I planned to use was "project = BEAM AND (parent = BEAM-2784 OR
parent = BEAM-1251) ORDER BY Rank ASC"

Ahmet

On Fri, Apr 6, 2018 at 5:45 AM, Robbe Sneyders <robbe.sneyd...@ml6.eu>
wrote:

> Hi all,
>
> I don't seem to have the permissions to create a Kanban board or even
> assign tasks to myself. Who could help me with this?
>
> I've updated the coders package pull request [1] and added the applied
> strategy to the proposal document [2].
> It would be great to get some feedback on this, so we can start moving
> forward with other subpackages.
>
> Kind regards,
> Robbe
>
> [1] https://github.com/apache/beam/pull/4990
> [2] https://docs.google.com/document/d/1xDG0MWVlDKDPu_
> IW9gtMvxi2S9I0GB0VDTkPhjXT0nE/edit?usp=sharing
>
> On Mon, 2 Apr 2018 at 21:07 Robbe Sneyders <robbe.sneyd...@ml6.eu> wrote:
>
>> Hello Robert,
>>
>> I think a Kanban board on Jira as proposed by Ahmet can be helpful for
>> this. I'll look into setting one up tomorrow.
>>
>> In the meantime, you can find the first pull request with the updated
>> coders package here:
>> https://github.com/apache/beam/pull/4990
>>
>> Kind regards,
>> Robbe
>>
>> On Fri, 30 Mar 2018 at 18:01 Robert Bradshaw <rober...@google.com> wrote:
>>
>>> On Fri, Mar 30, 2018 at 8:39 AM Robbe Sneyders <robbe.sneyd...@ml6.eu>
>>> wrote:
>>>
>>>> Thanks Ahmet and Robert,
>>>>
>>>> I think we can work on different subpackages in parallel, but it's
>>>> important to apply the same strategy everywhere. I'm currently working on
>>>> applying step 1 (was mostly done already) and 2 of the proposal to the
>>>> coders subpackage to create a first pull request. We can then discuss the
>>>> applied strategy in detail before merging and applying it to the other
>>>> subpackages.
>>>>
>>>
>>> Sounds good. Again, could you document (in a more permanent/easy to look
>>> up state than email) when packages are started/done?
>>>
>>>
>>>> This strategy also includes the choice of automated tools. I'm focusing
>>>> on writing python 3 code with python 2 compatibility, which means depending
>>>> on the future package instead of the six package (which is already used in
>>>> some places in the current code base). I have already noticed that this
>>>> indeed requires a lot of manual work after running the automated script.
>>>> The future package supports python 3.3+ compatibility, so I don't think
>>>> there is a higher cost supporting 3.4 compared to 3.5+.
>>>>
>>>
>>> Sure. It may incur a higher maintenance burden long-term though.
>>> (Basically, if we go out the door with 3.4 it's a promise to support it for
>>> some time to come.)
>>>
>>>
>>>> I have already added a tox environment to run pylint2 with the --py3k
>>>> argument per updated subpackage, which should help avoid regression between
>>>> step 2 and step 3 of the proposal. This update will be pushed with the
>>>> first pull request.
>>>>
>>>> Kind regards,
>>>> Robbe
>>>>
>>>>
>>>> On Fri, 30 Mar 2018 at 02:22 Robert Bradshaw <rober...@google.com>
>>>> wrote:
>>>>
>>>>> Thank you, Robbie, for your offer to help with contribution here. I
>>>>> read over your doc and the one thing I'd like to add is that this work is
>>>>> very parallelizable, but if we have enough people looking at it we'll want
>>>>> some way to coordinate so as to not overlap work (or just waste time
>>>>> discovering what's been done). Tracking individual JIRAs and PRs gets
>>>>> unwieldy, perhaps a spreadsheet with modules/packages on one axis and the
>>>>> various automated/manual conversions along the other would be helpful?
>>>>>
>>>>> A note on automated tools, they're sometimes overly conservative, so
>>>>> we should be sure to review the changes manually. (A typical example of
>>>>> this is unnecessarily importing six.moves.xrange when there was no big
>>>>> reason to use xrange over range in Python 2, or conversely using
>>>>> list(range(...) in Python 3.)
>>>>>
>>>>> Also, +1 to targett

Re: Python SDK feature set

2018-04-02 Thread Ahmet Altay
On Mon, Apr 2, 2018 at 6:17 PM, Thomas Weise  wrote:

> Hi,
>
> I’m trying to find a summary of the feature set that is currently
> supported in the Python SDK. I understand it is experimental and
> currently only supports a subset of the Beam model like fixed interval
> windows but not merging windows and custom window functions.
>
> I’m specifically interested in stateful processing and timers as basic
> building blocks that could be used with a global window to implement
> session functionality and other higher level abstractions without being
> constrained by window functions and predefined triggers.
>

These are missing. Charles started working on those [1]. We can use any
help we can get if you are interested.


>
> Also since we have the runner capability matrix
> , would
> it be useful to track SDK capabilities in a similar way so that users
> know what’s supported?
>

I think this would be really helpful. Especially now that we have multiple
SDKs in master. Recently Rafael proposed having per-transform documentation
[2]. Building an SDK capability matrix would be natural extension of it.

[1] https://issues.apache.org/jira/browse/BEAM-2687
[2]
https://lists.apache.org/thread.html/626427f65a97eb5b40ecb4963202e7bdb43fccf139d82add698a7113@%3Cdev.beam.apache.org%3E



>
> Thanks,
> Thomas
>
>


Re: [ANNOUCEMENT] New Foundation members!

2018-03-30 Thread Ahmet Altay
Congratulations to all of you!

On Fri, Mar 30, 2018, 4:29 PM Pablo Estrada  wrote:

> Congratulations y'all! Very cool.
> Best
> -P.
>
> On Fri, Mar 30, 2018 at 4:09 PM Davor Bonaci  wrote:
>
>> Now that this is public... please join me in welcoming three newly
>> elected members of the Apache Software Foundation with ties to this
>> community, who were elected during the most recent Members' Meeting.
>>
>> * Ismaël Mejía (Beam PMC)
>>
>> * Josh Wills (Crunch Chair; Beam, DataFu PMC)
>>
>> * Holden Karau (Spark, SystemML PMC; Mahout, Subversion committer; Beam
>> contributor)
>>
>> These individuals demonstrated merit in Foundation's growth, evolution,
>> and progress. They were recognized, nominated, and elected by existing
>> membership for their significant impact to the Foundation as a whole, such
>> as the roots of project-related and cross-project activities.
>>
>> As members, they now become legal owners and shareholders of the
>> Foundation. They can vote for the Board, incubate new projects, nominate
>> new members, participate in any PMC-private discussions, and contribute to
>> any project.
>>
>> (For the Beam community, this election nearly doubles the number of
>> Foundation members. The new members are joining Jean-Baptiste Onofré,
>> Stephan Ewen, Romain Manni-Bucau and myself in this role.)
>>
>> I'm happy to be able to call all three of you my fellow members.
>> Congratulations!
>>
>>
>> Davor
>>
> --
> Got feedback? go/pabloem-feedback
>


Re: [PROPOSAL] Python 3 support

2018-03-27 Thread Ahmet Altay
On Tue, Mar 27, 2018 at 7:12 AM, Holden Karau <hol...@pigscanfly.ca> wrote:

>
> On Tue, Mar 27, 2018 at 4:27 AM Robbe Sneyders <robbe.sneyd...@ml6.eu>
> wrote:
>
>> Hi Anand,
>>
>> Thanks for the feedback.
>>
>> It should be no problem to run everything on DataflowRunner as well.
>> Are there any performance tests in place to check for performance
>> regressions?
>>
>
Yes there is a suite (
https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PerformanceTests_Python.groovy).
It may not be very comprehensive and seems to be failing for a while. I
would not block python 3 work on performance for now. That is the
unfortuante state of things.

If anybody in the community is interested, this would be a great
opportunity to help with benchmarks in general.


>
>> Some questions were raised in the proposal document which I want to add
>> to this conversation:
>>
>> The first comment was about the targeted python 3 versions. We proposed
>> to target 3.6 since it is the latest version available and added 3.5
>> because 3.6 adoption seems rather low (hard to find any relevant sources on
>> this though).
>> If the beam community prefers 3.4, I would propose to target 3.4 only
>> during porting and add 3.5 and 3.6 later so we don't slow down the porting
>> progress. 3.4 has the advantage of already being installed on the workers
>> and allows pySpark pipelines to be moved over to beam more easily.
>> It would be great to get some opinions on this.
>>
>
My preference is to support 3.4+. I searched a bit on the web to understand
the usage statistics for python 3, it seems like python 3.4 has ~20% usage
and python 3.4+ has 99% (
https://semaphoreci.com/blog/2017/10/18/python-versions-used-in-commercial-projects-in-2017.html).
Based on that, I think it makes sense to support it.



>
>> Another comment was made on how to avoid regression during the porting
>> progress.
>> After applying step 1 and step 2, no python 3 compatibility lint warnings
>> should remain, so it would be great if we could enforce this check for
>> every pull request on an already updated subpackage.
>> After applying step 3, all tests should run on python 3, so again it
>> would be great if we can enforce these per updated subpackage.
>> Any insights on how to best accomplish this?
>>
> So you can look at some of the recent changes to tox.ini in the git log to
> see what we’ve done so far around this I suspect you can repeat that same
> pattern.
>

+1 updating tox.ini and adding new checks to run_mini_py3lint.sh would help
a lot to prevent regressions.



>
>> Thanks,
>> Robbe
>>
>> On Fri, 23 Mar 2018 at 19:59 Ahmet Altay <al...@google.com> wrote:
>>
>>> Thank you Robbe.
>>>
>>> I reviewed the document it looks reasonable to me. I will touch on some
>>> points that were not mentioned:
>>> - Runner exercise different code paths. Doing auto conversions and
>>> focusing on DirectRunner is not enough. It is worthwhile to run things on
>>> DataflowRunner as well. This can be triggered from Jenkins. It will
>>> validate that we are still compatible for python 2.
>>> - Similar to above but with an eye on perf regressions.
>>>
>>> For project tracking on JIRA, please feel free to create any new issues,
>>> close stale ones, or take ownership of any open issues. All JIRAs should be
>>> assigned to the people actively working on them. If you wan to track it in
>>> a separate way, you can also propose that. (For example a kanban board is
>>> used for portability effort which is fully supported in JIRA.)
>>>
>>> I will also call out to a few other people in addition to Holden who
>>> helped out or showed interest in helping with Python 3. @cclaus, @luke-zhu,
>>> @udim, @robertwb, @charlesccychen, @tvalentyn. You can include these
>>> people (and myself) for reviews and other questions that you have.
>>>
>>> Welcome again, and looking forward to your contributions.
>>>
>>> Thank you,
>>> Ahmet
>>>
>>>
>>>
>>> On Fri, Mar 23, 2018 at 9:27 AM, Robbe Sneyders <robbe.sneyd...@ml6.eu>
>>> wrote:
>>>
>>>> Hello everyone,
>>>>
>>>> In the next month(s), me and my colleague Matthias will commit a lot of
>>>> time and effort to python 3 support for beam and we would like to discuss
>>>> the best way to go forward with this.
>>>>
>>>> We have drawn up a document [1] with a high level outline of the
>>>&g

Re: Dataflow throwing backend error

2018-03-27 Thread Ahmet Altay
Hi Rajesh,

This looks like a transient error from GCS. Beam SDK will retry tasks in
the face of such errors and those typically do not make your pipeline fail.
If you have additional questions please reach out to Dataflow support (
https://cloud.google.com/dataflow/support).

Thank you,
Ahmet

On Tue, Mar 27, 2018 at 3:58 AM, Rajesh Hegde 
wrote:

> Hi,
> We are getting backend error with Google Cloud Storage service, any idea
> why it happens and how to fix it? Error traceback is pasted below.
>
> *Traceback (most recent call last):*
> *  File
> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py",
> line 609, in do_work*
> *work_executor.execute()*
> *  File
> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line
> 170, in execute*
> *op.finish()*
> *  File "dataflow_worker/native_operations.py", line 93, in
> dataflow_worker.native_operations.NativeWriteOperation.finish*
> *def finish(self):*
> *  File "dataflow_worker/native_operations.py", line 94, in
> dataflow_worker.native_operations.NativeWriteOperation.finish*
> *with self.scoped_finish_state:*
> *  File "dataflow_worker/native_operations.py", line 95, in
> dataflow_worker.native_operations.NativeWriteOperation.finish*
> *self.writer.__exit__(None, None, None)*
> *  File
> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/nativefileio.py",
> line 459, in __exit__*
> *self.file.close()*
> *  File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filesystemio.py",
> line 201, in close*
> *self._uploader.finish()*
> *  File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/gcsio.py", line
> 575, in finish*
> *raise self._upload_thread.last_error  # pylint:
> disable=raising-bad-type*
> *HttpError: HttpError accessing
>  /o?uploadType=resumable=json_id=AEnB2UoK85wrqW1nWJ2cN7DQ5JKdQtTyDX-LwRwgIlIHgVL0KWR8JlUcLOJYFWmv9_YfpVhlKsooB4tHL2cxXch9hVls4nxFnw=temp%2Fbeamapp-bzftmxc-0327102557-645475.1522146357.645609%2F11032707444842841251%2Fdax-tmp-2018-03-27_03_27_30-7388048597187286914-S05-0-adbd1398680cf5c7%2F-shard--try-0c9aa3475bd67907-endshard.json>:
> response: <{'status': '410', 'content-length': '177', 'vary': 'Origin,
> X-Origin', 'server': 'UploadServer', 'x-guploader-uploadid':
> 'AEnB2UoK85wrqW1nWJ2cN7DQ5JKdQtTyDX-LwRwgIlIHgVL0KWR8JlUcLOJYFWmv9_YfpVhlKsooB4tHL2cxXch9hVls4nxFnw',
> 'date': 'Tue, 27 Mar 2018 10:41:17 GMT', 'content-type': 'application/json;
> charset=UTF-8'}>, content <{*
> * "error": {*
> *  "errors": [*
> *   {*
> *"domain": "global",*
> *"reason": "backendError",*
> *"message": "Backend Error"*
> *   }*
> *  ],*
> *  "code": 500,*
> *  "message": "Backend Error"*
> * }*
> *}*
> *>*
>
> --
>
> *Rajesh Hegde | Lead Product Developer | Datalicious*
> *e*: rhe...@datalicious.com | *m*: +919167571827 <+91%2091675%2071827>
> *a*: L-77, 15th Cross Rd, Sector 6, HSR Layout,
> 
> Bangalore Karnataka- 560102
> 
> *w*: www.datalicious.com
> 
>
> *Contact supp...@datalicious.com  anytime, we're
> keen to help!*
>
> 
>    
>
>
>
> 
>
>


Re: executing the pipeline from datalab

2018-03-23 Thread Ahmet Altay
+ user, dev to bcc

Eila,

Is it possible that you are using an old version? I remember pending was
missing in the dictionary and was added later. If that is not the reason,
could you file a JIRA issue?

Thank you,
Ahmet


On Fri, Mar 23, 2018 at 6:15 AM, Jean-Baptiste Onofré 
wrote:

> Hi Eila,
>
> can you please address this kind of question to the user mailing list ?
>
> Thanks !
> Regards
> JB
>
> On 03/23/2018 02:08 PM, OrielResearch Eila Arich-Landkof wrote:
> > Hello all,
> >
> > When I run the pipeline with 4 samples (very small dataset), I don't get
> any
> > error on DirectRunner or DataflowRunner
> >
> > When I run it with 50 samples dataset, I get the following error for the
> > run.wait_until_finished()
> > What does this error mean?
> > Thanks,
> > Eila
> >
> > KeyErrorTraceback (most recent call last)
> >  in () 1 result = pc.run()>
> > 2result.wait_until_finish()/usr/local/envs/py2env/lib/
> python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.pyc
> > in wait_until_finish(self, duration) 771 while thread.isAlive():772
> > time.sleep(5.0)--> 773if self.state != PipelineState.DONE:774 #
> TODO(BEAM-1290):
> > Consider converting this to an error log based on the775 # resolution of
> the
> > issue./usr/local/envs/py2env/lib/python2.7/site-packages/
> apache_beam/runners/dataflow/dataflow_runner.pyc
> > in state(self) 741 } 742 --> 743return (api_jobstate_map[self._job.
> currentState]
> > if self._job.currentState 744 else PipelineState.UNKNOWN) 745 KeyError:
> > CurrentStateValueValuesEnum(JOB_STATE_PENDING, 9)
> >
> >
> >
> >
> > --
> > Eila
> > www.orielresearch.org 
> > https://www.meetup.com/Deep-Learning-In-Production/
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: [PROPOSAL] Python 3 support

2018-03-23 Thread Ahmet Altay
Thank you Robbe.

I reviewed the document it looks reasonable to me. I will touch on some
points that were not mentioned:
- Runner exercise different code paths. Doing auto conversions and focusing
on DirectRunner is not enough. It is worthwhile to run things on
DataflowRunner as well. This can be triggered from Jenkins. It will
validate that we are still compatible for python 2.
- Similar to above but with an eye on perf regressions.

For project tracking on JIRA, please feel free to create any new issues,
close stale ones, or take ownership of any open issues. All JIRAs should be
assigned to the people actively working on them. If you wan to track it in
a separate way, you can also propose that. (For example a kanban board is
used for portability effort which is fully supported in JIRA.)

I will also call out to a few other people in addition to Holden who helped
out or showed interest in helping with Python 3. @cclaus, @luke-zhu, @udim,
@robertwb, @charlesccychen, @tvalentyn. You can include these people (and
myself) for reviews and other questions that you have.

Welcome again, and looking forward to your contributions.

Thank you,
Ahmet



On Fri, Mar 23, 2018 at 9:27 AM, Robbe Sneyders 
wrote:

> Hello everyone,
>
> In the next month(s), me and my colleague Matthias will commit a lot of
> time and effort to python 3 support for beam and we would like to discuss
> the best way to go forward with this.
>
> We have drawn up a document [1] with a high level outline of the proposed
> approach and would like to get your feedback on this.
>
> The main Jira issue [2] for python 3 support has been mostly inactive for
> the past year. Other smaller issues have been opened, but it's hard to
> track the general progress. It would be great if anyone could offer some
> insights on how to best handle this project on Jira.
>
> @Holden Karau, you seem to have already put in a lot of effort to add
> python 3 support, so it would be great to get your insights and find a way
> to merge our efforts.
>
> Kind regards,
> Robbe
>
> [1] https://docs.google.com/document/d/1xDG0MWVlDKDPu_
> IW9gtMvxi2S9I0GB0VDTkPhjXT0nE/edit?usp=sharing
> [2] https://issues.apache.org/jira/browse/BEAM-1251
> --
>
> [image: https://ml6.eu] 
>
> * Robbe Sneyders*
>
> ML6 Gent
> 
>
> M: +32 474 71 31 08 <+32%20474%2071%2031%2008>
>


Re: Python PostCommit Broken

2018-03-23 Thread Ahmet Altay
https://issues.apache.org/jira/browse/BEAM-3922 is the JIRA for tracking
this.

On Fri, Mar 23, 2018 at 10:51 AM, Pablo Estrada  wrote:

> Hello everyone,
> I see that the Python PostCommit has been broken for a couple days. Is
> there a PR / JIRA to track this?
> See breakage: https://builds.apache.org/job/beam_
> PostCommit_Python_Verify/4472/console
>
> Best
> -P.
> --
> Got feedback? go/pabloem-feedback
> 
>


Re: [PROPOSAL] Scripting extension based on Java JSR-223

2018-03-23 Thread Ahmet Altay
Thank you Ismaël, this looks really cool.

On Fri, Mar 23, 2018 at 5:33 AM, Jean-Baptiste Onofré 
wrote:

> Hi,
>
> it sounds like a very good extension mechanism to PTransform.
>
> +1
>
> Regards
> JB
>
> On 03/23/2018 12:03 PM, Ismaël Mejía wrote:
> > This is a really simple proposal to add an extension with transforms
> > that package the Java Scripting API )JSR-223) [1] to allow users to
> > specialize some transforms via a scripting language. This work was
> > initially created by Romain [2] and I just took it with his
> > authorization and refined it to make it pass all the Beam validations
> > + style. I also added ValueProviders that allow users to template now
> > scripts also in Dataflow.
> >
> > Notice that Dataflow recently added something similar to create really
> > simple data movement pipelines [3], so maybe the rest of the community
> > can benefit of a similar extension (and eventually dataflow may
> > converge to this implementation).
> >
> > I hope there is interest in this extension, so far we have a
> > ScriptingParDo transform to show the idea, hopefully we can expand
> > this to other transforms.
> >
> > For those interested in more details you can check the Jira issue [4]
> > and the PR [5].
> >
> > [1] https://www.jcp.org/en/jsr/detail?id=223
> > [2] https://github.com/rmannibucau/beam-jsr223
> > [3] https://cloud.google.com/blog/big-data/2018/03/pre-built-
> cloud-dataflow-templates-kiss-for-data-movement
> > [4] https://issues.apache.org/jira/browse/BEAM-3921
> > [5} https://github.com/apache/beam/pull/4944
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: Apache beam DataFlow runner throwing setup error

2018-03-23 Thread Ahmet Altay
Hi Rajesh,

Have you looked at the worker-startup logs [1]? You should be able to see
the setup error there. It is possible that something in your requirements
file is failing to install in the workers. If that is the case,
see Managing Python Pipeline Dependencies [2] for alternative options. You
could also reach out to Google Cloud Dataflow support for getting
additional help [3]

Thank you,
Ahmet

[1]
https://cloud.google.com/dataflow/pipelines/logging#monitoring-pipeline-logs
[2] https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/
[3] https://cloud.google.com/dataflow/support

On Thu, Mar 22, 2018 at 10:08 PM, Rajesh Hegde 
wrote:

> Hi,
> We are building data pipeline using Beam Python SDK and trying to run on
> Dataflow, but getting the below error,
>
> *A setup error was detected in
> beamapp--0322102737-03220329-8a74-harness-lm6v. Please refer to the
> worker-startup log for detailed information.*
>
> But could not find detailed worker-startup logs.
>
> We tried increasing memory size, worker count etc, but still getting the
> same error.
>
> Here is the command we use,
> *python run.py \*
> *--project=xyz \*
> *--runner=DataflowRunner \*
> *--staging_location=gs://xyz/staging \*
> *--temp_location=gs://xyz/temp \*
> *--requirements_file=requirements.txt \*
> *--worker_machine_type n1-standard-8 \*
> *--num_workers 2*
>
>
> pipeline snippet
>
> *data = pipeline | "load data" >> beam.io.Read(*
> *beam.io.BigQuerySource(query="SELECT * FROM abc_table LIMIT 100")*
> *)*
>
> *data | "filter data" >> beam.Filter(lambda x: x.get('column_name') ==
> value)*
>
>
> Above pipeline is just loading the data from BigQuery and filtering based
> on some column value. This pipeline works like a charm in DirectRunner but
> fails on Dataflow.
>
> Are we doing any obvious setup mistake? anyone else getting the same
> error? We could use some help to resolve the issue.
>
>
> --
>
> *Rajesh Hegde | Lead Product Developer | Datalicious*
> *e*: rhe...@datalicious.com | *m*: +919167571827 <+91%2091675%2071827>
> *a*: L-77, 15th Cross Rd, Sector 6, HSR Layout,
> 
> Bangalore Karnataka- 560102
> 
> *w*: www.datalicious.com
> 
>
> *Contact supp...@datalicious.com  anytime, we're
> keen to help!*
>
> 
>    
>
>
>
> 
>
>


Re: Pubsub API feedback

2018-03-20 Thread Ahmet Altay
Thank you Udi. Left some high level comments on the PR.


On Mon, Mar 19, 2018 at 5:13 PM, Udi Meiri  wrote:

> Hi,
> I wanted to get feedback about the upcoming Python Pubsub API. It is
> currently experimental and only supports reading and writing UTF-8 strings.
> My current proposal only concerns reading from Pubsub.
>
> Classes:
> - PubsubMessage: encapsulates Pubsub message payload and attributes.
>
> PTransforms:
> - ReadMessagesFromPubSub: Outputs elements of type ``PubsubMessage``.
>
> - ReadPayloadsFromPubSub: Outputs elements of type ``str``.
>
> - ReadStringsFromPubSub: Outputs elements of type ``unicode``, decoded
> from UTF-8.
>
> Description of common PTransform arguments:
>   topic: Cloud Pub/Sub topic in the form "projects//topics/<
> topic>".
> If provided, subscription must be None.
>   subscription: Existing Cloud Pub/Sub subscription to use in the
> form "projects//subscriptions/". If not
> specified,
> a temporary subscription will be created from the specified topic. If
> provided, topic must be None.
>   id_label: The attribute on incoming Pub/Sub messages to use as a unique
> record identifier. When specified, the value of this attribute (which
> can be any string that uniquely identifies the record) will be used for
> deduplication of messages. If not provided, we cannot guarantee
> that no duplicate data will be delivered on the Pub/Sub stream. In this
> case, deduplication of the stream will be strictly best effort.
>   timestamp_attribute: Message value to use as element timestamp. If None,
> uses message publishing time as the timestamp.
> Timestamp values should be in one of two formats:
> - A numerical value representing the number of milliseconds since the
> Unix
>   epoch.
> - A string in RFC 3339 format. For example,
>   {@code 2015-10-29T23:41:41.123Z}. The sub-second component of the
>   timestamp is optional, and digits beyond the first three (i.e., time
> units
>   smaller than milliseconds) will be ignored.
>
> Code: https://github.com/udim/beam/blob/b981dd618e9e1f667597eec2a91c72
> 65a389c405/sdks/python/apache_beam/io/gcp/pubsub.py
> PR: https://github.com/apache/beam/pull/4901
>
>


Re: [VOTE] Release 2.4.0, release candidate #3

2018-03-19 Thread Ahmet Altay
I was able to run hourly_team_score. I was passing a wrong argument. No
need for an alarm. :)

On Mon, Mar 19, 2018 at 5:33 PM, Ahmet Altay <al...@google.com> wrote:

> +1 Thank you Robert.
>
> Verified python mobile gaming examples using the wheel files on direct
> runner. Got user_score working but hourly_team_score failed with (
> https://issues.apache.org/jira/browse/BEAM-3824). Since this is an
> example, I think it is fine to continue with the release. I will work on
> fixing the example post release.
>
> On Mon, Mar 19, 2018 at 2:46 PM, Konstantinos Katsiapis <
> katsia...@google.com> wrote:
>
>> +1, since Tf.Transform <https://github.com/tensorflow/transform> 0.6
>> depends on (and is blocked by) Beam 2.4
>>
>> On Sat, Mar 17, 2018 at 2:19 AM, Robert Bradshaw <rober...@google.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> Please review and vote on the release candidate #3 for the version 2.4.0,
>>> as follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>
>>> The complete staging area is available for your review, which includes:
>>> * JIRA release notes [1],
>>> * the official Apache source release to be deployed to dist.apache.org
>>> [2],
>>> which is signed with the key with fingerprint BDC9 89B0 1BD2 A463 6010
>>> A1CA
>>> 8F15 5E09 610D 69FB [3],
>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>> * source code tag "v2.4.0-RC3" [5],
>>> * website pull request listing the release and publishing the API
>>> reference
>>> manual [6].
>>> * Java artifacts were built with Maven 3.2.5 and OpenJDK 1.8.0_112.
>>> * Python artifacts are deployed along with the source release to the
>>> dist.apache.org [2].
>>>
>>> The validation spreadsheet is available at
>>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkS
>>> ZTR8AGqyNUM-oLFo_ZXBpJw/edit?ts=5a1c7310#gid=1663314475
>>>
>>> The vote will be open for at least 72 hours. It is adopted by majority
>>> approval, with at least 3 PMC affirmative votes.
>>>
>>> Thanks,
>>> - Robert
>>>
>>> [1]
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?versi
>>> on=12342682=12319527
>>> [2] https://dist.apache.org/repos/dist/dev/beam/2.4.0/
>>> [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>>> [4] https://repository.apache.org/content/repositories/orgapache
>>> beam-1031/
>>> [5] https://github.com/apache/beam/tree/v2.4.0-RC3
>>> [6] https://github.com/apache/beam-site/pull/398
>>>
>>
>>
>>
>> --
>> Gus Katsiapis | Software Engineer | katsia...@google.com | 650-918-7487
>> <(650)%20918-7487>
>>
>
>


Re: [VOTE] Release 2.4.0, release candidate #3

2018-03-19 Thread Ahmet Altay
+1 Thank you Robert.

Verified python mobile gaming examples using the wheel files on direct
runner. Got user_score working but hourly_team_score failed with (
https://issues.apache.org/jira/browse/BEAM-3824). Since this is an example,
I think it is fine to continue with the release. I will work on fixing the
example post release.

On Mon, Mar 19, 2018 at 2:46 PM, Konstantinos Katsiapis <
katsia...@google.com> wrote:

> +1, since Tf.Transform  0.6
> depends on (and is blocked by) Beam 2.4
>
> On Sat, Mar 17, 2018 at 2:19 AM, Robert Bradshaw 
> wrote:
>
>> Hi everyone,
>>
>> Please review and vote on the release candidate #3 for the version 2.4.0,
>> as follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>>
>> The complete staging area is available for your review, which includes:
>> * JIRA release notes [1],
>> * the official Apache source release to be deployed to dist.apache.org
>> [2],
>> which is signed with the key with fingerprint BDC9 89B0 1BD2 A463 6010
>> A1CA
>> 8F15 5E09 610D 69FB [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "v2.4.0-RC3" [5],
>> * website pull request listing the release and publishing the API
>> reference
>> manual [6].
>> * Java artifacts were built with Maven 3.2.5 and OpenJDK 1.8.0_112.
>> * Python artifacts are deployed along with the source release to the
>> dist.apache.org [2].
>>
>> The validation spreadsheet is available at
>> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkS
>> ZTR8AGqyNUM-oLFo_ZXBpJw/edit?ts=5a1c7310#gid=1663314475
>>
>> The vote will be open for at least 72 hours. It is adopted by majority
>> approval, with at least 3 PMC affirmative votes.
>>
>> Thanks,
>> - Robert
>>
>> [1]
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?versi
>> on=12342682=12319527
>> [2] https://dist.apache.org/repos/dist/dev/beam/2.4.0/
>> [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
>> [4] https://repository.apache.org/content/repositories/orgapache
>> beam-1031/
>> [5] https://github.com/apache/beam/tree/v2.4.0-RC3
>> [6] https://github.com/apache/beam-site/pull/398
>>
>
>
>
> --
> Gus Katsiapis | Software Engineer | katsia...@google.com | 650-918-7487
> <(650)%20918-7487>
>


Re: Contributing

2018-03-15 Thread Ahmet Altay
Hi Austin,

It was great meeting with you. We mentioned a list of starter bugs, here is
that list [1]. It might give you some ides on where to start.

[1]
https://issues.apache.org/jira/browse/BEAM-3844?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20Reopened)%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20in%20(starter%2C%20newbie)%20ORDER%20BY%20created%20DESC%2C%20priority%20DESC


On Thu, Mar 15, 2018 at 6:11 PM, Jean-Baptiste Onofré 
wrote:

> Hi
>
> Great to see you yesterday and welcome aboard !
>
> I'm looking forward your contributions ! Please ping me or the team if you
> need help or guidance !
>
> Regards
> JB
> Le 15 mars 2018, à 17:31, Austin Bennett  a
> écrit:
>>
>> Hi All,
>>
>> Enjoyed meeting many of you yesterday, and look forward to helping the
>> project!
>>
>> I'll hope in the next week or two to submit some bit of code, and then
>> [slowly] to get up to speed on conventions, guidelines, appropriate style,
>> etc.  Ideally, with some guidance.
>>
>> Also, should we be able to identify areas that need contributions which
>> I'm more currently equipped to handle (ex: I heard a bunch around tracking,
>> metrics, data and notifications) would be happy to address.  This might be
>> more concrete/substantive/worthwhile contribution in the short run.  Not
>> even sure where would seek that info (as not quite JIRA tickets as I think
>> of them, but will look there), nor what sort of permissions would be needed
>> for such things.
>>
>
I think this is a great idea. JIRA is a good place to track these ideas.
You should be able to file issues in Beam JIRA. Give it a try and if it
does not work we are here to help.

Ahmet


>
>> Best,
>> Austin
>>
>>
>>


Re: Proposal: build Python wheel distributions for Apache Beam releases

2018-03-07 Thread Ahmet Altay
I do not know what is the best practice. For practical purposes it makes
sense to stage to the same svn repo, so that we can test it as part of the
release process.

On Wed, Mar 7, 2018 at 4:22 PM, Robert Bradshaw <rober...@google.com> wrote:

> Yes, we should. There's a bit of an open question of where these release
> artifacts should be staged. (Eventually, of course, they'll be published to
> PyPi). Should they be placed alongside the source artifacts in the svn
> repository?
>
>
> On Wed, Mar 7, 2018 at 3:00 PM Ahmet Altay <al...@google.com> wrote:
>
>> Are we planning to do this for the 2.4.0 release? I am asking, because
>> they were not part of RC1 artifacts.
>>
>> On Tue, Feb 13, 2018 at 9:18 AM, Robert Bradshaw <rober...@google.com>
>> wrote:
>>
>>> On Tue, Feb 13, 2018 at 8:31 AM, Nima Mousavi <nima.mous...@gmail.com>
>>> wrote:
>>> > Related question:
>>> >
>>> > How can we tell if the docker image of our binary contains the cython
>>> > optimized beam or the slower codepath?
>>> > The image was built on Google cloud (using gcloud container builds
>>> submit).
>>>
>>> There are certain modules (corresponding to the pyx files) that are
>>> only built if Cython is present. We can (1) make sure Cython is
>>> installed before installing apache beam into the container, and (2)
>>> assert as part of the build process that these modules exist.
>>>
>>> > On Mon, Feb 12, 2018 at 9:32 PM, Ahmet Altay <al...@google.com> wrote:
>>> >>
>>> >> +1 to wheels. The main effort for this would be updating the release
>>> >> guide, and adding support for other platforms in Jenkins for building
>>> and
>>> >> testing wheels.  In light of this, maybe we can prioritize having test
>>> >> infrastructure for other platforms.
>>> >>
>>> >> On Mon, Feb 12, 2018 at 1:47 PM, Ismaël Mejía <ieme...@gmail.com>
>>> wrote:
>>> >>>
>>> >>> +1 for wheels, they are the standard binary distribution format so it
>>> >>> makes sense. Also wheels support packaging python 2 and 3 on
>>> universal
>>> >>> packages so they are future proof.
>>> >>>
>>> >>> On Mon, Feb 12, 2018 at 10:26 PM, Robert Bradshaw <
>>> rober...@google.com>
>>> >>> wrote:
>>> >>> > +1, is it too late to try to release these as part of the 2.3
>>> release
>>> >>> > (to get familiar with the process, no code changes should be
>>> needed)?
>>> >>
>>> >>
>>> >> It would be nice to have this for the current release. How can we
>>> build
>>> >> and test these binaries? I think it will be prudent to waIt until we
>>> have
>>> >> infrastructure.
>>> >>
>>> >>>
>>> >>> >
>>> >>> > The wheels are advantageous when running locally (e.g. during
>>> testing
>>> >>> > and development) where requiring containers will probably be
>>> overkill.
>>> >>> > This will become especially relevant with the switch to use the
>>> >>> > FnApiRunner.
>>> >>> >
>>> >>> > On Mon, Feb 12, 2018 at 1:22 PM, Lukasz Cwik <lc...@google.com>
>>> wrote:
>>> >>> >> If we want all our code related to pipeline execution to be in a
>>> >>> >> container,
>>> >>> >> what value does building wheel distributions provide?
>>> >>> >>
>>> >>> >>
>>> >>> >> On Mon, Feb 12, 2018 at 1:18 PM, Kenneth Knowles <k...@google.com>
>>> >>> >> wrote:
>>> >>> >>>
>>> >>> >>> +1
>>> >>> >>>
>>> >>> >>> On Mon, Feb 12, 2018 at 1:04 PM, Charles Chen <c...@google.com>
>>> wrote:
>>> >>> >>>>
>>> >>> >>>> Currently, Apache Beam distributes Python packages through pip
>>> and
>>> >>> >>>> PyPI.
>>> >>> >>>> On PyPI, developers can release either source tarballs, and / or
>>> >>> >>>> precompiled
>>> >>> >>>> "wheel" distributions for each platform, which would be used if
>>> >>> >>>> available
>>> >>> >>>> for a particular platform.  Currently, we only distribute the
>>> source
>>> >>> >>>> tarballs, so any user who installs Beam using "pip install
>>> >>> >>>> apache_beam" has
>>> >>> >>>> to have a compiler and toolchain installed to take advantage of
>>> >>> >>>> Cython
>>> >>> >>>> optimizations in Beam (which require compiled C code).  If such
>>> a
>>> >>> >>>> compiler
>>> >>> >>>> is not available, Beam is currently configured to install
>>> anyway,
>>> >>> >>>> but will
>>> >>> >>>> use slower Python codepaths instead of the more optimized ones
>>> (for
>>> >>> >>>> example,
>>> >>> >>>> for Coder encoding / decoding).
>>> >>> >>>>
>>> >>> >>>> I would like to propose that we start distributing binary wheel
>>> >>> >>>> distributions for our releases, for common platforms like
>>> Windows /
>>> >>> >>>> Mac /
>>> >>> >>>> Linux.  We could potentially use a method similar to this one
>>> >>> >>>> (https://github.com/MacPython/cython-wheels) for building these
>>> >>> >>>> wheel
>>> >>> >>>> distributions.  Thoughts?
>>> >>> >>>>
>>> >>> >>>> Best,
>>> >>> >>>> Charles
>>> >>> >>>
>>> >>> >>>
>>> >>> >>
>>> >>
>>> >>
>>> >
>>>
>>
>>


Re: Proposal: build Python wheel distributions for Apache Beam releases

2018-03-07 Thread Ahmet Altay
Are we planning to do this for the 2.4.0 release? I am asking, because they
were not part of RC1 artifacts.

On Tue, Feb 13, 2018 at 9:18 AM, Robert Bradshaw <rober...@google.com>
wrote:

> On Tue, Feb 13, 2018 at 8:31 AM, Nima Mousavi <nima.mous...@gmail.com>
> wrote:
> > Related question:
> >
> > How can we tell if the docker image of our binary contains the cython
> > optimized beam or the slower codepath?
> > The image was built on Google cloud (using gcloud container builds
> submit).
>
> There are certain modules (corresponding to the pyx files) that are
> only built if Cython is present. We can (1) make sure Cython is
> installed before installing apache beam into the container, and (2)
> assert as part of the build process that these modules exist.
>
> > On Mon, Feb 12, 2018 at 9:32 PM, Ahmet Altay <al...@google.com> wrote:
> >>
> >> +1 to wheels. The main effort for this would be updating the release
> >> guide, and adding support for other platforms in Jenkins for building
> and
> >> testing wheels.  In light of this, maybe we can prioritize having test
> >> infrastructure for other platforms.
> >>
> >> On Mon, Feb 12, 2018 at 1:47 PM, Ismaël Mejía <ieme...@gmail.com>
> wrote:
> >>>
> >>> +1 for wheels, they are the standard binary distribution format so it
> >>> makes sense. Also wheels support packaging python 2 and 3 on universal
> >>> packages so they are future proof.
> >>>
> >>> On Mon, Feb 12, 2018 at 10:26 PM, Robert Bradshaw <rober...@google.com
> >
> >>> wrote:
> >>> > +1, is it too late to try to release these as part of the 2.3 release
> >>> > (to get familiar with the process, no code changes should be needed)?
> >>
> >>
> >> It would be nice to have this for the current release. How can we build
> >> and test these binaries? I think it will be prudent to waIt until we
> have
> >> infrastructure.
> >>
> >>>
> >>> >
> >>> > The wheels are advantageous when running locally (e.g. during testing
> >>> > and development) where requiring containers will probably be
> overkill.
> >>> > This will become especially relevant with the switch to use the
> >>> > FnApiRunner.
> >>> >
> >>> > On Mon, Feb 12, 2018 at 1:22 PM, Lukasz Cwik <lc...@google.com>
> wrote:
> >>> >> If we want all our code related to pipeline execution to be in a
> >>> >> container,
> >>> >> what value does building wheel distributions provide?
> >>> >>
> >>> >>
> >>> >> On Mon, Feb 12, 2018 at 1:18 PM, Kenneth Knowles <k...@google.com>
> >>> >> wrote:
> >>> >>>
> >>> >>> +1
> >>> >>>
> >>> >>> On Mon, Feb 12, 2018 at 1:04 PM, Charles Chen <c...@google.com>
> wrote:
> >>> >>>>
> >>> >>>> Currently, Apache Beam distributes Python packages through pip and
> >>> >>>> PyPI.
> >>> >>>> On PyPI, developers can release either source tarballs, and / or
> >>> >>>> precompiled
> >>> >>>> "wheel" distributions for each platform, which would be used if
> >>> >>>> available
> >>> >>>> for a particular platform.  Currently, we only distribute the
> source
> >>> >>>> tarballs, so any user who installs Beam using "pip install
> >>> >>>> apache_beam" has
> >>> >>>> to have a compiler and toolchain installed to take advantage of
> >>> >>>> Cython
> >>> >>>> optimizations in Beam (which require compiled C code).  If such a
> >>> >>>> compiler
> >>> >>>> is not available, Beam is currently configured to install anyway,
> >>> >>>> but will
> >>> >>>> use slower Python codepaths instead of the more optimized ones
> (for
> >>> >>>> example,
> >>> >>>> for Coder encoding / decoding).
> >>> >>>>
> >>> >>>> I would like to propose that we start distributing binary wheel
> >>> >>>> distributions for our releases, for common platforms like Windows
> /
> >>> >>>> Mac /
> >>> >>>> Linux.  We could potentially use a method similar to this one
> >>> >>>> (https://github.com/MacPython/cython-wheels) for building these
> >>> >>>> wheel
> >>> >>>> distributions.  Thoughts?
> >>> >>>>
> >>> >>>> Best,
> >>> >>>> Charles
> >>> >>>
> >>> >>>
> >>> >>
> >>
> >>
> >
>


Re: [VOTE] Release 2.4.0, release candidate #1

2018-03-07 Thread Ahmet Altay
-1 for the same reason as Ismaël. Python version is not updated in the
release branch [1].

[1]
https://github.com/apache/beam/blob/release-2.4.0/sdks/python/apache_beam/version.py#L21

On Wed, Mar 7, 2018 at 8:39 AM, Jean-Baptiste Onofré 
wrote:

> No it's not (I'm testing the release right now), I just was curious and
> noticed
> the missing details ;)
>
> Thanks !
> Regards
> JB
>
> On 03/07/2018 05:17 PM, Robert Bradshaw wrote:
> > On Wed, Mar 7, 2018 at 12:50 AM Jean-Baptiste Onofré  > > wrote:
> >
> > For the record, the vote e-mail doesn't contain actual MAVEN_VERSION
> and
> > JDK_VERSION used to build.
> >
> >
> > Sorry, it's Apache Maven 3.2.5 with Java version: 1.8.0_112. Hopefully
> this
> > isn't a deciding factor :).
> >
> >
> > Regards
> > JB
> >
> > On 03/07/2018 09:44 AM, Robert Bradshaw wrote:
> > > Hi everyone,
> > >
> > > Please review and vote on the release candidate #1 for the version
> 2.4.0,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific
> comments)
> > >
> > > The complete staging area is available for your review, which
> includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release to be deployed to
> dist.apache.org
> >  [2],
> > > which is signed with the key with fingerprint BDC9 89B0 1BD2 A463
> 6010
> > >   A1CA 8F15 5E09 610D 69FB [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag "v2.4.0-RC1" [5],
> > > * website pull request listing the release and publishing the API
> reference
> > > manual [6].
> > > * Java artifacts were built with Maven MAVEN_VERSION and
> OpenJDK/Oracle JDK
> > > JDK_VERSION.
> > > * Python artifacts are deployed along with the source release to
> the
> > > dist.apache.org  [2].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by
> majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > - Robert
> > >
> > > [1]
> > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> version=12342682=12319527
> > > [2] https://dist.apache.org/repos/dist/dev/beam/2.4.0/
> > > [3] https://dist.apache.org/repos/dist/dev/beam/KEYS
> > > [4] https://repository.apache.org/content/repositories/
> orgapachebeam-1028/
> > > [5] https://github.com/apache/beam/tree/v2.4.0-RC1
> > > [6] https://github.com/apache/beam-site/pull/398
> > >
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org 
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: Merging Python code? Help avoid Python 3 regressions with these two simple steps :)

2018-03-02 Thread Ahmet Altay
That is my understanding as well, it is requires attention from infra.
Could anyone help with this? I know we worked with infra before, what is
the best way to approach this?

On Fri, Mar 2, 2018 at 9:50 AM, Holden Karau <holden.ka...@gmail.com> wrote:

> I agree, however I'm of the impression it's blocked on infra? (e.g. it's
> important but out of my hands).
>
> On Mar 1, 2018 11:05 PM, "Ahmet Altay" <al...@google.com> wrote:
>
>> I think we should prioritize the issue of installing Python 3 on the
>> workers (https://issues.apache.org/jira/browse/BEAM-3671). I would
>> appreciate if folks pay attention to these 2 steps but I am worried that it
>> will be easily forgotten.
>>
>> On Thu, Mar 1, 2018 at 6:51 PM, Holden Karau <hol...@pigscanfly.ca>
>> wrote:
>>
>>> I may have watched too many buzzfeed videos this week but the steps are:
>>> 1) git checkout the PR in question
>>> 2) Run tox -e lint_py2,lint_py3
>>>
>>> This is important since Python 3 isn't installed on the Jenkins workers
>>> just yet and we have some tests to catch basic invalid Python 3 which we
>>> can slowly grow as we fix the issues and you can help us keep moving
>>> forward!
>>>
>>> If step 1 is too much work I like using the hub program I find it helps
>>> me with this part of my workflow in other projects. That being said you
>>> don't have to do this, we'll fix whatever errors come up, so if this is
>>> going to slow your workflow down or you otherwise don't like it feel free
>>> to pass along.
>>>
>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>>


Re: Merging Python code? Help avoid Python 3 regressions with these two simple steps :)

2018-03-01 Thread Ahmet Altay
I think we should prioritize the issue of installing Python 3 on the
workers (https://issues.apache.org/jira/browse/BEAM-3671). I would
appreciate if folks pay attention to these 2 steps but I am worried that it
will be easily forgotten.

On Thu, Mar 1, 2018 at 6:51 PM, Holden Karau  wrote:

> I may have watched too many buzzfeed videos this week but the steps are:
> 1) git checkout the PR in question
> 2) Run tox -e lint_py2,lint_py3
>
> This is important since Python 3 isn't installed on the Jenkins workers
> just yet and we have some tests to catch basic invalid Python 3 which we
> can slowly grow as we fix the issues and you can help us keep moving
> forward!
>
> If step 1 is too much work I like using the hub program I find it helps me
> with this part of my workflow in other projects. That being said you don't
> have to do this, we'll fix whatever errors come up, so if this is going to
> slow your workflow down or you otherwise don't like it feel free to pass
> along.
>
> --
> Twitter: https://twitter.com/holdenkarau
>


Re: Instructions to install Python SDK from source

2018-02-28 Thread Ahmet Altay
Hi Pablo,

You have a great point. Getting started instructions for developers for
Python SDK is not well documented. Installing Python SDK from source is a
subset of this lack of documentation. There is a JIRA for this (
https://issues.apache.org/jira/browse/BEAM-3075). It would be great if you
can take that and add some documentation. It could also contain information
on running tests, setting up an IDE etc. I think it would be better to
update the website as the single source of documentation, rather than
having individual README files spread across our repository. It would also
be more accessible for new developers coming into Beam.

Thank you,
Ahmet

On Wed, Feb 28, 2018 at 4:23 PM, Pablo Estrada  wrote:

> Hello all,
> I noticed that the README for Beam has some basic instructions on
> installing Beam from source using mvn. Do these instructions work also to
> install Python SDK?
> I have my own script to install the Python SDK from source reliably, but I
> recently noticed that there's no instructions to install exclusively the
> Python SDK within the Beam docs. (Or perhaps I missed it?)
> I feel like we should document this better, so I was wondering what the
> community thinks about:
>
> 1) Having a README file within sdks/python/ with this and other important
> python documentation,
> 2) and/or only having an entry in the Beam docs.
>
> Best
> -P.
> --
> Got feedback? go/pabloem-feedback
> 
>


Re: Python 3 flake 8: splitting up on the errors?

2018-02-28 Thread Ahmet Altay
I think this is a great idea. I would encourage everyone who would like to
help with Python 3 migration to help with this effort. Holden, if you
already have a list, could you either share the list or create individual
JIRAs so that we can track the work among us.

On Tue, Feb 27, 2018 at 4:53 PM, Holden Karau  wrote:

> How would folks feel about splitting up some of the Python 3 migration
> work by the different flake8 errors in Py3? This might allow us to
> parallelize some of the work while still keeping things fairly small?
>
> --
> Twitter: https://twitter.com/holdenkarau
>


Re: Proposed improvements to our documentation

2018-02-28 Thread Ahmet Altay
+1

I think this is a great idea, it can also serve as an inventory of where a
language might be lacking in transforms and provide a good starting point
for new contributors to fill in those gaps by looking at the existing Java
implementations.

On Wed, Feb 28, 2018 at 10:53 AM, Lukasz Cwik  wrote:

> +1
>
> On Wed, Feb 28, 2018 at 10:46 AM, Kenneth Knowles  wrote:
>
>> Yes! I love the idea of having a good cross-language transform reference
>> on the web site. Very good idea to get started now and provide the
>> skeleton, then fill out additional transforms and additional languages
>> incrementally.
>>
>> Kenn
>>
>> On Wed, Feb 28, 2018 at 10:23 AM, Rafael Fernandez 
>> wrote:
>>
>>> Hi folks,
>>>
>>> I think we've all seen a few areas of improvement here and there in our
>>> docs. For example, one can find a a Javadoc entry with outdated content
>>> here and there [1], or "sample" code snippets that have problems, such as
>>> not compiling [2].
>>>
>>> I think a good thing to do is to invest in extending our documentation
>>> to having a robust per-transform reference, which has samples and a good
>>> description of what the transform does, and keep JavaDoc as a solid source
>>> of API documentation. I believe similar approaches can benefit Python and
>>> other languages.
>>>
>>> What do you think? I'm happy to spend some time now and then and
>>> incrementaly move in this direction. I would like some help from the
>>> community with reviews, suggestions (and perhaps picking up associated
>>> JIRAs as I file them.) Good idea? Bad? Try? +1?
>>>
>>> Thanks,
>>> r
>>>
>>> [1] See https://github.com/apache/beam/blob/a629f73ee4e64c470e0c78cc
>>> 6f51b8625d781b41/sdks/java/core/src/main/java/org/apache/bea
>>> m/sdk/transforms/CombineWithContext.java , which contains a stale
>>> reference to KeyedCombineFn .
>>>
>>> [2] https://github.com/apache/beam/blob/5fb30ec8265c841cd8c4
>>> e6ae16b43be1f171eabb/sdks/java/core/src/main/java/org/apache
>>> /beam/sdk/transforms/FlatMapElements.java#L65
>>>
>>
>>
>


Re: Python 3 reviewers

2018-02-22 Thread Ahmet Altay
Thank you Holden for doing this work. I agree with Robert's comment. I know
there are a few folks working on this now (you, @luke-zhu and @cclauss).
Perhaps you could do python 3 related code reviews within that group. I
would be happy to chime in and review some chunks as well.

On Thu, Feb 22, 2018 at 4:54 PM, Robert Bradshaw 
wrote:

> I'd really like to see Python 3 support sooner rather than later, and
> have been reviewing some (simple) PRs in this direction. As long as
> they're broken up into small enough chunks, feel free to send some my
> way.
>
> On Thu, Feb 22, 2018 at 3:59 PM, Holden Karau 
> wrote:
> > Hi Y'all,
> >
> > I'm trying to make some progress on Python 3 support for Beam but I'm
> having
> > a bit of difficulty finding people with review bandwidth. Are there any
> > committers with time to spare who would be willing to work on this? If
> not
> > no worries I'll refocus my efforts elsewhere :)
> >
> > Cheers,
> >
> > Holden :)
> >
> > --
> > Twitter: https://twitter.com/holdenkarau
>


Re: [DISCUSS]: Beam 2.3.0 release archetypes missing mobile gaming examples

2018-02-22 Thread Ahmet Altay
In my opinion waiting for the 2.4.0 release makes sense, since there is a
plan to cut 2.4.0 release soon and this is examples and not a core
function. In the meantime, we could add a notice to the website warning
users about this issue and suggest them to use the previous release for
trying out these examples.

Ahmet

On Thu, Feb 22, 2018 at 4:09 PM, Yifan Zou  wrote:

> Greetings,
>
> We stopped copying the mobile gaming examples into maven archetypes after
> merging Java8 examples to "mian" Java examples. So, we're not able to run
> those pipelines via creating a maven projects since those files are not
> included.
>
> To solve this problem, we could:
>
>- 2.3.1 point ct with fix for archetypes.
>- Wait for 2.4.0 release with fix.
>
> Any thoughts?
>
> BEAM-3735
>  is
> filed to track this issue.
>
> Thanks.
>
> Regards.
> Yifan
>


Re: Beam 2.4.0

2018-02-20 Thread Ahmet Altay
+1 for having regular release cycles. Finalizing a release takes time in
the order of a few weeks and starting a new release soon after the previous
one is a reliable way for having releases every 6 weeks.

On Tue, Feb 20, 2018 at 2:30 PM, Robert Bradshaw 
wrote:

> Yep. I am starting the "Let's do a 2.4.0 release" thread almost
> exactly 6 weeks after JB first started the 2.3.0 release thread.
>
> On Tue, Feb 20, 2018 at 2:20 PM, Charles Chen  wrote:
> > I would like to +1 the faster release cycle process JB and Robert have
> been
> > advocating and implementing, and thank JB for releasing 2.3.0 smoothly.
> > When we block for specific features and increase the time between
> releases,
> > we increase the urgency for PR authors to push for their change to go
> into
> > an upcoming release, which is a feedback loop that results in our
> releases
> > taking months instead of weeks.  We should however try to get pending PRs
> > wrapped up.
> >
> > On Tue, Feb 20, 2018 at 2:15 PM Romain Manni-Bucau <
> rmannibu...@gmail.com>
> > wrote:
> >>
> >> Kind of agree but rythm was supposed to be 6 weeks IIRC, 2.3 is just out
> >> so 1 week is a bit fast IMHO.
> >>
> >> Le 20 févr. 2018 23:13, "Robert Bradshaw"  a
> écrit :
> >>>
> >>> One of the main shifts that I think helped this release was explicitly
> >>> not being feature driven, rather releasing what's already in the
> >>> branch. That doesn't mean it's not a good call to action to try and
> >>> get long-pending PRs or similar wrapped up.
> >>>
> >>> On Tue, Feb 20, 2018 at 2:10 PM, Romain Manni-Bucau
> >>>  wrote:
> >>> > There are a lot of long pending PR, would be good to merge them
> before
> >>> > 2.4.
> >>> > Some are bringing tests for the 2.3 release which can be critical to
> >>> > include.
> >>> >
> >>> > Maybe we should list the pr and jira we want it before picking a
> date?
> >>> >
> >>> > Le 20 févr. 2018 22:02, "Konstantinos Katsiapis" <
> katsia...@google.com>
> >>> > a
> >>> > écrit :
> >>> >>
> >>> >> +1 since tf.transform 0.6 depends on Beam 2.4 and Tensorflow 1.6
> (and
> >>> >> the
> >>> >> latter already has an RC out, so we will likely be blocked on Beam).
> >>> >>
> >>> >> On Tue, Feb 20, 2018 at 12:50 PM, Robert Bradshaw
> >>> >> 
> >>> >> wrote:
> >>> >>>
> >>> >>> Now that Beam 2.3.0 went out (and in record time, kudos to all that
> >>> >>> made this happen!) It'd be great to keep the ball rolling for a
> >>> >>> similarly well-executed 2.4. A lot has gone in [1] since we made
> the
> >>> >>> 2.3 cut, and to keep our cadence up I would propose a time-based
> cut
> >>> >>> date early next week (say the 28th).
> >>> >>>
> >>> >>> I'll volunteer to do this release.
> >>> >>>
> >>> >>> [1] https://github.com/apache/beam/compare/release-2.3.0...master
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Gus Katsiapis | Software Engineer | katsia...@google.com |
> >>> >> 650-918-7487
>


Re: [VOTE] Release 2.3.0, release candidate #3

2018-02-17 Thread Ahmet Altay
On Fri, Feb 16, 2018 at 9:52 PM, Jean-Baptiste Onofré 
wrote:

> Hi,
>
> Can someone from Python grand me permission to upload Python SDK 2.3.0 to
> PyPi ?
>

lukecwik, kennknowles, aljoscha, robertwb, davorbonaci are the package
owners, any of them can do it.


>
> My user is jbonofre.
>
> Thanks !
> Regards
> JB
>
> On 02/16/2018 03:40 AM, Jean-Baptiste Onofré wrote:
> > Great !!!
> >
> > Thanks for the update, I will close the vote then.
> >
> > Regards
> > JB
> >
> > On 02/15/2018 11:45 PM, Ben Sidhom wrote:
> >> I just successfully ran the quickstart on Flink 1.4.0 on Dataproc.
> Should be
> >> good to go.
> >>
> >>
> >> On Thu, Feb 15, 2018 at 9:21 AM Jean-Baptiste Onofré  >> > wrote:
> >>
> >> Luke said he tested Flink locally. So, we have to test on a Yarn
> cluster.
> >>
> >> Regards
> >> JB
> >>
> >> On 02/15/2018 06:16 PM, Reuven Lax wrote:
> >> > I count enough votes :) So sounds like someone needs to verify
> Flink 1.4
> >> > quickstart, and then we're ready?
> >> >
> >> > On Wed, Feb 14, 2018 at 5:38 PM, Robert Bradshaw <
> rober...@google.com
> >> 
> >> > >> wrote:
> >> >
> >> > +1 (binding) pending Flink verification. I have checked the
> signatures
> >> > and checksums of the artifacts, and that it agrees with commit
> >> > 67b5e1bab25d284cdac2127b47f44acc8e83499e on github *except*
> for 76
> >> > places where -SNAPSHOT was removed.
> >> >
> >> > FWIW, http://www.apache.org/dev/release-distribution
> >> >  now
> recommends
> >> > sha512 (over sha1 which has been broken).
> >> >
> >> > On Wed, Feb 14, 2018 at 5:24 PM, Eugene Kirpichov
> >> 
> >> > >>
> wrote:
> >> > > Thanks Kenn. I retract my -1, but then someone must verify
> it with Flink
> >> > > 1.4. I might give it a shot tomorrow (installing Flink 1.4
> on Dataproc).
> >> > >
> >> > > On Wed, Feb 14, 2018 at 4:53 PM Reuven Lax <
> re...@google.com
> >> 
> >> > >> wrote:
> >> > >>
> >> > >> +1 (binding)
> >> > >>
> >> > >> On Wed, Feb 14, 2018 at 10:42 AM, Alan Myrvold <
> amyrv...@google.com
> >> 
> >> > >>
> >> > >> wrote:
> >> > >>>
> >> > >>> +1 Validated java quickstarts for direct, dataflow, apex,
> flink, and
> >> > >>> spark.
> >> > >>>
> >> > >>> On Wed, Feb 14, 2018 at 9:21 AM, Lukasz Cwik <
> lc...@google.com
> >> 
> >> > >> wrote:
> >> > 
> >> >  +1 (binding)
> >> >  Validated several quickstarts including the regression
> that I
> >> originally
> >> >  reported with Spark.
> >> > 
> >> >  On Wed, Feb 14, 2018 at 5:34 AM, Ismaël Mejía <
> ieme...@gmail.com
> >> 
> >> > >> wrote:
> >> > >
> >> > > +1 (binding)
> >> > >
> >> > > Validated SHAs + tag vs source.zip file.
> >> > > Run mvn clean install -Prelease OK
> >> > > Validated that the 3 regressions reported for RC1 were
> fixed.
> >> > > Run Nexmark on Direct/Flink runner on local mode, no
> regressions
> >> now.
> >> > > Installed python version on virtualenv and run local
> wordcount with
> >> > > success.
> >> > > Checked that the hadoop-input-format artifact is in the
> extended
> >> > > staging area.
> >> > >
> >> > > On Tue, Feb 13, 2018 at 5:41 PM, Jean-Baptiste Onofré
> >> 
> >> > >>
> >> > > wrote:
> >> > > > +1 (binding)
> >> > > >
> >> > > > Tested the Spark runner (with wordcount example and
> beam samples)
> >> > > > Tested the performance of the direct runner
> >> > > >
> >> > > > I just updated the spreadsheet.
> >> > > >
> >> > > > Regards
> >> > > > JB
> >> > > >
> >> > > > On 02/11/2018 06:33 AM, Jean-Baptiste Onofré wrote:
> >> > > >> Hi everyone,
> >> > > >>
> >> > > >> Please review and vote on the release candidate #3
> for the
> >> 

Re: [VOTE] Release 2.3.0, release candidate #3

2018-02-14 Thread Ahmet Altay
+1

Thank you JB and thank you everyone for doing the validations.

On Wed, Feb 14, 2018 at 5:24 PM, Eugene Kirpichov 
wrote:

> Thanks Kenn. I retract my -1, but then someone must verify it with Flink
> 1.4. I might give it a shot tomorrow (installing Flink 1.4 on Dataproc).
>
> On Wed, Feb 14, 2018 at 4:53 PM Reuven Lax  wrote:
>
>> +1 (binding)
>>
>> On Wed, Feb 14, 2018 at 10:42 AM, Alan Myrvold 
>> wrote:
>>
>>> +1 Validated java quickstarts for direct, dataflow, apex, flink, and
>>> spark.
>>>
>>> On Wed, Feb 14, 2018 at 9:21 AM, Lukasz Cwik  wrote:
>>>
 +1 (binding)
 Validated several quickstarts including the regression that I
 originally reported with Spark.

 On Wed, Feb 14, 2018 at 5:34 AM, Ismaël Mejía 
 wrote:

> +1 (binding)
>
> Validated SHAs + tag vs source.zip file.
> Run mvn clean install -Prelease OK
> Validated that the 3 regressions reported for RC1 were fixed.
> Run Nexmark on Direct/Flink runner on local mode, no regressions now.
> Installed python version on virtualenv and run local wordcount with
> success.
> Checked that the hadoop-input-format artifact is in the extended
> staging area.
>
> On Tue, Feb 13, 2018 at 5:41 PM, Jean-Baptiste Onofré 
> wrote:
> > +1 (binding)
> >
> > Tested the Spark runner (with wordcount example and beam samples)
> > Tested the performance of the direct runner
> >
> > I just updated the spreadsheet.
> >
> > Regards
> > JB
> >
> > On 02/11/2018 06:33 AM, Jean-Baptiste Onofré wrote:
> >> Hi everyone,
> >>
> >> Please review and vote on the release candidate #3 for the version
> 2.3.0, as
> >> follows:
> >>
> >> [ ] +1, Approve the release
> >> [ ] -1, Do not approve the release (please provide specific
> comments)
> >>
> >>
> >> The complete staging area is available for your review, which
> includes:
> >> * JIRA release notes [1],
> >> * the official Apache source release to be deployed to
> dist.apache.org [2],
> >> which is signed with the key with fingerprint C8282E76 [3],
> >> * all artifacts to be deployed to the Maven Central Repository [4],
> >> * source code tag "v2.3.0-RC3" [5],
> >> * website pull request listing the release and publishing the API
> reference
> >> manual [6].
> >> * Java artifacts were built with Maven 3.3.9 and Oracle JDK
> 1.8.0_111.
> >> * Python artifacts are deployed along with the source release to the
> >> dist.apache.org [2].
> >>
> >> The vote will be open for at least 72 hours. It is adopted by
> majority approval,
> >> with at least 3 PMC affirmative votes.
> >>
> >> Thanks,
> >> JB
> >>
> >> [1]
> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12319527=12341608
> >> [2] https://dist.apache.org/repos/dist/dev/beam/2.3.0/
> >> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> >> [4] https://repository.apache.org/content/repositories/
> orgapachebeam-1028/
> >> [5] https://github.com/apache/beam/tree/v2.3.0-RC3
> >> [6] https://github.com/apache/beam-site/pull/381
> >>
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
>


>>>


Re: Proposal: build Python wheel distributions for Apache Beam releases

2018-02-12 Thread Ahmet Altay
+1 to wheels. The main effort for this would be updating the release guide,
and adding support for other platforms in Jenkins for building and testing
wheels.  In light of this, maybe we can prioritize having test
infrastructure for other platforms.

On Mon, Feb 12, 2018 at 1:47 PM, Ismaël Mejía  wrote:

> +1 for wheels, they are the standard binary distribution format so it
> makes sense. Also wheels support packaging python 2 and 3 on universal
> packages so they are future proof.
>
> On Mon, Feb 12, 2018 at 10:26 PM, Robert Bradshaw 
> wrote:
> > +1, is it too late to try to release these as part of the 2.3 release
> > (to get familiar with the process, no code changes should be needed)?
>

It would be nice to have this for the current release. How can we build and
test these binaries? I think it will be prudent to waIt until we have
infrastructure.


> >
> > The wheels are advantageous when running locally (e.g. during testing
> > and development) where requiring containers will probably be overkill.
> > This will become especially relevant with the switch to use the
> > FnApiRunner.
> >
> > On Mon, Feb 12, 2018 at 1:22 PM, Lukasz Cwik  wrote:
> >> If we want all our code related to pipeline execution to be in a
> container,
> >> what value does building wheel distributions provide?
> >>
> >>
> >> On Mon, Feb 12, 2018 at 1:18 PM, Kenneth Knowles 
> wrote:
> >>>
> >>> +1
> >>>
> >>> On Mon, Feb 12, 2018 at 1:04 PM, Charles Chen  wrote:
> 
>  Currently, Apache Beam distributes Python packages through pip and
> PyPI.
>  On PyPI, developers can release either source tarballs, and / or
> precompiled
>  "wheel" distributions for each platform, which would be used if
> available
>  for a particular platform.  Currently, we only distribute the source
>  tarballs, so any user who installs Beam using "pip install
> apache_beam" has
>  to have a compiler and toolchain installed to take advantage of Cython
>  optimizations in Beam (which require compiled C code).  If such a
> compiler
>  is not available, Beam is currently configured to install anyway, but
> will
>  use slower Python codepaths instead of the more optimized ones (for
> example,
>  for Coder encoding / decoding).
> 
>  I would like to propose that we start distributing binary wheel
>  distributions for our releases, for common platforms like Windows /
> Mac /
>  Linux.  We could potentially use a method similar to this one
>  (https://github.com/MacPython/cython-wheels) for building these wheel
>  distributions.  Thoughts?
> 
>  Best,
>  Charles
> >>>
> >>>
> >>
>


Re: Off for 3 weeks

2018-02-12 Thread Ahmet Altay
Best wishes, hope you will recover quickly.

On Mon, Feb 12, 2018 at 9:03 AM, Kenneth Knowles  wrote:

> Best wishes for a swift recovery.
>
> Kenn
>
>
> On Sat, Feb 10, 2018 at 5:56 AM, Etienne Chauchot 
> wrote:
>
>> Hi guys,
>>
>> I've been off this week for a surgery. I will not be available until 2
>> more weeks.
>>
>> See you later then.
>>
>> Etienne
>>
>>
>


Re: [VOTE] Release 2.3.0, release candidate #2

2018-02-08 Thread Ahmet Altay
+1

I verified python quick start, mobile gaming examples, streaming on Direct
and Dataflow runners. Thank you JB!

On Thu, Feb 8, 2018 at 2:27 AM, Romain Manni-Bucau 
wrote:

> +1 (non-binding), thanks JB for the effort!
>
>
> Romain Manni-Bucau
> @rmannibucau  |  Blog
>  | Old Blog
>  | Github
>  | LinkedIn
>  | Book
> 
>
> 2018-02-08 11:12 GMT+01:00 Ismaël Mejía :
>
>> +1 (binding)
>>
>> Validated SHAs + tag vs source.zip file.
>> Run mvn clean install -Prelease OK
>> Validated that the 3 regressions reported for RC1 were fixed.
>> Run Nexmark on Direct/Flink runner on local mode, no regressions now.
>> Installed python version on virtualenv and run wordcount with success.
>>
>> On Thu, Feb 8, 2018 at 6:37 AM, Jean-Baptiste Onofré 
>> wrote:
>> > Hi everyone,
>> >
>> > Please review and vote on the release candidate #2 for the version
>> 2.3.0, as
>> > follows:
>> >
>> > [ ] +1, Approve the release
>> > [ ] -1, Do not approve the release (please provide specific comments)
>> >
>> >
>> > The complete staging area is available for your review, which includes:
>> > * JIRA release notes [1],
>> > * the official Apache source release to be deployed to dist.apache.org
>> [2],
>> > which is signed with the key with fingerprint C8282E76 [3],
>> > * all artifacts to be deployed to the Maven Central Repository [4],
>> > * source code tag "v2.3.0-RC2" [5],
>> > * website pull request listing the release and publishing the API
>> reference
>> > manual [6].
>> > * Java artifacts were built with Maven 3.3.9 and Oracle JDK 1.8.0_111.
>> > * Python artifacts are deployed along with the source release to the
>> > dist.apache.org [2].
>> >
>> > The vote will be open for at least 72 hours. It is adopted by majority
>> approval,
>> > with at least 3 PMC affirmative votes.
>> >
>> > Thanks,
>> > JB
>> >
>> > [1]
>> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
>> ctId=12319527=12341608
>> > [2] https://dist.apache.org/repos/dist/dev/beam/2.3.0/
>> > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>> > [4] https://repository.apache.org/content/repositories/orgapache
>> beam-1027/
>> > [5] https://github.com/apache/beam/tree/v2.3.0-RC2
>> > [6] https://github.com/apache/beam-site/pull/381
>> > --
>> > Jean-Baptiste Onofré
>> > jbono...@apache.org
>> > http://blog.nanthrax.net
>> > Talend - http://www.talend.com
>>
>
>


Re: Replacing Python DirectRunner apply_* hooks with PTransformOverrides

2018-02-02 Thread Ahmet Altay
+1 to this change.

Thank you Charles for improving the DirectRunner, sharing your progress and
seeking feedback. This change would allow us to migrate to a faster
DirectRunner for Python. A long time requested feature and an important
part of the first use experience for new users trying out Beam.

Ahmet

On Fri, Feb 2, 2018 at 11:00 AM, Charles Chen  wrote:

> Thanks Kenn.  We already do the Runner API roundtripping (I believe Robert
> implemented this).  With this change, we would start doing exactly what
> you're suggesting, where we apply overrides to a post-deserialization
> pipeline.
>
> On Thu, Feb 1, 2018 at 6:45 PM Kenneth Knowles  wrote:
>
>> +1 for removing apply_*
>>
>> For the Java SDK, removing specialized intercepts was an important first
>> step towards the portability framework. I wonder if there is a way for the
>> Python SDK to leapfrog, taking advantage of some of the lessons that Java
>> learned a bit more painfully. Most pertinent I think is that if an SDK's
>> role is to construct a pipeline and ship the proto to a runner (service)
>> then overrides apply to a post-deserialization pipeline. The Java
>> DirectRunner does a proto round-trip to avoid accidentally depending on
>> things that are not really part of the pipeline. I would this crisp
>> abstraction enforcement would add even more value to Python.
>>
>> Kenn
>>
>> On Thu, Feb 1, 2018 at 5:21 PM, Charles Chen  wrote:
>>
>>> In the Python DirectRunner, we currently use apply_* overrides to
>>> override the operation of the default .expand() operation for certain
>>> transforms. For example, GroupByKey has a special implementation in the
>>> DirectRunner, so we use an apply_* override hook to replace the
>>> implementation of GroupByKey.expand().
>>>
>>> However, this strategy has drawbacks. Because this override operation
>>> happens eagerly during graph construction, the pipeline graph is
>>> specialized and modified before a specific runner is bound to the
>>> pipeline's execution. This makes the pipeline graph non-portable and blocks
>>> full migration to using the Runner API pipeline representation in the
>>> DirectRunner.
>>>
>>> By contrast, the SDK's PTransformOverride mechanism allows the
>>> expression of matchers that operate on the unspecialized graph, replacing
>>> PTransforms as necessary to produce a DirectRunner-specialized pipeline
>>> graph for execution.
>>>
>>> I therefore propose to replace these eager apply_* overrides with
>>> PTransformOverrides that operate on the completely constructed graph.
>>>
>>> The JIRA issue is https://issues.apache.org/jira/browse/BEAM-3566, and
>>> I've prepared a candidate patch at https://github.com/apache/
>>> incubator-beam/pull/4529.
>>>
>>> Best,
>>> Charles
>>>
>>
>>


Re: [VOTE] Release 2.3.0, release candidate #1

2018-01-31 Thread Ahmet Altay
This will require a change in the Beam code, because image names are
hardcoded in to code (python) and configuration (java). RC1 as it is will
not work correctly with Cloud Dataflow.

On Wed, Jan 31, 2018 at 2:08 PM, Reuven Lax  wrote:

> Hopefully we can validate soon. I believe some of the delays are because
> of integrating major changes done over the last week (e.g. Java 8
> migration).
>
> On Wed, Jan 31, 2018 at 2:04 PM, Ismaël Mejía  wrote:
>
>> What is the common procedure in cases like this ? Because it doesn't
>> seems that it needs a re-vote, just an extra day or two for
>> validation, any ideas JB ?
>>
>> On Wed, Jan 31, 2018 at 10:41 PM, Alan Myrvold 
>> wrote:
>> > Yes, it is a dataflow step. Happy to test this again when they are
>> > available.
>> >
>> > On Wed, Jan 31, 2018 at 1:39 PM, Jean-Baptiste Onofré 
>> > wrote:
>> >>
>> >> OK, I think I understood ;)
>> >>
>> >> So it's not "directly" related to Beam itself (it's more a Dataflow
>> step
>> >> to perform).
>> >>
>> >> @Alan, I think it's better to test first and then cast the vote. This
>> kind
>> >> of tests are valuable to validate the release and make sense. But vote
>> >> should represent the state of the Beam release. So I think -1 vote is
>> a bit
>> >> too early before the test.
>> >>
>> >> Thanks !
>> >> Regards
>> >> JB
>> >>
>> >> On 31/01/2018 22:33, Reuven Lax wrote:
>> >>>
>> >>> It's just a step that needs to be peformed before the new release
>> works
>> >>> on Dataflow. Alan is saying that we've been unable to validate
>> Dataflow so
>> >>> far, as worker images are not yet built. Hopefully they'll be built
>> soon,
>> >>> and we'll be able to validate.
>> >>>
>> >>> On Wed, Jan 31, 2018 at 1:31 PM, Jean-Baptiste Onofré <
>> j...@nanthrax.net
>> >>> > wrote:
>> >>>
>> >>> Hi Alan
>> >>>
>> >>> does it related to change in the codebase or in a
>> dependency/related
>> >>> project ?
>> >>>
>> >>> I mean: is it something we have to fix/change in Beam ?
>> >>>
>> >>> Just curious as I'm not sure what you mean by "worker images" ;)
>> >>>
>> >>> Thanks !
>> >>> Regards
>> >>> JB
>> >>>
>> >>> On 31/01/2018 22:18, Alan Myrvold wrote:
>> >>>
>> >>> -1 (for now, hope to change this)
>> >>>
>> >>> Dataflow runner jobs are failing for me with 2.3.0 RC1, for
>> both
>> >>> Java and Python.
>> >>>
>> >>> This is not an issues with the 2.3.0 RC1 SDK, we (google) need
>> >>> to release worker images.
>> >>>
>> >>> I have assigned these bugs to myself, and will be working to
>> >>> help get these workers released.
>> >>>
>> >>> [BEAM-3584] Java dataflow job fails with 2.3.0 RC1, due to
>> >>> missing worker image
>> >>> [BEAM-3585] Python dataflow job fails with 2.3.0 RC1, due to
>> >>> missing worker image
>> >>>
>> >>> On Wed, Jan 31, 2018 at 6:12 AM, Jean-Baptiste Onofré
>> >>> 
>> >>> >> wrote:
>> >>>
>> >>>  Thanks Kenn,
>> >>>
>> >>>  I prepared the list of tasks I did. I will complete where
>> >>> release is
>> >>>  out.
>> >>>
>> >>>  Regards
>> >>>  JB
>> >>>
>> >>>  On 01/31/2018 03:07 PM, Kenneth Knowles wrote:
>> >>>  > I've cloned the release validation spreadsheet:
>> >>>  >
>> >>>  > https://s.apache.org/beam-2.3.0-release-validation
>> >>> 
>> >>>  > >>> >
>> >>>  >
>> >>>  > If you plan to perform a manual validation task, please
>> >>> sign up so multiple
>> >>>  > people don't waste effort.
>> >>>  >
>> >>>  > Alan & JB, as far as your pairing up to automate more,
>> >>> anything manual on this
>> >>>  > sheet should be considered.
>> >>>  >
>> >>>  > Kenn
>> >>>  >
>> >>>  > On Wed, Jan 31, 2018 at 5:59 AM, Jean-Baptiste Onofré
>> >>> 
>> >>> >
>> >>>  > 
>> >>> > >>>  >
>> >>>  > +1 (binding)
>> >>>  >
>> >>>  > Casting my own +1 ;)
>> >>>  >
>> >>>  > Regards
>> >>>  > JB
>> >>>  >
>> >>>  > On 01/30/2018 09:04 AM, Jean-Baptiste Onofré wrote:
>> >>>  > > 

Re: Does Apache Beam for python support server-based shuffle with Dataflow runner yet?

2018-01-17 Thread Ahmet Altay
Hi Nima,

You can try this feature with python SDK using the same instructions from
the announcement. However, it is not ready for production usage. Team is
working official supporting it. We cannot share an ETA, once it is
available it will be announced.

For future questions related to the Dataflow service please see GCP
Dataflow support page [1].

Thank you,
Ahmet

[1] https://cloud.google.com/dataflow/support

On Wed, Jan 17, 2018 at 11:25 AM, Nima Mousavi 
wrote:

> Hi,
>
> In June 2017, Google introduced server-based shuffle for Datatflow
> pipeline, which can result in 5x performance improvement. However, at the
> time of announcement this feature was only available for Cloud Dataflow SDK
> for Java version 1. What is the status for Dataflow SDK for Python? Is it
> supported already? Any plan to add it soon?
>
>
> https://cloud.google.com/blog/big-data/2017/06/introducing-c
> loud-dataflow-shuffle-for-up-to-5x-performance-improvement-i
> n-data-analytic-pipelines
>
> Thanks!
>


Re: Pushing daily/test containers for python

2017-12-21 Thread Ahmet Altay
Thank you all for the comments. We can prototype something closer to (a)
and we can always change it later. My concern was that this would consume
more resources, but this might be a non-issue.

>From a procedure perspective, do we need a formal vote on this?

On Thu, Dec 21, 2017 at 1:33 PM, Holden Karau <hol...@pigscanfly.ca> wrote:

> So I think we (or more accurately the PMC) need to be careful with how we
> post the container artifacts from an Apache POV since they most likely
> contain non-Apache licensed code (and also posting daileys can be
> conolicated since the PMC hasn’t voted on each one).
>

> For just testing it should probably be OK but we need to make sure users
> aren’t confused and think they are releases.
>

+1. Perhaps we can make these images private or have mechanisms in the
tests to remove images as part of the test cleanup.


>
>
> On Thu, Dec 21, 2017 at 10:03 AM Valentyn Tymofieiev <valen...@google.com>
> wrote:
>
>> The GCR repository can be configured with public pull access, which I
>> think will be required to use the container.
>>
>> On Thu, Dec 21, 2017 at 2:34 AM, David Sabater Dinter <
>> david.saba...@gmail.com> wrote:
>>
>>> +1
>>> Hi,
>>> It makes sense to use GCR (locality with GCP services and works like any
>>> other container repository), only caveat being that the images will be
>>> private, in case anyone requires to debug locally will need access to pull
>>> the image or build locally and push.
>>> I agree getting closer to (a) is preferable assuming the build time
>>> doesn't increase dramatically in the post commit process.
>>>
>>> On Thu, Dec 21, 2017 at 1:59 AM Henning Rohde <hero...@google.com>
>>> wrote:
>>>
>>>> +1
>>>>
>>>> It would be great to be able to test this aspect of portability. For
>>>> testing purposes, I think whatever container registry is convenient to use
>>>> for distribution is fine.
>>>>
>>>> Regarding frequency, I think we should consider something closer to
>>>> (a). The container images -- although usually quite stable -- are part of
>>>> the SDK at that commit and are not guaranteed to work with any other
>>>> version. Breaking changes in their interaction would cause confusion and
>>>> create noise. Any local tests can also in theory just build the container
>>>> images directly and not use any registry, so it might make sense to set up
>>>> the tests so that pushing occurs less frequently then building.
>>>>
>>>> Henning
>>>>
>>>>
>>>>
>>>> On Wed, Dec 20, 2017 at 3:10 PM, Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> After some recent changes (e.g. [1]) we have a feasible container that
>>>>> we can use to test Python SDK on portability framework. Until now we were
>>>>> using Google provided container images for testing and for the released
>>>>> product. We can gradually move away from that (at least partially) for
>>>>> Python SDK.
>>>>>
>>>>> I would like to propose building containers for testing purposes only
>>>>> and pushing them to gcr.io as part of jenkins jobs. I would like to
>>>>> clarify two points with the team first:
>>>>>
>>>>> 1. Use of GCR, I am proposing it for a few reasons:
>>>>> - Beam's jenkins workers run on GCP, and it would be easy to push them
>>>>> to gcr from there.
>>>>> - If we use another service (perhaps with a free tier for open source
>>>>> projects) we might be overusing it by pushing/pulling from our daily 
>>>>> tests.
>>>>> - This is similar to how we stage some artifacts to GCS as part of the
>>>>> testing process.
>>>>>
>>>>> 2. Frequency of building and pushing containers
>>>>>
>>>>> a. We can run it at every PR, by integrating with python post commit
>>>>> tests.
>>>>> b. We can run it daily, by having a new Jenkins job.
>>>>> c. We can run it manually, by having a parameterized Jenkins job that
>>>>> can build and push a new container from a tag/commit. Given that we
>>>>> infrequently change container code, I would suggest choosing this option.
>>>>>
>>>>> What do you think about this? To be clear, this is just a proposal
>>>>> about the testing environment. I am not suggesting anything about the
>>>>> release artifacts.
>>>>>
>>>>> Thank you,
>>>>> Ahmet
>>>>>
>>>>> [1] https://github.com/apache/beam/pull/4286
>>>>>
>>>>
>>>>
>> --
> Twitter: https://twitter.com/holdenkarau
>


Re: A question regarding BEAM-3280

2017-12-15 Thread Ahmet Altay
Hi Norio,

The ticket is asking for an example however this feature is currently not
supported by the SDK. First the support needs to be added.

On Tue, Dec 12, 2017 at 10:44 PM, Akagi Norio <redtree.dev1...@gmail.com>
wrote:

> Hi Ahmet and Robert,
>
> Thank you for the reply.
> Just to clarify, I initially thought this ticket is just to add an
> example, is it correct?
>
> An example from Roberts looks not supported by current SDK
> https://github.com/apache/beam/blob/master/sdks/python/
> apache_beam/transforms/ptransform.py#L280
> so I’m wondering if I actually need to update SDK to support a typehint
> with multiple output tags.
>
> I just started reading Beam’s code base, so it may take some time.
> If that’s okay then I’d happy to work on the ticket, so please assign it
> to me.
>

It is OK to take your time before working on this problem. I could not
assign to you. (I guess you first need to be added as a contributor to the
project). I added comment mentioning that you are working on this issue.

Thank you again!
Ahmet



>
> Regards,
> Norio Akagi
>
>
> On Dec 11, 2017, at 4:50 PM, Ahmet Altay <al...@google.com> wrote:
>
> Hi Norio,
>
> Thank you for your interest. If you would like to work on this I can
> assign the JIRA to you. I do not think this change in sufficient or
> correct. This reads as if SplitLinesToWordsFn returns a Tuple of things,
> however instead it produces three unrelated collections of different types.
>
> I think the work for fixing the issue should be:
> - Clarifying what the API needs to look like for typehints in case of
> multiple outputs.
> - Updating documentation for that (pydocs & https://beam.apache.org/
> documentation/sdks/python-type-safety/).
> - Adding examples. At that point we can choose to either update current
> examples or add new examples.
>
> Thank you,
> Ahmet
>
> On Mon, Dec 11, 2017 at 2:48 AM, Akagi Norio <redtree.dev1...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I’m working on a task BEAM-3280 (Add typehints with TaggedOutput) and
>> just want to clarify before I send a PR.
>> https://issues.apache.org/jira/browse/BEAM-3280
>>
>> Is it sufficient to modify the code in apache_beam.cookbook.multiple_
>> output_pardo.py like below ?
>>
>> # with_outputs allows accessing the explicitly tagged outputs of a DoFn.
>> split_lines_result = (lines
>>   | 
>> beam.ParDo(SplitLinesToWordsFn().with_output_types(
>>   beam.typehints.Tuple[
>> beam.typehints.Generator[unicode],
>> beam.typehints.Generator[unicode],
>> beam.typehints.Generator[int],
>>   ],
>>   )).with_outputs(
>>   SplitLinesToWordsFn.OUTPUT_TAG_SHORT_WORDS,
>>   
>> SplitLinesToWordsFn.OUTPUT_TAG_CHARACTER_COUNT,
>>   main='words')
>>  )
>>
>> Or do you expect something different to add a typehint to multiple
>> outputs?
>>
>> Regards,
>> Norio Akagi
>>
>
>
>


Re: [jira] [Commented] (BEAM-3357) Python SDK head fails to run tests due to Requirement.parse('protobuf<=3.4.0,>=3.2.0')

2017-12-15 Thread Ahmet Altay
I agree with Robert. Also there is usually a workaround to unbrake previous
version by installing a specific version of an offending dependency.

On Fri, Dec 15, 2017 at 2:53 PM, Robert Bradshaw <rober...@google.com>
wrote:

> This has the downside of pinning the dependencies for all downstream
> projects, making it impossible for them to use different versions than
> the ones we happened to choose. (Imagine the pain of two or more of
> our dependencies pinned all their dependencies...)
>
> On Fri, Dec 15, 2017 at 2:48 PM, Udi Meiri <eh...@google.com> wrote:
> > +1 to pinning to exact versions, to be sure that our releases do not
> break
> > when newer versions of dependencies are released.
> >
> > On Fri, Dec 15, 2017 at 2:44 PM Ahmet Altay <al...@google.com> wrote:
> >>
> >> On Fri, Dec 15, 2017 at 2:42 PM, Chamikara Jayalath <
> chamik...@google.com>
> >> wrote:
> >>>
> >>> +1 for automating the process of checking for possible version bumps.
> >>>
> >>> Also, what do you think about pinning dependencies to exact versions
> >>> (instead of ranges) after cutting a release branch ? This should
> improve the
> >>> stability of released SDKs (but not a prefect solution since transitive
> >>> dependencies can still change).
> >>
> >>
> >> This is a reasonable suggestion. The issue with that is, by being less
> >> flexible we will prevent users from using latest versions of
> dependencies.
> >> On the other hand it will prevent breaking of already released versions.
> >>
> >>>
> >>>
> >>> Thanks,
> >>> Cham
> >>>
> >>> On Fri, Dec 15, 2017 at 2:19 PM Ahmet Altay <al...@google.com> wrote:
> >>>>
> >>>> On Fri, Dec 15, 2017 at 2:02 PM, Robert Bradshaw <rober...@google.com
> >
> >>>> wrote:
> >>>>>
> >>>>> On Fri, Dec 15, 2017 at 1:51 PM, Ahmet Altay <al...@google.com>
> wrote:
> >>>>> >
> >>>>> > On Fri, Dec 15, 2017 at 1:38 PM, Robert Bradshaw
> >>>>> > <rober...@google.com>
> >>>>> > wrote:
> >>>>> >>
> >>>>> >> I am also in favor of pinning as an immediate fix, bumping the
> bound
> >>>>> >> otherwise.
> >>>>> >>
> >>>>> >> Regarding putting an upper bound to avoid being broken, the last
> two
> >>>>> >> breaks have been due to just having an (unneeded) upper bound
> (which
> >>>>> >> held us back to broken/incompatible releases in relationship to
> >>>>> >> other
> >>>>> >> dependencies). We should try to trust semantic versioning when
> >>>>> >> possible, and when not we must regularly audit.
> >>>>> >
> >>>>> > +1 to this, especially the auditing part. We also had breaks
> because
> >>>>> > we
> >>>>> > trusted semantic versioning. So far our semi-official policy was to
> >>>>> > trust a
> >>>>> > package until they prove it otherwise. I will argue that grpc here
> is
> >>>>> > making
> >>>>> > a breaking change in a minor version increment by changing the way
> >>>>> > they are
> >>>>> > depending on a major package.
> >>>>>
> >>>>> A minor version bump should be allowed to require a minor version
> bump
> >>>>> in its dependencies.
> >>>>>
> >>>>> > We have done a good job of auditing and updating those pinned (or
> >>>>> > upper
> >>>>> > bounded) dependencies, and probably we are behind in some of those.
> >>>>> >
> >>>>> > I wonder if we can automate some of this? If we can get a report,
> >>>>> > that
> >>>>> > audits our dependencies, warns us about new releases and potential
> >>>>> > conflicts
> >>>>> > it would be much easier to keep things up to date.
> >>>>>
> >>>>> Big +1, it should be easy to set up a nightly that relaxes some of
> the
> >>>>> requirements and sees what (if anything) breaks. Not breaking is
> >>>>> likely a signal that we should relax ours.
> >>>>
> >>&

Re: [jira] [Commented] (BEAM-3357) Python SDK head fails to run tests due to Requirement.parse('protobuf<=3.4.0,>=3.2.0')

2017-12-15 Thread Ahmet Altay
On Fri, Dec 15, 2017 at 2:42 PM, Chamikara Jayalath <chamik...@google.com>
wrote:

> +1 for automating the process of checking for possible version bumps.
>
> Also, what do you think about pinning dependencies to exact versions
> (instead of ranges) after cutting a release branch ? This should improve
> the stability of released SDKs (but not a prefect solution since transitive
> dependencies can still change).
>

This is a reasonable suggestion. The issue with that is, by being less
flexible we will prevent users from using latest versions of dependencies.
On the other hand it will prevent breaking of already released versions.


>
> Thanks,
> Cham
>
> On Fri, Dec 15, 2017 at 2:19 PM Ahmet Altay <al...@google.com> wrote:
>
>> On Fri, Dec 15, 2017 at 2:02 PM, Robert Bradshaw <rober...@google.com>
>> wrote:
>>
>>> On Fri, Dec 15, 2017 at 1:51 PM, Ahmet Altay <al...@google.com> wrote:
>>> >
>>> > On Fri, Dec 15, 2017 at 1:38 PM, Robert Bradshaw <rober...@google.com>
>>> > wrote:
>>> >>
>>> >> I am also in favor of pinning as an immediate fix, bumping the bound
>>> >> otherwise.
>>> >>
>>> >> Regarding putting an upper bound to avoid being broken, the last two
>>> >> breaks have been due to just having an (unneeded) upper bound (which
>>> >> held us back to broken/incompatible releases in relationship to other
>>> >> dependencies). We should try to trust semantic versioning when
>>> >> possible, and when not we must regularly audit.
>>> >
>>> > +1 to this, especially the auditing part. We also had breaks because we
>>> > trusted semantic versioning. So far our semi-official policy was to
>>> trust a
>>> > package until they prove it otherwise. I will argue that grpc here is
>>> making
>>> > a breaking change in a minor version increment by changing the way
>>> they are
>>> > depending on a major package.
>>>
>>> A minor version bump should be allowed to require a minor version bump
>>> in its dependencies.
>>>
>>> > We have done a good job of auditing and updating those pinned (or upper
>>> > bounded) dependencies, and probably we are behind in some of those.
>>> >
>>> > I wonder if we can automate some of this? If we can get a report, that
>>> > audits our dependencies, warns us about new releases and potential
>>> conflicts
>>> > it would be much easier to keep things up to date.
>>>
>>> Big +1, it should be easy to set up a nightly that relaxes some of the
>>> requirements and sees what (if anything) breaks. Not breaking is
>>> likely a signal that we should relax ours.
>>>
>>
>> Filed https://issues.apache.org/jira/browse/BEAM-3363 to track this. I
>> think it would be awesome if we can tackle this as part of a better
>> infrastructure for testing work.
>>
>>
>>>
>>> >> On Fri, Dec 15, 2017 at 1:33 PM, Chamikara Jayalath (JIRA)
>>> >> <j...@apache.org> wrote:
>>> >> >
>>> >> > [
>>> >> > https://issues.apache.org/jira/browse/BEAM-3357?page=
>>> com.atlassian.jira.plugin.system.issuetabpanels:comment-
>>> tabpanel=16293276#comment-16293276
>>> >> > ]
>>> >> >
>>> >> > Chamikara Jayalath commented on BEAM-3357:
>>> >> > --
>>> >> >
>>> >> > You mean we could bump up the upper bound ? I think we should keep
>>> some
>>> >> > upper bound in case we get badly broken by a future protobuf
>>> release.
>>> >> >
>>> >> >> Python SDK head fails to run tests due to
>>> >> >> Requirement.parse('protobuf<=3.4.0,>=3.2.0')
>>> >> >>
>>> >> >> 
>>> --
>>> >> >>
>>> >> >> Key: BEAM-3357
>>> >> >> URL: https://issues.apache.org/
>>> jira/browse/BEAM-3357
>>> >> >> Project: Beam
>>> >> >>  Issue Type: Bug
>>> >> >>  Components: sdk-py-core
>>> >> >>Reporter: Chamikara Jayalath
>>> >> >>Priority: Critical
>>&

Re: [jira] [Commented] (BEAM-3357) Python SDK head fails to run tests due to Requirement.parse('protobuf<=3.4.0,>=3.2.0')

2017-12-15 Thread Ahmet Altay
On Fri, Dec 15, 2017 at 1:38 PM, Robert Bradshaw 
wrote:

> I am also in favor of pinning as an immediate fix, bumping the bound
> otherwise.
>
> Regarding putting an upper bound to avoid being broken, the last two
> breaks have been due to just having an (unneeded) upper bound (which
> held us back to broken/incompatible releases in relationship to other
> dependencies). We should try to trust semantic versioning when
> possible, and when not we must regularly audit.
>

+1 to this, especially the auditing part. We also had breaks because we
trusted semantic versioning. So far our semi-official policy was to trust a
package until they prove it otherwise. I will argue that grpc here is
making a breaking change in a minor version increment by changing the way
they are depending on a major package.

We have done a good job of auditing and updating those pinned (or upper
bounded) dependencies, and probably we are behind in some of those.

I wonder if we can automate some of this? If we can get a report, that
audits our dependencies, warns us about new releases and potential
conflicts it would be much easier to keep things up to date.


>
> On Fri, Dec 15, 2017 at 1:33 PM, Chamikara Jayalath (JIRA)
>  wrote:
> >
> > [ https://issues.apache.org/jira/browse/BEAM-3357?page=
> com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel=16293276#comment-16293276 ]
> >
> > Chamikara Jayalath commented on BEAM-3357:
> > --
> >
> > You mean we could bump up the upper bound ? I think we should keep some
> upper bound in case we get badly broken by a future protobuf release.
> >
> >> Python SDK head fails to run tests due to Requirement.parse('protobuf<=
> 3.4.0,>=3.2.0')
> >> 
> --
> >>
> >> Key: BEAM-3357
> >> URL: https://issues.apache.org/jira/browse/BEAM-3357
> >> Project: Beam
> >>  Issue Type: Bug
> >>  Components: sdk-py-core
> >>Reporter: Chamikara Jayalath
> >>Priority: Critical
> >>
> >> Error is:
> >> running build_ext
> >> Traceback (most recent call last):
> >>   File "setup.py", line 202, in 
> >> 'test': generate_protos_first(test),
> >>   File "/Users/chamikara/testing/test_py_12_14_2017_2/env_
> proto_3.4/lib/python2.7/site-packages/setuptools/__init__.py", line 129,
> in setup
> >> return distutils.core.setup(**attrs)
> >>   File "/System/Library/Frameworks/Python.framework/Versions/2.7/
> lib/python2.7/distutils/core.py", line 151, in setup
> >> dist.run_commands()
> >>   File "/System/Library/Frameworks/Python.framework/Versions/2.7/
> lib/python2.7/distutils/dist.py", line 953, in run_commands
> >> self.run_command(cmd)
> >>   File "/System/Library/Frameworks/Python.framework/Versions/2.7/
> lib/python2.7/distutils/dist.py", line 972, in run_command
> >> cmd_obj.run()
> >>   File "setup.py", line 142, in run
> >> super(cmd, self).run()
> >>   File "/Users/chamikara/testing/test_py_12_14_2017_2/env_
> proto_3.4/lib/python2.7/site-packages/setuptools/command/test.py", line
> 225, in run
> >> with self.project_on_sys_path():
> >>   File 
> >> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py",
> line 17, in __enter__
> >> return self.gen.next()
> >>   File "/Users/chamikara/testing/test_py_12_14_2017_2/env_
> proto_3.4/lib/python2.7/site-packages/setuptools/command/test.py", line
> 164, in project_on_sys_path
> >> require('%s==%s' % (ei_cmd.egg_name, ei_cmd.egg_version))
> >>   File "/Users/chamikara/testing/test_py_12_14_2017_2/env_
> proto_3.4/lib/python2.7/site-packages/pkg_resources/__init__.py", line
> 984, in require
> >> needed = self.resolve(parse_requirements(requirements))
> >>   File "/Users/chamikara/testing/test_py_12_14_2017_2/env_
> proto_3.4/lib/python2.7/site-packages/pkg_resources/__init__.py", line
> 875, in resolve
> >> raise VersionConflict(dist, req).with_context(dependent_req)
> >> pkg_resources.ContextualVersionConflict: (protobuf 3.5.0.post1
> (/Users/chamikara/testing/test_py_12_14_2017_2/beam/
> sdks/python/.eggs/protobuf-3.5.0.post1-py2.7.egg),
> Requirement.parse('protobuf<=3.4.0,>=3.2.0'), set(['apache-beam']))
> >> Seems like grpcio did a release today which is breaking us:
> https://pypi.python.org/pypi/grpcio/1.8.1
> >> We have to either bump our protobuf dependency or reduce the upper
> bound of grpcio dependency to previous release (1.7.3).
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.4.14#64029)
>


Re: A personal update

2017-12-12 Thread Ahmet Altay
Welcome back! Looking forward to your contributions.

Ahmet

On Tue, Dec 12, 2017 at 10:05 PM, Jesse Anderson 
wrote:

> Congrats!
>
> On Wed, Dec 13, 2017, 5:54 AM Jean-Baptiste Onofré 
> wrote:
>
>> Hi Davor,
>>
>> welcome back !!
>>
>> It's really great to see you back active in the Beam community. We really
>> need you !
>>
>> I'm so happy !
>>
>> Regards
>> JB
>>
>> On 12/13/2017 05:51 AM, Davor Bonaci wrote:
>> > My dear friends,
>> > As many of you have noticed, I’ve been visibly absent from the project
>> for a
>> > little while. During this time, a great number of you kept reaching
>> out, and for
>> > that I’m deeply humbled and grateful to each and every one of you.
>> >
>> > I needed some time for personal reflection, which led to a transition
>> in my
>> > professional life. As things have settled, I’m happy to again be
>> working among
>> > all of you, as we propel this project forward. I plan to be active in
>> the
>> > future, but perhaps not quite full-time as I was before.
>> >
>> > In the near term, I’m working on getting the report to the Board
>> completed, as
>> > well as framing the discussion about the project state and vision going
>> > forwards. Additionally, I’ll make sure that we foster healthy community
>> culture
>> > and operate in the Apache Way.
>> >
>> > For those who are curious, I’m happy to say that I’m starting a company
>> building
>> > products related to Beam, along with several other members of this
>> community and
>> > authors of this technology. I’ll share more on this next year, but
>> until then if
>> > you have a data processing problem or an Apache Beam question, I’d love
>> to hear
>> > from you ;-).
>> >
>> > Thanks -- and so happy to be back!
>> >
>> > Davor
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>


Re: A question regarding BEAM-3280

2017-12-11 Thread Ahmet Altay
Hi Norio,

Thank you for your interest. If you would like to work on this I can assign
the JIRA to you. I do not think this change in sufficient or correct. This
reads as if SplitLinesToWordsFn returns a Tuple of things, however instead
it produces three unrelated collections of different types.

I think the work for fixing the issue should be:
- Clarifying what the API needs to look like for typehints in case of
multiple outputs.
- Updating documentation for that (pydocs &
https://beam.apache.org/documentation/sdks/python-type-safety/).
- Adding examples. At that point we can choose to either update current
examples or add new examples.

Thank you,
Ahmet

On Mon, Dec 11, 2017 at 2:48 AM, Akagi Norio 
wrote:

> Hi,
>
> I’m working on a task BEAM-3280 (Add typehints with TaggedOutput) and just
> want to clarify before I send a PR.
> https://issues.apache.org/jira/browse/BEAM-3280
>
> Is it sufficient to modify the code in apache_beam.cookbook.multiple_
> output_pardo.py like below ?
>
> # with_outputs allows accessing the explicitly tagged outputs of a DoFn.
> split_lines_result = (lines
>   | 
> beam.ParDo(SplitLinesToWordsFn().with_output_types(
>   beam.typehints.Tuple[
> beam.typehints.Generator[unicode],
> beam.typehints.Generator[unicode],
> beam.typehints.Generator[int],
>   ],
>   )).with_outputs(
>   SplitLinesToWordsFn.OUTPUT_TAG_SHORT_WORDS,
>   
> SplitLinesToWordsFn.OUTPUT_TAG_CHARACTER_COUNT,
>   main='words')
>  )
>
> Or do you expect something different to add a typehint to multiple outputs?
>
> Regards,
> Norio Akagi
>


Re: Apache Beam, version 2.2.0

2017-12-07 Thread Ahmet Altay
On Thu, Dec 7, 2017 at 3:51 PM, Eugene Kirpichov 
wrote:

> I've sent the poll https://lists.apache.org/thread.html/
> 5bc2e184a24de9dbc8184ffd2720d1894010497d47d956b395e037df@%
> 3Cuser.beam.apache.org%3E
> Will figure out how to tweet from @ApacheBeam, and sent the Twitter poll
> as well (or ask someone to).
>

I tweeted the poll.


>
> On Wed, Dec 6, 2017 at 1:47 PM Lukasz Cwik  wrote:
>
>> +1 on moving forward with the plan suggested by kirpichov@
>>
>> On Wed, Dec 6, 2017 at 9:14 AM, Robert Bradshaw 
>> wrote:
>>
>>> +1 to moving forward with this plan.
>>>
>>> (FWIW, this seems *less* backwards incompatible than, say, moving from
>>> Spark 1 to Spark 2, which was decided much quicker. I suppose the
>>> Spark change has a lower bound on the number of users it could impact
>>> though.)
>>>
>>> On Wed, Dec 6, 2017 at 9:09 AM, Eugene Kirpichov 
>>> wrote:
>>> > Okay, then let's go forward. Seems that we should:
>>> > - Open a new poll on user@, in light of 2.2 having been released
>>> > - Open a twitter poll
>>> > - Tweet that there's also a poll going on on user@
>>> > - Runner authors will reach out to respective runner user communities
>>> > - 2 weeks later we gather results and decide
>>> > ?
>>> >
>>> > On Wed, Dec 6, 2017 at 6:16 AM Ismaël Mejía  wrote:
>>> >>
>>> >> +1 For Eugene’s arguments waiting for Beam 3.0 seems still far away,
>>> >> and starting to improve Beam to offer a Java 8 friendly experience
>>> >> seems like an excellent idea.
>>> >>
>>> >> I understand the backwards compatibility argument. We should do the
>>> >> poll in twitter + try to reach more users for comments. If you
>>> >> consider that it is worth, I can open a second poll at user@.
>>> >>
>>> >> In any case we should try to move forward, even if we have more than
>>> >> 5% of users who want to stay on Java 7 we can consider to maintain
>>> >> minor releases of a backwards compatible version where we can backport
>>> >> only critical fixes e.g. security/data related errors but nothing new,
>>> >> in case some user really needs to have them. Of course this can be
>>> >> some extra work (to be discussed).
>>> >>
>>> >>
>>> >> On Tue, Dec 5, 2017 at 7:24 AM, Jean-Baptiste Onofré >> >
>>> >> wrote:
>>> >> > +1, and sorry again, I thought we got an consensus.
>>> >> >
>>> >> > Regards
>>> >> > JB
>>> >> >
>>> >> > On 12/05/2017 07:10 AM, Kenneth Knowles wrote:
>>> >> >>
>>> >> >> +1 to the poll and also to Reuven's point.
>>> >> >>
>>> >> >> Those without a support contract would have been using JDK 7
>>> without
>>> >> >> security updates for years. IMO it seems harmful, as a netizen, to
>>> >> >> encourage
>>> >> >> its use/existence.
>>> >> >>
>>> >> >> If there's no noise from the prior thread, then I would assume no
>>> one
>>> >> >> on
>>> >> >> user@ has any objection. Anyone else with customers should reach
>>> out to
>>> >> >> them.
>>> >> >>
>>> >> >> Kenn
>>> >> >>
>>> >> >> On Mon, Dec 4, 2017 at 9:49 PM, Reuven Lax >> >> >> > wrote:
>>> >> >>
>>> >> >> Technically it's a backwards-incompatible change, however if
>>> we are
>>> >> >> convinced the risk is low we could do it.
>>> >> >>
>>> >> >> As mentioned on the original thread, it's not clear that all
>>> Beam
>>> >> >> users read
>>> >> >> user@ - e.g. most Dataflow users definitely do not. I think
>>> we need
>>> >> >> to
>>> >> >> separately reach out to users of each runner through
>>> >> >> runner-specific
>>> >> >> channels.
>>> >> >>
>>> >> >> Reuven
>>> >> >>
>>> >> >> On Mon, Dec 4, 2017 at 9:37 PM, Eugene Kirpichov
>>> >> >> >> >> >> > wrote:
>>> >> >>
>>> >> >> On the original thread
>>> >> >>
>>> >> >>
>>> >> >> https://lists.apache.org/thread.html/
>>> 2e1890c62d9f022f09b20e9f12f130fe9f1042e391979087f725d2e0@%
>>> 3Cuser.beam.apache.org%3E
>>> >> >>
>>> >> >>
>>> >> >> >> 2e1890c62d9f022f09b20e9f12f130fe9f1042e391979087f725d2e0@%
>>> 3Cuser.beam.apache.org%3E>
>>> >> >> ,
>>> >> >> Robert and Ismaël were in favor of no major version change
>>> >> >> [Ismaël
>>> >> >> said:/Also I am afraid that if we wait/
>>> >> >> /until we have enough changes to switch Beam to a new major
>>> >> >> version the
>>> >> >> switch to Java 8 will happen too late, probably after Java
>>> 8's
>>> >> >> end
>>> >> >> of
>>> >> >> life. And I am not exaggerating, Java 8 is planned to EOL
>>> next
>>> >> >> march
>>> >> >> 2018!/]; JB and now Reuven are in favor of a major version
>>> >> >> change;
>>> >> >>
>>> >> >> nobody so far argued against switching to Java8 in general.
>>> >> >>
>>> >> >> I'm personally in favor of no major version change (i.e.
>>> not
>>> >> >> waiting
>>> >> >> until all other 

Re: Apache Beam, version 2.2.0

2017-12-04 Thread Ahmet Altay
Thank you Reuven! I tweeted the release announcement on Beam's account.

On Mon, Dec 4, 2017 at 9:49 PM, Reuven Lax  wrote:

> Technically it's a backwards-incompatible change, however if we are
> convinced the risk is low we could do it.
>
> As mentioned on the original thread, it's not clear that all Beam users
> read user@ - e.g. most Dataflow users definitely do not. I think we need
> to separately reach out to users of each runner through runner-specific
> channels.
>
> Reuven
>
> On Mon, Dec 4, 2017 at 9:37 PM, Eugene Kirpichov 
> wrote:
>
>> On the original thread https://lists.apache.or
>> g/thread.html/2e1890c62d9f022f09b20e9f12f130fe9f1042e3919790
>> 87f725d2e0@%3Cuser.beam.apache.org%3E , Robert and Ismaël were in favor
>> of no major version change [Ismaël said:* Also I am afraid that if we
>> wait*
>> *until we have enough changes to switch Beam to a new major version the
>> switch to Java 8 will happen too late, probably after Java 8's end of life.
>> And I am not exaggerating, Java 8 is planned to EOL next march 2018!*];
>> JB and now Reuven are in favor of a major version change; nobody so far
>> argued against switching to Java8 in general.
>>
>> I'm personally in favor of no major version change (i.e. not waiting
>> until all other large changes for Beam 3.0 converge, which will likely be
>> many months), because:
>> - Reasons Ismaël cited; plus the reason that most people are likely
>> already using Java 8.
>> - Going Java8-only earlier will make other Beam 3.0 APIs better for Java8
>> users, because we (Beam contributors) will have experience working with
>> them within the SDK in Java8 (e.g. writing tests with use of lambdas and
>> noticing whether it's clunky, or whether some other Beam APIs need better
>> Java8 support).
>> - Going Java8 will make it more reasonable to include (mostly or only)
>> Java8 snippets in Beam documentation, which will obviously look more
>> concise and attractive, addressing one of the common concerns of Beam users
>> that it has a heavyweight API compared to functional-style APIs of Spark
>> etc.
>>
>> I think resolving this via a poll of users would be reasonable. I'd
>> suggest e.g. the following phrasing:
>>
>> Apache Beam is considering dropping support for Java 7, and supporting
>> only Java 8 and above in a subsequent release. How would it impact your
>> usage of Beam?
>> - I am already using only Java 8+ for building my Beam code
>> - I am using Java 7 for building my Beam code, but I would have no
>> trouble switching to Java 8
>> - I am using Java 7 for building my Beam code, and dropping Java 7 would
>> be a blocker or hindrance to adopting the new release for me
>>
>> We could tweet this poll on Apache Beam twitter and publish on user@,
>> and, say, if we receive 5% or fewer votes for option 3 after keeping it
>> open for 2 weeks, then adopt Java 8 without a major version change.
>>
>> WDYT?
>>
>> On Mon, Dec 4, 2017 at 8:34 PM Jean-Baptiste Onofré 
>> wrote:
>>
>>> Good idea ! Definitely +1
>>>
>>> Regards
>>> JB
>>>
>>> On 12/05/2017 05:25 AM, Reuven Lax wrote:
>>> > We should bring this up on the Beam 3.0 thread. Since it's technically
>>> a
>>> > backwards-incompatible change, it might make a good item for Beam 3.0.
>>> >
>>> > Reuven
>>> >
>>> > On Mon, Dec 4, 2017 at 8:20 PM, Jean-Baptiste Onofré >> > > wrote:
>>> >
>>> > My apologizes, I thought we had a consensus already.
>>> >
>>> > Regards
>>> > JB
>>> >
>>> > On 12/04/2017 11:22 PM, Eugene Kirpichov wrote:
>>> >
>>> > Thanks JB for sending the detailed notes about new stuff in
>>> 2.2.0! A lot
>>> > of exciting things indeed.
>>> >
>>> > Regarding Java 8: I thought our consensus was to have the
>>> release notes
>>> > say that we're *considering* going Java8-only, and use that to
>>> get more
>>> > opinions from the user community - but I can't find the emails
>>> that made
>>> > me think so.
>>> >
>>> > +Ismaël Mejía > ieme...@gmail.com>> - do
>>> > you think we should formally conclude the vote on the
>>> thread [VOTE]
>>> > [DISCUSSION] Remove support for Java 7?
>>> > Or should we take more steps - e.g. perhaps tweet a link to
>>> that thread
>>> > from the Beam twitter account, ask people to chime in, and
>>> wait for say
>>> > 2 weeks before declaring a conclusion?
>>> >
>>> > Let's also have a process JIRA for going Java8. I've filed one:
>>> > https://issues.apache.org/jira/browse/BEAM-3285
>>> > 
>>> >
>>> > On Mon, Dec 4, 2017 at 1:58 AM Jean-Baptiste Onofré <
>>> j...@nanthrax.net
>>> >  >> > >> wrote:
>>> >
>>> >  Just an important note that we 

Re: [DISCUSS] Thinking about Beam 3.x roadmap and release schedule

2017-11-29 Thread Ahmet Altay
My wishlist for 2018 would be

- Python 3 support
- Python SDK to work with more runners. This is covered in portability in
general. I would like to see an enterprise grade Python SDK that can run on
a range of Beam runners.
- Related to the above item, full streaming support with Python SDK.
- Python SDK to catch-up on missing features. From larger APIs such as
State API to smaller things like setup/teardown support in DoFn.
- Interactive support, perhaps integrations with related projects like
Apache Zeppelin.


On Wed, Nov 29, 2017 at 5:19 AM, Ismaël Mejía  wrote:

> It is good to see so much enthusiasm about the future of Beam
> independently of the fact that we call it Beam 3 or no.
>
> I have some doubts about the idea of a release per month, Apache
> releases are designed to be slow-pace (via the 3-day voting process).
> It is just a question that we have in the same month some holiday
> period + some issues during the release that require two RCs and it
> will easily take two weeks (of course I understand the will to improve
> this considering our not so good statu quo of 6 weeks for the last two
> votes). My point is that a monthly release can bring a ton of extra
> work to validate every release, remember validating a release is not
> just running the unit tests.
>
> I want to add one idea to the wishlist for Beam in the future:
>
> - We need to improve Beam’s monitorability in a unified way even if
> this goes beyond the initial goals of the project because this is a
> big pain point for Beam adopters. We need things like system metrics
> and utilities to monitor what is going on with Beam pipelines in a
> runner-agnostic way.
>
> It would be nice to create JIRAs for the issues discussed in this
> thread (that don’t exist yet) with this we can follow them and
> categorize some sort of roadmap.
>
>
> On Wed, Nov 29, 2017 at 7:05 AM, Romain Manni-Bucau
>  wrote:
> > Ps: forgot another wish: make usable beam sql. Today you need to add a fn
> > before and after cause of that type breakage not consistent with the
> > pipeline API. It would be nice to support pojo (extracted from the select
> > fields or created from "views" like in jackson) bit not having to wrap
> the
> > sql usage in multiple UDF would make it powerful and ready to use.
> >
> > Le 29 nov. 2017 07:01, "Romain Manni-Bucau"  a
> écrit
> > :
> >>
> >> My user wishes - whatever version, it is just a number after all ;):
> >>
> >> - make coder usage simpler and consistent (PCollection TypeDescriptor
> and
> >> Coder are duplicated in term of API)
> >> - have a beam api (split from the sdk and internals and impl)
> >> - have SDF supported by runners
> >> - have a SDFRunner allowing to simulate the SDF lifecycle manually (same
> >> for DoFn short term - see next point for the current issue)
> >> - ensure classloader usage is consistent, ie any proxy is created into
> the
> >> final artifact classloader (transform if custom, dofn/source/sdf
> otherwise)
> >> - have a test compatibility kit (TCK) for runner. It would be a jar any
> >> runner impl can import to run with surefire
> >> - make IO configuration reflection friendly (get rid of the autovalue
> >> pattern which is not industriablizable and allow pojo like classes or
> >> alternatively support reading the conf from properties)
> >> - support pipeline implicit option based on transform names to override
> >> some attributes
> >> - change runner implementations to let the bundle size have a pipeline
> >> option defining an upper bound and not hardcode them arbitrarly -
> defaults
> >> can stay the current ones
> >> - better multi input/output support (just PCollection based and fully
> >> wireable)
> >> - a smoother pipeline API would be nice. I like hazelcast jet one for
> >> instance
> >>
> >> Le 29 nov. 2017 03:29, "Robert Bradshaw"  a écrit
> :
> >>>
> >>> On Tue, Nov 28, 2017 at 9:48 AM, Reuven Lax  wrote:
> >>> >
> >>> > On Tue, Nov 28, 2017 at 9:14 AM, Jean-Baptiste Onofré <
> j...@nanthrax.net>
> >>> > wrote:
> >>> >>
> >>> >> Hi Reuven,
> >>> >>
> >>> >> Yes, I remember that we agreed on a release per month. However, we
> >>> >> didn't
> >>> >> do it before. I think the most important is not the period, it's
> more
> >>> >> a
> >>> >> stable pace. I think it's more interesting for our community to have
> >>> >> "always" a release every two months, more than a tentative of a
> >>> >> release
> >>> >> every month that end later than that. Of course, if we can do both,
> >>> >> it's
> >>> >> perfect ;)
> >>> >
> >>> > Agree. A stable pace is the most important thing.
> >>>
> >>> +1, and I think everyone who's done a release is in favor of making it
> >>> easier and more frequent. Someone should put together a proposal of
> >>> easy things we can do to automate, etc.
> >>>
> >>> >> For Beam 3.x, I wasn't talking about breaking change, but more about
> >>> >> "marketing" 

Re: Version 2.2.0 release date

2017-11-22 Thread Ahmet Altay
Hi Stefania,

Release candidate for 2.2.0 is currently being voted [1]. The release will
happen after a successful vote.

Ahmet

[1] https://lists.apache.org/thread.html/da2acabdb15c9f8d11351f9167633a
4b089664fe3cce014ba619c937@%3Cdev.beam.apache.org%3E

On Mon, Nov 20, 2017 at 7:04 AM, Stefania Mantisi <
stefania.mant...@noovle.it> wrote:

> Hi everyone,
> I saw all the issues are currently marked as having been fixed.
> When will version 2.2 come out?
>
> Thank you!
>
> Stefania Mantisi
>
> --
> *Stefania Mantisi *
> Software Engineer - Cloud Development - Noovle S.r.l.
> mail: stefania.mant...@noovle.it
> 
>
> Noovle  | The Nexus of forces
>


Re: Python SDK DirectRunner (new feature)

2017-11-20 Thread Ahmet Altay
Thank you María.

On Mon, Nov 20, 2017 at 5:31 PM, María García Herrero <
mari...@google.com.invalid> wrote:

> Hello,
>
> I recently worked on adding a bundle retry for the Python SDK DirectRunner
> (
> https://issues.apache.org/jira/browse/BEAM-2718).
>
> The goal was to have a more reliable processing of bundles. The change
> included having any bundle retry be processed up to 4 times and making sure
> GroupByKey doesn't do partial write-outs in case of a retry. For 2.2.0, it
> will remain in opt-in mode with a message in case of failure, suggesting
> the use of the --direct_runner_bundle_retry flag. In our next release, it
> will be fully integrated (opt-in removed).
>
> If you have any questions about the addition, please let me know.
>
> Best,
>
> María
>


Re: HDFS Support for Python SDK

2017-11-20 Thread Ahmet Altay
Thank you Udi, this is a great comparison of available options.

On Mon, Nov 20, 2017 at 5:26 PM, Udi Meiri  wrote:

> Hi,
>
> I've done some research into implementing HDFS support for Python SDK and
> I'd like your input. This work is regarding BEAM-3099
> .
>
> This doc lists several options for implementing HDFS support and attempts
> to weigh the differences.
> https://docs.google.com/document/d/1-uzKf4VPlGrkBMXM00sxxf3K01Ss3Zz
> Xeju0w5L0LY0/edit?usp=sharing
>
> Thanks,
> Udi
>


Re: [VOTE] Release 2.2.0, release candidate #4

2017-11-20 Thread Ahmet Altay
+1

I verified the python quick start on Windows. I could not verify the
documentation changes because the staged version expired.


On Mon, Nov 20, 2017 at 12:08 PM, Eugene Kirpichov <
kirpic...@google.com.invalid> wrote:

> Thanks Luke. I was able to validate quickstart on Dataflow and on Spark
> cluster (using Cloud Dataproc). So +1 from me so far.
>
> On Sun, Nov 19, 2017 at 4:28 PM Lukasz Cwik 
> wrote:
>
> > Eugene, you can setup your ~/.m2/settings.xml to point to the repository
> > containing the release candidate.
> > 
> >   
> >
> >  release-repo
> >  
> >true
> >  
> >  
> >
> >  Release 2.2.0 RC4
> >  Release 2.2.0 RC4
> >  
> > https://repository.apache.org/content/repositories/orgapachebeam-1025/
> > 
> >
> >  
> >
> >   
> > 
> >
> > The URL for the release candidate is always part of the vote e-mail.
> > For more details about having multiple repositories, take a look at
> > https://maven.apache.org/guides/mini/guide-multiple-repositories.html
> >
> > On Fri, Nov 17, 2017 at 5:09 PM, Reuven Lax 
> > wrote:
> >
> > > hmmm, I thought I removed those generated files from the zip file
> before
> > > sending this email. Let me check again.
> > >
> > > Reuven
> > >
> > > On Sat, Nov 18, 2017 at 8:52 AM, Robert Bradshaw <
> > > rober...@google.com.invalid> wrote:
> > >
> > > > The source distribution contains a couple of files not on github
> (e.g.
> > > > folders that were added on master, Python generated files). The pom
> > > > files differed only by missing -SNAPSHOT, other than that presumably
> > > > the source release should just be "wget
> > > > https://github.com/apache/beam/archive/release-2.2.0.zip;?
> > > >
> > > > diff -rq apache-beam-2.2.0 beam/ | grep -v pom.xml
> > > >
> > > > # OK?
> > > >
> > > > Only in apache-beam-2.2.0: DEPENDENCIES
> > > >
> > > > # Expected.
> > > >
> > > > Only in beam/: .git
> > > > Only in beam/: .gitattributes
> > > > Only in beam/: .gitignore
> > > >
> > > > # These folders are probably from switching around between master and
> > > > git branches.
> > > >
> > > > Only in apache-beam-2.2.0: model
> > > > Only in apache-beam-2.2.0/runners/flink: examples
> > > > Only in apache-beam-2.2.0/runners/flink: runner
> > > > Only in apache-beam-2.2.0/runners/gearpump: jarstore
> > > > Only in apache-beam-2.2.0/sdks/java/extensions: gcp-core
> > > > Only in apache-beam-2.2.0/sdks/java/extensions: sketching
> > > > Only in apache-beam-2.2.0/sdks/java/io: file-based-io-tests
> > > > Only in apache-beam-2.2.0/sdks/java/io: hdfs
> > > > Only in apache-beam-2.2.0/sdks/java/maven-archetypes/examples/src/
> > > > main/resources/archetype-resources:
> > > > src
> > > > Only in apache-beam-2.2.0/sdks/java/maven-archetypes/examples-
> > > > java8/src/main/resources/archetype-resources:
> > > > src
> > > > Only in apache-beam-2.2.0/sdks/java: microbenchmarks
> > > >
> > > > # Here's the generated protos.
> > > >
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_artifact_api_pb2_grpc.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_artifact_api_pb2.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_fn_api_pb2_grpc.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_fn_api_pb2.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_job_api_pb2_grpc.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_job_api_pb2.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_provision_api_pb2_grpc.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_provision_api_pb2.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_runner_api_pb2_grpc.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > beam_runner_api_pb2.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > endpoints_pb2_grpc.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > endpoints_pb2.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > standard_window_fns_pb2_grpc.py
> > > > Only in apache-beam-2.2.0/sdks/python/apache_beam/portability/api:
> > > > standard_window_fns_pb2.py
> > > >
> > > > And some other sdist generated Python files.
> > > >
> > > > Only in apache-beam-2.2.0/sdks/python: .eggs
> > > > Only in apache-beam-2.2.0/sdks/python: LICENSE
> > > > Only in apache-beam-2.2.0/sdks/python: NOTICE
> > > > Only in apache-beam-2.2.0/sdks/python: README.md
> > > >
> > > > Presumably we should just purge these files from the rc?
> > > >
> > > >
> > > > FWIW, the Python tarball looks fine.
> > > >
> > > > On Fri, Nov 17, 2017 at 

Re: python3 support schedule

2017-11-05 Thread Ahmet Altay
For reference https://issues.apache.org/jira/browse/BEAM-1251 is the
umbrella issue tracking python3 support in the core SDK. There needs to be
additional runner specific work (e.g. DataflowRunner needs to use python3
binary on its workers) once the core work is completed.

Ahmet

On Thu, Nov 2, 2017 at 12:58 PM, Jesse Anderson 
wrote:

> Holden is being modest in her contributions to Python frameworks,
> especially Apache Spark.
>
> On Thu, Nov 2, 2017 at 12:55 PM Holden Karau  wrote:
>
> > Hi! So this is something I'm currently working on (e.g. in between
> checking
> > my e-mails :p). If you want to help join in we can split up the work into
> > smaller components and parallelize the process a bit :) Always happy to
> see
> > more folks who care about Python 3 support.
> >
> > On Thu, Nov 2, 2017 at 12:44 PM, Lukasz Cwik 
> > wrote:
> >
> > > Contributions are always welcome to improve progress.
> > >
> > > You can always vote/watch the Python 3 JIRA issue as this helps people
> > know
> > > what others are looking for.
> > >
> > > On Thu, Nov 2, 2017 at 10:33 AM, Yue Yang  wrote:
> > >
> > > > Hello,
> > > >   I wonder what is the schedule to support python 3. It seems that
> the
> > > > progess is very slow.
> > > >   Thanks.
> > > >
> > >
> >
> >
> >
> > --
> > Twitter: https://twitter.com/holdenkarau
> >
> --
> Thanks,
>
> Jesse
>


Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-01 Thread Ahmet Altay
Has anyone started a POC with Bazel? I would be interested in helping that
effort.

On Wed, Nov 1, 2017 at 9:27 AM, Lukasz Cwik 
wrote:

> I have started a POC for using Gradle here:
> https://github.com/lukecwik/incubator-beam/tree/gradle
>
> Things that work:
> * compiling all Java code (src/main and src/test)
> * generating source from protos
> * generating source from avro
> * running rat, checkstyle
>
> Partially working:
> * generating maven pom (albeit with wrong dependencies for some
> subprojects)
> * running tests (~80% pass, remainder seem to be dependency related but are
> uninvestigated)
>
> Things that don't work:
> * anything Python/Go/Docker compilation related
> * many tests fail because I messed up dependencies
> * anything shading related
> * minor plugins like eclipse code formatter/...
> * running @NeedsRunner/@ValidatesRunner/integration tests
>
> Feel free to reach out to me on Slack if you would like to try to tackle a
> piece of the POC to prevent duplication of effort from anyone working on
> it.
>
>
>
> On Tue, Oct 31, 2017 at 10:25 PM, Jean-Baptiste Onofré 
> wrote:
>
> > Agree to move forward on a PoC.
> >
> > Thanks Reuven for bringing discussion on the mailing list !
> >
> > Regards
> > JB
> >
> > On Nov 1, 2017, 03:20, at 03:20, Reuven Lax 
> > wrote:
> > >Some good discussion here, and thanks to JB and Romain for adding to
> > >it!
> > >
> > >JB makes the good point that we still need to release Maven artifacts,
> > >as
> > >many Beam users want to develop using Maven. So none of this discussion
> > >will affect our release process, as we still need Maven "releases."
> > >
> > >At this point, if people are interested, I see no harm in prototyping.
> > >Having working alternatives will give us a better basis for comparison
> > >to
> > >understand whether these other build systems give us anything over what
> > >Maven does.
> > >
> > >Reuven
> > >
> > >On Tue, Oct 31, 2017 at 11:05 AM, Charles Chen 
> > >wrote:
> > >
> > >> As a contributor to the Beam Python SDK, I noticed that many of the
> > >points
> > >> above regarding Maven and Gradle pertain mostly to Java SDK
> > >development.
> > >> For Python development, Maven is much less natural, and we end up
> > >just
> > >> shelling out to perform builds and tests.  For Python SDK (and
> > >upcoming Go
> > >> SDK development), an option to use Bazel would be quite useful.
> > >>
> > >> On Tue, Oct 31, 2017 at 10:42 AM Robert Bradshaw
> > >>  wrote:
> > >>
> > >> > +1, Maven is both a build tool and a repository, and the latter is
> > >> > essential to keep. Both Gradel and Bazel can interface with this
> > >> > repository.
> > >> >
> > >> > I am, however, very supportive of moving away from Maven to a tool
> > >> > that supports correct incremental, hermetic, dependency-driven,
> > >> > multi-langauge, and hopefully fast builds for our own development.
> > >> >
> > >> > On Tue, Oct 31, 2017 at 10:00 AM, Kenneth Knowles
> > >> >  wrote:
> > >> > > Echoing what JB and Reuven said, we absolutely must provide maven
> > >> central
> > >> > > artifacts for Java users, just as we provide pypi artifacts for
> > >Python
> > >> > > users.
> > >> > >
> > >> > > I see Maven as still a viable tool for single-module Java builds,
> > >> > > especially considering its rich plugin ecosystem.
> > >> > >
> > >> > > On Mon, Oct 30, 2017 at 11:27 PM, Reuven Lax
> > > > >> >
> > >> > > wrote:
> > >> > >
> > >> > >> I think that's a very good point. No matter what build system we
> > >use
> > >> for
> > >> > >> our own personal development, we still need to release Maven
> > >artifacts
> > >> > and
> > >> > >> releases as we need to support our users using Maven.
> > >> > >>
> > >> > >> On Mon, Oct 30, 2017 at 11:26 PM, Jean-Baptiste Onofré <
> > >> j...@nanthrax.net
> > >> > >
> > >> > >> wrote:
> > >> > >>
> > >> > >> > Generally speaking, it's interesting to evaluate alternatives,
> > >> > especially
> > >> > >> > Gradle. My point is also to keep Maven artifacts and
> > >"releases" as
> > >> > most
> > >> > >> of
> > >> > >> > our users will use Maven.
> > >> > >> > For incremental build, afair, there's some enhancements on
> > >Maven
> > >> but I
> > >> > >> > have to take a look.
> > >> > >> >
> > >> > >> > Regards
> > >> > >> > JB
> > >> > >> >
> > >> > >> > On Oct 31, 2017, 07:22, at 07:22, Eugene Kirpichov
> > >> > >> >  wrote:
> > >> > >> > >Hi!
> > >> > >> > >
> > >> > >> > >Many of these points sound valid, but AFAICT Maven doesn't
> > >really
> > >> do
> > >> > >> > >incremental builds [1]. The best it can do is, it seems,
> > >recompile
> > >> > only
> > >> > >> > >changed files, but Java compilation is a tiny part of the
> > >overall
> > >> > >> > >build.
> > >> > >> > >
> > >> > >> > >Almost all time is taken by other plugins, 

Re: [Proposal] Sharing Neville's post and upcoming meetups in the Twitter handle

2017-10-26 Thread Ahmet Altay
Done.

On Thu, Oct 26, 2017 at 11:26 AM, Griselda Cuevas  wrote:

> Hi folks, could you help us to tweet the second part of Neville's blogpost?
>
> Here's a suggested tweet:
> 2nd part of @sinisa_lyh's post is out! Read how @Spotify developed Scio, a
> high level Scala API 4 the Beam Java SDK. https://goo.gl/kyjr4n
>
> Thx!
>
>
>
> On 20 October 2017 at 13:08, Griselda Cuevas  wrote:
>
>> Hi everyone - What do you think about sharing Neville's blogpost[1] about
>> the road to Scio on the Apache Beam Twitter account?, I think it'd be good
>> to share some content since the last time we were active as 9/27.
>>
>> Also - could you help promote some of the upcoming Meetups? I made the
>> following tweets:
>>
>> 11/1 - San Francisco Cloud Mafia
>> Tweet:
>> Come join the SF Cloud Mafia to learn about stream & batch processing
>> with #ApacheBeam on Nov. 1st. https://www.meetup.com/Sa
>> n-Francisco-Cloud-Mafia/events/244180581/
>>
>> 11/22 - StockholmApache Beam Meetup
>> Tweet:
>> Stockholm is ready for its first #ApacheBeam meetup on Nov. 22nd. Join if
>> you're around! https://www.meetup.com/Apache-Beam-Stockholm/
>>
>> [1] https://labs.spotify.com/2017/10/16/big-data-processing-
>> at-spotify-the-road-to-scio-part-1/
>>
>> Thanks!
>> G
>>
>
>


Re: [Proposal] Sharing Neville's post and upcoming meetups in the Twitter handle

2017-10-20 Thread Ahmet Altay
This makes sense to me. I published the first tweet, we can publish the
second one perhaps closer to the event.

Ahmet

On Fri, Oct 20, 2017 at 1:08 PM, Griselda Cuevas 
wrote:

> Hi everyone - What do you think about sharing Neville's blogpost[1] about
> the road to Scio on the Apache Beam Twitter account?, I think it'd be good
> to share some content since the last time we were active as 9/27.
>
> Also - could you help promote some of the upcoming Meetups? I made the
> following tweets:
>
> 11/1 - San Francisco Cloud Mafia
> Tweet:
> Come join the SF Cloud Mafia to learn about stream & batch processing with
> #ApacheBeam on Nov. 1st. https://www.meetup.com/San-
> Francisco-Cloud-Mafia/events/244180581/
>
> 11/22 - StockholmApache Beam Meetup
> Tweet:
> Stockholm is ready for its first #ApacheBeam meetup on Nov. 22nd. Join if
> you're around! https://www.meetup.com/Apache-Beam-Stockholm/
>
> [1] https://labs.spotify.com/2017/10/16/big-data-processing-at-
> spotify-the-road-to-scio-part-1/
>
> Thanks!
> G
>


Re: New contributor

2017-10-18 Thread Ahmet Altay
Welcome Vilhem!

On Wed, Oct 18, 2017 at 4:55 AM, Etienne Chauchot 
wrote:

> Welcome!
>
>
>
> Le 17/10/2017 à 22:18, Vilhelm von Ehrenheim a écrit :
>
>> Hi everyone!
>> My name is Vilhelm von Ehrenheim and I would like to start contributing to
>> Beam.
>> I work as a Data Engineer at EQT in Stockholm and we are actively using
>> Beam on Dataflow for our processing tasks. That aside, I am also the host
>> of the Apache Beam Stockholm Meetup and would love to start being more
>> involved in the project.
>>
>> My ASF id is `while`.
>>
>> Regards,
>> Vilhelm von Ehrenheim
>>
>>
>


Re: Problem while upgrading lib

2017-10-03 Thread Ahmet Altay
google-apitools dependency (which is required for GCS) does not work
with oauth2client >= 4.0.0 [1]. Because of this Beam Python SDK also does
not work with oauth2client >= 4.0.0 versions, and this is captured
correctly in the setup.py [2].

Ahmet

[1]
https://github.com/google/apitools/blob/7aff8d88960b669c9e946c938de5841c5f296f4f/setup.py#L32
[2]
https://github.com/apache/beam/blob/f9bc76364636b92239510f9e6bd242ea0ea62ac6/sdks/python/setup.py#L104

On Tue, Sep 19, 2017 at 8:40 AM, Morand, Sebastien <
sebastien.mor...@veolia.com> wrote:

> Hi,
>
> No help on this?
>
> Regards,
>
> *Sébastien MORAND*
> Team Lead Solution Architect
> Technology & Operations / Digital Factory
> Veolia - Group Information Systems & Technology (IS)
> Cell.: +33 7 52 66 20 81 / Direct: +33 1 85 57 71 08
> Bureau 0144C (Ouest)
> 30, rue Madeleine-Vionnet - 93300 Aubervilliers, France
> *www.veolia.com *
> 
> 
> 
> 
> 
>
> On 15 September 2017 at 10:26, Morand, Sebastien <
> sebastien.mor...@veolia.com> wrote:
>
> > Hi,
> >
> > Hi got a problem when I install the oauth2client>=4.0.0 version:
> >
> > >>> import apache_beam.io.gcp.gcsio
> > Traceback (most recent call last):
> >   File "", line 1, in 
> >   File "/home/ubuntu/workspace/tmp/beam_oauth/env/local/lib/
> > python2.7/site-packages/apache_beam/io/gcp/gcsio.py", line 53, in
> 
> > 'Google Cloud Storage I/O not supported for this execution
> environment
> > '
> > ImportError: Google Cloud Storage I/O not supported for this execution
> > environment (could not import storage API client).
> > >>>
> >
> > Steps to reproduce:
> > virtualenv env && source env/bin/activate && pip install
> > 'apache_beam==2.1.0' 'oauth2client>=4.0.0' && echo "import
> > apache_beam.io.gcp.gcsio"|python2
> >
> > What is going on? How can I make apache_beam working with oauth2client >=
> > 4.
> >
> > Thanks by advance,
> > Regards,
> >
> > *Sébastien MORAND*
> > Team Lead Solution Architect
> > Technology & Operations / Digital Factory
> > Veolia - Group Information Systems & Technology (IS)
> > Cell.: +33 7 52 66 20 81 / Direct: +33 1 85 57 71 08
> > <+33%201%2085%2057%2071%2008>
> > Bureau 0144C (Ouest)
> > 30, rue Madeleine-Vionnet - 93300 Aubervilliers, France
> > *www.veolia.com *
> > 
> > 
> > 
> > 
> > 
> >
>
> --
>
> 
> 
> This e-mail transmission (message and any attached files) may contain
> information that is proprietary, privileged and/or confidential to Veolia
> Environnement and/or its affiliates and is intended exclusively for the
> person(s) to whom it is addressed. If you are not the intended recipient,
> please notify the sender by return e-mail and delete all copies of this
> e-mail, including all attachments. Unless expressly authorized, any use,
> disclosure, publication, retransmission or dissemination of this e-mail
> and/or of its attachments is strictly prohibited.
>
> Ce message electronique et ses fichiers attaches sont strictement
> confidentiels et peuvent contenir des elements dont Veolia Environnement
> et/ou l'une de ses entites affiliees sont proprietaires. Ils sont donc
> destines a l'usage de leurs seuls destinataires. Si vous avez recu ce
> message par erreur, merci de le retourner a son emetteur et de le detruire
> ainsi que toutes les pieces attachees. L'utilisation, la divulgation, la
> publication, la distribution, ou la reproduction non expressement
> autorisees de ce message et de ses pieces attachees sont interdites.
> 
> 
>


Ahmet offline for the next 3 weeks

2017-09-07 Thread Ahmet Altay
Hi all,

I will be on vacation starting tomorrow through first week of October. I
would not be able to respond to most of the things.

Happy Beaming,
Ahmet


Re: Merge branch DSL_SQL to master

2017-09-07 Thread Ahmet Altay
+1 Thanks to all contributors/reviewers!

On Thu, Sep 7, 2017 at 9:55 AM, Kai Jiang  wrote:

> +1 looking forward to this.
>
> On Thu, Sep 7, 2017, 09:53 Tyler Akidau 
> wrote:
>
> > +1, thanks for all the hard work to everyone that contributed!
> >
> > -Tyler
> >
> > On Thu, Sep 7, 2017 at 2:39 AM Ismaël Mejía  wrote:
> >
> > > +1
> > > A nice feature to have on Beam. Great work guys !
> > >
> > > On Thu, Sep 7, 2017 at 10:21 AM, Pei HE  wrote:
> > > > +1
> > > >
> > > > On Thu, Sep 7, 2017 at 4:03 PM, tarush grover <
> tarushappt...@gmail.com
> > >
> > > > wrote:
> > > >
> > > >> Thank you all, it was a great learning experience!
> > > >>
> > > >> Regards,
> > > >> Tarush
> > > >>
> > > >> On Thu, 7 Sep 2017 at 1:05 PM, Jean-Baptiste Onofré <
> j...@nanthrax.net>
> > > >> wrote:
> > > >>
> > > >> > +1
> > > >> >
> > > >> > Great work guys !
> > > >> > Ready to help for the merge and maintain !
> > > >> >
> > > >> > Regards
> > > >> > JB
> > > >> >
> > > >> > On 09/07/2017 08:48 AM, Mingmin Xu wrote:
> > > >> > > Hi all,
> > > >> > >
> > > >> > > On behalf of the virtual Beam SQL team[1], I'd like to propose
> to
> > > merge
> > > >> > > DSL_SQL branch into master (PR #3782 [2]) and include it in
> > release
> > > >> > version
> > > >> > > 2.2.0, which will give it more visibility to other contributors
> > and
> > > >> > users.
> > > >> > > The SQL feature satisfies the following criteria outlined in
> > > >> contribution
> > > >> > > guide[3].
> > > >> > >
> > > >> > > 1. Have at least 2 contributors interested in maintaining it,
> and
> > 1
> > > >> > > committer interested in supporting it
> > > >> > >
> > > >> > > * James and me will continue for new features and maintain it;
> > > >> > >
> > > >> > >Tyler, James and me will support it as committers;
> > > >> > >
> > > >> > > 2. Provide both end-user and developer-facing documentation
> > > >> > >
> > > >> > > * A web page[4] is added to describe the usage of SQL DSL and
> how
> > it
> > > >> > works;
> > > >> > >
> > > >> > >
> > > >> > > 3. Have at least a basic level of unit test coverage
> > > >> > >
> > > >> > > * Totally 230 unit/integration tests, with code coverage 83.4%;
> > > >> > >
> > > >> > > 4. Run all existing applicable integration tests with other Beam
> > > >> > components
> > > >> > > and create additional tests as appropriate
> > > >> > >
> > > >> > > * Besides of integration tests in package
> > > >> > org.apache.beam.sdk.extensions.sql,
> > > >> > > there's another example in
> > > org.apache.beam.sdk.extensions.sql.example.
> > > >> > > BeamSqlExample.
> > > >> > >
> > > >> > > [1]. Special thanks to all contributors/reviewers:
> > > >> > >
> > > >> > >   Tyler Akidau
> > > >> > >
> > > >> > >   Davor Bonaci
> > > >> > >
> > > >> > >   Robert Bradshaw
> > > >> > >
> > > >> > >   Lukasz Cwik
> > > >> > >
> > > >> > >   Tarush Grover
> > > >> > >
> > > >> > >   Kai Jiang
> > > >> > >
> > > >> > >   Kenneth Knowles
> > > >> > >
> > > >> > >   Jingsong Lee
> > > >> > >
> > > >> > >   Ismaël Mejía
> > > >> > >
> > > >> > >   Jean-Baptiste Onofré
> > > >> > >
> > > >> > >   James Xu
> > > >> > >
> > > >> > >   Mingmin Xu
> > > >> > >
> > > >> > > [2]. https://github.com/apache/beam/pull/3782
> > > >> > >
> > > >> > > [3]. https://beam.apache.org/contribute/contribution-guide/
> > > >> > > #merging-into-master
> > > >> > >
> > > >> > > [4]. https://beam.apache.org/documentation/dsls/sql/
> > > >> > >
> > > >> > > Thanks!
> > > >> > > 
> > > >> > > Mingmin
> > > >> > >
> > > >> >
> > > >> > --
> > > >> > Jean-Baptiste Onofré
> > > >> > jbono...@apache.org
> > > >> > http://blog.nanthrax.net
> > > >> > Talend - http://www.talend.com
> > > >> >
> > > >>
> > >
> >
>


Re: [RESULT][VOTE] Release 2.1.0, release candidate #3

2017-08-23 Thread Ahmet Altay
On Tue, Aug 22, 2017 at 5:12 PM, Ahmet Altay <al...@google.com> wrote:

> I believe this release is complete now. Thank you JB for pushing this
> release, and everyone else who contributed to it.
>
> On Tue, Aug 22, 2017 at 4:26 PM, Ahmet Altay <al...@google.com> wrote:
>
>> Remaining items for closing this release are:
>> - Move source distribution from dev repository to release repository in
>> dist.apache.org
>>
>
> Done.
>
>
>> - Finalize the version in JIRA.
>>
>
> Done.
>
>
>> - Announce on user@ and other places.
>>
>
> JB, I think it would be best if you announce this to user@ as the release
> manager. If you do not have time I will send the announcement message
> tomorrow.
>

I sent out the release email to user@. Everyone please feel free promote
the release on other channels.


>
>
>>
>> On Tue, Aug 22, 2017 at 2:18 PM, Ahmet Altay <al...@google.com> wrote:
>>
>>>
>>>
>>> On Tue, Aug 22, 2017 at 2:07 PM, Ahmet Altay <al...@google.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Aug 22, 2017 at 12:08 PM, Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 22, 2017 at 10:59 AM, Ahmet Altay <al...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Thank you JB.
>>>>>>
>>>>>> On Mon, Aug 21, 2017 at 11:33 PM, Jean-Baptiste Onofré <
>>>>>> j...@nanthrax.net> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> This vote passed with only +1.
>>>>>>>
>>>>>>> I'm promoting the artifacts to central and update Jira.
>>>>>>>
>>>>>>
>>>>>> Are you also pushing the PyPI artifacts? Do you want me to do it?
>>>>>>
>>>>>
>>> Let me know if you want me to push to PyPI.
>>>
>>
>> I noticed that the Maven artifacts are published and for parity published
>> the PyPI packages.
>>
>>
>>>
>>>
>>>>
>>>>>>
>>>>>>
>>>>>>> As I'm in vacation can a committer deal with the tag and website or
>>>>>>> merge ?
>>>>>>>
>>>>>>
>>>>>> Website PR had errors with dead-links. Re-running the test. If it
>>>>>> fails again, I can merge and we can fix with a follow up PR.
>>>>>>
>>>>>
>>>>> I noticed that pydocs contents are empty (i.e. headers are in there,
>>>>> but content is missing) in this PR. I am not sure what caused it. What is
>>>>> the recommendation here, we can do two things:
>>>>>
>>>>> a) Drop this PR, and re-create whole documentation again
>>>>> b) Merge this PR, and create another PR to fix the pydocs.
>>>>>
>>>>
>>>> This is done. I updated the pydocs while merging it. Pydoc generation
>>>> worked fine for me, I do not know what was the issue.
>>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> I can do the tagging.
>>>>>>
>>>>>
>>> This is also done.
>>>
>>>
>>>>
>>>>>>
>>>>>>>
>>>>>>> Sorry for this very short e-mail. Thanks all for your vote.
>>>>>>>
>>>>>>> Regards
>>>>>>> JB
>>>>>>>
>>>>>>>
>>>>>>> On Aug 18, 2017, 18:43, at 18:43, "Jean-Baptiste Onofré" <
>>>>>>> j...@nanthrax.net> wrote:
>>>>>>> >Hi
>>>>>>> >
>>>>>>> >I'm in vacation so I'm looking for a decent Internet connection to
>>>>>>> >finalize the release.
>>>>>>> >
>>>>>>> >I keep you posted.
>>>>>>> >
>>>>>>> >Regards
>>>>>>> >JB
>>>>>>> >
>>>>>>> >On Aug 18, 2017, 17:48, at 17:48, Eugene Kirpichov
>>>>>>> ><kirpic...@google.com.INVALID> wrote:
>>>>>>> >>Hi JB,
>>>>>>> >>
>>>>>>> >>Any updates on finalizing the releas

Re: [RESULT][VOTE] Release 2.1.0, release candidate #3

2017-08-22 Thread Ahmet Altay
I believe this release is complete now. Thank you JB for pushing this
release, and everyone else who contributed to it.

On Tue, Aug 22, 2017 at 4:26 PM, Ahmet Altay <al...@google.com> wrote:

> Remaining items for closing this release are:
> - Move source distribution from dev repository to release repository in
> dist.apache.org
>

Done.


> - Finalize the version in JIRA.
>

Done.


> - Announce on user@ and other places.
>

JB, I think it would be best if you announce this to user@ as the release
manager. If you do not have time I will send the announcement message
tomorrow.


>
> On Tue, Aug 22, 2017 at 2:18 PM, Ahmet Altay <al...@google.com> wrote:
>
>>
>>
>> On Tue, Aug 22, 2017 at 2:07 PM, Ahmet Altay <al...@google.com> wrote:
>>
>>>
>>>
>>> On Tue, Aug 22, 2017 at 12:08 PM, Ahmet Altay <al...@google.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Aug 22, 2017 at 10:59 AM, Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> Thank you JB.
>>>>>
>>>>> On Mon, Aug 21, 2017 at 11:33 PM, Jean-Baptiste Onofré <
>>>>> j...@nanthrax.net> wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> This vote passed with only +1.
>>>>>>
>>>>>> I'm promoting the artifacts to central and update Jira.
>>>>>>
>>>>>
>>>>> Are you also pushing the PyPI artifacts? Do you want me to do it?
>>>>>
>>>>
>> Let me know if you want me to push to PyPI.
>>
>
> I noticed that the Maven artifacts are published and for parity published
> the PyPI packages.
>
>
>>
>>
>>>
>>>>>
>>>>>
>>>>>> As I'm in vacation can a committer deal with the tag and website or
>>>>>> merge ?
>>>>>>
>>>>>
>>>>> Website PR had errors with dead-links. Re-running the test. If it
>>>>> fails again, I can merge and we can fix with a follow up PR.
>>>>>
>>>>
>>>> I noticed that pydocs contents are empty (i.e. headers are in there,
>>>> but content is missing) in this PR. I am not sure what caused it. What is
>>>> the recommendation here, we can do two things:
>>>>
>>>> a) Drop this PR, and re-create whole documentation again
>>>> b) Merge this PR, and create another PR to fix the pydocs.
>>>>
>>>
>>> This is done. I updated the pydocs while merging it. Pydoc generation
>>> worked fine for me, I do not know what was the issue.
>>>
>>>
>>>>
>>>>
>>>>>
>>>>> I can do the tagging.
>>>>>
>>>>
>> This is also done.
>>
>>
>>>
>>>>>
>>>>>>
>>>>>> Sorry for this very short e-mail. Thanks all for your vote.
>>>>>>
>>>>>> Regards
>>>>>> JB
>>>>>>
>>>>>>
>>>>>> On Aug 18, 2017, 18:43, at 18:43, "Jean-Baptiste Onofré" <
>>>>>> j...@nanthrax.net> wrote:
>>>>>> >Hi
>>>>>> >
>>>>>> >I'm in vacation so I'm looking for a decent Internet connection to
>>>>>> >finalize the release.
>>>>>> >
>>>>>> >I keep you posted.
>>>>>> >
>>>>>> >Regards
>>>>>> >JB
>>>>>> >
>>>>>> >On Aug 18, 2017, 17:48, at 17:48, Eugene Kirpichov
>>>>>> ><kirpic...@google.com.INVALID> wrote:
>>>>>> >>Hi JB,
>>>>>> >>
>>>>>> >>Any updates on finalizing the release?
>>>>>> >>
>>>>>> >>Thanks.
>>>>>> >>
>>>>>> >>On Thu, Aug 17, 2017 at 5:42 AM Aljoscha Krettek <
>>>>>> aljos...@apache.org>
>>>>>> >>wrote:
>>>>>> >>
>>>>>> >>> (Belated) +1
>>>>>> >>>
>>>>>> >>>  * verified signatures
>>>>>> >>>  * verified that Quickstart works with Flink Runner
>>>>>> >>>
>>>>>> >>> > On 16. Aug 2017, at 20:41, Robert Bradshaw
>>>>>> >>

Re: [RESULT][VOTE] Release 2.1.0, release candidate #3

2017-08-22 Thread Ahmet Altay
Remaining items for closing this release are:
- Move source distribution from dev repository to release repository in
dist.apache.org
- Finalize the version in JIRA.
- Announce on user@ and other places.

On Tue, Aug 22, 2017 at 2:18 PM, Ahmet Altay <al...@google.com> wrote:

>
>
> On Tue, Aug 22, 2017 at 2:07 PM, Ahmet Altay <al...@google.com> wrote:
>
>>
>>
>> On Tue, Aug 22, 2017 at 12:08 PM, Ahmet Altay <al...@google.com> wrote:
>>
>>>
>>>
>>> On Tue, Aug 22, 2017 at 10:59 AM, Ahmet Altay <al...@google.com> wrote:
>>>
>>>> Thank you JB.
>>>>
>>>> On Mon, Aug 21, 2017 at 11:33 PM, Jean-Baptiste Onofré <j...@nanthrax.net
>>>> > wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> This vote passed with only +1.
>>>>>
>>>>> I'm promoting the artifacts to central and update Jira.
>>>>>
>>>>
>>>> Are you also pushing the PyPI artifacts? Do you want me to do it?
>>>>
>>>
> Let me know if you want me to push to PyPI.
>

I noticed that the Maven artifacts are published and for parity published
the PyPI packages.


>
>
>>
>>>>
>>>>
>>>>> As I'm in vacation can a committer deal with the tag and website or
>>>>> merge ?
>>>>>
>>>>
>>>> Website PR had errors with dead-links. Re-running the test. If it fails
>>>> again, I can merge and we can fix with a follow up PR.
>>>>
>>>
>>> I noticed that pydocs contents are empty (i.e. headers are in there, but
>>> content is missing) in this PR. I am not sure what caused it. What is the
>>> recommendation here, we can do two things:
>>>
>>> a) Drop this PR, and re-create whole documentation again
>>> b) Merge this PR, and create another PR to fix the pydocs.
>>>
>>
>> This is done. I updated the pydocs while merging it. Pydoc generation
>> worked fine for me, I do not know what was the issue.
>>
>>
>>>
>>>
>>>>
>>>> I can do the tagging.
>>>>
>>>
> This is also done.
>
>
>>
>>>>
>>>>>
>>>>> Sorry for this very short e-mail. Thanks all for your vote.
>>>>>
>>>>> Regards
>>>>> JB
>>>>>
>>>>>
>>>>> On Aug 18, 2017, 18:43, at 18:43, "Jean-Baptiste Onofré" <
>>>>> j...@nanthrax.net> wrote:
>>>>> >Hi
>>>>> >
>>>>> >I'm in vacation so I'm looking for a decent Internet connection to
>>>>> >finalize the release.
>>>>> >
>>>>> >I keep you posted.
>>>>> >
>>>>> >Regards
>>>>> >JB
>>>>> >
>>>>> >On Aug 18, 2017, 17:48, at 17:48, Eugene Kirpichov
>>>>> ><kirpic...@google.com.INVALID> wrote:
>>>>> >>Hi JB,
>>>>> >>
>>>>> >>Any updates on finalizing the release?
>>>>> >>
>>>>> >>Thanks.
>>>>> >>
>>>>> >>On Thu, Aug 17, 2017 at 5:42 AM Aljoscha Krettek <
>>>>> aljos...@apache.org>
>>>>> >>wrote:
>>>>> >>
>>>>> >>> (Belated) +1
>>>>> >>>
>>>>> >>>  * verified signatures
>>>>> >>>  * verified that Quickstart works with Flink Runner
>>>>> >>>
>>>>> >>> > On 16. Aug 2017, at 20:41, Robert Bradshaw
>>>>> >><rober...@google.com.INVALID>
>>>>> >>> wrote:
>>>>> >>> >
>>>>> >>> > +1 binding
>>>>> >>> >
>>>>> >>> > (I've been on vacation as well.)
>>>>> >>> >
>>>>> >>> > On Wed, Aug 16, 2017 at 8:50 AM, Lukasz Cwik
>>>>> >><lc...@google.com.invalid>
>>>>> >>> wrote:
>>>>> >>> >> Back from vacation.
>>>>> >>> >>
>>>>> >>> >> +1 binding
>>>>> >>> >>
>>>>> >>> >> BEAM-2671 has been marked for 2.2.0 release.
>>>>> >>> >>
>>>>> >>> >

Re: [RESULT][VOTE] Release 2.1.0, release candidate #3

2017-08-22 Thread Ahmet Altay
On Tue, Aug 22, 2017 at 2:07 PM, Ahmet Altay <al...@google.com> wrote:

>
>
> On Tue, Aug 22, 2017 at 12:08 PM, Ahmet Altay <al...@google.com> wrote:
>
>>
>>
>> On Tue, Aug 22, 2017 at 10:59 AM, Ahmet Altay <al...@google.com> wrote:
>>
>>> Thank you JB.
>>>
>>> On Mon, Aug 21, 2017 at 11:33 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
>>> wrote:
>>>
>>>> Hi
>>>>
>>>> This vote passed with only +1.
>>>>
>>>> I'm promoting the artifacts to central and update Jira.
>>>>
>>>
>>> Are you also pushing the PyPI artifacts? Do you want me to do it?
>>>
>>
Let me know if you want me to push to PyPI.


>
>>>
>>>
>>>> As I'm in vacation can a committer deal with the tag and website or
>>>> merge ?
>>>>
>>>
>>> Website PR had errors with dead-links. Re-running the test. If it fails
>>> again, I can merge and we can fix with a follow up PR.
>>>
>>
>> I noticed that pydocs contents are empty (i.e. headers are in there, but
>> content is missing) in this PR. I am not sure what caused it. What is the
>> recommendation here, we can do two things:
>>
>> a) Drop this PR, and re-create whole documentation again
>> b) Merge this PR, and create another PR to fix the pydocs.
>>
>
> This is done. I updated the pydocs while merging it. Pydoc generation
> worked fine for me, I do not know what was the issue.
>
>
>>
>>
>>>
>>> I can do the tagging.
>>>
>>
This is also done.


>
>>>
>>>>
>>>> Sorry for this very short e-mail. Thanks all for your vote.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>>
>>>> On Aug 18, 2017, 18:43, at 18:43, "Jean-Baptiste Onofré" <
>>>> j...@nanthrax.net> wrote:
>>>> >Hi
>>>> >
>>>> >I'm in vacation so I'm looking for a decent Internet connection to
>>>> >finalize the release.
>>>> >
>>>> >I keep you posted.
>>>> >
>>>> >Regards
>>>> >JB
>>>> >
>>>> >On Aug 18, 2017, 17:48, at 17:48, Eugene Kirpichov
>>>> ><kirpic...@google.com.INVALID> wrote:
>>>> >>Hi JB,
>>>> >>
>>>> >>Any updates on finalizing the release?
>>>> >>
>>>> >>Thanks.
>>>> >>
>>>> >>On Thu, Aug 17, 2017 at 5:42 AM Aljoscha Krettek <aljos...@apache.org
>>>> >
>>>> >>wrote:
>>>> >>
>>>> >>> (Belated) +1
>>>> >>>
>>>> >>>  * verified signatures
>>>> >>>  * verified that Quickstart works with Flink Runner
>>>> >>>
>>>> >>> > On 16. Aug 2017, at 20:41, Robert Bradshaw
>>>> >><rober...@google.com.INVALID>
>>>> >>> wrote:
>>>> >>> >
>>>> >>> > +1 binding
>>>> >>> >
>>>> >>> > (I've been on vacation as well.)
>>>> >>> >
>>>> >>> > On Wed, Aug 16, 2017 at 8:50 AM, Lukasz Cwik
>>>> >><lc...@google.com.invalid>
>>>> >>> wrote:
>>>> >>> >> Back from vacation.
>>>> >>> >>
>>>> >>> >> +1 binding
>>>> >>> >>
>>>> >>> >> BEAM-2671 has been marked for 2.2.0 release.
>>>> >>> >>
>>>> >>> >>
>>>> >>> >>
>>>> >>> >> On Wed, Aug 16, 2017 at 2:08 AM, Kobi Salant
>>>> >><kobi.sal...@gmail.com>
>>>> >>> wrote:
>>>> >>> >>
>>>> >>> >>> Hi,
>>>> >>> >>>
>>>> >>> >>> Spark runner was tested with word count example and a more
>>>> >>complex
>>>> >>> session
>>>> >>> >>> based application on a yarn cluster.
>>>> >>> >>> Both application run successfully so we can say that spark
>>>> >runner
>>>> >>> passed
>>>> >>> >>> the sanity tests needed.
>>>> >

Re: [RESULT][VOTE] Release 2.1.0, release candidate #3

2017-08-22 Thread Ahmet Altay
On Tue, Aug 22, 2017 at 12:08 PM, Ahmet Altay <al...@google.com> wrote:

>
>
> On Tue, Aug 22, 2017 at 10:59 AM, Ahmet Altay <al...@google.com> wrote:
>
>> Thank you JB.
>>
>> On Mon, Aug 21, 2017 at 11:33 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
>> wrote:
>>
>>> Hi
>>>
>>> This vote passed with only +1.
>>>
>>> I'm promoting the artifacts to central and update Jira.
>>>
>>
>> Are you also pushing the PyPI artifacts? Do you want me to do it?
>>
>>
>>
>>> As I'm in vacation can a committer deal with the tag and website or
>>> merge ?
>>>
>>
>> Website PR had errors with dead-links. Re-running the test. If it fails
>> again, I can merge and we can fix with a follow up PR.
>>
>
> I noticed that pydocs contents are empty (i.e. headers are in there, but
> content is missing) in this PR. I am not sure what caused it. What is the
> recommendation here, we can do two things:
>
> a) Drop this PR, and re-create whole documentation again
> b) Merge this PR, and create another PR to fix the pydocs.
>

This is done. I updated the pydocs while merging it. Pydoc generation
worked fine for me, I do not know what was the issue.


>
>
>>
>> I can do the tagging.
>>
>>
>>>
>>> Sorry for this very short e-mail. Thanks all for your vote.
>>>
>>> Regards
>>> JB
>>>
>>>
>>> On Aug 18, 2017, 18:43, at 18:43, "Jean-Baptiste Onofré" <
>>> j...@nanthrax.net> wrote:
>>> >Hi
>>> >
>>> >I'm in vacation so I'm looking for a decent Internet connection to
>>> >finalize the release.
>>> >
>>> >I keep you posted.
>>> >
>>> >Regards
>>> >JB
>>> >
>>> >On Aug 18, 2017, 17:48, at 17:48, Eugene Kirpichov
>>> ><kirpic...@google.com.INVALID> wrote:
>>> >>Hi JB,
>>> >>
>>> >>Any updates on finalizing the release?
>>> >>
>>> >>Thanks.
>>> >>
>>> >>On Thu, Aug 17, 2017 at 5:42 AM Aljoscha Krettek <aljos...@apache.org>
>>> >>wrote:
>>> >>
>>> >>> (Belated) +1
>>> >>>
>>> >>>  * verified signatures
>>> >>>  * verified that Quickstart works with Flink Runner
>>> >>>
>>> >>> > On 16. Aug 2017, at 20:41, Robert Bradshaw
>>> >><rober...@google.com.INVALID>
>>> >>> wrote:
>>> >>> >
>>> >>> > +1 binding
>>> >>> >
>>> >>> > (I've been on vacation as well.)
>>> >>> >
>>> >>> > On Wed, Aug 16, 2017 at 8:50 AM, Lukasz Cwik
>>> >><lc...@google.com.invalid>
>>> >>> wrote:
>>> >>> >> Back from vacation.
>>> >>> >>
>>> >>> >> +1 binding
>>> >>> >>
>>> >>> >> BEAM-2671 has been marked for 2.2.0 release.
>>> >>> >>
>>> >>> >>
>>> >>> >>
>>> >>> >> On Wed, Aug 16, 2017 at 2:08 AM, Kobi Salant
>>> >><kobi.sal...@gmail.com>
>>> >>> wrote:
>>> >>> >>
>>> >>> >>> Hi,
>>> >>> >>>
>>> >>> >>> Spark runner was tested with word count example and a more
>>> >>complex
>>> >>> session
>>> >>> >>> based application on a yarn cluster.
>>> >>> >>> Both application run successfully so we can say that spark
>>> >runner
>>> >>> passed
>>> >>> >>> the sanity tests needed.
>>> >>> >>>
>>> >>> >>> Still there is an open ticket
>>> >>> >>> https://issues.apache.org/jira/browse/BEAM-2671 which Stas is
>>> >>working
>>> >>> on
>>> >>> >>> and its implications should be taken into consideration
>>> >regarding
>>> >>the
>>> >>> >>> release.
>>> >>> >>>
>>> >>> >>> Regards
>>> >>> >>> Kobi
>>> >>> >>>
>>> >>> >>> 2017-08-16 5:02 G

Re: [RESULT][VOTE] Release 2.1.0, release candidate #3

2017-08-22 Thread Ahmet Altay
On Tue, Aug 22, 2017 at 10:59 AM, Ahmet Altay <al...@google.com> wrote:

> Thank you JB.
>
> On Mon, Aug 21, 2017 at 11:33 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
>> Hi
>>
>> This vote passed with only +1.
>>
>> I'm promoting the artifacts to central and update Jira.
>>
>
> Are you also pushing the PyPI artifacts? Do you want me to do it?
>
>
>
>> As I'm in vacation can a committer deal with the tag and website or merge
>> ?
>>
>
> Website PR had errors with dead-links. Re-running the test. If it fails
> again, I can merge and we can fix with a follow up PR.
>

I noticed that pydocs contents are empty (i.e. headers are in there, but
content is missing) in this PR. I am not sure what caused it. What is the
recommendation here, we can do two things:

a) Drop this PR, and re-create whole documentation again
b) Merge this PR, and create another PR to fix the pydocs.


>
> I can do the tagging.
>
>
>>
>> Sorry for this very short e-mail. Thanks all for your vote.
>>
>> Regards
>> JB
>>
>>
>> On Aug 18, 2017, 18:43, at 18:43, "Jean-Baptiste Onofré" <j...@nanthrax.net>
>> wrote:
>> >Hi
>> >
>> >I'm in vacation so I'm looking for a decent Internet connection to
>> >finalize the release.
>> >
>> >I keep you posted.
>> >
>> >Regards
>> >JB
>> >
>> >On Aug 18, 2017, 17:48, at 17:48, Eugene Kirpichov
>> ><kirpic...@google.com.INVALID> wrote:
>> >>Hi JB,
>> >>
>> >>Any updates on finalizing the release?
>> >>
>> >>Thanks.
>> >>
>> >>On Thu, Aug 17, 2017 at 5:42 AM Aljoscha Krettek <aljos...@apache.org>
>> >>wrote:
>> >>
>> >>> (Belated) +1
>> >>>
>> >>>  * verified signatures
>> >>>  * verified that Quickstart works with Flink Runner
>> >>>
>> >>> > On 16. Aug 2017, at 20:41, Robert Bradshaw
>> >><rober...@google.com.INVALID>
>> >>> wrote:
>> >>> >
>> >>> > +1 binding
>> >>> >
>> >>> > (I've been on vacation as well.)
>> >>> >
>> >>> > On Wed, Aug 16, 2017 at 8:50 AM, Lukasz Cwik
>> >><lc...@google.com.invalid>
>> >>> wrote:
>> >>> >> Back from vacation.
>> >>> >>
>> >>> >> +1 binding
>> >>> >>
>> >>> >> BEAM-2671 has been marked for 2.2.0 release.
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> On Wed, Aug 16, 2017 at 2:08 AM, Kobi Salant
>> >><kobi.sal...@gmail.com>
>> >>> wrote:
>> >>> >>
>> >>> >>> Hi,
>> >>> >>>
>> >>> >>> Spark runner was tested with word count example and a more
>> >>complex
>> >>> session
>> >>> >>> based application on a yarn cluster.
>> >>> >>> Both application run successfully so we can say that spark
>> >runner
>> >>> passed
>> >>> >>> the sanity tests needed.
>> >>> >>>
>> >>> >>> Still there is an open ticket
>> >>> >>> https://issues.apache.org/jira/browse/BEAM-2671 which Stas is
>> >>working
>> >>> on
>> >>> >>> and its implications should be taken into consideration
>> >regarding
>> >>the
>> >>> >>> release.
>> >>> >>>
>> >>> >>> Regards
>> >>> >>> Kobi
>> >>> >>>
>> >>> >>> 2017-08-16 5:02 GMT+03:00 Eugene Kirpichov
>> >>> <kirpic...@google.com.invalid>:
>> >>> >>>
>> >>> >>>> Hey all,
>> >>> >>>>
>> >>> >>>> Seems like we're missing one more affirmative vote from a PMC
>> >>member
>> >>> (so
>> >>> >>>> far we have JB and Ahmet) to proceed with the release.
>> >>> >>>>
>> >>> >>>> On Mon, Aug 14, 2017 at 9:30 AM Ahmet Altay
>> >><al...@google.com.invalid
>> >>> >
>> >>> >>>> wrote:
>> >

Re: [RESULT][VOTE] Release 2.1.0, release candidate #3

2017-08-22 Thread Ahmet Altay
Thank you JB.

On Mon, Aug 21, 2017 at 11:33 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi
>
> This vote passed with only +1.
>
> I'm promoting the artifacts to central and update Jira.
>

Are you also pushing the PyPI artifacts? Do you want me to do it?



> As I'm in vacation can a committer deal with the tag and website or merge ?
>

Website PR had errors with dead-links. Re-running the test. If it fails
again, I can merge and we can fix with a follow up PR.

I can do the tagging.


>
> Sorry for this very short e-mail. Thanks all for your vote.
>
> Regards
> JB
>
>
> On Aug 18, 2017, 18:43, at 18:43, "Jean-Baptiste Onofré" <j...@nanthrax.net>
> wrote:
> >Hi
> >
> >I'm in vacation so I'm looking for a decent Internet connection to
> >finalize the release.
> >
> >I keep you posted.
> >
> >Regards
> >JB
> >
> >On Aug 18, 2017, 17:48, at 17:48, Eugene Kirpichov
> ><kirpic...@google.com.INVALID> wrote:
> >>Hi JB,
> >>
> >>Any updates on finalizing the release?
> >>
> >>Thanks.
> >>
> >>On Thu, Aug 17, 2017 at 5:42 AM Aljoscha Krettek <aljos...@apache.org>
> >>wrote:
> >>
> >>> (Belated) +1
> >>>
> >>>  * verified signatures
> >>>  * verified that Quickstart works with Flink Runner
> >>>
> >>> > On 16. Aug 2017, at 20:41, Robert Bradshaw
> >><rober...@google.com.INVALID>
> >>> wrote:
> >>> >
> >>> > +1 binding
> >>> >
> >>> > (I've been on vacation as well.)
> >>> >
> >>> > On Wed, Aug 16, 2017 at 8:50 AM, Lukasz Cwik
> >><lc...@google.com.invalid>
> >>> wrote:
> >>> >> Back from vacation.
> >>> >>
> >>> >> +1 binding
> >>> >>
> >>> >> BEAM-2671 has been marked for 2.2.0 release.
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Wed, Aug 16, 2017 at 2:08 AM, Kobi Salant
> >><kobi.sal...@gmail.com>
> >>> wrote:
> >>> >>
> >>> >>> Hi,
> >>> >>>
> >>> >>> Spark runner was tested with word count example and a more
> >>complex
> >>> session
> >>> >>> based application on a yarn cluster.
> >>> >>> Both application run successfully so we can say that spark
> >runner
> >>> passed
> >>> >>> the sanity tests needed.
> >>> >>>
> >>> >>> Still there is an open ticket
> >>> >>> https://issues.apache.org/jira/browse/BEAM-2671 which Stas is
> >>working
> >>> on
> >>> >>> and its implications should be taken into consideration
> >regarding
> >>the
> >>> >>> release.
> >>> >>>
> >>> >>> Regards
> >>> >>> Kobi
> >>> >>>
> >>> >>> 2017-08-16 5:02 GMT+03:00 Eugene Kirpichov
> >>> <kirpic...@google.com.invalid>:
> >>> >>>
> >>> >>>> Hey all,
> >>> >>>>
> >>> >>>> Seems like we're missing one more affirmative vote from a PMC
> >>member
> >>> (so
> >>> >>>> far we have JB and Ahmet) to proceed with the release.
> >>> >>>>
> >>> >>>> On Mon, Aug 14, 2017 at 9:30 AM Ahmet Altay
> >><al...@google.com.invalid
> >>> >
> >>> >>>> wrote:
> >>> >>>>
> >>> >>>>> On Mon, Aug 14, 2017 at 6:32 AM, Ismaël Mejía
> >><ieme...@gmail.com>
> >>> >>> wrote:
> >>> >>>>>
> >>> >>>>>> +1 (non-binding)
> >>> >>>>>>
> >>> >>>>>> - Validated signatures OK
> >>> >>>>>> - mvn clean verify -Prelease on both OpenJDK 1.7 and Oracle
> >>JDK 8
> >>> >>> with
> >>> >>>>>> the docker development images (WIP), both OK
> >>> >>>>>> - Run WordCount on local Flink and Spark runners OK
> >>> >>>>>>
> >>> >>>>>> Everything looks nice, only one minor thing (not block

Re: Policy for stale PRs

2017-08-18 Thread Ahmet Altay
To summarize the stale PR issue, do we agree on the following statement:

A PR becomes stale after its author fails to respond to actionable comments
for 60 days. The community will close stale PRs. Author is welcome to
reopen the same PR again in the future. The associated JIRAs will be
unassigned from the author but will stay open.

On Wed, Aug 16, 2017 at 3:25 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> bq. IRAs should still stay open but should become unassigned
>
> The above would need admin privilege, right ?
> Is there automated way to do it ?
>
> bq. Prevent contributors/committers from taking more than 'n' JIRAs at the
> same time
>
> It would be hard to determine the N above since the amount of coding /
> testing varies greatly across JIRAs.
>

I agree with Ismaël that there is an issue here. We currently have 969 open
JIRAs, 427 of them are unassigned and the remaining 542 are assigned to 87
people. The average of 6 issues per assignee is not that high. I think the
problem is some of us (mainly component leads, including myself) have too
many issues assigned.  Top 5 of them have 218 issues assigned to them. I
believe these issues are automatically assigned for triage purposes. We
probably do not need to codify an exact set of rules,, we could ask
component leads regularly triage their components, including unassigning
issues.


>
>
>
> On Wed, Aug 16, 2017 at 3:20 PM, Ismaël Mejía <ieme...@gmail.com> wrote:
>
> > Thanks Ahmet for bringing this subject.
> >
> > +1 to close the stale PRs automatically after a fixed time of inactivity.
> > 90
> > days is ok, but maybe a shorter period is better. If we consider that
> being
> > stale is just not having any activity i.e., the author of the PR does not
> > answer
> > any message. The author can buy extra time just by adding a message to
> say,
> > 'wait I am still working on this', and win a complete period of time, so
> > the
> > longer the staleness period is the longer it can eventually be extended.
> >
> > I agree with Thomas the JIRAs should still stay open but should become
> > unassigned because the issue won't be yet fixed but we want to encourage
> > people
> > to work on it.
> >
> > Other additional subject that makes sense to discuss here is if we need
> > policies
> > to avoid 'stale' JIRAs (JIRAs that have been taken but that don't have
> > progress)?, for example:
> >
> > - Prevent contributors/committers from taking more than 'n' JIRAs at the
> > same
> >   time (we should define this n considering the period of staleness,
> maybe
> > 10?).
> >
> > - Automatically free 'stale' JIRAs after a fixed time period with no
> > active work
> >
> > Remember the objective is to encourage more people to contribute but
> people
> > won't be encouraged to contribute on subjects that other people have
> > taken, this
> > is a well known anti-pattern in volunteer communities, see
> > http://communitymgt.wikia.com/wiki/Cookie_Licking
> >
> > On Wed, Aug 16, 2017 at 10:38 PM, Thomas Groh <tg...@google.com.invalid>
> > wrote:
> > > JIRAs should only be closed if the issue that they track is no longer
> > > relevant (either via being fixed or being determined to not be a
> > problem).
> > > If a JIRA isn't being meaningfully worked on, it should be unassigned
> (in
> > > all cases, not just if there's an associated pull request that has not
> > been
> > > worked on).
> > >
> > > +1 on closing PRs with no action from the original author after some
> > > reasonable time frame (90 days is certainly reasonable; 30 might be too
> > > short) if the author has not responded to actionable feedback.
> > >
> > > On Wed, Aug 16, 2017 at 12:07 PM, Sourabh Bajaj <
> > > sourabhba...@google.com.invalid> wrote:
> > >
> > >> Some projects I have seen close stale PRs after 30 days, saying
> "Closing
> > >> due to lack of activity, please feel free to re-open".
> > >>
> > >> On Wed, Aug 16, 2017 at 12:05 PM Ahmet Altay <al...@google.com.invalid
> >
> > >> wrote:
> > >>
> > >> > Sounds like we have consensus. Since this is a new policy, I would
> > >> suggest
> > >> > picking the most flexible option for now (90 days) and we can
> tighten
> > it
> > >> in
> > >> > the future. To answer Kenn's question, I do not know, how other
> > projects
> > >> > handle this. I did a basic search but could not find a good answer.
> > >> >
> > 

Re: Policy for stale PRs

2017-08-16 Thread Ahmet Altay
Sounds like we have consensus. Since this is a new policy, I would suggest
picking the most flexible option for now (90 days) and we can tighten it in
the future. To answer Kenn's question, I do not know, how other projects
handle this. I did a basic search but could not find a good answer.

What mechanism can we use to close PRs, assuming that author will be out of
communication. We can push a commit with a "This closes #xyz #abc" message.
Is there another way to do this?

Ahmet

On Wed, Aug 16, 2017 at 4:32 AM, Aviem Zur <aviem...@gmail.com> wrote:

> Makes sense to close after a long time of inactivity and no response, and
> as Kenn mentioned they can always re-open.
>
> On Wed, Aug 16, 2017 at 12:20 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
> > If we consider the author, it makes sense.
> >
> > Regards
> > JB
> >
> > On Aug 15, 2017, 01:29, at 01:29, Ted Yu <yuzhih...@gmail.com> wrote:
> > >The proposal makes sense.
> > >
> > >If the author of PR doesn't respond for 90 days, the PR is likely out
> > >of
> > >sync with current repo.
> > >
> > >Cheers
> > >
> > >On Mon, Aug 14, 2017 at 5:27 PM, Ahmet Altay <al...@google.com.invalid>
> > >wrote:
> > >
> > >> Hi all,
> > >>
> > >> Do we have an existing policy for handling stale PRs? If not could we
> > >come
> > >> up with one. We are getting close to 100 open PRs. Some of the open
> > >PRs
> > >> have not been touched for a while, and if we exclude the pings the
> > >number
> > >> will be higher.
> > >>
> > >> For example, we could close PRs that have not been updated by the
> > >original
> > >> author for 90 days even after multiple attempts to reach them (e.g.
> > >[1],
> > >> [2] are such PRs.)
> > >>
> > >> What do you think?
> > >>
> > >> Thank you,
> > >> Ahmet
> > >>
> > >> [1] https://github.com/apache/beam/pull/1464
> > >> [2] https://github.com/apache/beam/pull/2949
> > >>
> >
>


Re: Hello from a newbie to the data world living in the city by the bay!

2017-08-15 Thread Ahmet Altay
Welcome both of you!

Some helpful starting points:
- Contribution guide [1]
- Unassigned starter issues in JIRA [2]

Ahmet

[1] https://beam.apache.org/contribute/contribution-guide/
[2]
https://issues.apache.org/jira/browse/BEAM-2632?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20Reopened)%20AND%20resolution%20%3D%20Unresolved%20AND%20labels%20%3D%20starter%20AND%20assignee%20in%20(EMPTY)%20ORDER%20BY%20created%20DESC%2C%20priority%20DESC

On Tue, Aug 15, 2017 at 11:13 AM, Umang Sharma  wrote:

> Hi Gris,
> Nice to meet you.
>
> I'd like to take this opportunity to introduce me to you and everyone else
> in  the dev team.
>
> I’m m Umang Sharma. I'm an associate in Data Science and Applications at
> Accenture Digital.
>
>
> I write in python, Java and a number of other languages.
> I'd love to contribute to Beam. It'd br great if someone guides me to get
> started with contributing :)
>
> Among the other things i like are polo golf, giving talks and talking about
> mu work .
>
> Thanks,
> Umang
>
>
> On Aug 15, 2017 22:40, "Griselda Cuevas"  wrote:
>
> Hi Beam community,
>
> I’m Griselda (Gris) Cuevas and I’m very excited to join the community, I’m
> looking forward to learning awesome things from you and to getting the
> chance to collaborate on great initiatives.
>
> I’m currently working at Google and I’m studying a masters in operations
> research and data science at UC Berkeley. I’m interested in Natural
> Language Processing, Information Retrieval and Online Communities. Some
> other fun topics I love are juggling, camping and -just getting into it-
>  listening to podcasts, so if you ever want to discuss and talk about any
> of these topics, here I am!
>
> Another reason why I’m here is because I want to help this project grow and
> thrive. This means that you’ll see me contributing to the project, reaching
> out to ask questions as I get familiar with our community, and I also
> helping evangelize Apache Beam by organizing meetups, hangouts, etc.
>
> I say bye for now, I’ll see you around,
>
> Cheers,
>
> G
>


Re: [VOTE] Release 2.1.0, release candidate #3

2017-08-14 Thread Ahmet Altay
On Mon, Aug 14, 2017 at 6:32 AM, Ismaël Mejía  wrote:

> +1 (non-binding)
>
> - Validated signatures OK
> - mvn clean verify -Prelease on both OpenJDK 1.7 and Oracle JDK 8 with
> the docker development images (WIP), both OK
> - Run WordCount on local Flink and Spark runners OK
>
> Everything looks nice, only one minor thing (not blocking at all). The
> proto generated files for python are not cleaned correctly and this
> causes the validation to complain because the maven rat plugin does
> not find the apache headers on the files  (this happens if you execute
> mvn clean verify -Prelease immediately after the validation).
>

Ismaël, could you create a JIRA issue for this (to be fixed at a future
release)?


>
> On Sun, Aug 13, 2017 at 6:52 AM, Jean-Baptiste Onofré 
> wrote:
> > +1 (binding)
> >
> > I do my own tests and casting my own vote ;)
> >
> > Regards
> > JB
> >
> > On 08/09/2017 07:08 AM, Jean-Baptiste Onofré wrote:
> >>
> >> Hi everyone,
> >>
> >> Please review and vote on the release candidate #3 for the version
> 2.1.0,
> >> as follows:
> >>
> >> [ ] +1, Approve the release
> >> [ ] -1, Do not approve the release (please provide specific comments)
> >>
> >>
> >> The complete staging area is available for your review, which includes:
> >> * JIRA release notes [1],
> >> * the official Apache source release to be deployed to dist.apache.org
> >> [2], which is signed with the key with fingerprint C8282E76 [3],
> >> * all artifacts to be deployed to the Maven Central Repository [4],
> >> * source code tag "v2.1.0-RC3" [5],
> >> * website pull request listing the release and publishing the API
> >> reference manual [6].
> >> * Python artifacts are deployed along with the source release to the
> >> dist.apache.org [2].
> >>
> >> The vote will be open for at least 72 hours. It is adopted by majority
> >> approval, with at least 3 PMC affirmative votes.
> >>
> >> Thanks,
> >> JB
> >>
> >> [1]
> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12319527=12340528
> >> [2] https://dist.apache.org/repos/dist/dev/beam/2.1.0/
> >> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> >> [4] https://repository.apache.org/content/repositories/
> orgapachebeam-1020/
> >> [5] https://github.com/apache/beam/tree/v2.1.0-RC3
> >> [6] https://github.com/apache/beam-site/pull/270
> >
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
>


Re: [ANNOUNCEMENT] New committers, August 2017 edition!

2017-08-11 Thread Ahmet Altay
Congratulations to all of you. Well deserved and thank you for your
contributions.

On Fri, Aug 11, 2017 at 10:43 AM, tarush grover 
wrote:

> Congratulations!!
>
> Regards,
> Tarush
>
> On Fri, 11 Aug 2017 at 11:11 PM, Davor Bonaci  wrote:
>
> > Please join me and the rest of Beam PMC in welcoming the following
> > contributors as our newest committers. They have significantly
> contributed
> > to the project in different ways, and we look forward to many more
> > contributions in the future.
> >
> > * Reuven Lax
> > Reuven has been with the project since the very beginning, contributing
> > mostly to the core SDK and the GCP IO connectors. He accumulated 52
> commits
> > (19,824 ++ / 12,039 --). Most recently, Reuven re-wrote several IO
> > connectors that significantly expanded their functionality. Additionally,
> > Reuven authored important new design documents relating to update and
> > snapshot functionality.
> >
> > * Jingsong Lee
> > Jingsong has been contributing to Apache Beam since the beginning of the
> > year, particularly to the Flink runner. He has accumulated 34 commits
> > (11,214 ++ / 6,314 --) of deep, fundamental changes that significantly
> > improved the quality of the runner. Additionally, Jingsong has
> contributed
> > to the project in other ways too -- reviewing contributions, and
> > participating in discussions on the mailing list, design documents, and
> > JIRA issue tracker.
> >
> > * Mingmin Xu
> > Mingmin started the SQL DSL effort, and has driven it to the point of
> > merging to the master branch. In this effort, he extended the project to
> > the significant new user community.
> >
> > * Mingming (James) Xu
> > James joined the SQL DSL effort, contributing some of the trickier parts,
> > such as the Join functionality. Additionally, he's consistently shown
> > himself to be an insightful code reviewer, significantly impacting the
> > project’s code quality and ensuring the success of the new major
> component.
> >
> > * Manu Zhang
> > Manu initiated and developed a runner for the Apache Gearpump
> (incubating)
> > engine, and has driven it to the point of merging to the master branch.
> In
> > this effort, he accumulated 65 commits (7,812 ++ / 4,882 --) and extended
> > the project to the new user community.
> >
> > Congratulations to all five! Welcome!
> >
> > Davor
> >
>


Re: [VOTE] Release 2.1.0, release candidate #3

2017-08-09 Thread Ahmet Altay
+1, Thank you JB!

- I verified the hashes for apache-beam-2.1.0-python.zip,
apache-beam-2.1.0-source-release.zip files
- Unzipped apache-beam-2.1.0-source-release.zip and ran python packaging
and unittests using tox
- Ran python wordcount and mobile gaming examples with DirectRunner and
DataflowRunner on Linux.
- I could NOT verify the documentation, the staging link in PR270 is not
working any more. The PR likely needs an update.

Ahmet

On Tue, Aug 8, 2017 at 10:08 PM, Jean-Baptiste Onofré 
wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #3 for the version 2.1.0,
> as follows:
>
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org
> [2], which is signed with the key with fingerprint C8282E76 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "v2.1.0-RC3" [5],
> * website pull request listing the release and publishing the API
> reference manual [6].
> * Python artifacts are deployed along with the source release to the
> dist.apache.org [2].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> JB
>
> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
> ctId=12319527=12340528
> [2] https://dist.apache.org/repos/dist/dev/beam/2.1.0/
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1020/
> [5] https://github.com/apache/beam/tree/v2.1.0-RC3
> [6] https://github.com/apache/beam-site/pull/270
>


Re: [INFO] Build fails on master

2017-08-08 Thread Ahmet Altay
Hey, are you referring to a specifc Jenkins build? If not could you share
the actual error?

Last python post commit test was successful (
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Python_Verify/2887/),
although the one before that failed. The failure looks like a flaked to me
and I opened (https://issues.apache.org/jira/browse/BEAM-2755) to fix that.

Ahmet

On Mon, Aug 7, 2017 at 11:44 PM, Jean-Baptiste Onofré 
wrote:

> Hi guys,
>
> it seems we have an issue to build master on Python SDK:
>
> Test failed: 
> error: Test failed:  failures=0>
> ERROR: InvocationError: '/home/jbonofre/Workspace/beam
> /sdks/python/target/.tox/py27gcp/bin/python setup.py test'
> ___ summary __
> __
>   docs: commands succeeded
>   lint: commands succeeded
>   py27: commands succeeded
>   py27cython: commands succeeded
> ERROR:   py27gcp: commands failed
> [ERROR] Command execution failed.
>
> I will check the recent changes (but RC3 first ;)).
>
> Regards
> JB
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: [DISCUSS] Beam pipeline logical and physical DAGs visualization.

2017-08-03 Thread Ahmet Altay
+1, this looks great and it will be very useful for users to understand
their pipelines.

On Thu, Aug 3, 2017 at 8:25 PM, Pei HE  wrote:

> Hi all,
> While working on JStorm and MapReduce runners, I found that it is very
> helpful to understand Beam pipelines by visualizing them.
>
> Logical graph:
> https://drive.google.com/file/d/0B6iZ7iRh-LOYc0dUS0Rwb2tvWGM/view?usp=
> sharing
>
> Physical graph:
> https://drive.google.com/file/d/0B6iZ7iRh-LOYbDFWeDlCcDhnQmc/view?usp=
> sharing
>
> I think we can visualize Beam logical DAG in runner-core. It should also be
> easy to visualize the physical DAG in each runners. (Maybe we can define
> some shared data structures to make it more automatic, and even support
> visualizing them in Apex/Flink/Spark/Gearpump UIs).
>
> I have a commit for MapReduce runner in here (<200 lines). And, this commit
> generates dotfiles for logical and physical DAGs.
>
> https://github.com/peihe/incubator-beam/commit/
> bb3349e10c0cfacd81b610880ddfec030fedf34d
>
> Looking forward to ideas and feedbacks.
> --
> Pei
>


Re: [VOTE] Release 2.1.0, release candidate #2

2017-07-19 Thread Ahmet Altay
Yes, +1 on RC2.

On Wed, Jul 19, 2017 at 5:10 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi Aviem,
>
> as mentioned in the first e-mail:
>
> - Distributions are available here:
> https://dist.apache.org/repos/dist/dev/beam/2.1.0/
>
> - Artifacts are on the staging repository:
> https://repository.apache.org/content/repositories/orgapachebeam-1019/
>
> Regards
> JB
>
>
> On 07/19/2017 12:26 PM, Aviem Zur wrote:
>
>> Have the jars for RC2 been uploaded somewhere?
>>
>> On Wed, Jul 19, 2017 at 10:19 AM Jean-Baptiste Onofré <j...@nanthrax.net>
>> wrote:
>>
>> So, I guess you are voting +1 on RC2, correct (just for the tracking) ?
>>>
>>> Thanks,
>>> Regards
>>> JB
>>>
>>> On 07/19/2017 08:00 AM, Ahmet Altay wrote:
>>>
>>>> Thank you JB.
>>>>
>>>> I validated python wordcount and mobile gaming examples on Linux. Found
>>>>
>>> one
>>>
>>>> issue (https://issues.apache.org/jira/browse/BEAM-2636). This does not
>>>>
>>> need
>>>
>>>> to be a blocking issue for RC2, but if we end up having a RC3 we should
>>>> consider fixing this issue.
>>>>
>>>> Ahmet
>>>>
>>>> On Tue, Jul 18, 2017 at 4:18 PM, Mingmin Xu <mingm...@gmail.com> wrote:
>>>>
>>>> Thanks Kenn. SQL DSL should be ready in the next version 2.2.0, and
>>>>>
>>>> agree
>>>
>>>> to have an overall row "Add SQL DSL" instead of listing all the detailed
>>>>> tasks.
>>>>>
>>>>> On Tue, Jul 18, 2017 at 3:54 PM, Kenneth Knowles
>>>>> <k...@google.com.invalid
>>>>>
>>>>
>>>> wrote:
>>>>>
>>>>> Done.
>>>>>>
>>>>>> Since it is all on a feature branch and the release notes when it goes
>>>>>>
>>>>> to
>>>
>>>> master will include "Add SQL DSL" I did not associate the little bits
>>>>>>
>>>>> with
>>>>>
>>>>>> a release.
>>>>>>
>>>>>> On Tue, Jul 18, 2017 at 2:51 PM, Mingmin Xu <mingm...@gmail.com>
>>>>>>
>>>>> wrote:
>>>
>>>>
>>>>>> The tasks of SQL should not be labeled as 2.1.0, I've updated some
>>>>>>>
>>>>>> with
>>>
>>>> 2.2.0, fail to change the 'closed' ones. Can anyone with the
>>>>>>>
>>>>>> permission
>>>
>>>> update these tasks
>>>>>>> https://issues.apache.org/jira/browse/BEAM-2171?jql=
>>>>>>> project%20%3D%20BEAM%20AND%20fixVersion%20%3D%202.1.0%
>>>>>>> 20AND%20component%20%3D%20dsl-sql?
>>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Mingmin
>>>>>>>
>>>>>>> On Tue, Jul 18, 2017 at 2:23 PM, Jean-Baptiste Onofré <
>>>>>>>
>>>>>> j...@nanthrax.net
>>>
>>>>
>>>>>> wrote:
>>>>>>>
>>>>>>> Yeah, indeed, the issue like BEAM-2171 should not have "Fix Version"
>>>>>>>>
>>>>>>> set
>>>>>>
>>>>>>> to 2.1.0.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> JB
>>>>>>>>
>>>>>>>> On 07/18/2017 06:52 PM, James wrote:
>>>>>>>>
>>>>>>>> Just noticed that some of the DSL_SQL issues are included in this
>>>>>>>>>
>>>>>>>> release?
>>>>>>>
>>>>>>>> e.g. The first one: BEAM-2171, this is not expected,right?
>>>>>>>>> On Wed, 19 Jul 2017 at 12:30 AM Jean-Baptiste Onofré <
>>>>>>>>>
>>>>>>>> j...@nanthrax.net
>>>>>
>>>>>>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi everyone,
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please review and vote on the release candidat

Re: [VOTE] Release 2.1.0, release candidate #2

2017-07-19 Thread Ahmet Altay
Thank you JB.

I validated python wordcount and mobile gaming examples on Linux. Found one
issue (https://issues.apache.org/jira/browse/BEAM-2636). This does not need
to be a blocking issue for RC2, but if we end up having a RC3 we should
consider fixing this issue.

Ahmet

On Tue, Jul 18, 2017 at 4:18 PM, Mingmin Xu  wrote:

> Thanks Kenn. SQL DSL should be ready in the next version 2.2.0, and agree
> to have an overall row "Add SQL DSL" instead of listing all the detailed
> tasks.
>
> On Tue, Jul 18, 2017 at 3:54 PM, Kenneth Knowles 
> wrote:
>
> > Done.
> >
> > Since it is all on a feature branch and the release notes when it goes to
> > master will include "Add SQL DSL" I did not associate the little bits
> with
> > a release.
> >
> > On Tue, Jul 18, 2017 at 2:51 PM, Mingmin Xu  wrote:
> >
> > > The tasks of SQL should not be labeled as 2.1.0, I've updated some with
> > > 2.2.0, fail to change the 'closed' ones. Can anyone with the permission
> > > update these tasks
> > > https://issues.apache.org/jira/browse/BEAM-2171?jql=
> > > project%20%3D%20BEAM%20AND%20fixVersion%20%3D%202.1.0%
> > > 20AND%20component%20%3D%20dsl-sql?
> > >
> > >
> > > Thanks!
> > > Mingmin
> > >
> > > On Tue, Jul 18, 2017 at 2:23 PM, Jean-Baptiste Onofré  >
> > > wrote:
> > >
> > > > Yeah, indeed, the issue like BEAM-2171 should not have "Fix Version"
> > set
> > > > to 2.1.0.
> > > >
> > > > Regards
> > > > JB
> > > >
> > > > On 07/18/2017 06:52 PM, James wrote:
> > > >
> > > >> Just noticed that some of the DSL_SQL issues are included in this
> > > release?
> > > >> e.g. The first one: BEAM-2171, this is not expected,right?
> > > >> On Wed, 19 Jul 2017 at 12:30 AM Jean-Baptiste Onofré <
> j...@nanthrax.net
> > >
> > > >> wrote:
> > > >>
> > > >> Hi everyone,
> > > >>>
> > > >>> Please review and vote on the release candidate #2 for the version
> > > 2.1.0,
> > > >>> as
> > > >>> follows:
> > > >>>
> > > >>> [ ] +1, Approve the release
> > > >>> [ ] -1, Do not approve the release (please provide specific
> comments)
> > > >>>
> > > >>>
> > > >>> The complete staging area is available for your review, which
> > includes:
> > > >>> * JIRA release notes [1],
> > > >>> * the official Apache source release to be deployed to
> > dist.apache.org
> > > >>> [2],
> > > >>> which is signed with the key with fingerprint C8282E76 [3],
> > > >>> * all artifacts to be deployed to the Maven Central Repository [4],
> > > >>> * source code tag "v2.1.0-RC2" [5],
> > > >>> * website pull request listing the release and publishing the API
> > > >>> reference
> > > >>> manual [6].
> > > >>> * Python artifacts are deployed along with the source release to
> the
> > > >>> dist.apache.org [2].
> > > >>>
> > > >>> The vote will be open for at least 72 hours. It is adopted by
> > majority
> > > >>> approval,
> > > >>> with at least 3 PMC affirmative votes.
> > > >>>
> > > >>> Thanks,
> > > >>> JB
> > > >>>
> > > >>> [1]
> > > >>>
> > > >>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
> > > >>> ctId=12319527=12340528
> > > >>> [2] https://dist.apache.org/repos/dist/dev/beam/2.1.0/
> > > >>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> > > >>> [4] https://repository.apache.org/content/repositories/orgapache
> > > >>> beam-1019/
> > > >>> [5] https://github.com/apache/beam/tree/v2.1.0-RC2
> > > >>> [6] https://github.com/apache/beam-site/pull/270
> > > >>>
> > > >>>
> > > >>
> > > > --
> > > > Jean-Baptiste Onofré
> > > > jbono...@apache.org
> > > > http://blog.nanthrax.net
> > > > Talend - http://www.talend.com
> > > >
> > >
> > >
> > >
> > > --
> > > 
> > > Mingmin
> > >
> >
>
>
>
> --
> 
> Mingmin
>


Re: [CANCEL][VOTE] Release 2.1.0, release candidate #1

2017-07-17 Thread Ahmet Altay
On Mon, Jul 17, 2017 at 5:08 AM, Jean-Baptiste Onofré 
wrote:

> Hi all,
>
> as discussed, I cancel the vote on RC1 in order to prepare a RC2.
>
> The RC2 should include the following fixes:
>
> - BEAM-2595 is already fixed and cherry-picked on release-2.1.0 branch
> - BEAM-2271 is still pending. In RC1, I did the python zip cleanup
> manually. I think it's good enough for RC2. @Ahmet: is it fine for you ?
>
Yes. Thank you!


>
> I will prepare a RC2 later today.
>
> Regards
> JB
>
>
> On 07/11/2017 03:02 PM, Jean-Baptiste Onofré wrote:
>
>> Hi everyone,
>>
>> Please review and vote on the release candidate #1 for the version 2.1.0,
>> as follows:
>>
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>>
>>
>> The complete staging area is available for your review, which includes:
>> * JIRA release notes [1],
>> * the official Apache source release to be deployed to dist.apache.org
>> [2], which is signed with the key with fingerprint C8282E76 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "v2.1.0-RC1" [5],
>> * website pull request listing the release and publishing the API
>> reference manual [6].
>> * Python artifacts are deployed along with the source release to the
>> dist.apache.org [2].
>>
>> The vote will be open for at least 72 hours. It is adopted by majority
>> approval, with at least 3 PMC affirmative votes.
>>
>> Thanks,
>> JB
>>
>> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
>> ctId=12319527=12340528
>> [2] https://dist.apache.org/repos/dist/dev/beam/2.1.0/
>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>> [4] https://repository.apache.org/content/repositories/orgapache
>> beam-1018/
>> [5] https://github.com/apache/beam/tree/v2.1.0-RC1
>> [6] https://github.com/apache/beam-site/pull/270
>>
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: [VOTE] Release 2.1.0, release candidate #1

2017-07-15 Thread Ahmet Altay
Thank you JB, Sourabh.

On Fri, Jul 14, 2017 at 11:42 PM, Sourabh Bajaj <
sourabhba...@google.com.invalid> wrote:

> Hi JB,
>
> https://github.com/apache/beam/pull/3563 cherrypicks the fix for BEAM-2595
> to the release branch, please review.


Reviewed and and merged this PR to the release branch. I will close this
issue.


>
>
I wasn't able to reproduce the issue in BEAM-2271 but was hoping
> https://github.com/apache/beam/pull/3563 will fix it, so would be great if
> you can take a look at it as well.
>
> Thanks
> Sourabh
>
> On Fri, Jul 14, 2017 at 10:30 PM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
> > Hi Ahmet,
> >
> > sorry I missed those Jira.
> >
> > Can you help to cherry-pick and fix both issues on release-2.1.0 branch ?
> >
> > The release guide already mention to check to open Jira on the release
> > target. I
> > just missed these two Jira, sorry about that.
> >
> > I don't think an additional list is required.
> >
> > I will cancel this vote and cut a RC2 as soon as BEAM-2595 and BEAM-2771
> > are
> > addressed.
> >
> > Thanks,
> > Regards
> > JB
> >
> > On 07/14/2017 06:15 AM, Ahmet Altay wrote:
> > > -1
> > >
> > > Thank you JB. Unfortunately I do not want to approve this RC :(. My
> > reason
> > > is that there are two open issues in the burndown list (
> > > https://s.apache.org/beam-2.1.0-burndown). I think we should either
> fix
> > > them or explicitly move them out of the list. BEAM-2595 is a regression
> > in
> > > usability (not in functionality), and it is fixed in master. We could
> > > cherry pick that. BEAM-2271 is an improvement to the release process. I
> > > would prefer fixing the process now instead of the next release cycle.
> > > However, if we want to release sooner, it is fine to clean the zip
> files
> > > manually.
> > >
> > > Another point I would like to raise is about the validation process.
> > During
> > > 2.0 release we created a list of things to validate before that
> release.
> > > Should we re-use that list for this and subseqeuent releases?
> > >
> > > Ahmet
> > >
> > > On Tue, Jul 11, 2017 at 6:02 AM, Jean-Baptiste Onofré <j...@nanthrax.net
> >
> > > wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> Please review and vote on the release candidate #1 for the version
> > 2.1.0,
> > >> as follows:
> > >>
> > >> [ ] +1, Approve the release
> > >> [ ] -1, Do not approve the release (please provide specific comments)
> > >>
> > >>
> > >> The complete staging area is available for your review, which
> includes:
> > >> * JIRA release notes [1],
> > >> * the official Apache source release to be deployed to
> dist.apache.org
> > >> [2], which is signed with the key with fingerprint C8282E76 [3],
> > >> * all artifacts to be deployed to the Maven Central Repository [4],
> > >> * source code tag "v2.1.0-RC1" [5],
> > >> * website pull request listing the release and publishing the API
> > >> reference manual [6].
> > >> * Python artifacts are deployed along with the source release to the
> > >> dist.apache.org [2].
> > >>
> > >> The vote will be open for at least 72 hours. It is adopted by majority
> > >> approval, with at least 3 PMC affirmative votes.
> > >>
> > >> Thanks,
> > >> JB
> > >>
> > >> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
> > >> ctId=12319527=12340528
> > >> [2] https://dist.apache.org/repos/dist/dev/beam/2.1.0/
> > >> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> > >> [4]
> > https://repository.apache.org/content/repositories/orgapachebeam-1018/
> > >> [5] https://github.com/apache/beam/tree/v2.1.0-RC1
> > >> [6] https://github.com/apache/beam-site/pull/270
> > >>
> > >
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>


Re: [VOTE] Release 2.1.0, release candidate #1

2017-07-13 Thread Ahmet Altay
-1

Thank you JB. Unfortunately I do not want to approve this RC :(. My reason
is that there are two open issues in the burndown list (
https://s.apache.org/beam-2.1.0-burndown). I think we should either fix
them or explicitly move them out of the list. BEAM-2595 is a regression in
usability (not in functionality), and it is fixed in master. We could
cherry pick that. BEAM-2271 is an improvement to the release process. I
would prefer fixing the process now instead of the next release cycle.
However, if we want to release sooner, it is fine to clean the zip files
manually.

Another point I would like to raise is about the validation process. During
2.0 release we created a list of things to validate before that release.
Should we re-use that list for this and subseqeuent releases?

Ahmet

On Tue, Jul 11, 2017 at 6:02 AM, Jean-Baptiste Onofré 
wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #1 for the version 2.1.0,
> as follows:
>
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org
> [2], which is signed with the key with fingerprint C8282E76 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "v2.1.0-RC1" [5],
> * website pull request listing the release and publishing the API
> reference manual [6].
> * Python artifacts are deployed along with the source release to the
> dist.apache.org [2].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> JB
>
> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
> ctId=12319527=12340528
> [2] https://dist.apache.org/repos/dist/dev/beam/2.1.0/
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1018/
> [5] https://github.com/apache/beam/tree/v2.1.0-RC1
> [6] https://github.com/apache/beam-site/pull/270
>


Re: Passing pipeline options into PTransforms and Filesystems in Python

2017-07-11 Thread Ahmet Altay
+1 to the above responses to for passing option into PTransforms.

As Robert mentioned in the JIRA issue, filesystem plug-ins are in a
different category. It is reasonable for them to create credentials based
on options/environment variables. We could have a protocol for
instantiating file system plugins and there pass them the pipeline options
in its entirety.

There is also the security aspect of all of this. None of the proposed
options here will ensure that credential passed in this way will be secure.
It is possible that the ptransform/file system plugin to log (or do more
unsecure things) once getting the credentials.

Ahmet

On Tue, Jul 11, 2017 at 4:00 PM, Robert Bradshaw <
rober...@google.com.invalid> wrote:

> Templates, including ValueProviders, were recently added to the Python
> SDK. +1 to pursuing this train of thought (and as I mentioned on the
> bug, and has been mentioned here, we don't want to add PipelineOptions
> access to PTransforms/at construction time).
>
> On Tue, Jul 11, 2017 at 3:21 PM, Kenneth Knowles 
> wrote:
> > Hi Dmitry,
> >
> > This is a very worthwhile discussion that has recently come up on
> > StackOverflow, here: https://stackoverflow.com/a/45024542/4820657
> >
> > We actually recently _removed_ the PipelineOptions from Pipeline.apply in
> > Java since they tend to cause transforms to have implicit changes that
> make
> > them non-portable. Baking in credentials would probably fall into this
> > category.
> >
> > The other aspect to this is that we want to be able to build a pipeline
> and
> > run it later, in an environment chosen when we decide to run it. So
> > PipelineOptions are really for running, not building, a Pipeline. You can
> > still use them for arg parsing and passing specific values to transforms
> -
> > that is essentially orthogonal and just accidentally conflated.
> >
> > I can't speak to the state of Python SDK's maturity in this regard, but
> > there is a concept of a "ValueProvider" that is a deferred value that can
> > be specified by PipelineOptions when you run your pipeline. This may be
> > what you want. You build a PTransform passing some of its configuration
> > parameters as ValueProvider and at run time you set them to actual values
> > that are passed to the UDFs in your pipeline.
> >
> > Hope this helps. Despite not being deeply involved in Python, I wanted to
> > lay out the territory so someone else could comment further without
> having
> > to go into background.
> >
> > Kenn
> >
> > On Tue, Jul 11, 2017 at 3:03 PM, Dmitry Demeshchuk  >
> > wrote:
> >
> >> Hi folks,
> >>
> >> Sometimes, it would be very useful if PTransforms had access to global
> >> pipeline options, such as various credentials, settings and so on.
> >>
> >> Per conversation in https://issues.apache.org/jira/browse/BEAM-2572,
> I'd
> >> like to kick off a discussion about that.
> >>
> >> This would be beneficial for at least one major use case: support for
> >> different cloud providers (AWS, Azure, etc) and an ability to specify
> each
> >> provider's credentials just once in the pipeline options.
> >>
> >> It looks like the trickiest part is not to make the PTransform objects
> have
> >> access to pipeline options (we could possibly just modified the
> >> Pipeline.apply
> >>  >> apache_beam/pipeline.py#L355>
> >> method), but to actually pass these options down the road, such as to
> DoFn
> >> objects and FileSystem objects.
> >>
> >> I'm still in the process of reading the code and understanding of what
> this
> >> could look like, so any input would be really appreciated.
> >>
> >> Thank you.
> >>
> >> --
> >> Best regards,
> >> Dmitry Demeshchuk.
> >>
>


Re: Mixed-Language Pipelines

2017-07-11 Thread Ahmet Altay
Thank you Thomas. I think this will especially be great for Python SDK,
allowing it to tap into many sources that exist in the Java SDK. I added my
comments.

Ahmet

On Mon, Jul 10, 2017 at 9:58 AM, Thomas Groh 
wrote:

> Hey everyone;
>
> I've been working on a design for implementing multi-language pipelines
> within the Beam SDKs (also known as mix-and-match). This kind of pipeline
> lets us reuse transforms written in one language in any other language that
> supports the Runner API and the Fn API. Letting us write a transform once
> and run it everywhere is pretty exciting to me, so I'm pretty excited for
> this.
>
> The document is available at
> https://s.apache.org/beam-mixed-language-pipelines. Comments and questions
> are welcome, and I'm looking forwards to any feedback available.
>
> Thanks,
>
> Thomas
>


Re: [Proposal] Submitting pipelines to Runners in another language

2017-07-06 Thread Ahmet Altay
Thank you Sourabh. I added my comments as well and +1 to Kenn.

On Thu, Jul 6, 2017 at 2:21 PM, Kenneth Knowles 
wrote:

> I added a few detailed comments. I definitely think we should move forward
> on this to get Python pipelines running on all our our runners, and
> hopefully that gets us ready for any future SDKs too.
>
> On Wed, Jul 5, 2017 at 2:21 PM, Sourabh Bajaj <
> sourabhba...@google.com.invalid> wrote:
>
> > Hi,
> >
> > I wanted to share a proposal for submitting pipelines from SDK X
> > (Python/Go) to runners written in another language Y (Java) (Flink /
> Spark
> > / Apex) using the Runner API. Please find the doc here
> >  > UVoH5BsFUofZuo/edit#>
> > .
> >
> > As always comments and feedback are welcome.
> >
> > Thanks
> > Sourabh
> >
>


Re: [DISCUSS] Apache Beam 2.1.0 release next week ?

2017-06-22 Thread Ahmet Altay
+1

For Python, there are 2 hard blocking issues (and 2 nice to haves) all
tagged as blocking 2.1.0 [1].

Ahmet

[1]
https://issues.apache.org/jira/browse/BEAM-2497?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20Reopened)%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%202.1.0%20AND%20component%20%3D%20sdk-py%20ORDER%20BY%20created%20DESC%2C%20priority%20DESC

On Thu, Jun 22, 2017 at 10:05 AM, Davor Bonaci  wrote:

> +1
>
> On Thu, Jun 22, 2017 at 5:42 AM, Etienne Chauchot 
> wrote:
>
> > Besides, there are some minor fixes/enhancements that lack in spark
> >
> > For info, bellow are the ones raised by nexmark test suite:
> >
> > https://issues.apache.org/jira/browse/BEAM-2499
> >
> > https://issues.apache.org/jira/browse/BEAM-2112
> >
> > https://issues.apache.org/jira/browse/BEAM-2409
> >
> > https://issues.apache.org/jira/browse/BEAM-1035
> >
> >
> > But I don't think it blocks the release
> >
> > Best
> >
> > Etienne
> >
> >
> >
> > Le 22/06/2017 à 13:46, Aviem Zur a écrit :
> >
> >> +1
> >> There are important bug fixes that need to be released.
> >>
> >> On Thu, Jun 22, 2017 at 11:42 AM Etienne Chauchot 
> >> wrote:
> >>
> >> +1 on Ismaël words, but not a blocking point indeed, maybe more a nice
> >>> to have.
> >>>
> >>>
> >>> Le 22/06/2017 à 06:59, Ismaël Mejía a écrit :
> >>>
>  Thahks JB for keeping the time based release agenda. I really don't
>  have any blocker but I would like to have the hadoop version alignment
>  PR merged before this one and probably also Nexmark (considering that
>  Etienne fixed most of the issues and we have already the LGTM, we are
>  just waiting for a last review on the final fixes).
>  Of course none of the two are blockers, but it would be nice to get
>  them merged out if possible.
>  When do you plan to start the vote?
> 
>  On Thu, Jun 22, 2017 at 6:48 AM, Reuven Lax  >
> 
> >>> wrote:
> >>>
>  Does mean that value-dependent FileBasedSink will miss 2.1.0, but I
> >
>  guess
> >>>
>  it will make 2.2.0 then.
> >
> > On Wed, Jun 21, 2017 at 7:23 PM, Jean-Baptiste Onofré <
> j...@nanthrax.net
> > >
> > wrote:
> >
> > Hi guys,
> >>
> >> As we released 2.0.0 (first stable release) last month during
> >>
> > ApacheCon,
> >>>
>  and to maintain our release pace, I would like to release 2.1.0 next
> >>
> > week.
> >>>
>  This release would include lot of bug fixes and some new features:
> >>
> >> https://issues.apache.org/jira/projects/BEAM/versions/12340528
> >>
> >> I'm volunteer to be release manager for this one.
> >>
> >> Thoughts ?
> >>
> >> Thanks,
> >> Regards
> >> JB
> >> --
> >> Jean-Baptiste Onofré
> >> jbono...@apache.org
> >> http://blog.nanthrax.net
> >> Talend - http://www.talend.com
> >>
> >>
> >>>
> >
>


Re: Windows OS Compatibility and Jenkins Postcommit

2017-06-09 Thread Ahmet Altay
Thank you Luke, this is great! I hope that we will see a mac version of
this at some point.

It looks like there is no python binary and python tests are failing for
that.

Ahmet

On Thu, Jun 8, 2017 at 3:55 PM, Flavio Fiszman 
wrote:

> Thanks Luke! I'm working on fixing the Windows incompatibilities related to
> file schemes and globs in FileSystems.
>
> On Thu, Jun 8, 2017 at 3:52 PM, Lukasz Cwik 
> wrote:
>
> > I added the Windows Jenkins Postcommit run to be able to allow for people
> > to address a class of issues when attempting to build, test, run the SDK
> > with the Windows OS. It is not expected that the Jenkins Postcommit
> passes
> > or will pass until the issues below are addressed. I was hoping that
> > contributors who are able to develop, test, and use the SDK with the
> > Windows OS to help fix known issues and enumerate others.
> >
> > Postcommit:
> > https://builds.apache.org/view/All/job/beam_PostCommit_
> > Java_MavenInstall_Windows/
> > Github trigger: Run Java Windows PostCommit
> >
> > The current list of known Windows OS incompatibilities:
> > Support building/running archetypes:
> > https://issues.apache.org/jira/browse/BEAM-1042
> > Filesystem path pattern issues:
> > https://issues.apache.org/jira/browse/BEAM-1045
> > Launch Apex local cluster: https://issues.apache.org/
> jira/browse/BEAM-2269
> > Build the Apache Beam SDK:https://issues.apache.org/
> jira/browse/BEAM-2299
> >
>


Re: Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #3982

2017-06-01 Thread Ahmet Altay
Thank you Kenn. https://github.com/apache/beam/pull/3278 is for fixing the
lint errors,

On Thu, Jun 1, 2017 at 11:11 AM, Kenneth Knowles  wrote:

> On the surface, it looks like Python lint failed, but I failed after a few
> seconds to  track down the actionable error message.
>
> On Thu, Jun 1, 2017 at 10:27 AM, Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
>> See > tall/3982/display/redirect?page=changes>
>>
>> Changes:
>>
>> [altay] Add template examples to snippets.py
>>
>> --
>> [...truncated 2.98 MB...]
>> test_visit_entire_graph (apache_beam.pipeline_test.PipelineTest) ... ok
>> test_simple (apache_beam.pipeline_test.RunnerApiTest)
>> Tests serializing, deserializing, and running a simple pipeline. ... ok
>> test_pvalue_expected_arguments (apache_beam.pvalue_test.PValueTest) ...
>> ok
>> test_file_checksum_matchcer_invalid_sleep_time
>> (apache_beam.testing.pipeline_verifiers_test.PipelineVerifiersTest) ... <
>> https://builds.apache.org/job/beam_PostCommit_Java_MavenIns
>> tall/ws/sdks/python/apache_beam/testing/pipeline_verifiers_test.py>:129:
>> DeprecationWarning: BaseException.message has been deprecated as of Python
>> 2.6
>>   self.assertEqual(cm.exception.message,
>> ok
>> test_file_checksum_matcher_read_failed (apache_beam.testing.pipeline_
>> verifiers_test.PipelineVerifiersTest) ... ok
>> test_file_checksum_matcher_service_error (apache_beam.testing.pipeline_
>> verifiers_test.PipelineVerifiersTest) ... ok
>> test_file_checksum_matcher_sleep_before_verify
>> (apache_beam.testing.pipeline_verifiers_test.PipelineVerifiersTest) ...
>> ok
>> test_file_checksum_matcher_success (apache_beam.testing.pipeline_
>> verifiers_test.PipelineVerifiersTest) ... ok
>> test_pipeline_state_matcher_fails (apache_beam.testing.pipeline_
>> verifiers_test.PipelineVerifiersTest)
>> Test PipelineStateMatcher fails when using default expected state ... ok
>> test_pipeline_state_matcher_given_state (apache_beam.testing.pipeline_
>> verifiers_test.PipelineVerifiersTest)
>> Test PipelineStateMatcher successes when matches given state ... ok
>> test_pipeline_state_matcher_success (apache_beam.testing.pipeline_
>> verifiers_test.PipelineVerifiersTest)
>> Test PipelineStateMatcher successes when using default expected state ...
>> ok
>> test_append_extra_options 
>> (apache_beam.testing.test_pipeline_test.TestPipelineTest)
>> ... ok
>> test_append_verifier_in_extra_opt 
>> (apache_beam.testing.test_pipeline_test.TestPipelineTest)
>> ... ok
>> test_create_test_pipeline_options 
>> (apache_beam.testing.test_pipeline_test.TestPipelineTest)
>> ... ok
>> test_empty_option_args_parsing 
>> (apache_beam.testing.test_pipeline_test.TestPipelineTest)
>> ... ok
>> test_get_option (apache_beam.testing.test_pipeline_test.TestPipelineTest)
>> ... ok
>> test_option_args_parsing 
>> (apache_beam.testing.test_pipeline_test.TestPipelineTest)
>> ... ok
>> test_skip_IT (apache_beam.testing.test_pipeline_test.TestPipelineTest)
>> ... SKIP: IT is skipped because --test-pipeline-options is not specified
>> test_basic_test_stream (apache_beam.testing.test_stream_test.TestStreamTest)
>> ... ok
>> test_test_stream_errors (apache_beam.testing.test_stream_test.TestStreamTest)
>> ... ok
>> test_assert_that_fails (apache_beam.testing.util_test.UtilTest) ... ok
>> test_assert_that_fails_on_empty_expected 
>> (apache_beam.testing.util_test.UtilTest)
>> ... ok
>> test_assert_that_fails_on_empty_input 
>> (apache_beam.testing.util_test.UtilTest)
>> ... ok
>> test_assert_that_passes (apache_beam.testing.util_test.UtilTest) ... ok
>>
>> --
>> Ran 1185 tests in 181.744s
>>
>> OK (skipped=17)
>> ___ summary
>> 
>>   docs: commands succeeded
>> ERROR:   lint: commands failed
>>   py27: commands succeeded
>>   py27cython: commands succeeded
>>   py27gcp: commands succeeded
>> 2017-06-01T17:26:23.147 [ERROR] Command execution failed.
>> org.apache.commons.exec.ExecuteException: Process exited with an error:
>> 1 (Exit value: 1)
>> at org.apache.commons.exec.DefaultExecutor.executeInternal(Defa
>> ultExecutor.java:404)
>> at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecu
>> tor.java:166)
>> at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.
>> java:764)
>> at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.
>> java:711)
>> at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:289)
>> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMoj
>> o(DefaultBuildPluginManager.java:134)
>> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(Moj
>> oExecutor.java:208)
>> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(Moj
>> oExecutor.java:153)
>> at 

Re: low availability in the coming 4 weeks

2017-05-25 Thread Ahmet Altay
Congratulations!

On Thu, May 25, 2017 at 10:54 AM, Jean-Baptiste Onofré 
wrote:

> Congrats and enjoy !
>
> Regards
> JB
>
>
> On 05/25/2017 05:33 AM, Mingmin Xu wrote:
>
>> Hello everyone,
>>
>> I'll take 4 weeks off to take care of my new born baby. I'm very glad that
>> James Xu agrees to take my role in Beam SQL feature.
>>
>> Ps, I'll consolidate the PR for BEAM-2010 soon before that.
>>
>> Thank you!
>> 
>> Mingmin
>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: First stable release completed!

2017-05-17 Thread Ahmet Altay
Congratulations everyone, this is great!

On Wed, May 17, 2017 at 7:26 AM, Kenneth Knowles 
wrote:

> Awesome. A huge step.
>
> On Wed, May 17, 2017 at 6:30 AM, Andrew Psaltis 
> wrote:
>
> > This is fantastic.  Great job!
> > On Wed, May 17, 2017 at 08:20 Jean-Baptiste Onofré 
> > wrote:
> >
> > > Huge congrats to everyone who helped reaching this important milestone
> !
> > >
> > > Honestly, we are a great team, WE ROCK ! ;)
> > >
> > > Regards
> > > JB
> > >
> > > On 05/17/2017 01:28 PM, Davor Bonaci wrote:
> > > > The first stable release is now complete!
> > > >
> > > > Release artifacts are available through various repositories,
> including
> > > > dist.apache.org, Maven Central, and PyPI. The website is updated,
> and
> > > > announcements are published.
> > > >
> > > > Apache Software Foundation press release:
> > > >
> > > http://globenewswire.com/news-release/2017/05/17/986839/0/
> > en/The-Apache-Software-Foundation-Announces-Apache-Beam-v2-0-0.html
> > > >
> > > > Beam blog:
> > > > https://beam.apache.org/blog/2017/05/17/beam-first-stable-
> release.html
> > > >
> > > > Congratulations to everyone -- this is a really big milestone for the
> > > > project, and I'm proud to be a part of this great community.
> > > >
> > > > Davor
> > > >
> > >
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> > --
> > Thanks,
> > Andrew
> >
> > Subscribe to my book: Streaming Data 
> > 
> > twiiter: @itmdata 
> >
>


Re: First stable release: version designation?

2017-05-05 Thread Ahmet Altay
I would also like to vote for strong 2.0 with the same reasons as Dan
mentioned. It will be less confusing for the users overall.

Ahmet

On Fri, May 5, 2017 at 9:33 AM, Davor Bonaci  wrote:

> Strongly for 2.0.0:
> * Aljoscha
> * Cham
> * Dan
> * Luke
>
> Slight preference toward 2.0.0, but fine with 1.0.0:
> * Davor
> * Ismael
> * Kenn
>
> Strongly for 1.0.0: none.
>
> Slight preference toward 1.0.0, but fine with 2.0.0:
> * Amit
> * Jesse
> * JB
> * Manu
> * Mingmin
> * Ted
> * Thomas W.
>
> Unbelievably, the tally is 7 : 7. However, the 2.0 camp tends to feel more
> strongly, and we have nobody who feels strongly for 1.0. Thus, it seems
> going with 2.0.0 is the path of least resistance.
>
> With that, I'll start building the 2.0.0 RCs, and we'll formally
> ratify/reject this decision in an RC vote.
>
> On Thu, May 4, 2017 at 6:30 PM, María García Herrero <
> mari...@google.com.invalid> wrote:
>
> > The bigger letters aimed to indicate "strongly in favor of" as opposed to
> > "weakly in favor of." I'm OK with not using a doc, just responding to
> Ted's
> > question.
> >
> > On Thu, May 4, 2017 at 3:39 PM, Ted Yu  wrote:
> >
> > > What's the difference between first and second, third and fourth
> columns
> > ?
> > >
> > > On Thu, May 4, 2017 at 3:36 PM, María García Herrero <
> > > mari...@google.com.invalid> wrote:
> > >
> > > > Thanks for the suggestion, Ted. Get your vote in here
> > > >  > > > Wqz5B6eQ40TEgk/edit?usp=sharing>
> > > > .
> > > > I have already added all the votes that Davor compiled 3 hours ago
> and
> > > the
> > > > responses afterwards.
> > > >
> > > > On Thu, May 4, 2017 at 12:49 PM, Ted Yu  wrote:
> > > >
> > > > > Maybe create a google doc with columns as the camps.
> > > > >
> > > > > Each person can put his/her name under the camp in his/her favor.
> > > > >
> > > > > On Thu, May 4, 2017 at 12:32 PM, Thomas Weise 
> > wrote:
> > > > >
> > > > > > I'm in the relaxed 1.0.0 camp.
> > > > > >
> > > > > > --
> > > > > > sent from mobile
> > > > > > On May 4, 2017 12:29 PM, "Mingmin Xu" 
> wrote:
> > > > > >
> > > > > > > I slightly prefer1.0.0 for the *first* stable release, but fine
> > > with
> > > > > > 2.0.0.
> > > > > > >
> > > > > > > On Thu, May 4, 2017 at 12:25 PM, Lukasz Cwik
> > > >  > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Put me under Strongly for 2.0.0
> > > > > > > >
> > > > > > > > On Thu, May 4, 2017 at 12:24 PM, Kenneth Knowles
> > > > > >  > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I'll join Davor's group.
> > > > > > > > >
> > > > > > > > > On Thu, May 4, 2017 at 12:07 PM, Davor Bonaci <
> > > da...@apache.org>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I don't think we have reached a consensus here yet. Let's
> > > > > > re-examine
> > > > > > > > this
> > > > > > > > > > after some time has passed.
> > > > > > > > > >
> > > > > > > > > > If I understand everyone's opinion correctly, this is the
> > > > > summary:
> > > > > > > > > >
> > > > > > > > > > Strongly for 2.0.0:
> > > > > > > > > > * Aljoscha
> > > > > > > > > > * Dan
> > > > > > > > > >
> > > > > > > > > > Slight preference toward 2.0.0, but fine with 1.0.0:
> > > > > > > > > > * Davor
> > > > > > > > > >
> > > > > > > > > > Strongly for 1.0.0: none.
> > > > > > > > > >
> > > > > > > > > > Slight preference toward 1.0.0, but fine with 2.0.0:
> > > > > > > > > > * Amit
> > > > > > > > > > * Jesse
> > > > > > > > > > * JB
> > > > > > > > > > * Ted
> > > > > > > > > >
> > > > > > > > > > Any additional opinions?
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > >
> > > > > > > > > > Davor
> > > > > > > > > >
> > > > > > > > > > On Wed, Mar 8, 2017 at 12:58 PM, Amit Sela <
> > > > amitsel...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > If we were to go with a 2.0 release, we would have to
> be
> > > very
> > > > > > clear
> > > > > > > > on
> > > > > > > > > > > maturity of different modules; for example python SDK
> is
> > > not
> > > > as
> > > > > > > > mature
> > > > > > > > > as
> > > > > > > > > > > Java SDK, some runners support streaming better than
> > > others,
> > > > > some
> > > > > > > run
> > > > > > > > > on
> > > > > > > > > > > YARN better than others, etc.
> > > > > > > > > > >
> > > > > > > > > > > My only reservation here is that the Apache community
> > > usually
> > > > > > > expects
> > > > > > > > > > > version 2.0 to be a mature products, so I'm OK as long
> as
> > > we
> > > > do
> > > > > > > some
> > > > > > > > > > > "maturity-analysis" and document properly.
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Mar 7, 2017 at 4:48 AM Ted Yu <
> > yuzhih...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > If 

Re: Congratulations Davor!

2017-05-04 Thread Ahmet Altay
Congratulations, well deserved!

On Thu, May 4, 2017 at 10:35 AM, Andrew Psaltis 
wrote:

> Congrats Davor!
>
> On Thu, May 4, 2017 at 1:34 PM, Melissa Pashniak <
> meliss...@google.com.invalid> wrote:
>
> > Congratulations Davor!
> >
> >
> > On Thu, May 4, 2017 at 10:32 AM, Robert Bradshaw <
> > rober...@google.com.invalid> wrote:
> >
> > > Congratulations, Davor! Well deserved.
> > >
> > > On Thu, May 4, 2017 at 9:53 AM, Hadar Hod 
> > > wrote:
> > > > Congrats, Davor!
> > > >
> > > > On Thu, May 4, 2017 at 8:56 AM, Chamikara Jayalath <
> > chamik...@apache.org
> > > >
> > > > wrote:
> > > >
> > > >> Congrats Davor. Very well deserved.
> > > >>
> > > >> - Cham
> > > >>
> > > >> On Thu, May 4, 2017, 8:51 AM tarush grover  >
> > > >> wrote:
> > > >>
> > > >> > Congrats Davor
> > > >> >
> > > >> > Regards,
> > > >> > Tarush
> > > >> > On Thu, 4 May 2017 at 8:54 PM, Frances Perry 
> > > wrote:
> > > >> >
> > > >> > > Woohoo! So well deserved.
> > > >> > >
> > > >> > > On Thu, May 4, 2017 at 8:18 AM, Etienne Chauchot <
> > > echauc...@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Congratulations Davor!
> > > >> > > >
> > > >> > > > Well deserved indeed!
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > Le 04/05/2017 à 17:02, Thomas Groh a écrit :
> > > >> > > >
> > > >> > > >> Congratulations!
> > > >> > > >>
> > > >> > > >> On Thu, May 4, 2017 at 7:56 AM, Thomas Weise  >
> > > >> wrote:
> > > >> > > >>
> > > >> > > >> Congrats!
> > > >> > > >>>
> > > >> > > >>>
> > > >> > > >>> On Thu, May 4, 2017 at 7:53 AM, Sourabh Bajaj <
> > > >> > > >>> sourabhba...@google.com.invalid> wrote:
> > > >> > > >>>
> > > >> > > >>> Congrats!!
> > > >> > >  On Thu, May 4, 2017 at 7:48 AM Mingmin Xu <
> > mingm...@gmail.com>
> > > >> > wrote:
> > > >> > > 
> > > >> > >  Congratulations @Davor!
> > > >> > > >
> > > >> > > >
> > > >> > > > On May 4, 2017, at 7:08 AM, Amit Sela <
> amitsel...@gmail.com
> > >
> > > >> > wrote:
> > > >> > > >>
> > > >> > > >> Congratulations Davor!
> > > >> > > >>
> > > >> > > >> On Thu, May 4, 2017, 10:02 JingsongLee <
> > > lzljs3620...@aliyun.com
> > > >> >
> > > >> > > >>>
> > > >> > > >> wrote:
> > > >> > > 
> > > >> > > > Congratulations!
> > > >> > > >>>
> > > >> > 
> --
> > > >> > > >>> From:Jesse Anderson 
> > > >> > > >>> Time:2017 May 4 (Thu) 21:36
> > > >> > > >>> To:dev 
> > > >> > > >>> Subject:Re: Congratulations Davor!
> > > >> > > >>> Congrats!
> > > >> > > >>>
> > > >> > > >>> On Thu, May 4, 2017, 6:20 AM Aljoscha Krettek <
> > > >> > aljos...@apache.org
> > > >> > > 
> > > >> > > >>> wrote:
> > > >> > > >
> > > >> > > >> Congrats! :-)
> > > >> > > 
> > > >> > > > On 4. May 2017, at 14:34, Kenneth Knowles
> > > >> >  > > >> > > >
> > > >> > >  wrote:
> > > >> > > 
> > > >> > > > Awesome!
> > > >> > > >
> > > >> > > > On Thu, May 4, 2017 at 1:19 AM, Ted Yu <
> > > yuzhih...@gmail.com>
> > > >> > > >>
> > > >> > > > wrote:
> > > >> > > 
> > > >> > > > Congratulations, Davor!
> > > >> > > >>
> > > >> > > >> On Thu, May 4, 2017 at 12:45 AM, Aviem Zur <
> > > >> > aviem...@gmail.com
> > > >> > > >>
> > > >> > > > wrote:
> > > >> > > 
> > > >> > > > Congrats Davor! :)
> > > >> > > >>>
> > > >> > > >>> On Thu, May 4, 2017 at 10:42 AM Jean-Baptiste
> Onofré <
> > > >> > > >>>
> > > >> > > >> j...@nanthrax.net>
> > > >> > > >>>
> > > >> > >  wrote:
> > > >> > > >>>
> > > >> > > >>> Congrats ! Well deserved ;)
> > > >> > > 
> > > >> > >  Regards
> > > >> > >  JB
> > > >> > > 
> > > >> > >  On 05/04/2017 09:30 AM, Jason Kuster wrote:
> > > >> > > > Hi all,
> > > >> > > >
> > > >> > > > The ASF has just published a blog post[1]
> welcoming
> > > new
> > > >> > > >
> > > >> > >  members
> > > >> > > >>>
> > > >> > >  of
> > > >> > > 
> > > >> > > > the
> > > >> > > >>
> > > >> > > >>> Apache Software Foundation, and our own Davor Bonaci
> > is
> > > >> among
> > > >> > > >
> > > >> > >  them!
> > > >> > > >
> > > >> > > >> Congratulations and thank you to Davor for all of your
> work
> > > >> > > >
> > > >> > >  for
> > > >> > > >>>
> > > >> > >  the
> > > >> > > >
> > > >> > > >> Beam
> > > >> > > >>>
> > > >> > >  community, and the ASF at large. Well deserved.
> > > >> > > >
> 

Re: Community hackathon

2017-04-24 Thread Ahmet Altay
+1, this is a great idea.

On Mon, Apr 24, 2017 at 3:54 AM, JingsongLee 
wrote:

> +1
> best,
> Jingsonglee
> --From:Ted
> Yu Time:2017 Apr 24 (Mon) 17:29To:dev <
> dev@beam.apache.org>Subject:Re: Community hackathon
> +1
>
> > On Apr 24, 2017, at 12:51 AM, Jean-Baptiste Onofré  > wrote:
> >
> > That's a wonderful idea !
> >
> > I think the easiest way to organize this event is using th
> e Slack channels to discuss, help each other, and sync together.
> >
> > Regards
> > JB
> >
> >> On 04/24/2017 09:48 AM, Davor Bonaci wrote:
> >> We've been working as a community towards the first stabl
> e release for a
> >> while now, and I think we made a ton of progress across t
> he board over the
> >> last few weeks.
> >>
> >> We could try to organize a community-wide hackathon to identify and fix
> >> those last few issues, as well as to get a better sense of the overall
> >> project quality as it stands right now.
> >>
> >> This could be a self-organized event, and coordinated via the Slack
> >> channel. For example, we (as a community and participants
> ) can try out the
> >> project in various ways -- quickstart, examples, different runners,
> >> different platforms -- immediately fixing issues as we
> run into them. It
> >> could last, say, 24 hours, with people from different time zones
> >> participating at the time of their choosing.
> >>
> >> Thoughts?
> >>
> >> Davor
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
>


Re: Python build artifacts seem to be misconfigured

2017-04-14 Thread Ahmet Altay
Robert, would passing `build_dir='target/build'` to `cythonize` accomplish
what Davor is saying?

Using build_dir seems to be working but I could not find documentation of
it.

Ahmet

On Wed, Apr 12, 2017 at 7:11 PM, Davor Bonaci <da...@apache.org> wrote:

> I think it would be great if it can be configured that all (or, as many as
> possible) build-generated files use one specific directory -- "target/".
> Likely, all problems would just go away.
>
> On Wed, Apr 12, 2017 at 3:24 PM, Ahmet Altay <al...@google.com.invalid>
> wrote:
>
> > This is also root cause for the flakiness in test_using_slow_impl very
> > flaky locally tests (https://issues.apache.org/jira/browse/BEAM-1910).
> >
> > Kenn, have you found anything that might explain why tox is not deleting
> > them?
> >
> > Ahmet
> >
> > On Tue, Apr 11, 2017 at 11:50 AM, Robert Bradshaw <
> > rober...@google.com.invalid> wrote:
> >
> > > We should also ignore them: https://github.com/apache/beam/pull/2494
> > >
> > > On Thu, Apr 6, 2017 at 6:45 PM, Kenneth Knowles <k...@google.com.invalid
> >
> > > wrote:
> > > > Thanks for the pointer. I'll dig in to tox docs to see why this isn't
> > > > happening. Probably something to do with unclean shutdowns.
> > > >
> > > > On Thu, Apr 6, 2017 at 6:10 PM, Vikas RK <vikky...@gmail.com> wrote:
> > > >
> > > >> Those are cython generated files that should be deleted according to
> > > >> https://github.com/apache/beam/blob/master/sdks/python/tox.ini#L54
> > > >>
> > > >>
> > > >>
> > > >> On 6 April 2017 at 17:58, Kenneth Knowles <k...@google.com.invalid>
> > > wrote:
> > > >>
> > > >> > Hi all,
> > > >> >
> > > >> > It appears that the Python build process creates quite a few files
> > > that
> > > >> are
> > > >> > not accounted for in our .gitignore and that also trip the RAT
> check
> > > next
> > > >> > time around. These should be set up so that RAT and git both
> ignore
> > > the
> > > >> > files.
> > > >> >
> > > >> > It is possible that others have defaults that differ from mine,
> but
> > > >> > droppings from a recent `mvn verify` include:
> > > >> >
> > > >> > sdks/python/apache_beam/coders/coder_impl.c
> > > >> > sdks/python/apache_beam/coders/coder_impl.so
> > > >> > sdks/python/apache_beam/coders/stream.c
> > > >> > sdks/python/apache_beam/coders/stream.so
> > > >> > sdks/python/apache_beam/metrics/execution.c
> > > >> > sdks/python/apache_beam/metrics/execution.so
> > > >> > sdks/python/apache_beam/runners/common.c
> > > >> > sdks/python/apache_beam/runners/common.so
> > > >> > sdks/python/apache_beam/transforms/cy_combiners.c
> > > >> > sdks/python/apache_beam/transforms/cy_combiners.so
> > > >> > sdks/python/apache_beam/utils/counters.c
> > > >> > sdks/python/apache_beam/utils/counters.so
> > > >> > sdks/python/apache_beam/utils/windowed_value.c
> > > >> > sdks/python/apache_beam/utils/windowed_value.so
> > > >> > sdks/python/nose-1.3.7-py2.7.egg/
> > > >> >
> > > >> > Can someone who knows the Python SDK build process rectify?
> > > >> >
> > > >> > Kenn
> > > >> >
> > > >>
> > >
> >
>


Re: Python build artifacts seem to be misconfigured

2017-04-12 Thread Ahmet Altay
This is also root cause for the flakiness in test_using_slow_impl very
flaky locally tests (https://issues.apache.org/jira/browse/BEAM-1910).

Kenn, have you found anything that might explain why tox is not deleting
them?

Ahmet

On Tue, Apr 11, 2017 at 11:50 AM, Robert Bradshaw <
rober...@google.com.invalid> wrote:

> We should also ignore them: https://github.com/apache/beam/pull/2494
>
> On Thu, Apr 6, 2017 at 6:45 PM, Kenneth Knowles 
> wrote:
> > Thanks for the pointer. I'll dig in to tox docs to see why this isn't
> > happening. Probably something to do with unclean shutdowns.
> >
> > On Thu, Apr 6, 2017 at 6:10 PM, Vikas RK  wrote:
> >
> >> Those are cython generated files that should be deleted according to
> >> https://github.com/apache/beam/blob/master/sdks/python/tox.ini#L54
> >>
> >>
> >>
> >> On 6 April 2017 at 17:58, Kenneth Knowles 
> wrote:
> >>
> >> > Hi all,
> >> >
> >> > It appears that the Python build process creates quite a few files
> that
> >> are
> >> > not accounted for in our .gitignore and that also trip the RAT check
> next
> >> > time around. These should be set up so that RAT and git both ignore
> the
> >> > files.
> >> >
> >> > It is possible that others have defaults that differ from mine, but
> >> > droppings from a recent `mvn verify` include:
> >> >
> >> > sdks/python/apache_beam/coders/coder_impl.c
> >> > sdks/python/apache_beam/coders/coder_impl.so
> >> > sdks/python/apache_beam/coders/stream.c
> >> > sdks/python/apache_beam/coders/stream.so
> >> > sdks/python/apache_beam/metrics/execution.c
> >> > sdks/python/apache_beam/metrics/execution.so
> >> > sdks/python/apache_beam/runners/common.c
> >> > sdks/python/apache_beam/runners/common.so
> >> > sdks/python/apache_beam/transforms/cy_combiners.c
> >> > sdks/python/apache_beam/transforms/cy_combiners.so
> >> > sdks/python/apache_beam/utils/counters.c
> >> > sdks/python/apache_beam/utils/counters.so
> >> > sdks/python/apache_beam/utils/windowed_value.c
> >> > sdks/python/apache_beam/utils/windowed_value.so
> >> > sdks/python/nose-1.3.7-py2.7.egg/
> >> >
> >> > Can someone who knows the Python SDK build process rectify?
> >> >
> >> > Kenn
> >> >
> >>
>


Re: [DISCUSSION] Consistent use of loggers

2017-03-28 Thread Ahmet Altay
On Wed, Mar 22, 2017 at 10:38 AM, Tibor Kiss  wrote:

> This is a great idea!
>
> I believe Python-SDK's logging could also be enhanced (a bit differently):
> Currently we are not instantiating the logger, just using the class what
> logging package provides.
> Shortcoming of this approach is that the user cannot set the log level on
> a per module basis as all log messages
> end up in the root level.
>

+1 to this. Python SDK needs to expands its logging capabilities. Filed [1]
for this.

Ahmet

[1] https://issues.apache.org/jira/browse/BEAM-1825


>
> On 3/22/17, 5:46 AM, "Aviem Zur"  wrote:
>
> +1 to what JB said.
>
> Will just have to be documented well as if we provide no binding there
> will
> be no logging out of the box unless the user adds a binding.
>
> On Wed, Mar 22, 2017 at 6:24 AM Jean-Baptiste Onofré 
> wrote:
>
> > Hi Aviem,
> >
> > Good point.
> >
> > I think, in our dependencies set, we should just depend to slf4j-api
> and
> > let the
> > user provides the binding he wants (slf4j-log4j12, slf4j-simple,
> whatever).
> >
> > We define a binding only with test scope in our modules.
> >
> > Regards
> > JB
> >
> > On 03/22/2017 04:58 AM, Aviem Zur wrote:
> > > Hi all,
> > >
> > > There have been a few reports lately (On JIRA [1] and on Slack)
> from
> > users
> > > regarding inconsistent loggers used across Beam's modules.
> > >
> > > While we use SLF4J, different modules use a different logger
> behind it
> > > (JUL, log4j, etc)
> > > So when people add a log4j.properties file to their classpath for
> > instance,
> > > they expect this to affect all of their dependencies on Beam
> modules, but
> > > it doesn’t and they miss out on some logs they thought they would
> see.
> > >
> > > I think we should strive for consistency in which logger is used
> behind
> > > SLF4J, and try to enforce this in our modules.
> > > I for one think it should be slf4j-log4j. However, if performance
> of
> > > logging is critical we might want to consider logback.
> > >
> > > Note: SLF4J will still be the facade for logging across the
> project. The
> > > only change would be the logger SLF4J delegates to.
> > >
> > > Once we have something like this it would also be useful to add
> > > documentation on logging in Beam to the website.
> > >
> > > [1] https://issues.apache.org/jira/browse/BEAM-1757
> > >
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>
>
>


Re: [ANNOUNCEMENT] New committers, March 2017 edition!

2017-03-17 Thread Ahmet Altay
Congratulations to all you!

On Fri, Mar 17, 2017 at 4:28 PM, Chamikara Jayalath 
wrote:

> Thanks all. Congrats to other new committers !!
>
> I'm very excited to join.
>
> - Cham
>
> On Fri, Mar 17, 2017 at 3:02 PM Mark Liu 
> wrote:
>
> > Congrats to all of them!
> >
> > On Fri, Mar 17, 2017 at 2:24 PM, Neelesh Salian <
> neeleshssal...@gmail.com>
> > wrote:
> >
> > > Congratulations!
> > >
> > > On Fri, Mar 17, 2017 at 2:16 PM, Kenneth Knowles
>  > >
> > > wrote:
> > >
> > > > Congrats all!
> > > >
> > > > On Fri, Mar 17, 2017 at 2:13 PM, Davor Bonaci 
> > wrote:
> > > >
> > > > > Please join me and the rest of Beam PMC in welcoming the following
> > > > > contributors as our newest committers. They have significantly
> > > > contributed
> > > > > to the project in different ways, and we look forward to many more
> > > > > contributions in the future.
> > > > >
> > > > > * Chamikara Jayalath
> > > > > Chamikara has been contributing to Beam since inception, and
> > previously
> > > > to
> > > > > Google Cloud Dataflow, accumulating a total of 51 commits (8,301
> ++ /
> > > > 3,892
> > > > > --) since February 2016 [1]. He contributed broadly to the project,
> > but
> > > > > most significantly to the Python SDK, building the IO framework in
> > this
> > > > SDK
> > > > > [2], [3].
> > > > >
> > > > > * Eugene Kirpichov
> > > > > Eugene has been contributing to Beam since inception, and
> previously
> > to
> > > > > Google Cloud Dataflow, accumulating a total of 95 commits (22,122
> ++
> > /
> > > > > 18,407 --) since February 2016 [1]. In recent months, he’s been
> > driving
> > > > the
> > > > > Splittable DoFn effort [4]. A true expert on IO subsystem, Eugene
> has
> > > > > reviewed nearly every IO contributed to Beam. Finally, Eugene
> > > contributed
> > > > > the Beam Style Guide, and is championing it across the project.
> > > > >
> > > > > * Ismaël Mejia
> > > > > Ismaël has been contributing to Beam since mid-2016, accumulating a
> > > total
> > > > > of 35 commits (3,137 ++ / 1,328 --) [1]. He authored the HBaseIO
> > > > connector,
> > > > > helped on the Spark runner, and contributed in other areas as well,
> > > > > including cross-project collaboration with Apache Zeppelin. Ismaël
> > > > reported
> > > > > 24 Jira issues.
> > > > >
> > > > > * Aviem Zur
> > > > > Aviem has been contributing to Beam since early fall, accumulating
> a
> > > > total
> > > > > of 49 commits (6,471 ++ / 3,185 --) [1]. He reported 43 Jira
> issues,
> > > and
> > > > > resolved ~30 issues. Aviem improved the stability of the Spark
> > runner a
> > > > > lot, and introduced support for metrics. Finally, Aviem is
> > championing
> > > > > dependency management across the project.
> > > > >
> > > > > Congratulations to all four! Welcome!
> > > > >
> > > > > Davor
> > > > >
> > > > > [1]
> > > > > https://github.com/apache/beam/graphs/contributors?from=
> > > > > 2016-02-01=2017-03-17=c
> > > > > [2]
> > > > > https://github.com/apache/beam/blob/v0.6.0/sdks/python/
> > > > > apache_beam/io/iobase.py#L70
> > > > > [3]
> > > > > https://github.com/apache/beam/blob/v0.6.0/sdks/python/
> > > > > apache_beam/io/iobase.py#L561
> > > > > [4] https://s.apache.org/splittable-do-fn
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Neelesh S. Salian
> > >
> >
>


Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Ahmet Altay
JB,

0.6.0 is flagged as released now, thank you for catching this. As a side
note, I did not have enough permissions do this and asked Davor to do. I
will add this to the release notes.

Ahmet

On Wed, Mar 15, 2017 at 7:16 AM, Jesse Anderson <je...@bigdatainstitute.io>
wrote:

> Excellent!
>
> On Wed, Mar 15, 2017, 6:13 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
> > Hi Ahmet,
> >
> > it seems Jira is not up to date: 0.6.0 version is not flagged as
> > "Released".
> >
> > Can you fix that please ?
> >
> > Thanks !
> > Regards
> > JB
> >
> > On 03/15/2017 05:22 AM, Ahmet Altay wrote:
> > > I'm happy to announce that we have unanimously approved this release.
> > >
> > > There are 7 approving votes, 4 of which are binding:
> > > * Aljoscha Krettek
> > > * Davor Bonaci
> > > * Ismaël Mejía
> > > * Jean-Baptiste Onofré
> > > * Robert Bradshaw
> > > * Ted Yu
> > > * Tibor Kiss
> > >
> > > There are no disapproving votes.
> > >
> > > Thanks everyone!
> > >
> > > Ahmet
> > >
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
> --
> Thanks,
>
> Jesse
>


Re: [RESULT] [VOTE] Release 0.6.0, release candidate #2

2017-03-15 Thread Ahmet Altay
This release is now complete. Thanks to everyone who have helped make this
release possible!

Before sending a note to users@, I would like to make a pass over the
website and simplify things now that we have an official python release. I
did the first 'pip install apache-beam' today and it felt amazing!

Ahmet


On Tue, Mar 14, 2017 at 2:22 PM, Ahmet Altay <al...@google.com> wrote:

> I'm happy to announce that we have unanimously approved this release.
>
> There are 7 approving votes, 4 of which are binding:
> * Aljoscha Krettek
> * Davor Bonaci
> * Ismaël Mejía
> * Jean-Baptiste Onofré
> * Robert Bradshaw
> * Ted Yu
> * Tibor Kiss
>
> There are no disapproving votes.
>
> Thanks everyone!
>
> Ahmet
>


<    4   5   6   7   8   9   10   >