Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Pei HE
+1 on moving forward with Spark 2.x only.
Spark 1 users can still use already released Spark runners, and we can
support them with minor version releases for future bug fixes.

I don't see how important it is to make future Beam releases available to
Spark 1 users. If they choose not to upgrade Spark clusters, maybe they
don't need the newest Beam releases as well.

I think it is more important to 1). be able to leverage new features in
Spark 2.x, 2.) extend user base to Spark 2.
--
Pei


On Thu, Nov 9, 2017 at 1:45 PM, Holden Karau  wrote:

> That's a good point about Oozie does only supporting only Spark 1 or 2 at a
> time on a cluster -- but do we know people using Oozie and Spark 1 that
> would still be using Spark 1 by the time of the next BEAM release? The last
> Spark 1 release was a year ago (and last non-maintenance release almost 20
> months ago).
>
> On Wed, Nov 8, 2017 at 9:30 PM, NerdyNick  wrote:
>
> > I don't know if ditching Spark 1 out right right now would be a great
> move
> > given that a lot of the main support applications around spark haven't
> yet
> > fully moved to Spark 2 yet. Yet alone have support for having a cluster
> > with both. Oozie for example is still pre stable release for their Spark
> 1
> > and can't support a cluster with mixed Spark version. I think maybe doing
> > as suggested above with the common, spark1, spark2 packaging might be
> best
> > during this carry over phase. Maybe even just flag spark 1 as deprecated
> > and just being maintained might be enough.
> >
> > On Wed, Nov 8, 2017 at 10:25 PM, Holden Karau 
> > wrote:
> >
> > > Also, upgrading Spark 1 to 2 is generally easier than changing JVM
> > > versions. For folks using YARN or the hosted environments it pretty
> much
> > > trivial since you can effectively have distinct Spark clusters for each
> > > job.
> > >
> > > On Wed, Nov 8, 2017 at 9:19 PM, Holden Karau 
> > wrote:
> > >
> > > > I'm +1 on dropping Spark 1. There are a lot of exciting improvements
> in
> > > > Spark 2, and trying to write efficient code that runs between Spark 1
> > and
> > > > Spark 2 is super painful in the long term. It would be one thing if
> > there
> > > > were a lot of people available to work on the Spark runners, but it
> > seems
> > > > like we'd be better spent focusing our energy on the future.
> > > >
> > > > I don't know a lot of folks who are stuck on Spark 1, and the few
> that
> > I
> > > > know are planning to migrate in the next few months anyways.
> > > >
> > > > Note: this is a non-binding vote as I'm not a committer or PMC
> member.
> > > >
> > > > On Wed, Nov 8, 2017 at 3:43 AM, Ted Yu  wrote:
> > > >
> > > >> Having both Spark1 and Spark2 modules would benefit wider user base.
> > > >>
> > > >> I would vote for that.
> > > >>
> > > >> Cheers
> > > >>
> > > >> On Wed, Nov 8, 2017 at 12:51 AM, Jean-Baptiste Onofré <
> > j...@nanthrax.net>
> > > >> wrote:
> > > >>
> > > >> > Hi Robert,
> > > >> >
> > > >> > Thanks for your feedback !
> > > >> >
> > > >> > From an user perspective, with the current state of the PR, the
> same
> > > >> > pipelines can run on both Spark 1.x and 2.x: the only difference
> is
> > > the
> > > >> > dependencies set.
> > > >> >
> > > >> > I'm calling the vote to get suck kind of feedback: if we consider
> > > Spark
> > > >> > 1.x still need to be supported, no problem, I will improve the PR
> to
> > > >> have
> > > >> > three modules (common, spark1, spark2) and let users pick the
> > desired
> > > >> > version.
> > > >> >
> > > >> > Let's wait a bit other feedbacks, I will update the PR
> accordingly.
> > > >> >
> > > >> > Regards
> > > >> > JB
> > > >> >
> > > >> >
> > > >> > On 11/08/2017 09:47 AM, Robert Bradshaw wrote:
> > > >> >
> > > >> >> I'm generally a -0.5 on this change, or at least doing so
> hastily.
> > > >> >>
> > > >> >> As with dropping Java 7 support, I think this should at least be
> > > >> >> announced in release notes that we're considering dropping
> support
> > in
> > > >> >> the subsequent release, as this dev list likely does not reach a
> > > >> >> substantial portion of the userbase.
> > > >> >>
> > > >> >> How much work is it to move from a Spark 1.x cluster to a Spark
> 2.x
> > > >> >> cluster? I get the feeling it's not nearly as transparent as
> > > upgrading
> > > >> >> Java versions. Can Spark 1.x pipelines be run on Spark 2.x
> > clusters,
> > > >> >> or is a new cluster (and/or upgrading all pipelines) required
> (e.g.
> > > >> >> for those who operate spark clusters shared among their many
> > users)?
> > > >> >>
> > > >> >> Looks like the latest release of Spark 1.x was about a year ago,
> > > >> >> overlapping a bit with the 2.x series which is coming up on 1.5
> > years
> > > >> >> old, so I could see a lot of people still using 1.x even if 2.x
> is
> > > >> >> clearly the future. But it sure doesn't seem very backwards
> > > >> >> 

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Holden Karau
That's a good point about Oozie does only supporting only Spark 1 or 2 at a
time on a cluster -- but do we know people using Oozie and Spark 1 that
would still be using Spark 1 by the time of the next BEAM release? The last
Spark 1 release was a year ago (and last non-maintenance release almost 20
months ago).

On Wed, Nov 8, 2017 at 9:30 PM, NerdyNick  wrote:

> I don't know if ditching Spark 1 out right right now would be a great move
> given that a lot of the main support applications around spark haven't yet
> fully moved to Spark 2 yet. Yet alone have support for having a cluster
> with both. Oozie for example is still pre stable release for their Spark 1
> and can't support a cluster with mixed Spark version. I think maybe doing
> as suggested above with the common, spark1, spark2 packaging might be best
> during this carry over phase. Maybe even just flag spark 1 as deprecated
> and just being maintained might be enough.
>
> On Wed, Nov 8, 2017 at 10:25 PM, Holden Karau 
> wrote:
>
> > Also, upgrading Spark 1 to 2 is generally easier than changing JVM
> > versions. For folks using YARN or the hosted environments it pretty much
> > trivial since you can effectively have distinct Spark clusters for each
> > job.
> >
> > On Wed, Nov 8, 2017 at 9:19 PM, Holden Karau 
> wrote:
> >
> > > I'm +1 on dropping Spark 1. There are a lot of exciting improvements in
> > > Spark 2, and trying to write efficient code that runs between Spark 1
> and
> > > Spark 2 is super painful in the long term. It would be one thing if
> there
> > > were a lot of people available to work on the Spark runners, but it
> seems
> > > like we'd be better spent focusing our energy on the future.
> > >
> > > I don't know a lot of folks who are stuck on Spark 1, and the few that
> I
> > > know are planning to migrate in the next few months anyways.
> > >
> > > Note: this is a non-binding vote as I'm not a committer or PMC member.
> > >
> > > On Wed, Nov 8, 2017 at 3:43 AM, Ted Yu  wrote:
> > >
> > >> Having both Spark1 and Spark2 modules would benefit wider user base.
> > >>
> > >> I would vote for that.
> > >>
> > >> Cheers
> > >>
> > >> On Wed, Nov 8, 2017 at 12:51 AM, Jean-Baptiste Onofré <
> j...@nanthrax.net>
> > >> wrote:
> > >>
> > >> > Hi Robert,
> > >> >
> > >> > Thanks for your feedback !
> > >> >
> > >> > From an user perspective, with the current state of the PR, the same
> > >> > pipelines can run on both Spark 1.x and 2.x: the only difference is
> > the
> > >> > dependencies set.
> > >> >
> > >> > I'm calling the vote to get suck kind of feedback: if we consider
> > Spark
> > >> > 1.x still need to be supported, no problem, I will improve the PR to
> > >> have
> > >> > three modules (common, spark1, spark2) and let users pick the
> desired
> > >> > version.
> > >> >
> > >> > Let's wait a bit other feedbacks, I will update the PR accordingly.
> > >> >
> > >> > Regards
> > >> > JB
> > >> >
> > >> >
> > >> > On 11/08/2017 09:47 AM, Robert Bradshaw wrote:
> > >> >
> > >> >> I'm generally a -0.5 on this change, or at least doing so hastily.
> > >> >>
> > >> >> As with dropping Java 7 support, I think this should at least be
> > >> >> announced in release notes that we're considering dropping support
> in
> > >> >> the subsequent release, as this dev list likely does not reach a
> > >> >> substantial portion of the userbase.
> > >> >>
> > >> >> How much work is it to move from a Spark 1.x cluster to a Spark 2.x
> > >> >> cluster? I get the feeling it's not nearly as transparent as
> > upgrading
> > >> >> Java versions. Can Spark 1.x pipelines be run on Spark 2.x
> clusters,
> > >> >> or is a new cluster (and/or upgrading all pipelines) required (e.g.
> > >> >> for those who operate spark clusters shared among their many
> users)?
> > >> >>
> > >> >> Looks like the latest release of Spark 1.x was about a year ago,
> > >> >> overlapping a bit with the 2.x series which is coming up on 1.5
> years
> > >> >> old, so I could see a lot of people still using 1.x even if 2.x is
> > >> >> clearly the future. But it sure doesn't seem very backwards
> > >> >> compatible.
> > >> >>
> > >> >> Mostly I'm not comfortable with dropping 1.x in the same release as
> > >> >> adding support for 2.x, giving no transition period, but could be
> > >> >> convinced if this transition is mostly a no-op or no one's still
> > using
> > >> >> 1.x. If there's non-trivial code complexity issues, I would perhaps
> > >> >> revisit the issue of having a single Spark Runner that does chooses
> > >> >> the backend implicitly in favor of simply having two runners which
> > >> >> share the code that's easy to share and diverge otherwise (which
> > seems
> > >> >> it would be much simpler both to implement and explain to users). I
> > >> >> would be OK with even letting the Spark 1.x runner be somewhat
> > >> >> stagnant (e.g. few or no new features) until we decide we can kill
> it

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread NerdyNick
I don't know if ditching Spark 1 out right right now would be a great move
given that a lot of the main support applications around spark haven't yet
fully moved to Spark 2 yet. Yet alone have support for having a cluster
with both. Oozie for example is still pre stable release for their Spark 1
and can't support a cluster with mixed Spark version. I think maybe doing
as suggested above with the common, spark1, spark2 packaging might be best
during this carry over phase. Maybe even just flag spark 1 as deprecated
and just being maintained might be enough.

On Wed, Nov 8, 2017 at 10:25 PM, Holden Karau  wrote:

> Also, upgrading Spark 1 to 2 is generally easier than changing JVM
> versions. For folks using YARN or the hosted environments it pretty much
> trivial since you can effectively have distinct Spark clusters for each
> job.
>
> On Wed, Nov 8, 2017 at 9:19 PM, Holden Karau  wrote:
>
> > I'm +1 on dropping Spark 1. There are a lot of exciting improvements in
> > Spark 2, and trying to write efficient code that runs between Spark 1 and
> > Spark 2 is super painful in the long term. It would be one thing if there
> > were a lot of people available to work on the Spark runners, but it seems
> > like we'd be better spent focusing our energy on the future.
> >
> > I don't know a lot of folks who are stuck on Spark 1, and the few that I
> > know are planning to migrate in the next few months anyways.
> >
> > Note: this is a non-binding vote as I'm not a committer or PMC member.
> >
> > On Wed, Nov 8, 2017 at 3:43 AM, Ted Yu  wrote:
> >
> >> Having both Spark1 and Spark2 modules would benefit wider user base.
> >>
> >> I would vote for that.
> >>
> >> Cheers
> >>
> >> On Wed, Nov 8, 2017 at 12:51 AM, Jean-Baptiste Onofré 
> >> wrote:
> >>
> >> > Hi Robert,
> >> >
> >> > Thanks for your feedback !
> >> >
> >> > From an user perspective, with the current state of the PR, the same
> >> > pipelines can run on both Spark 1.x and 2.x: the only difference is
> the
> >> > dependencies set.
> >> >
> >> > I'm calling the vote to get suck kind of feedback: if we consider
> Spark
> >> > 1.x still need to be supported, no problem, I will improve the PR to
> >> have
> >> > three modules (common, spark1, spark2) and let users pick the desired
> >> > version.
> >> >
> >> > Let's wait a bit other feedbacks, I will update the PR accordingly.
> >> >
> >> > Regards
> >> > JB
> >> >
> >> >
> >> > On 11/08/2017 09:47 AM, Robert Bradshaw wrote:
> >> >
> >> >> I'm generally a -0.5 on this change, or at least doing so hastily.
> >> >>
> >> >> As with dropping Java 7 support, I think this should at least be
> >> >> announced in release notes that we're considering dropping support in
> >> >> the subsequent release, as this dev list likely does not reach a
> >> >> substantial portion of the userbase.
> >> >>
> >> >> How much work is it to move from a Spark 1.x cluster to a Spark 2.x
> >> >> cluster? I get the feeling it's not nearly as transparent as
> upgrading
> >> >> Java versions. Can Spark 1.x pipelines be run on Spark 2.x clusters,
> >> >> or is a new cluster (and/or upgrading all pipelines) required (e.g.
> >> >> for those who operate spark clusters shared among their many users)?
> >> >>
> >> >> Looks like the latest release of Spark 1.x was about a year ago,
> >> >> overlapping a bit with the 2.x series which is coming up on 1.5 years
> >> >> old, so I could see a lot of people still using 1.x even if 2.x is
> >> >> clearly the future. But it sure doesn't seem very backwards
> >> >> compatible.
> >> >>
> >> >> Mostly I'm not comfortable with dropping 1.x in the same release as
> >> >> adding support for 2.x, giving no transition period, but could be
> >> >> convinced if this transition is mostly a no-op or no one's still
> using
> >> >> 1.x. If there's non-trivial code complexity issues, I would perhaps
> >> >> revisit the issue of having a single Spark Runner that does chooses
> >> >> the backend implicitly in favor of simply having two runners which
> >> >> share the code that's easy to share and diverge otherwise (which
> seems
> >> >> it would be much simpler both to implement and explain to users). I
> >> >> would be OK with even letting the Spark 1.x runner be somewhat
> >> >> stagnant (e.g. few or no new features) until we decide we can kill it
> >> >> off.
> >> >>
> >> >> On Tue, Nov 7, 2017 at 11:27 PM, Jean-Baptiste Onofré <
> j...@nanthrax.net
> >> >
> >> >> wrote:
> >> >>
> >> >>> Hi all,
> >> >>>
> >> >>> as you might know, we are working on Spark 2.x support in the Spark
> >> >>> runner.
> >> >>>
> >> >>> I'm working on a PR about that:
> >> >>>
> >> >>> https://github.com/apache/beam/pull/3808
> >> >>>
> >> >>> Today, we have something working with both Spark 1.x and 2.x from a
> >> code
> >> >>> standpoint, but I have to deal with dependencies. It's the first
> step
> >> of
> >> >>> the
> >> >>> update as I'm still using RDD, 

Re: Portability overview webpage

2017-11-08 Thread Holden Karau
Awesome! Out of interest is there any discussion around common formats for
interchange going on?

On Tue, Nov 7, 2017 at 9:15 AM, Henning Rohde 
wrote:

> Thanks everyone! The page is now live at:
>
>https://beam.apache.org/contribute/portability/
>
> Henning
>
> On Thu, Nov 2, 2017 at 8:22 AM, Kenneth Knowles 
> wrote:
>
> > This is a superb high-level overview of the effort, understandable at a
> > glance. I think it is the first time someone has made it clear what we
> are
> > actually doing!
> >
> > Kenn
> >
> > On Wed, Nov 1, 2017 at 10:23 AM, Jean-Baptiste Onofré 
> > wrote:
> >
> > > Thanks for the update. I will take a look.
> > >
> > > Regards
> > > JB
> > >
> > > On Nov 1, 2017, 18:21, at 18:21, Henning Rohde
> > 
> > > wrote:
> > > >Hi everyone,
> > > >
> > > >Although portability is a large and involved effort, it seems it
> > > >doesn't
> > > >have a high-level overview and plan written down anywhere. I added a
> > > >proposed page with a 10,000 ft view and links to the webside under
> > > >'Contribute (technical references)'. There is a page for ongoing
> > > >projects,
> > > >but portability is much more encompassing and seems to be more suited
> > > >for
> > > >it's own page.
> > > >
> > > >The PR is:
> > > >
> > > > https://github.com/apache/beam-site/pull/340
> > > >
> > > >I'm sending it out to the dev list for more visibility. Please let me
> > > >know
> > > >if you have any comments or objections, or if there is a better place
> > > >for
> > > >this content.
> > > >
> > > >Thanks,
> > > > Henning
> > >
> >
>



-- 
Twitter: https://twitter.com/holdenkarau


Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Holden Karau
Also, upgrading Spark 1 to 2 is generally easier than changing JVM
versions. For folks using YARN or the hosted environments it pretty much
trivial since you can effectively have distinct Spark clusters for each job.

On Wed, Nov 8, 2017 at 9:19 PM, Holden Karau  wrote:

> I'm +1 on dropping Spark 1. There are a lot of exciting improvements in
> Spark 2, and trying to write efficient code that runs between Spark 1 and
> Spark 2 is super painful in the long term. It would be one thing if there
> were a lot of people available to work on the Spark runners, but it seems
> like we'd be better spent focusing our energy on the future.
>
> I don't know a lot of folks who are stuck on Spark 1, and the few that I
> know are planning to migrate in the next few months anyways.
>
> Note: this is a non-binding vote as I'm not a committer or PMC member.
>
> On Wed, Nov 8, 2017 at 3:43 AM, Ted Yu  wrote:
>
>> Having both Spark1 and Spark2 modules would benefit wider user base.
>>
>> I would vote for that.
>>
>> Cheers
>>
>> On Wed, Nov 8, 2017 at 12:51 AM, Jean-Baptiste Onofré 
>> wrote:
>>
>> > Hi Robert,
>> >
>> > Thanks for your feedback !
>> >
>> > From an user perspective, with the current state of the PR, the same
>> > pipelines can run on both Spark 1.x and 2.x: the only difference is the
>> > dependencies set.
>> >
>> > I'm calling the vote to get suck kind of feedback: if we consider Spark
>> > 1.x still need to be supported, no problem, I will improve the PR to
>> have
>> > three modules (common, spark1, spark2) and let users pick the desired
>> > version.
>> >
>> > Let's wait a bit other feedbacks, I will update the PR accordingly.
>> >
>> > Regards
>> > JB
>> >
>> >
>> > On 11/08/2017 09:47 AM, Robert Bradshaw wrote:
>> >
>> >> I'm generally a -0.5 on this change, or at least doing so hastily.
>> >>
>> >> As with dropping Java 7 support, I think this should at least be
>> >> announced in release notes that we're considering dropping support in
>> >> the subsequent release, as this dev list likely does not reach a
>> >> substantial portion of the userbase.
>> >>
>> >> How much work is it to move from a Spark 1.x cluster to a Spark 2.x
>> >> cluster? I get the feeling it's not nearly as transparent as upgrading
>> >> Java versions. Can Spark 1.x pipelines be run on Spark 2.x clusters,
>> >> or is a new cluster (and/or upgrading all pipelines) required (e.g.
>> >> for those who operate spark clusters shared among their many users)?
>> >>
>> >> Looks like the latest release of Spark 1.x was about a year ago,
>> >> overlapping a bit with the 2.x series which is coming up on 1.5 years
>> >> old, so I could see a lot of people still using 1.x even if 2.x is
>> >> clearly the future. But it sure doesn't seem very backwards
>> >> compatible.
>> >>
>> >> Mostly I'm not comfortable with dropping 1.x in the same release as
>> >> adding support for 2.x, giving no transition period, but could be
>> >> convinced if this transition is mostly a no-op or no one's still using
>> >> 1.x. If there's non-trivial code complexity issues, I would perhaps
>> >> revisit the issue of having a single Spark Runner that does chooses
>> >> the backend implicitly in favor of simply having two runners which
>> >> share the code that's easy to share and diverge otherwise (which seems
>> >> it would be much simpler both to implement and explain to users). I
>> >> would be OK with even letting the Spark 1.x runner be somewhat
>> >> stagnant (e.g. few or no new features) until we decide we can kill it
>> >> off.
>> >>
>> >> On Tue, Nov 7, 2017 at 11:27 PM, Jean-Baptiste Onofré > >
>> >> wrote:
>> >>
>> >>> Hi all,
>> >>>
>> >>> as you might know, we are working on Spark 2.x support in the Spark
>> >>> runner.
>> >>>
>> >>> I'm working on a PR about that:
>> >>>
>> >>> https://github.com/apache/beam/pull/3808
>> >>>
>> >>> Today, we have something working with both Spark 1.x and 2.x from a
>> code
>> >>> standpoint, but I have to deal with dependencies. It's the first step
>> of
>> >>> the
>> >>> update as I'm still using RDD, the second step would be to support
>> >>> dataframe
>> >>> (but for that, I would need PCollection elements with schemas, that's
>> >>> another topic on which Eugene, Reuven and I are discussing).
>> >>>
>> >>> However, as all major distributions now ship Spark 2.x, I don't think
>> >>> it's
>> >>> required anymore to support Spark 1.x.
>> >>>
>> >>> If we agree, I will update and cleanup the PR to only support and
>> focus
>> >>> on
>> >>> Spark 2.x.
>> >>>
>> >>> So, that's why I'm calling for a vote:
>> >>>
>> >>>[ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
>> >>>[ ] 0 (I don't care ;))
>> >>>[ ] -1, I would like to still support Spark 1.x, and so having
>> >>> support of
>> >>> both Spark 1.x and 2.x (please provide specific comment)
>> >>>
>> >>> This vote is open for 48 hours (I have the commits ready, just 

Re: Jira Access

2017-11-08 Thread Paul Gerver
Oh, yes. When I registered pgerv12 my browser said it timed out. I tried to 
register again, said it already existed so I created pfgerver. Unfortunately, I 
didn't see a way to delete the pfgerver account.

Do you know if a note to the Jira admins could handle it?

On 2017-11-08 18:02, Lukasz Cwik  wrote:
> I have added you and saw the Jira that you commented on and assigned it to>
> you.>
>
> Curious note, I also saw a pfgerver which also seems to be you.>
>
> On Wed, Nov 8, 2017 at 2:08 PM, Paul Gerver  wrote:>
>
> > Hello,>
> >>
> > I'm part of the IBM Streams team and would like to contribute to the Apache>
> > Beam community.>
> > My ASF Jira ID is pgerv12.>
> >>
> > Thanks!>
> >>
> > -->
> >>
> > *Paul Gerver*>
> >>
>


Re: [VOTE] Release 2.2.0, release candidate #3

2017-11-08 Thread Valentyn Tymofieiev
I looked at Python side of Dataflow & Direct runners on Linux. There are
two findings:

1. One of the mobile gaming examples did not pass for Dataflow runner,
addressed in: https://github.com/apache/beam/pull/4102

.

2. Python streaming did not work for Dataflow runner, one PR is out
https://github.com/apache/beam/pull/4106, but follow up PRs may be required
as we continue to investigate. If we had a PostCommit tests suite running
against a release branch, this could have been caught earlier. Filed
https://issues.apache.org/jira/browse/BEAM-3163.

On Wed, Nov 8, 2017 at 2:39 PM, Reuven Lax  wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #3 for the version 2.2.0,
> as follows:
>   [ ] +1, Approve the release
>   [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
>   * JIRA release notes [1],
>   * the official Apache source release to be deployed to dist.apache.org
> [2],
> which is signed with the key with fingerprint B98B7708 [3],
>   * all artifacts to be deployed to the Maven Central Repository [4],
>   * source code tag "v2.2.0-RC3" [5],
>   * website pull request listing the release and publishing the API
> reference manual [6].
>   * Java artifacts were built with Maven 3.5.0 and OpenJDK/Oracle JDK
> 1.8.0_144.
>   * Python artifacts are deployed along with the source release to the
> dist.apache.org [2].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Reuven
>
> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?p
> rojectId=12319527=12341044
> [2] https://dist.apache.org/repos/dist/dev/beam/2.2.0/
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1023/
> [5] https://github.com/apache/beam/tree/v2.2.0-RC3
> 
> [6] https://github.com/apache/beam-site/pull/337
>


Re: Jira Access

2017-11-08 Thread Lukasz Cwik
I have added you and saw the Jira that you commented on and assigned it to
you.

Curious note, I also saw a pfgerver which also seems to be you.

On Wed, Nov 8, 2017 at 2:08 PM, Paul Gerver  wrote:

> Hello,
>
> I'm part of the IBM Streams team and would like to contribute to the Apache
> Beam community.
> My ASF Jira ID is pgerv12.
>
> Thanks!
>
> --
>
> *Paul Gerver*
>


Jira Access

2017-11-08 Thread Paul Gerver
Hello,

I'm part of the IBM Streams team and would like to contribute to the Apache
Beam community.
My ASF Jira ID is pgerv12.

Thanks!

-- 

*Paul Gerver*


[VOTE] Release 2.2.0, release candidate #3

2017-11-08 Thread Reuven Lax
Hi everyone,

Please review and vote on the release candidate #3 for the version 2.2.0,
as follows:
  [ ] +1, Approve the release
  [ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
  * JIRA release notes [1],
  * the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint B98B7708 [3],
  * all artifacts to be deployed to the Maven Central Repository [4],
  * source code tag "v2.2.0-RC3" [5],
  * website pull request listing the release and publishing the API
reference manual [6].
  * Java artifacts were built with Maven 3.5.0 and OpenJDK/Oracle JDK
1.8.0_144.
  * Python artifacts are deployed along with the source release to the
dist.apache.org [2].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Reuven

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?p
rojectId=12319527=12341044
[2] https://dist.apache.org/repos/dist/dev/beam/2.2.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1023/
[5] https://github.com/apache/beam/tree/v2.2.0-RC3

[6] https://github.com/apache/beam-site/pull/337


Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Reuven Lax
Resending. The issues with the links were caused by gmail not updating them.

On Wed, Nov 8, 2017 at 2:31 PM, Reuven Lax  wrote:

> The thread is misnamed - I sent it out quickly before getting on a plane
> to Singapore. I'll resend it with RC2 fixed up.
>
> On Wed, Nov 8, 2017 at 2:17 PM, Valentyn Tymofieiev <
> valen...@google.com.invalid> wrote:
>
>> I think the thread is misnamed and should refer to RC #3, also there are
>> typos in the links, some of them are pointing to RC2 and some to RC3.
>>
>> Aside from that, one of mobile gaming examples [1] is not working for me
>> on
>> Dataflow runner, but works on Direct runner. There were changes to the
>> example recently [2]. David Cavazos is taking a look.
>>
>> [1] https://github.com/apache/beam/blob/v2.2.0-RC3/sdks/python/
>> apache_beam/examples/complete/game/user_score.py
>> [2] https://github.com/apache/beam/commit/12c0fa68f463b52f21
>> c666ef8cebc7235b79aedf#diff-f204362d66104bc997bbf22fd8719b77
>>
>> On Wed, Nov 8, 2017 at 12:53 PM, Ben Chambers
>> 
>> wrote:
>>
>> > This seems to be the second thread entitled "[VOTE] Release 2.2.0,
>> release
>> > candidate #2". The subject and description refer to release candidate
>> #2,
>> > however the artifacts mention v2.2.0-RC3. Which release candidate is
>> this
>> > vote thread for?
>> >
>> > On Wed, Nov 8, 2017 at 12:52 PM Jean-Baptiste Onofré 
>> > wrote:
>> >
>> > > Agree.
>> > >
>> > > I just would like what changed exactly as I didn't have any issue
>> when I
>> > > did the 2.1.0 release.
>> > >
>> > > Regards
>> > > JB
>> > >
>> > > On Nov 8, 2017, 21:50, at 21:50, Kenneth Knowles
>> > > >
>> > > wrote:
>> > > >Agree with everything Robert said. So if we just rebuild the Python
>> zip
>> > > >then this should g2g?
>> > > >
>> > > >On Wed, Nov 8, 2017 at 12:37 PM, Robert Bradshaw <
>> > > >rober...@google.com.invalid> wrote:
>> > > >
>> > > >> Let me try to clarify the state of the world (with regards to
>> Python
>> > > >> and proto files).
>> > > >>
>> > > >> * When Python setup.py is run, it checks to see if the generated
>> pb2
>> > > >> files exist. If not, it attempts to generate them by installing the
>> > > >> proto compiler and looking up the .proto definitions in its parent
>> > > >> directory. This works great for the developer that checked out the
>> > > >> full pristine sources from git (or otherwise obtained them).
>> > > >>
>> > > >> * For the sdist tarball uploaded to PyPi (aka Python Artifact), we
>> > > >> ship the generated pb2 files both because (1) we don't want to
>> force
>> > > >> the user to install the proto compiler and (2) the "parent"
>> directory
>> > > >> doesn't exist as we're just shipping the sdks/python/... portion of
>> > > >> the full git repository.
>> > > >>
>> > > >> * All previous "releases" in
>> > > >> https://dist.apache.org/repos/dist/release/beam/ post the Python
>> > > >> artifact (which is Python sources + generated proto files, but
>> > > >notably
>> > > >> no source proto files) in addition to the full source artifact
>> (which
>> > > >> contains some snapshot of the full git repository, Python and proto
>> > > >> files included). We also separately publish Java artifacts offsite
>> > > >> which is what people will install from.
>> > > >>
>> > > >> So it seems the purpose of the -python.zip file is just to stage
>> what
>> > > >> we intend to release on PyPi (e.g. for testing); it is not a source
>> > > >> distribution (that is taken care of by the adjacent -source.zip
>> file)
>> > > >> and so there's no issue with it containing generated files. It
>> should
>> > > >> be the output of "python setup.py sdist" (possibly invoked by the
>> mvn
>> > > >> release commands, if you can get those to work). On the other hand,
>> > > >> creating a separate python-only source distribution would serve no
>> > > >> purpose, as it would be redundant with the existing
>> > > >> everything-source-distribution which is just a manually taken
>> > > >snapshot
>> > > >> of the entire git repository. The confusion is around the role of
>> the
>> > > >> -python.zip file, and if we clarify that it's the proposed Python
>> > > >PyPi
>> > > >> artifact, and *not* some kind of python-only source distribution,
>> the
>> > > >> release process is WAI.
>> > > >>
>> > > >> - Robert
>> > > >>
>> > > >>
>> > > >> On Wed, Nov 8, 2017 at 11:55 AM, Jean-Baptiste Onofré
>> > > >
>> > > >> wrote:
>> > > >> > Let me take a look. Afair I didn't touch those files in last
>> > > >release.
>> > > >> >
>> > > >> > I keep you posted.
>> > > >> >
>> > > >> > Regards
>> > > >> > JB
>> > > >> >
>> > > >> > On Nov 8, 2017, 20:50, at 20:50, Reuven Lax
>> > > >
>> > > >> wrote:
>> > > >> >>I explicitly removed the pb2 files as I thought we determined
>> they
>> > > >> >>shouldn't be in the source release, and they caused RAT failures.
>> > > >What
>> > > >> 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Reuven Lax
The thread is misnamed - I sent it out quickly before getting on a plane to
Singapore. I'll resend it with RC2 fixed up.

On Wed, Nov 8, 2017 at 2:17 PM, Valentyn Tymofieiev <
valen...@google.com.invalid> wrote:

> I think the thread is misnamed and should refer to RC #3, also there are
> typos in the links, some of them are pointing to RC2 and some to RC3.
>
> Aside from that, one of mobile gaming examples [1] is not working for me on
> Dataflow runner, but works on Direct runner. There were changes to the
> example recently [2]. David Cavazos is taking a look.
>
> [1] https://github.com/apache/beam/blob/v2.2.0-RC3/sdks/python/
> apache_beam/examples/complete/game/user_score.py
> [2] https://github.com/apache/beam/commit/12c0fa68f463b52f21
> c666ef8cebc7235b79aedf#diff-f204362d66104bc997bbf22fd8719b77
>
> On Wed, Nov 8, 2017 at 12:53 PM, Ben Chambers  >
> wrote:
>
> > This seems to be the second thread entitled "[VOTE] Release 2.2.0,
> release
> > candidate #2". The subject and description refer to release candidate #2,
> > however the artifacts mention v2.2.0-RC3. Which release candidate is this
> > vote thread for?
> >
> > On Wed, Nov 8, 2017 at 12:52 PM Jean-Baptiste Onofré 
> > wrote:
> >
> > > Agree.
> > >
> > > I just would like what changed exactly as I didn't have any issue when
> I
> > > did the 2.1.0 release.
> > >
> > > Regards
> > > JB
> > >
> > > On Nov 8, 2017, 21:50, at 21:50, Kenneth Knowles
>  > >
> > > wrote:
> > > >Agree with everything Robert said. So if we just rebuild the Python
> zip
> > > >then this should g2g?
> > > >
> > > >On Wed, Nov 8, 2017 at 12:37 PM, Robert Bradshaw <
> > > >rober...@google.com.invalid> wrote:
> > > >
> > > >> Let me try to clarify the state of the world (with regards to Python
> > > >> and proto files).
> > > >>
> > > >> * When Python setup.py is run, it checks to see if the generated pb2
> > > >> files exist. If not, it attempts to generate them by installing the
> > > >> proto compiler and looking up the .proto definitions in its parent
> > > >> directory. This works great for the developer that checked out the
> > > >> full pristine sources from git (or otherwise obtained them).
> > > >>
> > > >> * For the sdist tarball uploaded to PyPi (aka Python Artifact), we
> > > >> ship the generated pb2 files both because (1) we don't want to force
> > > >> the user to install the proto compiler and (2) the "parent"
> directory
> > > >> doesn't exist as we're just shipping the sdks/python/... portion of
> > > >> the full git repository.
> > > >>
> > > >> * All previous "releases" in
> > > >> https://dist.apache.org/repos/dist/release/beam/ post the Python
> > > >> artifact (which is Python sources + generated proto files, but
> > > >notably
> > > >> no source proto files) in addition to the full source artifact
> (which
> > > >> contains some snapshot of the full git repository, Python and proto
> > > >> files included). We also separately publish Java artifacts offsite
> > > >> which is what people will install from.
> > > >>
> > > >> So it seems the purpose of the -python.zip file is just to stage
> what
> > > >> we intend to release on PyPi (e.g. for testing); it is not a source
> > > >> distribution (that is taken care of by the adjacent -source.zip
> file)
> > > >> and so there's no issue with it containing generated files. It
> should
> > > >> be the output of "python setup.py sdist" (possibly invoked by the
> mvn
> > > >> release commands, if you can get those to work). On the other hand,
> > > >> creating a separate python-only source distribution would serve no
> > > >> purpose, as it would be redundant with the existing
> > > >> everything-source-distribution which is just a manually taken
> > > >snapshot
> > > >> of the entire git repository. The confusion is around the role of
> the
> > > >> -python.zip file, and if we clarify that it's the proposed Python
> > > >PyPi
> > > >> artifact, and *not* some kind of python-only source distribution,
> the
> > > >> release process is WAI.
> > > >>
> > > >> - Robert
> > > >>
> > > >>
> > > >> On Wed, Nov 8, 2017 at 11:55 AM, Jean-Baptiste Onofré
> > > >
> > > >> wrote:
> > > >> > Let me take a look. Afair I didn't touch those files in last
> > > >release.
> > > >> >
> > > >> > I keep you posted.
> > > >> >
> > > >> > Regards
> > > >> > JB
> > > >> >
> > > >> > On Nov 8, 2017, 20:50, at 20:50, Reuven Lax
> > > >
> > > >> wrote:
> > > >> >>I explicitly removed the pb2 files as I thought we determined they
> > > >> >>shouldn't be in the source release, and they caused RAT failures.
> > > >What
> > > >> >>should I be doing here?
> > > >> >>
> > > >> >>On Wed, Nov 8, 2017 at 10:21 AM, Robert Bradshaw <
> > > >> >>rober...@google.com.invalid> wrote:
> > > >> >>
> > > >> >>> This is due to having removed the auto-generated pb2 files.
> > > >> >>>
> > > >> >>> On Wed, Nov 8, 2017 at 9:37 AM, 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Valentyn Tymofieiev
I think the thread is misnamed and should refer to RC #3, also there are
typos in the links, some of them are pointing to RC2 and some to RC3.

Aside from that, one of mobile gaming examples [1] is not working for me on
Dataflow runner, but works on Direct runner. There were changes to the
example recently [2]. David Cavazos is taking a look.

[1] https://github.com/apache/beam/blob/v2.2.0-RC3/sdks/python/
apache_beam/examples/complete/game/user_score.py
[2] https://github.com/apache/beam/commit/12c0fa68f463b52f21
c666ef8cebc7235b79aedf#diff-f204362d66104bc997bbf22fd8719b77

On Wed, Nov 8, 2017 at 12:53 PM, Ben Chambers 
wrote:

> This seems to be the second thread entitled "[VOTE] Release 2.2.0, release
> candidate #2". The subject and description refer to release candidate #2,
> however the artifacts mention v2.2.0-RC3. Which release candidate is this
> vote thread for?
>
> On Wed, Nov 8, 2017 at 12:52 PM Jean-Baptiste Onofré 
> wrote:
>
> > Agree.
> >
> > I just would like what changed exactly as I didn't have any issue when I
> > did the 2.1.0 release.
> >
> > Regards
> > JB
> >
> > On Nov 8, 2017, 21:50, at 21:50, Kenneth Knowles  >
> > wrote:
> > >Agree with everything Robert said. So if we just rebuild the Python zip
> > >then this should g2g?
> > >
> > >On Wed, Nov 8, 2017 at 12:37 PM, Robert Bradshaw <
> > >rober...@google.com.invalid> wrote:
> > >
> > >> Let me try to clarify the state of the world (with regards to Python
> > >> and proto files).
> > >>
> > >> * When Python setup.py is run, it checks to see if the generated pb2
> > >> files exist. If not, it attempts to generate them by installing the
> > >> proto compiler and looking up the .proto definitions in its parent
> > >> directory. This works great for the developer that checked out the
> > >> full pristine sources from git (or otherwise obtained them).
> > >>
> > >> * For the sdist tarball uploaded to PyPi (aka Python Artifact), we
> > >> ship the generated pb2 files both because (1) we don't want to force
> > >> the user to install the proto compiler and (2) the "parent" directory
> > >> doesn't exist as we're just shipping the sdks/python/... portion of
> > >> the full git repository.
> > >>
> > >> * All previous "releases" in
> > >> https://dist.apache.org/repos/dist/release/beam/ post the Python
> > >> artifact (which is Python sources + generated proto files, but
> > >notably
> > >> no source proto files) in addition to the full source artifact (which
> > >> contains some snapshot of the full git repository, Python and proto
> > >> files included). We also separately publish Java artifacts offsite
> > >> which is what people will install from.
> > >>
> > >> So it seems the purpose of the -python.zip file is just to stage what
> > >> we intend to release on PyPi (e.g. for testing); it is not a source
> > >> distribution (that is taken care of by the adjacent -source.zip file)
> > >> and so there's no issue with it containing generated files. It should
> > >> be the output of "python setup.py sdist" (possibly invoked by the mvn
> > >> release commands, if you can get those to work). On the other hand,
> > >> creating a separate python-only source distribution would serve no
> > >> purpose, as it would be redundant with the existing
> > >> everything-source-distribution which is just a manually taken
> > >snapshot
> > >> of the entire git repository. The confusion is around the role of the
> > >> -python.zip file, and if we clarify that it's the proposed Python
> > >PyPi
> > >> artifact, and *not* some kind of python-only source distribution, the
> > >> release process is WAI.
> > >>
> > >> - Robert
> > >>
> > >>
> > >> On Wed, Nov 8, 2017 at 11:55 AM, Jean-Baptiste Onofré
> > >
> > >> wrote:
> > >> > Let me take a look. Afair I didn't touch those files in last
> > >release.
> > >> >
> > >> > I keep you posted.
> > >> >
> > >> > Regards
> > >> > JB
> > >> >
> > >> > On Nov 8, 2017, 20:50, at 20:50, Reuven Lax
> > >
> > >> wrote:
> > >> >>I explicitly removed the pb2 files as I thought we determined they
> > >> >>shouldn't be in the source release, and they caused RAT failures.
> > >What
> > >> >>should I be doing here?
> > >> >>
> > >> >>On Wed, Nov 8, 2017 at 10:21 AM, Robert Bradshaw <
> > >> >>rober...@google.com.invalid> wrote:
> > >> >>
> > >> >>> This is due to having removed the auto-generated pb2 files.
> > >> >>>
> > >> >>> On Wed, Nov 8, 2017 at 9:37 AM, Valentyn Tymofieiev
> > >> >>>  wrote:
> > >> >>> > Confirming Ismaël's finding - I also see this error and it did
> > >not
> > >> >>see it
> > >> >>> > on a candidate that was in the staging area yesterday.
> > >> >>> >
> > >> >>> > On Wed, Nov 8, 2017 at 9:07 AM, Ismaël Mejía
> > >
> > >> >>wrote:
> > >> >>> >
> > >> >>> >> I tested the python version of the release I just created a
> > >new
> > >> >>> >> 

Re: [DISCUSS] Dealing with differing Scala versions in Runner dependencies

2017-11-08 Thread Lukasz Cwik
An alternative would be to contribute to the Gradle based build and
configure a test set with the specific dependencies/configurations that are
needed on a per runner basis without needing to activate a mixed bag of
profiles which lead to runner based dependency conflicts/resolution.

On Tue, Nov 7, 2017 at 9:16 AM, Kenneth Knowles 
wrote:

> Is there a JIRA for this issue? I've just cut
> https://github.com/apache/beam/pull/4093 since I don't think we actually
> need them in some of those.
>
> I believe the precommit is the only place where we actually need both, to
> run the examples' integration tests with each runner (albeit currently in
> local mode for most runners).
>
> The only solutions I can think of are two executions/jobs or separate
> modules that set things up explicitly. I believe the latter is probably
> more robust to changes in the build, like so:
>
> runners/flink/examples-integration-tests
> -> runners/flink
> -> examples/java
>
> Now this module should be able to be pretty flexible to do what it needs to
> do. Our dependency graph won't be intuitive from the directory structure,
> so maybe it should have a different home (maybe all ITs by their nature
> should live alongside the other modules).
>
> On Tue, Nov 7, 2017 at 7:38 AM, Aljoscha Krettek 
> wrote:
>
> > I'd like to do it yes, but I've been swamped lately with Flink work.
> >
> > Also, the situation with Jenkins/Maven (especially the Pipeline Job
> > changes) make it somewhat unclear how I should proceed because I don't
> want
> > to add expensive pre-commit hooks. The number of profiles that would need
> > splitting are also quite high:
> >
> > ~/D/i/.test-infra (master|✔) $ ag flink-runner
> > jenkins/job_beam_PreCommit_Java_MavenInstall.groovy
> > 47:'--activate-profiles release,jenkins-precommit,
> > direct-runner,dataflow-runner,spark-runner,flink-runner,apex-runner',
> >
> > jenkins/job_beam_PreCommit_Python_MavenInstall.groovy
> > 47:--activate-profiles release,jenkins-precommit,
> > direct-runner,dataflow-runner,spark-runner,flink-runner,apex-runner \
> >
> > jenkins/job_beam_Java_UnitTest.groovy
> > 35:'flink-runner',
> >
> > jenkins/job_beam_PreCommit_Go_MavenInstall.groovy
> > 47:--activate-profiles release,jenkins-precommit,
> > direct-runner,dataflow-runner,spark-runner,flink-runner,apex-runner \
> >
> > jenkins/job_beam_Java_Build.groovy
> > 51:'flink-runner',
> >
> > jenkins/job_beam_Java_IntegrationTest.groovy
> > 36:'flink-runner',
> > 48:'flink-runner-integration-tests',
> >
> > These are all profiles where we build with several runner profiles active
> > at the same time.
> >
> > Unfortunately, this is becoming somewhat pressing now since Flink 1.4
> will
> > drop support for Scala 2.10 support, meaning we have to update the Flink
> > Runner to 2.11 deps.
> >
> > > On 27. Oct 2017, at 09:52, Kenneth Knowles 
> > wrote:
> > >
> > > You are entirely correct in how you would pull this off - groovy files
> > and
> > > tweaking the profiles. Seeding is done daily, or also by commenting
> "Run
> > > Seed Job" on a pull request. One thing to consider, in light of recent
> > > conversations, is making new jobs that are by post-commit and by
> request
> > > only, or multi-step in order to avoid running lots of extra tests, etc.
> > >
> > > Do you think you might have time to work on this goal of splitting
> apart
> > > jobs that require splitting?
> > >
> > >
> > > On Wed, Oct 11, 2017 at 2:08 AM, Aljoscha Krettek  >
> > > wrote:
> > >
> > >> I also like option 2 (allowing differing dependencies for runners)
> > better.
> > >> With the current situation this would mean splitting
> > >> PreCommit_Java_MavenInstall (and possibly also
> > >> PreCommit_Python_MavenInstall and PreCommit_Go_MavenInstall) into
> > separate
> > >> jobs. For my goals splitting into one job for "direct-runner,dataflow-
> > >> runner,spark-runner,apex-runner" and one for "flank-runner" would be
> > >> enough so we should probably go with that until we have the "custom
> > make"
> > >> solution.
> > >>
> > >> What do you think?
> > >>
> > >> @Jason For pulling this off I would copy the groovy files in
> .test-infra
> > >> and change the --activate-profiles line, right? Are there still manual
> > >> steps required for "re-seeding" the jobs?
> > >>
> > >>
> > >>
> > >>> On 9. Oct 2017, at 18:06, Kenneth Knowles 
> > >> wrote:
> > >>>
> > >>> +1 to the goal, and replying inline on details.
> > >>>
> > >>> On Mon, Oct 9, 2017 at 8:06 AM, Aljoscha Krettek <
> aljos...@apache.org>
> > >>> wrote:
> > >>>
> > 
> >  - We want to update the Flink dependencies to _2.11 dependencies
> > because
> >  2.10 is quite outdated
> > >>>
> > >>> - This doesn't work well because some modules (examples, for example)
> >  depend on all Runners and at least the Spark Runner has _2.10
> > >> 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Ben Chambers
This seems to be the second thread entitled "[VOTE] Release 2.2.0, release
candidate #2". The subject and description refer to release candidate #2,
however the artifacts mention v2.2.0-RC3. Which release candidate is this
vote thread for?

On Wed, Nov 8, 2017 at 12:52 PM Jean-Baptiste Onofré 
wrote:

> Agree.
>
> I just would like what changed exactly as I didn't have any issue when I
> did the 2.1.0 release.
>
> Regards
> JB
>
> On Nov 8, 2017, 21:50, at 21:50, Kenneth Knowles 
> wrote:
> >Agree with everything Robert said. So if we just rebuild the Python zip
> >then this should g2g?
> >
> >On Wed, Nov 8, 2017 at 12:37 PM, Robert Bradshaw <
> >rober...@google.com.invalid> wrote:
> >
> >> Let me try to clarify the state of the world (with regards to Python
> >> and proto files).
> >>
> >> * When Python setup.py is run, it checks to see if the generated pb2
> >> files exist. If not, it attempts to generate them by installing the
> >> proto compiler and looking up the .proto definitions in its parent
> >> directory. This works great for the developer that checked out the
> >> full pristine sources from git (or otherwise obtained them).
> >>
> >> * For the sdist tarball uploaded to PyPi (aka Python Artifact), we
> >> ship the generated pb2 files both because (1) we don't want to force
> >> the user to install the proto compiler and (2) the "parent" directory
> >> doesn't exist as we're just shipping the sdks/python/... portion of
> >> the full git repository.
> >>
> >> * All previous "releases" in
> >> https://dist.apache.org/repos/dist/release/beam/ post the Python
> >> artifact (which is Python sources + generated proto files, but
> >notably
> >> no source proto files) in addition to the full source artifact (which
> >> contains some snapshot of the full git repository, Python and proto
> >> files included). We also separately publish Java artifacts offsite
> >> which is what people will install from.
> >>
> >> So it seems the purpose of the -python.zip file is just to stage what
> >> we intend to release on PyPi (e.g. for testing); it is not a source
> >> distribution (that is taken care of by the adjacent -source.zip file)
> >> and so there's no issue with it containing generated files. It should
> >> be the output of "python setup.py sdist" (possibly invoked by the mvn
> >> release commands, if you can get those to work). On the other hand,
> >> creating a separate python-only source distribution would serve no
> >> purpose, as it would be redundant with the existing
> >> everything-source-distribution which is just a manually taken
> >snapshot
> >> of the entire git repository. The confusion is around the role of the
> >> -python.zip file, and if we clarify that it's the proposed Python
> >PyPi
> >> artifact, and *not* some kind of python-only source distribution, the
> >> release process is WAI.
> >>
> >> - Robert
> >>
> >>
> >> On Wed, Nov 8, 2017 at 11:55 AM, Jean-Baptiste Onofré
> >
> >> wrote:
> >> > Let me take a look. Afair I didn't touch those files in last
> >release.
> >> >
> >> > I keep you posted.
> >> >
> >> > Regards
> >> > JB
> >> >
> >> > On Nov 8, 2017, 20:50, at 20:50, Reuven Lax
> >
> >> wrote:
> >> >>I explicitly removed the pb2 files as I thought we determined they
> >> >>shouldn't be in the source release, and they caused RAT failures.
> >What
> >> >>should I be doing here?
> >> >>
> >> >>On Wed, Nov 8, 2017 at 10:21 AM, Robert Bradshaw <
> >> >>rober...@google.com.invalid> wrote:
> >> >>
> >> >>> This is due to having removed the auto-generated pb2 files.
> >> >>>
> >> >>> On Wed, Nov 8, 2017 at 9:37 AM, Valentyn Tymofieiev
> >> >>>  wrote:
> >> >>> > Confirming Ismaël's finding - I also see this error and it did
> >not
> >> >>see it
> >> >>> > on a candidate that was in the staging area yesterday.
> >> >>> >
> >> >>> > On Wed, Nov 8, 2017 at 9:07 AM, Ismaël Mejía
> >
> >> >>wrote:
> >> >>> >
> >> >>> >> I tested the python version of the release I just created a
> >new
> >> >>> >> virtualenv and run
> >> >>> >>
> >> >>> >> python setup.py install and it gave me this message:
> >> >>> >>
> >> >>> >> Traceback (most recent call last):
> >> >>> >>   File "setup.py", line 203, in 
> >> >>> >> 'test': generate_protos_first(test),
> >> >>> >>   File "/usr/lib/python2.7/distutils/core.py", line 151, in
> >setup
> >> >>> >> dist.run_commands()
> >> >>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in
> >> >>> run_commands
> >> >>> >> self.run_command(cmd)
> >> >>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
> >> >>run_command
> >> >>> >> cmd_obj.run()
> >> >>> >>   File
> >> >>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> >>> >> te-packages/setuptools/command/install.py",
> >> >>> >> line 67, in run
> >> >>> >> self.do_egg_install()
> >> >>> >>   File
> >> 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Jean-Baptiste Onofré
Agree.

I just would like what changed exactly as I didn't have any issue when I did 
the 2.1.0 release.

Regards
JB

On Nov 8, 2017, 21:50, at 21:50, Kenneth Knowles  
wrote:
>Agree with everything Robert said. So if we just rebuild the Python zip
>then this should g2g?
>
>On Wed, Nov 8, 2017 at 12:37 PM, Robert Bradshaw <
>rober...@google.com.invalid> wrote:
>
>> Let me try to clarify the state of the world (with regards to Python
>> and proto files).
>>
>> * When Python setup.py is run, it checks to see if the generated pb2
>> files exist. If not, it attempts to generate them by installing the
>> proto compiler and looking up the .proto definitions in its parent
>> directory. This works great for the developer that checked out the
>> full pristine sources from git (or otherwise obtained them).
>>
>> * For the sdist tarball uploaded to PyPi (aka Python Artifact), we
>> ship the generated pb2 files both because (1) we don't want to force
>> the user to install the proto compiler and (2) the "parent" directory
>> doesn't exist as we're just shipping the sdks/python/... portion of
>> the full git repository.
>>
>> * All previous "releases" in
>> https://dist.apache.org/repos/dist/release/beam/ post the Python
>> artifact (which is Python sources + generated proto files, but
>notably
>> no source proto files) in addition to the full source artifact (which
>> contains some snapshot of the full git repository, Python and proto
>> files included). We also separately publish Java artifacts offsite
>> which is what people will install from.
>>
>> So it seems the purpose of the -python.zip file is just to stage what
>> we intend to release on PyPi (e.g. for testing); it is not a source
>> distribution (that is taken care of by the adjacent -source.zip file)
>> and so there's no issue with it containing generated files. It should
>> be the output of "python setup.py sdist" (possibly invoked by the mvn
>> release commands, if you can get those to work). On the other hand,
>> creating a separate python-only source distribution would serve no
>> purpose, as it would be redundant with the existing
>> everything-source-distribution which is just a manually taken
>snapshot
>> of the entire git repository. The confusion is around the role of the
>> -python.zip file, and if we clarify that it's the proposed Python
>PyPi
>> artifact, and *not* some kind of python-only source distribution, the
>> release process is WAI.
>>
>> - Robert
>>
>>
>> On Wed, Nov 8, 2017 at 11:55 AM, Jean-Baptiste Onofré
>
>> wrote:
>> > Let me take a look. Afair I didn't touch those files in last
>release.
>> >
>> > I keep you posted.
>> >
>> > Regards
>> > JB
>> >
>> > On Nov 8, 2017, 20:50, at 20:50, Reuven Lax
>
>> wrote:
>> >>I explicitly removed the pb2 files as I thought we determined they
>> >>shouldn't be in the source release, and they caused RAT failures.
>What
>> >>should I be doing here?
>> >>
>> >>On Wed, Nov 8, 2017 at 10:21 AM, Robert Bradshaw <
>> >>rober...@google.com.invalid> wrote:
>> >>
>> >>> This is due to having removed the auto-generated pb2 files.
>> >>>
>> >>> On Wed, Nov 8, 2017 at 9:37 AM, Valentyn Tymofieiev
>> >>>  wrote:
>> >>> > Confirming Ismaël's finding - I also see this error and it did
>not
>> >>see it
>> >>> > on a candidate that was in the staging area yesterday.
>> >>> >
>> >>> > On Wed, Nov 8, 2017 at 9:07 AM, Ismaël Mejía
>
>> >>wrote:
>> >>> >
>> >>> >> I tested the python version of the release I just created a
>new
>> >>> >> virtualenv and run
>> >>> >>
>> >>> >> python setup.py install and it gave me this message:
>> >>> >>
>> >>> >> Traceback (most recent call last):
>> >>> >>   File "setup.py", line 203, in 
>> >>> >> 'test': generate_protos_first(test),
>> >>> >>   File "/usr/lib/python2.7/distutils/core.py", line 151, in
>setup
>> >>> >> dist.run_commands()
>> >>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in
>> >>> run_commands
>> >>> >> self.run_command(cmd)
>> >>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
>> >>run_command
>> >>> >> cmd_obj.run()
>> >>> >>   File
>> >>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>> >>> >> te-packages/setuptools/command/install.py",
>> >>> >> line 67, in run
>> >>> >> self.do_egg_install()
>> >>> >>   File
>> >>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>> >>> >> te-packages/setuptools/command/install.py",
>> >>> >> line 109, in do_egg_install
>> >>> >> self.run_command('bdist_egg')
>> >>> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in
>> >>run_command
>> >>> >> self.distribution.run_command(command)
>> >>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
>> >>run_command
>> >>> >> cmd_obj.run()
>> >>> >>   File
>> >>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>> >>> >> 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Kenneth Knowles
Agree with everything Robert said. So if we just rebuild the Python zip
then this should g2g?

On Wed, Nov 8, 2017 at 12:37 PM, Robert Bradshaw <
rober...@google.com.invalid> wrote:

> Let me try to clarify the state of the world (with regards to Python
> and proto files).
>
> * When Python setup.py is run, it checks to see if the generated pb2
> files exist. If not, it attempts to generate them by installing the
> proto compiler and looking up the .proto definitions in its parent
> directory. This works great for the developer that checked out the
> full pristine sources from git (or otherwise obtained them).
>
> * For the sdist tarball uploaded to PyPi (aka Python Artifact), we
> ship the generated pb2 files both because (1) we don't want to force
> the user to install the proto compiler and (2) the "parent" directory
> doesn't exist as we're just shipping the sdks/python/... portion of
> the full git repository.
>
> * All previous "releases" in
> https://dist.apache.org/repos/dist/release/beam/ post the Python
> artifact (which is Python sources + generated proto files, but notably
> no source proto files) in addition to the full source artifact (which
> contains some snapshot of the full git repository, Python and proto
> files included). We also separately publish Java artifacts offsite
> which is what people will install from.
>
> So it seems the purpose of the -python.zip file is just to stage what
> we intend to release on PyPi (e.g. for testing); it is not a source
> distribution (that is taken care of by the adjacent -source.zip file)
> and so there's no issue with it containing generated files. It should
> be the output of "python setup.py sdist" (possibly invoked by the mvn
> release commands, if you can get those to work). On the other hand,
> creating a separate python-only source distribution would serve no
> purpose, as it would be redundant with the existing
> everything-source-distribution which is just a manually taken snapshot
> of the entire git repository. The confusion is around the role of the
> -python.zip file, and if we clarify that it's the proposed Python PyPi
> artifact, and *not* some kind of python-only source distribution, the
> release process is WAI.
>
> - Robert
>
>
> On Wed, Nov 8, 2017 at 11:55 AM, Jean-Baptiste Onofré 
> wrote:
> > Let me take a look. Afair I didn't touch those files in last release.
> >
> > I keep you posted.
> >
> > Regards
> > JB
> >
> > On Nov 8, 2017, 20:50, at 20:50, Reuven Lax 
> wrote:
> >>I explicitly removed the pb2 files as I thought we determined they
> >>shouldn't be in the source release, and they caused RAT failures. What
> >>should I be doing here?
> >>
> >>On Wed, Nov 8, 2017 at 10:21 AM, Robert Bradshaw <
> >>rober...@google.com.invalid> wrote:
> >>
> >>> This is due to having removed the auto-generated pb2 files.
> >>>
> >>> On Wed, Nov 8, 2017 at 9:37 AM, Valentyn Tymofieiev
> >>>  wrote:
> >>> > Confirming Ismaël's finding - I also see this error and it did not
> >>see it
> >>> > on a candidate that was in the staging area yesterday.
> >>> >
> >>> > On Wed, Nov 8, 2017 at 9:07 AM, Ismaël Mejía 
> >>wrote:
> >>> >
> >>> >> I tested the python version of the release I just created a new
> >>> >> virtualenv and run
> >>> >>
> >>> >> python setup.py install and it gave me this message:
> >>> >>
> >>> >> Traceback (most recent call last):
> >>> >>   File "setup.py", line 203, in 
> >>> >> 'test': generate_protos_first(test),
> >>> >>   File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
> >>> >> dist.run_commands()
> >>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in
> >>> run_commands
> >>> >> self.run_command(cmd)
> >>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
> >>run_command
> >>> >> cmd_obj.run()
> >>> >>   File
> >>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >>> >> te-packages/setuptools/command/install.py",
> >>> >> line 67, in run
> >>> >> self.do_egg_install()
> >>> >>   File
> >>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >>> >> te-packages/setuptools/command/install.py",
> >>> >> line 109, in do_egg_install
> >>> >> self.run_command('bdist_egg')
> >>> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in
> >>run_command
> >>> >> self.distribution.run_command(command)
> >>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
> >>run_command
> >>> >> cmd_obj.run()
> >>> >>   File
> >>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >>> >> te-packages/setuptools/command/bdist_egg.py",
> >>> >> line 169, in run
> >>> >> cmd = self.call_command('install_lib', warn_dir=0)
> >>> >>   File
> >>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >>> >> te-packages/setuptools/command/bdist_egg.py",
> >>> >> line 155, in call_command
> >>> >> self.run_command(cmdname)
> >>> 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Robert Bradshaw
Let me try to clarify the state of the world (with regards to Python
and proto files).

* When Python setup.py is run, it checks to see if the generated pb2
files exist. If not, it attempts to generate them by installing the
proto compiler and looking up the .proto definitions in its parent
directory. This works great for the developer that checked out the
full pristine sources from git (or otherwise obtained them).

* For the sdist tarball uploaded to PyPi (aka Python Artifact), we
ship the generated pb2 files both because (1) we don't want to force
the user to install the proto compiler and (2) the "parent" directory
doesn't exist as we're just shipping the sdks/python/... portion of
the full git repository.

* All previous "releases" in
https://dist.apache.org/repos/dist/release/beam/ post the Python
artifact (which is Python sources + generated proto files, but notably
no source proto files) in addition to the full source artifact (which
contains some snapshot of the full git repository, Python and proto
files included). We also separately publish Java artifacts offsite
which is what people will install from.

So it seems the purpose of the -python.zip file is just to stage what
we intend to release on PyPi (e.g. for testing); it is not a source
distribution (that is taken care of by the adjacent -source.zip file)
and so there's no issue with it containing generated files. It should
be the output of "python setup.py sdist" (possibly invoked by the mvn
release commands, if you can get those to work). On the other hand,
creating a separate python-only source distribution would serve no
purpose, as it would be redundant with the existing
everything-source-distribution which is just a manually taken snapshot
of the entire git repository. The confusion is around the role of the
-python.zip file, and if we clarify that it's the proposed Python PyPi
artifact, and *not* some kind of python-only source distribution, the
release process is WAI.

- Robert


On Wed, Nov 8, 2017 at 11:55 AM, Jean-Baptiste Onofré  wrote:
> Let me take a look. Afair I didn't touch those files in last release.
>
> I keep you posted.
>
> Regards
> JB
>
> On Nov 8, 2017, 20:50, at 20:50, Reuven Lax  wrote:
>>I explicitly removed the pb2 files as I thought we determined they
>>shouldn't be in the source release, and they caused RAT failures. What
>>should I be doing here?
>>
>>On Wed, Nov 8, 2017 at 10:21 AM, Robert Bradshaw <
>>rober...@google.com.invalid> wrote:
>>
>>> This is due to having removed the auto-generated pb2 files.
>>>
>>> On Wed, Nov 8, 2017 at 9:37 AM, Valentyn Tymofieiev
>>>  wrote:
>>> > Confirming Ismaël's finding - I also see this error and it did not
>>see it
>>> > on a candidate that was in the staging area yesterday.
>>> >
>>> > On Wed, Nov 8, 2017 at 9:07 AM, Ismaël Mejía 
>>wrote:
>>> >
>>> >> I tested the python version of the release I just created a new
>>> >> virtualenv and run
>>> >>
>>> >> python setup.py install and it gave me this message:
>>> >>
>>> >> Traceback (most recent call last):
>>> >>   File "setup.py", line 203, in 
>>> >> 'test': generate_protos_first(test),
>>> >>   File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
>>> >> dist.run_commands()
>>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in
>>> run_commands
>>> >> self.run_command(cmd)
>>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
>>run_command
>>> >> cmd_obj.run()
>>> >>   File
>>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>>> >> te-packages/setuptools/command/install.py",
>>> >> line 67, in run
>>> >> self.do_egg_install()
>>> >>   File
>>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>>> >> te-packages/setuptools/command/install.py",
>>> >> line 109, in do_egg_install
>>> >> self.run_command('bdist_egg')
>>> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in
>>run_command
>>> >> self.distribution.run_command(command)
>>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
>>run_command
>>> >> cmd_obj.run()
>>> >>   File
>>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>>> >> te-packages/setuptools/command/bdist_egg.py",
>>> >> line 169, in run
>>> >> cmd = self.call_command('install_lib', warn_dir=0)
>>> >>   File
>>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>>> >> te-packages/setuptools/command/bdist_egg.py",
>>> >> line 155, in call_command
>>> >> self.run_command(cmdname)
>>> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in
>>run_command
>>> >> self.distribution.run_command(command)
>>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
>>run_command
>>> >> cmd_obj.run()
>>> >>   File
>>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>>> >> te-packages/setuptools/command/install_lib.py",
>>> >> line 11, in run
>>> >> 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Jean-Baptiste Onofré
Let me take a look. Afair I didn't touch those files in last release.

I keep you posted.

Regards
JB

On Nov 8, 2017, 20:50, at 20:50, Reuven Lax  wrote:
>I explicitly removed the pb2 files as I thought we determined they
>shouldn't be in the source release, and they caused RAT failures. What
>should I be doing here?
>
>On Wed, Nov 8, 2017 at 10:21 AM, Robert Bradshaw <
>rober...@google.com.invalid> wrote:
>
>> This is due to having removed the auto-generated pb2 files.
>>
>> On Wed, Nov 8, 2017 at 9:37 AM, Valentyn Tymofieiev
>>  wrote:
>> > Confirming Ismaël's finding - I also see this error and it did not
>see it
>> > on a candidate that was in the staging area yesterday.
>> >
>> > On Wed, Nov 8, 2017 at 9:07 AM, Ismaël Mejía 
>wrote:
>> >
>> >> I tested the python version of the release I just created a new
>> >> virtualenv and run
>> >>
>> >> python setup.py install and it gave me this message:
>> >>
>> >> Traceback (most recent call last):
>> >>   File "setup.py", line 203, in 
>> >> 'test': generate_protos_first(test),
>> >>   File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
>> >> dist.run_commands()
>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in
>> run_commands
>> >> self.run_command(cmd)
>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
>run_command
>> >> cmd_obj.run()
>> >>   File
>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>> >> te-packages/setuptools/command/install.py",
>> >> line 67, in run
>> >> self.do_egg_install()
>> >>   File
>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>> >> te-packages/setuptools/command/install.py",
>> >> line 109, in do_egg_install
>> >> self.run_command('bdist_egg')
>> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in
>run_command
>> >> self.distribution.run_command(command)
>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
>run_command
>> >> cmd_obj.run()
>> >>   File
>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>> >> te-packages/setuptools/command/bdist_egg.py",
>> >> line 169, in run
>> >> cmd = self.call_command('install_lib', warn_dir=0)
>> >>   File
>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>> >> te-packages/setuptools/command/bdist_egg.py",
>> >> line 155, in call_command
>> >> self.run_command(cmdname)
>> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in
>run_command
>> >> self.distribution.run_command(command)
>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
>run_command
>> >> cmd_obj.run()
>> >>   File
>"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
>> >> te-packages/setuptools/command/install_lib.py",
>> >> line 11, in run
>> >> self.build()
>> >>   File "/usr/lib/python2.7/distutils/command/install_lib.py", line
>109,
>> >> in build
>> >> self.run_command('build_py')
>> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in
>run_command
>> >> self.distribution.run_command(command)
>> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in
>run_command
>> >> cmd_obj.run()
>> >>   File "setup.py", line 143, in run
>> >> gen_protos.generate_proto_files()
>> >>   File
>"/home/ismael/releases/votes/beam/apache-beam-2.2.0-python/g
>> >> en_protos.py",
>> >> line 66, in generate_proto_files
>> >> 'Not in apache git tree; unable to find proto definitions.')
>> >> RuntimeError: Not in apache git tree; unable to find proto
>definitions.
>> >>
>> >> Not sure if this is something in my environment, but this passed
>when
>> >> I validated the previous release (2.1.0).
>> >>
>> >>
>> >> On Wed, Nov 8, 2017 at 11:30 AM, Reuven Lax
>
>> >> wrote:
>> >> > Hi everyone,
>> >> >
>> >> > Please review and vote on the release candidate #2 for the
>version
>> 2.2.0,
>> >> > as follows:
>> >> >   [ ] +1, Approve the release
>> >> >   [ ] -1, Do not approve the release (please provide specific
>> comments)
>> >> >
>> >> >
>> >> > The complete staging area is available for your review, which
>> includes:
>> >> >   * JIRA release notes [1],
>> >> >   * the official Apache source release to be deployed to
>> dist.apache.org
>> >> [2],
>> >> > which is signed with the key with fingerprint B98B7708 [3],
>> >> >   * all artifacts to be deployed to the Maven Central Repository
>[4],
>> >> >   * source code tag "v2.2.0-RC3" [5],
>> >> >   * website pull request listing the release and publishing the
>API
>> >> > reference manual [6].
>> >> >   * Java artifacts were built with Maven 3.5.0 and
>OpenJDK/Oracle JDK
>> >> > 1.8.0_144.
>> >> >   * Python artifacts are deployed along with the source release
>to the
>> >> > dist.apache.org [2].
>> >> >
>> >> > The vote will be open for at least 72 hours. It is adopted by
>majority
>> >> > approval, with at least 3 PMC affirmative votes.
>> >> >
>> >> > Thanks,
>> >> > 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Reuven Lax
I explicitly removed the pb2 files as I thought we determined they
shouldn't be in the source release, and they caused RAT failures. What
should I be doing here?

On Wed, Nov 8, 2017 at 10:21 AM, Robert Bradshaw <
rober...@google.com.invalid> wrote:

> This is due to having removed the auto-generated pb2 files.
>
> On Wed, Nov 8, 2017 at 9:37 AM, Valentyn Tymofieiev
>  wrote:
> > Confirming Ismaël's finding - I also see this error and it did not see it
> > on a candidate that was in the staging area yesterday.
> >
> > On Wed, Nov 8, 2017 at 9:07 AM, Ismaël Mejía  wrote:
> >
> >> I tested the python version of the release I just created a new
> >> virtualenv and run
> >>
> >> python setup.py install and it gave me this message:
> >>
> >> Traceback (most recent call last):
> >>   File "setup.py", line 203, in 
> >> 'test': generate_protos_first(test),
> >>   File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
> >> dist.run_commands()
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in
> run_commands
> >> self.run_command(cmd)
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> >> cmd_obj.run()
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/install.py",
> >> line 67, in run
> >> self.do_egg_install()
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/install.py",
> >> line 109, in do_egg_install
> >> self.run_command('bdist_egg')
> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> >> self.distribution.run_command(command)
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> >> cmd_obj.run()
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/bdist_egg.py",
> >> line 169, in run
> >> cmd = self.call_command('install_lib', warn_dir=0)
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/bdist_egg.py",
> >> line 155, in call_command
> >> self.run_command(cmdname)
> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> >> self.distribution.run_command(command)
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> >> cmd_obj.run()
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/install_lib.py",
> >> line 11, in run
> >> self.build()
> >>   File "/usr/lib/python2.7/distutils/command/install_lib.py", line 109,
> >> in build
> >> self.run_command('build_py')
> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> >> self.distribution.run_command(command)
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> >> cmd_obj.run()
> >>   File "setup.py", line 143, in run
> >> gen_protos.generate_proto_files()
> >>   File "/home/ismael/releases/votes/beam/apache-beam-2.2.0-python/g
> >> en_protos.py",
> >> line 66, in generate_proto_files
> >> 'Not in apache git tree; unable to find proto definitions.')
> >> RuntimeError: Not in apache git tree; unable to find proto definitions.
> >>
> >> Not sure if this is something in my environment, but this passed when
> >> I validated the previous release (2.1.0).
> >>
> >>
> >> On Wed, Nov 8, 2017 at 11:30 AM, Reuven Lax 
> >> wrote:
> >> > Hi everyone,
> >> >
> >> > Please review and vote on the release candidate #2 for the version
> 2.2.0,
> >> > as follows:
> >> >   [ ] +1, Approve the release
> >> >   [ ] -1, Do not approve the release (please provide specific
> comments)
> >> >
> >> >
> >> > The complete staging area is available for your review, which
> includes:
> >> >   * JIRA release notes [1],
> >> >   * the official Apache source release to be deployed to
> dist.apache.org
> >> [2],
> >> > which is signed with the key with fingerprint B98B7708 [3],
> >> >   * all artifacts to be deployed to the Maven Central Repository [4],
> >> >   * source code tag "v2.2.0-RC3" [5],
> >> >   * website pull request listing the release and publishing the API
> >> > reference manual [6].
> >> >   * Java artifacts were built with Maven 3.5.0 and OpenJDK/Oracle JDK
> >> > 1.8.0_144.
> >> >   * Python artifacts are deployed along with the source release to the
> >> > dist.apache.org [2].
> >> >
> >> > The vote will be open for at least 72 hours. It is adopted by majority
> >> > approval, with at least 3 PMC affirmative votes.
> >> >
> >> > Thanks,
> >> > Reuven
> >> >
> >> > [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> >> > projectId=12319527=12341044
> >> > [2] https://dist.apache.org/repos/dist/dev/beam/2.2.0/
> >> > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> >> > [4] https://repository.apache.org/content/repositories/orgapache
> >> 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Kenneth Knowles
I have a couple thoughts. It seems like our release process for Java and
Python are asymmetric in undesirable ways.

We should have the following:

 - An official source release per ASF releasing policy that includes all
sources from Python, Java, Go, Proto. This should pass RAT. This seems the
proper form of [2]. Is there a reason we cannot do this?

 - A staged release candidate artifact for each end-user distribution
mechanism. No RAT here.
- For Java this is maven central aka [4].
- We don't currently stage a built candidate for Python, aka the output
of `python setup.py sdist` destined for pypi.

>From my understanding of where this RC is at, the -python.zip is neither a
valid built artifact (needs generated pb2) nor a valid source dist (doesn't
have the proto files needed to build)

We can manually tweak stuff this time around if it is easier and fix up our
build & poms & documentation to build the right thing next time.

Kenn

On Wed, Nov 8, 2017 at 10:21 AM, Robert Bradshaw <
rober...@google.com.invalid> wrote:

> This is due to having removed the auto-generated pb2 files.
>
> On Wed, Nov 8, 2017 at 9:37 AM, Valentyn Tymofieiev
>  wrote:
> > Confirming Ismaël's finding - I also see this error and it did not see it
> > on a candidate that was in the staging area yesterday.
> >
> > On Wed, Nov 8, 2017 at 9:07 AM, Ismaël Mejía  wrote:
> >
> >> I tested the python version of the release I just created a new
> >> virtualenv and run
> >>
> >> python setup.py install and it gave me this message:
> >>
> >> Traceback (most recent call last):
> >>   File "setup.py", line 203, in 
> >> 'test': generate_protos_first(test),
> >>   File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
> >> dist.run_commands()
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in
> run_commands
> >> self.run_command(cmd)
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> >> cmd_obj.run()
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/install.py",
> >> line 67, in run
> >> self.do_egg_install()
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/install.py",
> >> line 109, in do_egg_install
> >> self.run_command('bdist_egg')
> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> >> self.distribution.run_command(command)
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> >> cmd_obj.run()
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/bdist_egg.py",
> >> line 169, in run
> >> cmd = self.call_command('install_lib', warn_dir=0)
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/bdist_egg.py",
> >> line 155, in call_command
> >> self.run_command(cmdname)
> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> >> self.distribution.run_command(command)
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> >> cmd_obj.run()
> >>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> >> te-packages/setuptools/command/install_lib.py",
> >> line 11, in run
> >> self.build()
> >>   File "/usr/lib/python2.7/distutils/command/install_lib.py", line 109,
> >> in build
> >> self.run_command('build_py')
> >>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> >> self.distribution.run_command(command)
> >>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> >> cmd_obj.run()
> >>   File "setup.py", line 143, in run
> >> gen_protos.generate_proto_files()
> >>   File "/home/ismael/releases/votes/beam/apache-beam-2.2.0-python/g
> >> en_protos.py",
> >> line 66, in generate_proto_files
> >> 'Not in apache git tree; unable to find proto definitions.')
> >> RuntimeError: Not in apache git tree; unable to find proto definitions.
> >>
> >> Not sure if this is something in my environment, but this passed when
> >> I validated the previous release (2.1.0).
> >>
> >>
> >> On Wed, Nov 8, 2017 at 11:30 AM, Reuven Lax 
> >> wrote:
> >> > Hi everyone,
> >> >
> >> > Please review and vote on the release candidate #2 for the version
> 2.2.0,
> >> > as follows:
> >> >   [ ] +1, Approve the release
> >> >   [ ] -1, Do not approve the release (please provide specific
> comments)
> >> >
> >> >
> >> > The complete staging area is available for your review, which
> includes:
> >> >   * JIRA release notes [1],
> >> >   * the official Apache source release to be deployed to
> dist.apache.org
> >> [2],
> >> > which is signed with the key with fingerprint B98B7708 [3],
> >> >   * all artifacts to be deployed to the Maven Central Repository [4],
> >> >   * source code tag "v2.2.0-RC3" [5],
> 

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-08 Thread Jean-Baptiste Onofré
Thanks for the update. I was swamped on some meetings. I'm back to test the 
latest changes.

Regards
JB

On Nov 8, 2017, 18:56, at 18:56, Lukasz Cwik  wrote:
>Thanks everyone for trying this build out in different workspaces /
>configurations. This will help make sure the build works for more
>people
>and will get rid of any rough edges.
>
>Performance (All):
>Maven performs parallelization at the module level, an entire module
>needs
>to complete before any dependent modules can start, this means running
>all
>the checks like findbugs, checkstyle, tests need to finish. Gradle has
>task
>level parallelism between subprojects which means that as soon as the
>compile and shade steps are done for a project, and dependent
>subprojects
>can typically start. This means that we get increased parallelism due
>to
>not needing to wait for findbugs, checkstyle, tests to run. I typically
>see
>~20 tasks (at peak) running on my desktop in parallel.
>
>Apache Rat (JB / Romain):
>What files are in the rat report that fail (its likely that I'm missing
>some exclusion for a build time artifact)? Also, please try the build
>again
>after running `git clean -fdx` in your workspace.
>
>Python (JB):
>As for the Python SDK, you'll need to share more details about the
>failure.
>
>Gradle 4.3:
>I would like to defer the swap to Gradle 4.3 until after this PR since
>it
>will be a much smaller set of changes.
>
>On Wed, Nov 8, 2017 at 12:54 AM, Jean-Baptiste Onofré 
>wrote:
>
>> Same for me for rat and python build too:
>>
>> FAILURE: Build completed with 2 failures.
>>
>> 1: Task failed with an exception.
>> ---
>> * What went wrong:
>> Execution failed for task ':rat'.
>> > Found 905 files with unapproved/unknown licenses. See
>> file:/home/jbonofre/Workspace/beam/build/reports/rat/rat-report.txt
>>
>> * Try:
>> Run with --stacktrace option to get the stack trace. Run with --info
>or
>> --debug option to get more log output.
>> 
>> ==
>>
>> 2: Task failed with an exception.
>> ---
>> * Where:
>> Build file '/home/jbonofre/Workspace/beam/sdks/python/build.gradle'
>line:
>> 64
>>
>> * What went wrong:
>> Execution failed for task ':beam-sdks-parent:beam-sdks-python:lint'.
>> > Process 'command 'tox'' finished with non-zero exit value 1
>>
>>
>>
>> On 11/08/2017 09:51 AM, Romain Manni-Bucau wrote:
>>
>>> gradle branch doesnt build for me (some rat issues)
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn
>>>
>>>
>>> 2017-11-08 5:41 GMT+01:00 Jean-Baptiste Onofré :
>>>
 Great !

 What explain these difference ? I'm curious especially for the
>clean
 build
 all Java modules: is it a question of parallel execution ?

 Regards
 JB


 On 11/08/2017 02:59 AM, Lukasz Cwik wrote:

>
> The Gradle POC has made significant advances since last week
>(shading,
> Python, Go, Docker builds, ...). I believe the current state is
>close
> enough to the Maven build system to warrant a comparison.
>
> The largest build differences I noticed are:
> * Full build takes about ~22mins using Gradle (parallelizing the
>three
> rounds of Python tests would reduce this to ~17mins) compared to
>~38mins
> in
> Maven
> * Clean build all Java modules (skipping over Go/Python
> ) takes ~8mins in
> Gradle which takes ~36mins in Maven
> * Build output is cached allowing for faster subsequent builds
>with
> "gradle
> buildDependents" allowing for most single module changes taking
>~2mins
> to
> build and test without needing to rely on "mvn install"
>
> I have opened PR 4096 
>so
> that
> the Gradle build files merged and then follow up with new Jenkins
> precommits which are powered by Gradle. This will allow the
>community to
> continuing contributing to the Gradle build and also allow for a
> comparison
> of the precommit times on the Jenkins executor when using
>Maven/Gradle.
> I
> suggest that those who are interested try out the PR.
>
> On Fri, Nov 3, 2017 at 10:29 PM, Jean-Baptiste Onofré
>
> wrote:
>
> That makes sense. The point is that we have to compare
>equivalently. I'm
>> also curious about Gradle PoC assuming it does the same actions
>as
>> Maven.
>>
>> Regards
>> JB
>>
>> On Nov 3, 2017, 20:41, at 20:41, Kenneth Knowles
>> 
>> wrote:
>>
>>>
>>> I'm confident that any choice will speed things up dramatically
>even
>>> beyond
>>> a fast profile, even if the new tool runs all the extra stuff.
>But
>>> that
>>> is
>>> a question that we can answer empirically anyhow. Let's see how
>it
>>> goes!
>>>

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-08 Thread Lukasz Cwik
Thanks everyone for trying this build out in different workspaces /
configurations. This will help make sure the build works for more people
and will get rid of any rough edges.

Performance (All):
Maven performs parallelization at the module level, an entire module needs
to complete before any dependent modules can start, this means running all
the checks like findbugs, checkstyle, tests need to finish. Gradle has task
level parallelism between subprojects which means that as soon as the
compile and shade steps are done for a project, and dependent subprojects
can typically start. This means that we get increased parallelism due to
not needing to wait for findbugs, checkstyle, tests to run. I typically see
~20 tasks (at peak) running on my desktop in parallel.

Apache Rat (JB / Romain):
What files are in the rat report that fail (its likely that I'm missing
some exclusion for a build time artifact)? Also, please try the build again
after running `git clean -fdx` in your workspace.

Python (JB):
As for the Python SDK, you'll need to share more details about the failure.

Gradle 4.3:
I would like to defer the swap to Gradle 4.3 until after this PR since it
will be a much smaller set of changes.

On Wed, Nov 8, 2017 at 12:54 AM, Jean-Baptiste Onofré 
wrote:

> Same for me for rat and python build too:
>
> FAILURE: Build completed with 2 failures.
>
> 1: Task failed with an exception.
> ---
> * What went wrong:
> Execution failed for task ':rat'.
> > Found 905 files with unapproved/unknown licenses. See
> file:/home/jbonofre/Workspace/beam/build/reports/rat/rat-report.txt
>
> * Try:
> Run with --stacktrace option to get the stack trace. Run with --info or
> --debug option to get more log output.
> 
> ==
>
> 2: Task failed with an exception.
> ---
> * Where:
> Build file '/home/jbonofre/Workspace/beam/sdks/python/build.gradle' line:
> 64
>
> * What went wrong:
> Execution failed for task ':beam-sdks-parent:beam-sdks-python:lint'.
> > Process 'command 'tox'' finished with non-zero exit value 1
>
>
>
> On 11/08/2017 09:51 AM, Romain Manni-Bucau wrote:
>
>> gradle branch doesnt build for me (some rat issues)
>>
>> Romain Manni-Bucau
>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn
>>
>>
>> 2017-11-08 5:41 GMT+01:00 Jean-Baptiste Onofré :
>>
>>> Great !
>>>
>>> What explain these difference ? I'm curious especially for the clean
>>> build
>>> all Java modules: is it a question of parallel execution ?
>>>
>>> Regards
>>> JB
>>>
>>>
>>> On 11/08/2017 02:59 AM, Lukasz Cwik wrote:
>>>

 The Gradle POC has made significant advances since last week (shading,
 Python, Go, Docker builds, ...). I believe the current state is close
 enough to the Maven build system to warrant a comparison.

 The largest build differences I noticed are:
 * Full build takes about ~22mins using Gradle (parallelizing the three
 rounds of Python tests would reduce this to ~17mins) compared to ~38mins
 in
 Maven
 * Clean build all Java modules (skipping over Go/Python
 ) takes ~8mins in
 Gradle which takes ~36mins in Maven
 * Build output is cached allowing for faster subsequent builds with
 "gradle
 buildDependents" allowing for most single module changes taking ~2mins
 to
 build and test without needing to rely on "mvn install"

 I have opened PR 4096  so
 that
 the Gradle build files merged and then follow up with new Jenkins
 precommits which are powered by Gradle. This will allow the community to
 continuing contributing to the Gradle build and also allow for a
 comparison
 of the precommit times on the Jenkins executor when using Maven/Gradle.
 I
 suggest that those who are interested try out the PR.

 On Fri, Nov 3, 2017 at 10:29 PM, Jean-Baptiste Onofré 
 wrote:

 That makes sense. The point is that we have to compare equivalently. I'm
> also curious about Gradle PoC assuming it does the same actions as
> Maven.
>
> Regards
> JB
>
> On Nov 3, 2017, 20:41, at 20:41, Kenneth Knowles
> 
> wrote:
>
>>
>> I'm confident that any choice will speed things up dramatically even
>> beyond
>> a fast profile, even if the new tool runs all the extra stuff. But
>> that
>> is
>> a question that we can answer empirically anyhow. Let's see how it
>> goes!
>>
>> Incidentally, my experiments with Bazel have led me to the conclusion
>> that
>> it is not the right choice for us so I'm not going to be proposing any
>> completed POC of that right now. I'm interested in the outcome of the
>> Gradle POC.
>>
>> Kenn
>>
>>
>> On Fri, Nov 3, 2017 at 3:30 AM, Jean-Baptiste 

Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Valentyn Tymofieiev
Confirming Ismaël's finding - I also see this error and it did not see it
on a candidate that was in the staging area yesterday.

On Wed, Nov 8, 2017 at 9:07 AM, Ismaël Mejía  wrote:

> I tested the python version of the release I just created a new
> virtualenv and run
>
> python setup.py install and it gave me this message:
>
> Traceback (most recent call last):
>   File "setup.py", line 203, in 
> 'test': generate_protos_first(test),
>   File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> te-packages/setuptools/command/install.py",
> line 67, in run
> self.do_egg_install()
>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> te-packages/setuptools/command/install.py",
> line 109, in do_egg_install
> self.run_command('bdist_egg')
>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> self.distribution.run_command(command)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> te-packages/setuptools/command/bdist_egg.py",
> line 169, in run
> cmd = self.call_command('install_lib', warn_dir=0)
>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> te-packages/setuptools/command/bdist_egg.py",
> line 155, in call_command
> self.run_command(cmdname)
>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> self.distribution.run_command(command)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/si
> te-packages/setuptools/command/install_lib.py",
> line 11, in run
> self.build()
>   File "/usr/lib/python2.7/distutils/command/install_lib.py", line 109,
> in build
> self.run_command('build_py')
>   File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
> self.distribution.run_command(command)
>   File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "setup.py", line 143, in run
> gen_protos.generate_proto_files()
>   File "/home/ismael/releases/votes/beam/apache-beam-2.2.0-python/g
> en_protos.py",
> line 66, in generate_proto_files
> 'Not in apache git tree; unable to find proto definitions.')
> RuntimeError: Not in apache git tree; unable to find proto definitions.
>
> Not sure if this is something in my environment, but this passed when
> I validated the previous release (2.1.0).
>
>
> On Wed, Nov 8, 2017 at 11:30 AM, Reuven Lax 
> wrote:
> > Hi everyone,
> >
> > Please review and vote on the release candidate #2 for the version 2.2.0,
> > as follows:
> >   [ ] +1, Approve the release
> >   [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> >   * JIRA release notes [1],
> >   * the official Apache source release to be deployed to dist.apache.org
> [2],
> > which is signed with the key with fingerprint B98B7708 [3],
> >   * all artifacts to be deployed to the Maven Central Repository [4],
> >   * source code tag "v2.2.0-RC3" [5],
> >   * website pull request listing the release and publishing the API
> > reference manual [6].
> >   * Java artifacts were built with Maven 3.5.0 and OpenJDK/Oracle JDK
> > 1.8.0_144.
> >   * Python artifacts are deployed along with the source release to the
> > dist.apache.org [2].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Reuven
> >
> > [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> > projectId=12319527=12341044
> > [2] https://dist.apache.org/repos/dist/dev/beam/2.2.0/
> > [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> > [4] https://repository.apache.org/content/repositories/orgapache
> beam-1023/
> > 
> > [5] https://github.com/apache/beam/tree/v2.2.0-RC
> > 3
> > [6] https://github.com/apache/beam-site/pull/337
>


Re: [VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Ismaël Mejía
I tested the python version of the release I just created a new
virtualenv and run

python setup.py install and it gave me this message:

Traceback (most recent call last):
  File "setup.py", line 203, in 
'test': generate_protos_first(test),
  File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
  File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
  File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
  File 
"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/site-packages/setuptools/command/install.py",
line 67, in run
self.do_egg_install()
  File 
"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/site-packages/setuptools/command/install.py",
line 109, in do_egg_install
self.run_command('bdist_egg')
  File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
self.distribution.run_command(command)
  File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
  File 
"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/site-packages/setuptools/command/bdist_egg.py",
line 169, in run
cmd = self.call_command('install_lib', warn_dir=0)
  File 
"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/site-packages/setuptools/command/bdist_egg.py",
line 155, in call_command
self.run_command(cmdname)
  File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
self.distribution.run_command(command)
  File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
  File 
"/home/ismael/.virtualenvs/beam-vote2/local/lib/python2.7/site-packages/setuptools/command/install_lib.py",
line 11, in run
self.build()
  File "/usr/lib/python2.7/distutils/command/install_lib.py", line 109, in build
self.run_command('build_py')
  File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
self.distribution.run_command(command)
  File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
  File "setup.py", line 143, in run
gen_protos.generate_proto_files()
  File 
"/home/ismael/releases/votes/beam/apache-beam-2.2.0-python/gen_protos.py",
line 66, in generate_proto_files
'Not in apache git tree; unable to find proto definitions.')
RuntimeError: Not in apache git tree; unable to find proto definitions.

Not sure if this is something in my environment, but this passed when
I validated the previous release (2.1.0).


On Wed, Nov 8, 2017 at 11:30 AM, Reuven Lax  wrote:
> Hi everyone,
>
> Please review and vote on the release candidate #2 for the version 2.2.0,
> as follows:
>   [ ] +1, Approve the release
>   [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
>   * JIRA release notes [1],
>   * the official Apache source release to be deployed to dist.apache.org [2],
> which is signed with the key with fingerprint B98B7708 [3],
>   * all artifacts to be deployed to the Maven Central Repository [4],
>   * source code tag "v2.2.0-RC3" [5],
>   * website pull request listing the release and publishing the API
> reference manual [6].
>   * Java artifacts were built with Maven 3.5.0 and OpenJDK/Oracle JDK
> 1.8.0_144.
>   * Python artifacts are deployed along with the source release to the
> dist.apache.org [2].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Reuven
>
> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12319527=12341044
> [2] https://dist.apache.org/repos/dist/dev/beam/2.2.0/
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1023/
> 
> [5] https://github.com/apache/beam/tree/v2.2.0-RC
> 3
> [6] https://github.com/apache/beam-site/pull/337


Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Ted Yu
Having both Spark1 and Spark2 modules would benefit wider user base.

I would vote for that.

Cheers

On Wed, Nov 8, 2017 at 12:51 AM, Jean-Baptiste Onofré 
wrote:

> Hi Robert,
>
> Thanks for your feedback !
>
> From an user perspective, with the current state of the PR, the same
> pipelines can run on both Spark 1.x and 2.x: the only difference is the
> dependencies set.
>
> I'm calling the vote to get suck kind of feedback: if we consider Spark
> 1.x still need to be supported, no problem, I will improve the PR to have
> three modules (common, spark1, spark2) and let users pick the desired
> version.
>
> Let's wait a bit other feedbacks, I will update the PR accordingly.
>
> Regards
> JB
>
>
> On 11/08/2017 09:47 AM, Robert Bradshaw wrote:
>
>> I'm generally a -0.5 on this change, or at least doing so hastily.
>>
>> As with dropping Java 7 support, I think this should at least be
>> announced in release notes that we're considering dropping support in
>> the subsequent release, as this dev list likely does not reach a
>> substantial portion of the userbase.
>>
>> How much work is it to move from a Spark 1.x cluster to a Spark 2.x
>> cluster? I get the feeling it's not nearly as transparent as upgrading
>> Java versions. Can Spark 1.x pipelines be run on Spark 2.x clusters,
>> or is a new cluster (and/or upgrading all pipelines) required (e.g.
>> for those who operate spark clusters shared among their many users)?
>>
>> Looks like the latest release of Spark 1.x was about a year ago,
>> overlapping a bit with the 2.x series which is coming up on 1.5 years
>> old, so I could see a lot of people still using 1.x even if 2.x is
>> clearly the future. But it sure doesn't seem very backwards
>> compatible.
>>
>> Mostly I'm not comfortable with dropping 1.x in the same release as
>> adding support for 2.x, giving no transition period, but could be
>> convinced if this transition is mostly a no-op or no one's still using
>> 1.x. If there's non-trivial code complexity issues, I would perhaps
>> revisit the issue of having a single Spark Runner that does chooses
>> the backend implicitly in favor of simply having two runners which
>> share the code that's easy to share and diverge otherwise (which seems
>> it would be much simpler both to implement and explain to users). I
>> would be OK with even letting the Spark 1.x runner be somewhat
>> stagnant (e.g. few or no new features) until we decide we can kill it
>> off.
>>
>> On Tue, Nov 7, 2017 at 11:27 PM, Jean-Baptiste Onofré 
>> wrote:
>>
>>> Hi all,
>>>
>>> as you might know, we are working on Spark 2.x support in the Spark
>>> runner.
>>>
>>> I'm working on a PR about that:
>>>
>>> https://github.com/apache/beam/pull/3808
>>>
>>> Today, we have something working with both Spark 1.x and 2.x from a code
>>> standpoint, but I have to deal with dependencies. It's the first step of
>>> the
>>> update as I'm still using RDD, the second step would be to support
>>> dataframe
>>> (but for that, I would need PCollection elements with schemas, that's
>>> another topic on which Eugene, Reuven and I are discussing).
>>>
>>> However, as all major distributions now ship Spark 2.x, I don't think
>>> it's
>>> required anymore to support Spark 1.x.
>>>
>>> If we agree, I will update and cleanup the PR to only support and focus
>>> on
>>> Spark 2.x.
>>>
>>> So, that's why I'm calling for a vote:
>>>
>>>[ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
>>>[ ] 0 (I don't care ;))
>>>[ ] -1, I would like to still support Spark 1.x, and so having
>>> support of
>>> both Spark 1.x and 2.x (please provide specific comment)
>>>
>>> This vote is open for 48 hours (I have the commits ready, just waiting
>>> the
>>> end of the vote to push on the PR).
>>>
>>> Thanks !
>>> Regards
>>> JB
>>> --
>>> Jean-Baptiste Onofré
>>> jbono...@apache.org
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


[VOTE] Release 2.2.0, release candidate #2

2017-11-08 Thread Reuven Lax
Hi everyone,

Please review and vote on the release candidate #2 for the version 2.2.0,
as follows:
  [ ] +1, Approve the release
  [ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
  * JIRA release notes [1],
  * the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint B98B7708 [3],
  * all artifacts to be deployed to the Maven Central Repository [4],
  * source code tag "v2.2.0-RC3" [5],
  * website pull request listing the release and publishing the API
reference manual [6].
  * Java artifacts were built with Maven 3.5.0 and OpenJDK/Oracle JDK
1.8.0_144.
  * Python artifacts are deployed along with the source release to the
dist.apache.org [2].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Reuven

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
projectId=12319527=12341044
[2] https://dist.apache.org/repos/dist/dev/beam/2.2.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1023/

[5] https://github.com/apache/beam/tree/v2.2.0-RC
3
[6] https://github.com/apache/beam-site/pull/337


Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-08 Thread Jean-Baptiste Onofré

Same for me for rat and python build too:

FAILURE: Build completed with 2 failures.

1: Task failed with an exception.
---
* What went wrong:
Execution failed for task ':rat'.
> Found 905 files with unapproved/unknown licenses. See 
file:/home/jbonofre/Workspace/beam/build/reports/rat/rat-report.txt


* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

==

2: Task failed with an exception.
---
* Where:
Build file '/home/jbonofre/Workspace/beam/sdks/python/build.gradle' line: 64

* What went wrong:
Execution failed for task ':beam-sdks-parent:beam-sdks-python:lint'.
> Process 'command 'tox'' finished with non-zero exit value 1


On 11/08/2017 09:51 AM, Romain Manni-Bucau wrote:

gradle branch doesnt build for me (some rat issues)

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn


2017-11-08 5:41 GMT+01:00 Jean-Baptiste Onofré :

Great !

What explain these difference ? I'm curious especially for the clean build
all Java modules: is it a question of parallel execution ?

Regards
JB


On 11/08/2017 02:59 AM, Lukasz Cwik wrote:


The Gradle POC has made significant advances since last week (shading,
Python, Go, Docker builds, ...). I believe the current state is close
enough to the Maven build system to warrant a comparison.

The largest build differences I noticed are:
* Full build takes about ~22mins using Gradle (parallelizing the three
rounds of Python tests would reduce this to ~17mins) compared to ~38mins
in
Maven
* Clean build all Java modules (skipping over Go/Python) takes ~8mins in
Gradle which takes ~36mins in Maven
* Build output is cached allowing for faster subsequent builds with
"gradle
buildDependents" allowing for most single module changes taking ~2mins to
build and test without needing to rely on "mvn install"

I have opened PR 4096  so that
the Gradle build files merged and then follow up with new Jenkins
precommits which are powered by Gradle. This will allow the community to
continuing contributing to the Gradle build and also allow for a
comparison
of the precommit times on the Jenkins executor when using Maven/Gradle. I
suggest that those who are interested try out the PR.

On Fri, Nov 3, 2017 at 10:29 PM, Jean-Baptiste Onofré 
wrote:


That makes sense. The point is that we have to compare equivalently. I'm
also curious about Gradle PoC assuming it does the same actions as Maven.

Regards
JB

On Nov 3, 2017, 20:41, at 20:41, Kenneth Knowles 
wrote:


I'm confident that any choice will speed things up dramatically even
beyond
a fast profile, even if the new tool runs all the extra stuff. But that
is
a question that we can answer empirically anyhow. Let's see how it
goes!

Incidentally, my experiments with Bazel have led me to the conclusion
that
it is not the right choice for us so I'm not going to be proposing any
completed POC of that right now. I'm interested in the outcome of the
Gradle POC.

Kenn


On Fri, Nov 3, 2017 at 3:30 AM, Jean-Baptiste Onofré 
wrote:


Hi

It's what I said in a previous e-mail: I don't think that just


changing


the build tool will improve a lot the build time.

We already know (and discussed while ago) that plugins like findbugs,
checkstyle, etc are taking time.

So, I think we can already have a fast profile.

Regards
JB

On Nov 3, 2017, 11:16, at 11:16, Romain Manni-Bucau





wrote:


Hi guys,

when you check the duration of each mojo of the build (almost since
python part of the build just breaks it locally) you see that there


is


no real link with maven for the perf issues beam can encounter:
https://gist.github.com/rmannibucau/f65fdde28d5dab0fdac50633f84554c9
(generated from the profiling of tesla-profile and parsed with



https://gist.github.com/rmannibucau/e329d54b8af6c009f46fd151d10037ad)



Before PoC-ing other tools which will end up to either have the same
issues if the other builds do the same things (test, checkstyle,
enforcer, findbugs, ...) or have a less reliable build (trying to


skip


some parts of the build if "untouched" - note that this is a very


hard


issue since static code anaylizis doesn't give you any guarantee of
what it does with modern code - then maybe some action can be taken


on


the current build:

- testing https://github.com/vackosar/gitflow-incremental-builder or
https://github.com/khmarbaise/incremental-module-builder maybe or do
the same kind of extension including the beam needs (/!\ the


previous


warning is still accurate and requires a full run at some point to
validate the graph detection algorithm didn't get abused by some
indirect code dependency)
- maybe try to get rid of some shades (it is a bit crazy ATM to have
so much shades no?)
- the CI can have profiles based on 

Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-08 Thread Romain Manni-Bucau
gradle branch doesnt build for me (some rat issues)

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn


2017-11-08 5:41 GMT+01:00 Jean-Baptiste Onofré :
> Great !
>
> What explain these difference ? I'm curious especially for the clean build
> all Java modules: is it a question of parallel execution ?
>
> Regards
> JB
>
>
> On 11/08/2017 02:59 AM, Lukasz Cwik wrote:
>>
>> The Gradle POC has made significant advances since last week (shading,
>> Python, Go, Docker builds, ...). I believe the current state is close
>> enough to the Maven build system to warrant a comparison.
>>
>> The largest build differences I noticed are:
>> * Full build takes about ~22mins using Gradle (parallelizing the three
>> rounds of Python tests would reduce this to ~17mins) compared to ~38mins
>> in
>> Maven
>> * Clean build all Java modules (skipping over Go/Python) takes ~8mins in
>> Gradle which takes ~36mins in Maven
>> * Build output is cached allowing for faster subsequent builds with
>> "gradle
>> buildDependents" allowing for most single module changes taking ~2mins to
>> build and test without needing to rely on "mvn install"
>>
>> I have opened PR 4096  so that
>> the Gradle build files merged and then follow up with new Jenkins
>> precommits which are powered by Gradle. This will allow the community to
>> continuing contributing to the Gradle build and also allow for a
>> comparison
>> of the precommit times on the Jenkins executor when using Maven/Gradle. I
>> suggest that those who are interested try out the PR.
>>
>> On Fri, Nov 3, 2017 at 10:29 PM, Jean-Baptiste Onofré 
>> wrote:
>>
>>> That makes sense. The point is that we have to compare equivalently. I'm
>>> also curious about Gradle PoC assuming it does the same actions as Maven.
>>>
>>> Regards
>>> JB
>>>
>>> On Nov 3, 2017, 20:41, at 20:41, Kenneth Knowles 
>>> wrote:

 I'm confident that any choice will speed things up dramatically even
 beyond
 a fast profile, even if the new tool runs all the extra stuff. But that
 is
 a question that we can answer empirically anyhow. Let's see how it
 goes!

 Incidentally, my experiments with Bazel have led me to the conclusion
 that
 it is not the right choice for us so I'm not going to be proposing any
 completed POC of that right now. I'm interested in the outcome of the
 Gradle POC.

 Kenn


 On Fri, Nov 3, 2017 at 3:30 AM, Jean-Baptiste Onofré 
 wrote:

> Hi
>
> It's what I said in a previous e-mail: I don't think that just

 changing
>
> the build tool will improve a lot the build time.
>
> We already know (and discussed while ago) that plugins like findbugs,
> checkstyle, etc are taking time.
>
> So, I think we can already have a fast profile.
>
> Regards
> JB
>
> On Nov 3, 2017, 11:16, at 11:16, Romain Manni-Bucau

 
>
> wrote:
>>
>> Hi guys,
>>
>> when you check the duration of each mojo of the build (almost since
>> python part of the build just breaks it locally) you see that there

 is
>>
>> no real link with maven for the perf issues beam can encounter:
>> https://gist.github.com/rmannibucau/f65fdde28d5dab0fdac50633f84554c9
>> (generated from the profiling of tesla-profile and parsed with
>
>
> https://gist.github.com/rmannibucau/e329d54b8af6c009f46fd151d10037ad)
>>
>>
>> Before PoC-ing other tools which will end up to either have the same
>> issues if the other builds do the same things (test, checkstyle,
>> enforcer, findbugs, ...) or have a less reliable build (trying to

 skip
>>
>> some parts of the build if "untouched" - note that this is a very

 hard
>>
>> issue since static code anaylizis doesn't give you any guarantee of
>> what it does with modern code - then maybe some action can be taken

 on
>>
>> the current build:
>>
>> - testing https://github.com/vackosar/gitflow-incremental-builder or
>> https://github.com/khmarbaise/incremental-module-builder maybe or do
>> the same kind of extension including the beam needs (/!\ the

 previous
>>
>> warning is still accurate and requires a full run at some point to
>> validate the graph detection algorithm didn't get abused by some
>> indirect code dependency)
>> - maybe try to get rid of some shades (it is a bit crazy ATM to have
>> so much shades no?)
>> - the CI can have profiles based on a PR convention (name of the
>> branch?) to select the build profile, for instance
>> fb/elasticsearch_super-nice-PR would build only the elasticsearch
>> modules, jenkins/travis have this ability since they support

 scripting
>>
>> - 

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Robert Bradshaw
I'm generally a -0.5 on this change, or at least doing so hastily.

As with dropping Java 7 support, I think this should at least be
announced in release notes that we're considering dropping support in
the subsequent release, as this dev list likely does not reach a
substantial portion of the userbase.

How much work is it to move from a Spark 1.x cluster to a Spark 2.x
cluster? I get the feeling it's not nearly as transparent as upgrading
Java versions. Can Spark 1.x pipelines be run on Spark 2.x clusters,
or is a new cluster (and/or upgrading all pipelines) required (e.g.
for those who operate spark clusters shared among their many users)?

Looks like the latest release of Spark 1.x was about a year ago,
overlapping a bit with the 2.x series which is coming up on 1.5 years
old, so I could see a lot of people still using 1.x even if 2.x is
clearly the future. But it sure doesn't seem very backwards
compatible.

Mostly I'm not comfortable with dropping 1.x in the same release as
adding support for 2.x, giving no transition period, but could be
convinced if this transition is mostly a no-op or no one's still using
1.x. If there's non-trivial code complexity issues, I would perhaps
revisit the issue of having a single Spark Runner that does chooses
the backend implicitly in favor of simply having two runners which
share the code that's easy to share and diverge otherwise (which seems
it would be much simpler both to implement and explain to users). I
would be OK with even letting the Spark 1.x runner be somewhat
stagnant (e.g. few or no new features) until we decide we can kill it
off.

On Tue, Nov 7, 2017 at 11:27 PM, Jean-Baptiste Onofré  wrote:
> Hi all,
>
> as you might know, we are working on Spark 2.x support in the Spark runner.
>
> I'm working on a PR about that:
>
> https://github.com/apache/beam/pull/3808
>
> Today, we have something working with both Spark 1.x and 2.x from a code
> standpoint, but I have to deal with dependencies. It's the first step of the
> update as I'm still using RDD, the second step would be to support dataframe
> (but for that, I would need PCollection elements with schemas, that's
> another topic on which Eugene, Reuven and I are discussing).
>
> However, as all major distributions now ship Spark 2.x, I don't think it's
> required anymore to support Spark 1.x.
>
> If we agree, I will update and cleanup the PR to only support and focus on
> Spark 2.x.
>
> So, that's why I'm calling for a vote:
>
>   [ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
>   [ ] 0 (I don't care ;))
>   [ ] -1, I would like to still support Spark 1.x, and so having support of
> both Spark 1.x and 2.x (please provide specific comment)
>
> This vote is open for 48 hours (I have the commits ready, just waiting the
> end of the vote to push on the PR).
>
> Thanks !
> Regards
> JB
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com


Re: [DISCUSS] Move away from Apache Maven as build tool

2017-11-08 Thread Jean-Baptiste Onofré

Great !

What explain these difference ? I'm curious especially for the clean build all 
Java modules: is it a question of parallel execution ?


Regards
JB

On 11/08/2017 02:59 AM, Lukasz Cwik wrote:

The Gradle POC has made significant advances since last week (shading,
Python, Go, Docker builds, ...). I believe the current state is close
enough to the Maven build system to warrant a comparison.

The largest build differences I noticed are:
* Full build takes about ~22mins using Gradle (parallelizing the three
rounds of Python tests would reduce this to ~17mins) compared to ~38mins in
Maven
* Clean build all Java modules (skipping over Go/Python) takes ~8mins in
Gradle which takes ~36mins in Maven
* Build output is cached allowing for faster subsequent builds with "gradle
buildDependents" allowing for most single module changes taking ~2mins to
build and test without needing to rely on "mvn install"

I have opened PR 4096  so that
the Gradle build files merged and then follow up with new Jenkins
precommits which are powered by Gradle. This will allow the community to
continuing contributing to the Gradle build and also allow for a comparison
of the precommit times on the Jenkins executor when using Maven/Gradle. I
suggest that those who are interested try out the PR.

On Fri, Nov 3, 2017 at 10:29 PM, Jean-Baptiste Onofré 
wrote:


That makes sense. The point is that we have to compare equivalently. I'm
also curious about Gradle PoC assuming it does the same actions as Maven.

Regards
JB

On Nov 3, 2017, 20:41, at 20:41, Kenneth Knowles 
wrote:

I'm confident that any choice will speed things up dramatically even
beyond
a fast profile, even if the new tool runs all the extra stuff. But that
is
a question that we can answer empirically anyhow. Let's see how it
goes!

Incidentally, my experiments with Bazel have led me to the conclusion
that
it is not the right choice for us so I'm not going to be proposing any
completed POC of that right now. I'm interested in the outcome of the
Gradle POC.

Kenn


On Fri, Nov 3, 2017 at 3:30 AM, Jean-Baptiste Onofré 
wrote:


Hi

It's what I said in a previous e-mail: I don't think that just

changing

the build tool will improve a lot the build time.

We already know (and discussed while ago) that plugins like findbugs,
checkstyle, etc are taking time.

So, I think we can already have a fast profile.

Regards
JB

On Nov 3, 2017, 11:16, at 11:16, Romain Manni-Bucau



wrote:

Hi guys,

when you check the duration of each mojo of the build (almost since
python part of the build just breaks it locally) you see that there

is

no real link with maven for the perf issues beam can encounter:
https://gist.github.com/rmannibucau/f65fdde28d5dab0fdac50633f84554c9
(generated from the profiling of tesla-profile and parsed with


https://gist.github.com/rmannibucau/e329d54b8af6c009f46fd151d10037ad)


Before PoC-ing other tools which will end up to either have the same
issues if the other builds do the same things (test, checkstyle,
enforcer, findbugs, ...) or have a less reliable build (trying to

skip

some parts of the build if "untouched" - note that this is a very

hard

issue since static code anaylizis doesn't give you any guarantee of
what it does with modern code - then maybe some action can be taken

on

the current build:

- testing https://github.com/vackosar/gitflow-incremental-builder or
https://github.com/khmarbaise/incremental-module-builder maybe or do
the same kind of extension including the beam needs (/!\ the

previous

warning is still accurate and requires a full run at some point to
validate the graph detection algorithm didn't get abused by some
indirect code dependency)
- maybe try to get rid of some shades (it is a bit crazy ATM to have
so much shades no?)
- the CI can have profiles based on a PR convention (name of the
branch?) to select the build profile, for instance
fb/elasticsearch_super-nice-PR would build only the elasticsearch
modules, jenkins/travis have this ability since they support

scripting

- document how to setup a "fastBuild" profile in its settings.xml
which bypasses checkstyle, enforcer plugin, findbugs, etc... for

fast

development iterations




Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn


2017-11-01 21:02 GMT+01:00 Kenneth Knowles :

I have started one, here:

https://github.com/kennknowles/beam/commits/bazel.

It is not nearly as far along as Luke's. For the POC I am just

putting

things in one root BUILD, and learning where we might find the

necessary

plugins as I go. I am happy to grant push access to this branch.

It

would

be superb if you had some time to work through the Python steps.

On Wed, Nov 1, 2017 at 10:09 AM, Ahmet Altay



wrote:


Has anyone started a POC with Bazel? I would be interested in

helping that


Re: Nightly build doesn't pass

2017-11-08 Thread Jean-Baptiste Onofré

I confirm that the build now passes with your fix.

Thanks !

Regards
JB

On 11/07/2017 06:18 PM, Kenneth Knowles wrote:

I just noticed it before I made it through my email. Filed
https://issues.apache.org/jira/browse/BEAM-3150 and opened
https://github.com/apache/beam/pull/4092. I believe anyone can review this.

On Tue, Nov 7, 2017 at 1:56 AM, Jean-Baptiste Onofré 
wrote:


Hi guys,

FYI, our nightly build job doesn't pass on Jenkins due to a test failure
on SpannerIO:

(c2dfd45f33ad1a3a): com.google.cloud.spanner.SpannerException:
INVALID_ARGUMENT: Syntax error: Unexpected keyword WHERE [at 1:95]
...FROM information_schema.columns as c WHERE where c.table_catalog = ''
AND ...

I will take a look later today. Any help is welcome.

Sorry for the inconvenience.

Thanks !
Regards
JB
--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com





--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com