Re: Style: how much testing for transform builder classes?

2017-08-18 Thread Jean-Baptiste Onofré
Hi Eugene

It sounds good to me.

Regards
JB

On Aug 18, 2017, 21:42, at 21:42, Eugene Kirpichov 
 wrote:
>Hi all,
>
>This seems to have slipped through the cracks. I'd like to raise this
>again
>since I'm doing a cleanup in https://github.com/apache/beam/pull/3730 ,
>and
>I'd like to get consensus and add the guidance to PTransform Style
>Guide or
>the Testing Guide.
>
>Let me rephrase my suggestions from the previous email more clearly and
>concisely:
>- Do not test successful validation
>- Do not test trivial validation errors (e.g. "fails when a property is
>unset/null/empty/negative/...")
>- Add 1 test per non-trivial class of validation error (e.g.
>"properties
>foo and bar are exclusive", validation of query syntax, etc)
>- Define "nontrivial" as: where missing/incorrect/uninformative
>validation
>may lead to serious problems: data loss, counter-intuitive behavior,
>value
>of a property being silently ignored, or other hard-to-debug errors.
>- Test that each withFoo() method generally has effect (is not
>ignored),
>using TestPipeline and PAssert
>
>Does this sound reasonable?
>
>On Mon, Mar 27, 2017 at 2:28 PM Eugene Kirpichov 
>wrote:
>
>> From Ismael's and Dan's comments, I think I agree that there's a
>class of
>> easy-to-make errors that justifies having some builder tests.
>>
>> 1. Errors of the form "withFoo() instead sets bar, which is of the
>same
>> type as foo so it still compiles" (what Ismael quoted). This, I
>think,
>> should be caught by NeedsRunner tests, because it means that
>withFoo() is
>> effectively being completely ignored, and NeedsRunner tests need to
>test
>> that each property is taking effect, semantically.
>> 2. Errors of the form "withFoo() overrides bar with a default value"
>(e.g.
>> in the commit Dan quoted). This means that foo and bar don't commute,
>i.e.
>> .withFoo().withBar() is not the same as .withBar().withFoo().
>>
>> Errors of the form #2 warrant builder tests. To catch errors of this
>form,
>> it's sufficient to have a single test - that a transform fully
>configured
>> (with non-default values) in one order is equivalent to a transform
>fully
>> configured in precisely the reverse order - this way any two
>properties
>> foo,bar will be tested for whether they commute. Indirect effects
>(where
>> the relative order of more than two properties matters) seem very
>unlikely.
>>
>> OK, here's my revised proposal for what testing of transforms with
>many
>> properties should look like, for inclusion (if people agree)
>> almost-verbatim into the style/testing guide:
>> - For each property with non-trivial validation or a non-trivial
>error
>> message produced from validation, a unit test verifying that invalid
>values
>> are rejected with the proper error message.
>> - Same for rejecting invalid configuration of the transform as a
>whole
>> (e.g. unset required properties, or incompatible but individually
>valid
>> values specified for property combinations), one test per
>substantially
>> different class of error.
>> - One "blackbox" (@NeedsRunner) test per property (or per combination
>of
>> property values causing the transform to behave substantially
>differently),
>> verifying that every property has the desired effect when the
>transform is
>> executed.
>> - A single "whitebox" test that constructs a transform with
>non-default
>> values of all properties in one order and in the exact reverse order,
>and
>> verifies that the resulting transforms are configured equivalently.
>>
>> WDYT?
>>
>> On Tue, Mar 21, 2017 at 5:22 PM Robert Bradshaw
>>  wrote:
>>
>>> On Tue, Mar 21, 2017 at 5:14 PM, Dan Halperin
>>> >
>>> wrote:
>>>
>>> >
>https://github.com/apache/beam/commit/b202548323b4d59b11bbdf06c99d0f
>>> > 99e6a947ef
>>> > is one example where tests of feature Bar exist but did not
>discover
>>> bugs
>>> > that could be introduced by builders.
>>> >
>>>
>>> True, though one would need to test the full cross-product of
>builder
>>> orderings to discover this, a simple test of setValidate() wouldn't
>>> suffice
>>> either. (And if not the full cross product, what subset?) Maybe some
>>> automated fuzz testing would be a fair thing to do here (cheaper and
>more
>>> comprehensive than manual tests...)?
>>>
>>> AutoValue like alleviates many, but not all, of these concerns - as
>Ismael
>>> > points out.
>>> >
>>>
>>> If two features are not orthogonal, that perhaps merits more test
>(and
>>> documentation).
>>>
>>>
>>> >
>>> >
>>> >
>>> > On Tue, Mar 21, 2017 at 1:18 PM, Robert Bradshaw <
>>> > rober...@google.com.invalid> wrote:
>>> >
>>> > > On Wed, Mar 15, 2017 at 2:11 AM, Ismaël Mejía
>
>>> wrote:
>>> > >
>>> > > > +1 to Vikas point maybe the right place to enforce things
>correct
>>> > > > build tests is in the validate and like this reduce the test
>>> > > > boilerplate and only test the validate, but I wonder if this
>totally
>>> > > > covers both cases (the buildsCorrectly and
>>> > > > buildsCorrectlyInDifferentOrder ones).
>>> > > >
>>> > > > I answer Euge

Re: Policy for stale PRs

2017-08-18 Thread Ted Yu
bq. component leads regularly triage their components, including
unassigning issues.

+1

On Fri, Aug 18, 2017 at 5:11 PM, Ahmet Altay 
wrote:

> To summarize the stale PR issue, do we agree on the following statement:
>
> A PR becomes stale after its author fails to respond to actionable comments
> for 60 days. The community will close stale PRs. Author is welcome to
> reopen the same PR again in the future. The associated JIRAs will be
> unassigned from the author but will stay open.
>
> On Wed, Aug 16, 2017 at 3:25 PM, Ted Yu  wrote:
>
> > bq. IRAs should still stay open but should become unassigned
> >
> > The above would need admin privilege, right ?
> > Is there automated way to do it ?
> >
> > bq. Prevent contributors/committers from taking more than 'n' JIRAs at
> the
> > same time
> >
> > It would be hard to determine the N above since the amount of coding /
> > testing varies greatly across JIRAs.
> >
>
> I agree with Ismaël that there is an issue here. We currently have 969 open
> JIRAs, 427 of them are unassigned and the remaining 542 are assigned to 87
> people. The average of 6 issues per assignee is not that high. I think the
> problem is some of us (mainly component leads, including myself) have too
> many issues assigned.  Top 5 of them have 218 issues assigned to them. I
> believe these issues are automatically assigned for triage purposes. We
> probably do not need to codify an exact set of rules,, we could ask
> component leads regularly triage their components, including unassigning
> issues.
>
>
> >
> >
> >
> > On Wed, Aug 16, 2017 at 3:20 PM, Ismaël Mejía  wrote:
> >
> > > Thanks Ahmet for bringing this subject.
> > >
> > > +1 to close the stale PRs automatically after a fixed time of
> inactivity.
> > > 90
> > > days is ok, but maybe a shorter period is better. If we consider that
> > being
> > > stale is just not having any activity i.e., the author of the PR does
> not
> > > answer
> > > any message. The author can buy extra time just by adding a message to
> > say,
> > > 'wait I am still working on this', and win a complete period of time,
> so
> > > the
> > > longer the staleness period is the longer it can eventually be
> extended.
> > >
> > > I agree with Thomas the JIRAs should still stay open but should become
> > > unassigned because the issue won't be yet fixed but we want to
> encourage
> > > people
> > > to work on it.
> > >
> > > Other additional subject that makes sense to discuss here is if we need
> > > policies
> > > to avoid 'stale' JIRAs (JIRAs that have been taken but that don't have
> > > progress)?, for example:
> > >
> > > - Prevent contributors/committers from taking more than 'n' JIRAs at
> the
> > > same
> > >   time (we should define this n considering the period of staleness,
> > maybe
> > > 10?).
> > >
> > > - Automatically free 'stale' JIRAs after a fixed time period with no
> > > active work
> > >
> > > Remember the objective is to encourage more people to contribute but
> > people
> > > won't be encouraged to contribute on subjects that other people have
> > > taken, this
> > > is a well known anti-pattern in volunteer communities, see
> > > http://communitymgt.wikia.com/wiki/Cookie_Licking
> > >
> > > On Wed, Aug 16, 2017 at 10:38 PM, Thomas Groh  >
> > > wrote:
> > > > JIRAs should only be closed if the issue that they track is no longer
> > > > relevant (either via being fixed or being determined to not be a
> > > problem).
> > > > If a JIRA isn't being meaningfully worked on, it should be unassigned
> > (in
> > > > all cases, not just if there's an associated pull request that has
> not
> > > been
> > > > worked on).
> > > >
> > > > +1 on closing PRs with no action from the original author after some
> > > > reasonable time frame (90 days is certainly reasonable; 30 might be
> too
> > > > short) if the author has not responded to actionable feedback.
> > > >
> > > > On Wed, Aug 16, 2017 at 12:07 PM, Sourabh Bajaj <
> > > > sourabhba...@google.com.invalid> wrote:
> > > >
> > > >> Some projects I have seen close stale PRs after 30 days, saying
> > "Closing
> > > >> due to lack of activity, please feel free to re-open".
> > > >>
> > > >> On Wed, Aug 16, 2017 at 12:05 PM Ahmet Altay
>  > >
> > > >> wrote:
> > > >>
> > > >> > Sounds like we have consensus. Since this is a new policy, I would
> > > >> suggest
> > > >> > picking the most flexible option for now (90 days) and we can
> > tighten
> > > it
> > > >> in
> > > >> > the future. To answer Kenn's question, I do not know, how other
> > > projects
> > > >> > handle this. I did a basic search but could not find a good
> answer.
> > > >> >
> > > >> > What mechanism can we use to close PRs, assuming that author will
> be
> > > out
> > > >> of
> > > >> > communication. We can push a commit with a "This closes #xyz #abc"
> > > >> message.
> > > >> > Is there another way to do this?
> > > >> >
> > > >> > Ahmet
> > > >> >
> > > >> > On Wed, Aug 16, 2017 at 4:32 AM, Aviem Zur 
> > > wrote:
> > > >> >
> > > 

Re: Policy for stale PRs

2017-08-18 Thread Ahmet Altay
To summarize the stale PR issue, do we agree on the following statement:

A PR becomes stale after its author fails to respond to actionable comments
for 60 days. The community will close stale PRs. Author is welcome to
reopen the same PR again in the future. The associated JIRAs will be
unassigned from the author but will stay open.

On Wed, Aug 16, 2017 at 3:25 PM, Ted Yu  wrote:

> bq. IRAs should still stay open but should become unassigned
>
> The above would need admin privilege, right ?
> Is there automated way to do it ?
>
> bq. Prevent contributors/committers from taking more than 'n' JIRAs at the
> same time
>
> It would be hard to determine the N above since the amount of coding /
> testing varies greatly across JIRAs.
>

I agree with Ismaël that there is an issue here. We currently have 969 open
JIRAs, 427 of them are unassigned and the remaining 542 are assigned to 87
people. The average of 6 issues per assignee is not that high. I think the
problem is some of us (mainly component leads, including myself) have too
many issues assigned.  Top 5 of them have 218 issues assigned to them. I
believe these issues are automatically assigned for triage purposes. We
probably do not need to codify an exact set of rules,, we could ask
component leads regularly triage their components, including unassigning
issues.


>
>
>
> On Wed, Aug 16, 2017 at 3:20 PM, Ismaël Mejía  wrote:
>
> > Thanks Ahmet for bringing this subject.
> >
> > +1 to close the stale PRs automatically after a fixed time of inactivity.
> > 90
> > days is ok, but maybe a shorter period is better. If we consider that
> being
> > stale is just not having any activity i.e., the author of the PR does not
> > answer
> > any message. The author can buy extra time just by adding a message to
> say,
> > 'wait I am still working on this', and win a complete period of time, so
> > the
> > longer the staleness period is the longer it can eventually be extended.
> >
> > I agree with Thomas the JIRAs should still stay open but should become
> > unassigned because the issue won't be yet fixed but we want to encourage
> > people
> > to work on it.
> >
> > Other additional subject that makes sense to discuss here is if we need
> > policies
> > to avoid 'stale' JIRAs (JIRAs that have been taken but that don't have
> > progress)?, for example:
> >
> > - Prevent contributors/committers from taking more than 'n' JIRAs at the
> > same
> >   time (we should define this n considering the period of staleness,
> maybe
> > 10?).
> >
> > - Automatically free 'stale' JIRAs after a fixed time period with no
> > active work
> >
> > Remember the objective is to encourage more people to contribute but
> people
> > won't be encouraged to contribute on subjects that other people have
> > taken, this
> > is a well known anti-pattern in volunteer communities, see
> > http://communitymgt.wikia.com/wiki/Cookie_Licking
> >
> > On Wed, Aug 16, 2017 at 10:38 PM, Thomas Groh 
> > wrote:
> > > JIRAs should only be closed if the issue that they track is no longer
> > > relevant (either via being fixed or being determined to not be a
> > problem).
> > > If a JIRA isn't being meaningfully worked on, it should be unassigned
> (in
> > > all cases, not just if there's an associated pull request that has not
> > been
> > > worked on).
> > >
> > > +1 on closing PRs with no action from the original author after some
> > > reasonable time frame (90 days is certainly reasonable; 30 might be too
> > > short) if the author has not responded to actionable feedback.
> > >
> > > On Wed, Aug 16, 2017 at 12:07 PM, Sourabh Bajaj <
> > > sourabhba...@google.com.invalid> wrote:
> > >
> > >> Some projects I have seen close stale PRs after 30 days, saying
> "Closing
> > >> due to lack of activity, please feel free to re-open".
> > >>
> > >> On Wed, Aug 16, 2017 at 12:05 PM Ahmet Altay  >
> > >> wrote:
> > >>
> > >> > Sounds like we have consensus. Since this is a new policy, I would
> > >> suggest
> > >> > picking the most flexible option for now (90 days) and we can
> tighten
> > it
> > >> in
> > >> > the future. To answer Kenn's question, I do not know, how other
> > projects
> > >> > handle this. I did a basic search but could not find a good answer.
> > >> >
> > >> > What mechanism can we use to close PRs, assuming that author will be
> > out
> > >> of
> > >> > communication. We can push a commit with a "This closes #xyz #abc"
> > >> message.
> > >> > Is there another way to do this?
> > >> >
> > >> > Ahmet
> > >> >
> > >> > On Wed, Aug 16, 2017 at 4:32 AM, Aviem Zur 
> > wrote:
> > >> >
> > >> > > Makes sense to close after a long time of inactivity and no
> > response,
> > >> and
> > >> > > as Kenn mentioned they can always re-open.
> > >> > >
> > >> > > On Wed, Aug 16, 2017 at 12:20 AM Jean-Baptiste Onofré <
> > j...@nanthrax.net
> > >> >
> > >> > > wrote:
> > >> > >
> > >> > > > If we consider the author, it makes sense.
> > >> > > >
> > >> > > > Regards
> > >> > > > JB
> > >> > > >
> > >> > > > 

Re: Proposal: file-based IOs should support readAllMatches()

2017-08-18 Thread Chamikara Jayalath
+1 for this. Also it looks like IO authors should be able to use existing
'ReadAllViaFileBasedSource' transform when implementing
FooIO.readAllMatches().

- Cham

On Fri, Aug 18, 2017 at 2:38 PM Eugene Kirpichov
 wrote:

> Hi all,
>
> I've been adding new features to TextIO and AvroIO recently, see e.g.
> https://github.com/apache/beam/pull/3725. The features are:
> - withHintMatchesManyFiles()
> - readAll() that reads a PCollection of filepatterns
> - configurable treatment of filepatterns that match no files
> - watchForNewFiles() that incrementally watches for new files matching the
> filepatterns
>
> However, these features also make sense for other file-based IOs
> (TFRecordIO, XmlIO, the in-review WholeFileIO), and adding them explicitly
> to each of these requires a lot of boilerplate and reeks of lack of
> modularity. I don't want to add this much duplicated code to each of them,
> nor to require authors of new such IOs to add it.
>
> Note that all of these features are available on the recently added
> Match.filepatterns() transform, that converts a PCollection to a
> PCollection (file path and size). The boilerplate in
> file-based IOs ends up simply passing on the properties to the Match
> transform.
>
> Because of this, I'd like to propose the following recommendation for
> file-based IOs:
> A file-based FooIO should include:
> - A read transform that reads the data from a filepattern specified at
> pipeline construction time - FooIO.read().from(filepattern) or something
> analogous, as a PCollection
> - A transform FooIO.readAllMatches() that converts a PCollection
> to PCollection
>
> Then FooIO.read() handles the common case, and the user can solve all
> advanced cases by combining Match.filepatterns() with
> FooIO.readAllMatches():
>
> // Read files in a filepattern but don't fail if it's empty
> PCollection foos = p.apply(Create.of(myFilepattern))
>.apply(Match.filepatterns().withEmptyMatchTreatment(ALLOW))
>.apply(FooIO.readAllMatches());
>
> // Read new filepatterns arriving over PubSub, and for each filepattern
> // continuously watch for new files matching it, polling every 1 minute
> // and stop polling a filepattern if no new files appear for 5 minutes
> PCollection filepatterns = p.apply(PubsubIO.readStrings()...)
> PCollection foos = p.apply(Create.of(myFilepattern))
>.apply(Match.filepatterns().continuously(
>   Duration.standardMinutes(1),
> afterTimeSinceNewOutput(Duration.standardMinutes(5
>.apply(FooIO.readAllMatches());
>
> Adding explicit support for these configuration options to FooIO.read(),
> and adding a FooIO.readAll() should be optional.
>
> WDYT?
>


Re: Becoming a project contributor?

2017-08-18 Thread arost...@google.com
Thank you!

On 2017-08-18 02:11, Kenneth Knowles  wrote: 
> Done, and welcome!
> 
> On Thu, Aug 17, 2017 at 2:24 PM, Asha Rostamianfar <
> arost...@google.com.invalid> wrote:
> 
> > Hi,
> >
> > I have started working on BEAM-2774
> >  and was wondering how to
> > become a project contributor so that the issue can be assigned to me?
> >
> > Thanks,
> > Asha
> >
> 


Proposal: file-based IOs should support readAllMatches()

2017-08-18 Thread Eugene Kirpichov
Hi all,

I've been adding new features to TextIO and AvroIO recently, see e.g.
https://github.com/apache/beam/pull/3725. The features are:
- withHintMatchesManyFiles()
- readAll() that reads a PCollection of filepatterns
- configurable treatment of filepatterns that match no files
- watchForNewFiles() that incrementally watches for new files matching the
filepatterns

However, these features also make sense for other file-based IOs
(TFRecordIO, XmlIO, the in-review WholeFileIO), and adding them explicitly
to each of these requires a lot of boilerplate and reeks of lack of
modularity. I don't want to add this much duplicated code to each of them,
nor to require authors of new such IOs to add it.

Note that all of these features are available on the recently added
Match.filepatterns() transform, that converts a PCollection to a
PCollection (file path and size). The boilerplate in
file-based IOs ends up simply passing on the properties to the Match
transform.

Because of this, I'd like to propose the following recommendation for
file-based IOs:
A file-based FooIO should include:
- A read transform that reads the data from a filepattern specified at
pipeline construction time - FooIO.read().from(filepattern) or something
analogous, as a PCollection
- A transform FooIO.readAllMatches() that converts a PCollection
to PCollection

Then FooIO.read() handles the common case, and the user can solve all
advanced cases by combining Match.filepatterns() with
FooIO.readAllMatches():

// Read files in a filepattern but don't fail if it's empty
PCollection foos = p.apply(Create.of(myFilepattern))
   .apply(Match.filepatterns().withEmptyMatchTreatment(ALLOW))
   .apply(FooIO.readAllMatches());

// Read new filepatterns arriving over PubSub, and for each filepattern
// continuously watch for new files matching it, polling every 1 minute
// and stop polling a filepattern if no new files appear for 5 minutes
PCollection filepatterns = p.apply(PubsubIO.readStrings()...)
PCollection foos = p.apply(Create.of(myFilepattern))
   .apply(Match.filepatterns().continuously(
  Duration.standardMinutes(1),
afterTimeSinceNewOutput(Duration.standardMinutes(5
   .apply(FooIO.readAllMatches());

Adding explicit support for these configuration options to FooIO.read(),
and adding a FooIO.readAll() should be optional.

WDYT?


Re: Style: how much testing for transform builder classes?

2017-08-18 Thread Eugene Kirpichov
Hi all,

This seems to have slipped through the cracks. I'd like to raise this again
since I'm doing a cleanup in https://github.com/apache/beam/pull/3730 , and
I'd like to get consensus and add the guidance to PTransform Style Guide or
the Testing Guide.

Let me rephrase my suggestions from the previous email more clearly and
concisely:
- Do not test successful validation
- Do not test trivial validation errors (e.g. "fails when a property is
unset/null/empty/negative/...")
- Add 1 test per non-trivial class of validation error (e.g. "properties
foo and bar are exclusive", validation of query syntax, etc)
- Define "nontrivial" as: where missing/incorrect/uninformative validation
may lead to serious problems: data loss, counter-intuitive behavior, value
of a property being silently ignored, or other hard-to-debug errors.
- Test that each withFoo() method generally has effect (is not ignored),
using TestPipeline and PAssert

Does this sound reasonable?

On Mon, Mar 27, 2017 at 2:28 PM Eugene Kirpichov 
wrote:

> From Ismael's and Dan's comments, I think I agree that there's a class of
> easy-to-make errors that justifies having some builder tests.
>
> 1. Errors of the form "withFoo() instead sets bar, which is of the same
> type as foo so it still compiles" (what Ismael quoted). This, I think,
> should be caught by NeedsRunner tests, because it means that withFoo() is
> effectively being completely ignored, and NeedsRunner tests need to test
> that each property is taking effect, semantically.
> 2. Errors of the form "withFoo() overrides bar with a default value" (e.g.
> in the commit Dan quoted). This means that foo and bar don't commute, i.e.
> .withFoo().withBar() is not the same as .withBar().withFoo().
>
> Errors of the form #2 warrant builder tests. To catch errors of this form,
> it's sufficient to have a single test - that a transform fully configured
> (with non-default values) in one order is equivalent to a transform fully
> configured in precisely the reverse order - this way any two properties
> foo,bar will be tested for whether they commute. Indirect effects (where
> the relative order of more than two properties matters) seem very unlikely.
>
> OK, here's my revised proposal for what testing of transforms with many
> properties should look like, for inclusion (if people agree)
> almost-verbatim into the style/testing guide:
> - For each property with non-trivial validation or a non-trivial error
> message produced from validation, a unit test verifying that invalid values
> are rejected with the proper error message.
> - Same for rejecting invalid configuration of the transform as a whole
> (e.g. unset required properties, or incompatible but individually valid
> values specified for property combinations), one test per substantially
> different class of error.
> - One "blackbox" (@NeedsRunner) test per property (or per combination of
> property values causing the transform to behave substantially differently),
> verifying that every property has the desired effect when the transform is
> executed.
> - A single "whitebox" test that constructs a transform with non-default
> values of all properties in one order and in the exact reverse order, and
> verifies that the resulting transforms are configured equivalently.
>
> WDYT?
>
> On Tue, Mar 21, 2017 at 5:22 PM Robert Bradshaw
>  wrote:
>
>> On Tue, Mar 21, 2017 at 5:14 PM, Dan Halperin > >
>> wrote:
>>
>> > https://github.com/apache/beam/commit/b202548323b4d59b11bbdf06c99d0f
>> > 99e6a947ef
>> > is one example where tests of feature Bar exist but did not discover
>> bugs
>> > that could be introduced by builders.
>> >
>>
>> True, though one would need to test the full cross-product of builder
>> orderings to discover this, a simple test of setValidate() wouldn't
>> suffice
>> either. (And if not the full cross product, what subset?) Maybe some
>> automated fuzz testing would be a fair thing to do here (cheaper and more
>> comprehensive than manual tests...)?
>>
>> AutoValue like alleviates many, but not all, of these concerns - as Ismael
>> > points out.
>> >
>>
>> If two features are not orthogonal, that perhaps merits more test (and
>> documentation).
>>
>>
>> >
>> >
>> >
>> > On Tue, Mar 21, 2017 at 1:18 PM, Robert Bradshaw <
>> > rober...@google.com.invalid> wrote:
>> >
>> > > On Wed, Mar 15, 2017 at 2:11 AM, Ismaël Mejía 
>> wrote:
>> > >
>> > > > +1 to Vikas point maybe the right place to enforce things correct
>> > > > build tests is in the validate and like this reduce the test
>> > > > boilerplate and only test the validate, but I wonder if this totally
>> > > > covers both cases (the buildsCorrectly and
>> > > > buildsCorrectlyInDifferentOrder ones).
>> > > >
>> > > > I answer Eugene’s question here even if you are aware now since you
>> > > > commented in the PR, so everyone understands the case.
>> > > >
>> > > > The case is pretty simple, when you extend an IO and add a new
>> > > > configuration parameter, suppose we have withFo

Re: beam-site issues with Jenkins and MergeBot

2017-08-18 Thread Jason Kuster
I'll take a look this afternoon Eugene, thanks! I'll also send a PR to
update MergeBot documentation to make testing and deployment information
clearer.

On Thu, Aug 17, 2017 at 3:58 PM, Eugene Kirpichov 
wrote:

> Hi Jason,
>
> Mergebot seems to be generally working, thanks! However I found another
> issue with it: it fails to add new files when regenerating the website. I
> ran into this in https://github.com/apache/beam-site/pull/292: mergebot
> generated commit b548e9ba
> 
>  which
> failed to add new files and I had to add them manually in 44d3769f
> 
> .
>
> I sent a PR to fix this https://github.com/jasonkuster/merge-bot/pull/24 
> though
> I don't know how to test or deploy it.
>
> On Thu, Aug 10, 2017 at 3:32 PM Jason Kuster 
> wrote:
>
>> Investigating mergebot outage currently. Apologies for the downtime.
>>
>> On Wed, Aug 9, 2017 at 9:55 PM, Eugene Kirpichov 
>> wrote:
>>
>>> Indeed beam-site is at https://gitbox.apache.org/repos/asf/beam-site.git
>>>  now.
>>>
>>> However, Mergebot appears to still be not working.
>>> https://github.com/apache/beam-site/pull/283 fixes the dead link and it
>>> passes the Jenkins precommit tests, but my "@asfgit merge" appears to have
>>> done nothing. I'm gonna have to merge things manually for now.
>>>
>>> +Jason Kuster  Any ideas on why Mergebot is not
>>> working?
>>>
>>> On Wed, Aug 9, 2017 at 9:40 PM Jean-Baptiste Onofré 
>>> wrote:
>>>
 Beam site is no more on git-wip-us, but it moved to gitbox afair.

 Regards
 JB

 On 08/09/2017 10:08 PM, Eugene Kirpichov wrote:
 > Hello,
 >
 > I've been trying to merge a PR https://github.com/apache/
 beam-site/pull/278
 > and ran into the following issues:
 >
 > 1) When I do "git fetch --all" on beam-site, I get an error "fatal:
 > repository 'https://git-wip-us.apache.org/repos/asf/beam-site.git/'
 not
 > found". Has the git address of the apache repo changed? Is it no
 longer
 > valid because we have MergeBot?
 >
 > 2) Precommit tests are failing nearly 100% of the time.
 > If you look at build history on
 > https://builds.apache.org/job/beam_PreCommit_Website_Test/ - 9 out
 of 10
 > last builds failed.
 > Failures I saw:
 >
 > 7 times:
 > + gpg --keyserver hkp://keys.gnupg.net --recv-keys
 > 409B6B1796C275462A1703113804BB82D39DC0E3
 > gpg: requesting key D39DC0E3 from hkp server keys.gnupg.net
 > ?: keys.gnupg.net: Cannot assign requested address
 >
 > 2 times:
 > - ./content/subdir/contribute/testing/index.html
 >*  External link https://builds.apache.org/view/Beam/ failed: 404
 No error
 >
 > The second failure seems legit - https://builds.apache.org/view/Beam/
 is
 > actually 404 right now (I'll send a separate email about htis)
 >
 > The gnupg failure is not legit - I'm able to run the same command
 myself
 > with no issues.
 >
 > 3) Suppose because of this, I'm not able to merge my PR with "@asfgit
 > merge" command - I suppose it requires a successful test run. Would
 be nice
 > if it posted a comment saying why it refuses to merge.
 >

 --
 Jean-Baptiste Onofré
 jbono...@apache.org
 http://blog.nanthrax.net
 Talend - http://www.talend.com

>>>
>>
>>
>> --
>> ---
>> Jason Kuster
>> Apache Beam / Google Cloud Dataflow
>>
>


-- 
---
Jason Kuster
Apache Beam / Google Cloud Dataflow


Re: [VOTE] Release 2.1.0, release candidate #3

2017-08-18 Thread Jean-Baptiste Onofré
Hi

I'm in vacation so I'm looking for a decent Internet connection to finalize the 
release.

I keep you posted.

Regards
JB

On Aug 18, 2017, 17:48, at 17:48, Eugene Kirpichov 
 wrote:
>Hi JB,
>
>Any updates on finalizing the release?
>
>Thanks.
>
>On Thu, Aug 17, 2017 at 5:42 AM Aljoscha Krettek 
>wrote:
>
>> (Belated) +1
>>
>>  * verified signatures
>>  * verified that Quickstart works with Flink Runner
>>
>> > On 16. Aug 2017, at 20:41, Robert Bradshaw
>
>> wrote:
>> >
>> > +1 binding
>> >
>> > (I've been on vacation as well.)
>> >
>> > On Wed, Aug 16, 2017 at 8:50 AM, Lukasz Cwik
>
>> wrote:
>> >> Back from vacation.
>> >>
>> >> +1 binding
>> >>
>> >> BEAM-2671 has been marked for 2.2.0 release.
>> >>
>> >>
>> >>
>> >> On Wed, Aug 16, 2017 at 2:08 AM, Kobi Salant
>
>> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> Spark runner was tested with word count example and a more
>complex
>> session
>> >>> based application on a yarn cluster.
>> >>> Both application run successfully so we can say that spark runner
>> passed
>> >>> the sanity tests needed.
>> >>>
>> >>> Still there is an open ticket
>> >>> https://issues.apache.org/jira/browse/BEAM-2671 which Stas is
>working
>> on
>> >>> and its implications should be taken into consideration regarding
>the
>> >>> release.
>> >>>
>> >>> Regards
>> >>> Kobi
>> >>>
>> >>> 2017-08-16 5:02 GMT+03:00 Eugene Kirpichov
>> :
>> >>>
>>  Hey all,
>> 
>>  Seems like we're missing one more affirmative vote from a PMC
>member
>> (so
>>  far we have JB and Ahmet) to proceed with the release.
>> 
>>  On Mon, Aug 14, 2017 at 9:30 AM Ahmet Altay
>> >
>>  wrote:
>> 
>> > On Mon, Aug 14, 2017 at 6:32 AM, Ismaël Mejía
>
>> >>> wrote:
>> >
>> >> +1 (non-binding)
>> >>
>> >> - Validated signatures OK
>> >> - mvn clean verify -Prelease on both OpenJDK 1.7 and Oracle
>JDK 8
>> >>> with
>> >> the docker development images (WIP), both OK
>> >> - Run WordCount on local Flink and Spark runners OK
>> >>
>> >> Everything looks nice, only one minor thing (not blocking at
>all).
>> >>> The
>> >> proto generated files for python are not cleaned correctly and
>this
>> >> causes the validation to complain because the maven rat plugin
>does
>> >> not find the apache headers on the files  (this happens if you
>> >>> execute
>> >> mvn clean verify -Prelease immediately after the validation).
>> >>
>> >
>> > Ismaël, could you create a JIRA issue for this (to be fixed at
>a
>> future
>> > release)?
>> >
>> >
>> >>
>> >> On Sun, Aug 13, 2017 at 6:52 AM, Jean-Baptiste Onofré <
>> >>> j...@nanthrax.net
>> >
>> >> wrote:
>> >>> +1 (binding)
>> >>>
>> >>> I do my own tests and casting my own vote ;)
>> >>>
>> >>> Regards
>> >>> JB
>> >>>
>> >>> On 08/09/2017 07:08 AM, Jean-Baptiste Onofré wrote:
>> 
>>  Hi everyone,
>> 
>>  Please review and vote on the release candidate #3 for the
>version
>> >> 2.1.0,
>>  as follows:
>> 
>>  [ ] +1, Approve the release
>>  [ ] -1, Do not approve the release (please provide specific
>>  comments)
>> 
>> 
>>  The complete staging area is available for your review,
>which
>> > includes:
>>  * JIRA release notes [1],
>>  * the official Apache source release to be deployed to
>> > dist.apache.org
>>  [2], which is signed with the key with fingerprint C8282E76
>[3],
>>  * all artifacts to be deployed to the Maven Central
>Repository
>> >>> [4],
>>  * source code tag "v2.1.0-RC3" [5],
>>  * website pull request listing the release and publishing
>the API
>>  reference manual [6].
>>  * Python artifacts are deployed along with the source
>release to
>> >>> the
>>  dist.apache.org [2].
>> 
>>  The vote will be open for at least 72 hours. It is adopted
>by
>>  majority
>>  approval, with at least 3 PMC affirmative votes.
>> 
>>  Thanks,
>>  JB
>> 
>>  [1]
>>  https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>> >> projectId=12319527&version=12340528
>>  [2] https://dist.apache.org/repos/dist/dev/beam/2.1.0/
>>  [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>>  [4] https://repository.apache.org/content/repositories/
>> >> orgapachebeam-1020/
>>  [5] https://github.com/apache/beam/tree/v2.1.0-RC3
>>  [6] https://github.com/apache/beam-site/pull/270
>> >>>
>> >>>
>> >>> --
>> >>> Jean-Baptiste Onofré
>> >>> jbono...@apache.org
>> >>> http://blog.nanthrax.net
>> >>> Talend - http://www.talend.com
>> >>
>> >
>> 
>> >>>
>>
>>


Re: [VOTE] Release 2.1.0, release candidate #3

2017-08-18 Thread Eugene Kirpichov
Hi JB,

Any updates on finalizing the release?

Thanks.

On Thu, Aug 17, 2017 at 5:42 AM Aljoscha Krettek 
wrote:

> (Belated) +1
>
>  * verified signatures
>  * verified that Quickstart works with Flink Runner
>
> > On 16. Aug 2017, at 20:41, Robert Bradshaw 
> wrote:
> >
> > +1 binding
> >
> > (I've been on vacation as well.)
> >
> > On Wed, Aug 16, 2017 at 8:50 AM, Lukasz Cwik 
> wrote:
> >> Back from vacation.
> >>
> >> +1 binding
> >>
> >> BEAM-2671 has been marked for 2.2.0 release.
> >>
> >>
> >>
> >> On Wed, Aug 16, 2017 at 2:08 AM, Kobi Salant 
> wrote:
> >>
> >>> Hi,
> >>>
> >>> Spark runner was tested with word count example and a more complex
> session
> >>> based application on a yarn cluster.
> >>> Both application run successfully so we can say that spark runner
> passed
> >>> the sanity tests needed.
> >>>
> >>> Still there is an open ticket
> >>> https://issues.apache.org/jira/browse/BEAM-2671 which Stas is working
> on
> >>> and its implications should be taken into consideration regarding the
> >>> release.
> >>>
> >>> Regards
> >>> Kobi
> >>>
> >>> 2017-08-16 5:02 GMT+03:00 Eugene Kirpichov
> :
> >>>
>  Hey all,
> 
>  Seems like we're missing one more affirmative vote from a PMC member
> (so
>  far we have JB and Ahmet) to proceed with the release.
> 
>  On Mon, Aug 14, 2017 at 9:30 AM Ahmet Altay  >
>  wrote:
> 
> > On Mon, Aug 14, 2017 at 6:32 AM, Ismaël Mejía 
> >>> wrote:
> >
> >> +1 (non-binding)
> >>
> >> - Validated signatures OK
> >> - mvn clean verify -Prelease on both OpenJDK 1.7 and Oracle JDK 8
> >>> with
> >> the docker development images (WIP), both OK
> >> - Run WordCount on local Flink and Spark runners OK
> >>
> >> Everything looks nice, only one minor thing (not blocking at all).
> >>> The
> >> proto generated files for python are not cleaned correctly and this
> >> causes the validation to complain because the maven rat plugin does
> >> not find the apache headers on the files  (this happens if you
> >>> execute
> >> mvn clean verify -Prelease immediately after the validation).
> >>
> >
> > Ismaël, could you create a JIRA issue for this (to be fixed at a
> future
> > release)?
> >
> >
> >>
> >> On Sun, Aug 13, 2017 at 6:52 AM, Jean-Baptiste Onofré <
> >>> j...@nanthrax.net
> >
> >> wrote:
> >>> +1 (binding)
> >>>
> >>> I do my own tests and casting my own vote ;)
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On 08/09/2017 07:08 AM, Jean-Baptiste Onofré wrote:
> 
>  Hi everyone,
> 
>  Please review and vote on the release candidate #3 for the version
> >> 2.1.0,
>  as follows:
> 
>  [ ] +1, Approve the release
>  [ ] -1, Do not approve the release (please provide specific
>  comments)
> 
> 
>  The complete staging area is available for your review, which
> > includes:
>  * JIRA release notes [1],
>  * the official Apache source release to be deployed to
> > dist.apache.org
>  [2], which is signed with the key with fingerprint C8282E76 [3],
>  * all artifacts to be deployed to the Maven Central Repository
> >>> [4],
>  * source code tag "v2.1.0-RC3" [5],
>  * website pull request listing the release and publishing the API
>  reference manual [6].
>  * Python artifacts are deployed along with the source release to
> >>> the
>  dist.apache.org [2].
> 
>  The vote will be open for at least 72 hours. It is adopted by
>  majority
>  approval, with at least 3 PMC affirmative votes.
> 
>  Thanks,
>  JB
> 
>  [1]
>  https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> >> projectId=12319527&version=12340528
>  [2] https://dist.apache.org/repos/dist/dev/beam/2.1.0/
>  [3] https://dist.apache.org/repos/dist/release/beam/KEYS
>  [4] https://repository.apache.org/content/repositories/
> >> orgapachebeam-1020/
>  [5] https://github.com/apache/beam/tree/v2.1.0-RC3
>  [6] https://github.com/apache/beam-site/pull/270
> >>>
> >>>
> >>> --
> >>> Jean-Baptiste Onofré
> >>> jbono...@apache.org
> >>> http://blog.nanthrax.net
> >>> Talend - http://www.talend.com
> >>
> >
> 
> >>>
>
>


Jenkins build is back to normal : beam_Release_NightlySnapshot #507

2017-08-18 Thread Apache Jenkins Server
See