Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-11 Thread Kai Jiang
Hi Etienne,

It's awesome for working on these useful dashboards. I am getting TPC-H
benchmark running on Flink and Dataflow Runner. I could work on similar
dashboards for TPC benchmark after code merged.
Also, it's great to have a dashboards for Dataflow.

Best,
Kai
ᐧ

On Wed, Jul 11, 2018 at 6:35 AM Etienne Chauchot 
wrote:

> First catch of the nexmark-CI:
> It seems that there was a change in the direct runner.
>
> Query3 (exercise state and timers)
> - output size should be constant but has increased today => Was there a
> change in state and timer related code?
> - the output size of this query is different between batch and streaming
> modes on direct runner.
>
> Etienne
>
> Le mercredi 11 juillet 2018 à 15:25 +0200, Etienne Chauchot a écrit :
>
> Is someone interested in creating the scripts and dashboards for the other
> runners? They can be created by copying the existing scripts and dashboards
> and changing one gradle parameter in the scripts and the table name in the
> dashboards.
>
> I have created the tickets:
> https://issues.apache.org/jira/browse/BEAM-4763
> https://issues.apache.org/jira/browse/BEAM-4762
> https://issues.apache.org/jira/browse/BEAM-4761
> https://issues.apache.org/jira/browse/BEAM-4760
>
> Etienne
> Le mercredi 11 juillet 2018 à 15:13 +0200, Etienne Chauchot a écrit :
>
>
> Hi guys,
>
> I'm glad to announce that the CI of Beam has much improved ! Indeed
> Nexmark is now included in the perfkit dashboards.
>
> At each commit on master, nexmark suites are run and plots are created on
> the graphs.
>
> I've created 2 kind of dashboards:
> - one for performances (run times of the queries)
> - one for the size of the output PCollection (which should be constant)
>
> There are dashboards for these runners:
> - spark
> - flink
> - direct runner
>
> Each dashboard contains:
> - graphs in batch mode
> - graphs in streaming mode
> - graphs for the 13 queries.
>
> That gives more than a hundred of graphs (my right finger hurts after so
> many clics on the mouse :) ). It is detailed that much so that anyone can
> focus on the area they have interest in.
> Feel free to also create new dashboards with more aggregated data.
>
> Thanks to Lukasz and Cham for reviewing my PRs and showing how to use
> perfkit dashboards.
>
> Dashboards are there:
>
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
>
> 
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
>
> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
>
>
> Enjoy,
>
> Etienne
>
>
>


Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Anton Kedin
I think this looks good, we should enable the plugin and try it out.
Concrete details of the follow-up tasks (auto-assignment, triage, and
dashboarding) will probably depend on how functional the plugin is and what
the test failures data looks like.

Regards,
Anton

On Wed, Jul 11, 2018 at 5:00 PM Mikhail Gryzykhin  wrote:

> @Yifan Zou 
>
> I believe that we should test-drive the system with tickets + PR first and
> decide on email notification later. We already have tests failure emails
> sent to commits@, I believe most people filter out or not signed up for
> that list though.
>
> It creates only one ticket, and keeps it for recurring test failures.
>
> @Andrew Pilloud 
> Thank you for the suggestion. I'll add it to design doc.
>
> --Mikhail
>
>
>
> On Wed, Jul 11, 2018 at 4:52 PM Yifan Zou  wrote:
>
>> +1 to Andrew's concerns. Leaving the tickets unassigned will cause the
>> ticket being ignored and no actions being taken.
>>
>> I can see the challenges on ticket assignment. Like Mikhail mentioned,
>> the plugin does not support dynamic assignments. We have to implement
>> custom script to determine the assignees and do some tricks to the jenkins
>> job. Also, the post-commits tests usually cover tons of stuffs that it is
>> difficult to find which part was broken and ask the right person to look
>> into within the Auto JIRA process. Some naive thoughts: Are we able to send
>> emails to the dev@ to ask people to take care of the JIRA issues? Are we
>> able to find component leads and ask them triage the test failure tickets?
>>
>> Another nitpick comment. Does the jenkins job file the JIRA issue in
>> every test failure? Sometimes the test continuously fails in a time period
>> due to the same reason. In this case, we will get some duplicate issues
>> filed by Jenkins. I think it could be better if we can avoid filing issues
>> if the previous one has not been resolved.
>>
>> Thanks.
>> Yifan
>>
>>
>> On Wed, Jul 11, 2018 at 4:37 PM Andrew Pilloud 
>> wrote:
>>
>>> That sounds great. You should add this detail to the doc.
>>>
>>> On Wed, Jul 11, 2018 at 4:29 PM Mikhail Gryzykhin 
>>> wrote:
>>>
 We already have component for this purpose: "test-failures". All
 tickets created will go to that component. As an option, we can add link to
 view list of open JIRA tickets to PR template.

 We also would want to create graph on dashboard with amount of
 unassigned and assigned bugs.

 I believe that we can also add counter of unassigned bugs to PR
 template. This way it will be easier for everyone to know when there's some
 tests issue not attended.

 --Mikhail


 On Wed, Jul 11, 2018 at 4:24 PM Andrew Pilloud 
 wrote:

> So it sounds like you will want to create a component for untriaged
> issues so they are easy to find. I like the idea of distributing the work
> of triaging post commit failures to new PR authors as a condition of
> merging. I feel like we will just be filling JIRA with spam if the issues
> are automatically created without a plan for triage.
>
> Andrew
>
> On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:
>
>> Maybe this is also a good thread to start the discussion that if we
>> want to enforce postcommit test for every PR.
>>
>> Can we afford the cost of longer waiting time to catch potential
>> bugs?
>>
>> -Rui
>>
>> On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin 
>> wrote:
>>
>>> That's a valid point.
>>>
>>> Unfortunately, the JiraTestResultReporter plugin does not have
>>> features to dynamically assign owners. Additionally, I don't think it is
>>> always easy to find proper owner for post-commit tests at first glance,
>>> since they usually cover broad specter of issues.
>>>
>>> My assumption is that we need someone to triage new issues.
>>>
>>> Ideally, any contributor, who sees failing test, should check
>>> unassigned tickets and either do triage, or assign them to someone who 
>>> can.
>>> I strongly encourage this approach.
>>>
>>> We have couple other ready-made options to consider:
>>> 1. We can configure JIRA component owner who would be assigned to
>>> created tickets.
>>> 2. JiraTestReporterPlugin can assign tickets to specific user. This
>>> is configured per Jenkins job. We can utilize this if someone 
>>> volunteers.
>>> 3. Dynamic assignment will most likely require custom solution.
>>>
>>> --Mikhail
>>>
>>>
>>> On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
>>> wrote:
>>>
 Hi Mikhail,

 I like the proposal! Hopefully this can replace the constant stream
 of build failure emails. I noticed one detail seems to be missing:  How
 will new issues be assigned to the proper owner? Will the tool do this
 automatically or will we need someone to triage new issues?

Re: Vendoring / Shading Protobuf and gRPC

2018-07-11 Thread Lukasz Cwik
Got a fix[1] for Andrews issue which turned out to be a release blocker
since it broke performing the release. Also fixed several minor things like
javadoc that were wrong with the release. Solving it allowed me to do the
publishing in parallel and cut the release time from 20+ mins to 8 mins on
my machine.

1: https://github.com/apache/beam/pull/5936

On Wed, Jul 11, 2018 at 3:51 PM Andrew Pilloud  wrote:

> We discussed this in person, sounds like my issue is known and will be
> fixed shortly. I'm running builds with '-Ppublishing' because I need to
> generate release artifacts for bundling the Beam SQL shell with the Google
> Cloud SDK. Hope to eventually just use the Beam release, but we are
> currently cutting a release off master every week to quickly iterate on bug
> fixes.
>
> Andrew
>
> On Wed, Jul 11, 2018 at 1:39 PM Lukasz Cwik  wrote:
>
>> Andrew, to my knowledge it seems as though your running into BEAM-4744,
>> is there a reason you need to specify -Ppublishing?
>>
>> No particular reason to using ByteString within ByteKey and TextSource.
>> Note that we currently do shade away protobuf in sdks/java/core so we could
>> either migrate to using a vendored version or re-implement the
>> functionality to not use ByteString. Note that sdks/java/core can now
>> dependend on the model/* classes and perform the Pipeline -> Proto
>> translation as this will be needed to support portability efforts so I
>> would prefer just migrating to use the vendored versions of the code. Filed
>> BEAM-4766.
>>
>> As for the IO module, I was referring to the upstream
>> bigtable/bigquery/... libraries vended by Google. If they trimmed their API
>> surface to not expose gRPC or protobuf, then we wouldn't have to worry
>> about having the shading logic within sdks/java/io/google-cloud-platform. I
>> know that this will be impossible for some connectors without backwards
>> incompatible changes since they exposed protobuf on their API surface. I
>> know that Chamikara was looking to shade this away in the
>> sdks/java/io/google-cloud-platform but only had limited success in the past.
>>
>> On Wed, Jul 11, 2018 at 1:14 PM Ismaël Mejía  wrote:
>>
>>> This is great news in particular for runners (Spark) where the leaking
>>> of some grpc subdependencies caused stability issues and required extra
>>> shading. Great !
>>>
>>> About the other modules
>>>
>>> > Note, these are the following modules that still depend on protobuf
>>> that are shaded away and could move to use a vendored variant of protobuf:
>>> > * sdks/java/core
>>> > * sdks/java/extensions/sql
>>>
>>> For sdks/java/core the dependency in protobuf seems to be minor, from a
>>> quick look it seems that it is only used to import ByteString in two
>>> classes: ByteKey and TextSource so hopefully we can rewrite both and get
>>> rid of the dependency altogether (making core smaller which is always a
>>> win).
>>> Can we fill a JIRA for this or do I miss other reasons to depend on
>>> protobuf in core?
>>>
>>> For sdks/java/extensions/sql I don’t know if I am missing something, but
>>> I don’t see any code use of protobuf and I doubt that calcite uses protobuf
>>> so maybe it is there just because it was leaking from somewhere else in
>>> Beam, we should better check this first.
>>>
>>> > These modules expose protobuf because it is part of the API surface:
>>> > * sdks/java/extensions/protobuf
>>> > * sdks/java/io/google-cloud-platform (I believe that gRPC could be
>>> shaded here but preferrably the IO module would do it so we wouldn't have
>>> this maintenance burden.)
>>>
>>> Can you please elaborate on ‘but preferrably the IO module would do it
>>> so we wouldn't have this maintenance burden’. I remember there was an issue
>>> when running the examples in the spark runner examples because of
>>> sdks/java/io/google-cloud-platform leaking netty via gRPC (BEAM-3519) [Note
>>> that this is hidden at this moment because of pure luck Spark 2.3.x and
>>> Beam are aligned on netty version but this can change in the future so
>>> hopefully this can be shaded/controlled].
>>>
>>> On Wed, Jul 11, 2018 at 8:55 PM Andrew Pilloud 
>>> wrote:
>>>
 This is really cool and should cut down our artifact size
 significantly! Thanks Luke!

 I am running into one issue after this: builds with the publishing flag
 no longer work. (We run './gradlew -Ppublishing shadowJar' to generate
 release artifacts for the Beam SQL shell.) I get a bunch of errors like
 this:

 model/job-management/build/generated/source/proto/main/java/org/apache/beam/model/jobmanagement/v1/JobApi.java:148:
 error: no suitable method found for
 readMessage(org.apache.beam.vendor.protobuf.v3.com.google.protobuf.Parser,ExtensionRegistryLite)

 Is there something I need to change in my build?

 Andrew

 On Tue, Jul 10, 2018 at 2:10 PM Lukasz Cwik  wrote:

> With the merge of PR #5594[1], we started shading all gRPC / Protobuf
> 

Re: Running post-commit tests on every PR

2018-07-11 Thread Rui Wang
Thanks Mikhail.

I'd like to close this discussion now.

The idea of running post commits frequently is to avoid post commits tests
failing ASAP. However, bugs are supposed to be caught in precommit tests.
With the advantage of fast running time of precommit tests, improving
precommit tests is a better option to keep post commit tests green.


-Rui





On Wed, Jul 11, 2018 at 4:32 PM Mikhail Gryzykhin  wrote:

> Hello everyone,
>
> Rui suggested to run post-commit tests on every pull request.
>
> Lets utilize this forked thread to discuss this suggestion.
>
> Best regards,
> --Mikhail
>
>
> On Wed, Jul 11, 2018 at 4:25 PM Mikhail Gryzykhin 
> wrote:
>
>> Hi Rui,
>>
>> I would suggest to start another thread for that discussion.
>>
>> Lets keep this discussion focused on automated JIRA tickets creation.
>>
>> --Mikhail
>>
>>
>> On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:
>>
>>> Maybe this is also a good thread to start the discussion that if we want
>>> to enforce postcommit test for every PR.
>>>
>>> Can we afford the cost of longer waiting time to catch potential bugs?
>>>
>>> -Rui
>>>
>>> On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin 
>>> wrote:
>>>
 That's a valid point.

 Unfortunately, the JiraTestResultReporter plugin does not have features
 to dynamically assign owners. Additionally, I don't think it is always easy
 to find proper owner for post-commit tests at first glance, since they
 usually cover broad specter of issues.

 My assumption is that we need someone to triage new issues.

 Ideally, any contributor, who sees failing test, should check
 unassigned tickets and either do triage, or assign them to someone who can.
 I strongly encourage this approach.

 We have couple other ready-made options to consider:
 1. We can configure JIRA component owner who would be assigned to
 created tickets.
 2. JiraTestReporterPlugin can assign tickets to specific user. This is
 configured per Jenkins job. We can utilize this if someone volunteers.
 3. Dynamic assignment will most likely require custom solution.

 --Mikhail


 On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
 wrote:

> Hi Mikhail,
>
> I like the proposal! Hopefully this can replace the constant stream of
> build failure emails. I noticed one detail seems to be missing:  How will
> new issues be assigned to the proper owner? Will the tool do this
> automatically or will we need someone to triage new issues?
>
> Andrew
>
> On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
> wrote:
>
>> Hi everyone,
>>
>> I want to add an automatic JIRA tickets creation for failing
>> post-commit tests.
>>
>> I wrote up design proposal doc with more details on this:
>>
>> https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI
>>
>> Quick summary:
>> I suggest to utilize JiraTestResultReporter plugin.
>> Since this plugin is not installed on our Jenkins yet, we have to
>> request to Infra team to add it.
>>
>> Please, comment if this approach sounds good to you.
>>
>> Best regards,
>> --Mikhail
>>
>>


Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Mikhail Gryzykhin
@Yifan Zou 

I believe that we should test-drive the system with tickets + PR first and
decide on email notification later. We already have tests failure emails
sent to commits@, I believe most people filter out or not signed up for
that list though.

It creates only one ticket, and keeps it for recurring test failures.

@Andrew Pilloud 
Thank you for the suggestion. I'll add it to design doc.

--Mikhail



On Wed, Jul 11, 2018 at 4:52 PM Yifan Zou  wrote:

> +1 to Andrew's concerns. Leaving the tickets unassigned will cause the
> ticket being ignored and no actions being taken.
>
> I can see the challenges on ticket assignment. Like Mikhail mentioned, the
> plugin does not support dynamic assignments. We have to implement custom
> script to determine the assignees and do some tricks to the jenkins job.
> Also, the post-commits tests usually cover tons of stuffs that it is
> difficult to find which part was broken and ask the right person to look
> into within the Auto JIRA process. Some naive thoughts: Are we able to send
> emails to the dev@ to ask people to take care of the JIRA issues? Are we
> able to find component leads and ask them triage the test failure tickets?
>
> Another nitpick comment. Does the jenkins job file the JIRA issue in every
> test failure? Sometimes the test continuously fails in a time period due to
> the same reason. In this case, we will get some duplicate issues filed by
> Jenkins. I think it could be better if we can avoid filing issues if the
> previous one has not been resolved.
>
> Thanks.
> Yifan
>
>
> On Wed, Jul 11, 2018 at 4:37 PM Andrew Pilloud 
> wrote:
>
>> That sounds great. You should add this detail to the doc.
>>
>> On Wed, Jul 11, 2018 at 4:29 PM Mikhail Gryzykhin 
>> wrote:
>>
>>> We already have component for this purpose: "test-failures". All tickets
>>> created will go to that component. As an option, we can add link to view
>>> list of open JIRA tickets to PR template.
>>>
>>> We also would want to create graph on dashboard with amount of
>>> unassigned and assigned bugs.
>>>
>>> I believe that we can also add counter of unassigned bugs to PR
>>> template. This way it will be easier for everyone to know when there's some
>>> tests issue not attended.
>>>
>>> --Mikhail
>>>
>>>
>>> On Wed, Jul 11, 2018 at 4:24 PM Andrew Pilloud 
>>> wrote:
>>>
 So it sounds like you will want to create a component for untriaged
 issues so they are easy to find. I like the idea of distributing the work
 of triaging post commit failures to new PR authors as a condition of
 merging. I feel like we will just be filling JIRA with spam if the issues
 are automatically created without a plan for triage.

 Andrew

 On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:

> Maybe this is also a good thread to start the discussion that if we
> want to enforce postcommit test for every PR.
>
> Can we afford the cost of longer waiting time to catch potential bugs?
>
> -Rui
>
> On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin 
> wrote:
>
>> That's a valid point.
>>
>> Unfortunately, the JiraTestResultReporter plugin does not have
>> features to dynamically assign owners. Additionally, I don't think it is
>> always easy to find proper owner for post-commit tests at first glance,
>> since they usually cover broad specter of issues.
>>
>> My assumption is that we need someone to triage new issues.
>>
>> Ideally, any contributor, who sees failing test, should check
>> unassigned tickets and either do triage, or assign them to someone who 
>> can.
>> I strongly encourage this approach.
>>
>> We have couple other ready-made options to consider:
>> 1. We can configure JIRA component owner who would be assigned to
>> created tickets.
>> 2. JiraTestReporterPlugin can assign tickets to specific user. This
>> is configured per Jenkins job. We can utilize this if someone volunteers.
>> 3. Dynamic assignment will most likely require custom solution.
>>
>> --Mikhail
>>
>>
>> On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
>> wrote:
>>
>>> Hi Mikhail,
>>>
>>> I like the proposal! Hopefully this can replace the constant stream
>>> of build failure emails. I noticed one detail seems to be missing:  How
>>> will new issues be assigned to the proper owner? Will the tool do this
>>> automatically or will we need someone to triage new issues?
>>>
>>> Andrew
>>>
>>> On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
>>> wrote:
>>>
 Hi everyone,

 I want to add an automatic JIRA tickets creation for failing
 post-commit tests.

 I wrote up design proposal doc with more details on this:

 https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI

 Quick summary:
 I suggest to 

Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Yifan Zou
+1 to Andrew's concerns. Leaving the tickets unassigned will cause the
ticket being ignored and no actions being taken.

I can see the challenges on ticket assignment. Like Mikhail mentioned, the
plugin does not support dynamic assignments. We have to implement custom
script to determine the assignees and do some tricks to the jenkins job.
Also, the post-commits tests usually cover tons of stuffs that it is
difficult to find which part was broken and ask the right person to look
into within the Auto JIRA process. Some naive thoughts: Are we able to send
emails to the dev@ to ask people to take care of the JIRA issues? Are we
able to find component leads and ask them triage the test failure tickets?

Another nitpick comment. Does the jenkins job file the JIRA issue in every
test failure? Sometimes the test continuously fails in a time period due to
the same reason. In this case, we will get some duplicate issues filed by
Jenkins. I think it could be better if we can avoid filing issues if the
previous one has not been resolved.

Thanks.
Yifan


On Wed, Jul 11, 2018 at 4:37 PM Andrew Pilloud  wrote:

> That sounds great. You should add this detail to the doc.
>
> On Wed, Jul 11, 2018 at 4:29 PM Mikhail Gryzykhin 
> wrote:
>
>> We already have component for this purpose: "test-failures". All tickets
>> created will go to that component. As an option, we can add link to view
>> list of open JIRA tickets to PR template.
>>
>> We also would want to create graph on dashboard with amount of unassigned
>> and assigned bugs.
>>
>> I believe that we can also add counter of unassigned bugs to PR template.
>> This way it will be easier for everyone to know when there's some tests
>> issue not attended.
>>
>> --Mikhail
>>
>>
>> On Wed, Jul 11, 2018 at 4:24 PM Andrew Pilloud 
>> wrote:
>>
>>> So it sounds like you will want to create a component for untriaged
>>> issues so they are easy to find. I like the idea of distributing the work
>>> of triaging post commit failures to new PR authors as a condition of
>>> merging. I feel like we will just be filling JIRA with spam if the issues
>>> are automatically created without a plan for triage.
>>>
>>> Andrew
>>>
>>> On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:
>>>
 Maybe this is also a good thread to start the discussion that if we
 want to enforce postcommit test for every PR.

 Can we afford the cost of longer waiting time to catch potential bugs?

 -Rui

 On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin 
 wrote:

> That's a valid point.
>
> Unfortunately, the JiraTestResultReporter plugin does not have
> features to dynamically assign owners. Additionally, I don't think it is
> always easy to find proper owner for post-commit tests at first glance,
> since they usually cover broad specter of issues.
>
> My assumption is that we need someone to triage new issues.
>
> Ideally, any contributor, who sees failing test, should check
> unassigned tickets and either do triage, or assign them to someone who 
> can.
> I strongly encourage this approach.
>
> We have couple other ready-made options to consider:
> 1. We can configure JIRA component owner who would be assigned to
> created tickets.
> 2. JiraTestReporterPlugin can assign tickets to specific user. This is
> configured per Jenkins job. We can utilize this if someone volunteers.
> 3. Dynamic assignment will most likely require custom solution.
>
> --Mikhail
>
>
> On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
> wrote:
>
>> Hi Mikhail,
>>
>> I like the proposal! Hopefully this can replace the constant stream
>> of build failure emails. I noticed one detail seems to be missing:  How
>> will new issues be assigned to the proper owner? Will the tool do this
>> automatically or will we need someone to triage new issues?
>>
>> Andrew
>>
>> On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
>> wrote:
>>
>>> Hi everyone,
>>>
>>> I want to add an automatic JIRA tickets creation for failing
>>> post-commit tests.
>>>
>>> I wrote up design proposal doc with more details on this:
>>>
>>> https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI
>>>
>>> Quick summary:
>>> I suggest to utilize JiraTestResultReporter plugin.
>>> Since this plugin is not installed on our Jenkins yet, we have to
>>> request to Infra team to add it.
>>>
>>> Please, comment if this approach sounds good to you.
>>>
>>> Best regards,
>>> --Mikhail
>>>
>>>


Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Andrew Pilloud
That sounds great. You should add this detail to the doc.

On Wed, Jul 11, 2018 at 4:29 PM Mikhail Gryzykhin  wrote:

> We already have component for this purpose: "test-failures". All tickets
> created will go to that component. As an option, we can add link to view
> list of open JIRA tickets to PR template.
>
> We also would want to create graph on dashboard with amount of unassigned
> and assigned bugs.
>
> I believe that we can also add counter of unassigned bugs to PR template.
> This way it will be easier for everyone to know when there's some tests
> issue not attended.
>
> --Mikhail
>
>
> On Wed, Jul 11, 2018 at 4:24 PM Andrew Pilloud 
> wrote:
>
>> So it sounds like you will want to create a component for untriaged
>> issues so they are easy to find. I like the idea of distributing the work
>> of triaging post commit failures to new PR authors as a condition of
>> merging. I feel like we will just be filling JIRA with spam if the issues
>> are automatically created without a plan for triage.
>>
>> Andrew
>>
>> On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:
>>
>>> Maybe this is also a good thread to start the discussion that if we want
>>> to enforce postcommit test for every PR.
>>>
>>> Can we afford the cost of longer waiting time to catch potential bugs?
>>>
>>> -Rui
>>>
>>> On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin 
>>> wrote:
>>>
 That's a valid point.

 Unfortunately, the JiraTestResultReporter plugin does not have features
 to dynamically assign owners. Additionally, I don't think it is always easy
 to find proper owner for post-commit tests at first glance, since they
 usually cover broad specter of issues.

 My assumption is that we need someone to triage new issues.

 Ideally, any contributor, who sees failing test, should check
 unassigned tickets and either do triage, or assign them to someone who can.
 I strongly encourage this approach.

 We have couple other ready-made options to consider:
 1. We can configure JIRA component owner who would be assigned to
 created tickets.
 2. JiraTestReporterPlugin can assign tickets to specific user. This is
 configured per Jenkins job. We can utilize this if someone volunteers.
 3. Dynamic assignment will most likely require custom solution.

 --Mikhail


 On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
 wrote:

> Hi Mikhail,
>
> I like the proposal! Hopefully this can replace the constant stream of
> build failure emails. I noticed one detail seems to be missing:  How will
> new issues be assigned to the proper owner? Will the tool do this
> automatically or will we need someone to triage new issues?
>
> Andrew
>
> On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
> wrote:
>
>> Hi everyone,
>>
>> I want to add an automatic JIRA tickets creation for failing
>> post-commit tests.
>>
>> I wrote up design proposal doc with more details on this:
>>
>> https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI
>>
>> Quick summary:
>> I suggest to utilize JiraTestResultReporter plugin.
>> Since this plugin is not installed on our Jenkins yet, we have to
>> request to Infra team to add it.
>>
>> Please, comment if this approach sounds good to you.
>>
>> Best regards,
>> --Mikhail
>>
>>


Running post-commit tests on every PR

2018-07-11 Thread Mikhail Gryzykhin
Hello everyone,

Rui suggested to run post-commit tests on every pull request.

Lets utilize this forked thread to discuss this suggestion.

Best regards,
--Mikhail


On Wed, Jul 11, 2018 at 4:25 PM Mikhail Gryzykhin  wrote:

> Hi Rui,
>
> I would suggest to start another thread for that discussion.
>
> Lets keep this discussion focused on automated JIRA tickets creation.
>
> --Mikhail
>
>
> On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:
>
>> Maybe this is also a good thread to start the discussion that if we want
>> to enforce postcommit test for every PR.
>>
>> Can we afford the cost of longer waiting time to catch potential bugs?
>>
>> -Rui
>>
>> On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin 
>> wrote:
>>
>>> That's a valid point.
>>>
>>> Unfortunately, the JiraTestResultReporter plugin does not have features
>>> to dynamically assign owners. Additionally, I don't think it is always easy
>>> to find proper owner for post-commit tests at first glance, since they
>>> usually cover broad specter of issues.
>>>
>>> My assumption is that we need someone to triage new issues.
>>>
>>> Ideally, any contributor, who sees failing test, should check unassigned
>>> tickets and either do triage, or assign them to someone who can. I strongly
>>> encourage this approach.
>>>
>>> We have couple other ready-made options to consider:
>>> 1. We can configure JIRA component owner who would be assigned to
>>> created tickets.
>>> 2. JiraTestReporterPlugin can assign tickets to specific user. This is
>>> configured per Jenkins job. We can utilize this if someone volunteers.
>>> 3. Dynamic assignment will most likely require custom solution.
>>>
>>> --Mikhail
>>>
>>>
>>> On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
>>> wrote:
>>>
 Hi Mikhail,

 I like the proposal! Hopefully this can replace the constant stream of
 build failure emails. I noticed one detail seems to be missing:  How will
 new issues be assigned to the proper owner? Will the tool do this
 automatically or will we need someone to triage new issues?

 Andrew

 On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
 wrote:

> Hi everyone,
>
> I want to add an automatic JIRA tickets creation for failing
> post-commit tests.
>
> I wrote up design proposal doc with more details on this:
>
> https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI
>
> Quick summary:
> I suggest to utilize JiraTestResultReporter plugin.
> Since this plugin is not installed on our Jenkins yet, we have to
> request to Infra team to add it.
>
> Please, comment if this approach sounds good to you.
>
> Best regards,
> --Mikhail
>
>


Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Mikhail Gryzykhin
We already have component for this purpose: "test-failures". All tickets
created will go to that component. As an option, we can add link to view
list of open JIRA tickets to PR template.

We also would want to create graph on dashboard with amount of unassigned
and assigned bugs.

I believe that we can also add counter of unassigned bugs to PR template.
This way it will be easier for everyone to know when there's some tests
issue not attended.

--Mikhail


On Wed, Jul 11, 2018 at 4:24 PM Andrew Pilloud  wrote:

> So it sounds like you will want to create a component for untriaged issues
> so they are easy to find. I like the idea of distributing the work of
> triaging post commit failures to new PR authors as a condition of merging.
> I feel like we will just be filling JIRA with spam if the issues are
> automatically created without a plan for triage.
>
> Andrew
>
> On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:
>
>> Maybe this is also a good thread to start the discussion that if we want
>> to enforce postcommit test for every PR.
>>
>> Can we afford the cost of longer waiting time to catch potential bugs?
>>
>> -Rui
>>
>> On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin 
>> wrote:
>>
>>> That's a valid point.
>>>
>>> Unfortunately, the JiraTestResultReporter plugin does not have features
>>> to dynamically assign owners. Additionally, I don't think it is always easy
>>> to find proper owner for post-commit tests at first glance, since they
>>> usually cover broad specter of issues.
>>>
>>> My assumption is that we need someone to triage new issues.
>>>
>>> Ideally, any contributor, who sees failing test, should check unassigned
>>> tickets and either do triage, or assign them to someone who can. I strongly
>>> encourage this approach.
>>>
>>> We have couple other ready-made options to consider:
>>> 1. We can configure JIRA component owner who would be assigned to
>>> created tickets.
>>> 2. JiraTestReporterPlugin can assign tickets to specific user. This is
>>> configured per Jenkins job. We can utilize this if someone volunteers.
>>> 3. Dynamic assignment will most likely require custom solution.
>>>
>>> --Mikhail
>>>
>>>
>>> On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
>>> wrote:
>>>
 Hi Mikhail,

 I like the proposal! Hopefully this can replace the constant stream of
 build failure emails. I noticed one detail seems to be missing:  How will
 new issues be assigned to the proper owner? Will the tool do this
 automatically or will we need someone to triage new issues?

 Andrew

 On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
 wrote:

> Hi everyone,
>
> I want to add an automatic JIRA tickets creation for failing
> post-commit tests.
>
> I wrote up design proposal doc with more details on this:
>
> https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI
>
> Quick summary:
> I suggest to utilize JiraTestResultReporter plugin.
> Since this plugin is not installed on our Jenkins yet, we have to
> request to Infra team to add it.
>
> Please, comment if this approach sounds good to you.
>
> Best regards,
> --Mikhail
>
>


Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Mikhail Gryzykhin
Hi Rui,

I would suggest to start another thread for that discussion.

Lets keep this discussion focused on automated JIRA tickets creation.

--Mikhail


On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:

> Maybe this is also a good thread to start the discussion that if we want
> to enforce postcommit test for every PR.
>
> Can we afford the cost of longer waiting time to catch potential bugs?
>
> -Rui
>
> On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin 
> wrote:
>
>> That's a valid point.
>>
>> Unfortunately, the JiraTestResultReporter plugin does not have features
>> to dynamically assign owners. Additionally, I don't think it is always easy
>> to find proper owner for post-commit tests at first glance, since they
>> usually cover broad specter of issues.
>>
>> My assumption is that we need someone to triage new issues.
>>
>> Ideally, any contributor, who sees failing test, should check unassigned
>> tickets and either do triage, or assign them to someone who can. I strongly
>> encourage this approach.
>>
>> We have couple other ready-made options to consider:
>> 1. We can configure JIRA component owner who would be assigned to created
>> tickets.
>> 2. JiraTestReporterPlugin can assign tickets to specific user. This is
>> configured per Jenkins job. We can utilize this if someone volunteers.
>> 3. Dynamic assignment will most likely require custom solution.
>>
>> --Mikhail
>>
>>
>> On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
>> wrote:
>>
>>> Hi Mikhail,
>>>
>>> I like the proposal! Hopefully this can replace the constant stream of
>>> build failure emails. I noticed one detail seems to be missing:  How will
>>> new issues be assigned to the proper owner? Will the tool do this
>>> automatically or will we need someone to triage new issues?
>>>
>>> Andrew
>>>
>>> On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
>>> wrote:
>>>
 Hi everyone,

 I want to add an automatic JIRA tickets creation for failing
 post-commit tests.

 I wrote up design proposal doc with more details on this:

 https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI

 Quick summary:
 I suggest to utilize JiraTestResultReporter plugin.
 Since this plugin is not installed on our Jenkins yet, we have to
 request to Infra team to add it.

 Please, comment if this approach sounds good to you.

 Best regards,
 --Mikhail




Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Andrew Pilloud
So it sounds like you will want to create a component for untriaged issues
so they are easy to find. I like the idea of distributing the work of
triaging post commit failures to new PR authors as a condition of merging.
I feel like we will just be filling JIRA with spam if the issues are
automatically created without a plan for triage.

Andrew

On Wed, Jul 11, 2018 at 4:12 PM Rui Wang  wrote:

> Maybe this is also a good thread to start the discussion that if we want
> to enforce postcommit test for every PR.
>
> Can we afford the cost of longer waiting time to catch potential bugs?
>
> -Rui
>
> On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin 
> wrote:
>
>> That's a valid point.
>>
>> Unfortunately, the JiraTestResultReporter plugin does not have features
>> to dynamically assign owners. Additionally, I don't think it is always easy
>> to find proper owner for post-commit tests at first glance, since they
>> usually cover broad specter of issues.
>>
>> My assumption is that we need someone to triage new issues.
>>
>> Ideally, any contributor, who sees failing test, should check unassigned
>> tickets and either do triage, or assign them to someone who can. I strongly
>> encourage this approach.
>>
>> We have couple other ready-made options to consider:
>> 1. We can configure JIRA component owner who would be assigned to created
>> tickets.
>> 2. JiraTestReporterPlugin can assign tickets to specific user. This is
>> configured per Jenkins job. We can utilize this if someone volunteers.
>> 3. Dynamic assignment will most likely require custom solution.
>>
>> --Mikhail
>>
>>
>> On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
>> wrote:
>>
>>> Hi Mikhail,
>>>
>>> I like the proposal! Hopefully this can replace the constant stream of
>>> build failure emails. I noticed one detail seems to be missing:  How will
>>> new issues be assigned to the proper owner? Will the tool do this
>>> automatically or will we need someone to triage new issues?
>>>
>>> Andrew
>>>
>>> On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
>>> wrote:
>>>
 Hi everyone,

 I want to add an automatic JIRA tickets creation for failing
 post-commit tests.

 I wrote up design proposal doc with more details on this:

 https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI

 Quick summary:
 I suggest to utilize JiraTestResultReporter plugin.
 Since this plugin is not installed on our Jenkins yet, we have to
 request to Infra team to add it.

 Please, comment if this approach sounds good to you.

 Best regards,
 --Mikhail




Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Rui Wang
Maybe this is also a good thread to start the discussion that if we want to
enforce postcommit test for every PR.

Can we afford the cost of longer waiting time to catch potential bugs?

-Rui

On Wed, Jul 11, 2018 at 4:04 PM Mikhail Gryzykhin  wrote:

> That's a valid point.
>
> Unfortunately, the JiraTestResultReporter plugin does not have features to
> dynamically assign owners. Additionally, I don't think it is always easy to
> find proper owner for post-commit tests at first glance, since they usually
> cover broad specter of issues.
>
> My assumption is that we need someone to triage new issues.
>
> Ideally, any contributor, who sees failing test, should check unassigned
> tickets and either do triage, or assign them to someone who can. I strongly
> encourage this approach.
>
> We have couple other ready-made options to consider:
> 1. We can configure JIRA component owner who would be assigned to created
> tickets.
> 2. JiraTestReporterPlugin can assign tickets to specific user. This is
> configured per Jenkins job. We can utilize this if someone volunteers.
> 3. Dynamic assignment will most likely require custom solution.
>
> --Mikhail
>
>
> On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud 
> wrote:
>
>> Hi Mikhail,
>>
>> I like the proposal! Hopefully this can replace the constant stream of
>> build failure emails. I noticed one detail seems to be missing:  How will
>> new issues be assigned to the proper owner? Will the tool do this
>> automatically or will we need someone to triage new issues?
>>
>> Andrew
>>
>> On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
>> wrote:
>>
>>> Hi everyone,
>>>
>>> I want to add an automatic JIRA tickets creation for failing post-commit
>>> tests.
>>>
>>> I wrote up design proposal doc with more details on this:
>>>
>>> https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI
>>>
>>> Quick summary:
>>> I suggest to utilize JiraTestResultReporter plugin.
>>> Since this plugin is not installed on our Jenkins yet, we have to
>>> request to Infra team to add it.
>>>
>>> Please, comment if this approach sounds good to you.
>>>
>>> Best regards,
>>> --Mikhail
>>>
>>>


Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Mikhail Gryzykhin
That's a valid point.

Unfortunately, the JiraTestResultReporter plugin does not have features to
dynamically assign owners. Additionally, I don't think it is always easy to
find proper owner for post-commit tests at first glance, since they usually
cover broad specter of issues.

My assumption is that we need someone to triage new issues.

Ideally, any contributor, who sees failing test, should check unassigned
tickets and either do triage, or assign them to someone who can. I strongly
encourage this approach.

We have couple other ready-made options to consider:
1. We can configure JIRA component owner who would be assigned to created
tickets.
2. JiraTestReporterPlugin can assign tickets to specific user. This is
configured per Jenkins job. We can utilize this if someone volunteers.
3. Dynamic assignment will most likely require custom solution.

--Mikhail


On Wed, Jul 11, 2018 at 3:34 PM Andrew Pilloud  wrote:

> Hi Mikhail,
>
> I like the proposal! Hopefully this can replace the constant stream of
> build failure emails. I noticed one detail seems to be missing:  How will
> new issues be assigned to the proper owner? Will the tool do this
> automatically or will we need someone to triage new issues?
>
> Andrew
>
> On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin 
> wrote:
>
>> Hi everyone,
>>
>> I want to add an automatic JIRA tickets creation for failing post-commit
>> tests.
>>
>> I wrote up design proposal doc with more details on this:
>>
>> https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI
>>
>> Quick summary:
>> I suggest to utilize JiraTestResultReporter plugin.
>> Since this plugin is not installed on our Jenkins yet, we have to request
>> to Infra team to add it.
>>
>> Please, comment if this approach sounds good to you.
>>
>> Best regards,
>> --Mikhail
>>
>>


Re: Vendoring / Shading Protobuf and gRPC

2018-07-11 Thread Andrew Pilloud
We discussed this in person, sounds like my issue is known and will be
fixed shortly. I'm running builds with '-Ppublishing' because I need to
generate release artifacts for bundling the Beam SQL shell with the Google
Cloud SDK. Hope to eventually just use the Beam release, but we are
currently cutting a release off master every week to quickly iterate on bug
fixes.

Andrew

On Wed, Jul 11, 2018 at 1:39 PM Lukasz Cwik  wrote:

> Andrew, to my knowledge it seems as though your running into BEAM-4744, is
> there a reason you need to specify -Ppublishing?
>
> No particular reason to using ByteString within ByteKey and TextSource.
> Note that we currently do shade away protobuf in sdks/java/core so we could
> either migrate to using a vendored version or re-implement the
> functionality to not use ByteString. Note that sdks/java/core can now
> dependend on the model/* classes and perform the Pipeline -> Proto
> translation as this will be needed to support portability efforts so I
> would prefer just migrating to use the vendored versions of the code. Filed
> BEAM-4766.
>
> As for the IO module, I was referring to the upstream
> bigtable/bigquery/... libraries vended by Google. If they trimmed their API
> surface to not expose gRPC or protobuf, then we wouldn't have to worry
> about having the shading logic within sdks/java/io/google-cloud-platform. I
> know that this will be impossible for some connectors without backwards
> incompatible changes since they exposed protobuf on their API surface. I
> know that Chamikara was looking to shade this away in the
> sdks/java/io/google-cloud-platform but only had limited success in the past.
>
> On Wed, Jul 11, 2018 at 1:14 PM Ismaël Mejía  wrote:
>
>> This is great news in particular for runners (Spark) where the leaking of
>> some grpc subdependencies caused stability issues and required extra
>> shading. Great !
>>
>> About the other modules
>>
>> > Note, these are the following modules that still depend on protobuf
>> that are shaded away and could move to use a vendored variant of protobuf:
>> > * sdks/java/core
>> > * sdks/java/extensions/sql
>>
>> For sdks/java/core the dependency in protobuf seems to be minor, from a
>> quick look it seems that it is only used to import ByteString in two
>> classes: ByteKey and TextSource so hopefully we can rewrite both and get
>> rid of the dependency altogether (making core smaller which is always a
>> win).
>> Can we fill a JIRA for this or do I miss other reasons to depend on
>> protobuf in core?
>>
>> For sdks/java/extensions/sql I don’t know if I am missing something, but
>> I don’t see any code use of protobuf and I doubt that calcite uses protobuf
>> so maybe it is there just because it was leaking from somewhere else in
>> Beam, we should better check this first.
>>
>> > These modules expose protobuf because it is part of the API surface:
>> > * sdks/java/extensions/protobuf
>> > * sdks/java/io/google-cloud-platform (I believe that gRPC could be
>> shaded here but preferrably the IO module would do it so we wouldn't have
>> this maintenance burden.)
>>
>> Can you please elaborate on ‘but preferrably the IO module would do it so
>> we wouldn't have this maintenance burden’. I remember there was an issue
>> when running the examples in the spark runner examples because of
>> sdks/java/io/google-cloud-platform leaking netty via gRPC (BEAM-3519) [Note
>> that this is hidden at this moment because of pure luck Spark 2.3.x and
>> Beam are aligned on netty version but this can change in the future so
>> hopefully this can be shaded/controlled].
>>
>> On Wed, Jul 11, 2018 at 8:55 PM Andrew Pilloud 
>> wrote:
>>
>>> This is really cool and should cut down our artifact size significantly!
>>> Thanks Luke!
>>>
>>> I am running into one issue after this: builds with the publishing flag
>>> no longer work. (We run './gradlew -Ppublishing shadowJar' to generate
>>> release artifacts for the Beam SQL shell.) I get a bunch of errors like
>>> this:
>>>
>>> model/job-management/build/generated/source/proto/main/java/org/apache/beam/model/jobmanagement/v1/JobApi.java:148:
>>> error: no suitable method found for
>>> readMessage(org.apache.beam.vendor.protobuf.v3.com.google.protobuf.Parser,ExtensionRegistryLite)
>>>
>>> Is there something I need to change in my build?
>>>
>>> Andrew
>>>
>>> On Tue, Jul 10, 2018 at 2:10 PM Lukasz Cwik  wrote:
>>>
 With the merge of PR #5594[1], we started shading all gRPC / Protobuf
 dependencies within all the modules that depended on the model/*
 dependencies by vendoring them. The vendored versions are built and
 packaged into the model jars (they should be separated out once I figure
 out how to generate proto code using a shaded import path). Note that this
 cleaned up several issues where we were incorrectly built shaded jars
 without repackaging in some locations or the shading process was corrupting
 the contents of some of the jars.

 Note that the 

Re: Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Andrew Pilloud
Hi Mikhail,

I like the proposal! Hopefully this can replace the constant stream of
build failure emails. I noticed one detail seems to be missing:  How will
new issues be assigned to the proper owner? Will the tool do this
automatically or will we need someone to triage new issues?

Andrew

On Wed, Jul 11, 2018 at 3:07 PM Mikhail Gryzykhin  wrote:

> Hi everyone,
>
> I want to add an automatic JIRA tickets creation for failing post-commit
> tests.
>
> I wrote up design proposal doc with more details on this:
>
> https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI
>
> Quick summary:
> I suggest to utilize JiraTestResultReporter plugin.
> Since this plugin is not installed on our Jenkins yet, we have to request
> to Infra team to add it.
>
> Please, comment if this approach sounds good to you.
>
> Best regards,
> --Mikhail
>
>


Automatically create JIRA tickets for failing post-commit tests

2018-07-11 Thread Mikhail Gryzykhin
Hi everyone,

I want to add an automatic JIRA tickets creation for failing post-commit
tests.

I wrote up design proposal doc with more details on this:
https://docs.google.com/document/d/1kpsTy0sYJkLqlZvkPalDkqzBbpu-Wug0z-oWIVPo6UI

Quick summary:
I suggest to utilize JiraTestResultReporter plugin.
Since this plugin is not installed on our Jenkins yet, we have to request
to Infra team to add it.

Please, comment if this approach sounds good to you.

Best regards,
--Mikhail


Re: Vendoring / Shading Protobuf and gRPC

2018-07-11 Thread Lukasz Cwik
Andrew, to my knowledge it seems as though your running into BEAM-4744, is
there a reason you need to specify -Ppublishing?

No particular reason to using ByteString within ByteKey and TextSource.
Note that we currently do shade away protobuf in sdks/java/core so we could
either migrate to using a vendored version or re-implement the
functionality to not use ByteString. Note that sdks/java/core can now
dependend on the model/* classes and perform the Pipeline -> Proto
translation as this will be needed to support portability efforts so I
would prefer just migrating to use the vendored versions of the code. Filed
BEAM-4766.

As for the IO module, I was referring to the upstream bigtable/bigquery/...
libraries vended by Google. If they trimmed their API surface to not expose
gRPC or protobuf, then we wouldn't have to worry about having the shading
logic within sdks/java/io/google-cloud-platform. I know that this will be
impossible for some connectors without backwards incompatible changes since
they exposed protobuf on their API surface. I know that Chamikara was
looking to shade this away in the sdks/java/io/google-cloud-platform but
only had limited success in the past.

On Wed, Jul 11, 2018 at 1:14 PM Ismaël Mejía  wrote:

> This is great news in particular for runners (Spark) where the leaking of
> some grpc subdependencies caused stability issues and required extra
> shading. Great !
>
> About the other modules
>
> > Note, these are the following modules that still depend on protobuf that
> are shaded away and could move to use a vendored variant of protobuf:
> > * sdks/java/core
> > * sdks/java/extensions/sql
>
> For sdks/java/core the dependency in protobuf seems to be minor, from a
> quick look it seems that it is only used to import ByteString in two
> classes: ByteKey and TextSource so hopefully we can rewrite both and get
> rid of the dependency altogether (making core smaller which is always a
> win).
> Can we fill a JIRA for this or do I miss other reasons to depend on
> protobuf in core?
>
> For sdks/java/extensions/sql I don’t know if I am missing something, but I
> don’t see any code use of protobuf and I doubt that calcite uses protobuf
> so maybe it is there just because it was leaking from somewhere else in
> Beam, we should better check this first.
>
> > These modules expose protobuf because it is part of the API surface:
> > * sdks/java/extensions/protobuf
> > * sdks/java/io/google-cloud-platform (I believe that gRPC could be
> shaded here but preferrably the IO module would do it so we wouldn't have
> this maintenance burden.)
>
> Can you please elaborate on ‘but preferrably the IO module would do it so
> we wouldn't have this maintenance burden’. I remember there was an issue
> when running the examples in the spark runner examples because of
> sdks/java/io/google-cloud-platform leaking netty via gRPC (BEAM-3519) [Note
> that this is hidden at this moment because of pure luck Spark 2.3.x and
> Beam are aligned on netty version but this can change in the future so
> hopefully this can be shaded/controlled].
>
> On Wed, Jul 11, 2018 at 8:55 PM Andrew Pilloud 
> wrote:
>
>> This is really cool and should cut down our artifact size significantly!
>> Thanks Luke!
>>
>> I am running into one issue after this: builds with the publishing flag
>> no longer work. (We run './gradlew -Ppublishing shadowJar' to generate
>> release artifacts for the Beam SQL shell.) I get a bunch of errors like
>> this:
>>
>> model/job-management/build/generated/source/proto/main/java/org/apache/beam/model/jobmanagement/v1/JobApi.java:148:
>> error: no suitable method found for
>> readMessage(org.apache.beam.vendor.protobuf.v3.com.google.protobuf.Parser,ExtensionRegistryLite)
>>
>> Is there something I need to change in my build?
>>
>> Andrew
>>
>> On Tue, Jul 10, 2018 at 2:10 PM Lukasz Cwik  wrote:
>>
>>> With the merge of PR #5594[1], we started shading all gRPC / Protobuf
>>> dependencies within all the modules that depended on the model/*
>>> dependencies by vendoring them. The vendored versions are built and
>>> packaged into the model jars (they should be separated out once I figure
>>> out how to generate proto code using a shaded import path). Note that this
>>> cleaned up several issues where we were incorrectly built shaded jars
>>> without repackaging in some locations or the shading process was corrupting
>>> the contents of some of the jars.
>>>
>>> Note that the majority of the code base (especially related to
>>> portability) should be using imports under the
>>> org.apache.beam.vendor.protobuf.v3 or org.apache.beam.vendor.grpc.v1 paths.
>>> I have yet to figure out a clean way to get Intellij to recognize these
>>> vendored paths. My only solution so far has been to manually add one of the
>>> built model jars to the compile classpath of the module being worked on in
>>> Intellij as described here[2]. I would greatly appreciate some ideas on how
>>> to improve this integration because from a few 

Re: [PROPOSAL] Prepare Beam 2.6.0 release

2018-07-11 Thread Alan Myrvold
+1 Thanks for volunteering, Pablo

On Wed, Jul 11, 2018 at 11:49 AM Jason Kuster 
wrote:

> +1 sounds great
>
> On Wed, Jul 11, 2018 at 11:06 AM Thomas Weise  wrote:
>
>> +1
>>
>> Thanks for volunteering, Pablo!
>>
>> On Mon, Jul 9, 2018 at 9:56 PM Jean-Baptiste Onofré 
>> wrote:
>>
>>> +1
>>>
>>> I planned to send the proposal as well ;)
>>>
>>> Regards
>>> JB
>>>
>>> On 09/07/2018 23:16, Pablo Estrada wrote:
>>> > Hello everyone!
>>> >
>>> > As per the previously agreed-upon schedule for Beam releases, the
>>> > process for the 2.6.0 Beam release should start on July 17th.
>>> >
>>> > I volunteer to perform this release.
>>> >
>>> > Here is the schedule that I have in mind:
>>> >
>>> > - We start triaging JIRA issues this week.
>>> > - I will cut a release branch on July 17.
>>> > - After July 17, any blockers will need to be cherry-picked into the
>>> > release branch.
>>> > - As soon as tests look good, and blockers have been addressed, I will
>>> > perform the other release tasks.
>>> >
>>> > Does that seem reasonable to the community?
>>> >
>>> > Best
>>> > -P.
>>> > --
>>> > Got feedback? go/pabloem-feedback
>>> 
>>>
>>> --
>>> Jean-Baptiste Onofré
>>> jbono...@apache.org
>>> http://blog.nanthrax.net
>>> Talend - http://www.talend.com
>>>
>>
>
> --
> ---
> Jason Kuster
> Apache Beam / Google Cloud Dataflow
>
> See something? Say something. go/jasonkuster-feedback
> 
>


Re: Broken seed job

2018-07-11 Thread Alan Myrvold
The outage likely fixed it.
I would have expected running the standalone job with the trigger phrase
should have fixed it. Did it not?

On Wed, Jul 11, 2018 at 8:45 AM Lukasz Cwik  wrote:

> I believe there was an outage a few hours ago.
>
> On Wed, Jul 11, 2018 at 8:36 AM Łukasz Gajowy 
> wrote:
>
>> It's totally fine. The problem is now gone even though I have taken no
>> action to fix it. I suppose Jenkins was restarted?
>>
>> BTW: Is it restarted only on demand by those who have access, or it's
>> done once in a while (periodically)?
>>
>> śr., 11 lip 2018 o 17:07 Lukasz Cwik  napisał(a):
>>
>>> Ah, sorry for my confusion.
>>>
>>> On Tue, Jul 10, 2018 at 4:59 PM Łukasz Gajowy 
>>> wrote:
>>>
 I didn't edit the "Standalone Seed job", only the "SeedJob". Now every
 time someone tries to run the seed job ("Run seed job") it results in an
 error even despite prior running the standalone job from master branch the
 way you described.



 śr., 11 lip 2018 o 01:25 Lukasz Cwik  napisał(a):

> job_seed_standalone should only be edited when we know that the
> regular seed job is in a healthy state.
>
> Note, that you can always recover back to what is checked in master by:
> 1) Creating an empty PR
> 2) Using the standalone seed job trigger phrase: "Run Standalone Seed
> Job"
>
> I kicked one off right now:
>
> https://builds.apache.org/view/A-D/view/Beam/job/beam_SeedJob_Standalone/1289/
>
> On Tue, Jul 10, 2018 at 4:17 PM Łukasz Gajowy 
> wrote:
>
>> Hi,
>>
>> while working on Jenkins' seed job, some changes I introduced didn't
>> get reverted even after running the seed job again from the
>> master branch. This is why seed job is now failing. More details here [1]
>> and here [2].
>>
>> Since I can operate only with jobs phrase-triggered from GitHub's PR,
>> I think there's nothing more I can do than I already tried. In my
>> opinion, fixing this issue requires an aid of a person with some greater
>> access to Jenkins. Can someone help with that?
>>
>> Sorry for the inconvenience - I didn't expect that such situation can
>> occur. "job_seed_standalone" works fine and it can be used instead
>> (until the issue is fixed).
>>
>> [1] https://github.com/apache/beam/pull/5915
>> [2]
>> https://builds.apache.org/view/A-D/view/Beam/job/beam_SeedJob/2190/console
>>
>>
>> Best regards,
>> Łukasz
>>
>


Re: [PROPOSAL] Prepare Beam 2.6.0 release

2018-07-11 Thread Thomas Weise
+1

Thanks for volunteering, Pablo!

On Mon, Jul 9, 2018 at 9:56 PM Jean-Baptiste Onofré  wrote:

> +1
>
> I planned to send the proposal as well ;)
>
> Regards
> JB
>
> On 09/07/2018 23:16, Pablo Estrada wrote:
> > Hello everyone!
> >
> > As per the previously agreed-upon schedule for Beam releases, the
> > process for the 2.6.0 Beam release should start on July 17th.
> >
> > I volunteer to perform this release.
> >
> > Here is the schedule that I have in mind:
> >
> > - We start triaging JIRA issues this week.
> > - I will cut a release branch on July 17.
> > - After July 17, any blockers will need to be cherry-picked into the
> > release branch.
> > - As soon as tests look good, and blockers have been addressed, I will
> > perform the other release tasks.
> >
> > Does that seem reasonable to the community?
> >
> > Best
> > -P.
> > --
> > Got feedback? go/pabloem-feedback
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Proposal: Apache Beam Contributor Metrics: Collection, Display, Actions

2018-07-11 Thread Alan Myrvold
I am proposing an html dashboard for some key contributor metrics,
initially the weekly passing rate of post-commit tests and the speed of
pre-commit tests, then expanding to code review and release metrics.

https://s.apache.org/beam-contributor-metrics

Comments welcome.

Alan


Re: Broken seed job

2018-07-11 Thread Łukasz Gajowy
It's totally fine. The problem is now gone even though I have taken no
action to fix it. I suppose Jenkins was restarted?

BTW: Is it restarted only on demand by those who have access, or it's done
once in a while (periodically)?

śr., 11 lip 2018 o 17:07 Lukasz Cwik  napisał(a):

> Ah, sorry for my confusion.
>
> On Tue, Jul 10, 2018 at 4:59 PM Łukasz Gajowy 
> wrote:
>
>> I didn't edit the "Standalone Seed job", only the "SeedJob". Now every
>> time someone tries to run the seed job ("Run seed job") it results in an
>> error even despite prior running the standalone job from master branch the
>> way you described.
>>
>>
>>
>> śr., 11 lip 2018 o 01:25 Lukasz Cwik  napisał(a):
>>
>>> job_seed_standalone should only be edited when we know that the regular
>>> seed job is in a healthy state.
>>>
>>> Note, that you can always recover back to what is checked in master by:
>>> 1) Creating an empty PR
>>> 2) Using the standalone seed job trigger phrase: "Run Standalone Seed
>>> Job"
>>>
>>> I kicked one off right now:
>>>
>>> https://builds.apache.org/view/A-D/view/Beam/job/beam_SeedJob_Standalone/1289/
>>>
>>> On Tue, Jul 10, 2018 at 4:17 PM Łukasz Gajowy 
>>> wrote:
>>>
 Hi,

 while working on Jenkins' seed job, some changes I introduced didn't
 get reverted even after running the seed job again from the
 master branch. This is why seed job is now failing. More details here [1]
 and here [2].

 Since I can operate only with jobs phrase-triggered from GitHub's PR,
 I think there's nothing more I can do than I already tried. In my
 opinion, fixing this issue requires an aid of a person with some greater
 access to Jenkins. Can someone help with that?

 Sorry for the inconvenience - I didn't expect that such situation can
 occur. "job_seed_standalone" works fine and it can be used instead
 (until the issue is fixed).

 [1] https://github.com/apache/beam/pull/5915
 [2]
 https://builds.apache.org/view/A-D/view/Beam/job/beam_SeedJob/2190/console


 Best regards,
 Łukasz

>>>


Re: Broken seed job

2018-07-11 Thread Lukasz Cwik
Ah, sorry for my confusion.

On Tue, Jul 10, 2018 at 4:59 PM Łukasz Gajowy 
wrote:

> I didn't edit the "Standalone Seed job", only the "SeedJob". Now every
> time someone tries to run the seed job ("Run seed job") it results in an
> error even despite prior running the standalone job from master branch the
> way you described.
>
>
>
> śr., 11 lip 2018 o 01:25 Lukasz Cwik  napisał(a):
>
>> job_seed_standalone should only be edited when we know that the regular
>> seed job is in a healthy state.
>>
>> Note, that you can always recover back to what is checked in master by:
>> 1) Creating an empty PR
>> 2) Using the standalone seed job trigger phrase: "Run Standalone Seed Job"
>>
>> I kicked one off right now:
>>
>> https://builds.apache.org/view/A-D/view/Beam/job/beam_SeedJob_Standalone/1289/
>>
>> On Tue, Jul 10, 2018 at 4:17 PM Łukasz Gajowy 
>> wrote:
>>
>>> Hi,
>>>
>>> while working on Jenkins' seed job, some changes I introduced didn't get
>>> reverted even after running the seed job again from the master branch. This
>>> is why seed job is now failing. More details here [1] and here [2].
>>>
>>> Since I can operate only with jobs phrase-triggered from GitHub's PR, I
>>> think there's nothing more I can do than I already tried. In my
>>> opinion, fixing this issue requires an aid of a person with some greater
>>> access to Jenkins. Can someone help with that?
>>>
>>> Sorry for the inconvenience - I didn't expect that such situation can
>>> occur. "job_seed_standalone" works fine and it can be used instead
>>> (until the issue is fixed).
>>>
>>> [1] https://github.com/apache/beam/pull/5915
>>> [2]
>>> https://builds.apache.org/view/A-D/view/Beam/job/beam_SeedJob/2190/console
>>>
>>>
>>> Best regards,
>>> Łukasz
>>>
>>


Re: The full list of proposals / prototype documents

2018-07-11 Thread Alexey Romanenko
Thank you for this link, Etienne. 
I agree that it doesn’t fit well for design documents page. So, I think it 
makes sense to add either on wiki or as a part of Nexmark documentation on web 
site: https://beam.apache.org/documentation/sdks/java/nexmark/ 
.


On 11 Jul 2018, at 11:16, Etienne Chauchot  wrote:
> 
> Alexey,
> 
> One doc that can be interesting that I forgot to point out is 
> https://docs.google.com/document/d/1VgnGiVu8vSfm7Et-xAtQYv0PlEpqeyfmhpQUNPmWRJs/edit?usp=sharing
>  
> 
> It is the doc I wrote when I submitted Nexmark PR to ease the reading of the 
> code.
> It is not a design doc, I don't know if it belongs to the website page or to 
> the wiki for beam devs.
> 
> Etienne
> 
> 
> Le mercredi 06 juin 2018 à 17:48 +0200, Alexey Romanenko a écrit :
>> FYI: Finally, it was merged and you can find this page here:
>> https://beam.apache.org/contribute/design-documents/ 
>> 
>> 
>> Thank you everybody who helped me to compile this list! 
>> I’ll try to do my best to update this with new coming docs. In the same 
>> time, please, feel free to add your new docs (or notify me if I missed this) 
>> once they are finished and ready to be published.
>> 
>> WBR,
>> Alexey
>> 
>>> On 31 May 2018, at 18:52, Eugene Kirpichov >> > wrote:
>>> 
>>> Thank you!
>>> 
>>> On Thu, May 31, 2018 at 8:30 AM Alexey Romanenko >> > wrote:
 Thank you everybody for provided links. I collected all of them (please, 
 correct me if I missed something), categorized and created a dedicated 
 page for Beam website.
 
 Here is a PR for that (please, review):
 https://github.com/apache/beam-site/pull/456 
 
 
 WBR,
 Alexey
 
> On 30 May 2018, at 13:17, Łukasz Gajowy  > wrote:
> 
> Hi, 
> 
> I just wanted to add those two (sorry for being kinda late with this): 
> 
> https://docs.google.com/document/d/1dA-5s6OHiP_cz-NRAbwapoKF5MEC1wKps4A5tFbIPKE/edit?usp=sharing
>  
> 
> https://docs.google.com/document/d/1Cb7XVmqe__nA_WCrriAifL-3WCzbZzV4Am5W_SkQLeA/edit?usp=sharing
>  
> 
> 
> Thanks, 
> Łukasz 
> 
> 2018-05-29 22:42 GMT+02:00 Lukasz Cwik  >:
>> Providing ownership to the PMC account allows others to take over 
>> ownership of the document once a contributor stops being active. This 
>> allows docs to be updated (even if just to point to a newer doc).
>> 
>> On Tue, May 29, 2018 at 1:20 PM Kenneth Knowles > > wrote:
>>> My position on ownership is design docs are really documents "of the 
>>> moment" and authored by a particular individual or group. Experience 
>>> shows that even if you try, keeping it fresh is not likely to happen. 
>>> Anything that needs freshness (like end-user docs) should be in a 
>>> different medium. I would just date the gdoc so readers know how to 
>>> interpret it (the automated "last edit" date is not sufficient for 
>>> understanding how stale something is). 
>>> 
>>> So it seems like it makes little difference if the project or PMC has 
>>> ownership or even write access. Of course I have no objections if 
>>> someone wants to transfer ownership, but is there a reason to encourage 
>>> it?
>>> 
>>> Kenn
>>> 
>>> On Tue, May 29, 2018 at 1:11 PM Lukasz Cwik >> > wrote:
 I transferred ownership of the docs that I owned to the 
 apacheb...@gmail.com  PMC account and put 
 the ones that I owned into the drive folder.
 
 Would it be a good idea for others to follow suit?
 
 Instructions on how to transfer ownership are here: 
 http://support.it.mtu.edu/Accounts/E-Mail/75946047/How-do-I-transfer-ownership-of-a-Google-Doc.htm
  
 
 
 
 
 On Tue, May 29, 2018 at 11:23 AM Lukasz Cwik >>> > wrote:
> I created a PR for the beam-site to link to the design docs and 
> template from the contribution guide:
> https://github.com/apache/beam-site/pull/454 
> 
> 
> On Fri, May 25, 2018 at 10:23 AM Lukasz Cwik  

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-11 Thread Etienne Chauchot
First catch of the nexmark-CI:It seems that there was a change in the direct 
runner.
Query3 (exercise state and timers) - output size should be constant but has 
increased today => Was there a change in
state and timer related code?- the output size of this query is different 
between batch and streaming modes on direct
runner.
Etienne
Le mercredi 11 juillet 2018 à 15:25 +0200, Etienne Chauchot a écrit :
> Is someone interested in creating the scripts and dashboards for the other 
> runners? They can be created by copying the
> existing scripts and dashboards and changing one gradle parameter in the 
> scripts and the table name in the
> dashboards. 
> I have created the 
> tickets:https://issues.apache.org/jira/browse/BEAM-4763https://issues.apache.org/jira/browse/BEAM-4
> 762https://issues.apache.org/jira/browse/BEAM-4761https://issues.apache.org/jira/browse/BEAM-4760
> Etienne Le mercredi 11 juillet 2018 à 15:13 +0200, Etienne Chauchot a écrit :
> > Hi guys, 
> > 
> > I'm glad to announce that the CI of Beam has much improved !  Indeed 
> > Nexmark is now included in the perfkit
> > dashboards.
> > 
> > At each commit on master, nexmark suites are run and plots are created on 
> > the graphs.
> > 
> > I've created 2 kind of dashboards:
> > - one for performances (run times of the queries)
> > - one for the size of the output PCollection (which  should be constant)
> > 
> > There are dashboards for these runners:
> > - spark
> > - flink
> > - direct runner
> > 
> > Each dashboard contains:
> > - graphs in batch mode 
> > - graphs in streaming mode
> > - graphs for the 13 queries.
> > 
> > That gives more than a hundred of graphs (my right finger hurts after so 
> > many clics on the mouse :) ). It is
> > detailed that much so that anyone can focus on the area they have interest 
> > in.
> > Feel free to also create new dashboards with more aggregated data.  
> > 
> > Thanks to Lukasz and Cham for reviewing my PRs and showing how to use 
> > perfkit dashboards.
> > 
> > Dashboards are there: 
> > 
> > https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> > https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> > https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> > 
> > https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> > https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> > https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> > 
> > 
> > Enjoy, 
> > 
> > Etienne
> > 
> > 

Re: [ANNOUNCEMENT] Nexmark included to the CI

2018-07-11 Thread Etienne Chauchot
Is someone interested in creating the scripts and dashboards for the other 
runners? They can be created by copying the
existing scripts and dashboards and changing one gradle parameter in the 
scripts and the table name in the dashboards. 
I have created the 
tickets:https://issues.apache.org/jira/browse/BEAM-4763https://issues.apache.org/jira/browse/BEAM-476
2https://issues.apache.org/jira/browse/BEAM-4761https://issues.apache.org/jira/browse/BEAM-4760
Etienne Le mercredi 11 juillet 2018 à 15:13 +0200, Etienne Chauchot a écrit :
> Hi guys, 
> 
> I'm glad to announce that the CI of Beam has much improved !  Indeed Nexmark 
> is now included in the perfkit
> dashboards.
> 
> At each commit on master, nexmark suites are run and plots are created on the 
> graphs.
> 
> I've created 2 kind of dashboards:
> - one for performances (run times of the queries)
> - one for the size of the output PCollection (which  should be constant)
> 
> There are dashboards for these runners:
> - spark
> - flink
> - direct runner
> 
> Each dashboard contains:
> - graphs in batch mode 
> - graphs in streaming mode
> - graphs for the 13 queries.
> 
> That gives more than a hundred of graphs (my right finger hurts after so many 
> clics on the mouse :) ). It is detailed
> that much so that anyone can focus on the area they have interest in.
> Feel free to also create new dashboards with more aggregated data.  
> 
> Thanks to Lukasz and Cham for reviewing my PRs and showing how to use perfkit 
> dashboards.
> 
> Dashboards are there: 
> 
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> 
> https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
> https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
> https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000
> 
> 
> Enjoy, 
> 
> Etienne
> 
> 

[ANNOUNCEMENT] Nexmark included to the CI

2018-07-11 Thread Etienne Chauchot

Hi guys, 

I'm glad to announce that the CI of Beam has much improved !  Indeed Nexmark is 
now included in the perfkit dashboards.

At each commit on master, nexmark suites are run and plots are created on the 
graphs.

I've created 2 kind of dashboards:
- one for performances (run times of the queries)
- one for the size of the output PCollection (which  should be constant)

There are dashboards for these runners:
- spark
- flink
- direct runner

Each dashboard contains:
- graphs in batch mode 
- graphs in streaming mode
- graphs for the 13 queries.

That gives more than a hundred of graphs (my right finger hurts after so many 
clics on the mouse :) ). It is detailed
that much so that anyone can focus on the area they have interest in.
Feel free to also create new dashboards with more aggregated data.  

Thanks to Lukasz and Cham for reviewing my PRs and showing how to use perfkit 
dashboards.

Dashboards are there: 

https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712

https://apache-beam-testing.appspot.com/explore?dashboard=5099379773931520
https://apache-beam-testing.appspot.com/explore?dashboard=5731568492478464
https://apache-beam-testing.appspot.com/explore?dashboard=5163657986048000


Enjoy, 

Etienne




Build failed in Jenkins: beam_SeedJob #2197

2018-07-11 Thread Apache Jenkins Server
See 

--
GitHub pull request #5927 of commit cd1326b72617ee1ed40e11ce1ef85b10a61a9123, 
no merge conflicts.
Setting status of cd1326b72617ee1ed40e11ce1ef85b10a61a9123 to PENDING with url 
https://builds.apache.org/job/beam_SeedJob/2197/ and message: 'Build started 
for merge commit.'
Using context: Jenkins: Seed Job
[EnvInject] - Loading node environment variables.
Building remotely on beam14 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/5927/*:refs/remotes/origin/pr/5927/*
 > git rev-parse refs/remotes/origin/pr/5927/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/5927/merge^{commit} # timeout=10
Checking out Revision 31ae3c3eb13c29a040c261f0355290d8c558e65d 
(refs/remotes/origin/pr/5927/merge)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 31ae3c3eb13c29a040c261f0355290d8c558e65d
Commit message: "Merge cd1326b72617ee1ed40e11ce1ef85b10a61a9123 into 
3c80b61e6a91e689bb8d3d2c9860d76ca7f874ee"
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Processing DSL script job_00_seed.groovy
Processing DSL script job_Dependency_Check.groovy
Processing DSL script job_Inventory.groovy
Processing DSL script job_PerformanceTests_Dataflow.groovy
Processing DSL script job_PerformanceTests_FileBasedIO_IT.groovy
Processing DSL script job_PerformanceTests_FileBasedIO_IT_HDFS.groovy
Processing DSL script job_PerformanceTests_HadoopInputFormat.groovy
Processing DSL script job_PerformanceTests_JDBC.groovy
Processing DSL script job_PerformanceTests_MongoDBIO_IT.groovy
Processing DSL script job_PerformanceTests_Python.groovy
Processing DSL script job_PerformanceTests_Spark.groovy
Processing DSL script job_PostCommit_Go_GradleBuild.groovy
Processing DSL script job_PostCommit_Java_GradleBuild.groovy
Processing DSL script job_PostCommit_Java_Nexmark_Direct.groovy
Processing DSL script job_PostCommit_Java_Nexmark_Flink.groovy
Processing DSL script job_PostCommit_Java_Nexmark_Spark.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Apex.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Dataflow.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Flink.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Gearpump.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Samza.groovy
Processing DSL script job_PostCommit_Java_ValidatesRunner_Spark.groovy
Processing DSL script job_PostCommit_Python_ValidatesContainer_Dataflow.groovy
Processing DSL script job_PostCommit_Python_ValidatesRunner_Dataflow.groovy
Processing DSL script job_PostCommit_Python_Verify.groovy
Processing DSL script job_PostRelease_NightlySnapshot.groovy
Processing DSL script job_PreCommit_Go.groovy
Processing DSL script job_PreCommit_Java.groovy
Processing DSL script job_PreCommit_Python.groovy
Processing DSL script job_PreCommit_Website_Merge.groovy
Processing DSL script job_PreCommit_Website_Stage.groovy
Processing DSL script job_PreCommit_Website_Test.groovy
Processing DSL script job_ReleaseCandidate_Python.groovy
Processing DSL script job_Release_Gradle_NightlySnapshot.groovy
Processing DSL script job_beam_PerformanceTests_Analysis.groovy
Processing DSL script job_seed_standalone.groovy
Existing items:
GeneratedJob{name='beam_Dependency_Check'}
GeneratedJob{name='beam_Inventory_beam1'}
GeneratedJob{name='beam_Inventory_beam10'}
GeneratedJob{name='beam_Inventory_beam11'}
GeneratedJob{name='beam_Inventory_beam12'}
GeneratedJob{name='beam_Inventory_beam13'}
GeneratedJob{name='beam_Inventory_beam14'}
GeneratedJob{name='beam_Inventory_beam15'}
GeneratedJob{name='beam_Inventory_beam16'}
GeneratedJob{name='beam_Inventory_beam2'}
GeneratedJob{name='beam_Inventory_beam3'}
GeneratedJob{name='beam_Inventory_beam4'}
GeneratedJob{name='beam_Inventory_beam5'}
GeneratedJob{name='beam_Inventory_beam6'}
GeneratedJob{name='beam_Inventory_beam7'}
GeneratedJob{name='beam_Inventory_beam8'}
GeneratedJob{name='beam_Inventory_beam9'}
GeneratedJob{name='beam_PerformanceTests_Analysis'}
GeneratedJob{name='beam_PerformanceTests_AvroIOIT'}
GeneratedJob{name='beam_PerformanceTests_AvroIOIT_HDFS'}
GeneratedJob{name='beam_PerformanceTests_Compressed_TextIOIT'}
GeneratedJob{name='beam_PerformanceTests_Compressed_TextIOIT_HDFS'}

Re: The full list of proposals / prototype documents

2018-07-11 Thread Etienne Chauchot
Alexey,
One doc that can be interesting that I forgot to point out is 
https://docs.google.com/document/d/1VgnGiVu8vSfm7Et-xAtQYv
0PlEpqeyfmhpQUNPmWRJs/edit?usp=sharingIt is the doc I wrote when I submitted 
Nexmark PR to ease the reading of the code.
It is not a design doc, I don't know if it belongs to the website page or to 
the wiki for beam devs.
Etienne

Le mercredi 06 juin 2018 à 17:48 +0200, Alexey Romanenko a écrit :
> FYI: Finally, it was merged and you can find this page 
> here:https://beam.apache.org/contribute/design-documents/
> 
> Thank you everybody who helped me to compile this list! 
> I’ll try to do my best to update this with new coming docs. In the same time, 
> please, feel free to add your new docs
> (or notify me if I missed this) once they are finished and ready to be 
> published.
> 
> WBR,
> Alexey
> 
> > On 31 May 2018, at 18:52, Eugene Kirpichov  wrote:
> > 
> > Thank you!
> > On Thu, May 31, 2018 at 8:30 AM Alexey Romanenko  
> > wrote:
> > > Thank you everybody for provided links. I collected all of them (please, 
> > > correct me if I missed something),
> > > categorized and created a dedicated page for Beam website.
> > > Here is a PR for that (please, review):
> > > https://github.com/apache/beam-site/pull/456
> > > 
> > > WBR,
> > > Alexey
> > > 
> > > > On 30 May 2018, at 13:17, Łukasz Gajowy  wrote:
> > > > 
> > > > Hi, 
> > > > 
> > > > I just wanted to add those two (sorry for being kinda late with this): 
> > > > 
> > > > https://docs.google.com/document/d/1dA-5s6OHiP_cz-NRAbwapoKF5MEC1wKps4A5tFbIPKE/edit?usp=sharing
> > > > https://docs.google.com/document/d/1Cb7XVmqe__nA_WCrriAifL-3WCzbZzV4Am5W_SkQLeA/edit?usp=sharing
> > > > 
> > > > Thanks, 
> > > > Łukasz 
> > > > 2018-05-29 22:42 GMT+02:00 Lukasz Cwik :
> > > > > Providing ownership to the PMC account allows others to take over 
> > > > > ownership of the document once a contributor
> > > > > stops being active. This allows docs to be updated (even if just to 
> > > > > point to a newer doc).
> > > > > 
> > > > > On Tue, May 29, 2018 at 1:20 PM Kenneth Knowles  
> > > > > wrote:
> > > > > > My position on ownership is design docs are really documents "of 
> > > > > > the moment" and authored by a particular
> > > > > > individual or group. Experience shows that even if you try, keeping 
> > > > > > it fresh is not likely to happen.
> > > > > > Anything that needs freshness (like end-user docs) should be in a 
> > > > > > different medium. I would just date the
> > > > > > gdoc so readers know how to interpret it (the automated "last edit" 
> > > > > > date is not sufficient for understanding
> > > > > > how stale something is). 
> > > > > > So it seems like it makes little difference if the project or PMC 
> > > > > > has ownership or even write access. Of
> > > > > > course I have no objections if someone wants to transfer ownership, 
> > > > > > but is there a reason to encourage it?
> > > > > > Kenn
> > > > > > 
> > > > > > On Tue, May 29, 2018 at 1:11 PM Lukasz Cwik  
> > > > > > wrote:
> > > > > > > I transferred ownership of the docs that I owned to the 
> > > > > > > apacheb...@gmail.com PMC account and put the ones
> > > > > > > that I owned into the drive folder.
> > > > > > > Would it be a good idea for others to follow suit?
> > > > > > > 
> > > > > > > Instructions on how to transfer ownership are here: 
> > > > > > > http://support.it.mtu.edu/Accounts/E-Mail/75946047/How
> > > > > > > -do-I-transfer-ownership-of-a-Google-Doc.htm
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > On Tue, May 29, 2018 at 11:23 AM Lukasz Cwik  
> > > > > > > wrote:
> > > > > > > > I created a PR for the beam-site to link to the design docs and 
> > > > > > > > template from the contribution guide:htt
> > > > > > > > ps://github.com/apache/beam-site/pull/454
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On Fri, May 25, 2018 at 10:23 AM Lukasz Cwik  
> > > > > > > > wrote:
> > > > > > > > > Here are some more links related to portability efforts:
> > > > > > > > > 
> > > > > > > > > https://s.apache.org/beam-fn-api
> > > > > > > > > https://s.apache.org/beam-fn-api-processing-a-bundle
> > > > > > > > > https://s.apache.org/beam-fn-api-send-and-receive-data
> > > > > > > > > https://s.apache.org/beam-fn-state-api-and-bundle-processing
> > > > > > > > > https://s.apache.org/beam-fn-api-progress-reporting
> > > > > > > > > https://s.apache.org/beam-fn-api-container-contract
> > > > > > > > > https://s.apache.org/beam-breaking-fusion
> > > > > > > > > https://s.apache.org/beam-runner-api-combine-model
> > > > > > > > > https://s.apache.org/beam-fn-api-metrics
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Thu, May 24, 2018 at 2:11 PM Scott Wegner 
> > > > > > > > >  wrote:
> > > > > > > > > > Thanks for sharing these. I also put together a design doc 
> > > > > > > > > > template based on common styling /
> > > > > > > > > > sections I saw in the docs listed above. Others are free to 
> > > > > > > 

Build failed in Jenkins: beam_Release_Gradle_NightlySnapshot #97

2018-07-11 Thread Apache Jenkins Server
See 


Changes:

[robertwb] object() takes no parameters

[Pablo] Improving signing of all published artifacts and adding test publication

[Pablo] Removing extra publication

[cclauss] [BEAM-3761] Define cmp() in Python 3

[cclauss] [BEAM-3959] Add Python 3 undefined names to flake8

[amyrvold] [BEAM-3457] Upgrade version of gogradle and add examples and 
containers

[apilloud] [SQL] Inject JDBC rules through Hook.

[lcwik] [BEAM-4481, BEAM-4484] Start vendoring portability dependencies to not

[apilloud] [SQL] Plumb pipelineOptions through IOSinkRel

--
[...truncated 1.16 MB...]
file or directory 
'
 not found
:beam-sdks-java-io-xml:testSourcesJar (Thread[Task worker for ':' Thread 
5,5,main]) completed. Took 0.004 secs.
:beam-sdks-java-maven-archetypes-examples:generatePomFileForMavenJavaPublication
 (Thread[Task worker for ':' Thread 5,5,main]) started.

> Task 
> :beam-sdks-java-maven-archetypes-examples:generatePomFileForMavenJavaPublication
Build cache key for task 
':beam-sdks-java-maven-archetypes-examples:generatePomFileForMavenJavaPublication'
 is e88836a5bca732f78522d2de5a70d4e6
Caching disabled for task 
':beam-sdks-java-maven-archetypes-examples:generatePomFileForMavenJavaPublication':
 Caching has not been enabled for the task
Task 
':beam-sdks-java-maven-archetypes-examples:generatePomFileForMavenJavaPublication'
 is not up-to-date because:
  Task.upToDateWhen is false.
:beam-sdks-java-maven-archetypes-examples:generatePomFileForMavenJavaPublication
 (Thread[Task worker for ':' Thread 5,5,main]) completed. Took 0.005 secs.
:beam-sdks-java-maven-archetypes-examples:generateSources (Thread[Task worker 
for ':' Thread 5,5,main]) started.

> Task :beam-sdks-java-maven-archetypes-examples:generateSources
Caching disabled for task 
':beam-sdks-java-maven-archetypes-examples:generateSources': Caching has not 
been enabled for the task
Task ':beam-sdks-java-maven-archetypes-examples:generateSources' is not 
up-to-date because:
  Task has not declared any outputs despite executing actions.
Starting process 'command './generate-sources.sh''. Working directory: 

 Command: ./generate-sources.sh 
Successfully started process 'command './generate-sources.sh''
:beam-sdks-java-maven-archetypes-examples:generateSources (Thread[Task worker 
for ':' Thread 5,5,main]) completed. Took 0.225 secs.
:beam-sdks-java-maven-archetypes-examples:processResources (Thread[Task worker 
for ':' Thread 5,5,main]) started.

> Task :beam-sdks-java-maven-archetypes-examples:processResources
Build cache key for task 
':beam-sdks-java-maven-archetypes-examples:processResources' is 
425c2a9455014215ad6b1b6c5335767f
Caching disabled for task 
':beam-sdks-java-maven-archetypes-examples:processResources': Caching has not 
been enabled for the task
Task ':beam-sdks-java-maven-archetypes-examples:processResources' is not 
up-to-date because:
  No history is available.
:beam-sdks-java-maven-archetypes-examples:processResources (Thread[Task worker 
for ':' Thread 5,5,main]) completed. Took 0.109 secs.
:beam-sdks-java-maven-archetypes-examples:generatePomPropertiesFileForMavenJavaPublication
 (Thread[Task worker for ':' Thread 14,5,main]) started.

> Task 
> :beam-sdks-java-maven-archetypes-examples:generatePomPropertiesFileForMavenJavaPublication
Build cache key for task 
':beam-sdks-java-maven-archetypes-examples:generatePomPropertiesFileForMavenJavaPublication'
 is f01acc2a3fb8523136a4a5dce680860b
Caching disabled for task 
':beam-sdks-java-maven-archetypes-examples:generatePomPropertiesFileForMavenJavaPublication':
 Caching has not been enabled for the task
Task 
':beam-sdks-java-maven-archetypes-examples:generatePomPropertiesFileForMavenJavaPublication'
 is not up-to-date because:
  No history is available.
:beam-sdks-java-maven-archetypes-examples:generatePomPropertiesFileForMavenJavaPublication
 (Thread[Task worker for ':' Thread 14,5,main]) completed. Took 0.001 secs.
:beam-sdks-java-maven-archetypes-examples:processTestResources (Thread[Task 
worker for ':' Thread 14,5,main]) started.

> Task :beam-sdks-java-maven-archetypes-examples:processTestResources
Build cache key for task 
':beam-sdks-java-maven-archetypes-examples:processTestResources' is 
32f999b722303cf823d337566e91f1de
Caching disabled for task 
':beam-sdks-java-maven-archetypes-examples:processTestResources': Caching has 
not been enabled for the task
Task ':beam-sdks-java-maven-archetypes-examples:processTestResources' is not 
up-to-date because:
  No history is available.
:beam-sdks-java-maven-archetypes-examples:processTestResources (Thread[Task 
worker for ':' Thread 14,5,main]) completed. Took 0.002 secs.

Jenkins build is back to normal : beam_SeedJob #2194

2018-07-11 Thread Apache Jenkins Server
See