Shall we use "tenacity" library to help deflake some of Python tests using retry logic?

2019-01-10 Thread Valentyn Tymofieiev
I have been looking at a few test flakes in Python SDK recently, and some of them can benefit from a simple retry logic. See PR #7455 for an example[1]. I would not recommend retrying by default for all tests, or mechanically adding a retry to every test that we see flaking: some legitimate bugs

Re: Add all tests to release validation

2019-01-10 Thread Kenneth Knowles
What do you think about crowd-sourcing? 1. Fix Version = 2.10.0 2. If assigned, ping ticket and maybe assignee, unassign if unresponsive 3. If unassigned, assign it to yourself while thinking about it 4. If you can route it a bit closer to someone who might know, great 5. If it doesn't look like a

Re: Add all tests to release validation

2019-01-10 Thread Mikhail Gryzykhin
+1 Although we should be cautious when enabling this policy. We have decent backlog of bugs that we need to plumb through. --Mikhail Have feedback ? On Thu, Jan 10, 2019 at 11:44 AM Scott Wegner wrote: > +1, this sounds good to me. > > I believe the next step would

Re: Add all tests to release validation

2019-01-10 Thread Scott Wegner
+1, this sounds good to me. I believe the next step would be to open a PR to add this to the release guide: https://github.com/apache/beam/blob/master/website/src/contribute/release-guide.md On Wed, Jan 9, 2019 at 12:04 PM Sam Rohde wrote: > Cool, thanks for all of the replies. Does this summar

Re: [DISCUSS] (Forked thread) Beam issue triage & assignees

2019-01-10 Thread Scott Wegner
+1 > 3) Ensure that each component's unresolved issues get looked at regularly This is ideal, but I also don't know how to get to this state. Starting with clear component ownership and expectations will help. If the triaging process is well-defined, then members of the community can help for any

Re: Load testing on DirectRunner

2019-01-10 Thread Andrew Pilloud
My default advice here is to use the Direct Runner for small smoke tests, and use the Flink LocalRunner for larger datasets that can still be run locally. As Reuven points out, the Direct Runner is more of a validation test itself, it does many things designed to test pipelines for the worst combin

Re: Load testing on DirectRunner

2019-01-10 Thread Rui Wang
Agree with Reuven. It depends on the purpose of running load test on direct runner. As a runner for testing, direct runner might not need load testing. However, if running load test on direct runner is used for verifying if load testing work, then reducing the size of test data definitely works.

Re: Load testing on DirectRunner

2019-01-10 Thread Reuven Lax
The Direct Runner as currently implemented is purposely inefficient. It was designed for testing, and therefore does many things that are meant to expose bugs in user pipelines (e.g. randomly sorting PCollections, serializing/deserializing every element, etc.). So it's not surprising that it doesn'

Re: Dev contact for KafkaIO

2019-01-10 Thread Ismaël Mejía
Hello, Thanks a lot Raghu for all the nice work you did for KafkaIO and the project in general. I hope we can bother you a bit (only when needed) with advanced Kafka related issues. Best wishes for your new adventure. Ismaël On Thu, Jan 10, 2019 at 11:58 AM Alexey Romanenko wrote: > > Raghu, >

Re: error with DirectRunner

2019-01-10 Thread Allie Chen
Thank you so much for start working on this! On Thu, Jan 10, 2019 at 5:55 AM Robert Bradshaw wrote: > https://github.com/apache/beam/pull/7456 > > On Thu, Jan 10, 2019 at 10:59 AM Robert Bradshaw > wrote: > > > > Sorry this got lost. I filed > > https://issues.apache.org/jira/browse/BEAM-6404;

Load testing on DirectRunner

2019-01-10 Thread Katarzyna Kucharczyk
Hi Everyone, My name is Kasia and I contribute to Beam's tests. Currently, I am working with Łukasz Gajowy on load tests and we created Jenkins configuration to run Synthetic Sources test on DirectRunner. It was decided to generate 1 000 000 000 records (bytes) for a small suite (details you can f

Re: Dev contact for KafkaIO

2019-01-10 Thread Alexey Romanenko
Raghu, Thank you for your confidence, I’ll be happy to lead a development of KafkaIO as much as I can if, of course, Beam community will be agree with that. I've been involved in KafkaIO development for some time and it was always a pleasure to work with you on bug fixes and new features. Anywa

Re: error with DirectRunner

2019-01-10 Thread Robert Bradshaw
https://github.com/apache/beam/pull/7456 On Thu, Jan 10, 2019 at 10:59 AM Robert Bradshaw wrote: > > Sorry this got lost. I filed > https://issues.apache.org/jira/browse/BEAM-6404; hopefully it'll be an > easy fix. > > On Wed, Jan 9, 2019 at 8:33 PM Allie Chen wrote: > > > > Greetings! > > > > M

Re: error with DirectRunner

2019-01-10 Thread Robert Bradshaw
Sorry this got lost. I filed https://issues.apache.org/jira/browse/BEAM-6404; hopefully it'll be an easy fix. On Wed, Jan 9, 2019 at 8:33 PM Allie Chen wrote: > > Greetings! > > May I ask whether there is any plan to work on this issue? Or if I just use > `BundleBasedDirectRunner` instead of `Di

Re: [DISCUSS] (Forked thread) Beam issue triage & assignees

2019-01-10 Thread Mikhail Gryzykhin
+1 to keep issues unassigned and reevaluate backlog from time to time. We can also auto-unassign if there was no activity on ticket for N days. Or we can have auto-mailed report that highlights stale assigned issues. On Thu, Jan 10, 2019 at 12:10 AM Robert Bradshaw wrote: > On Thu, Jan 10, 2019

Re: [DISCUSS] (Forked thread) Beam issue triage & assignees

2019-01-10 Thread Robert Bradshaw
On Thu, Jan 10, 2019 at 3:20 AM Ahmet Altay wrote: > > I agree with the proposals here. Initial state of "Needs Review" and blocking > releases on untriaged issues will ensure that we will at least look at every > new issue once. +1. I'm more ambivalent about closing stale issues. Unlike PRs,