Flaky test issue report

2021-04-25 Thread Beam Jira Bot
This is your daily summary of Beam's current flaky tests. These are P1 issues 
because they have a major negative impact on the community and make it hard to 
determine the quality of the software.

BEAM-12200: SamzaStoreStateInternalsTest is flaky 
(https://issues.apache.org/jira/browse/BEAM-12200)
BEAM-12163: Python GHA PreCommits flake with grpc.FutureTimeoutError on SDK 
harness startup (https://issues.apache.org/jira/browse/BEAM-12163)
BEAM-12061: beam_PostCommit_SQL failing on 
KafkaTableProviderIT.testFakeNested 
(https://issues.apache.org/jira/browse/BEAM-12061)
BEAM-12020: :sdks:java:container:java8:docker failing missing licenses 
(https://issues.apache.org/jira/browse/BEAM-12020)
BEAM-12019: 
apache_beam.runners.portability.flink_runner_test.FlinkRunnerTestOptimized.test_flink_metrics
 is flaky (https://issues.apache.org/jira/browse/BEAM-12019)
BEAM-11792: Python precommit failed (flaked?) installing package  
(https://issues.apache.org/jira/browse/BEAM-11792)
BEAM-11733: [beam_PostCommit_Java] [testFhirIO_Import|export] flaky 
(https://issues.apache.org/jira/browse/BEAM-11733)
BEAM-11666: 
apache_beam.runners.interactive.recording_manager_test.RecordingManagerTest.test_basic_execution
 is flaky (https://issues.apache.org/jira/browse/BEAM-11666)
BEAM-11662: elasticsearch tests failing 
(https://issues.apache.org/jira/browse/BEAM-11662)
BEAM-11661: hdfsIntegrationTest flake: network not found (py38 postcommit) 
(https://issues.apache.org/jira/browse/BEAM-11661)
BEAM-11646: beam_PostCommit_XVR_Spark failing 
(https://issues.apache.org/jira/browse/BEAM-11646)
BEAM-11645: beam_PostCommit_XVR_Flink failing 
(https://issues.apache.org/jira/browse/BEAM-11645)
BEAM-11541: testTeardownCalledAfterExceptionInProcessElement flakes on 
direct runner. (https://issues.apache.org/jira/browse/BEAM-11541)
BEAM-11540: Linter sometimes flakes on apache_beam.dataframe.frames_test 
(https://issues.apache.org/jira/browse/BEAM-11540)
BEAM-11493: Spark test failure: 
org.apache.beam.sdk.transforms.GroupByKeyTest$WindowTests.testGroupByKeyAndWindows
 (https://issues.apache.org/jira/browse/BEAM-11493)
BEAM-11492: Spark test failure: 
org.apache.beam.sdk.transforms.GroupByKeyTest$WindowTests.testGroupByKeyMergingWindows
 (https://issues.apache.org/jira/browse/BEAM-11492)
BEAM-11491: Spark test failure: 
org.apache.beam.sdk.transforms.GroupByKeyTest$WindowTests.testGroupByKeyMultipleWindows
 (https://issues.apache.org/jira/browse/BEAM-11491)
BEAM-11490: Spark test failure: 
org.apache.beam.sdk.transforms.ReifyTimestampsTest.inValuesSucceeds 
(https://issues.apache.org/jira/browse/BEAM-11490)
BEAM-11489: Spark test failure: 
org.apache.beam.sdk.metrics.MetricsTest$AttemptedMetricTests.testAttemptedDistributionMetrics
 (https://issues.apache.org/jira/browse/BEAM-11489)
BEAM-11488: Spark test failure: 
org.apache.beam.sdk.metrics.MetricsTest$AttemptedMetricTests.testAttemptedCounterMetrics
 (https://issues.apache.org/jira/browse/BEAM-11488)
BEAM-11487: Spark test failure: 
org.apache.beam.sdk.transforms.WithTimestampsTest.withTimestampsShouldApplyTimestamps
 (https://issues.apache.org/jira/browse/BEAM-11487)
BEAM-11486: Spark test failure: 
org.apache.beam.sdk.testing.PAssertTest.testSerializablePredicate 
(https://issues.apache.org/jira/browse/BEAM-11486)
BEAM-11485: Spark test failure: 
org.apache.beam.sdk.transforms.CombineFnsTest.testComposedCombineNullValues 
(https://issues.apache.org/jira/browse/BEAM-11485)
BEAM-11484: Spark test failure: 
org.apache.beam.runners.core.metrics.MetricsPusherTest.pushesUserMetrics 
(https://issues.apache.org/jira/browse/BEAM-11484)
BEAM-11483: Spark portable streaming PostCommit Test Improvements 
(https://issues.apache.org/jira/browse/BEAM-11483)
BEAM-10995: Java + Universal Local Runner: 
WindowingTest.testWindowPreservation fails 
(https://issues.apache.org/jira/browse/BEAM-10995)
BEAM-10987: stager_test.py::StagerTest::test_with_main_session flaky on 
windows py3.6,3.7 (https://issues.apache.org/jira/browse/BEAM-10987)
BEAM-10968: flaky test: 
org.apache.beam.sdk.metrics.MetricsTest$AttemptedMetricTests.testAttemptedDistributionMetrics
 (https://issues.apache.org/jira/browse/BEAM-10968)
BEAM-10955: Flink Java Runner test flake: Could not find Flink job  
(https://issues.apache.org/jira/browse/BEAM-10955)
BEAM-10923: Python requirements installation in docker container is flaky 
(https://issues.apache.org/jira/browse/BEAM-10923)
BEAM-10899: test_FhirIO_exportFhirResourcesGcs flake with OOM 
(https://issues.apache.org/jira/browse/BEAM-10899)
BEAM-10866: PortableRunnerTestWithSubprocesses.test_register_finalizations 
flaky on macOS (https://issues.apache.org/jira/browse/BEAM-10866)
BEAM-10763: Spotless flake (NullPointerException) 
(https://issues.apache.org/jira/browse/BEAM-10763)
BEAM-10590: BigQueryQueryToTableIT flaky: test_big_query_new_types 
(https

P1 issues report

2021-04-25 Thread Beam Jira Bot
This is your daily summary of Beam's current P1 issues, not including flaky 
tests.

See https://beam.apache.org/contribute/jira-priorities/#p1-critical for the 
meaning and expectations around P1 issues.

BEAM-1: Dataflow side input translation "Unknown producer for value" 
(https://issues.apache.org/jira/browse/BEAM-1)
BEAM-12205: Dataflow pipelines broken NoSuchMethodError 
DoFnInvoker.invokeSetup() (https://issues.apache.org/jira/browse/BEAM-12205)
BEAM-12195: Flink Runner 1.11 uses old Scala-Version 
(https://issues.apache.org/jira/browse/BEAM-12195)
BEAM-11959: Python Beam SDK Harness hangs when installing pip packages 
(https://issues.apache.org/jira/browse/BEAM-11959)
BEAM-11906: No trigger early repeatedly for session windows 
(https://issues.apache.org/jira/browse/BEAM-11906)
BEAM-11875: XmlIO.Read does not handle XML encoding per spec 
(https://issues.apache.org/jira/browse/BEAM-11875)
BEAM-11828: JmsIO is not acknowledging messages correctly 
(https://issues.apache.org/jira/browse/BEAM-11828)
BEAM-11755: Cross-language consistency (RequiresStableInputs) is quietly 
broken (at least on portable flink runner) 
(https://issues.apache.org/jira/browse/BEAM-11755)
BEAM-11578: `dataflow_metrics` (python) fails with TypeError (when int 
overflowing?) (https://issues.apache.org/jira/browse/BEAM-11578)
BEAM-11576: Go ValidatesRunner failure: TestFlattenDup on Dataflow Runner 
(https://issues.apache.org/jira/browse/BEAM-11576)
BEAM-11434: Expose Spanner admin/batch clients in Spanner Accessor 
(https://issues.apache.org/jira/browse/BEAM-11434)
BEAM-11227: Upgrade beam-vendor-grpc-1_26_0-0.3 to fix CVE-2020-27216 
(https://issues.apache.org/jira/browse/BEAM-11227)
BEAM-11148: Kafka commitOffsetsInFinalize OOM on Flink 
(https://issues.apache.org/jira/browse/BEAM-11148)
BEAM-11017: Timer with dataflow runner can be set multiple times (dataflow 
runner) (https://issues.apache.org/jira/browse/BEAM-11017)
BEAM-10861: Adds URNs and payloads to PubSub transforms 
(https://issues.apache.org/jira/browse/BEAM-10861)
BEAM-10617: python CombineGlobally().with_fanout() cause duplicate combine 
results for sliding windows (https://issues.apache.org/jira/browse/BEAM-10617)
BEAM-10569: SpannerIO tests don't actually assert anything. 
(https://issues.apache.org/jira/browse/BEAM-10569)
BEAM-10288: Quickstart documents are out of date 
(https://issues.apache.org/jira/browse/BEAM-10288)
BEAM-10244: Populate requirements cache fails on poetry-based packages 
(https://issues.apache.org/jira/browse/BEAM-10244)
BEAM-10100: FileIO writeDynamic with AvroIO.sink not writing all data 
(https://issues.apache.org/jira/browse/BEAM-10100)
BEAM-9564: Remove insecure ssl options from MongoDBIO 
(https://issues.apache.org/jira/browse/BEAM-9564)
BEAM-9455: Environment-sensitive provisioning for Dataflow 
(https://issues.apache.org/jira/browse/BEAM-9455)
BEAM-9293: Python direct runner doesn't emit empty pane when it should 
(https://issues.apache.org/jira/browse/BEAM-9293)
BEAM-8986: SortValues may not work correct for numerical types 
(https://issues.apache.org/jira/browse/BEAM-8986)
BEAM-8985: SortValues should fail if SecondaryKey coder is not 
deterministic (https://issues.apache.org/jira/browse/BEAM-8985)
BEAM-8407: [SQL] Some Hive tests throw NullPointerException, but get marked 
as passing (Direct Runner) (https://issues.apache.org/jira/browse/BEAM-8407)
BEAM-7717: PubsubIO watermark tracking hovers near start of epoch 
(https://issues.apache.org/jira/browse/BEAM-7717)
BEAM-7716: PubsubIO returns empty message bodies for all messages read 
(https://issues.apache.org/jira/browse/BEAM-7716)
BEAM-7195: BigQuery - 404 errors for 'table not found' when using dynamic 
destinations - sometimes, new table fails to get created 
(https://issues.apache.org/jira/browse/BEAM-7195)
BEAM-6839: User reports protobuf ClassChangeError running against 2.6.0 or 
above (https://issues.apache.org/jira/browse/BEAM-6839)
BEAM-6466: KafkaIO doesn't commit offsets while being used as bounded 
source (https://issues.apache.org/jira/browse/BEAM-6466)


Re: [VOTE] Release 2.29.0, release candidate #1

2021-04-25 Thread Kenneth Knowles
I did an additional round of making sure the human-readable quickstart
instructions also succeed.

Kenn

On Thu, Apr 22, 2021 at 6:47 PM Ahmet Altay  wrote:

> +1 (binding)
>
> I ran some python quick start examples. Most validations in the sheet were
> already done :) Thank you all!
>
> On Thu, Apr 22, 2021 at 9:15 AM Kyle Weaver  wrote:
>
>> +1 (non-)
>>
>> Ran Python wordcount on Flink and Spark.
>>
>> On Wed, Apr 21, 2021 at 5:20 PM Brian Hulette 
>> wrote:
>>
>>> +1 (non-binding)
>>>
>>> I ran a python pipeline exercising the DataFrame API, and another
>>> exercising SQLTransform in Python, both on Dataflow.
>>>
>>> On Wed, Apr 21, 2021 at 12:55 PM Kenneth Knowles 
>>> wrote:
>>>
 Since the artifacts were changed about 26 hours ago, I intend to leave
 this vote open until 46 hours from now. Specifically, around noon my time
 (US Pacific) on Friday I will close the vote and finalize the release, if
 no problems are discovered.

 Kenn

 On Wed, Apr 21, 2021 at 12:52 PM Kenneth Knowles 
 wrote:

> +1 (binding)
>
> I ran the script at
> https://beam.apache.org/contribute/release-guide/#run-validations-using-run_rc_validationsh
> except for the part that requires a GitHub PR, since Cham already did that
> part.
>
> Kenn
>
> On Wed, Apr 21, 2021 at 12:11 PM Valentyn Tymofieiev <
> valen...@google.com> wrote:
>
>> +1, verified that my previous findings are fixed.
>>
>> On Wed, Apr 21, 2021 at 8:17 AM Chamikara Jayalath <
>> chamik...@google.com> wrote:
>>
>>> +1 (binding)
>>>
>>> Ran some Python scenarios and updated the spreadsheet.
>>>
>>> Thanks,
>>> Cham
>>>
>>> On Tue, Apr 20, 2021 at 3:39 PM Kenneth Knowles 
>>> wrote:
>>>


 On Tue, Apr 20, 2021 at 3:24 PM Robert Bradshaw <
 rober...@google.com> wrote:

> The artifacts and signatures look good to me. +1 (binding)
>
> (The release branch still has the .dev name, maybe you didn't
> push?
> https://github.com/apache/beam/blob/release-2.29.0/sdks/python/apache_beam/version.py
> )
>

 Good point. I'll highlight that I finally implemented the branching
 changes from
 https://lists.apache.org/thread.html/205472bdaf3c2c5876533750d417c19b0d1078131a3dc04916082ce8%40%3Cdev.beam.apache.org%3E

 The new guide with diagram is here:
 https://beam.apache.org/contribute/release-guide/#tag-a-chosen-commit-for-the-rc

 TL;DR:
  - the release branch continues to be dev/SNAPSHOT for 2.29.0 while
 the main branch is now dev/SNAPSHOT for 2.30.0
  - the RC tag v2.29.0-RC1 no longer lies on the release branch. It
 is a single tagged commit that removes the dev/SNAPSHOT suffix

 Kenn


> On Tue, Apr 20, 2021 at 10:36 AM Kenneth Knowles 
> wrote:
>
>> Please take another look.
>>
>>  - I re-ran the RC creation script so the source release and
>> wheels are new and built from the RC tag. I confirmed the source zip 
>> and
>> wheels have version 2.29.0 (not .dev or -SNAPSHOT).
>>  - I fixed and rebuilt Dataflow worker container images from
>> exactly the RC commit, added dataclasses, with internal changes to 
>> get the
>> version to match.
>>  - I confirmed that the staged jars already have version 2.29.0
>> (not -SNAPSHOT).
>>  - I confirmed with `diff -r -q` that the source tarball matches
>> the RC tag (minus the .git* files and directories and gradlew)
>>
>> Kenn
>>
>> On Mon, Apr 19, 2021 at 9:19 PM Kenneth Knowles 
>> wrote:
>>
>>> At this point, the release train has just about come around to
>>> 2.30.0 which will pick up that change. I don't think it makes sense 
>>> to
>>> cherry-pick anything more into 2.29.0 unless it is nonfunctional. 
>>> As it is,
>>> I think we have a good commit and just need to build the expected
>>> artifacts. Since it isn't all the artifacts, I was planning on just
>>> overwriting the RC1 artifacts in question and re-verify. I could 
>>> also roll
>>> a new RC2 from the same commit fairly easily.
>>>
>>> Kenn
>>>
>>> On Mon, Apr 19, 2021 at 8:57 PM Reuven Lax 
>>> wrote:
>>>
 Any chance we could include
 https://github.com/apache/beam/pull/14548?

 On Mon, Apr 19, 2021 at 8:54 PM Kenneth Knowles <
 k...@apache.org> wrote:

> To clarify: I am running and fixing the release scripts on the
> `master` branch. They work from fresh clones of the RC tag so 
> this s

Re: [VOTE] Release 2.29.0, release candidate #1

2021-04-25 Thread Jarek Potiuk
+1 (non-binding) 

Thanks for tirelessly working on improving the python client :).

This is a friendly visit from Apache Airflow here. I've just tested the 
2.29.0rc1 in our "apache.beam" provider's tests and they are all Green. Just to 
give a bit of context here. We are eagerly waiting for the 2.29.0rc1 release as 
it will unblock a few things for us - most notably, relaxing PyArrow dependency 
will help us to add Python 3.9 support to Apache Airflow (It's been long 
overdue and pyarrow < 3.0.0 coming from Apache Beam was one of the last 
blockers).

Also FYI. I am happy to be a bit more involved with some (possible) future 
dependency improvements for Beam. We had a bit of struggle with PIP 21 which 
has hard time with some of the dependency conflicts. We've managed to 
workaround it for the moment (https://github.com/apache/airflow/pull/15513), 
but looking forward to improve this and make it better (especially moving all 
google python clients to > 2).

On 2021/04/23 01:46:51, Ahmet Altay  wrote: 
> +1 (binding)
> 
> I ran some python quick start examples. Most validations in the sheet were
> already done :) Thank you all!
> 
> On Thu, Apr 22, 2021 at 9:15 AM Kyle Weaver  wrote:
> 
> > +1 (non-)
> >
> > Ran Python wordcount on Flink and Spark.
> >
> > On Wed, Apr 21, 2021 at 5:20 PM Brian Hulette  wrote:
> >
> >> +1 (non-binding)
> >>
> >> I ran a python pipeline exercising the DataFrame API, and another
> >> exercising SQLTransform in Python, both on Dataflow.
> >>
> >> On Wed, Apr 21, 2021 at 12:55 PM Kenneth Knowles  wrote:
> >>
> >>> Since the artifacts were changed about 26 hours ago, I intend to leave
> >>> this vote open until 46 hours from now. Specifically, around noon my time
> >>> (US Pacific) on Friday I will close the vote and finalize the release, if
> >>> no problems are discovered.
> >>>
> >>> Kenn
> >>>
> >>> On Wed, Apr 21, 2021 at 12:52 PM Kenneth Knowles 
> >>> wrote:
> >>>
>  +1 (binding)
> 
>  I ran the script at
>  https://beam.apache.org/contribute/release-guide/#run-validations-using-run_rc_validationsh
>  except for the part that requires a GitHub PR, since Cham already did 
>  that
>  part.
> 
>  Kenn
> 
>  On Wed, Apr 21, 2021 at 12:11 PM Valentyn Tymofieiev <
>  valen...@google.com> wrote:
> 
> > +1, verified that my previous findings are fixed.
> >
> > On Wed, Apr 21, 2021 at 8:17 AM Chamikara Jayalath <
> > chamik...@google.com> wrote:
> >
> >> +1 (binding)
> >>
> >> Ran some Python scenarios and updated the spreadsheet.
> >>
> >> Thanks,
> >> Cham
> >>
> >> On Tue, Apr 20, 2021 at 3:39 PM Kenneth Knowles 
> >> wrote:
> >>
> >>>
> >>>
> >>> On Tue, Apr 20, 2021 at 3:24 PM Robert Bradshaw 
> >>> wrote:
> >>>
>  The artifacts and signatures look good to me. +1 (binding)
> 
>  (The release branch still has the .dev name, maybe you didn't push?
>  https://github.com/apache/beam/blob/release-2.29.0/sdks/python/apache_beam/version.py
>  )
> 
> >>>
> >>> Good point. I'll highlight that I finally implemented the branching
> >>> changes from
> >>> https://lists.apache.org/thread.html/205472bdaf3c2c5876533750d417c19b0d1078131a3dc04916082ce8%40%3Cdev.beam.apache.org%3E
> >>>
> >>> The new guide with diagram is here:
> >>> https://beam.apache.org/contribute/release-guide/#tag-a-chosen-commit-for-the-rc
> >>>
> >>> TL;DR:
> >>>  - the release branch continues to be dev/SNAPSHOT for 2.29.0 while
> >>> the main branch is now dev/SNAPSHOT for 2.30.0
> >>>  - the RC tag v2.29.0-RC1 no longer lies on the release branch. It
> >>> is a single tagged commit that removes the dev/SNAPSHOT suffix
> >>>
> >>> Kenn
> >>>
> >>>
>  On Tue, Apr 20, 2021 at 10:36 AM Kenneth Knowles 
>  wrote:
> 
> > Please take another look.
> >
> >  - I re-ran the RC creation script so the source release and
> > wheels are new and built from the RC tag. I confirmed the source 
> > zip and
> > wheels have version 2.29.0 (not .dev or -SNAPSHOT).
> >  - I fixed and rebuilt Dataflow worker container images from
> > exactly the RC commit, added dataclasses, with internal changes to 
> > get the
> > version to match.
> >  - I confirmed that the staged jars already have version 2.29.0
> > (not -SNAPSHOT).
> >  - I confirmed with `diff -r -q` that the source tarball matches
> > the RC tag (minus the .git* files and directories and gradlew)
> >
> > Kenn
> >
> > On Mon, Apr 19, 2021 at 9:19 PM Kenneth Knowles 
> > wrote:
> >
> >> At this point, the release train has just about come around to
> >> 2.30.0 which will pick up that change. I don't think it makes 
> >> sense to
> >>>