[jira] [Updated] (BEAM-13857) Add expansion service startup to Go integration test flags.

2022-04-18 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13857:
---
Fix Version/s: Not applicable
   Resolution: Fixed
   Status: Resolved  (was: Open)

> Add expansion service startup to Go integration test flags.
> ---
>
> Key: BEAM-13857
> URL: https://issues.apache.org/jira/browse/BEAM-13857
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Ritesh Ghorse
>Assignee: Daniel Oliveira
>Priority: P2
> Fix For: Not applicable
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Currently a separate debezium io expansion address flag needs to be passed to 
> the runner when running cross-language debezium IO pipelines from Go SDK. 
> Find a way to do this in a better way so that we could have it started along 
> with java io expansion service while spinning up the test without bulking 
> :sdks:java:io:expansion-service.
> In particular, needing to add a flag per expansion service jar to our 
> integration tests will eventually become quite cluttered, so we may wish to 
> settle on some kind of KV map flag approach instead to reduce copypasta code 
> overhead.
> Edit: Decided on going with the KV map flag approach within the Go SDK 
> instead of in a bash script, and moving expansion service startup into the 
> codebase as well.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14296) PostCommit Java VR Dataflow V2 Streaming failing (:release:go-licenses:java:dockerRun)

2022-04-18 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14296:
---
Fix Version/s: 2.38.0
   Resolution: Fixed
   Status: Resolved  (was: Open)

> PostCommit Java VR Dataflow V2 Streaming failing 
> (:release:go-licenses:java:dockerRun)
> --
>
> Key: BEAM-14296
> URL: https://issues.apache.org/jira/browse/BEAM-14296
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Emily Ye
>Assignee: Daniel Oliveira
>Priority: P2
>  Labels: currently-failing
> Fix For: 2.38.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Links:
>  * First failure (no changes that seem to have trigger this) : 
> [https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2_Streaming/1987/]
> Segfault in :release:go-licenses:java:dockerRun, seems to be related to 
> github.com/spf13/cobra v1.3.0 (see go.mod) and 
> [https://github.com/google/go-licenses/issues/125|https://github.com/google/go-licenses/issues/125.]
> Will attempt to fix by pining version of go-licenses
>  
> (Add any investigation notes so far)
> 
> _After you've filled out the above details, please [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriate owner. See [Apache Beam 
> Post-Commit 
> Policies|https://beam.apache.org/contribute/postcommits-policies]._



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14282) Fix exceptions swallowed in several Python I/O connectors

2022-04-12 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17521399#comment-17521399
 ] 

Daniel Oliveira commented on BEAM-14282:


This has been cherry-picked into 2.38.0, so assuming there's no further work it 
should be safe to resolve.

> Fix exceptions swallowed in several Python I/O connectors
> -
>
> Key: BEAM-14282
> URL: https://issues.apache.org/jira/browse/BEAM-14282
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Affects Versions: 2.32.0, 2.33.0, 2.34.0, 2.35.0, 2.36.0, 2.37.0
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Seems like we do not re-throw errors after reporting metrics at following 
> locations.
> https://github.com/apache/beam/blob/8e217ea0d1f383ef5033ef507b14d01edf9c67e6/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py#L303
> https://github.com/apache/beam/blob/70d9e2a08cc32192790cd9c98ffa15a756877a73/sdks/python/apache_beam/io/gcp/gcsio.py#L644
> Not re-raising these errors could result in data correctness issues for 
> downstream consumers.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13519) Java precommit flaky (timing out)

2022-04-06 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13519:
---
Fix Version/s: (was: 2.38.0)

> Java precommit flaky (timing out)
> -
>
> Key: BEAM-13519
> URL: https://issues.apache.org/jira/browse/BEAM-13519
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kiley Sok
>Priority: P1
>  Labels: flake
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Java precommits are sometimes timing out with no clear cause. Gradle will log 
> a bunch of routine build tasks, and then Jenkins will abort the job much 
> later. There are no logs to indicate what happened. It is not even clear 
> which task or tasks, if any, was the culprit, since many tasks are run in 
> parallel.
> 01:53:28 > Task :sdks:java:testing:nexmark:build
> 01:53:28 > Task :sdks:java:testing:nexmark:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:zetasql:buildDependents
> 01:53:28 > Task :sdks:java:io:google-cloud-platform:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:buildDependents
> 01:53:28 > Task :sdks:java:io:kafka:buildDependents
> 01:53:28 > Task :sdks:java:extensions:join-library:buildDependents
> 01:53:28 > Task :sdks:java:io:synthetic:buildDependents
> 01:53:28 > Task :sdks:java:io:mongodb:buildDependents
> 01:53:28 > Task :sdks:java:io:thrift:buildDependents
> 01:53:28 > Task :sdks:java:testing:test-utils:buildDependents
> 01:53:28 > Task :sdks:java:expansion-service:buildDependents
> 01:53:28 > Task :sdks:java:extensions:arrow:buildDependents
> 01:53:28 > Task :sdks:java:extensions:protobuf:buildDependents
> 01:53:28 > Task :sdks:java:io:common:buildDependents
> 01:53:28 > Task :runners:direct-java:buildDependents
> 01:53:28 > Task :runners:local-java:buildDependents
> 01:53:28 Build timed out (after 120 minutes). Marking the build as aborted.
> https://ci-beam.apache.org/job/beam_PreCommit_Java_cron/4874/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13519) Java precommit flaky (timing out)

2022-04-06 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13519:
---
Fix Version/s: 2.38.0
   Resolution: Fixed
   Status: Resolved  (was: Open)

> Java precommit flaky (timing out)
> -
>
> Key: BEAM-13519
> URL: https://issues.apache.org/jira/browse/BEAM-13519
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kiley Sok
>Priority: P1
>  Labels: flake
> Fix For: 2.38.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Java precommits are sometimes timing out with no clear cause. Gradle will log 
> a bunch of routine build tasks, and then Jenkins will abort the job much 
> later. There are no logs to indicate what happened. It is not even clear 
> which task or tasks, if any, was the culprit, since many tasks are run in 
> parallel.
> 01:53:28 > Task :sdks:java:testing:nexmark:build
> 01:53:28 > Task :sdks:java:testing:nexmark:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:zetasql:buildDependents
> 01:53:28 > Task :sdks:java:io:google-cloud-platform:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:buildDependents
> 01:53:28 > Task :sdks:java:io:kafka:buildDependents
> 01:53:28 > Task :sdks:java:extensions:join-library:buildDependents
> 01:53:28 > Task :sdks:java:io:synthetic:buildDependents
> 01:53:28 > Task :sdks:java:io:mongodb:buildDependents
> 01:53:28 > Task :sdks:java:io:thrift:buildDependents
> 01:53:28 > Task :sdks:java:testing:test-utils:buildDependents
> 01:53:28 > Task :sdks:java:expansion-service:buildDependents
> 01:53:28 > Task :sdks:java:extensions:arrow:buildDependents
> 01:53:28 > Task :sdks:java:extensions:protobuf:buildDependents
> 01:53:28 > Task :sdks:java:io:common:buildDependents
> 01:53:28 > Task :runners:direct-java:buildDependents
> 01:53:28 > Task :runners:local-java:buildDependents
> 01:53:28 Build timed out (after 120 minutes). Marking the build as aborted.
> https://ci-beam.apache.org/job/beam_PreCommit_Java_cron/4874/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14252) beam_PostCommit_Java_DataflowV1 failing with a variety of flakes and errors

2022-04-06 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14252:
---
Fix Version/s: (was: 2.38.0)

> beam_PostCommit_Java_DataflowV1 failing with a variety of flakes and errors
> ---
>
> Key: BEAM-14252
> URL: https://issues.apache.org/jira/browse/BEAM-14252
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, test-failures
>Reporter: Daniel Oliveira
>Assignee: Emily Ye
>Priority: P1
>
> Test Suite: https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1
> This is a catch-all bug for the various failures affecting this test. It 
> seems to have gone under the radar for a while, so it's likely that multiple 
> different failures have built up over time. Individual failures should be 
> linked as sub-tasks.
> Looking at the [build 
> trend|https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/buildTimeTrend],
>  this seems to have started around 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1386/, on 
> March 18, but even then it started with only 2 failures. Meanwhile recent 
> builds are around 35-45 failures, and it varies implying some of the failures 
> are flakes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14253) pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1

2022-04-05 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517738#comment-17517738
 ] 

Daniel Oliveira commented on BEAM-14253:


I added an error I saw in the Dataflow logs.

> pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1
> -
>
> Key: BEAM-14253
> URL: https://issues.apache.org/jira/browse/BEAM-14253
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-java-gcp, test-failures
>Reporter: Daniel Oliveira
>Assignee: Daniel Collins
>Priority: P1
>
> Example: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1455/testReport/junit/org.apache.beam.sdk.io.gcp.pubsublite/ReadWriteIT/testReadWrite/
> {noformat}
> java.lang.AssertionError: Did not receive signal on 
> projects/apache-beam-testing/subscriptions/result-subscription--586739339276181574
>  in 300s
> {noformat}
> Dataflow logs show this, might be related:
> {noformat}
> Error message from worker: java.lang.IllegalArgumentException
> 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:127)
> 
> org.apache.beam.sdk.io.gcp.pubsublite.internal.SubscriptionPartitionLoader$GeneratorFn.getInitialWatermarkEstimatorState(SubscriptionPartitionLoader.java:76)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14253) pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1

2022-04-05 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14253:
---
Description: 
Example: 
https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1455/testReport/junit/org.apache.beam.sdk.io.gcp.pubsublite/ReadWriteIT/testReadWrite/

{noformat}
java.lang.AssertionError: Did not receive signal on 
projects/apache-beam-testing/subscriptions/result-subscription--586739339276181574
 in 300s
{noformat}

Dataflow logs show this, might be related:

{noformat}
Error message from worker: java.lang.IllegalArgumentException

org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:127)

org.apache.beam.sdk.io.gcp.pubsublite.internal.SubscriptionPartitionLoader$GeneratorFn.getInitialWatermarkEstimatorState(SubscriptionPartitionLoader.java:76)
{noformat}

  was:
Example: 
https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1455/testReport/junit/org.apache.beam.sdk.io.gcp.pubsublite/ReadWriteIT/testReadWrite/

{noformat}
java.lang.AssertionError: Did not receive signal on 
projects/apache-beam-testing/subscriptions/result-subscription--586739339276181574
 in 300s
{noformat}



> pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1
> -
>
> Key: BEAM-14253
> URL: https://issues.apache.org/jira/browse/BEAM-14253
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-java-gcp, test-failures
>Reporter: Daniel Oliveira
>Assignee: Daniel Collins
>Priority: P1
>
> Example: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1455/testReport/junit/org.apache.beam.sdk.io.gcp.pubsublite/ReadWriteIT/testReadWrite/
> {noformat}
> java.lang.AssertionError: Did not receive signal on 
> projects/apache-beam-testing/subscriptions/result-subscription--586739339276181574
>  in 300s
> {noformat}
> Dataflow logs show this, might be related:
> {noformat}
> Error message from worker: java.lang.IllegalArgumentException
> 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:127)
> 
> org.apache.beam.sdk.io.gcp.pubsublite.internal.SubscriptionPartitionLoader$GeneratorFn.getInitialWatermarkEstimatorState(SubscriptionPartitionLoader.java:76)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-14263) beam_PostCommit_Java_DataflowV2, testBigQueryStorageWrite30MProto failing consistently

2022-04-05 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-14263:
--

 Summary: beam_PostCommit_Java_DataflowV2, 
testBigQueryStorageWrite30MProto failing consistently
 Key: BEAM-14263
 URL: https://issues.apache.org/jira/browse/BEAM-14263
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Daniel Oliveira
Assignee: Chamikara Madhusanka Jayalath


testBigQueryStorageWrite30MProto seems to have been failing since being 
originally introduced. First failure I found: 
https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV2/1391/, which 
includes when this PR was merged: https://github.com/apache/beam/pull/17038

I don't see any explicit reason this might be failing either in the console 
logs or Dataflow logs. Here's the console logs:

{noformat}
java.lang.RuntimeException: Workflow failed. Causes: 
S06:WriteToBQ/StorageApiLoads/GroupIntoBatches/ParDo(GroupIntoBatches)/ParMultiDo(GroupIntoBatches)/Reshard/Read+WriteToBQ/StorageApiLoads/GroupIntoBatches/ParDo(GroupIntoBatches)/ParMultiDo(GroupIntoBatches)/Reshard/UnreifyWindow+WriteToBQ/StorageApiLoads/GroupIntoBatches/ParDo(GroupIntoBatches)/ParMultiDo(GroupIntoBatches)+WriteToBQ/StorageApiLoads/StorageApiWriteSharded/Write
 
Records/ParMultiDo(WriteRecords)/Reshard/Reify+WriteToBQ/StorageApiLoads/StorageApiWriteSharded/Write
 Records/ParMultiDo(WriteRecords)/Reshard/Write failed., The job failed because 
a work item has failed 4 times. Look in previous log entries for the cause of 
each one of the 4 failures. For more information, see 
https://cloud.google.com/dataflow/docs/guides/common-errors. The work item was 
attempted on these workers: 

  testpipeline-jenkins-0319-03190548-sfjp-harness-263n
  Root cause: The worker lost contact with the service.,
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13519) Java precommit flaky (timing out)

2022-04-05 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517703#comment-17517703
 ] 

Daniel Oliveira commented on BEAM-13519:


I see there was an actual race condition causing this and not just an error 
with the test, so it looks important enough to cherry-pick. Any bug able to 
cause consistent failures in a Precommit like this is definitely 
release-blocking.

> Java precommit flaky (timing out)
> -
>
> Key: BEAM-13519
> URL: https://issues.apache.org/jira/browse/BEAM-13519
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kiley Sok
>Priority: P1
>  Labels: flake
> Fix For: 2.38.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Java precommits are sometimes timing out with no clear cause. Gradle will log 
> a bunch of routine build tasks, and then Jenkins will abort the job much 
> later. There are no logs to indicate what happened. It is not even clear 
> which task or tasks, if any, was the culprit, since many tasks are run in 
> parallel.
> 01:53:28 > Task :sdks:java:testing:nexmark:build
> 01:53:28 > Task :sdks:java:testing:nexmark:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:zetasql:buildDependents
> 01:53:28 > Task :sdks:java:io:google-cloud-platform:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:buildDependents
> 01:53:28 > Task :sdks:java:io:kafka:buildDependents
> 01:53:28 > Task :sdks:java:extensions:join-library:buildDependents
> 01:53:28 > Task :sdks:java:io:synthetic:buildDependents
> 01:53:28 > Task :sdks:java:io:mongodb:buildDependents
> 01:53:28 > Task :sdks:java:io:thrift:buildDependents
> 01:53:28 > Task :sdks:java:testing:test-utils:buildDependents
> 01:53:28 > Task :sdks:java:expansion-service:buildDependents
> 01:53:28 > Task :sdks:java:extensions:arrow:buildDependents
> 01:53:28 > Task :sdks:java:extensions:protobuf:buildDependents
> 01:53:28 > Task :sdks:java:io:common:buildDependents
> 01:53:28 > Task :runners:direct-java:buildDependents
> 01:53:28 > Task :runners:local-java:buildDependents
> 01:53:28 Build timed out (after 120 minutes). Marking the build as aborted.
> https://ci-beam.apache.org/job/beam_PreCommit_Java_cron/4874/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13519) Java precommit flaky (timing out)

2022-04-05 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13519:
---
Fix Version/s: 2.38.0

> Java precommit flaky (timing out)
> -
>
> Key: BEAM-13519
> URL: https://issues.apache.org/jira/browse/BEAM-13519
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kiley Sok
>Priority: P1
>  Labels: flake
> Fix For: 2.38.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Java precommits are sometimes timing out with no clear cause. Gradle will log 
> a bunch of routine build tasks, and then Jenkins will abort the job much 
> later. There are no logs to indicate what happened. It is not even clear 
> which task or tasks, if any, was the culprit, since many tasks are run in 
> parallel.
> 01:53:28 > Task :sdks:java:testing:nexmark:build
> 01:53:28 > Task :sdks:java:testing:nexmark:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:zetasql:buildDependents
> 01:53:28 > Task :sdks:java:io:google-cloud-platform:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:buildDependents
> 01:53:28 > Task :sdks:java:io:kafka:buildDependents
> 01:53:28 > Task :sdks:java:extensions:join-library:buildDependents
> 01:53:28 > Task :sdks:java:io:synthetic:buildDependents
> 01:53:28 > Task :sdks:java:io:mongodb:buildDependents
> 01:53:28 > Task :sdks:java:io:thrift:buildDependents
> 01:53:28 > Task :sdks:java:testing:test-utils:buildDependents
> 01:53:28 > Task :sdks:java:expansion-service:buildDependents
> 01:53:28 > Task :sdks:java:extensions:arrow:buildDependents
> 01:53:28 > Task :sdks:java:extensions:protobuf:buildDependents
> 01:53:28 > Task :sdks:java:io:common:buildDependents
> 01:53:28 > Task :runners:direct-java:buildDependents
> 01:53:28 > Task :runners:local-java:buildDependents
> 01:53:28 Build timed out (after 120 minutes). Marking the build as aborted.
> https://ci-beam.apache.org/job/beam_PreCommit_Java_cron/4874/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13950) PVR_Spark2_Streaming perma-red

2022-04-05 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517698#comment-17517698
 ] 

Daniel Oliveira commented on BEAM-13950:


Just noting that this is still affecting releases with 2.38.0. Not marking it 
as release-blocking since the reasoning above still holds.

> PVR_Spark2_Streaming perma-red
> --
>
> Key: BEAM-13950
> URL: https://issues.apache.org/jira/browse/BEAM-13950
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark, test-failures
>Affects Versions: 2.37.0, 2.38.0
>Reporter: Brian Hulette
>Priority: P1
>
> https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark2_Streaming has 
> been failing a variable number of tests for a while. 
> Last successful run was Dec 28, 2021 
> (https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark2_Streaming/1021/),
>  which was approximately coincident with gradle 7 changes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13950) PVR_Spark2_Streaming perma-red

2022-04-05 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13950:
---
Affects Version/s: 2.38.0

> PVR_Spark2_Streaming perma-red
> --
>
> Key: BEAM-13950
> URL: https://issues.apache.org/jira/browse/BEAM-13950
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark, test-failures
>Affects Versions: 2.37.0, 2.38.0
>Reporter: Brian Hulette
>Priority: P1
>
> https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark2_Streaming has 
> been failing a variable number of tests for a while. 
> Last successful run was Dec 28, 2021 
> (https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark2_Streaming/1021/),
>  which was approximately coincident with gradle 7 changes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (BEAM-14254) beam_PostCommit_Java_PVR_Flink_Streaming failing due to new AfterSynchronizedProcessingTime test

2022-04-04 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517224#comment-17517224
 ] 

Daniel Oliveira edited comment on BEAM-14254 at 4/5/22 5:35 AM:


Update: Looks like same test is causing issues with Samza: 
https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Samza/1249/testReport/org.apache.beam.sdk.transforms/GroupByKeyTest$BasicTests/testAfterProcessingTimeContinuationTriggerUsingState/


was (Author: danoliveira):
Update: Looks like same is happening in Samza: 
https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Samza/1249/testReport/org.apache.beam.sdk.transforms/GroupByKeyTest$BasicTests/testAfterProcessingTimeContinuationTriggerUsingState/

> beam_PostCommit_Java_PVR_Flink_Streaming failing due to new 
> AfterSynchronizedProcessingTime test
> 
>
> Key: BEAM-14254
> URL: https://issues.apache.org/jira/browse/BEAM-14254
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Affects Versions: 2.38.0
>Reporter: Daniel Oliveira
>Assignee: Ankur Goenka
>Priority: P2
>
> Test: https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/
> Failure example: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/7644/
> This seems to be technically a flake, but highly flake (looks like ~90% of 
> runs are failing and this is the only error in sight). I can't pinpoint where 
> it started since even the oldest failure shows this, but it's likely that it 
> was failing since the test was introduced because the test also caused 
> failures in Dataflow (https://issues.apache.org/jira/browse/BEAM-13952).
> Error message:
> {noformat}
> java.lang.RuntimeException: The Runner experienced the following error during 
> execution:
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 10: org.apache.beam.sdk.util.UserCodeException: java.lang.AssertionError: 
> Second Triggered sum/Values/Values/Map/ParMultiDo(Anonymous).output: 
> Expected: iterable with items [<42>] in any order
>  but: no item matches: <42> in []
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14254) beam_PostCommit_Java_PVR_Flink_Streaming failing due to new AfterSynchronizedProcessingTime test

2022-04-04 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517224#comment-17517224
 ] 

Daniel Oliveira commented on BEAM-14254:


Update: Looks like same is happening in Samza: 
https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Samza/1249/testReport/org.apache.beam.sdk.transforms/GroupByKeyTest$BasicTests/testAfterProcessingTimeContinuationTriggerUsingState/

> beam_PostCommit_Java_PVR_Flink_Streaming failing due to new 
> AfterSynchronizedProcessingTime test
> 
>
> Key: BEAM-14254
> URL: https://issues.apache.org/jira/browse/BEAM-14254
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Affects Versions: 2.38.0
>Reporter: Daniel Oliveira
>Assignee: Ankur Goenka
>Priority: P2
>
> Test: https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/
> Failure example: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/7644/
> This seems to be technically a flake, but highly flake (looks like ~90% of 
> runs are failing and this is the only error in sight). I can't pinpoint where 
> it started since even the oldest failure shows this, but it's likely that it 
> was failing since the test was introduced because the test also caused 
> failures in Dataflow (https://issues.apache.org/jira/browse/BEAM-13952).
> Error message:
> {noformat}
> java.lang.RuntimeException: The Runner experienced the following error during 
> execution:
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 10: org.apache.beam.sdk.util.UserCodeException: java.lang.AssertionError: 
> Second Triggered sum/Values/Values/Map/ParMultiDo(Anonymous).output: 
> Expected: iterable with items [<42>] in any order
>  but: no item matches: <42> in []
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14254) beam_PostCommit_Java_PVR_Flink_Streaming failing due to new AfterSynchronizedProcessingTime test

2022-04-04 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517221#comment-17517221
 ] 

Daniel Oliveira commented on BEAM-14254:


Ankur, I'm assigning to you since you handled this similar issue: 
https://issues.apache.org/jira/browse/BEAM-13961, but feel free to hand it off 
elsewhere, I'm just not sure who's an appropriate owner.

> beam_PostCommit_Java_PVR_Flink_Streaming failing due to new 
> AfterSynchronizedProcessingTime test
> 
>
> Key: BEAM-14254
> URL: https://issues.apache.org/jira/browse/BEAM-14254
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Affects Versions: 2.38.0
>Reporter: Daniel Oliveira
>Assignee: Ankur Goenka
>Priority: P2
>
> Test: https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/
> Failure example: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/7644/
> This seems to be technically a flake, but highly flake (looks like ~90% of 
> runs are failing and this is the only error in sight). I can't pinpoint where 
> it started since even the oldest failure shows this, but it's likely that it 
> was failing since the test was introduced because the test also caused 
> failures in Dataflow (https://issues.apache.org/jira/browse/BEAM-13952).
> Error message:
> {noformat}
> java.lang.RuntimeException: The Runner experienced the following error during 
> execution:
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 10: org.apache.beam.sdk.util.UserCodeException: java.lang.AssertionError: 
> Second Triggered sum/Values/Values/Map/ParMultiDo(Anonymous).output: 
> Expected: iterable with items [<42>] in any order
>  but: no item matches: <42> in []
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-14254) beam_PostCommit_Java_PVR_Flink_Streaming failing due to new AfterSynchronizedProcessingTime test

2022-04-04 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-14254:
--

 Summary: beam_PostCommit_Java_PVR_Flink_Streaming failing due to 
new AfterSynchronizedProcessingTime test
 Key: BEAM-14254
 URL: https://issues.apache.org/jira/browse/BEAM-14254
 Project: Beam
  Issue Type: Bug
  Components: runner-flink, test-failures
Affects Versions: 2.38.0
Reporter: Daniel Oliveira
Assignee: Ankur Goenka


Test: https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/
Failure example: 
https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/7644/

This seems to be technically a flake, but highly flake (looks like ~90% of runs 
are failing and this is the only error in sight). I can't pinpoint where it 
started since even the oldest failure shows this, but it's likely that it was 
failing since the test was introduced because the test also caused failures in 
Dataflow (https://issues.apache.org/jira/browse/BEAM-13952).

Error message:

{noformat}
java.lang.RuntimeException: The Runner experienced the following error during 
execution:
java.lang.RuntimeException: Error received from SDK harness for instruction 10: 
org.apache.beam.sdk.util.UserCodeException: java.lang.AssertionError: Second 
Triggered sum/Values/Values/Map/ParMultiDo(Anonymous).output: 
Expected: iterable with items [<42>] in any order
 but: no item matches: <42> in []
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14222) Test failure org.apache.beam.sdk.io.gcp.spanner.SpannerReadIT.testReadAllRecordsInDb

2022-04-04 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14222:
---
Parent: BEAM-14252
Issue Type: Sub-task  (was: Test)

> Test failure 
> org.apache.beam.sdk.io.gcp.spanner.SpannerReadIT.testReadAllRecordsInDb
> 
>
> Key: BEAM-14222
> URL: https://issues.apache.org/jira/browse/BEAM-14222
> Project: Beam
>  Issue Type: Sub-task
>  Components: test-failures
>Reporter: Kiley Sok
>Assignee: Bingye Li
>Priority: P2
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> java.lang.AssertionError: Count PG rows/Flatten.PCollections.out: 
> Expected: <5L>
>  but: was <0L>
> https://ci-beam.apache.org/job/beam_PostCommit_Java/8806/testReport/junit/org.apache.beam.sdk.io.gcp.spanner/SpannerReadIT/testReadAllRecordsInDb/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14192) ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime

2022-04-04 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14192:
---
Parent: BEAM-14252
Issue Type: Sub-task  (was: Bug)

> ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has 
> been compiled by a more recent version of the Java Runtime
> --
>
> Key: BEAM-14192
> URL: https://issues.apache.org/jira/browse/BEAM-14192
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow, test-failures
>Reporter: Luke Cwik
>Assignee: Kiley Sok
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The Dataflow ITs fails with a class version mismatch. I believe the Dataflow 
> v1 container that is being tested was built with the wrong JDK version.
> Jenkins: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1427/#showFailuresLink
> Example Failure:
> {noformat}
> java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.beam.sdk.util.UserCodeException: 
> java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory 
> has been compiled by a more recent version of the Java Runtime (class file 
> version 55.0), this version of the Java Runtime only recognizes class file 
> versions up to 52.0
>   at 
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187)
>   at 
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108)
>   at 
> org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56)
>   at 
> org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14253) pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1

2022-04-04 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517174#comment-17517174
 ] 

Daniel Oliveira commented on BEAM-14253:


Brian can you check if this is the same failure as 
https://issues.apache.org/jira/browse/BEAM-13025? The message is the same but I 
assume the root cause could be different.

> pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1
> -
>
> Key: BEAM-14253
> URL: https://issues.apache.org/jira/browse/BEAM-14253
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-java-gcp, test-failures
>Reporter: Daniel Oliveira
>Assignee: Brian Hulette
>Priority: P1
>
> Example: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1455/testReport/junit/org.apache.beam.sdk.io.gcp.pubsublite/ReadWriteIT/testReadWrite/
> {noformat}
> java.lang.AssertionError: Did not receive signal on 
> projects/apache-beam-testing/subscriptions/result-subscription--586739339276181574
>  in 300s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-14253) pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1

2022-04-04 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-14253:
--

 Summary: pubsublite.ReadWriteIT failing in 
beam_PostCommit_Java_DataflowV1
 Key: BEAM-14253
 URL: https://issues.apache.org/jira/browse/BEAM-14253
 Project: Beam
  Issue Type: Sub-task
  Components: io-java-gcp, test-failures
Reporter: Daniel Oliveira


Example: 
https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1455/testReport/junit/org.apache.beam.sdk.io.gcp.pubsublite/ReadWriteIT/testReadWrite/

{noformat}
java.lang.AssertionError: Did not receive signal on 
projects/apache-beam-testing/subscriptions/result-subscription--586739339276181574
 in 300s
{noformat}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (BEAM-14253) pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1

2022-04-04 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-14253:
--

Assignee: Brian Hulette

> pubsublite.ReadWriteIT failing in beam_PostCommit_Java_DataflowV1
> -
>
> Key: BEAM-14253
> URL: https://issues.apache.org/jira/browse/BEAM-14253
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-java-gcp, test-failures
>Reporter: Daniel Oliveira
>Assignee: Brian Hulette
>Priority: P1
>
> Example: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1455/testReport/junit/org.apache.beam.sdk.io.gcp.pubsublite/ReadWriteIT/testReadWrite/
> {noformat}
> java.lang.AssertionError: Did not receive signal on 
> projects/apache-beam-testing/subscriptions/result-subscription--586739339276181574
>  in 300s
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (BEAM-13025) pubsublite.ReadWriteIT flaky in beam_PostCommit_Java_DataflowV2

2022-04-04 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-13025:
--

Assignee: Brian Hulette

> pubsublite.ReadWriteIT flaky in beam_PostCommit_Java_DataflowV2  
> -
>
> Key: BEAM-13025
> URL: https://issues.apache.org/jira/browse/BEAM-13025
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Brian Hulette
>Priority: P1
>  Labels: currently-failing, flake
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV2/758/testReport/org.apache.beam.sdk.io.gcp.pubsublite/ReadWriteIT/testReadWrite/]
> java.lang.AssertionError: Did not receive signal on 
> projects/apache-beam-testing/subscriptions/result-subscription--5335365384640437489
>  in 300s



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14192) ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime

2022-04-04 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517171#comment-17517171
 ] 

Daniel Oliveira commented on BEAM-14192:


I'm marking this as affecting 2.38.0, but it really depends on whether this is 
being caused by an SDK-side change. It looks like it's possibly entirely 
Dataflow-side in which case this is probably not release-blocking.

> ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has 
> been compiled by a more recent version of the Java Runtime
> --
>
> Key: BEAM-14192
> URL: https://issues.apache.org/jira/browse/BEAM-14192
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, test-failures
>Reporter: Luke Cwik
>Assignee: Kiley Sok
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The Dataflow ITs fails with a class version mismatch. I believe the Dataflow 
> v1 container that is being tested was built with the wrong JDK version.
> Jenkins: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1427/#showFailuresLink
> Example Failure:
> {noformat}
> java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.beam.sdk.util.UserCodeException: 
> java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory 
> has been compiled by a more recent version of the Java Runtime (class file 
> version 55.0), this version of the Java Runtime only recognizes class file 
> versions up to 52.0
>   at 
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187)
>   at 
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108)
>   at 
> org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56)
>   at 
> org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14192) ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has been compiled by a more recent version of the Java Runtime

2022-04-04 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14192:
---
Fix Version/s: 2.38.0

> ITs run on Dataflow v1 fails with org/apache/commons/logging/LogFactory has 
> been compiled by a more recent version of the Java Runtime
> --
>
> Key: BEAM-14192
> URL: https://issues.apache.org/jira/browse/BEAM-14192
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, test-failures
>Reporter: Luke Cwik
>Assignee: Kiley Sok
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The Dataflow ITs fails with a class version mismatch. I believe the Dataflow 
> v1 container that is being tested was built with the wrong JDK version.
> Jenkins: 
> https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1427/#showFailuresLink
> Example Failure:
> {noformat}
> java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.beam.sdk.util.UserCodeException: 
> java.lang.UnsupportedClassVersionError: org/apache/commons/logging/LogFactory 
> has been compiled by a more recent version of the Java Runtime (class file 
> version 55.0), this version of the Java Runtime only recognizes class file 
> versions up to 52.0
>   at 
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:187)
>   at 
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108)
>   at 
> org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56)
>   at 
> org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-14252) beam_PostCommit_Java_DataflowV1 failing with a variety of flakes and errors

2022-04-04 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-14252:
--

 Summary: beam_PostCommit_Java_DataflowV1 failing with a variety of 
flakes and errors
 Key: BEAM-14252
 URL: https://issues.apache.org/jira/browse/BEAM-14252
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow, test-failures
Reporter: Daniel Oliveira
 Fix For: 2.38.0


Test Suite: https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1

This is a catch-all bug for the various failures affecting this test. It seems 
to have gone under the radar for a while, so it's likely that multiple 
different failures have built up over time. Individual failures should be 
linked as sub-tasks.

Looking at the [build 
trend|https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/buildTimeTrend],
 this seems to have started around 
https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV1/1386/, on March 
18, but even then it started with only 2 failures. Meanwhile recent builds are 
around 35-45 failures, and it varies implying some of the failures are flakes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14179) MonitoringInfoMetricName null value guard uncovering additional issues

2022-04-01 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14179:
---
Resolution: Fixed
Status: Resolved  (was: Open)

> MonitoringInfoMetricName null value guard uncovering additional issues
> --
>
> Key: BEAM-14179
> URL: https://issues.apache.org/jira/browse/BEAM-14179
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, sdk-java-harness
>Reporter: Luke Cwik
>Assignee: Daniel Oliveira
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Additional integration testing 
> (//cloud/dataflow/testing/integration/sdk:V1ReadIT_testE2EV1Read) caught that 
> https://github.com/apache/beam/pull/17094 causes a regression:
> The test failed with:
> {noformat}
> Caused by: java.lang.NullPointerException: null value in entry: 
> DATASTORE_NAMESPACE=null
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:32)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:100)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntries(RegularImmutableMap.java:74)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:464)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:437)
>   at 
> org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.(MonitoringInfoMetricName.java:46)
>   at 
> org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.named(MonitoringInfoMetricName.java:93)
>   at 
> org.apache.beam.runners.core.metrics.ServiceCallMetric.call(ServiceCallMetric.java:82)
>   at 
> org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.runQueryWithRetries(DatastoreV1.java:927)
>   at 
> org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:965)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14179) MonitoringInfoMetricName null value guard uncovering additional issues

2022-04-01 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516139#comment-17516139
 ] 

Daniel Oliveira commented on BEAM-14179:


I cherry-picked in a fix, mostly because the PR that revealed this isn't the 
root cause; The bug would still exist even if I rolled back, it just wouldn't 
be revealed by our tests. Plus the cherry-picked PR is a very small fix and 
unlikely to cause many issues.

> MonitoringInfoMetricName null value guard uncovering additional issues
> --
>
> Key: BEAM-14179
> URL: https://issues.apache.org/jira/browse/BEAM-14179
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, sdk-java-harness
>Reporter: Luke Cwik
>Assignee: Daniel Oliveira
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Additional integration testing 
> (//cloud/dataflow/testing/integration/sdk:V1ReadIT_testE2EV1Read) caught that 
> https://github.com/apache/beam/pull/17094 causes a regression:
> The test failed with:
> {noformat}
> Caused by: java.lang.NullPointerException: null value in entry: 
> DATASTORE_NAMESPACE=null
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:32)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:100)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntries(RegularImmutableMap.java:74)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:464)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:437)
>   at 
> org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.(MonitoringInfoMetricName.java:46)
>   at 
> org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.named(MonitoringInfoMetricName.java:93)
>   at 
> org.apache.beam.runners.core.metrics.ServiceCallMetric.call(ServiceCallMetric.java:82)
>   at 
> org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.runQueryWithRetries(DatastoreV1.java:927)
>   at 
> org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:965)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14116) Fix Pub/Sub Lite IO and SDF performance issues with shuffles

2022-04-01 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516134#comment-17516134
 ] 

Daniel Oliveira commented on BEAM-14116:


Rolled back #17004 on the release branch to resolve this as a release-blocker. 
This should probably still be addressed on master though so I'll leave the bug 
open.

> Fix Pub/Sub Lite IO and SDF performance issues with shuffles
> 
>
> Key: BEAM-14116
> URL: https://issues.apache.org/jira/browse/BEAM-14116
> Project: Beam
>  Issue Type: Task
>  Components: io-java-gcp, runner-dataflow
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14116) Fix Pub/Sub Lite IO and SDF performance issues with shuffles

2022-04-01 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14116:
---
Fix Version/s: (was: 2.38.0)

> Fix Pub/Sub Lite IO and SDF performance issues with shuffles
> 
>
> Key: BEAM-14116
> URL: https://issues.apache.org/jira/browse/BEAM-14116
> Project: Beam
>  Issue Type: Task
>  Components: io-java-gcp, runner-dataflow
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: P2
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14185) [SpannerIO.readChangeStreams] Drop metadata tables at the end of the job

2022-03-31 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14185:
---
Resolution: Fixed
Status: Resolved  (was: Open)

> [SpannerIO.readChangeStreams] Drop metadata tables at the end of the job
> 
>
> Key: BEAM-14185
> URL: https://issues.apache.org/jira/browse/BEAM-14185
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Affects Versions: 2.37.0
>Reporter: Thiago Nunes
>Assignee: Thiago Nunes
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> The SpannerIO.readChangeStreams Connector uses metadata tables to keep track 
> of its internal state during execution. At the moment, these metadata tables 
> linger after the execution, meaning that users will have to drop them 
> manually.
> In this change, we would like to drop them automatically once the job 
> finishes. This should only occur after all partitions have been processed 
> successfully and marked as finished.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14194) [SpannerIO.readChangeStream] Throw error when autoscaling algorithm is not NONE

2022-03-31 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14194:
---
Resolution: Fixed
Status: Resolved  (was: Open)

> [SpannerIO.readChangeStream] Throw error when autoscaling algorithm is not 
> NONE
> ---
>
> Key: BEAM-14194
> URL: https://issues.apache.org/jira/browse/BEAM-14194
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-dataflow
>Affects Versions: 2.37.0
>Reporter: Thiago Nunes
>Assignee: Thiago Nunes
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> SpannerIO.readChangeStreams does not currently support the autoscaling 
> feature. In order to avoid customer confusion, we decided to error out if an 
> algorithm different than NONE is specified.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14181) BQ: Storage API Sink reuses closed connections

2022-03-31 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14181:
---
Resolution: Fixed
Status: Resolved  (was: Open)

> BQ: Storage API Sink reuses closed connections
> --
>
> Key: BEAM-14181
> URL: https://issues.apache.org/jira/browse/BEAM-14181
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Ahmet Altay
>Assignee: Reuven Lax
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Creating a jira so that it can be considered whether it is release blocking 
> or not.
> Related change: https://github.com/apache/beam/pull/17187
> This causes the BigQuery sink to sometimes get full stuck and never recover, 
> and the pipeline grinds to a halt. Likely the regression was introduced in 
> the last Beam release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14171) CoGroupByKey loses values with large groups on Dataflow v1

2022-03-31 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14171:
---
Resolution: Fixed
Status: Resolved  (was: Triage Needed)

> CoGroupByKey loses values with large groups on Dataflow v1
> --
>
> Key: BEAM-14171
> URL: https://issues.apache.org/jira/browse/BEAM-14171
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-core
>Affects Versions: 2.36.0, 2.37.0
>Reporter: Niel Markwick
>Assignee: Robert Bradshaw
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> CoGroupByKey can lose elements - replacing them with null values when a group 
> is large (>10,000 elements).
>  
> This only occurs in dataflow v1, not dataflow-v2 runner
> Possibly related to BEAM-13541.
>  
> https://lists.apache.org/thread/5y56kbgm3q0m1byzf7186rrkomrcfldm
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14201) Provide a courtesy notification for users who may set a deprecated prebuild_sdk_container_base_image option.

2022-03-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14201:
---
Resolution: Fixed
Status: Resolved  (was: Open)

> Provide a courtesy notification for users  who may set a deprecated 
> prebuild_sdk_container_base_image option.
> -
>
> Key: BEAM-14201
> URL: https://issues.apache.org/jira/browse/BEAM-14201
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Since --prebuild_sdk_container_base_image was removed in 2.38.0 SDK in favor 
> of --sdk_container_image that can be used for the same purpose, it would be 
> nice to provide a courtesy message who use this option to proprely switch if 
> the deprecated option is used in isolation.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14201) Provide a courtesy notification for users who may set a deprecated prebuild_sdk_container_base_image option.

2022-03-30 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515071#comment-17515071
 ] 

Daniel Oliveira commented on BEAM-14201:


Whoops, I missed this yesterday, but the fix is all there and tested and this 
would definitely affect users.

> Provide a courtesy notification for users  who may set a deprecated 
> prebuild_sdk_container_base_image option.
> -
>
> Key: BEAM-14201
> URL: https://issues.apache.org/jira/browse/BEAM-14201
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Since --prebuild_sdk_container_base_image was removed in 2.38.0 SDK in favor 
> of --sdk_container_image that can be used for the same purpose, it would be 
> nice to provide a courtesy message who use this option to proprely switch if 
> the deprecated option is used in isolation.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-8218) Implement Apache PulsarIO

2022-03-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-8218:
--
Fix Version/s: (was: 2.38.0)

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Assignee: Marco Robles
>Priority: P3
>  Time Spent: 17h 50m
>  Remaining Estimate: 0h
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14216) Multiple XVR Suites having similar flakes simultaneously

2022-03-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14216:
---
Description: 
Didn't have time to look very closely into the root cause, but in taking a look 
at flaky cross-language tests I noticed a pattern of different suites on 
different runners flaking at the same time. The specific ones that I've noticed 
so far are:

Samza: https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/
Spark: https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/
Dataflow: 
https://ci-beam.apache.org/job/beam_PostCommit_XVR_PythonUsingJava_Dataflow/

Example flake (Mar 29, 12 PM):
https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/993/
https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/3530/
https://ci-beam.apache.org/job/beam_PostCommit_XVR_PythonUsingJava_Dataflow/242/

Example flake 2 (Mar 30, 6 PM):
https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/998/
https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/3535/
https://ci-beam.apache.org/job/beam_PostCommit_XVR_PythonUsingJava_Dataflow/247/

  was:
Didn't have time to look very closely into this. The test seems to be flaky and 
from the example failures I looked at there are potentially multiple different 
failures (or the same failure appearing as different error messages).

Example failure: https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/994/


> Multiple XVR Suites having similar flakes simultaneously
> 
>
> Key: BEAM-14216
> URL: https://issues.apache.org/jira/browse/BEAM-14216
> Project: Beam
>  Issue Type: Bug
>  Components: cross-language, test-failures
>Reporter: Daniel Oliveira
>Priority: P2
>  Labels: flake
>
> Didn't have time to look very closely into the root cause, but in taking a 
> look at flaky cross-language tests I noticed a pattern of different suites on 
> different runners flaking at the same time. The specific ones that I've 
> noticed so far are:
> Samza: https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/
> Spark: https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/
> Dataflow: 
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_PythonUsingJava_Dataflow/
> Example flake (Mar 29, 12 PM):
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/993/
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/3530/
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_PythonUsingJava_Dataflow/242/
> Example flake 2 (Mar 30, 6 PM):
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/998/
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/3535/
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_PythonUsingJava_Dataflow/247/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14216) Multiple XVR Suites having similar flakes simultaneously

2022-03-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14216:
---
Summary: Multiple XVR Suites having similar flakes simultaneously  (was: 
beam_PostCommit_XVR_Samza is flaky)

> Multiple XVR Suites having similar flakes simultaneously
> 
>
> Key: BEAM-14216
> URL: https://issues.apache.org/jira/browse/BEAM-14216
> Project: Beam
>  Issue Type: Bug
>  Components: cross-language, test-failures
>Reporter: Daniel Oliveira
>Priority: P2
>  Labels: flake
>
> Didn't have time to look very closely into this. The test seems to be flaky 
> and from the example failures I looked at there are potentially multiple 
> different failures (or the same failure appearing as different error 
> messages).
> Example failure: https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/994/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-14216) beam_PostCommit_XVR_Samza is flaky

2022-03-30 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-14216:
--

 Summary: beam_PostCommit_XVR_Samza is flaky
 Key: BEAM-14216
 URL: https://issues.apache.org/jira/browse/BEAM-14216
 Project: Beam
  Issue Type: Bug
  Components: cross-language, test-failures
Reporter: Daniel Oliveira


Didn't have time to look very closely into this. The test seems to be flaky and 
from the example failures I looked at there are potentially multiple different 
failures (or the same failure appearing as different error messages).

Example failure: https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/994/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13830) XVR Direct/Spark/Flink tests are timing out

2022-03-30 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515020#comment-17515020
 ] 

Daniel Oliveira commented on BEAM-13830:


This was resolved a while ago.

> XVR Direct/Spark/Flink tests are timing out
> ---
>
> Key: BEAM-13830
> URL: https://issues.apache.org/jira/browse/BEAM-13830
> Project: Beam
>  Issue Type: Bug
>  Components: cross-language, test-failures
>Reporter: Chamikara Madhusanka Jayalath
>Priority: P1
> Fix For: 2.36.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/
> Seems like these tests are running a set of 
> "org.apache.beam.sdk.extensions.schemaio.expansion" tests [1] that it did not 
> run before [2].
> I see that https://github.com/apache/beam/pull/16705 did some Gradle changes 
> related to SchemaIO and also in the set of PRs mentioned in the first failure 
> so possibly related.
> [1] https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/2260/testReport/
> [2]  
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/2259/testReport/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13830) XVR Direct/Spark/Flink tests are timing out

2022-03-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13830:
---
Resolution: Fixed
Status: Resolved  (was: Open)

> XVR Direct/Spark/Flink tests are timing out
> ---
>
> Key: BEAM-13830
> URL: https://issues.apache.org/jira/browse/BEAM-13830
> Project: Beam
>  Issue Type: Bug
>  Components: cross-language, test-failures
>Reporter: Chamikara Madhusanka Jayalath
>Priority: P1
> Fix For: 2.36.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/
> Seems like these tests are running a set of 
> "org.apache.beam.sdk.extensions.schemaio.expansion" tests [1] that it did not 
> run before [2].
> I see that https://github.com/apache/beam/pull/16705 did some Gradle changes 
> related to SchemaIO and also in the set of PRs mentioned in the first failure 
> so possibly related.
> [1] https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/2260/testReport/
> [2]  
> https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/2259/testReport/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-14214) beam_PostCommit_XVR_GoUsingJava_Dataflow fails on some test transforms

2022-03-30 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-14214:
--

 Summary: beam_PostCommit_XVR_GoUsingJava_Dataflow fails on some 
test transforms
 Key: BEAM-14214
 URL: https://issues.apache.org/jira/browse/BEAM-14214
 Project: Beam
  Issue Type: Bug
  Components: cross-language, sdk-go, test-failures
Reporter: Daniel Oliveira


Example failure: 
https://ci-beam.apache.org/job/beam_PostCommit_XVR_GoUsingJava_Dataflow/7/

I couldn't find accurate details about why the tests are failing, but 
TestXLang_Prefix, TestXLang_Multi, and TestXLang_Partition are failing while 
running for some reason. Investigating the Dataflow logs, we can see SDK 
harnesses are failing to connect for some reason. For example:

{noformat}
"getPodContainerStatuses for pod 
"df-go-testxlang-multi-03300551-62xv-harness-3msv_default(a7f1d8dfb2c3d2b4e80f5d92c1728787)"
 failed: rpc error: code = Unknown desc = Error: No such container: 
bea0d9bde42bf890f6fe1d4f589932471037a5948fb9588d01a06425cd14c177"
{noformat}

However I haven't been able to find any further details showing why the harness 
fails, and the tests keep running beyond that for a while with other errors 
that are also pretty inscrutable.




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-12815) Flink Go XVR tests fail on TestXLang_Multi: Insufficient number of network buffers

2022-03-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-12815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-12815:
---
Labels:   (was: test)

> Flink Go XVR tests fail on TestXLang_Multi: Insufficient number of network 
> buffers
> --
>
> Key: BEAM-12815
> URL: https://issues.apache.org/jira/browse/BEAM-12815
> Project: Beam
>  Issue Type: Bug
>  Components: cross-language, sdk-go
>Reporter: Daniel Oliveira
>Assignee: Danny McCormick
>Priority: P3
> Fix For: Not applicable
>
>
> When running the cross-language test suites () Flink fails on TestXLang_Multi 
> with the following error:
> {noformat}
> 19:29:14 2021/08/27 02:29:14  (): java.io.IOException: Insufficient number of 
> network buffers: required 17, but only 16 available. The total number of 
> network buffers is currently set to 2048 of 32768 bytes each. You can 
> increase this number by setting the configuration keys 
> 'taskmanager.memory.network.fraction', 'taskmanager.memory.network.min', and 
> 'taskmanager.memory.network.max'.
> 19:29:14 2021/08/27 02:29:14 Job state: FAILED
> 19:29:14 --- FAIL: TestXLang_Multi (6.26s){noformat}
> This doesn't seem to be a parallelism problem (go test is run with "-p 1" as 
> expected) and is only happening on this specific test.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-12815) Flink Go XVR tests fail on TestXLang_Multi: Insufficient number of network buffers

2022-03-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-12815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-12815:
---
Component/s: test-failures

> Flink Go XVR tests fail on TestXLang_Multi: Insufficient number of network 
> buffers
> --
>
> Key: BEAM-12815
> URL: https://issues.apache.org/jira/browse/BEAM-12815
> Project: Beam
>  Issue Type: Bug
>  Components: cross-language, sdk-go, test-failures
>Reporter: Daniel Oliveira
>Assignee: Danny McCormick
>Priority: P3
> Fix For: Not applicable
>
>
> When running the cross-language test suites () Flink fails on TestXLang_Multi 
> with the following error:
> {noformat}
> 19:29:14 2021/08/27 02:29:14  (): java.io.IOException: Insufficient number of 
> network buffers: required 17, but only 16 available. The total number of 
> network buffers is currently set to 2048 of 32768 bytes each. You can 
> increase this number by setting the configuration keys 
> 'taskmanager.memory.network.fraction', 'taskmanager.memory.network.min', and 
> 'taskmanager.memory.network.max'.
> 19:29:14 2021/08/27 02:29:14 Job state: FAILED
> 19:29:14 --- FAIL: TestXLang_Multi (6.26s){noformat}
> This doesn't seem to be a parallelism problem (go test is run with "-p 1" as 
> expected) and is only happening on this specific test.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-12815) Flink Go XVR tests fail on TestXLang_Multi: Insufficient number of network buffers

2022-03-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-12815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-12815:
---
Labels: test  (was: )

> Flink Go XVR tests fail on TestXLang_Multi: Insufficient number of network 
> buffers
> --
>
> Key: BEAM-12815
> URL: https://issues.apache.org/jira/browse/BEAM-12815
> Project: Beam
>  Issue Type: Bug
>  Components: cross-language, sdk-go
>Reporter: Daniel Oliveira
>Assignee: Danny McCormick
>Priority: P3
>  Labels: test
> Fix For: Not applicable
>
>
> When running the cross-language test suites () Flink fails on TestXLang_Multi 
> with the following error:
> {noformat}
> 19:29:14 2021/08/27 02:29:14  (): java.io.IOException: Insufficient number of 
> network buffers: required 17, but only 16 available. The total number of 
> network buffers is currently set to 2048 of 32768 bytes each. You can 
> increase this number by setting the configuration keys 
> 'taskmanager.memory.network.fraction', 'taskmanager.memory.network.min', and 
> 'taskmanager.memory.network.max'.
> 19:29:14 2021/08/27 02:29:14 Job state: FAILED
> 19:29:14 --- FAIL: TestXLang_Multi (6.26s){noformat}
> This doesn't seem to be a parallelism problem (go test is run with "-p 1" as 
> expected) and is only happening on this specific test.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (BEAM-14163) Performance Regressions in streaming python ParDo and GBK Load Tests

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514403#comment-17514403
 ] 

Daniel Oliveira edited comment on BEAM-14163 at 3/30/22, 2:42 AM:
--

Cherry-pick is in and the graphs linked show metrics returning to their 
previous values on master, so I think it's safe to mark this resolved. Thanks 
everyone who helped investigate!


was (Author: danoliveira):
Cherry-pick is in and the graphs linked show metrics returning to their 
previous values on master, so I think it's safe to mark this resolved.

> Performance Regressions in streaming python ParDo and GBK Load Tests
> 
>
> Key: BEAM-14163
> URL: https://issues.apache.org/jira/browse/BEAM-14163
> Project: Beam
>  Issue Type: Bug
>  Components: community-metrics, sdk-py-core
>Affects Versions: 2.38.0
>Reporter: Daniel Oliveira
>Assignee: Robert Bradshaw
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> As specified in the [Beam Release 
> Guide|https://beam.apache.org/contribute/release-guide/#4-investigate-performance-regressions],
>  I'm investigating performance regressions. The following load test metrics 
> show a clear and persistant performance regression starting approximately 
> around March 17 and affecting version 2.38.0.
> ParDo Load Tests: 
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=streaming&var-sdk=python
> GBK Load Tests: 
> http://metrics.beam.apache.org/d/UYZ-oJ3Zk/gbk-load-tests?orgId=1&var-processingType=streaming&var-sdk=python&from=now-30d&to=now



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14163) Performance Regressions in streaming python ParDo and GBK Load Tests

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514403#comment-17514403
 ] 

Daniel Oliveira commented on BEAM-14163:


Cherry-pick is in and the graphs linked show metrics returning to their 
previous values on master, so I think it's safe to mark this resolved.

> Performance Regressions in streaming python ParDo and GBK Load Tests
> 
>
> Key: BEAM-14163
> URL: https://issues.apache.org/jira/browse/BEAM-14163
> Project: Beam
>  Issue Type: Bug
>  Components: community-metrics, sdk-py-core
>Affects Versions: 2.38.0
>Reporter: Daniel Oliveira
>Assignee: Robert Bradshaw
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> As specified in the [Beam Release 
> Guide|https://beam.apache.org/contribute/release-guide/#4-investigate-performance-regressions],
>  I'm investigating performance regressions. The following load test metrics 
> show a clear and persistant performance regression starting approximately 
> around March 17 and affecting version 2.38.0.
> ParDo Load Tests: 
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=streaming&var-sdk=python
> GBK Load Tests: 
> http://metrics.beam.apache.org/d/UYZ-oJ3Zk/gbk-load-tests?orgId=1&var-processingType=streaming&var-sdk=python&from=now-30d&to=now



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14163) Performance Regressions in streaming python ParDo and GBK Load Tests

2022-03-29 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14163:
---
Resolution: Fixed
Status: Resolved  (was: Open)

> Performance Regressions in streaming python ParDo and GBK Load Tests
> 
>
> Key: BEAM-14163
> URL: https://issues.apache.org/jira/browse/BEAM-14163
> Project: Beam
>  Issue Type: Bug
>  Components: community-metrics, sdk-py-core
>Affects Versions: 2.38.0
>Reporter: Daniel Oliveira
>Assignee: Robert Bradshaw
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> As specified in the [Beam Release 
> Guide|https://beam.apache.org/contribute/release-guide/#4-investigate-performance-regressions],
>  I'm investigating performance regressions. The following load test metrics 
> show a clear and persistant performance regression starting approximately 
> around March 17 and affecting version 2.38.0.
> ParDo Load Tests: 
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=streaming&var-sdk=python
> GBK Load Tests: 
> http://metrics.beam.apache.org/d/UYZ-oJ3Zk/gbk-load-tests?orgId=1&var-processingType=streaming&var-sdk=python&from=now-30d&to=now



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14194) [SpannerIO.readChangeStream] Throw error when autoscaling algorithm is not NONE

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514388#comment-17514388
 ] 

Daniel Oliveira commented on BEAM-14194:


(I'm copy-pasting my comment from the other release-blocker because it is 
completely applicable here)

I can fit this cherry-pick in since it's all ready and won't delay the release.

While it technically doesn't fit the requirements for a release-blocker (since 
it isn't a "significant regression or loss of functionality"), it's in a weird 
spot due to being a new feature. So sure, technically none of this can be a 
regression since it's all a new feature of this release. But this is still a 
known issue with noticeable user impact and the cherry-pick isn't delaying 
anything, so I think it's worth getting in.

> [SpannerIO.readChangeStream] Throw error when autoscaling algorithm is not 
> NONE
> ---
>
> Key: BEAM-14194
> URL: https://issues.apache.org/jira/browse/BEAM-14194
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-dataflow
>Affects Versions: 2.37.0
>Reporter: Thiago Nunes
>Assignee: Thiago Nunes
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> SpannerIO.readChangeStreams does not currently support the autoscaling 
> feature. In order to avoid customer confusion, we decided to error out if an 
> algorithm different than NONE is specified.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14185) [SpannerIO.readChangeStreams] Drop metadata tables at the end of the job

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514378#comment-17514378
 ] 

Daniel Oliveira commented on BEAM-14185:


I can fit this cherry-pick in since it's all ready and won't delay the release.

While it technically doesn't fit the requirements for a release-blocker (since 
it isn't a "significant regression or loss of functionality"), it's in a weird 
spot due to being a new feature. So sure, technically none of this can be a 
regression since it's all a new feature of this release. But this is still a 
known issue with noticeable user impact and the cherry-pick isn't delaying 
anything, so I think it's worth getting in.

> [SpannerIO.readChangeStreams] Drop metadata tables at the end of the job
> 
>
> Key: BEAM-14185
> URL: https://issues.apache.org/jira/browse/BEAM-14185
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Affects Versions: 2.37.0
>Reporter: Thiago Nunes
>Assignee: Thiago Nunes
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The SpannerIO.readChangeStreams Connector uses metadata tables to keep track 
> of its internal state during execution. At the moment, these metadata tables 
> linger after the execution, meaning that users will have to drop them 
> manually.
> In this change, we would like to drop them automatically once the job 
> finishes. This should only occur after all partitions have been processed 
> successfully and marked as finished.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13519) Java precommit flaky (timing out)

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514377#comment-17514377
 ] 

Daniel Oliveira commented on BEAM-13519:


I should probably add that this has been making reviewing PRs annoying since it 
makes the Java precommit very flaky, and since it takes ~3 hours to run, 
deflaking is a huge time sink.

> Java precommit flaky (timing out)
> -
>
> Key: BEAM-13519
> URL: https://issues.apache.org/jira/browse/BEAM-13519
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Priority: P1
>  Labels: flake
>
> Java precommits are sometimes timing out with no clear cause. Gradle will log 
> a bunch of routine build tasks, and then Jenkins will abort the job much 
> later. There are no logs to indicate what happened. It is not even clear 
> which task or tasks, if any, was the culprit, since many tasks are run in 
> parallel.
> 01:53:28 > Task :sdks:java:testing:nexmark:build
> 01:53:28 > Task :sdks:java:testing:nexmark:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:zetasql:buildDependents
> 01:53:28 > Task :sdks:java:io:google-cloud-platform:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:buildDependents
> 01:53:28 > Task :sdks:java:io:kafka:buildDependents
> 01:53:28 > Task :sdks:java:extensions:join-library:buildDependents
> 01:53:28 > Task :sdks:java:io:synthetic:buildDependents
> 01:53:28 > Task :sdks:java:io:mongodb:buildDependents
> 01:53:28 > Task :sdks:java:io:thrift:buildDependents
> 01:53:28 > Task :sdks:java:testing:test-utils:buildDependents
> 01:53:28 > Task :sdks:java:expansion-service:buildDependents
> 01:53:28 > Task :sdks:java:extensions:arrow:buildDependents
> 01:53:28 > Task :sdks:java:extensions:protobuf:buildDependents
> 01:53:28 > Task :sdks:java:io:common:buildDependents
> 01:53:28 > Task :runners:direct-java:buildDependents
> 01:53:28 > Task :runners:local-java:buildDependents
> 01:53:28 Build timed out (after 120 minutes). Marking the build as aborted.
> https://ci-beam.apache.org/job/beam_PreCommit_Java_cron/4874/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13519) Java precommit flaky (timing out)

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514374#comment-17514374
 ] 

Daniel Oliveira commented on BEAM-13519:


This is resurfacing now, despite the Precommit timeout getting extended to 180 
minutes. Check out 
https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/5270/ for example:
{noformat}
12:00:48 > Task :sdks:java:core:validateShadedJarDoesntLeakNonProjectClasses
12:00:48 > Task :sdks:java:core:check
12:00:48 > Task :sdks:java:core:build
12:00:48 > Task :sdks:java:core:buildNeeded
14:16:44 Build timed out (after 180 minutes). Marking the build as aborted.
14:16:44 Build was aborted
{noformat}

> Java precommit flaky (timing out)
> -
>
> Key: BEAM-13519
> URL: https://issues.apache.org/jira/browse/BEAM-13519
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Priority: P1
>  Labels: flake
>
> Java precommits are sometimes timing out with no clear cause. Gradle will log 
> a bunch of routine build tasks, and then Jenkins will abort the job much 
> later. There are no logs to indicate what happened. It is not even clear 
> which task or tasks, if any, was the culprit, since many tasks are run in 
> parallel.
> 01:53:28 > Task :sdks:java:testing:nexmark:build
> 01:53:28 > Task :sdks:java:testing:nexmark:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:zetasql:buildDependents
> 01:53:28 > Task :sdks:java:io:google-cloud-platform:buildDependents
> 01:53:28 > Task :sdks:java:extensions:sql:buildDependents
> 01:53:28 > Task :sdks:java:io:kafka:buildDependents
> 01:53:28 > Task :sdks:java:extensions:join-library:buildDependents
> 01:53:28 > Task :sdks:java:io:synthetic:buildDependents
> 01:53:28 > Task :sdks:java:io:mongodb:buildDependents
> 01:53:28 > Task :sdks:java:io:thrift:buildDependents
> 01:53:28 > Task :sdks:java:testing:test-utils:buildDependents
> 01:53:28 > Task :sdks:java:expansion-service:buildDependents
> 01:53:28 > Task :sdks:java:extensions:arrow:buildDependents
> 01:53:28 > Task :sdks:java:extensions:protobuf:buildDependents
> 01:53:28 > Task :sdks:java:io:common:buildDependents
> 01:53:28 > Task :runners:direct-java:buildDependents
> 01:53:28 > Task :runners:local-java:buildDependents
> 01:53:28 Build timed out (after 120 minutes). Marking the build as aborted.
> https://ci-beam.apache.org/job/beam_PreCommit_Java_cron/4874/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-8218) Implement Apache PulsarIO

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514308#comment-17514308
 ] 

Daniel Oliveira commented on BEAM-8218:
---

Note that this is marked as release-blocking for 2.38.0 due to the release 
version being set. Not sure if that was intentional, but this doesn't look like 
a release-blocking bug. Can we remove this from the list of release-blockers?

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Assignee: Marco Robles
>Priority: P3
> Fix For: 2.38.0
>
>  Time Spent: 17h 50m
>  Remaining Estimate: 0h
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14177) GroupByKey iteration caching broken for portable runners like Dataflow runner v2

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514305#comment-17514305
 ] 

Daniel Oliveira commented on BEAM-14177:


Since this has been cherry-picked into the release branch, can we mark this as 
resolved?

> GroupByKey iteration caching broken for portable runners like Dataflow runner 
> v2
> 
>
> Key: BEAM-14177
> URL: https://issues.apache.org/jira/browse/BEAM-14177
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The wrong cache key is being used as it has not been namespaced to the state 
> key.
> This was previously being done within StateFetchingIterators but 
> https://github.com/apache/beam/pull/17121 changed that to use a single shared 
> key.
> The fix is to subcache the cache before passing it into 
> StateFetchingIterators restoring the prior behavior.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (BEAM-14181) BQ: Storage API Sink reuses closed connections

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514304#comment-17514304
 ] 

Daniel Oliveira edited comment on BEAM-14181 at 3/29/22, 7:38 PM:
--

Cherry-picking #17187 to release-2.38.0: 
https://github.com/apache/beam/pull/17208


was (Author: danoliveira):
Cherry-picking #17187 to release-2.38.0

> BQ: Storage API Sink reuses closed connections
> --
>
> Key: BEAM-14181
> URL: https://issues.apache.org/jira/browse/BEAM-14181
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Ahmet Altay
>Assignee: Reuven Lax
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Creating a jira so that it can be considered whether it is release blocking 
> or not.
> Related change: https://github.com/apache/beam/pull/17187
> This causes the BigQuery sink to sometimes get full stuck and never recover, 
> and the pipeline grinds to a halt. Likely the regression was introduced in 
> the last Beam release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14181) BQ: Storage API Sink reuses closed connections

2022-03-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514304#comment-17514304
 ] 

Daniel Oliveira commented on BEAM-14181:


Cherry-picking #17187 to release-2.38.0

> BQ: Storage API Sink reuses closed connections
> --
>
> Key: BEAM-14181
> URL: https://issues.apache.org/jira/browse/BEAM-14181
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Ahmet Altay
>Assignee: Reuven Lax
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Creating a jira so that it can be considered whether it is release blocking 
> or not.
> Related change: https://github.com/apache/beam/pull/17187
> This causes the BigQuery sink to sometimes get full stuck and never recover, 
> and the pipeline grinds to a halt. Likely the regression was introduced in 
> the last Beam release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-14181) BQ: Storage API Sink reuses closed connections

2022-03-28 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-14181:
---
Description: 
Creating a jira so that it can be considered whether it is release blocking or 
not.

Related change: https://github.com/apache/beam/pull/17187

This causes the BigQuery sink to sometimes get full stuck and never recover, 
and the pipeline grinds to a halt. Likely the regression was introduced in the 
last Beam release

  was:
Creating a jira so that it can be considered whether it is release blocking or 
not.

Related change: https://github.com/apache/beam/pull/17187


> BQ: Storage API Sink reuses closed connections
> --
>
> Key: BEAM-14181
> URL: https://issues.apache.org/jira/browse/BEAM-14181
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Ahmet Altay
>Assignee: Reuven Lax
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Creating a jira so that it can be considered whether it is release blocking 
> or not.
> Related change: https://github.com/apache/beam/pull/17187
> This causes the BigQuery sink to sometimes get full stuck and never recover, 
> and the pipeline grinds to a halt. Likely the regression was introduced in 
> the last Beam release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14116) Fix Pub/Sub Lite IO and SDF performance issues with shuffles

2022-03-28 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513762#comment-17513762
 ] 

Daniel Oliveira commented on BEAM-14116:


Just a heads up that I'll be timeboxing a fix for this in 2.38.0 based on the 
progress of https://issues.apache.org/jira/browse/BEAM-14163. That is, if that 
gets fixed and there isn't a fix for this yet, I'll rollback these PRs on the 
release branch instead of delaying the RC further.

> Fix Pub/Sub Lite IO and SDF performance issues with shuffles
> 
>
> Key: BEAM-14116
> URL: https://issues.apache.org/jira/browse/BEAM-14116
> Project: Beam
>  Issue Type: Task
>  Components: io-java-gcp, runner-dataflow
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14179) MonitoringInfoMetricName null value guard uncovering additional issues

2022-03-25 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512580#comment-17512580
 ] 

Daniel Oliveira commented on BEAM-14179:


Regarding this bug's release blocker status, I'm currently leaning towards just 
rolling back the culprit PR on the release branch. However I'm currently 
waiting on some other cherrypicks before I can make RC1, so if a fix is 
available before then I'll cherry-pick it in.

> MonitoringInfoMetricName null value guard uncovering additional issues
> --
>
> Key: BEAM-14179
> URL: https://issues.apache.org/jira/browse/BEAM-14179
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, sdk-java-harness
>Reporter: Luke Cwik
>Assignee: Daniel Oliveira
>Priority: P2
> Fix For: 2.38.0
>
>
> Additional integration testing 
> (//cloud/dataflow/testing/integration/sdk:V1ReadIT_testE2EV1Read) caught that 
> https://github.com/apache/beam/pull/17094 causes a regression:
> The test failed with:
> {noformat}
> Caused by: java.lang.NullPointerException: null value in entry: 
> DATASTORE_NAMESPACE=null
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:32)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:100)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.RegularImmutableMap.fromEntries(RegularImmutableMap.java:74)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:464)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:437)
>   at 
> org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.(MonitoringInfoMetricName.java:46)
>   at 
> org.apache.beam.runners.core.metrics.MonitoringInfoMetricName.named(MonitoringInfoMetricName.java:93)
>   at 
> org.apache.beam.runners.core.metrics.ServiceCallMetric.call(ServiceCallMetric.java:82)
>   at 
> org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.runQueryWithRetries(DatastoreV1.java:927)
>   at 
> org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:965)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14064) ElasticSearchIO#Write buffering and outputting across windows

2022-03-24 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512139#comment-17512139
 ] 

Daniel Oliveira commented on BEAM-14064:


That sounds reasonable. I'd agree that if this is now silently dropping 
elements, that's what I would call a new regression. If that's all that remains 
to get in a PR and cherry-pick it into the release branch, then it sounds 
reasonable, we can just focus on getting an answer to your question on 
watermark behavior ASAP.

> ElasticSearchIO#Write buffering and outputting across windows
> -
>
> Key: BEAM-14064
> URL: https://issues.apache.org/jira/browse/BEAM-14064
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-elasticsearch
>Affects Versions: 2.35.0, 2.36.0, 2.37.0
>Reporter: Luke Cwik
>Assignee: Evan Galpin
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Source: https://lists.apache.org/thread/mtwtno2o88lx3zl12jlz7o5w1lcgm2db
> Bug PR: https://github.com/apache/beam/pull/15381
> ElasticsearchIO is collecting results from elements in window X and then 
> trying to output them in window Y when flushing the batch. This exposed a bug 
> where elements that were being buffered were being output as part of a 
> different window than what the window that produced them was.
> This became visible because validation was added recently to ensure that when 
> the pipeline is processing elements in window X that output with a timestamp 
> is valid for window X. Note that this validation only occurs in 
> *@ProcessElement* since output is associated with the current window with the 
> input element that is being processed.
> It is ok to do this in *@FinishBundle* since there is no existing windowing 
> context and when you output that element is assigned to an appropriate window.
> *Further Context*
> We’ve bisected it to being introduced in 2.35.0, and I’m reasonably certain 
> it’s this PR https://github.com/apache/beam/pull/15381
> Our scenario is pretty trivial, we read off Pubsub and write to Elastic in a 
> streaming job, the config for the source and sink is respectively
> {noformat}
> pipeline.apply(
> PubsubIO.readStrings().fromSubscription(subscription)
> ).apply(ParseJsons.of(OurObject::class.java))
> .setCoder(KryoCoder.of())
> {noformat}
> and
> {noformat}
> ElasticsearchIO.write()
> .withUseStatefulBatches(true)
> .withMaxParallelRequestsPerWindow(1)
> .withMaxBufferingDuration(Duration.standardSeconds(30))
> // 5 bytes **> KiB **> MiB, so 5 MiB
> .withMaxBatchSizeBytes(5L * 1024 * 1024)
> // # of docs
> .withMaxBatchSize(1000)
> .withConnectionConfiguration(
> ElasticsearchIO.ConnectionConfiguration.create(
> arrayOf(host),
> "fubar",
> "_doc"
> ).withConnectTimeout(5000)
> .withSocketTimeout(3)
> )
> .withRetryConfiguration(
> ElasticsearchIO.RetryConfiguration.create(
> 10,
> // the duration is wall clock, against the connection and 
> socket timeouts specified
> // above. I.e., 10 x 30s is gonna be more than 3 minutes, 
> so if we're getting
> // 10 socket timeouts in a row, this would ignore the 
> "10" part and terminate
> // after 6. The idea is that in a mixed failure mode, 
> you'd get different timeouts
> // of different durations, and on average 10 x fails < 4m.
> // That said, 4m is arbitrary, so adjust as and when 
> needed.
> Duration.standardMinutes(4)
> )
> )
> .withIdFn { f: JsonNode -> f["id"].asText() }
> .withIndexFn { f: JsonNode -> f["schema_name"].asText() }
> .withIsDeleteFn { f: JsonNode -> f["_action"].asText("noop") == 
> "delete" }
> {noformat}
> We recently tried upgrading 2.33 to 2.36 and immediately hit a bug in the 
> consumer, due to alleged time skew, specifically
> {noformat}
> 2022-03-07 10:48:37.886 GMTError message from worker: 
> java.lang.IllegalArgumentException: Cannot output with timestamp 
> 2022-03-07T10:43:38.640Z. Output timestamps must be no earlier than the 
> timestamp of the 
> current input (2022-03-07T10:43:43.562Z) minus the allowed skew (0 
> milliseconds) and no later than 294247-01-10T04:00:54.775Z. See the 
> DoFn#getAllowedTimestampSkew() Javadoc 
> for details on changing the allowed skew. 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.cor

[jira] [Assigned] (BEAM-14163) Performance Regressions in streaming python ParDo and GBK Load Tests

2022-03-24 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-14163:
--

Assignee: Valentyn Tymofieiev  (was: Daniel Oliveira)

> Performance Regressions in streaming python ParDo and GBK Load Tests
> 
>
> Key: BEAM-14163
> URL: https://issues.apache.org/jira/browse/BEAM-14163
> Project: Beam
>  Issue Type: Bug
>  Components: community-metrics, sdk-py-core
>Affects Versions: 2.38.0
>Reporter: Daniel Oliveira
>Assignee: Valentyn Tymofieiev
>Priority: P0
> Fix For: 2.38.0
>
>
> As specified in the [Beam Release 
> Guide|https://beam.apache.org/contribute/release-guide/#4-investigate-performance-regressions],
>  I'm investigating performance regressions. The following load test metrics 
> show a clear and persistant performance regression starting approximately 
> around March 17 and affecting version 2.38.0.
> ParDo Load Tests: 
> http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=streaming&var-sdk=python
> GBK Load Tests: 
> http://metrics.beam.apache.org/d/UYZ-oJ3Zk/gbk-load-tests?orgId=1&var-processingType=streaming&var-sdk=python&from=now-30d&to=now



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14171) CoGroupByKey loses values with large groups on Dataflow v1

2022-03-24 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512130#comment-17512130
 ] 

Daniel Oliveira commented on BEAM-14171:


How does this issue look in terms of timeline to getting a fix in and regarding 
blocking the release? I see this issue recently got reported and a PR just went 
up, but not sure if that PR addresses the root cause or is tangentially related.

It looks like this affected the previous two versions, which by precedent is a 
reason for us _not_ to block since this wouldn't be a new regression. But it 
also sounds like a pretty major problem if it affects all CoGroupByKeys, which 
I think might be worth making an exception _if_ we think a fix can be 
implemented in a timely manner. Like within the next three workdays as a rough 
target. Alternatively, if there are simple workarounds/mitigations for this, 
then I think we can just list it as a known issue and describe how users can 
mitigate it.

> CoGroupByKey loses values with large groups on Dataflow v1
> --
>
> Key: BEAM-14171
> URL: https://issues.apache.org/jira/browse/BEAM-14171
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-core
>Affects Versions: 2.36.0, 2.37.0
>Reporter: Niel Markwick
>Assignee: Robert Bradshaw
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CoGroupByKey can lose elements - replacing them with null values when a group 
> is large (>10,000 elements).
>  
> This only occurs in dataflow v1, not dataflow-v2 runner
> Possibly related to BEAM-13541.
>  
> https://lists.apache.org/thread/5y56kbgm3q0m1byzf7186rrkomrcfldm
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14153) Reshuffled Row Coder PCollection used direct to Side Input breaks Dataflow & PyPortable

2022-03-24 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512127#comment-17512127
 ] 

Daniel Oliveira commented on BEAM-14153:


I'm looking at release-blocking bugs and trying to see if any can be safely 
removed from the list of release blockers.

What's the status on this? It does sound like a regression which indicates that 
it's a release blocker, but how soon is a fix incoming? If a fix isn't coming 
soon, is it a major regression or something that can easily be worked around? 
It sounds pretty specific to trigger this since it needs to be a reshuffled 
PCollection, maybe we can just provide workaround instructions along with 
marking this as a known issue?

> Reshuffled Row Coder PCollection used direct to Side Input breaks Dataflow & 
> PyPortable
> ---
>
> Key: BEAM-14153
> URL: https://issues.apache.org/jira/browse/BEAM-14153
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: 2.37.0
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: P2
> Fix For: 2.38.0
>
>
> Since First class Iterable side inputs were implemented, passing a reshuffled 
> PCollection directly to a Side Input will cause a coder mismatch between 
> encoding the reshuffle and decoding it on Dataflow and on Python Portable. In 
> particular, the Row values will be encoded without a Length Prefix, but then 
> be requested to decode them with a length prefix, which wasn't included.
> This is similar to the issue in BEAM-12438 which has been hacked around. 
> In this instance it's likely more resilient to always length prefix Row 
> encoded types, and make it explicit in the pipeline proto. This should avoid 
> issues with runners having odd behaviors WRT row coders at this time, while 
> not preventing them from introspecting row encoded values should they chose. 
> This may also allow us to avoid the hack for BEAM-12438, though that is 
> something to be verified independently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14064) ElasticSearchIO#Write buffering and outputting across windows

2022-03-24 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512106#comment-17512106
 ] 

Daniel Oliveira commented on BEAM-14064:


Hi, I'm managing the 2.38.0 release. I see this is set to release-blocking, but 
I'm not sure if it's actually worth blocking the release.

While it does fit our "significant regression or loss of functionality" 
requirement [from the 
website|https://beam.apache.org/contribute/release-blocking/], we've set a 
precedent of blocking for new regressions and not blocking for known existing 
issues. Basically, since this issue has been present for the past 3 releases, 
it's not a new regression and not something we want to block a release on. The 
cat's already out of the bag, so to speak. In addition to all that, the scope 
of this seems limited to ElasticsearchIO. If it had broad impact it might be 
worth making an exception, but not as it stands.

Evan, can we take this off the release blocker list and just get it in for 
2.39.0 instead? Do you have an argument for keeping it as a release blocker?

> ElasticSearchIO#Write buffering and outputting across windows
> -
>
> Key: BEAM-14064
> URL: https://issues.apache.org/jira/browse/BEAM-14064
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-elasticsearch
>Affects Versions: 2.35.0, 2.36.0, 2.37.0
>Reporter: Luke Cwik
>Assignee: Evan Galpin
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Source: https://lists.apache.org/thread/mtwtno2o88lx3zl12jlz7o5w1lcgm2db
> Bug PR: https://github.com/apache/beam/pull/15381
> ElasticsearchIO is collecting results from elements in window X and then 
> trying to output them in window Y when flushing the batch. This exposed a bug 
> where elements that were being buffered were being output as part of a 
> different window than what the window that produced them was.
> This became visible because validation was added recently to ensure that when 
> the pipeline is processing elements in window X that output with a timestamp 
> is valid for window X. Note that this validation only occurs in 
> *@ProcessElement* since output is associated with the current window with the 
> input element that is being processed.
> It is ok to do this in *@FinishBundle* since there is no existing windowing 
> context and when you output that element is assigned to an appropriate window.
> *Further Context*
> We’ve bisected it to being introduced in 2.35.0, and I’m reasonably certain 
> it’s this PR https://github.com/apache/beam/pull/15381
> Our scenario is pretty trivial, we read off Pubsub and write to Elastic in a 
> streaming job, the config for the source and sink is respectively
> {noformat}
> pipeline.apply(
> PubsubIO.readStrings().fromSubscription(subscription)
> ).apply(ParseJsons.of(OurObject::class.java))
> .setCoder(KryoCoder.of())
> {noformat}
> and
> {noformat}
> ElasticsearchIO.write()
> .withUseStatefulBatches(true)
> .withMaxParallelRequestsPerWindow(1)
> .withMaxBufferingDuration(Duration.standardSeconds(30))
> // 5 bytes **> KiB **> MiB, so 5 MiB
> .withMaxBatchSizeBytes(5L * 1024 * 1024)
> // # of docs
> .withMaxBatchSize(1000)
> .withConnectionConfiguration(
> ElasticsearchIO.ConnectionConfiguration.create(
> arrayOf(host),
> "fubar",
> "_doc"
> ).withConnectTimeout(5000)
> .withSocketTimeout(3)
> )
> .withRetryConfiguration(
> ElasticsearchIO.RetryConfiguration.create(
> 10,
> // the duration is wall clock, against the connection and 
> socket timeouts specified
> // above. I.e., 10 x 30s is gonna be more than 3 minutes, 
> so if we're getting
> // 10 socket timeouts in a row, this would ignore the 
> "10" part and terminate
> // after 6. The idea is that in a mixed failure mode, 
> you'd get different timeouts
> // of different durations, and on average 10 x fails < 4m.
> // That said, 4m is arbitrary, so adjust as and when 
> needed.
> Duration.standardMinutes(4)
> )
> )
> .withIdFn { f: JsonNode -> f["id"].asText() }
> .withIndexFn { f: JsonNode -> f["schema_name"].asText() }
> .withIsDeleteFn { f: JsonNode -> f["_action"].asText("noop") == 
> "delete" }
> {noformat}
> We recently tried upgrading 2.33 to 2.36 and immediately hit a bug in the 
> consum

[jira] [Commented] (BEAM-14129) Fix issues with Pub/Sub Lite IO at high volumes

2022-03-24 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512088#comment-17512088
 ] 

Daniel Oliveira commented on BEAM-14129:


Currently this Jira is marked as a release-blocker (due to the fix version 
being set to an upcoming release, and the issue not being resolved). This issue 
doesn't look like it's "a significant regression or loss of functionality" 
([see this page|https://beam.apache.org/contribute/release-blocking/]). Can we 
unmark it as release-blocking by clearing the "Fix Version" field until after 
it's resolved?

> Fix issues with Pub/Sub Lite IO at high volumes
> ---
>
> Key: BEAM-14129
> URL: https://issues.apache.org/jira/browse/BEAM-14129
> Project: Beam
>  Issue Type: Task
>  Components: io-java-gcp
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: P1
> Fix For: 2.38.0
>
>  Time Spent: 19h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13695) Provide more accurate size estimates for cache objects in Java 17

2022-03-24 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512085#comment-17512085
 ] 

Daniel Oliveira commented on BEAM-13695:


Currently this Jira is marked as a release-blocker (due to the fix version 
being set to an upcoming release, and the issue not being resolved). This issue 
doesn't look like it's "a significant regression or loss of functionality" 
([see this page|https://beam.apache.org/contribute/release-blocking/]). Can we 
unmark it as release-blocking?

> Provide more accurate size estimates for cache objects in Java 17
> -
>
> Key: BEAM-13695
> URL: https://issues.apache.org/jira/browse/BEAM-13695
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness
>Reporter: Kiley Sok
>Assignee: Kiley Sok
>Priority: P2
> Fix For: 2.38.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-14163) Performance Regressions in streaming python ParDo and GBK Load Tests

2022-03-23 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-14163:
--

 Summary: Performance Regressions in streaming python ParDo and GBK 
Load Tests
 Key: BEAM-14163
 URL: https://issues.apache.org/jira/browse/BEAM-14163
 Project: Beam
  Issue Type: Bug
  Components: community-metrics, sdk-py-core
Affects Versions: 2.38.0
Reporter: Daniel Oliveira
 Fix For: 2.38.0


As specified in the [Beam Release 
Guide|https://beam.apache.org/contribute/release-guide/#4-investigate-performance-regressions],
 I'm investigating performance regressions. The following load test metrics 
show a clear and persistant performance regression starting approximately 
around March 17 and affecting version 2.38.0.

ParDo Load Tests: 
http://metrics.beam.apache.org/d/MOi-kf3Zk/pardo-load-tests?orgId=1&var-processingType=streaming&var-sdk=python
GBK Load Tests: 
http://metrics.beam.apache.org/d/UYZ-oJ3Zk/gbk-load-tests?orgId=1&var-processingType=streaming&var-sdk=python&from=now-30d&to=now



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14122) Python portable precommit broken: 'get_installed_distributions'

2022-03-18 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509110#comment-17509110
 ] 

Daniel Oliveira commented on BEAM-14122:


I'm running into this on all the XLang Dataflow test suites too (which is 
blocking me since I'm trying to add a new one for Go).

> Python portable precommit broken: 'get_installed_distributions'
> ---
>
> Key: BEAM-14122
> URL: https://issues.apache.org/jira/browse/BEAM-14122
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Priority: P1
>  Labels: currently-failing
>
> Successfully installed PTable-0.9.2 pip-licenses-2.3.0
> WARNING: Running pip as the 'root' user can result in broken permissions and 
> conflicting behaviour with the system package manager. It is recommended to 
> use a virtual environment instead: https://pip.pypa.io/warnings/venv
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.9/site-packages/piplicenses.py", line 40, in 
> 
> from pip._internal.utils.misc import get_installed_distributions
> ImportError: cannot import name 'get_installed_distributions' from 
> 'pip._internal.utils.misc' 
> (/usr/local/lib/python3.9/site-packages/pip/_internal/utils/misc.py)
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/local/bin/pip-licenses", line 5, in 
> from piplicenses import main
>   File "/usr/local/lib/python3.9/site-packages/piplicenses.py", line 42, in 
> 
> from pip import get_installed_distributions
> ImportError: cannot import name 'get_installed_distributions' from 'pip' 
> (/usr/local/lib/python3.9/site-packages/pip/__init__.py)
> Traceback (most recent call last):
>   File "/tmp/license_scripts/pull_licenses_py.py", line 166, in 
> dependencies = run_pip_licenses()
>   File "/tmp/license_scripts/pull_licenses_py.py", line 49, in 
> run_pip_licenses
> dependencies = run_bash_command(command)
>   File "/tmp/license_scripts/pull_licenses_py.py", line 44, in 
> run_bash_command
> return subprocess.check_output(command.split()).decode('utf-8')
>   File "/usr/local/lib/python3.9/subprocess.py", line 424, in check_output
> return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
>   File "/usr/local/lib/python3.9/subprocess.py", line 528, in run
> raise CalledProcessError(retcode, process.args,
> subprocess.CalledProcessError: Command '['pip-licenses', 
> '--with-license-file', '--with-urls', '--from=mixed', '--ignore', 
> 'apache-beam', '--format=json']' returned non-zero exit status 1.
> The command '/bin/sh -c if [ "$pull_licenses" = "true" ] ; then   pip 
> install 'pip-licenses<3.0.0' pyyaml tenacity &&   python 
> /tmp/license_scripts/pull_licenses_py.py ; fi' returned a non-zero code: 1
> > Task :sdks:python:container:py39:docker FAILED
> https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/4748



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14017) beam_PreCommit_CommunityMetrics_Cron is failing.

2022-03-09 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17503924#comment-17503924
 ] 

Daniel Oliveira commented on BEAM-14017:


Also relevant: The failing gradle task is happening here, where the scripts 
are: https://github.com/apache/beam/tree/master/.test-infra/metrics
And the port that's failing to be found, 443, seems to be an HTTP port and is 
hardcoded in a few places in that directory 
https://github.com/apache/beam/search?l=Python&q=%22443%22

> beam_PreCommit_CommunityMetrics_Cron is failing.
> 
>
> Key: BEAM-14017
> URL: https://issues.apache.org/jira/browse/BEAM-14017
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Daniel Oliveira
>Priority: P1
>
> https://ci-beam.apache.org/job/beam_PreCommit_CommunityMetrics_Cron/4805/console
> 10:14:48 > Task :beam-test-infra-metrics:validateConfiguration
> 10:14:48 W0228 18:14:48.092605  389274 helpers.go:549] --dry-run=true is 
> deprecated (boolean value) and can be replaced with --dry-run=client.
> 10:15:20 Unable to connect to the server: dial tcp 104.154.102.21:443: i/o 
> timeout (Client.Timeout exceeded while awaiting headers)
> 10:15:20 
> 10:15:20 > Task :beam-test-infra-metrics:validateConfiguration FAILED
> 10:15:20 
> 10:15:20 FAILURE: Build failed with an exception.
> 10:15:20 
> 10:15:20 * What went wrong:
> 10:15:20 Execution failed for task 
> ':beam-test-infra-metrics:validateConfiguration'.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (BEAM-14017) beam_PreCommit_CommunityMetrics_Cron is failing.

2022-03-09 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-14017:
--

Assignee: Daniel Oliveira  (was: Heejong Lee)

> beam_PreCommit_CommunityMetrics_Cron is failing.
> 
>
> Key: BEAM-14017
> URL: https://issues.apache.org/jira/browse/BEAM-14017
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Daniel Oliveira
>Priority: P1
>
> https://ci-beam.apache.org/job/beam_PreCommit_CommunityMetrics_Cron/4805/console
> 10:14:48 > Task :beam-test-infra-metrics:validateConfiguration
> 10:14:48 W0228 18:14:48.092605  389274 helpers.go:549] --dry-run=true is 
> deprecated (boolean value) and can be replaced with --dry-run=client.
> 10:15:20 Unable to connect to the server: dial tcp 104.154.102.21:443: i/o 
> timeout (Client.Timeout exceeded while awaiting headers)
> 10:15:20 
> 10:15:20 > Task :beam-test-infra-metrics:validateConfiguration FAILED
> 10:15:20 
> 10:15:20 FAILURE: Build failed with an exception.
> 10:15:20 
> 10:15:20 * What went wrong:
> 10:15:20 Execution failed for task 
> ':beam-test-infra-metrics:validateConfiguration'.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-14017) beam_PreCommit_CommunityMetrics_Cron is failing.

2022-03-09 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17503923#comment-17503923
 ] 

Daniel Oliveira commented on BEAM-14017:


I spent some time looking into this and couldn't figure out anything really 
useful. The best lead I have is that the error that's appearing is probably due 
to being unable to connect to a Kubernetes cluster, and that the error first 
started appearing Feb. 25 sometime between 4 PM PST and 10 PM PST. It seems 
likely that this is due to a change in our GCP environment but I haven't been 
able to find what it could be.

Last good run: 
https://ci-beam.apache.org/job/beam_PreCommit_CommunityMetrics_Cron/4794/
First bad run: 
https://ci-beam.apache.org/job/beam_PreCommit_CommunityMetrics_Cron/4795/

> beam_PreCommit_CommunityMetrics_Cron is failing.
> 
>
> Key: BEAM-14017
> URL: https://issues.apache.org/jira/browse/BEAM-14017
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Assignee: Heejong Lee
>Priority: P1
>
> https://ci-beam.apache.org/job/beam_PreCommit_CommunityMetrics_Cron/4805/console
> 10:14:48 > Task :beam-test-infra-metrics:validateConfiguration
> 10:14:48 W0228 18:14:48.092605  389274 helpers.go:549] --dry-run=true is 
> deprecated (boolean value) and can be replaced with --dry-run=client.
> 10:15:20 Unable to connect to the server: dial tcp 104.154.102.21:443: i/o 
> timeout (Client.Timeout exceeded while awaiting headers)
> 10:15:20 
> 10:15:20 > Task :beam-test-infra-metrics:validateConfiguration FAILED
> 10:15:20 
> 10:15:20 FAILURE: Build failed with an exception.
> 10:15:20 
> 10:15:20 * What went wrong:
> 10:15:20 Execution failed for task 
> ':beam-test-infra-metrics:validateConfiguration'.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13857) Add expansion service startup to Go integration test flags.

2022-02-17 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13857:
---
Description: 
Currently a separate debezium io expansion address flag needs to be passed to 
the runner when running cross-language debezium IO pipelines from Go SDK. Find 
a way to do this in a better way so that we could have it started along with 
java io expansion service while spinning up the test without bulking 
:sdks:java:io:expansion-service.

In particular, needing to add a flag per expansion service jar to our 
integration tests will eventually become quite cluttered, so we may wish to 
settle on some kind of KV map flag approach instead to reduce copypasta code 
overhead.

Edit: Decided on going with the KV map flag approach within the Go SDK instead 
of in a bash script, and moving expansion service startup into the codebase as 
well.

  was:
Currently a separate debezium io expansion address flag needs to be passed to 
the runner when running cross-language debezium IO pipelines from Go SDK. Find 
a way to do this in a better way so that we could have it started along with 
java io expansion service while spinning up the test without bulking 
:sdks:java:io:expansion-service.

In particular, needing to add a flag per expansion service jar to our 
integration tests will eventually become quite cluttered, so we may wish to 
settle on some kind of KV map flag approach instead to reduce copypasta code 
overhead.


> Add expansion service startup to Go integration test flags.
> ---
>
> Key: BEAM-13857
> URL: https://issues.apache.org/jira/browse/BEAM-13857
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Ritesh Ghorse
>Assignee: Daniel Oliveira
>Priority: P2
>
> Currently a separate debezium io expansion address flag needs to be passed to 
> the runner when running cross-language debezium IO pipelines from Go SDK. 
> Find a way to do this in a better way so that we could have it started along 
> with java io expansion service while spinning up the test without bulking 
> :sdks:java:io:expansion-service.
> In particular, needing to add a flag per expansion service jar to our 
> integration tests will eventually become quite cluttered, so we may wish to 
> settle on some kind of KV map flag approach instead to reduce copypasta code 
> overhead.
> Edit: Decided on going with the KV map flag approach within the Go SDK 
> instead of in a bash script, and moving expansion service startup into the 
> codebase as well.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13857) Add expansion service startup to Go integration test flags.

2022-02-17 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13857:
---
Summary: Add expansion service startup to Go integration test flags.  (was: 
DebeziumIO expansion address flag in Go SDK)

> Add expansion service startup to Go integration test flags.
> ---
>
> Key: BEAM-13857
> URL: https://issues.apache.org/jira/browse/BEAM-13857
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Ritesh Ghorse
>Assignee: Daniel Oliveira
>Priority: P2
>
> Currently a separate debezium io expansion address flag needs to be passed to 
> the runner when running cross-language debezium IO pipelines from Go SDK. 
> Find a way to do this in a better way so that we could have it started along 
> with java io expansion service while spinning up the test without bulking 
> :sdks:java:io:expansion-service.
> In particular, needing to add a flag per expansion service jar to our 
> integration tests will eventually become quite cluttered, so we may wish to 
> settle on some kind of KV map flag approach instead to reduce copypasta code 
> overhead.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (BEAM-13857) DebeziumIO expansion address flag in Go SDK

2022-02-08 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-13857:
--

Assignee: Daniel Oliveira

> DebeziumIO expansion address flag in Go SDK
> ---
>
> Key: BEAM-13857
> URL: https://issues.apache.org/jira/browse/BEAM-13857
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Ritesh Ghorse
>Assignee: Daniel Oliveira
>Priority: P2
>
> Currently a separate debezium io expansion address flag needs to be passed to 
> the runner when running cross-language debezium IO pipelines from Go SDK. 
> Find a way to do this in a better way so that we could have it started along 
> with java io expansion service while spinning up the test without bulking 
> :sdks:java:io:expansion-service.
> In particular, needing to add a flag per expansion service jar to our 
> integration tests will eventually become quite cluttered, so we may wish to 
> settle on some kind of KV map flag approach instead to reduce copypasta code 
> overhead.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13321) [Cross-Language] Externalize a minimal Implementation of Java's BigQuery IO

2022-02-08 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13321:
---
Fix Version/s: 2.37.0
   Resolution: Fixed
   Status: Resolved  (was: In Progress)

> [Cross-Language] Externalize a minimal Implementation of Java's BigQuery IO
> ---
>
> Key: BEAM-13321
> URL: https://issues.apache.org/jira/browse/BEAM-13321
> Project: Beam
>  Issue Type: New Feature
>  Components: cross-language, io-java-gcp
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
> Fix For: 2.37.0
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> This is described in detail in this design doc: 
> [https://s.apache.org/beam-bigquery-externalization]
> The short version of this task is to have a minimum viable implementation of 
> BigQuery IO available for cross-language usage via SchemaIO.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13732) [Cross-Language] Implement Go SDK wrapper for xlang BigQuery IO

2022-02-08 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13732:
---
Fix Version/s: 2.37.0
   Resolution: Fixed
   Status: Resolved  (was: In Progress)

> [Cross-Language] Implement Go SDK wrapper for xlang BigQuery IO
> ---
>
> Key: BEAM-13732
> URL: https://issues.apache.org/jira/browse/BEAM-13732
> Project: Beam
>  Issue Type: New Feature
>  Components: cross-language, sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
> Fix For: 2.37.0
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Title says it all.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (BEAM-13806) [Cross-Language] Jenkins integration test for Go SDK BigQuery IO.

2022-02-08 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-13806 started by Daniel Oliveira.
--
> [Cross-Language] Jenkins integration test for Go SDK BigQuery IO.
> -
>
> Key: BEAM-13806
> URL: https://issues.apache.org/jira/browse/BEAM-13806
> Project: Beam
>  Issue Type: New Feature
>  Components: cross-language, io-go-gcp, sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Title says it all. Add an integration test for cross-language BigQuery IO 
> that runs on Jenkins.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-13806) [Cross-Language] Jenkins integration test for Go SDK BigQuery IO.

2022-02-02 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-13806:
--

 Summary: [Cross-Language] Jenkins integration test for Go SDK 
BigQuery IO.
 Key: BEAM-13806
 URL: https://issues.apache.org/jira/browse/BEAM-13806
 Project: Beam
  Issue Type: New Feature
  Components: cross-language, io-go-gcp, sdk-go
Reporter: Daniel Oliveira


Title says it all. Add an integration test for cross-language BigQuery IO that 
runs on Jenkins.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13732) [Cross-Language] Implement Go SDK wrapper for xlang BigQuery IO

2022-01-24 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481378#comment-17481378
 ] 

Daniel Oliveira commented on BEAM-13732:


Requires BEAM-13321 to be implemented.

> [Cross-Language] Implement Go SDK wrapper for xlang BigQuery IO
> ---
>
> Key: BEAM-13732
> URL: https://issues.apache.org/jira/browse/BEAM-13732
> Project: Beam
>  Issue Type: New Feature
>  Components: cross-language, sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
>
> Title says it all.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (BEAM-13321) [Cross-Language] Externalize a minimal Implementation of Java's BigQuery IO

2022-01-24 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-13321 started by Daniel Oliveira.
--
> [Cross-Language] Externalize a minimal Implementation of Java's BigQuery IO
> ---
>
> Key: BEAM-13321
> URL: https://issues.apache.org/jira/browse/BEAM-13321
> Project: Beam
>  Issue Type: New Feature
>  Components: cross-language, io-java-gcp
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> This is described in detail in this design doc: 
> [https://s.apache.org/beam-bigquery-externalization]
> The short version of this task is to have a minimum viable implementation of 
> BigQuery IO available for cross-language usage via SchemaIO.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-13732) [Cross-Language] Implement Go SDK wrapper for xlang BigQuery IO

2022-01-24 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-13732:
--

 Summary: [Cross-Language] Implement Go SDK wrapper for xlang 
BigQuery IO
 Key: BEAM-13732
 URL: https://issues.apache.org/jira/browse/BEAM-13732
 Project: Beam
  Issue Type: New Feature
  Components: cross-language, sdk-go
Reporter: Daniel Oliveira
Assignee: Daniel Oliveira


Title says it all.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (BEAM-13732) [Cross-Language] Implement Go SDK wrapper for xlang BigQuery IO

2022-01-24 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-13732 started by Daniel Oliveira.
--
> [Cross-Language] Implement Go SDK wrapper for xlang BigQuery IO
> ---
>
> Key: BEAM-13732
> URL: https://issues.apache.org/jira/browse/BEAM-13732
> Project: Beam
>  Issue Type: New Feature
>  Components: cross-language, sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
>
> Title says it all.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13321) [Cross-Language] Externalize a minimal Implementation of Java's BigQuery IO

2022-01-11 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17474236#comment-17474236
 ] 

Daniel Oliveira commented on BEAM-13321:


It doesn't, there's still some additional PRs coming up.

> [Cross-Language] Externalize a minimal Implementation of Java's BigQuery IO
> ---
>
> Key: BEAM-13321
> URL: https://issues.apache.org/jira/browse/BEAM-13321
> Project: Beam
>  Issue Type: New Feature
>  Components: cross-language, io-java-gcp
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> This is described in detail in this design doc: 
> [https://s.apache.org/beam-bigquery-externalization]
> The short version of this task is to have a minimum viable implementation of 
> BigQuery IO available for cross-language usage via SchemaIO.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-13618) Java BigQuery IO: DirectRead does not work with Beam Schema support.

2022-01-09 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-13618:
--

 Summary: Java BigQuery IO: DirectRead does not work with Beam 
Schema support.
 Key: BEAM-13618
 URL: https://issues.apache.org/jira/browse/BEAM-13618
 Project: Beam
  Issue Type: Bug
  Components: io-java-gcp
Affects Versions: 2.35.0
Reporter: Daniel Oliveira


Currently in BigQueryIO, Reads with Beam Schema support (for example using 
[readTableRowsWithSchema|https://github.com/apache/beam/blob/v2.35.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L553])
 don't actually have Schema support if using DirectRead as a read method. This 
appears to be because the expansion logic for DirectReads takes [a different 
path|https://github.com/apache/beam/blob/v2.35.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1060]
 that doesn't include any considerations for beam schemas ([example of the code 
handling Beam schemas in the default 
path|https://github.com/apache/beam/blob/v2.35.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1204]).

Part of the reason for this is likely that the current approach to Beam Schema 
support is to get a description of the BQ table's schema and then convert it to 
a Beam schema. However, with DirectRead specific columns can be excluded while 
reading, meaning that the Beam schema needed doesn't actually convert directly 
to the table's schema, it would need to be constructed based on the specific 
fields selected for the read.

(As a side note, this is currently not documented anywhere, leading me to 
believe this is an oversight or potential bug. I will add some documentation 
indicating that schema support currently does not work with DirectRead.)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (BEAM-13456) beam_PostCommit_Java consistently timing out.

2021-12-14 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-13456:
--

Assignee: Kenneth Knowles

> beam_PostCommit_Java consistently timing out.
> -
>
> Key: BEAM-13456
> URL: https://issues.apache.org/jira/browse/BEAM-13456
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Affects Versions: 2.36.0
>Reporter: Daniel Oliveira
>Assignee: Kenneth Knowles
>Priority: P1
>
> This seems to have first appeared with build #8367: 
> [https://ci-beam.apache.org/job/beam_PostCommit_Java/8367/]
> Frustratingly, no build scans pop up when the test fails this way, and no 
> error messages appear except for the timeout. It may be easiest to attempt to 
> determine which commit introduced the error.
> The previous successful test is at commit 
> b52762bf150cacceb0fdeb1f0dc85cbea6e6f39c
> The first failing test is at commit 06a5e67332aae53ea90dedb4ef6421c2a7d65035



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13456) beam_PostCommit_Java consistently timing out.

2021-12-13 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13456:
---
Description: 
This seems to have first appeared with build #8367: 
[https://ci-beam.apache.org/job/beam_PostCommit_Java/8367/]

Frustratingly, no build scans pop up when the test fails this way, and no error 
messages appear except for the timeout. It may be easiest to attempt to 
determine which commit introduced the error.

The previous successful test is at commit 
b52762bf150cacceb0fdeb1f0dc85cbea6e6f39c

The first failing test is at commit 06a5e67332aae53ea90dedb4ef6421c2a7d65035

  was:
This seems to have first appeared with build #8367: 
[https://ci-beam.apache.org/job/beam_PostCommit_Java/8367/]

Frustratingly, no build scans pop up when the test fails this way, and no error 
messages appear except for the timeout. It may be easiest to attempt to 
determine which CL introduced the error.


> beam_PostCommit_Java consistently timing out.
> -
>
> Key: BEAM-13456
> URL: https://issues.apache.org/jira/browse/BEAM-13456
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Affects Versions: 2.36.0
>Reporter: Daniel Oliveira
>Priority: P1
>
> This seems to have first appeared with build #8367: 
> [https://ci-beam.apache.org/job/beam_PostCommit_Java/8367/]
> Frustratingly, no build scans pop up when the test fails this way, and no 
> error messages appear except for the timeout. It may be easiest to attempt to 
> determine which commit introduced the error.
> The previous successful test is at commit 
> b52762bf150cacceb0fdeb1f0dc85cbea6e6f39c
> The first failing test is at commit 06a5e67332aae53ea90dedb4ef6421c2a7d65035



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13456) beam_PostCommit_Java consistently timing out.

2021-12-13 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458912#comment-17458912
 ] 

Daniel Oliveira commented on BEAM-13456:


As a side note, the beam_PostCommit_Java_ValidatesRunner_ULR test appears to be 
suffering from the same issue starting at nearly the same time, so it's 
probably related: 
[https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_ULR/]

> beam_PostCommit_Java consistently timing out.
> -
>
> Key: BEAM-13456
> URL: https://issues.apache.org/jira/browse/BEAM-13456
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Affects Versions: 2.36.0
>Reporter: Daniel Oliveira
>Priority: P1
>
> This seems to have first appeared with build #8367: 
> [https://ci-beam.apache.org/job/beam_PostCommit_Java/8367/]
> Frustratingly, no build scans pop up when the test fails this way, and no 
> error messages appear except for the timeout. It may be easiest to attempt to 
> determine which CL introduced the error.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-13456) beam_PostCommit_Java consistently timing out.

2021-12-13 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-13456:
--

 Summary: beam_PostCommit_Java consistently timing out.
 Key: BEAM-13456
 URL: https://issues.apache.org/jira/browse/BEAM-13456
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Affects Versions: 2.36.0
Reporter: Daniel Oliveira


This seems to have first appeared with build #8367: 
[https://ci-beam.apache.org/job/beam_PostCommit_Java/8367/]

Frustratingly, no build scans pop up when the test fails this way, and no error 
messages appear except for the timeout. It may be easiest to attempt to 
determine which CL introduced the error.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13433) beam_PostCommit_Python37 failing, potentially due to apache_beam.ml.gcp.cloud_dlp_it_test.CloudDLPIT

2021-12-09 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17456879#comment-17456879
 ] 

Daniel Oliveira commented on BEAM-13433:


Assigning to you Pablo as a general python IO dev, since I can't find any 
specific owner for the test.

> beam_PostCommit_Python37 failing, potentially due to 
> apache_beam.ml.gcp.cloud_dlp_it_test.CloudDLPIT
> 
>
> Key: BEAM-13433
> URL: https://issues.apache.org/jira/browse/BEAM-13433
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Affects Versions: 2.36.0
>Reporter: Daniel Oliveira
>Assignee: Pablo Estrada
>Priority: P2
>
> It's difficult for me to test for sure, because each run seems to show 
> slightly different errors, and sometimes the errors don't even show at all. 
> To track this down, you need to check the gradle build scan for the test, 
> because the raw logs are too long to find the appropriate error.
> This is one that shows an error: 
> [https://ci-beam.apache.org/job/beam_PostCommit_Python37/4617/]
> As far as I can tell, this is the error, apparently happening due to 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/ml/gcp/cloud_dlp_it_test.py]
> {noformat}
> Traceback (most recent call last):
> File "apache_beam/runners/common.py", line 1198, in 
> apache_beam.runners.common.DoFnRunner.process
> File "apache_beam/runners/common.py", line 536, in 
> apache_beam.runners.common.SimpleInvoker.invoke_process
> File "apache_beam/runners/common.py", line 1334, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/ml/gcp/cloud_dlp.py", 
> line 199, in process
>   item={"value": element}, **self.params)
> TypeError: deidentify_content() got an unexpected keyword argument 'item'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
> File "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", 
> line 644, in do_work
>   work_executor.execute()
> File "/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py", 
> line 208, in execute
>   op.start()
> File "dataflow_worker/native_operations.py", line 38, in 
> dataflow_worker.native_operations.NativeReadOperation.start
> File "dataflow_worker/native_operations.py", line 39, in 
> dataflow_worker.native_operations.NativeReadOperation.start
> File "dataflow_worker/native_operations.py", line 44, in 
> dataflow_worker.native_operations.NativeReadOperation.start
> File "dataflow_worker/native_operations.py", line 54, in 
> dataflow_worker.native_operations.NativeReadOperation.start
> File "apache_beam/runners/worker/operations.py", line 348, in 
> apache_beam.runners.worker.operations.Operation.output
> File "apache_beam/runners/worker/operations.py", line 215, in 
> apache_beam.runners.worker.operations.SingletonConsumerSet.receive
> File "apache_beam/runners/worker/operations.py", line 707, in 
> apache_beam.runners.worker.operations.DoOperation.process
> File "apache_beam/runners/worker/operations.py", line 708, in 
> apache_beam.runners.worker.operations.DoOperation.process
> File "apache_beam/runners/common.py", line 1200, in 
> apache_beam.runners.common.DoFnRunner.process
> File "apache_beam/runners/common.py", line 1281, in 
> apache_beam.runners.common.DoFnRunner._reraise_augmented
> File "apache_beam/runners/common.py", line 1198, in 
> apache_beam.runners.common.DoFnRunner.process
> File "apache_beam/runners/common.py", line 536, in 
> apache_beam.runners.common.SimpleInvoker.invoke_process
> File "apache_beam/runners/common.py", line 1334, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
> File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/ml/gcp/cloud_dlp.py", 
> line 199, in process
>   item={"value": element}, **self.params)
> TypeError: deidentify_content() got an unexpected keyword argument 'item' 
> [while running 'MaskDetectedDetails/ParDo(_DeidentifyFn)']{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-13433) beam_PostCommit_Python37 failing, potentially due to apache_beam.ml.gcp.cloud_dlp_it_test.CloudDLPIT

2021-12-09 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13433:
---
Description: 
It's difficult for me to test for sure, because each run seems to show slightly 
different errors, and sometimes the errors don't even show at all. To track 
this down, you need to check the gradle build scan for the test, because the 
raw logs are too long to find the appropriate error.

This is one that shows an error: 
[https://ci-beam.apache.org/job/beam_PostCommit_Python37/4617/]

As far as I can tell, this is the error, apparently happening due to 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/ml/gcp/cloud_dlp_it_test.py]
{noformat}
Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1198, in 
apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 536, in 
apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1334, in 
apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.7/site-packages/apache_beam/ml/gcp/cloud_dlp.py", 
line 199, in process
  item={"value": element}, **self.params)
TypeError: deidentify_content() got an unexpected keyword argument 'item'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", 
line 644, in do_work
  work_executor.execute()
File "/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py", line 
208, in execute
  op.start()
File "dataflow_worker/native_operations.py", line 38, in 
dataflow_worker.native_operations.NativeReadOperation.start
File "dataflow_worker/native_operations.py", line 39, in 
dataflow_worker.native_operations.NativeReadOperation.start
File "dataflow_worker/native_operations.py", line 44, in 
dataflow_worker.native_operations.NativeReadOperation.start
File "dataflow_worker/native_operations.py", line 54, in 
dataflow_worker.native_operations.NativeReadOperation.start
File "apache_beam/runners/worker/operations.py", line 348, in 
apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 215, in 
apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 707, in 
apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 708, in 
apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 1200, in 
apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 1281, in 
apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 1198, in 
apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 536, in 
apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1334, in 
apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.7/site-packages/apache_beam/ml/gcp/cloud_dlp.py", 
line 199, in process
  item={"value": element}, **self.params)
TypeError: deidentify_content() got an unexpected keyword argument 'item' 
[while running 'MaskDetectedDetails/ParDo(_DeidentifyFn)']{noformat}

  was:
It's difficult for me to test for sure, because each run seems to show slightly 
different errors, and sometimes the errors don't even show at all. To track 
this down, you need to check the gradle build scan for the test, because the 
raw logs are too long to find the appropriate error.

This is one that shows an error: 
[https://ci-beam.apache.org/job/beam_PostCommit_Python37/4617/]

As far as I can tell, this is the error, apparently happening due to 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/ml/gcp/cloud_dlp_it_test.py]
{noformat}
Traceback (most recent call last):  
 File "apache_beam/runners/common.py", line 
1198, in apache_beam.runners.common.DoFnRunner.process  
 File 
"apache_beam/runners/common.py", line 536, in 
apache_beam.runners.common.SimpleInvoker.invoke_process 
  File 
"apache_beam/runners/common.py", line 1334, in 
apache_beam.runners.common._OutputProcessor.process_outputs 
  File 
"/usr/local/lib/python3.7/site-packages/apache_beam/ml/gcp/cloud_dlp.py", line 
199, in process 
item={"value": element}, **self.params) 
  

[jira] [Created] (BEAM-13433) beam_PostCommit_Python37 failing, potentially due to apache_beam.ml.gcp.cloud_dlp_it_test.CloudDLPIT

2021-12-09 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-13433:
--

 Summary: beam_PostCommit_Python37 failing, potentially due to 
apache_beam.ml.gcp.cloud_dlp_it_test.CloudDLPIT
 Key: BEAM-13433
 URL: https://issues.apache.org/jira/browse/BEAM-13433
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Affects Versions: 2.36.0
Reporter: Daniel Oliveira
Assignee: Pablo Estrada


It's difficult for me to test for sure, because each run seems to show slightly 
different errors, and sometimes the errors don't even show at all. To track 
this down, you need to check the gradle build scan for the test, because the 
raw logs are too long to find the appropriate error.

This is one that shows an error: 
[https://ci-beam.apache.org/job/beam_PostCommit_Python37/4617/]

As far as I can tell, this is the error, apparently happening due to 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/ml/gcp/cloud_dlp_it_test.py]
{noformat}
Traceback (most recent call last):  
 File "apache_beam/runners/common.py", line 
1198, in apache_beam.runners.common.DoFnRunner.process  
 File 
"apache_beam/runners/common.py", line 536, in 
apache_beam.runners.common.SimpleInvoker.invoke_process 
  File 
"apache_beam/runners/common.py", line 1334, in 
apache_beam.runners.common._OutputProcessor.process_outputs 
  File 
"/usr/local/lib/python3.7/site-packages/apache_beam/ml/gcp/cloud_dlp.py", line 
199, in process 
item={"value": element}, **self.params) 
TypeError: 
deidentify_content() got an unexpected keyword argument 'item'  

During handling of the 
above exception, another exception occurred:

  Traceback (most recent call last):
   File 
"/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
644, in do_work 
work_executor.execute() 
  File 
"/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py", line 208, 
in execute  
   op.start()   
File "dataflow_worker/native_operations.py", line 38, in 
dataflow_worker.native_operations.NativeReadOperation.start 
  File 
"dataflow_worker/native_operations.py", line 39, in 
dataflow_worker.native_operations.NativeReadOperation.start 
  File 
"dataflow_worker/native_operations.py", line 44, in 
dataflow_worker.native_operations.NativeReadOperation.start 
  File 
"dataflow_worker/native_operations.py", line 54, in 
dataflow_worker.native_operations.NativeReadOperation.start 
  File 
"apache_beam/runners/worker/operations.py", line 348, in 
apache_beam.runners.worker.operations.Operation.output  
 File 
"apache_beam/runners/worker/operations.py", line 215, in 
apache_beam.runners.worker.operations.SingletonConsumerSet.receive  
 File 
"apache_beam/runners/worker/operations.py", line 707, in 
apache_beam.runners.worker.operations.DoOperation.process   
File 
"apache_beam/runners/worker/operations.py", line 708, in 
apache_beam.runners.worker.operations.DoOperation.process   
File 
"apache_beam/runners/common.py", line 1200, in 
apache_beam.runners.common.DoFnRunner.process   
File 
"apache_beam/runners/common.py", line 1281, in 
apach

[jira] [Created] (BEAM-13420) [Go SDK] Decoding a schema row doesn't respect field names from struct tags

2021-12-08 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-13420:
--

 Summary: [Go SDK] Decoding a schema row doesn't respect field 
names from struct tags
 Key: BEAM-13420
 URL: https://issues.apache.org/jira/browse/BEAM-13420
 Project: Beam
  Issue Type: Bug
  Components: sdk-go
Reporter: Daniel Oliveira


Will attempt to create a reproducible code snippet soon.

For now, the basics of the bug are that when an element encoded as a Row via 
schemas gets decoded to a Go struct, it doesn't respect struct tags somewhere 
in the conversion process. More specifically, if I have an external transform 
that outputs a Row, and I take that as input to a native Go transform that 
accepts some Go struct "Foo", it will fail if the field names are different, 
even if the struct tags on Foo match the external row's field names.

Example error message. The following error is caused by the WordCount and 
CorpusDate fields in the native struct not matching the field names 
"word_count" and "corpus_date" from the raw Row. The row gets decoded into a 
struct with the field names Word_count and Corpus_date, and ignoring the struct 
tags of the struct it's attempting to match:
{noformat}
panic: reflect: Call using struct { Word string "beam:\"word\""; Word_count 
int64 "beam:\"word_count\""; Corpus string "beam:\"corpus\""; Corpus_date int64 
"beam:\"corpus_date\"" } as type struct { Word string "beam:\"word\""; 
WordCount int64 "beam:\"word_count\""; Corpus string "beam:\"corpus\""; 
CorpusDate int64 "beam:\"corpus_date\"" }
Full error:
while executing Process for Plan[process-bundle-descriptor-291]:
2: DataSink[S[ptransform-289@localhost:12371]] 
Coder:W;coder-315>!GWC
3: PCollection[pcollection-304] Out:[2]
4: ParDo[beam.addFixedKeyFn] Out:[2]
5: PCollection[pcollection-300] Out:[4]
6: ParDo[main.main.func1] Out:[5]
1: DataSource[S[ptransform-288@localhost:12371], 0] 
Coder:W;coder-310>!GWC Out:6
caused by:
panic: reflect: Call using struct { Word string "beam:\"word\""; Word_count 
int64 "beam:\"word_count\""; Corpus string "beam:\"corpus\""; Corpus_date int64 
"beam:\"corpus_date\"" } as type struct { Word string "beam:\"word\""; 
WordCount int64 "beam:\"word_count\""; Corpus string "beam:\"corpus\""; 
CorpusDate int64 "beam:\"corpus_date\"" } goroutine 39 [running]:
runtime/debug.Stack()
/usr/lib/google-golang/src/runtime/debug/stack.go:24 +0x65
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.callNoPanic.func1()

/usr/local/google/home/danoliveira/repos/beam/sdks/go/pkg/beam/core/runtime/exec/util.go:58
 +0xa5
panic({0xe8f160, 0xc0003149e0})
/usr/lib/google-golang/src/runtime/panic.go:1038 +0x215
reflect.Value.call({0xc0003267e0, 0xc0003b3e10, 0x5be732}, {0x102e0e5, 0x4}, 
{0xcfe3d8, 0x1, 0x679031})
/usr/lib/google-golang/src/reflect/value.go:411 +0x1965
reflect.Value.Call({0xc0003267e0, 0xc0003b3e10, 0x1}, {0xcfe3d8, 0x1, 0x1})
/usr/lib/google-golang/src/reflect/value.go:339 +0xc5
github.com/apache/beam/sdks/v2/go/pkg/beam/core/util/reflectx.(*reflectFunc).Call(0xc000426060,
 {0xc0003326f0, 0x0, 0xf5ef00})

/usr/local/google/home/danoliveira/repos/beam/sdks/go/pkg/beam/core/util/reflectx/call.go:87
 +0x59
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*invoker).initCall.func33({0x18e7a90,
 0x1, 0x1}, 0xffdf3b645a1cac09)

/usr/local/google/home/danoliveira/repos/beam/sdks/go/pkg/beam/core/runtime/exec/fn_arity.go:229
 +0x7b
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*invoker).Invoke(0xc000477860,
 {0x1168a50, 0xc00033cdc0}, {0x18e7a90, 0x0, 0x1}, 0x203000, 0xc00015d660, 
{0x1934830, 0x0, ...})

/usr/local/google/home/danoliveira/repos/beam/sdks/go/pkg/beam/core/runtime/exec/fn.go:186
 +0x7a2
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*ParDo).invokeProcessFn(0xc0004420e0,
 {0x1168a50, 0xc00033cdc0}, {0x18e7a90, 0x1, 0x1}, 0x30, 0x1904be0)

/usr/local/google/home/danoliveira/repos/beam/sdks/go/pkg/beam/core/runtime/exec/pardo.go:316
 +0x146
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*ParDo).processSingleWindow(0xc0004420e0,
 0xc00015d660)

/usr/local/google/home/danoliveira/repos/beam/sdks/go/pkg/beam/core/runtime/exec/pardo.go:166
 +0x4b
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*ParDo).processMainInput(0xc000348fc0,
 0x1168a50)

/usr/local/google/home/danoliveira/repos/beam/sdks/go/pkg/beam/core/runtime/exec/pardo.go:146
 +0x9c
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*ParDo).ProcessElement(0xc0004420e0,
 {0x114d8a0, 0xc000426300}, 0xc0001f2540, {0x0, 0x0, 0x0})

/usr/local/google/home/danoliveira/repos/beam/sdks/go/pkg/beam/core/runtime/exec/pardo.go:132
 +0x1a5
github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/exec.(*DataSource).Process(0xcec8c0,
 {0x1168a50, 0xc00033cd00})

/u

[jira] [Created] (BEAM-13419) Add Go integration test errors when forgetting ptest.Main/beam.Init

2021-12-08 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-13419:
--

 Summary: Add Go integration test errors when forgetting 
ptest.Main/beam.Init
 Key: BEAM-13419
 URL: https://issues.apache.org/jira/browse/BEAM-13419
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: Daniel Oliveira
Assignee: Daniel Oliveira


Currently when someone writes an integration test and forgets to put ptest.Main 
into TestMain (or their own code calling beam.Init), then the SDK harness runs 
the tests as unit tests and ends up passing them because ptest.Run and beam.Run 
seem to just instantly pass without a problem when beam.Init hasn't been called.

The end result is that SDK harnesses in this setup just instantly pass all the 
tests and then close without any error messages.

This code path should have an error added so that if beam.Init hasn't been run 
when ptest.Run executes, then it fails with an error.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-13321) [Cross-Language] Externalize a minimal Implementation of Java's BigQuery IO

2021-11-24 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-13321:
--

 Summary: [Cross-Language] Externalize a minimal Implementation of 
Java's BigQuery IO
 Key: BEAM-13321
 URL: https://issues.apache.org/jira/browse/BEAM-13321
 Project: Beam
  Issue Type: New Feature
  Components: cross-language, io-java-gcp
Reporter: Daniel Oliveira
Assignee: Daniel Oliveira


This is described in detail in this design doc: 
[https://s.apache.org/beam-bigquery-externalization]

The short version of this task is to have a minimum viable implementation of 
BigQuery IO available for cross-language usage via SchemaIO.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (BEAM-12862) Implement Go SDK-side initialization of Java/Python expansion services.

2021-11-24 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-12862:
---
Status: Open  (was: Triage Needed)

> Implement Go SDK-side initialization of Java/Python expansion services.
> ---
>
> Key: BEAM-12862
> URL: https://issues.apache.org/jira/browse/BEAM-12862
> Project: Beam
>  Issue Type: New Feature
>  Components: cross-language, sdk-go
>Reporter: Daniel Oliveira
>Priority: P2
>
> This feature allows users to run cross-language transforms without manually 
> starting an expansion service beforehand. If no expansion service is running, 
> the SDK will default to starting up a predetermined expansion service.
> This behavior already exists in Java and Python, which might be a useful 
> reference point.
> Note: It may be preferable to implement this after cross-language override 
> registration is implemented. That feature will allow registering alternate 
> behavior for expanding cross-language transforms, which is a good place to 
> slot in a default behavior for when no expansion address is provided.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (BEAM-12862) Implement Go SDK-side initialization of Java/Python expansion services.

2021-11-24 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-12862:
--

Assignee: Jack McCluskey

> Implement Go SDK-side initialization of Java/Python expansion services.
> ---
>
> Key: BEAM-12862
> URL: https://issues.apache.org/jira/browse/BEAM-12862
> Project: Beam
>  Issue Type: New Feature
>  Components: cross-language, sdk-go
>Reporter: Daniel Oliveira
>Assignee: Jack McCluskey
>Priority: P2
>
> This feature allows users to run cross-language transforms without manually 
> starting an expansion service beforehand. If no expansion service is running, 
> the SDK will default to starting up a predetermined expansion service.
> This behavior already exists in Java and Python, which might be a useful 
> reference point.
> Note: It may be preferable to implement this after cross-language override 
> registration is implemented. That feature will allow registering alternate 
> behavior for expanding cross-language transforms, which is a good place to 
> slot in a default behavior for when no expansion address is provided.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (BEAM-13215) Portable OSS runners do not support GCP credentials for GCP IOs.

2021-11-09 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-13215:
--

 Summary: Portable OSS runners do not support GCP credentials for 
GCP IOs.
 Key: BEAM-13215
 URL: https://issues.apache.org/jira/browse/BEAM-13215
 Project: Beam
  Issue Type: Bug
  Components: io-go-gcp, io-java-gcp, io-py-gcp, java-fn-execution
Reporter: Daniel Oliveira


The situation here is that when a pipeline is run on a portable runner using a 
GCP IO, and uses docker for the SDK Harness environment, the SDK Harness does 
not have the user's GCP credentials available and the pipeline fails. There are 
apparently [pipeline options for setting 
credentials|https://github.com/apache/beam/blob/v2.33.0/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcpOptions.java#L170],
 but as far as I can tell they are either meant only for non-portable 
pipelines, or only for the Dataflow runner.

The tricky part of implementing this is that credentials for GCP are not 
straightforward, and having them available for something like the Application 
Default Credentials API involves copying over multiple files or environment 
variables. The following article provides a lot of context for the difficulties 
involved: 
[https://medium.com/datamindedbe/application-default-credentials-477879e31cb5]

Possible solutions. Note these are mostly untested:
 # Perform some volume-mounting when calling the "docker run" command to mount 
directories containing credentials. Preferably this can be set via some sort of 
pipeline option. (This could potentially also be used to provide directories 
for docker containers to write output files to with TextIO or FileIO.) See the 
article above for an example.
 ** This solution may not work with runners on remote endpoints though. The 
directory mounted must be on the same machine as the docker container to work 
properly, which may not be possible in some cases with remote runners.
 # Require custom containers with appropriate credentials provided. This is 
more robust than the solution above, but less user-friendly, and would require 
a good amount of documentation to be available.
 ** This could be possible in conjunction with the solution above, and might be 
a good way of supporting GCP credentials on remote runners. Custom containers 
can store any valid credentials of the user's choice, (for example service 
account credentials for a production service) and then be run on any machine.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (BEAM-13037) Update Golang version on Jenkins worker images.

2021-10-19 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-13037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17430737#comment-17430737
 ] 

Daniel Oliveira commented on BEAM-13037:


This seems to be some kind of PATH issue. When I SSH into the workers, 
/snap/bin/go is there and available and I can run the go tests manually from my 
instance just fine. But when the Jenkins workers try to use go, it's not 
available, and creating a symlink from /snap/bin/go to /usr/bin/go seems to fix 
the issue.

I suspect the Jenkins agents either have some hardcoded path in their 
configuration that doesn't include the snap/bin directory, or it's cached in 
some way and it's trying to find the go binary in /usr/bin/go after it moved. 
As a workaround for now I'm creating a symlink on all the VMs as a workaround 
for now, but we should try to fix the root cause. 

> Update Golang version on Jenkins worker images.
> ---
>
> Key: BEAM-13037
> URL: https://issues.apache.org/jira/browse/BEAM-13037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-go, testing
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
> Fix For: Not applicable
>
>
> Update the version of the `go` command on our Jenkins VMs from 1.12.X to 
> 1.16.X, to match as closely as possible the current version specified for Go 
> in our BeamModulePlugin (1.16.5).
> Follows the instructions on Confluence here: 
> https://cwiki.apache.org/confluence/display/BEAM/Jenkins+Tips#JenkinsTips-HowtoinstallandupgradesoftwareonJenkinsworkers



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-13037) Update Golang version on Jenkins worker images.

2021-10-18 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-13037:
---
Fix Version/s: Not applicable
   Resolution: Fixed
   Status: Resolved  (was: Open)

> Update Golang version on Jenkins worker images.
> ---
>
> Key: BEAM-13037
> URL: https://issues.apache.org/jira/browse/BEAM-13037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-go, testing
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
> Fix For: Not applicable
>
>
> Update the version of the `go` command on our Jenkins VMs from 1.12.X to 
> 1.16.X, to match as closely as possible the current version specified for Go 
> in our BeamModulePlugin (1.16.5).
> Follows the instructions on Confluence here: 
> https://cwiki.apache.org/confluence/display/BEAM/Jenkins+Tips#JenkinsTips-HowtoinstallandupgradesoftwareonJenkinsworkers



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   >