Re: [VOTE] Release 2.41.0, release candidate #2

2022-08-18 Thread Ahmet Altay via dev
+1 - I valited python quickstarts on the directrunner. Thank you Kiley!

On Thu, Aug 18, 2022 at 1:31 PM Kiley Sok via dev 
wrote:

> Hi everyone,
> Please review and vote on the release candidate #1 for the version 2.41.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> Reviewers are encouraged to test their own use cases with the release
> candidate, and vote +1 if no issues are found.
>
> The complete staging area is available for your review, which includes:
> * GitHub Release notes [1],
> * the official Apache source release to be deployed to dist.apache.org [2],
> which is signed with the key with fingerprint
> 4D5731CC0AA38097D091EB091E7B28884452AE5D [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "v2.41.0-RC2" [5],
> * website pull request listing the release [6], the blog post [6], and
> publishing the API reference manual [7].
> * Java artifacts were built with Gradle 7.4 and OpenJDK/Oracle JDK
> 1.8.0_312.
> * Python artifacts are deployed along with the source release to the
> dist.apache.org [2] and PyPI[8].
> * Validation sheet with a tab for 2.41.0 release to help with validation
> [9].
> * Docker images published to Docker Hub [10].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> For guidelines on how to try the release in your projects, check out our
> blog post at https://beam.apache.org/blog/validate-beam-release/.
>
> Thanks,
> Release Manager
>
> [1] https://github.com/apache/beam/milestone/3
> [2] https://dist.apache.org/repos/dist/dev/beam/2.41.0/
> [3] https://dist.apache.org/repos/dist/release/beam/KEYS
> [4] https://repository.apache.org/content/repositories/orgapachebeam-1283/
> [5] https://github.com/apache/beam/tree/v2.41.0-RC2
> [6] https://github.com/apache/beam/pull/22706
> [7] https://github.com/apache/beam-site/pull/633
> [8] https://pypi.org/project/apache-beam/2.41.0rc2/
> [9]
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=331459080
> [10] https://hub.docker.com/search?q=apache%2Fbeam=image
>


[VOTE] Release 2.41.0, release candidate #2

2022-08-18 Thread Kiley Sok via dev
Hi everyone,
Please review and vote on the release candidate #1 for the version 2.41.0,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


Reviewers are encouraged to test their own use cases with the release
candidate, and vote +1 if no issues are found.

The complete staging area is available for your review, which includes:
* GitHub Release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which is signed with the key with fingerprint
4D5731CC0AA38097D091EB091E7B28884452AE5D [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "v2.41.0-RC2" [5],
* website pull request listing the release [6], the blog post [6], and
publishing the API reference manual [7].
* Java artifacts were built with Gradle 7.4 and OpenJDK/Oracle JDK
1.8.0_312.
* Python artifacts are deployed along with the source release to the
dist.apache.org [2] and PyPI[8].
* Validation sheet with a tab for 2.41.0 release to help with validation
[9].
* Docker images published to Docker Hub [10].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

For guidelines on how to try the release in your projects, check out our
blog post at https://beam.apache.org/blog/validate-beam-release/.

Thanks,
Release Manager

[1] https://github.com/apache/beam/milestone/3
[2] https://dist.apache.org/repos/dist/dev/beam/2.41.0/
[3] https://dist.apache.org/repos/dist/release/beam/KEYS
[4] https://repository.apache.org/content/repositories/orgapachebeam-1283/
[5] https://github.com/apache/beam/tree/v2.41.0-RC2
[6] https://github.com/apache/beam/pull/22706
[7] https://github.com/apache/beam-site/pull/633
[8] https://pypi.org/project/apache-beam/2.41.0rc2/
[9]
https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=331459080
[10] https://hub.docker.com/search?q=apache%2Fbeam=image


Beam Dependency Check Report (2022-08-18)

2022-08-18 Thread Apache Jenkins Server
<<< text/html; charset=UTF-8: Unrecognized >>>


Re: [VOTE] Release 2.41.0, release candidate #1

2022-08-18 Thread Alexey Romanenko
Just for reference: it looks that it happens only with default package classes 
[1].

[1] https://github.com/raphw/byte-buddy/issues/1301#issuecomment-1218475514


> On 18 Aug 2022, at 12:40, Alexey Romanenko  wrote:
> 
> 
>> On 17 Aug 2022, at 22:52, Kenneth Knowles > > wrote:
>> 
>> Seems like there has been a lot of progress on 
>> https://github.com/raphw/byte-buddy/issues/1301 
>> . Since it has been 
>> identified, I think we can be pretty confident that the version downgrade is 
>> the necessary part. So we can revert the PR for the release, then on main 
>> branch we can proceed with unvendoring but keeping the same version.
> 
> +1 for this.
> 
>> BTW just noting for the thread that I also checked mvn dependency:tree of 
>> Talend/beam-samples and confirmed that only Beam depends on bytebuddy so it 
>> is not a dependency conflict.
> 
> Sorry, I forgot to mention that I'd checked this before I posted that issue 
> on dev@ (to make sure that it depends only on one version of byte_buddy).
> Thanks for highlighting this.
> 
>> I also googled the error message and found 
>> https://stackoverflow.com/questions/51650074/apache-beam-invisible-parameter-type-exception
>>  
>> 
>>  and https://issues.apache.org/jira/browse/BEAM-5061 
>>  where using JDK 10 instead 
>> of 8 causes a similar symptom but it does not seem related. It was never 
>> directly addressed.
> 
> For me, it looks not very related since it’s about pretty old version of 
> byte_buddy and in “beam-samples” we test a build with OpenJDK v8, v11, and 
> even v17 and it fails in the same way for all of them.
> 
> —
> Alexey
> 
>> 
>> Kenn
>> 
>> On Wed, Aug 17, 2022 at 10:36 AM Kiley Sok > > wrote:
>> PR to revert the change for the release 
>> https://github.com/apache/beam/pull/22759 
>> 
>> 
>> I'll rebuild a new RC once the tests pass and the PR is merged.
>> 
>> On Wed, Aug 17, 2022 at 8:16 AM Liam Miller-Cushon > > wrote:
>> On Tue, Aug 16, 2022 at 2:42 PM Kiley Sok > > wrote:
>> Liam, are we okay to roll back this change for this release?
>> 
>> No concerns from me with rolling back to unblock the release.
>> 
>> It looks like this is a change between bytebuddy 1.12.3 and 1.12.4. I filed 
>> https://github.com/raphw/byte-buddy/issues/1301 
>>  to get help understanding 
>> what changed, it sounds like the change might be WAI but there's a suggested 
>> fix. I will prepare a PR for that as a follow-up.
> 



Re: Benchmark tests for the Beam RunInference API

2022-08-18 Thread Danny McCormick via dev
I left a few comments, but overall this sounds like a good plan to me -
thanks for the writeup!

On Tue, Aug 16, 2022 at 9:36 AM Anand Inguva via dev 
wrote:

> Hi,
>
> I created a doc
> [1]
> which outlines the plan for the RunInference API[2] benchmark/performance
> tests. I would appreciate feedback on the following,
>
>- Models used for the benchmark tests.
>- Metrics calculated as part of the benchmark tests.
>
>
> If you have any inputs or any suggestions on additional metrics/models
> that would be helpful for the Beam ML community as part of the benchmark
> tests, please let us know.
>
> [1]
> https://docs.google.com/document/d/1xmh9D_904H-6X19Mi0-tDACwCCMvP4_MFA9QT0TOym8/edit#
> [2]
>  
> https://github.com/apache/beam/blob/67cb87ecc2d01b88f8620ed6821bcf71376d9849/sdks/python/apache_beam/ml/inference/base.py#L269
> 
>
>
> Thanks,
> Anand
>


Re: Representation of logical type beam:logical_type:datetime:v1

2022-08-18 Thread Yi Hu via dev
On Wed, Aug 17, 2022 at 5:14 PM Chamikara Jayalath 
wrote:

>
> I think this is fine (even though it would add a small perf hit to
> JdbcIO.Read). We also probably should make this conversion a utility method
> that can be used elsewhere when we need to encode datetime fields.
> We should also document that "beam:logical_type:datetime:v1" is not
> portable (till we fix the incompatibility).
>
>
+1 for utility method and documentation.
If we were to change JDBC instead of make  millis_instant compatible to
InstantCoder, this would only fix JDBC cross-language timestamps. I expect
for other IO connectors this is still a problem and that is why I would
like to take a generic approach. In general, inside each sdk we would like
to follow the language specific convention of that sdk. I remember a
related  discussion about the timestamp types:
https://github.com/apache/beam/pull/17380#discussion_r852422314 which
reached a conclusion that follows the language convention on timestamp
values, e.g. use milli precision (long backed) Instant in Java; micro
precision (float backed) timestamp in python.

Best,
Yi


Re: [VOTE] Release 2.41.0, release candidate #1

2022-08-18 Thread Alexey Romanenko

> On 17 Aug 2022, at 22:52, Kenneth Knowles  wrote:
> 
> Seems like there has been a lot of progress on 
> https://github.com/raphw/byte-buddy/issues/1301 
> . Since it has been 
> identified, I think we can be pretty confident that the version downgrade is 
> the necessary part. So we can revert the PR for the release, then on main 
> branch we can proceed with unvendoring but keeping the same version.

+1 for this.

> BTW just noting for the thread that I also checked mvn dependency:tree of 
> Talend/beam-samples and confirmed that only Beam depends on bytebuddy so it 
> is not a dependency conflict.

Sorry, I forgot to mention that I'd checked this before I posted that issue on 
dev@ (to make sure that it depends only on one version of byte_buddy).
Thanks for highlighting this.

> I also googled the error message and found 
> https://stackoverflow.com/questions/51650074/apache-beam-invisible-parameter-type-exception
>  
> 
>  and https://issues.apache.org/jira/browse/BEAM-5061 
>  where using JDK 10 instead 
> of 8 causes a similar symptom but it does not seem related. It was never 
> directly addressed.

For me, it looks not very related since it’s about pretty old version of 
byte_buddy and in “beam-samples” we test a build with OpenJDK v8, v11, and even 
v17 and it fails in the same way for all of them.

—
Alexey

> 
> Kenn
> 
> On Wed, Aug 17, 2022 at 10:36 AM Kiley Sok  > wrote:
> PR to revert the change for the release 
> https://github.com/apache/beam/pull/22759 
> 
> 
> I'll rebuild a new RC once the tests pass and the PR is merged.
> 
> On Wed, Aug 17, 2022 at 8:16 AM Liam Miller-Cushon  > wrote:
> On Tue, Aug 16, 2022 at 2:42 PM Kiley Sok  > wrote:
> Liam, are we okay to roll back this change for this release?
> 
> No concerns from me with rolling back to unblock the release.
> 
> It looks like this is a change between bytebuddy 1.12.3 and 1.12.4. I filed 
> https://github.com/raphw/byte-buddy/issues/1301 
>  to get help understanding 
> what changed, it sounds like the change might be WAI but there's a suggested 
> fix. I will prepare a PR for that as a follow-up.



Beam High Priority Issue Report (70)

2022-08-18 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need 
attention.

See https://beam.apache.org/contribute/issue-priorities for the meaning and 
expectations around issue priorities.

Unassigned P1 Issues:

https://github.com/apache/beam/issues/22749 [Bug]: Bytebuddy version update 
causes Invisible parameter type error
https://github.com/apache/beam/issues/22743 [Bug]: Test flake: 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImplTest.testInsertWithinRowCountLimits
https://github.com/apache/beam/issues/22642 [Bug]: Dataflow fails to drain a 
job when using BigQuery (java sdk v.2.38)
https://github.com/apache/beam/issues/22440 [Bug]: Python Batch Dataflow 
SideInput LoadTests failing
https://github.com/apache/beam/issues/22321 
PortableRunnerTestWithExternalEnv.test_pardo_large_input is regularly failing 
on jenkins
https://github.com/apache/beam/issues/22303 [Task]: Add tests to Kafka SDF and 
fix known and discovered issues
https://github.com/apache/beam/issues/22299 [Bug]: JDBCIO Write freeze at 
getConnection() in WriteFn
https://github.com/apache/beam/issues/22283 [Bug]: Python Lots of fn runner 
test items cost exactly 5 seconds to run
https://github.com/apache/beam/issues/21794 Dataflow runner creates a new timer 
whenever the output timestamp is change
https://github.com/apache/beam/issues/21713 404s in BigQueryIO don't get output 
to Failed Inserts PCollection
https://github.com/apache/beam/issues/21704 beam_PostCommit_Java_DataflowV2 
failures parent bug
https://github.com/apache/beam/issues/21703 pubsublite.ReadWriteIT failing in 
beam_PostCommit_Java_DataflowV1 and V2
https://github.com/apache/beam/issues/21702 SpannerWriteIT failing in beam 
PostCommit Java V1
https://github.com/apache/beam/issues/21701 beam_PostCommit_Java_DataflowV1 
failing with a variety of flakes and errors
https://github.com/apache/beam/issues/21700 
--dataflowServiceOptions=use_runner_v2 is broken
https://github.com/apache/beam/issues/21696 Flink Tests failure :  
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.beam.runners.core.construction.SerializablePipelineOptions 
https://github.com/apache/beam/issues/21695 DataflowPipelineResult does not 
raise exception for unsuccessful states.
https://github.com/apache/beam/issues/21694 BigQuery Storage API insert with 
writeResult retry and write to error table
https://github.com/apache/beam/issues/21480 flake: 
FlinkRunnerTest.testEnsureStdoutStdErrIsRestored
https://github.com/apache/beam/issues/21472 Dataflow streaming tests failing 
new AfterSynchronizedProcessingTime test
https://github.com/apache/beam/issues/21471 Flakes: Failed to load cache entry
https://github.com/apache/beam/issues/21470 Test flake: test_split_half_sdf
https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink flaky: 
Connection refused
https://github.com/apache/beam/issues/21468 
beam_PostCommit_Python_Examples_Dataflow failing
https://github.com/apache/beam/issues/21467 GBK and CoGBK streaming Java load 
tests failing
https://github.com/apache/beam/issues/21465 Kafka commit offset drop data on 
failure for runners that have non-checkpointing shuffle
https://github.com/apache/beam/issues/21463 NPE in Flink Portable 
ValidatesRunner streaming suite
https://github.com/apache/beam/issues/21462 Flake in 
org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use
https://github.com/apache/beam/issues/21271 pubsublite.ReadWriteIT flaky in 
beam_PostCommit_Java_DataflowV2  
https://github.com/apache/beam/issues/21270 
org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView
 flaky on Dataflow Runner V2
https://github.com/apache/beam/issues/21268 Race between member variable being 
accessed due to leaking uninitialized state via OutboundObserverFactory
https://github.com/apache/beam/issues/21267 WriteToBigQuery submits a duplicate 
BQ load job if a 503 error code is returned from googleapi
https://github.com/apache/beam/issues/21266 
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
 is flaky in Java ValidatesRunner Flink suite.
https://github.com/apache/beam/issues/21265 
apache_beam.runners.portability.fn_api_runner.translations_test.TranslationsTest.test_run_packable_combine_globally
 'apache_beam.coders.coder_impl._AbstractIterable' object is not reversible
https://github.com/apache/beam/issues/21263 (Broken Pipe induced) Bricked 
Dataflow Pipeline 
https://github.com/apache/beam/issues/21262 Python AfterAny, AfterAll do not 
follow spec
https://github.com/apache/beam/issues/21261 
org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
 is flaky
https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit 
data at GC time
https://github.com/apache/beam/issues/21257 Either Create or DirectRunner fails 
to produce all elements