Re: [PROPOSAL] Preparing for 2.42.0 Release

2022-09-08 Thread Robert Burke
My recent travels appear to prevent me from working at my best and focus on
getting the release out this week.

I am going to take a day/weekend to rest so I can work better next week.
Fortunately, I've been rather thoroughly been testing Covid negative since
my trip.

While I'll be resting, I will still be keeping an eye on emails and Cherry
Pick requests for 2.42.0 that come across my Github dashboard (@lostluck).
If you have fixes for any regressions, please do incorporate them into the
main branch, and send the cherry pick PR to the release-2.42.0 branch for
my review.

Thank you for your patience with this delay and release processes will
resume in earnest on Monday the 12th PT.

Your friendly neighbourhood release manager,
Robert Burke


On Wed, Sep 7, 2022 at 6:21 PM Robert Burke  wrote:

> A release branch has been cut!
>
> https://github.com/apache/beam/tree/release-2.42.0
>
> However, I don't have time to start further verification procedures. The
> plan is to get them going tomorrow morning PT, and proceed to a quick RC so
> that community verification can begin promptly. It's not anticipated that
> that will be the final RC.
>
> Either an RC will be published tomorrow, with it's accompanying RC thread,
> or this thread will be updated by EoD.
>
> Thank you for your patience,
> Your friendly neighbourhood release manager
> Robert Burke
>
> On Wed, Sep 7, 2022 at 10:55 AM Robert Burke  wrote:
>
>> 'One issue remains open in the 2.42.0 milestone [1] blocking the cut.
>>
>> Since it's a BOM update I'm inclined to wait for it, rather than cause a
>> cherry pick to invalidate all prior testing.
>>
>> The PR [2] is making good progress in ensuring no linkage issues, so I
>> still anticipate that the 2.42.0 branch cut will happen by EoD today
>> Pacific Time.
>>
>> If that doesn't occur, I'll send another email to this thread with an
>> update.
>>
>> Thank you for your patience,
>> Your friendly neighbourhood release manager
>> Robert Burke
>>
>> [1] https://github.com/apache/beam/milestone/4
>> [2] https://github.com/apache/beam/pull/22996
>>
>> On Wed, Aug 24, 2022 at 3:05 PM Robert Burke  wrote:
>>
>>> Hi everyone!
>>>
>>> The next (2.42.0) release branch cut is scheduled for Sept 7th,
>>> according to
>>> the release calendar [1].
>>>
>>> I would like to volunteer myself to do this release. My plan is to cut
>>> the branch on that date, and cherrypick release-blocking fixes afterwards,
>>> if any.
>>>
>>> Please help me make sure the release goes smoothly by:
>>> - Making sure that any unresolved release blocking issues for 2.42.0 should
>>> have their "Milestone" marked as "2.42.0 Release" as soon as possible.
>>> - Reviewing the current release blockers [2] and remove the Milestone if
>>> they don't meet the criteria at [3].
>>>
>>> Let me know if you have any comments/objections/questions.
>>>
>>> Due to travel, I will be unavailable to answer any emails from August
>>> 26th until Sept 5th, but cut blocking concerns will be resolved before I
>>> make the cut, and will delay it if necessary. This thread will be notified
>>> in that instance.
>>>
>>> Thanks,
>>> Robert Burke
>>>
>>> [1]
>>> https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com
>>> [2] https://github.com/apache/beam/milestone/4
>>> [3] https://beam.apache.org/contribute/release-blocking/
>>>
>>


Beam Dependency Check Report (2022-09-08)

2022-09-08 Thread Apache Jenkins Server
<<< text/html; charset=UTF-8: Unrecognized >>>


Re: [JmsIO] => Pull Request to fix message acknowledgement issue

2022-09-08 Thread Luke Cwik via dev
Could we have more than one active checkpoint per reader instance?
Yes. Readers are saved and reused across multiple bundles. They aren't
always closed at bundle boundaries.

Are we sure that all checkpoints are finalized when the reader is closed?
No, readers are closed after a certain period of time of inactivity. It is
likely that all checkpoints will have expired or been finalized but it is
not guaranteed by when the reader is closed for example in multi
language pipelines the downstream processing in another language can delay
committing the output to the runner which can lead to the readers being
closed due to inactivity and then the checkpoint being finalized.

We could choose to hand off the session ownership to the JmsCheckpoint and
create a new one. This way finalizing the checkpoint would own closing the
session.




On Thu, Sep 8, 2022 at 8:01 AM BALLADA Vincent 
wrote:

> Hello Luke,
>
>
>
> Thanks for your remarks.
>
>
>
> *Connection reuse*
>
> Concerning the use of a single connection fort the entire process per
> connection factory, that would mean that we would have one JMS connection
> per worker, and there may be a downside to do so:
>
> If the broker is hosted into a multi-node cluster infrastructure, and if
> we want to consumer messages from all cluster nodes, we have to make sure
> that we have enough connections to be load balanced to all the nodes.
>
> If for some reason (autoscaling, low backlog size) we have only one
> worker, we may not consume from all the cluster nodes.
>
> As the number of connections is limited by the number of split/Readers,
> and as connections are opened/closed not so often (when workers are killed
> or created, or reader closes/started), I would suggest to keep the
> connection management as it is currently.
>
>
>
> *Session and consumer lifecycle*
>
>
>
>1. Session unique per checkpoint
>
> Could we have more than one active checkpoint per reader instance?
>
>
>
> Should we close the session/consumer and create new session/consumer at
> the end of finalizeCheckpoint? The goal here is to ensure that the message
> acknowledgement occurs before the session is closed.
>
> If advance and finalizeCheckpoint can be called concurrently, we need to
> make sure that the session is active in “advance” in order to receive
> message.
>
> Are we sure that all checkpoints are finalized when the reader is closed?
>
>
>
>1. Session scoped to the reader start/close
>
> It seems to be more or less the case currently.
>
>
>
> Regards
>
>
>
> Vincent BALLADA
>
>
>
>
>
> *De : *Luke Cwik via dev 
> *Date : *jeudi, 1 septembre 2022 à 18:48
> *À : *dev 
> *Objet : *Re: [JmsIO] => Pull Request to fix message acknowledgement issue
>
> [image: vwP6KQExYeP8ewASUVORK5CYII=]
>
> [EXT]
>
> I have a better understanding of the problem after reviewing the doc and
> we need to decide on what lifecycle scope we want the `Connection`,
> `Session`, and `MessageConsumer` to have.
>
> It looks like for the `Connection` we should try to have at most one
> instance for the entire process per connection factory.
> https://docs.oracle.com/javaee/7/api/javax/jms/Connection.html says that
> the connection should be re-used. Having less connections would likely be
> beneficial unless you think there would be a performance limitation of
> using a single connection per process for all the messages?
>
> For the `Session` and `MessageConsumer`, I'm not sure what lifecycle scope
> it should have. Some ideas:
> 1. we could make it so that each `Session` is unique per checkpoint, e.g.
> we hand off the ownership of the `Session` to the JmsCheckpointMark
> everytime we checkpoint and create a new `Session` for the next set of
> messages we receive. This would mean that we would also close the
> `MessageConsumer` at every checkpoint and create a new one.
> 2. we could make it so that the `Session` is scoped to the reader
> start/close and possibly multiple checkpoint marks and effectively close
> the `Session` once the reader is closed and all checkpoint marks are
> finalized/expired. We would close the `MessageConsumer` whenever the reader
> is closed.
> 3. we could make it so that the `Session` is scoped to the `Connection`
> and would only close it when the `Connection` closes.
>
> 1 seems pretty simple since the ownership of the `Session` is always owned
> by a single distinct owner. This seems like it would make the most sense if
> `Session` creation and management was cheap. Another positive is that once
> the `Session` closes any messages that weren't acknowledged are returned
> back to the queue and we will not have to wait for the reader to be closed
> or all the checkpoint marks to be finalized.
>
> What do you think?
>
>
>
> On Mon, Aug 29, 2022 at 10:06 PM Jean-Baptiste Onofré 
> wrote:
>
> Hi Vincent,
>
> thanks, I will take a look (as original JmsIO author ;)).
>
> Regards
> JB
>
> On Mon, Aug 29, 2022 at 6:43 PM BALLADA Vincent
>  wrote:
> >
> > Hi all,
> >
> >
> >
> > Here is a 

Re: [JmsIO] => Pull Request to fix message acknowledgement issue

2022-09-08 Thread BALLADA Vincent
Hello Luke,

Thanks for your remarks.

Connection reuse
Concerning the use of a single connection fort the entire process per 
connection factory, that would mean that we would have one JMS connection per 
worker, and there may be a downside to do so:
If the broker is hosted into a multi-node cluster infrastructure, and if we 
want to consumer messages from all cluster nodes, we have to make sure that we 
have enough connections to be load balanced to all the nodes.
If for some reason (autoscaling, low backlog size) we have only one worker, we 
may not consume from all the cluster nodes.
As the number of connections is limited by the number of split/Readers, and as 
connections are opened/closed not so often (when workers are killed or created, 
or reader closes/started), I would suggest to keep the connection management as 
it is currently.

Session and consumer lifecycle


  1.  Session unique per checkpoint
Could we have more than one active checkpoint per reader instance?

Should we close the session/consumer and create new session/consumer at the end 
of finalizeCheckpoint? The goal here is to ensure that the message 
acknowledgement occurs before the session is closed.
If advance and finalizeCheckpoint can be called concurrently, we need to make 
sure that the session is active in “advance” in order to receive message.
Are we sure that all checkpoints are finalized when the reader is closed?


  1.  Session scoped to the reader start/close
It seems to be more or less the case currently.

Regards

Vincent BALLADA


De : Luke Cwik via dev 
Date : jeudi, 1 septembre 2022 à 18:48
À : dev 
Objet : Re: [JmsIO] => Pull Request to fix message acknowledgement issue
[vwP6KQExYeP8ewASUVORK5CYII=]

[EXT]
I have a better understanding of the problem after reviewing the doc and we 
need to decide on what lifecycle scope we want the `Connection`, `Session`, and 
`MessageConsumer` to have.

It looks like for the `Connection` we should try to have at most one instance 
for the entire process per connection factory. 
https://docs.oracle.com/javaee/7/api/javax/jms/Connection.html says that the 
connection should be re-used. Having less connections would likely be 
beneficial unless you think there would be a performance limitation of using a 
single connection per process for all the messages?

For the `Session` and `MessageConsumer`, I'm not sure what lifecycle scope it 
should have. Some ideas:
1. we could make it so that each `Session` is unique per checkpoint, e.g. we 
hand off the ownership of the `Session` to the JmsCheckpointMark everytime we 
checkpoint and create a new `Session` for the next set of messages we receive. 
This would mean that we would also close the `MessageConsumer` at every 
checkpoint and create a new one.
2. we could make it so that the `Session` is scoped to the reader start/close 
and possibly multiple checkpoint marks and effectively close the `Session` once 
the reader is closed and all checkpoint marks are finalized/expired. We would 
close the `MessageConsumer` whenever the reader is closed.
3. we could make it so that the `Session` is scoped to the `Connection` and 
would only close it when the `Connection` closes.

1 seems pretty simple since the ownership of the `Session` is always owned by a 
single distinct owner. This seems like it would make the most sense if 
`Session` creation and management was cheap. Another positive is that once the 
`Session` closes any messages that weren't acknowledged are returned back to 
the queue and we will not have to wait for the reader to be closed or all the 
checkpoint marks to be finalized.

What do you think?

On Mon, Aug 29, 2022 at 10:06 PM Jean-Baptiste Onofré 
mailto:j...@nanthrax.net>> wrote:
Hi Vincent,

thanks, I will take a look (as original JmsIO author ;)).

Regards
JB

On Mon, Aug 29, 2022 at 6:43 PM BALLADA Vincent
mailto:vincent.ball...@renault.com>> wrote:
>
> Hi all,
>
>
>
> Here is a PR related to the following issue (Runner acknowledges messages on 
> closed session):
>
> https://github.com/apache/beam/issues/20814
>
>
>
> And here is a documentation explaining the fix:
>
> https://docs.google.com/document/d/19HiNPoJeIlzCFyWGdlw7WEFutceL2AmN_5Z0Vmi1s-I/edit?usp=sharing
>
>
>
> And finally the PR:
>
> https://github.com/apache/beam/pull/22932
>
>
>
> Regards
>
>
>
> Vincent BALLADA
>
>
>
>
>
>
> Confidential C
>
> -- Disclaimer 
> Ce message ainsi que les eventuelles pieces jointes constituent une 
> correspondance privee et confidentielle a l'attention exclusive du 
> destinataire designe ci-dessus. Si vous n'etes pas le destinataire du present 
> message ou une personne susceptible de pouvoir le lui delivrer, il vous est 
> signifie que toute divulgation, distribution ou copie de cette transmission 
> est strictement interdite. Si vous avez recu ce message par erreur, nous vous 
> remercions d'en informer l'expediteur par telephone ou de lui retourner le 
> present message, puis d'effa

Beam High Priority Issue Report (68)

2022-09-08 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need 
attention.

See https://beam.apache.org/contribute/issue-priorities for the meaning and 
expectations around issue priorities.

Unassigned P1 Issues:

https://github.com/apache/beam/issues/23020 [Bug]: Kubernetes service not 
cleaned up and reaching quota causing multiple performance test failure
https://github.com/apache/beam/issues/22913 [Bug]: 
beam_PostCommit_Java_ValidatesRunner_Flink is failing
https://github.com/apache/beam/issues/22749 [Bug]: Bytebuddy version update 
causes Invisible parameter type error
https://github.com/apache/beam/issues/22743 [Bug]: Test flake: 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImplTest.testInsertWithinRowCountLimits
https://github.com/apache/beam/issues/22440 [Bug]: Python Batch Dataflow 
SideInput LoadTests failing
https://github.com/apache/beam/issues/22321 
PortableRunnerTestWithExternalEnv.test_pardo_large_input is regularly failing 
on jenkins
https://github.com/apache/beam/issues/22303 [Task]: Add tests to Kafka SDF and 
fix known and discovered issues
https://github.com/apache/beam/issues/22299 [Bug]: JDBCIO Write freeze at 
getConnection() in WriteFn
https://github.com/apache/beam/issues/22283 [Bug]: Python Lots of fn runner 
test items cost exactly 5 seconds to run
https://github.com/apache/beam/issues/21794 Dataflow runner creates a new timer 
whenever the output timestamp is change
https://github.com/apache/beam/issues/21713 404s in BigQueryIO don't get output 
to Failed Inserts PCollection
https://github.com/apache/beam/issues/21704 beam_PostCommit_Java_DataflowV2 
failures parent bug
https://github.com/apache/beam/issues/21703 pubsublite.ReadWriteIT failing in 
beam_PostCommit_Java_DataflowV1 and V2
https://github.com/apache/beam/issues/21702 SpannerWriteIT failing in beam 
PostCommit Java V1
https://github.com/apache/beam/issues/21701 beam_PostCommit_Java_DataflowV1 
failing with a variety of flakes and errors
https://github.com/apache/beam/issues/21700 
--dataflowServiceOptions=use_runner_v2 is broken
https://github.com/apache/beam/issues/21696 Flink Tests failure :  
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.beam.runners.core.construction.SerializablePipelineOptions 
https://github.com/apache/beam/issues/21695 DataflowPipelineResult does not 
raise exception for unsuccessful states.
https://github.com/apache/beam/issues/21694 BigQuery Storage API insert with 
writeResult retry and write to error table
https://github.com/apache/beam/issues/21480 flake: 
FlinkRunnerTest.testEnsureStdoutStdErrIsRestored
https://github.com/apache/beam/issues/21472 Dataflow streaming tests failing 
new AfterSynchronizedProcessingTime test
https://github.com/apache/beam/issues/21471 Flakes: Failed to load cache entry
https://github.com/apache/beam/issues/21470 Test flake: test_split_half_sdf
https://github.com/apache/beam/issues/21469 beam_PostCommit_XVR_Flink flaky: 
Connection refused
https://github.com/apache/beam/issues/21468 
beam_PostCommit_Python_Examples_Dataflow failing
https://github.com/apache/beam/issues/21467 GBK and CoGBK streaming Java load 
tests failing
https://github.com/apache/beam/issues/21465 Kafka commit offset drop data on 
failure for runners that have non-checkpointing shuffle
https://github.com/apache/beam/issues/21463 NPE in Flink Portable 
ValidatesRunner streaming suite
https://github.com/apache/beam/issues/21462 Flake in 
org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use
https://github.com/apache/beam/issues/21271 pubsublite.ReadWriteIT flaky in 
beam_PostCommit_Java_DataflowV2  
https://github.com/apache/beam/issues/21270 
org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView
 flaky on Dataflow Runner V2
https://github.com/apache/beam/issues/21268 Race between member variable being 
accessed due to leaking uninitialized state via OutboundObserverFactory
https://github.com/apache/beam/issues/21267 WriteToBigQuery submits a duplicate 
BQ load job if a 503 error code is returned from googleapi
https://github.com/apache/beam/issues/21266 
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
 is flaky in Java ValidatesRunner Flink suite.
https://github.com/apache/beam/issues/21265 
apache_beam.runners.portability.fn_api_runner.translations_test.TranslationsTest.test_run_packable_combine_globally
 'apache_beam.coders.coder_impl._AbstractIterable' object is not reversible
https://github.com/apache/beam/issues/21263 (Broken Pipe induced) Bricked 
Dataflow Pipeline 
https://github.com/apache/beam/issues/21262 Python AfterAny, AfterAll do not 
follow spec
https://github.com/apache/beam/issues/21261 
org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
 is flaky
https://github.com/apache/beam/issues/21260 Python DirectRunner does