Re: [VOTE] Release 2.52.0, release candidate #5

2023-11-14 Thread Bruno Volpato via dev
+1 (non-binding). Tested with https://github.com/GoogleCloudPlatform/DataflowTemplates (Java SDK 11, Dataflow runner). Thanks Danny! On Mon, Nov 13, 2023 at 6:07 PM Danny McCormick via dev wrote: > Hi everyone, > Please review and vote on the release candidate #5 for the version 2.52.0, > as

Re: Upgrading Avro dependencies

2023-11-14 Thread Alexey Romanenko
Thanks! Please, let me know if you need any help on this. — Alexey > On 14 Nov 2023, at 17:52, John Casey wrote: > > The vulnerability said to upgrade to 1.11.3, so I think that would be my > starting point. > > > On Mon, Nov 13, 2023 at 12:23 PM Alexey Romanenko

Re: Hiding logging for beam playground examples

2023-11-14 Thread Robert Bradshaw via dev
+1 to at least setting the log level to higher than info. Some runner logging (e.g. job started/done) may be useful. On Tue, Nov 14, 2023 at 9:37 AM Joey Tran wrote: > > Hi all, > > I just had a workshop to demo beam for people at my company and there was a > bit of confusion about whether the

Hiding logging for beam playground examples

2023-11-14 Thread Joey Tran
Hi all, I just had a workshop to demo beam for people at my company and there was a bit of confusion about whether the beam python playground examples were even working and it turned out they just got confused by all the runner logging that is output. Is this worth keeping? It seems like it'd be

Re: The Current State of Beam Python Type Hinting

2023-11-14 Thread Robert Bradshaw via dev
Thanks for writing this up! Added some comments to the doc itself. On Mon, Nov 13, 2023 at 11:01 PM Johanna Öjeling via dev < dev@beam.apache.org> wrote: > Thanks - well written! Interesting with the Any type, I learned something > new. Added a comment. > > Johanna > > On Mon, Nov 13, 2023 at

Re: Upgrading Avro dependencies

2023-11-14 Thread John Casey via dev
The vulnerability said to upgrade to 1.11.3, so I think that would be my starting point. On Mon, Nov 13, 2023 at 12:23 PM Alexey Romanenko wrote: > > > On 10 Nov 2023, at 19:23, John Casey wrote: > > I guess I'm a bit confused as to why specifically generateTestAvroJava > seems to use the

Beam High Priority Issue Report (47)

2023-11-14 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29413 [Bug]: Can not use

Re: The Current State of Beam Python Type Hinting

2023-11-13 Thread Johanna Öjeling via dev
Thanks - well written! Interesting with the Any type, I learned something new. Added a comment. Johanna On Mon, Nov 13, 2023 at 6:02 PM Jack McCluskey via dev wrote: > Hey everyone, > > I put together a small doc explaining how Beam Python type hinting works + > where the module needs to go in

[VOTE] Release 2.52.0, release candidate #5

2023-11-13 Thread Danny McCormick via dev
Hi everyone, Please review and vote on the release candidate #5 for the version 2.52.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) Reviewers are encouraged to test their own use cases with the release candidate, and vote +1 if no

[RFC] Enrichment Transform Design

2023-11-13 Thread Ritesh Ghorse via dev
Hey everyone, I've written a doc for a new Enrichment Transform in Beam. I will note that this transform will be built on top of RequestResponseIO (Implementation

[PR] Publish docs for 2.52.0 release [beam-site]

2023-11-13 Thread via GitHub
damccorm opened a new pull request, #655: URL: https://github.com/apache/beam-site/pull/655 Content generated from https://github.com/apache/beam/tree/v2.52.0-RC5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Publish docs for 2.52.0 release [beam-site]

2023-11-13 Thread via GitHub
damccorm closed pull request #654: Publish docs for 2.52.0 release URL: https://github.com/apache/beam-site/pull/654 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: Upgrading Avro dependencies

2023-11-13 Thread Alexey Romanenko
> On 10 Nov 2023, at 19:23, John Casey wrote: > > I guess I'm a bit confused as to why specifically generateTestAvroJava seems > to use the wrong version. I see our version specific generated code, but this > action appears to be inherited from the plugin, and is configured with > whichever

Re: Adding Dead Letter Queues to Beam IOs

2023-11-13 Thread Alexey Romanenko
Thanks a lot for working on this, long waiting and very demanded user feature. I’ll try to take a look on design doc in the next days. — Alexey > On 8 Nov 2023, at 21:43, John Casey via dev wrote: > > Hi All, > > I've written up a design for adding DLQs to existing Beam IOs. It's been >

The Current State of Beam Python Type Hinting

2023-11-13 Thread Jack McCluskey via dev
Hey everyone, I put together a small doc explaining how Beam Python type hinting works + where the module needs to go in the future with changes to Python itself. This is over at https://s.apache.org/beam-python-type-hinting-overview and I'll be putting it into a few places for discoverability as

Re: [VOTE] Release 2.52.0, release candidate #4

2023-11-13 Thread Danny McCormick via dev
I agree that we should take this, I'll work on a new RC once that PR is in/cherry-picked On Mon, Nov 13, 2023 at 9:30 AM Bruno Volpato via dev wrote: > I'm having some problems validating the RCs proposed here (both 3 and 4). > User code that depends on versions newer than Avro 1.8.2 are having

Re: [VOTE] Release 2.52.0, release candidate #4

2023-11-13 Thread Bruno Volpato via dev
I'm having some problems validating the RCs proposed here (both 3 and 4). User code that depends on versions newer than Avro 1.8.2 are having problems running on Dataflow. > Caused by: java.io.InvalidClassException: org.apache.avro.specific.SpecificRecordBase; local class incompatible: stream

Re: [VOTE] Release 2.52.0, release candidate #4

2023-11-13 Thread Jan Lukavský
+1 (binding) Validated Java SDK with Flink runner on own use cases.  Jan On 11/12/23 00:44, Danny McCormick via dev wrote: Hi everyone, Please review and vote on the release candidate #3 for the version 2.52.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release

Beam High Priority Issue Report (46)

2023-11-13 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java

Re: [VOTE] Release 2.52.0, release candidate #4

2023-11-12 Thread Danny McCormick via dev
Yep, that is correct - this should've read release candidate #4 - thanks for calling that out On Sun, Nov 12, 2023 at 1:13 PM Austin Bennett wrote: > Danny: > > The rc # differs between the subject [ #4 ] and the first sentence of your > email [ #3 ]. > > I think we can assume this is a VOTE to

Re: Adding Dead Letter Queues to Beam IOs

2023-11-12 Thread Austin Bennett
This will eventually be a great addition to aid usability -- thanks for starting to think through and address, John! On Fri, Nov 10, 2023, 10:54 AM Danny McCormick via dev wrote: > Thanks - the general ideas seem solid, I added some questions/comments as > well. > > On Fri, Nov 10, 2023 at

Re: [VOTE] Release 2.52.0, release candidate #4

2023-11-12 Thread Austin Bennett
Danny: The rc # differs between the subject [ #4 ] and the first sentence of your email [ #3 ]. I think we can assume this is a VOTE to match the subject and is for RC #4 , but wanted to call that out. Cheers, Austin On Sat, Nov 11, 2023, 3:44 PM Danny McCormick via dev wrote: > Hi

[VOTE] Release 2.52.0, release candidate #4

2023-11-11 Thread Danny McCormick via dev
Hi everyone, Please review and vote on the release candidate #3 for the version 2.52.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) Reviewers are encouraged to test their own use cases with the release candidate, and vote +1 if no

[PR] Publish docs for 2.52.0 release [beam-site]

2023-11-11 Thread via GitHub
damccorm opened a new pull request, #654: URL: https://github.com/apache/beam-site/pull/654 Content generated from https://github.com/apache/beam/tree/v2.52.0-RC4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-11 Thread Danny McCormick via dev
I will go ahead and create an RC4 - IMO this vulnerability patch warrants a new RC. Thanks Valentyn! On Fri, Nov 10, 2023 at 9:11 PM Valentyn Tymofieiev via dev < dev@beam.apache.org> wrote: > As mentioned in another thread [1], there is a recently detected > vulnerability in pyarrow [2]. > >

Re: [PR] Publish docs for 2.52.0 release [beam-site]

2023-11-11 Thread via GitHub
damccorm closed pull request #653: Publish docs for 2.52.0 release URL: https://github.com/apache/beam-site/pull/653 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-10 Thread Valentyn Tymofieiev via dev
As mentioned in another thread [1], there is a recently detected vulnerability in pyarrow [2]. It appears to be a concern for Beam users that we can mitigate in the upcoming release. We can reassess early next week in case there is a revised assessment for severity for this vulnerability. In the

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-10 Thread Chamikara Jayalath via dev
+1 (binding). Tested multi-lang Java/Python jobs. Thanks, Cham On Fri, Nov 10, 2023, 12:28 PM Svetak Sundhar via dev wrote: > +1 Non Binding -- tested Python SDK batch. > > > Svetak Sundhar > > Data Engineer > s vetaksund...@google.com > > > > On Fri, Nov 10, 2023 at 2:58 PM Danny McCormick

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-10 Thread Svetak Sundhar via dev
+1 Non Binding -- tested Python SDK batch. Svetak Sundhar Data Engineer s vetaksund...@google.com On Fri, Nov 10, 2023 at 2:58 PM Danny McCormick via dev wrote: > > Note: the release guide >

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-10 Thread Danny McCormick via dev
> Note: the release guide and blog post say the RC image has a tag

Re: [Python SDK] PyArrow Critical Vulnerability

2023-11-10 Thread Valentyn Tymofieiev via dev
>From https://pypi.org/project/pyarrow-hotfix/ : pyarrow_hotfix must be imported in your application or library code for it to take effect. Just installing the package is not sufficient: For Beam users, that means that the pipeline code running on the workers would need to import this module on

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-10 Thread Johanna Öjeling via dev
+1 (non-binding) Tested the Go SDK on Dataflow with own use cases. Note: the release guide and blog post

Re: Adding Dead Letter Queues to Beam IOs

2023-11-10 Thread Danny McCormick via dev
Thanks - the general ideas seem solid, I added some questions/comments as well. On Fri, Nov 10, 2023 at 1:32 PM Robert Bradshaw via dev wrote: > Thanks. I added some comments to the doc and open PR. > > On Wed, Nov 8, 2023 at 12:44 PM John Casey via dev > wrote: > > > > Hi All, > > > > I've

Re: Adding Dead Letter Queues to Beam IOs

2023-11-10 Thread Robert Bradshaw via dev
Thanks. I added some comments to the doc and open PR. On Wed, Nov 8, 2023 at 12:44 PM John Casey via dev wrote: > > Hi All, > > I've written up a design for adding DLQs to existing Beam IOs. It's been > through a round of reviews with some Dataflow folks at Google, but I'd > appreciate any

Re: Upgrading Avro dependencies

2023-11-10 Thread John Casey via dev
I guess I'm a bit confused as to why specifically generateTestAvroJava seems to use the wrong version. I see our version specific generated code, but this action appears to be inherited from the plugin, and is configured with whichever avro version is provided. Given that I tried to just change to

Re: [Python SDK] PyArrow Critical Vulnerability

2023-11-10 Thread Valentyn Tymofieiev via dev
Hi Piotr, thanks for bringing this to the list. There is a FR to support pyarrow https://github.com/apache/beam/issues/28410 . I looked into it briefly in https://github.com/apache/beam/pull/28437 but saw some test failures and it has been on back burner. Given the news about vulnerability it

Re: Upgrading Avro dependencies

2023-11-10 Thread Alexey Romanenko
Hi John, This old Avro version in Beam is a very long story. Briefly, since initially it was toughly integrated into Java SDK “core” module then it was not possible to upgrade an Avro version without breaking changes for users (because of some Avro incompatible changes, as you have noticed

Upgrading Avro dependencies

2023-11-10 Thread John Casey via dev
Hi All, There was a CVE detected in Avro 1.8.2 (CVE-2023-39410), so I'm trying to upgrade to avro 1.11.3. Unfortunately, it seems that our auto-generated Avro test classes aren't being generated properly with this new version. I've updated our avro generation plugin as well, but for whatever

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-10 Thread Robert Bradshaw via dev
+1 (binding) Artifacts and signatures look good, validated one of the Python wheels in a fresh install. On Fri, Nov 10, 2023 at 7:23 AM Alexey Romanenko wrote: > > +1 (binding) > > Java SDK with Spark runner > > — > Alexey > > On 9 Nov 2023, at 16:44, Ritesh Ghorse via dev wrote: > > +1

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-10 Thread Alexey Romanenko
+1 (binding) Java SDK with Spark runner — Alexey > On 9 Nov 2023, at 16:44, Ritesh Ghorse via dev wrote: > > +1 (non-binding) > > Validated Python SDK quickstart batch and streaming. > > Thanks! > > On Thu, Nov 9, 2023 at 9:25 AM Jan Lukavský > wrote: >> +1

Beam High Priority Issue Report (46)

2023-11-10 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29386 [Failing Test]:

Re: [VOTE] Release 2.52.0, release candidate #2

2023-11-09 Thread Yi Hu via dev
+1 (non-binding) Tested on Java IO load tests ( https://github.com/bvolpato/DataflowTemplates/tree/56d18a31c1c95e58543d7a1656bd83d7e859b482/it) BigQueryIO, TextIO, BigtableIO, SpannerIO on Dataflow legacy runner and runner v2 While it was announced there will be an RC3, the RC2 validation for IO

Re: [External Sender] Re: [Question] Error handling for IO Write Functions

2023-11-09 Thread Robert Bradshaw via dev
+1 Specifically, p.run().waitUntilFinish() would throw an exception if there were errors during pipeline execution. On Wed, Nov 8, 2023 at 8:05 AM John Casey via dev wrote: > Yep, thats a common misunderstanding with beam. > > The code that is actually executed in the try block is just for

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-09 Thread Ritesh Ghorse via dev
+1 (non-binding) Validated Python SDK quickstart batch and streaming. Thanks! On Thu, Nov 9, 2023 at 9:25 AM Jan Lukavský wrote: > +1 (binding) > > Validated Java SDK with Flink runner on own use cases. > Jan > > On 11/9/23 03:31, Danny McCormick via dev wrote: > > Hi everyone, > Please

Re: [VOTE] Release 2.52.0, release candidate #3

2023-11-09 Thread Jan Lukavský
+1 (binding) Validated Java SDK with Flink runner on own use cases.  Jan On 11/9/23 03:31, Danny McCormick via dev wrote: Hi everyone, Please review and vote on the release candidate #3 for the version 2.52.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release

Beam High Priority Issue Report (45)

2023-11-09 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java

[VOTE] Release 2.52.0, release candidate #3

2023-11-08 Thread Danny McCormick via dev
Hi everyone, Please review and vote on the release candidate #3 for the version 2.52.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) Reviewers are encouraged to test their own use cases with the release candidate, and vote +1 if no

Adding Dead Letter Queues to Beam IOs

2023-11-08 Thread John Casey via dev
Hi All, I've written up a design for adding DLQs to existing Beam IOs. It's been through a round of reviews with some Dataflow folks at Google, but I'd appreciate any comments the rest of Beam have around how to refine the design. TL;DR: Make it easy for a user to configure IOs to route bad data

Re: [External Sender] Re: [Question] Error handling for IO Write Functions

2023-11-08 Thread John Casey via dev
Yep, thats a common misunderstanding with beam. The code that is actually executed in the try block is just for pipeline construction, and no data is processed at this point in time. Once the pipeline is constructed, the various pardos are serialized, and sent to the runners, where they are

Re: [External Sender] Re: [Question] Error handling for IO Write Functions

2023-11-08 Thread Ramya Prasad via dev
Hey John, Yes that's how my code is set up, I have the FileIO.write() in its own try-catch block. I took a second look at where exactly the code is failing, and it's actually in a ParDo function which is happening before I call FileIO.write(). But even within that, I've tried adding a try-catch

[PR] Publish docs for 2.52.0 release [beam-site]

2023-11-08 Thread via GitHub
damccorm opened a new pull request, #653: URL: https://github.com/apache/beam-site/pull/653 Content generated from https://github.com/apache/beam/tree/v2.52.0-RC3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Publish docs for 2.52.0 release [beam-site]

2023-11-08 Thread via GitHub
damccorm closed pull request #652: Publish docs for 2.52.0 release URL: https://github.com/apache/beam-site/pull/652 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [VOTE] Release 2.52.0, release candidate #2

2023-11-08 Thread Danny McCormick via dev
Hey everyone, @Ritesh Ghorse pointed out to me that the docker containers were not pushed for RC2, just for RC1. On closer inspection, I've realized that I accidentally built the RC from the RC1 tag (https://github.com/apache/beam/tree/v2.52.0-RC1) instead of the RC2 tag (

Re: [Question] Error handling for IO Write Functions

2023-11-08 Thread John Casey via dev
There are 2 execution times when using Beam. The first execution is local, when a pipeline is constructed, and the second is remote on the runner, processing data. Based on what you said, it sounds like you are wrapping pipeline construction in a try-catch, and constructing FileIO isn't failing.

Re: [VOTE] Release 2.52.0, release candidate #2

2023-11-08 Thread Svetak Sundhar via dev
Thanks, Danny! @all: Reminder that if there's anything you think that is worth documenting while RC testing, please feel free to add it here . We can then use it to update

Re: [VOTE] Release 2.52.0, release candidate #2

2023-11-08 Thread Jean-Baptiste Onofré
+1 (binding) Regards JB On Wed, Nov 8, 2023 at 12:24 AM Danny McCormick via dev wrote: > > Hi everyone, > Please review and vote on the release candidate #2 for the version 2.52.0, as > follows: > [ ] +1, Approve the release > [ ] -1, Do not approve the release (please provide specific

Beam High Priority Issue Report (45)

2023-11-08 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java

Re: [VOTE] Release 2.52.0, release candidate #2

2023-11-08 Thread Jan Lukavský
+1 (binding) Validated Java SDK with Flink runner on own use cases.  Jan On 11/8/23 00:24, Danny McCormick via dev wrote: Hi everyone, Please review and vote on the release candidate #2 for the version 2.52.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release

[VOTE] Release 2.52.0, release candidate #2

2023-11-07 Thread Danny McCormick via dev
Hi everyone, Please review and vote on the release candidate #2 for the version 2.52.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) Reviewers are encouraged to test their own use cases with the release candidate, and vote +1 if no

Re: [Question] Error handling for IO Write Functions

2023-11-07 Thread Robert Bradshaw via dev
File write failures should be throwing exceptions that will terminate the pipeline on failure. (Generally a distributed runner will make multiple attempts before abandoning the entire pipeline of course.) Are you seeing files failing to be written but no exceptions being thrown? If so, this is

[Question] Error handling for IO Write Functions

2023-11-07 Thread Ramya Prasad via dev
Hello, I am a developer using Apache Beam in my Java application, and I need some help on how to handle exceptions when writing a file to S3. I have tried wrapping my code within a try-catch block, but no exception is being thrown within the try block. I'm assuming that FileIO doesn't throw any

Re: Disabling Jenkins Jobs

2023-11-07 Thread Alexey Romanenko
Danny, Yi, Thank you for taking care of this! — Alexey > On 7 Nov 2023, at 17:10, Yi Hu via dev wrote: > > Hi Alexey, > > > all Jenkins jobs are stuck and there is a big Build Queue on > > https://ci-beam.apache.org/ > > This is not intentional. This is likely due to INFRA's routine

Re: Disabling Jenkins Jobs

2023-11-07 Thread Yi Hu via dev
Hi Alexey, > all Jenkins jobs are stuck and there is a big Build Queue on https://ci-beam.apache.org/ This is not intentional. This is likely due to INFRA's routine Jenkins upgrade on Nov 5 and caused this outage. Have created

Re: Disabling Jenkins Jobs

2023-11-07 Thread Danny McCormick via dev
I don't think it's related. I noticed the problem half an hour ago; it seems there's an expired cert on the Jenkins machines. I'm hoping https://github.com/apache/beam/actions/runs/6786537134/job/18447281366 will fix this since the IO-Datastores cert is the problematic piece I think (and that has

Re: Disabling Jenkins Jobs

2023-11-07 Thread Alexey Romanenko
Not sure if it’s related but I see that, seems, all Jenkins jobs are stuck and there is a big Build Queue on https://ci-beam.apache.org/ Random clicks on jobs show that “"All nodes of label ‘beam’ are offline” message. Is it known problem? — Alexey > On 24 Oct 2023, at 21:50, Yi Hu via dev

Beam High Priority Issue Report (47)

2023-11-07 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29214 [Failing Test]:

Re: Lakehouse Formats with IO/Integration --> Hudi? Iceberg?

2023-11-07 Thread Ismaël Mejía
For iceberg there has been a long time opened issue and some WIP for a sink https://github.com/apache/beam/issues/20327 On Tue, Nov 7, 2023 at 2:08 AM Austin Bennett wrote: > Beam Devs, > > I was looking through GH Issue and online more generally and hadn't seen > much... Has anyone written

[PR] Publish docs for 2.52.0 release [beam-site]

2023-11-06 Thread via GitHub
damccorm opened a new pull request, #652: URL: https://github.com/apache/beam-site/pull/652 Content generated from https://github.com/apache/beam/tree/v2.52.0-RC1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Lakehouse Formats with IO/Integration --> Hudi? Iceberg?

2023-11-06 Thread Austin Bennett
Beam Devs, I was looking through GH Issue and online more generally and hadn't seen much... Has anyone written any Beam IO or other integration for writing to [ or reading from ] either Hudi or Iceberg? Any experience that can be shared [ on list, else feel free to message me off list and I'll

Re: Embeddings generation in MLTransform

2023-11-06 Thread Anand Inguva via dev
Hi all, After the initial email, I went ahead and added a few more things as per the comments on the doc . Please take a look and let me know what you think. Thanks, Anand On

Re: [PR] Publish docs for 2.52.0 release [beam-site]

2023-11-06 Thread via GitHub
damccorm closed pull request #651: Publish docs for 2.52.0 release URL: https://github.com/apache/beam-site/pull/651 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: Beam 2.52.0 Release

2023-11-06 Thread Danny McCormick via dev
Update: all issues mentioned in my last email are resolved and I was able to get all artifacts for RC published except for the PyPi artifacts (which I was about to do). However, during my validation before sending out the voting email I discovered a breaking change in the Datastore IO caused by

Beam High Priority Issue Report (49)

2023-11-06 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29214 [Failing Test]:

[PR] Publish docs for 2.52.0 release [beam-site]

2023-11-05 Thread via GitHub
damccorm opened a new pull request, #651: URL: https://github.com/apache/beam-site/pull/651 Content generated from https://github.com/apache/beam/tree/v2.52.0-RC1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Publish docs for 2.52.0 release [beam-site]

2023-11-05 Thread via GitHub
damccorm closed pull request #650: Publish docs for 2.52.0 release URL: https://github.com/apache/beam-site/pull/650 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: Beam 2.52.0 Release

2023-11-03 Thread Danny McCormick via dev
Update before the weekend - I likely won't have an RC out until early next week. I'm currently running down the following issues with RC creation: 1) Publishing java artifacts is failing because of an illegal implicit gradle dependency [1]. I have a PR [2] and cherrypick PR [3] prepared which

[PR] Publish docs for 2.52.0 release [beam-site]

2023-11-03 Thread via GitHub
damccorm opened a new pull request, #650: URL: https://github.com/apache/beam-site/pull/650 Content generated from https://github.com/apache/beam/tree/v2.52.0-RC1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Beam High Priority Issue Report (49)

2023-11-03 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29214 [Failing Test]:

Beam High Priority Issue Report (48)

2023-11-02 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29214 [Failing Test]:

[LAZY CONSENSUS] Deprecate Euphoria extension

2023-11-02 Thread Jan Lukavský
Hi, according to discussion [1], because no objections were raised and the overall usage (artifact download stats) is negligible compared to other Beam artifacts, I'll proceed with deprecating the Euphoria extension, unless there are any objections within 72 hours (excluding weekend). Best,

Re: Credentials Rotation Failure on Metrics cluster (2023-11-01)

2023-11-01 Thread Danny McCormick via dev
Yep, it is safe to ignore On Wed, Nov 1, 2023 at 2:43 PM Kenneth Knowles wrote: > +Danny McCormick is this the converse of the > other failure? (I didn't click through I just read the other thread) > > On Tue, Oct 31, 2023 at 10:10 PM gacti...@beam.apache.org < > beamacti...@gmail.com> wrote:

Re: Credentials Rotation Failure on Metrics cluster (2023-11-01)

2023-11-01 Thread Kenneth Knowles via dev
+Danny McCormick is this the converse of the other failure? (I didn't click through I just read the other thread) On Tue, Oct 31, 2023 at 10:10 PM gacti...@beam.apache.org < beamacti...@gmail.com> wrote: > Something went wrong during the automatic credentials rotation for Metrics > Cluster,

Re: Beam 2.52.0 Release

2023-11-01 Thread Danny McCormick via dev
I just cut the 2.52.0 release branch . All commits after https://github.com/apache/beam/commit/ea8596f2df0e3e4b9da9f215ae6745c2ddfb6612 will be targeted for a later release. There is currently still one open release blocker

Re: Credentials Rotation Failure on IO-Datastores cluster

2023-11-01 Thread Danny McCormick via dev
My guess is that this is due to running this both on GitHub Actions and Jenkins. The Actions run succeeded, so I don't think we need to worry about this - https://github.com/apache/beam/actions/runs/6714783844 It seems like for the metrics job the opposite happened - the Actions run failing

Beam High Priority Issue Report (48)

2023-11-01 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29214 [Failing Test]:

Re: Processing time watermarks in KinesisIO

2023-11-01 Thread Jan Lukavský
> That is a fair point, but I don't think we can guarantee that we have a timestamp embedded in the record. (Or is there some stable kafka metadata we could use here, I'm not that familiar with what kafka guarantees). We could require it to be opt-in given the caveats. Kafka (and Kinesis)

RE: Re: KafkaIO does not make use of Kafka Consumer Groups [kafka] [java] [io]

2023-11-01 Thread shaoj wu
Can't agree with Shahar Frank more On 2023/04/19 18:17:15 Shahar Frank wrote: > Hi Daniel, > > I think I've already answered these in a previous email but let me answer > them again. > > I was specifically responding to quoted points from your last email. I >> really don't understand why you,

Re: Credentials Rotation Failure on IO-Datastores cluster

2023-10-31 Thread Svetak Sundhar via dev
I took a quick look -- the error is the following: *22:17:26* ERROR: (gcloud.container.clusters.update) ResponseError: code=400, message=Operation operation-1698804621818-e9c8fe33-d4a2-44cd-86aa-9c4e09dea259 is currently upgrading cluster io-datastores. Please wait and try again once it is done.

Credentials Rotation Failure on IO-Datastores cluster

2023-10-31 Thread Apache Jenkins Server
Something went wrong during the automatic credentials rotation for IO-Datastores Cluster, performed at Wed Nov 01 00:52:45 UTC 2023. It may be necessary to check the state of the cluster certificates. For further details refer to the following links: * Failing job:

Credentials Rotation Failure on Metrics cluster (2023-11-01)

2023-10-31 Thread gacti...@beam.apache.org
Something went wrong during the automatic credentials rotation for Metrics Cluster, performed at 2023-11-01. It may be necessary to check the state of the cluster certificates. For further details refer to the following links:\n * Failing job:

Re: Processing time watermarks in KinesisIO

2023-10-31 Thread Robert Bradshaw via dev
On Tue, Oct 31, 2023 at 10:28 AM Jan Lukavský wrote: > > On 10/31/23 17:44, Robert Bradshaw via dev wrote: > > There are really two cases that make sense: > > > > (1) We read the event timestamps from the kafka records themselves and > > have some external knowledge that guarantees (or at least

Re: Processing time watermarks in KinesisIO

2023-10-31 Thread Jan Lukavský
On 10/31/23 17:44, Robert Bradshaw via dev wrote: There are really two cases that make sense: (1) We read the event timestamps from the kafka records themselves and have some external knowledge that guarantees (or at least provides a very good heuristic) about what the timestamps of unread

Re: Processing time watermarks in KinesisIO

2023-10-31 Thread Robert Bradshaw via dev
There are really two cases that make sense: (1) We read the event timestamps from the kafka records themselves and have some external knowledge that guarantees (or at least provides a very good heuristic) about what the timestamps of unread messages could be in the future to set the watermark.

Beam High Priority Issue Report (47)

2023-10-31 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java

Re: Processing time watermarks in KinesisIO

2023-10-31 Thread Jan Lukavský
I think that instead of deprecating and creating new version, we could leverage the proposed update compatibility flag for this [1]. I still have some doubts if the processing-time watermarking (and event-time assignment) makes sense. Do we have a valid use-case for that? This is actually the

Re: [YAML] Aggregations

2023-10-30 Thread Kenneth Knowles
Automatically dereferencing, basically. It is nice. Especially for many-to-many relationships like the example. I don't know if the aggregation is any different though, is it? Kenn On Sun, Oct 29, 2023 at 1:12 PM Robert Burke wrote: > I came across Edge DB, and it has a novel syntax moving

Re: Streaming update compatibility

2023-10-30 Thread Kenneth Knowles
+1 million to this. I think this could be a real game-changer. I would even more forcefully say update compatibility has pushed our development style has been pushed into the "never make significant changes" or "every significant change is wildly more complex than it should be". It forces our

Call for Presentations now open: Community over Code EU 2024

2023-10-30 Thread Ryan Skraba
(Note: You are receiving this because you are subscribed to the dev@ list for one or more projects of the Apache Software Foundation.) It's back *and* it's new! We're excited to announce that the first edition of Community over Code Europe (formerly known as ApacheCon EU) which will be held at

Embeddings generation in MLTransform

2023-10-30 Thread Anand Inguva via dev
Hi all, In Apache Beam 2.50.0 Python SDK, we added MLTransform , which is used to pre/post process data using common ML operations. Now, we are planning to

Beam High Priority Issue Report (46)

2023-10-30 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java

<    4   5   6   7   8   9   10   11   12   13   >