Re: [YAML] Aggregations

2023-10-18 Thread Robert Burke
MongoDB has its own concept of aggregation pipelines as well. https://www.mongodb.com/docs/manual/core/aggregation-pipeline/#std-label-aggregation-pipeline On Wed, Oct 18, 2023, 6:07 PM Robert Bradshaw via dev wrote: > On Wed, Oct 18, 2023 at 5:06 PM Byron Ellis wrote: > > > > Is it worth

Re: [YAML] Aggregations

2023-10-18 Thread Robert Bradshaw via dev
On Wed, Oct 18, 2023 at 5:06 PM Byron Ellis wrote: > > Is it worth taking a look at similar prior art in the space? +1. Pointers welcome. > The first one that comes to mind is Transform, but with the dbt labs > acquisition that spec is a lot harder to find. Rill is pretty similar though. Rill

Re: [YAML] Aggregations

2023-10-18 Thread Byron Ellis via dev
Is it worth taking a look at similar prior art in the space? The first one that comes to mind is Transform, but with the dbt labs acquisition that spec is a lot harder to find. Rill is pretty similar though. On Wed, Oct 18, 2023 at 1:12 PM Robert

Re: [NOTICE] Deprecation Avro classes in "core" and use "extensions/avro" instead for Java SDK

2023-10-18 Thread Byron Ellis via dev
Awesome! On Wed, Oct 18, 2023 at 1:14 PM Alexey Romanenko wrote: > Heads up! > > Finally, all Avro-related code and Avro dependency, that was deprecated > before (see a message above), has been removed from Beam Java SDK “core” > module [1]. We believe that it was a sufficient number of Beam

Re: [NOTICE] Deprecation Avro classes in "core" and use "extensions/avro" instead for Java SDK

2023-10-18 Thread Alexey Romanenko
Heads up! Finally, all Avro-related code and Avro dependency, that was deprecated before (see a message above), has been removed from Beam Java SDK “core” module [1]. We believe that it was a sufficient number of Beam releases (six!) that passed after this code had been deprecated and users

[YAML] Aggregations

2023-10-18 Thread Robert Bradshaw via dev
Beam Yaml has good support for IOs and mappings, but one key missing feature for even writing a WordCount is the ability to do Aggregations [1]. While the traditional Beam primitive is GroupByKey (and CombineValues), we're eschewing KVs in the notion of more schema'd data (which has some

Re: [Discuss] Idea to increase RC voting participation

2023-10-18 Thread Robert Bradshaw via dev
+1 That's a great idea. They have incentive to make sure the issue was resolved for them, plus we get to ensure there were no other regressions. On Wed, Oct 18, 2023 at 11:30 AM Johanna Öjeling via dev < dev@beam.apache.org> wrote: > When I have contributed to Apache Airflow, they have tagged

Re: [Discuss] Idea to increase RC voting participation

2023-10-18 Thread Johanna Öjeling via dev
When I have contributed to Apache Airflow, they have tagged all contributors concerned in a GitHub issue when the RC is available and asked us to validate it. Example: #29424 . I found that to be an effective way to notify contributors of the RC and

Re: [PR] Publish docs for 2.51.0 release [beam-site]

2023-10-18 Thread via GitHub
kennknowles merged PR #649: URL: https://github.com/apache/beam-site/pull/649 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[ANNOUNCE] Apache Beam 2.51.0 Released

2023-10-18 Thread Kenneth Knowles
The Apache Beam Team is pleased to announce the release of version 2.51.0. You can download the release here: https://beam.apache.org/get-started/downloads/ This release includes bug fixes, features, and improvements detailed on the Beam Blog: https://beam.apache.org/blog/beam-2.51.0/ and the

Beam 2.52.0 Release

2023-10-18 Thread Danny McCormick via dev
Hey everyone, the next release (2.52.0) branch cut is scheduled for Nov 1, 2023, 2 weeks from today, according to the release calendar [1]. I'd like to perform this release; I will cut the branch on that date, and cherrypick release-blocking fixes afterwards, if any. Please help with the release

Beam High Priority Issue Report (42)

2023-10-18 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/29022 [Failing Test]:

Re: KafkaIO does not make use of Kafka Consumer Groups [kafka] [java] [io]

2023-10-18 Thread Jan Lukavský
Hi, my two cents on this. While it would perfectly possible to use consumer group in KafkaIO, it has its own issues. The most visible would be, that using subscriptions might introduce unnecessary duplicates in downstream processing. The reason for this is that consumer in a consumer group