Thanks for your thoughts, Robert.
On 15.05.23, 19:23, "Robert Bradshaw via dev" wrote:
On Mon, May 15, 2023 at 8:38 AM Moritz Mack wrote:
>
> Hi all,
>
> I was just looking into an old issue again, SerializablePipelineOptions
> calling FileSystems.setDefaultPipelineO
Hi all,
I was just looking into an old issue again, SerializablePipelineOptions calling
FileSystems.setDefaultPipelineOptions on deserialization [1]. This applies to
various runners including Flink and Spark, but not Dataflow as far as I know.
Problem:
Current initialization of FileSystems thr
Tue,
Apr 11, 2023, 8: 14 AM Moritz Mack wrote: Thanks so much
The coverage issue is only with the Java builds in specific.
Go abd Python have their coverage numbers codecov uploads done in GitHub
Actions instead.
On Tue, Apr 11, 2023, 8:14 AM Moritz Mack
mailto:mm...@talend.com>> wrote
Thanks so much for looking into this!
I’m absolutely +1 for removing Jenkins related friction and the proposed
changes sound legitimate.
Also, considering the number of flaky tests in general [1], code coverage might
not be the pressing issue. Should it be disabled everywhere in favor of more
r
Congrats, Jan!
On 16.02.23, 23:28, "Luke Cwik via dev" wrote:
Congrats, well deserved. On Thu, Feb 16, 2023 at 10: 32 AM Anand Inguva via dev
wrote: Congratulations!! On Thu, Feb 16, 2023 at 12:
42 PM Chamikara Jayalath via dev wrote: Congrats Jan!On
Congrats, well deserved.
On Thu, Feb 16
Dear All,
The runner for Spark 2 was deprecated quite a while back in August 2022 with
the release of Beam 2.41.0 [1]. We’re planning to move ahead with this and
finally remove support for Spark 2 (beam-runners-spark) to only maintain
support for Spark 3 (beam-runners-spark-3) going forward.
N
Hi all,
Just pumping this up again. Would anyone familiar with Beam SQL be able to have
a look at this potential bug in Beam SQL. That help would be much appreciated!
Thanks so much,
Moritz
On 23.11.22, 09:58, "Moritz Mack" wrote:
Hi all, Not sure who’s best to ping. I spend
Hi Damon,
I fear the current release / versioning strategy of Beam doesn’t lend itself
well for such breaking changes. Alexey and I have spent quite some time
discussing how to proceed with the problematic Avro dependency in core (and
respectively AvroIO, of course).
Such changes essentially al
Thanks so much! Great to see this to be picked up again with some good progress.
/ Moritz
On 11.12.22, 15:17, "Herman Mak via dev" wrote:
Hello Everyone, *TLDR* Should we adopt a set of standards that Connector I/Os
should adhere to? Attached is a first version of a Beam I/O Standards
guideli
Great, I really like the new simplified flow! Thanks for that!
On 08.12.22, 19:48, "Kenneth Knowles" wrote:
Merged it. Please be on the lookout for bugs I have introduced, since they
could result in issues slipping through the cracks. On Wed, Dec 7, 2022 at 3:
31 PM Kenneth Knowles wrote: OK
Hi all,
Not sure who’s best to ping. I spend some time looking into the SqlTransform
translation of one of the TPC-DS queries yesterday and noticed it’s generating
an overly complex transform hierarchy. I’ve summarized my findings in [1]. It
would be great to get some more experienced eyes on i
Also, thanks so much for all the great and through reviews! That was always
much appreciated!
All the best, Brian
On 11.11.22, 23:23, "Ahmet Altay via dev" wrote:
Thank you for everything Brian! On Fri, Nov 11, 2022 at 11: 27 AM Austin
Bennett wrote: Thanks for everything you've done, @
Bhul
Thanks a lot for the feedback so far! I can only second Alexey. It was painful
to come to realize that the only feasible option seems to be copying a lot of
code during the transition phase.
For that reason, it will be critical to be disciplined about the removal of the
to-be deprecated code in
Congrats, Ritesh 😊
On 05.11.22, 03:08, "Ahmet Altay via dev" wrote:
Congratulations Ritesh! On Fri, Nov 4, 2022 at 12: 18 PM Ritesh Ghorse via dev
wrote: Thanks everyone! I'm glad to be a part of this
community and I look forward to making more contributions in whatever ways I
Congratulation
ote:
>>
>> Good idea. I'm curious about our current benchmarks. Some of them run on
>> clusters, but I think some of them are running locally and just being noisy.
>> Perhaps this could improve that. (or if they are running on local
>> Spark/Flink then maybe the
Hi team,
I’m looking for some help to setup infrastructure to periodically run Java
microbenchmarks (JMH).
Results of these runs will be added to our community metrics (InfluxDB) to help
us track performance, see [1].
To prevent noisy runs this would require a dedicated Jenkins machine that run
Sorry for the confusion. Beam migrated to using Github issues just recently and
the confluence docs haven’t been updated yet.
Please create a new issue under https://github.com/apache/beam/issues and then
reference it in your commit message using the issue id, e.g.
git commit -am “Description of
Hi,
Please use a git clone of the apache/beam repository [1] as mentioned in the
instructions [2]:
> git clone g...@github.com:apache/beam.git
It looks like the source code archive you’ve downloaded doesn’t contain some
necessary build sources such as this plugin.
Regards, Moritz
[1] https://
simpler to just have a
flag on your translator that translates Create.Values into something that looks
unbounded at the Spark layer.
Kenn
On Thu, Jul 28, 2022 at 2:01 AM Moritz Mack
mailto:mm...@talend.com>> wrote:
Hi all,
Wondering if somebody could help and shed some lights on
Hi all,
Wondering if somebody could help and shed some lights on the behavior of
Pipeline.replaceAll, particularly the outputs to expect after the replacement.
I’m currently looking into supporting VR tests for SparkRunner in streaming
mode [1]. Unfortunately, I didn’t succeed replacing (wrappin
Congrats, Steven!
On 21.07.22, 05:25, "Evan Galpin" wrote:
Congrats! Well deserved! On Wed, Jul 20, 2022 at 15:17 Chamikara Jayalath via
dev wrote: Congrats, Steve! On Wed, Jul 20, 2022,
9:16 AM Austin Bennett wrote: Great!
ZjQcmQRYFpfptBannerStart
ZjQcmQRYFpfptB
nal performance benchmarks! But what does JMH stand for? On
Tue, Jul 12, 2022, 7:54 AM Moritz Mack wrote: Hi all, This
is a very short proposal to start running JMH benchmarks periodically and
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside yo
Hi all,
This is a very short proposal to start running JMH benchmarks periodically and
store benchmark results so we can start monitoring performance trends on the
community metrics dashboards over time. Comments most welcome!
https://s.apache.org/nvi9g
Best regards,
Moritz
As a recipient of
2.16.0 35
2.17.0 18
2.18.0 67
2.19.0 29
2.20.0 22
2.21.0 23
2.22.0 47
2.23.0 19
2.24.0 63
2.25.0 17
2.26.0 23
2.27.0 18
2.28.0 81
2.29.0 268
2.30.0 26
2.31.0 24
2.32.0 69
2.33.0 403
2.34.0 352
2.35.0 1543
2.36.0 50
2.37.0 19
2.38.0 420
2.39.0 86
All 5224
On Wed, Jun 29, 2022 at 1:24 AM Moritz Mack
Who could help pulling the latest Maven download stats for beam-runners-spark
and beam-runners-spark-3 for the last few Beam releases?
Thanks so much!
/ Moritz
On 01.04.22, 16:54, "Moritz Mack" wrote:
I just started looking into the Spark runner code a bit to helpfully help
sup
Hi Yichi,
Sorry for breaking that :/ I had a quick look and would suggest to only
generate Javadocs for the latest Spark runner version. Please see here:
https://github.com/apache/beam/pull/17793
The process of aggregating Javadocs is a bit sketchy in cases where source sets
are shared. Partic
Hi Yushu,
Have a look at org.apache.beam.runners.spark.translation.EvaluationContext in
the Spark runner. It maintains that mapping between PCollections and RDDs
(wrapped in the BoundedDataset helper). As Reuven just pointed out, values are
timestamped (and windowed) in Beam, therefore BoundedD
Does anybody here have some insights on this? Really wondering about the
numbers, initializing all filesystems ~80k times for a pipeline run doesn’t
seem right.
On 13.05.22, 09:10, "Moritz Mack" wrote:
Hi Jack, Silencing info logs for that class during IT tests would be a quick
fix
Sorry, please ignore the previous empty reply …
On 17.05.22, 09:31, "Moritz Mack" wrote:
On 16.05.22, 18:09, "Robert Bradshaw" wrote:
On Mon, May 16, 2022 at 8:53 AM Alexey Romanenko
wrote:
>
>> On 13 May 2022, at 18:38, Robert Bradshaw wrote:
>&g
On 17.05.22, 09:31, "Moritz Mack" wrote:
On 16.05.22, 18:09, "Robert Bradshaw" wrote: On Mon, May
16, 2022 at 8:53 AM Alexey Romanenko wrote: > >> On
13 May 2022, at 18:38, Robert Bradshaw
ZjQcmQRYFpfptBannerStar
On 16.05.22, 18:09, "Robert Bradshaw"
On 16.05.22, 18:09, "Robert Bradshaw" wrote:
On Mon, May 16, 2022 at 8:53 AM Alexey Romanenko
wrote:
>
>> On 13 May 2022, at 18:38, Robert Bradshaw wrote:
>>
>> We should probably remove the experimental annotations from
> SchemaCoder at this point.
>
> Is there anything that stops us from th
Thanks so much for all these pointers, Alexey. Having that context really helps!
Skimming through the past conversations, this one key consideration hasn’t
changed and seems still critical:
AvroCoder is the de facto standard for encoding complex user types (with
SchemaCoder still being experimen
Hi Jack,
Silencing info logs for that class during IT tests would be a quick fix, but
also removing logging there entirely shouldn’t hurt.
If the S3 filesystem is used it’ll fail on first usage and the issue should be
fairly obvious…
Though wondering, this is logged once when file systems are i
should be the next major release but I’m not sure
it’s even on distant horizon for now since this is topic that we didn’t discuss
for a long time (maybe it’s a good time to come back to this).
—
Alexey
On 18 Mar 2022, at 12:19, Moritz Mack
mailto:mm...@talend.com>> wrote:
Dear all,
n, W12 7TP
From: Moritz Mack
Sent: 21 March 2022 12:58
To: dev@beam.apache.org
Subject: Re: [DISCUSS] Deprecation of AWS SDK v1 IO connectors
Thank you both! Absolutely agree on reaching out to users!
The release of 2.38 seems to be a very good time to do so t
I just started looking into the Spark runner code a bit to helpfully help
supporting it.
Besides having to maintain (test!) twice the number of artifacts, there’s also
a significant negative impact on developer ergonomics / productivity supporting
multiple major versions (separate modules to dea
goes along with the attachValues method, which is
similarly tricky to use. It's there to enable 0-copy code, but not
necessarily intended for general consumption.
On Tue, Mar 29, 2022 at 9:42 AM Moritz Mack
mailto:mm...@talend.com>> wrote:
Dear team,
Is anybody around who could help me wi
Dear team,
Is anybody around who could help me with a question on Schemas / Rows? That
would be much appreciated!
I’m particularly looking at RowWithGetters currently and I’m stuck
understanding the semantics of Row.getValues() [1].
public List getValues() {
return getters.stream().map(g ->
ut I’m not sure
it’s even on distant horizon for now since this is topic that we didn’t discuss
for a long time (maybe it’s a good time to come back to this).
—
Alexey
On 18 Mar 2022, at 12:19, Moritz Mack
mailto:mm...@talend.com>> wrote:
Dear all,
I’d like to bring up an old discussion
Dear all,
I’d like to bring up an old discussion again [1].
Currently we have two different versions of AWS IO connectors in Beam for the
Java SDK:
* amazon-web-services [2] and kinesis [3] for the AWS Java SDK v1
* amazon-web-services2 (including kinesis) [4] for the AWS Java SDK v2
M
A NATS connector would be great, Suresh.
Really enjoyed how easy to operate and reliable it is!
Curious, are you using NATS with Jetstream enabled (the replacement of the
legacy NATS streaming layer) or core NATS (at most once delivery)?
Regards,
Moritz
From: Alexey Romanenko
Date: Thursday, 1
Thanks so much everyone 😊
From: Pablo Estrada
Date: Friday, 11. March 2022 at 17:43
To: dev
Subject: Re: [ANNOUNCE] New committer: Moritz Mack
Congrats Moritz! Well deserved indeed:) On Fri, Mar 11, 2022, 6:30 AM Evan
Galpin wrote: Congrats Moritz! On Fri, Mar 11, 2022 at
3:05 AM Etienne
Thanks so much Stephen and welcome to Beam!
I’m more than happy to review your PR, just ping me once opened (R: @mosche).
I’ve done a bit of work recently to get the AWS v2 module in a better / ready
shape, your help there is much appreciated.
/Moritz
From: Ahmet Altay
Date: Tuesday, 1. March 20
Just having a quick look, it looks like the respective interface in KafkaIO
should rather look like this to support KafkaAvroSerializer, which is a
Serializer:
public Write withValueSerializer(Class>
valueSerializer)
Thoughts?
Cheers, Moritz
From: Moritz Mack
Date: Tuesday, 8. Febru
Hi Matt,
Unfortunately, the types don’t play well when using KafkaAvroSerializer. It
currently requires a cast :/
The following will work:
write.withValueSerializer((Class)KafkaAvroSerializer.class))
This seems to be the cause of repeated confusion, so probably worth improving
the user experien
Hi,
Welcome 😊
I don’t have permissions to manage Jira, but you should have received an invite
for Slack.
Best,
Moritz
From: Mostafa Aghajani
Reply to: "dev@beam.apache.org"
Date: Tuesday, 2. November 2021 at 17:27
To: "dev@beam.apache.org"
Subject: Beam Contributer
Warning! External email.
Hi all,
I’m very much looking forward to start contributing to Beam and just want to
briefly introduce myself.
My name is Moritz (mosche) and I’m working together with Alexey and Etienne.
Having worked mostly with Spark in the past, I’m excited to dive deeper into
Beam 😊
Looking forward to wo
47 matches
Mail list logo