Yes, it works but I think it has the same problem. It's a lot slower so it took
me hours of running it, but by the end the memory usage was high and the CPU
about 100% so it seems to be the same problem.
Worth noting perhaps that when I use the DirectRunner I have to turn
enforceImmutability of
I think that might be possible too.
Yohei Onishi
On Tue, Jun 4, 2019 at 9:00 AM Chamikara Jayalath
wrote:
>
>
> On Mon, Jun 3, 2019 at 5:14 PM Yohei Onishi wrote:
>
>> Hi Nicolas,
>>
>> Are you running you job on Dataflow? According to GCP support, Dataflow
>> currently does not support Schem
On Mon, Jun 3, 2019 at 5:14 PM Yohei Onishi wrote:
> Hi Nicolas,
>
> Are you running you job on Dataflow? According to GCP support, Dataflow
> currently does not support Schema Registry. But we can still use Schema
> Registry to serialize / deserialize your message using
> custom KafkaAvroSeriali
Hi Nicolas,
Are you running you job on Dataflow? According to GCP support, Dataflow
currently does not support Schema Registry. But we can still use Schema
Registry to serialize / deserialize your message using
custom KafkaAvroSerializer.
In my case I implemented my custom KafkaAvroDeserializer t
By looking at your usecase, the whole processing logic seems to be very
custom.
I would recommend using ParDo's to express your use case. If the processing
for individual dictionary is expensive then you can potentially use a
reshuffle operation to distribute the updation of dictionary over multipl
Hi Ankur,
Thanks for reply. Please find responses updated in below mail.
Thanks,
Anjana
From: Ankur Goenka [goe...@google.com]
Sent: Monday, June 03, 2019 11:01 AM
To: user@beam.apache.org
Subject: Re: How to build a beam python pipeline which does GET/POST reques
Thanks for providing more information.
Some follow up questions/comments
1. Call an API which would provide a dictionary as response.
Question: Do you need to make multiple of these API calls? If yes, what
distinguishes API call1 from call2? If its the input to the API, then can
you provide the in
Hi Robert,
Thanks for the response.
As you've discovered, fully custom merging window fns are not yet
> supported portably, though this is on our todo list.
>
> https://issues.apache.org/jira/browse/BEAM-5601
>
Thanks for linking me to that. I've watched it and voted for it, and maybe
I'll even
Ha sorry I was only reading screenshots but ignored your other comments. So
count fn indeed worked.
Can I ask if your sql pipeline works on direct runner?
-Rui
On Mon, Jun 3, 2019 at 10:39 AM Rui Wang wrote:
> BeamSQL actually only converts SELECT COUNT(*) query to the Java pipeline
> that ca
BeamSQL actually only converts SELECT COUNT(*) query to the Java pipeline
that calls Java's builtin Count[1] transform.
Could you implement your pipeline by Count transform to see whether this
memory issue still exists? By doing so we could narrow down problem a bit.
If using Java directly without
Yep, that's correct.
On Mon, Jun 3, 2019 at 2:06 PM Juan Carlos Garcia wrote:
>
> Hi Robert,
>
> The elements of a PCollection are unordered. >> Yes this is something known
> and understood given the nature of a PCollection.
>
> So that means, that when we are doing a replay of past data (we rew
Hi Robert,
*The elements of a PCollection are unordered.* >> Yes this is something
known and understood given the nature of a PCollection.
So that means, that when we are doing a replay of past data (we rewind our
kafka consumer groups), in 1h of processing time, there might be multiple
1h window
The elements of a PCollection are unordered. This includes the results
of a GBK--there is no promise that the output be processed in any (in
particular, windows ordered by timestamp) order. DoFns, especially one
with side effects, should be written with this in mind.
(There is actually ongoing dis
Hi Folks,
My team and i have a situation that cannot be explain and
would like to hear your thoughts, we have a pipeline which
enrich the incoming messages and write them to BigQuery, the pipeline looks
like this:
Apache Beam 2.12.0 / GCP Dataflow
-
- ReadFromKafka (with withCreateTime and 1
Hi Chad!
As you've discovered, fully custom merging window fns are not yet
supported portably, though this is on our todo list.
https://issues.apache.org/jira/browse/BEAM-5601
This involves calling back into the SDK to perform the actually
merging logic (and also, for full generality, being able
15 matches
Mail list logo