Yes, this is explicitly supported. You can return named tuples and
dictionaries (with PCollections as values) as well.
On Tue, Aug 4, 2020 at 5:00 PM Harrison Green wrote:
>
> Hi all,
>
> I've run into a situation where I would like to return two PCollections
> during a PTransform. I am aware
Hi all,
I've run into a situation where I would like to return two PCollections
during a PTransform. I am aware of the ParDo.with_outputs construct but in
this case, the PCollections are the flattened results of several other
transforms and it would be cleaner to just return multiple PCollections
Thanks Luke. I will go through them and come back if I have any questions.
Regards,
Praveen
On Tue, Aug 4, 2020 at 3:55 PM Luke Cwik wrote:
> Take a look at the WatchGrowthFn[1] and also the in-progress Kafka PR[2].
>
> 1:
>
Take a look at the WatchGrowthFn[1] and also the in-progress Kafka PR[2].
1:
https://github.com/apache/beam/blob/6612b24ada9382706373db547b5606d6e0496b02/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Watch.java#L787
2: https://github.com/apache/beam/pull/11749
On Tue, Aug 4, 2020
Thanks for the suggestions Luke. As you know, we are just starting and
should be able to switch to SplittableDoFn, if that's the future of Beam IO
Connectors. The SplittableDoFn page has the design details but it would be
great if we can look into an IO connector built using SplittableDoFn
for
So, after some additional digging, it appears that Beam does not consistently
check for timer expiry before calling process. The result is that it may be the
case that the watermark has moved beyond your timer expiry, and if youre
counting on the timer callback happening at the time you set it
Hi Piotr,
Are you using the beam master head to dev? Can you share your code? The
x-lang transform can be tested with Flink runner, where SDF is also
supported, such as
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/flink_runner_test.py#L205-L261
On Tue,
Hi Mayank,
Which runner do you want to run your pipeline? You should add 'beam_fn_api'
when you launch the pipeline --experiments=beam_fn_api.
In your code:
class TestDoFn(beam.DoFn):
def process(
self,
element,
restriction_tracker=beam.DoFn.RestrictionParam(
May I suggest we print a URL(and a message) you can use to file bugs at, in
the command line when you run a beam pipeline. (And in any other user
interface we use for beam, some of the runner specific UIs may want to link
to this as well).
On Tue, Aug 4, 2020 at 9:16 AM Alexey Romanenko
wrote:
Hello Team,
I was hoping to get anyones help with an error I'm encountering in
running SDF.
Posted the question imn stack overflow (includes code)
https://stackoverflow.com/questions/63252327/error-in-running-apache-beam-python-splittabledofn
However I am receiving a error
RuntimeError:
I'm in favor of a quarantine job whose tests are called out
prominently as "possibly broken" in the release notes. As a follow up,
+1 to exploring better tooling to track at a fine grained level
exactly how flaky these test are (and hopefully detect if/when they go
from flakey to just plain
https://github.com/marketplace/actions/gs-commit-message-checker
On Tue, Aug 4, 2020 at 10:25 AM Robert Bradshaw wrote:
> +1, thanks for the reminder.
>
> This should be really easy to automate, using
> https://developer.github.com/webhooks/event-payloads/#pull_request to
> give a warning when
+1, thanks for the reminder.
This should be really easy to automate, using
https://developer.github.com/webhooks/event-payloads/#pull_request to
give a warning when the change history is not sufficiently "clean."
I'm not sure where to host this though (or if it could be integrated
into
+1 thanks Alexey.
My apologies that I merged such a case recently (but not intentionally). I
tried to use the "squash and merge" button with a consolidated commit
message. After clicking the button, github showed "failed to merge" and
gave a retry button, and after clicking that retry button,
Is there a simple way to register the splittable dofn for cross-language usage?
It's a bit a black box to me right now.
The most meaningful logs for Flink are the ones I pasted and the following:
apache_beam.utils.subprocess_server: INFO: b'[grpc-default-executor-0] WARN
Yes, good point, thanks Valentyn.
> On 4 Aug 2020, at 18:29, Valentyn Tymofieiev wrote:
>
> +1, thanks, Alexey.
>
> Also a reminder from the contributor guide: do not use the default GitHub
> commit message for merge commits, which looks like:
>
> Merge pull request #1234 from
Hi dev team,
I want to add a side_input into a ParDo transform. Side_input is a table
from Bigquery.
However, I am facing a weird issue. I could run this pipeline with
*DirectRunner* on a local machine but I am not able to run this pipeline
with *DataflowRunner. *The error message is:
```
Job did
+1, thanks, Alexey.
Also a reminder from the contributor guide: do not use the default GitHub
commit message for merge commits, which looks like:
Merge pull request #1234 from some_user/transient_branch_name
Instead, add the commit message into the subject line, for example: "Merge
pull request
Great topic, thanks Griselda for raising this question.
I’d prefer to keep Jira as the only one main issue tracker and use other
suggested ways, like emails, Git issues, web form or dedicated Slack channel,
as different interfaces designed to simplify a way how users can submit an
issue. But
On Thu, Jul 30, 2020 at 6:24 PM Ahmet Altay wrote:
> I like:
> *Include ignored or quarantined tests in the release notes*
> *Run flaky tests only in postcommit* (related? *Separate flaky tests into
> quarantine job*)
>
The quarantine job would allow them to run in presubmit still, we would
Hi all,
+1 on ping the assigned person.
For the flakes I know of (ESIO and CassandraIO), they are due to the
load of the CI server. These IOs are tested using real embedded backends
because those backends are complex and we need relevant tests.
Counter measures have been taken (retrial
Hi all,
I’d like to attract your attention regarding our Git commit history and related
issue. A while ago I noticed that it started getting not very clear and quite
verbose comparing to how it was before. We have quite significant amount of
recent commits like “fix”, “address comments”,
I did a small research on the temporary directories. Seems that there's no
one unified way of telling applications to use a specific path. Neither the
guarantee that all of them will use the dedicated custom directory. Badly
behaving apps could always hardcode `/tmp`, e.g. Java ;)
But, we should
23 matches
Mail list logo