Reshuffle stuck in python API

2021-02-17 Thread Manninger, Matyas
Dear Beam users,

I have a problem running a python pipeline in Dataflow. Because of many
side inputs and a complicated architecture Google told us that their
optimization algorithm gets messed up and adding reshuffle to the pipeline
solves the issue. Unfortunately, it seems like the Reshuffle step is not
working properly. I added a 60 sec fixed window in front of it as this is a
streaming pipeline. It seems like elements get added to the step but they
remain grouped or something like that as there are only a very few elements
coming out of the step. Any ideas what I might be doing wrong? The code is
very long and complicated, I also wouldn't share it, but are there any
typical mistakes regarding the reshuffling?

Thanks for any tips,
Matyas


Fwd: Defining Custom Labels / Label Support in Beam Metrics

2021-02-17 Thread Rion Williams
+user for additional reach

-- Forwarded message -
From: Rion Williams 
Date: Mon, Feb 15, 2021 at 7:58 PM
Subject: Defining Custom Labels / Label Support in Beam Metrics
To: dev 


Hey all,

I've been working extensively on the observability stories surrounding some
of the pipelines that I've been building recently in Beam and while the
existing Metrics have been extremely helpful, I don't see any easy or
perhaps just well documented way of adding/updating labels for these
metrics.

What I was hoping to accomplish would be to just use these metrics and
scape them into Prometheus/Grafana to create some nice dashboards, however
in some of these use-cases, I need to be able to slice the data by certain
attributes (e.g. a tenant, source, etc.) in addition to just
overall/aggregate data.

Does Beam have support for this? If not, what have folks been using to
accomplish this? My first thought was to consider using the existing
Prometheus Client Metrics [0], however if Beam had support for this, I'd
much prefer that. Additionally, I'm not entirely sure what challenges might
come with implementing metrics at the DoFn/Transform level using a
third-party library such as this, given the distributed nature of Beam.

Any insight / recommendations / thoughts would be greatly appreciated!

Thanks,

Rion

[0] https://prometheus.io/docs/instrumenting/clientlibs/


Regarding Beam 2.28 release timeline

2021-02-17 Thread Tao Li
Hi Beam community,

I am looking forward to Beam 2.28 release which will probably include 
BEAM-11527. We will depend on 
BEAM-11527 for a major work 
item from my side. Can some please provide an ETA of 2.28 release? Thanks so 
much!



Re: Regarding Beam 2.28 release timeline

2021-02-17 Thread Chamikara Jayalath
Currently the first release candidate is being validated.
Please see here for updates:
https://lists.apache.org/thread.html/r247d51726a8f330400468db0f69b3531e47525952fb6af5d8614%40%3Cdev.beam.apache.org%3E

Thanks,
Cham

On Wed, Feb 17, 2021 at 2:02 PM Tao Li  wrote:

> Hi Beam community,
>
>
>
> I am looking forward to Beam 2.28 release which will probably include
> BEAM-11527 . We will
> depend on BEAM-11527 
> for a major work item from my side. Can some please provide an ETA of 2.28
> release? Thanks so much!
>
>
>