Hi @all,
From a quick debugging session, I conclude that the wiring is in place
for the Flink Runner. There is a ProgressReporter that reports
MonitoringInfos to Flink, in a similar fashion as the "legacy" Runner.
The bundle duration metrics are 0, but the element count gets reported
correctly. It appears to be an issue of the Python/Java harness because
"ProcessBundleProgressResponse" contains only 0 values for the bundle
duration.
Thanks,
Max
On 04.04.19 19:54, Mikhail Gryzykhin wrote:
Hi everyone,
Quick summary on python and Dataflow Runner:
Python SDK already reports:
- MSec
- User metrics (int64 and distribution)
- PCollection Element Count
- Work on MeanByteCount for pcollection is ongoing here
<https://github.com/apache/beam/pull/8062>.
Dataflow Runner:
- all metrics listed above are passed through to Dataflow.
Ryan can give more information on Flink Runner. I also see Maximilian on
some of relevant PRs, so he might comment on this as well.
Regards,
Mikhail.
On Thu, Apr 4, 2019 at 10:43 AM Pablo Estrada <[email protected]
<mailto:[email protected]>> wrote:
Hello guys!
Alex, Mikhail and Ryan are working on support for metrics in the
portability framework. The support on the SDK is pretty advanced
AFAIK*, and the next step is to get the metrics back into the
runner. Lukazs and myself are working on a project that depends on
this too, so I'm adding everyone so we can get an idea of what's
missing.
I believe:
- User metrics are fully wired up in the SDK
- State sampler (timing) metrics are wired up as well (is that
right, +Alex Amato <mailto:[email protected]>?)
- Work is ongoing to send the updates back to Flink.
- What is the plan for making metrics queriable from Flink? +Ryan
Williams <mailto:[email protected]>
Thanks!
-P.
On Wed, Apr 3, 2019 at 12:02 PM Thomas Weise <[email protected]
<mailto:[email protected]>> wrote:
I believe this is where the metrics are supplied:
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/operations.py
git grep process_bundle_msecs yields results for dataflow
worker only
There isn't any test coverage for the Flink runner:
https://github.com/apache/beam/blob/d38645ae8758d834c3e819b715a66dd82c78f6d4/sdks/python/apache_beam/runners/portability/flink_runner_test.py#L181
On Wed, Apr 3, 2019 at 10:45 AM Akshay Balwally
<[email protected] <mailto:[email protected]>> wrote:
Should have added- I'm using Python sdk, Flink runner
On Wed, Apr 3, 2019 at 10:32 AM Akshay Balwally
<[email protected] <mailto:[email protected]>> wrote:
Hi,
I'm hoping to get metrics on the amount of time spent on
each operator, so it seams like the stat
{organization_specific_prefix}.operator.beam-metric-pardo_execution_time-process_bundle_msecs-v1.gauge.mean
would be pretty helpful. But in practice, this stat
always shows 0, which I interpret as 0 milliseconds
spent per bundle, which can't be correct (other stats
show that the operators are running, and timers within
the operators show more reasonable times). Is this a
known bug?
--
*Akshay Balwally*
Software Engineer
937.271.6469 <tel:+19372716469>
Lyft <http://www.lyft.com/>
--
*Akshay Balwally*
Software Engineer
937.271.6469 <tel:+19372716469>
Lyft <http://www.lyft.com/>