Installing cython in the application environment fixed the issue. Now I am
able to see the operator metrics ({organization_specific_prefix}
.operator.beam-metric-pardo_execution_time-process_bundle_
msecs-v1.gauge.mean)Thanks Ankur for looking into it and providing support. I am going to close https://issues.apache.org/jira/browse/BEAM-7058 if no one has any objection? On Thu, Apr 11, 2019 at 7:13 AM Thomas Weise <[email protected]> wrote: > Tracked as https://issues.apache.org/jira/browse/BEAM-7058 > > > On Wed, Apr 10, 2019 at 11:38 AM Pablo Estrada <[email protected]> wrote: > >> This sounds like a bug then? +Alex Amato <[email protected]> >> >> On Wed, Apr 10, 2019 at 3:59 AM Maximilian Michels <[email protected]> >> wrote: >> >>> Hi @all, >>> >>> From a quick debugging session, I conclude that the wiring is in place >>> for the Flink Runner. There is a ProgressReporter that reports >>> MonitoringInfos to Flink, in a similar fashion as the "legacy" Runner. >>> >>> The bundle duration metrics are 0, but the element count gets reported >>> correctly. It appears to be an issue of the Python/Java harness because >>> "ProcessBundleProgressResponse" contains only 0 values for the bundle >>> duration. >>> >>> Thanks, >>> Max >>> >>> On 04.04.19 19:54, Mikhail Gryzykhin wrote: >>> > Hi everyone, >>> > >>> > Quick summary on python and Dataflow Runner: >>> > Python SDK already reports: >>> > - MSec >>> > - User metrics (int64 and distribution) >>> > - PCollection Element Count >>> > - Work on MeanByteCount for pcollection is ongoing here >>> > <https://github.com/apache/beam/pull/8062>. >>> > >>> > Dataflow Runner: >>> > - all metrics listed above are passed through to Dataflow. >>> > >>> > Ryan can give more information on Flink Runner. I also see Maximilian >>> on >>> > some of relevant PRs, so he might comment on this as well. >>> > >>> > Regards, >>> > Mikhail. >>> > >>> > >>> > On Thu, Apr 4, 2019 at 10:43 AM Pablo Estrada <[email protected] >>> > <mailto:[email protected]>> wrote: >>> > >>> > Hello guys! >>> > Alex, Mikhail and Ryan are working on support for metrics in the >>> > portability framework. The support on the SDK is pretty advanced >>> > AFAIK*, and the next step is to get the metrics back into the >>> > runner. Lukazs and myself are working on a project that depends on >>> > this too, so I'm adding everyone so we can get an idea of what's >>> > missing. >>> > >>> > I believe: >>> > - User metrics are fully wired up in the SDK >>> > - State sampler (timing) metrics are wired up as well (is that >>> > right, +Alex Amato <mailto:[email protected]>?) >>> > - Work is ongoing to send the updates back to Flink. >>> > - What is the plan for making metrics queriable from Flink? +Ryan >>> > Williams <mailto:[email protected]> >>> > >>> > Thanks! >>> > -P. >>> > >>> > >>> > >>> > On Wed, Apr 3, 2019 at 12:02 PM Thomas Weise <[email protected] >>> > <mailto:[email protected]>> wrote: >>> > >>> > I believe this is where the metrics are supplied: >>> > >>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/operations.py >>> > >>> > git grep process_bundle_msecs yields results for dataflow >>> > worker only >>> > >>> > There isn't any test coverage for the Flink runner: >>> > >>> > >>> https://github.com/apache/beam/blob/d38645ae8758d834c3e819b715a66dd82c78f6d4/sdks/python/apache_beam/runners/portability/flink_runner_test.py#L181 >>> > >>> > >>> > >>> > On Wed, Apr 3, 2019 at 10:45 AM Akshay Balwally >>> > <[email protected] <mailto:[email protected]>> wrote: >>> > >>> > Should have added- I'm using Python sdk, Flink runner >>> > >>> > On Wed, Apr 3, 2019 at 10:32 AM Akshay Balwally >>> > <[email protected] <mailto:[email protected]>> wrote: >>> > >>> > Hi, >>> > I'm hoping to get metrics on the amount of time spent >>> on >>> > each operator, so it seams like the stat >>> > >>> > >>> >>> {organization_specific_prefix}.operator.beam-metric-pardo_execution_time-process_bundle_msecs-v1.gauge.mean >>> > >>> > would be pretty helpful. But in practice, this stat >>> > always shows 0, which I interpret as 0 milliseconds >>> > spent per bundle, which can't be correct (other stats >>> > show that the operators are running, and timers within >>> > the operators show more reasonable times). Is this a >>> > known bug? >>> > >>> > >>> > -- >>> > *Akshay Balwally* >>> > Software Engineer >>> > 937.271.6469 <tel:+19372716469> >>> > Lyft <http://www.lyft.com/> >>> > >>> > >>> > >>> > -- >>> > *Akshay Balwally* >>> > Software Engineer >>> > 937.271.6469 <tel:+19372716469> >>> > Lyft <http://www.lyft.com/> >>> > >>> >>
