Pablo, what about setting a lower sampling rate? Or would that lead to poor results?
Kyle Weaver | Software Engineer | github.com/ibzib | [email protected] | +16502035555 On Mon, Jul 15, 2019 at 4:44 PM Pablo Estrada <[email protected]> wrote: > @Thomas do you think this is a problem of documentation, or a missing > feature? > > We did not add support for it without cython because the cost of locking > and checking every 200ms in Python would be too high - that's why this is > only implemented in the optimized Cython codepath. I think it makes sense > to document this, rather than adding the support, as it would be really > expensive. What are your thoughts? > > Best > -P. > > On Mon, Jul 15, 2019, 1:48 PM Thomas Weise <[email protected]> wrote: > >> That's great, but I think the JIRA needs to remain open since w/o Cython >> the metric still doesn't work. >> >> It would however be helpful to add a comment regarding your findings. >> >> >> On Mon, Jul 15, 2019 at 1:46 PM Rakesh Kumar <[email protected]> >> wrote: >> >>> >>> Installing cython in the application environment fixed the issue. Now I >>> am able to see the operator metrics ({organization_specific_prefix} >>> .operator.beam-metric-pardo_execution_time-process_bundle_ >>> msecs-v1.gauge.mean) >>> >>> Thanks Ankur for looking into it and providing support. >>> >>> I am going to close https://issues.apache.org/jira/browse/BEAM-7058 if >>> no one has any objection? >>> >>> >>> On Thu, Apr 11, 2019 at 7:13 AM Thomas Weise <[email protected]> wrote: >>> >>>> Tracked as https://issues.apache.org/jira/browse/BEAM-7058 >>>> >>>> >>>> On Wed, Apr 10, 2019 at 11:38 AM Pablo Estrada <[email protected]> >>>> wrote: >>>> >>>>> This sounds like a bug then? +Alex Amato <[email protected]> >>>>> >>>>> On Wed, Apr 10, 2019 at 3:59 AM Maximilian Michels <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi @all, >>>>>> >>>>>> From a quick debugging session, I conclude that the wiring is in >>>>>> place >>>>>> for the Flink Runner. There is a ProgressReporter that reports >>>>>> MonitoringInfos to Flink, in a similar fashion as the "legacy" Runner. >>>>>> >>>>>> The bundle duration metrics are 0, but the element count gets >>>>>> reported >>>>>> correctly. It appears to be an issue of the Python/Java harness >>>>>> because >>>>>> "ProcessBundleProgressResponse" contains only 0 values for the bundle >>>>>> duration. >>>>>> >>>>>> Thanks, >>>>>> Max >>>>>> >>>>>> On 04.04.19 19:54, Mikhail Gryzykhin wrote: >>>>>> > Hi everyone, >>>>>> > >>>>>> > Quick summary on python and Dataflow Runner: >>>>>> > Python SDK already reports: >>>>>> > - MSec >>>>>> > - User metrics (int64 and distribution) >>>>>> > - PCollection Element Count >>>>>> > - Work on MeanByteCount for pcollection is ongoing here >>>>>> > <https://github.com/apache/beam/pull/8062>. >>>>>> > >>>>>> > Dataflow Runner: >>>>>> > - all metrics listed above are passed through to Dataflow. >>>>>> > >>>>>> > Ryan can give more information on Flink Runner. I also see >>>>>> Maximilian on >>>>>> > some of relevant PRs, so he might comment on this as well. >>>>>> > >>>>>> > Regards, >>>>>> > Mikhail. >>>>>> > >>>>>> > >>>>>> > On Thu, Apr 4, 2019 at 10:43 AM Pablo Estrada <[email protected] >>>>>> > <mailto:[email protected]>> wrote: >>>>>> > >>>>>> > Hello guys! >>>>>> > Alex, Mikhail and Ryan are working on support for metrics in the >>>>>> > portability framework. The support on the SDK is pretty advanced >>>>>> > AFAIK*, and the next step is to get the metrics back into the >>>>>> > runner. Lukazs and myself are working on a project that depends >>>>>> on >>>>>> > this too, so I'm adding everyone so we can get an idea of what's >>>>>> > missing. >>>>>> > >>>>>> > I believe: >>>>>> > - User metrics are fully wired up in the SDK >>>>>> > - State sampler (timing) metrics are wired up as well (is that >>>>>> > right, +Alex Amato <mailto:[email protected]>?) >>>>>> > - Work is ongoing to send the updates back to Flink. >>>>>> > - What is the plan for making metrics queriable from Flink? >>>>>> +Ryan >>>>>> > Williams <mailto:[email protected]> >>>>>> > >>>>>> > Thanks! >>>>>> > -P. >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Wed, Apr 3, 2019 at 12:02 PM Thomas Weise <[email protected] >>>>>> > <mailto:[email protected]>> wrote: >>>>>> > >>>>>> > I believe this is where the metrics are supplied: >>>>>> > >>>>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/operations.py >>>>>> > >>>>>> > git grep process_bundle_msecs yields results for dataflow >>>>>> > worker only >>>>>> > >>>>>> > There isn't any test coverage for the Flink runner: >>>>>> > >>>>>> > >>>>>> https://github.com/apache/beam/blob/d38645ae8758d834c3e819b715a66dd82c78f6d4/sdks/python/apache_beam/runners/portability/flink_runner_test.py#L181 >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Wed, Apr 3, 2019 at 10:45 AM Akshay Balwally >>>>>> > <[email protected] <mailto:[email protected]>> wrote: >>>>>> > >>>>>> > Should have added- I'm using Python sdk, Flink runner >>>>>> > >>>>>> > On Wed, Apr 3, 2019 at 10:32 AM Akshay Balwally >>>>>> > <[email protected] <mailto:[email protected]>> wrote: >>>>>> > >>>>>> > Hi, >>>>>> > I'm hoping to get metrics on the amount of time >>>>>> spent on >>>>>> > each operator, so it seams like the stat >>>>>> > >>>>>> > >>>>>> >>>>>> {organization_specific_prefix}.operator.beam-metric-pardo_execution_time-process_bundle_msecs-v1.gauge.mean >>>>>> > >>>>>> > would be pretty helpful. But in practice, this stat >>>>>> > always shows 0, which I interpret as 0 milliseconds >>>>>> > spent per bundle, which can't be correct (other >>>>>> stats >>>>>> > show that the operators are running, and timers >>>>>> within >>>>>> > the operators show more reasonable times). Is this a >>>>>> > known bug? >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > *Akshay Balwally* >>>>>> > Software Engineer >>>>>> > 937.271.6469 <tel:+19372716469> >>>>>> > Lyft <http://www.lyft.com/> >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > *Akshay Balwally* >>>>>> > Software Engineer >>>>>> > 937.271.6469 <tel:+19372716469> >>>>>> > Lyft <http://www.lyft.com/> >>>>>> > >>>>>> >>>>>
