Re: Percentile metrics in Beam

2021-10-01 Thread Ajo Thomas
new URNs is to specifically be able to add new formats and > specifications over time. > > On Fri, Oct 1, 2021 at 10:34 AM Ajo Thomas wrote: > >> Thanks for the pointers, Luke and sorry for replying late on this thread. >> >> Distribution metric's - *void update(long

Re: Percentile metrics in Beam

2021-10-01 Thread Ajo Thomas
/beam/pull/7183 > 2: > https://github.com/apache/beam/blob/a7b706cb9d1e84709f89fe98d1dda94d4eb1243b/model/pipeline/src/main/proto/metrics.proto#L110 > 3: > https://lists.apache.org/thread.html/rfc1ff850ed2eaa9057ba9fb34286c19a802bc2720424afc0dffa3b1b%40%3Cdev.beam.apache.org%3E > > On Mon,

Re: Percentile metrics in Beam

2021-09-20 Thread Ajo Thomas
they see fit. I'd love to see a proposal / PR for this. > > fyi @Robert Bradshaw > > On Wed, Sep 15, 2021 at 10:37 AM Ajo Thomas > wrote: > >> Thanks for the response, Alexey and Ke. >> Agree with your point to introduce a new metric type (say Percentiles) >> instead o

Re: Percentile metrics in Beam

2021-09-17 Thread Ajo Thomas
Thanks for the link to the doc. I think it should be okay to include percentiles in Distribution given that it was intended to be extensible. As for the user facing Metrics API, there will be no changes unless we want to allow the user to specify custom percentiles aside from a set of defaults.

Re: Percentile metrics in Beam

2021-09-15 Thread Ajo Thomas
/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/Distribution.java > > > On Sep 7, 2021, at 1:28 PM, Ajo Thomas wrote: > > Hi All, > > I am working on adding support for some additional distribution metrics > like std dev, percentiles to the Metrics API. T

Percentile metrics in Beam

2021-09-07 Thread Ajo Thomas
Hi All, I am working on adding support for some additional distribution metrics like std dev, percentiles to the Metrics API. The runner of interest here is Samza runner. I wanted to get the opinion of fellow beam devs on this. One way to do this would be to make changes to the existing

Re: Portable Python pipeline not splitting reads across executors

2021-06-28 Thread Ajo Thomas
in SparkRunner. Switching to ReadAllFromAvro worked as it relies on filebasedsource.ReadAllFiless which uses a different approach to splitting the work. Thanks Ajo On Fri, Jun 11, 2021 at 8:18 AM Ajo Thomas wrote: > Hi folks, > > I am working on running a Portable Python pipeline on Spark. &

Portable Python pipeline not splitting reads across executors

2021-06-11 Thread Ajo Thomas
Hi folks, I am working on running a Portable Python pipeline on Spark. The test pipeline is very straightforward where I am trying to read some avro data in hdfs using avroio (native io and not an external transform) and write it back to hdfs. Here is the pipeline: Pipeline: pipeline_options =

Requesting contributor permission for Beam JIRA tickets

2019-05-08 Thread Ajo Thomas
Hello, I am Ajo Thomas and I was hoping to work on making some improvements to the beam KinesisIO java SDK. I have created a ticket for it [ https://issues.apache.org/jira/browse/BEAM-7240 ] and was hoping to assign it to myself. Requesting the admins to please add me as a contributor for Beam's