Still seeing these wheel build emails on my fork

2020-08-18 Thread Alex Amato
Asked about this a few weeks ago, I rebased from master as was suggested, but I am still seeing these. I am guessing this is wasting our resources somehow? :( On Tue, Aug 18, 2020 at 7:28 PM Alex Amato wrote: > Run failed for master (010adc5) > > Repository: ajamato/beam > Workflow: Build

Re: BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-18 Thread Saikat Maitra
Hi Val, Thank you for your response. I like the idea of reactive event based processing engine for fault tolerance. As you mentioned it will be upto underlying system to manage job execution and offer fault tolerance and we will need to build it in Ignite compute execution model. I looked into

Re: BEAM-9045 Implement an Ignite runner using Apache Ignite compute grid

2020-08-18 Thread Valentin Kulichenko
Hi Saikat, Thanks for clarifying. Is there a Beam component that monitors the state, or this is up to the application? If something fails, will the application have to retry the whole pipeline? My concern is that Ignite compute actually provides very limited guarantees, especially for the async

Re: python AfterCount() trigger behavior not aligned with java

2020-08-18 Thread Leiyi Zhang
Thank you Robert! On Mon, Aug 17, 2020 at 5:52 PM Robert Bradshaw wrote: > Correct, everything is per-key. To allow triggering after n events you > would have to given them all the same key. (Note that this would > potentially introduce a bottleneck, as they would all be shuffled to > the same

Re: [DISCUSS][BEAM-10670] Migrating BoundedSource/UnboundedSource to execute as a Splittable DoFn for non-portable Java runners

2020-08-18 Thread Pulasthi Supun Wickramasinghe
Hi Luke Will take a look at this as soon as possible and get back to you. Best Regards, Pulasthi On Tue, Aug 18, 2020 at 2:30 PM Luke Cwik wrote: > I have made some good progress here and have gotten to the following state > for non-portable runners: > > DirectRunner[1]: Merged. Supports

Re: Percentile metrics in Beam

2020-08-18 Thread Luke Cwik
getPMForCDF[1] seems to return a CDF and you can choose the split points (b0, b1, b2, ...). 1: https://github.com/stanford-futuredata/msketch/blob/cf4e49e860761f48ebdeb00f650ce997c46073e2/javamsketch/quantilebench/src/main/java/yahoo/DoublesPmfCdfImpl.java#L16 On Tue, Aug 18, 2020 at 11:20 AM

Re: [DISCUSS][BEAM-10670] Migrating BoundedSource/UnboundedSource to execute as a Splittable DoFn for non-portable Java runners

2020-08-18 Thread Luke Cwik
I have made some good progress here and have gotten to the following state for non-portable runners: DirectRunner[1]: Merged. Supports Read.Bounded and Read.Unbounded. Twister2[2]: Ready for review. Supports Read.Bounded, the current runner doesn't support unbounded pipelines. Spark[3]: WIP.

Re: Percentile metrics in Beam

2020-08-18 Thread Alex Amato
I'm a bit confused, are you sure that it is possible to derive the CDF? Using the moments variables. The linked implementation on github seems to not use a derived CDF equation, but instead using some sampling technique (which I can't fully grasp yet) to estimate how many elements are in each

Re: Percentile metrics in Beam

2020-08-18 Thread Ke Wu
Hi Alex, It is great to know you are working on the metrics. Do you have any concern if we add a Histogram type metrics in Samza Runner itself for now so we can start using it before a generic histogram metrics can be introduced in the Metrics class? Best, Ke > On Aug 18, 2020, at 12:57 AM,

Re: Percentile metrics in Beam

2020-08-18 Thread Gleb Kanterov
Hi Alex, I'm not sure about restoring histogram, because the use-case I had in the past used percentiles. As I understand it, you can approximate histogram if you know percentiles and total count. E.g. 5% of values fall into [P95, +INF) bucket, other 5% [P90, P95), etc. I don't understand the

Re: Percentile metrics in Beam

2020-08-18 Thread Luke Cwik
You can use a cumulative distribution function over the sketch at b0, b1, b2, b3, ... which will tell you the probability that any given value is <= X. You multiply that probability against the total count (which is also recorded as part of the sketch) to get an estimate for the number of values

Re: Welcome Sruthi Sree Kumar - Season of Docs tech writer

2020-08-18 Thread Sruthi Sree Kumar
Thank you. Looking forward to working with the Beam community. :) On Mon, Aug 17, 2020 at 11:57 PM Pablo Estrada wrote: > Welcome Sruthi! : ) > > On Mon, Aug 17, 2020 at 2:41 PM Gris Cuevas wrote: > >> Welcome Sruthi! >> >> On 2020/08/17 20:56:40, Aizhamal Nurmamat kyzy >> wrote: >> > Hi all,

Re: [BEAM-10292] change proposal to DefaultFilenamePolicy.ParamsCoder

2020-08-18 Thread David Janíček
I looked at the possibility to fix the underlying filesystem and it turns out that only the local filesystem couldn't handle decoding right, HDFS and some other filesystem, e.g. S3, already have a check for that. So I added a similar check to the local filesystem too. The implementation is in the