Spark Structured Streaming runner migrated to Spark 3

2021-08-05 Thread Etienne Chauchot
Hi all, Just to let you know that Spark Structured Streaming runner was migrated to Spark 3. Enjoy ! Etienne

Re: Spark Structured Streaming Runner Roadmap

2021-08-03 Thread Etienne Chauchot
Hi, Sorry for the late answer: the streaming mode in spark structured streaming runner is stuck because of spark structured streaming framework implementation of watermark at the apache spark project side. See https://echauchot.blogspot.com/2020/11/watermark-architecture-proposal-for.html

Re: A new reworked Elasticsearch 7+ IO module

2020-03-31 Thread Etienne Chauchot
Etienne Chauchot. On 06/03/2020 11:26, Etienne Chauchot wrote: Hi all, it's been 3 weeks since the survey on ES versions the users use. The survey received very few responses: only 9 responses for now (multiple versions possible of course). The responses are the following: ES2: 0 clients

Re: A new reworked Elasticsearch 7+ IO module

2020-03-06 Thread Etienne Chauchot
support but for now it is still not very representative. I'm cross-posting to @users to let you know that I'm closing the survey within 1 or 2 weeks. So please respond if you're using ESIO. Best Etienne On 13/02/2020 12:37, Etienne Chauchot wrote: Hi Cham, thanks for your comments ! I just

Elasticsearch use in Apache Beam

2020-02-13 Thread Etienne Chauchot
Hi everyone, The Apache Beam community is currently working on refactoring the current ElasticsearchIO (see the thread [1] on the dev mailing list). To determine which Elasticsearch versions to support, we do a survey among users of Apache Beam. Can you please tell us more about your use of

Re: Feedback on how we use Apache Beam in my company

2019-10-09 Thread Etienne Chauchot
Very nice ! Thanks ccing dev list Etienne On 09/10/2019 16:55, Pierre Vanacker wrote: Hi Apache Beam community, We’ve been working with Apache Beam in production for a few years now in my company (Dailymotion). If you’re interested to know how we use Apache Beam in combination with

Re: 2019 Beam Events

2018-12-13 Thread Etienne Chauchot
Great work ! Thanks for sharing Gris ! Etienne Le mercredi 05 décembre 2018 à 07:47 +, Matthias Baetens a écrit : > Great stuff, Gris! Looking forward to what 2019 will bring! > The Beam meetup in London will have a new get together early next year as > well :-) >

Re: Beam Metrics using FlinkRunner

2018-12-11 Thread Etienne Chauchot
nk cluster that would not have network access to your local machine. Etienne Le mardi 11 décembre 2018 à 15:07 +0100, Etienne Chauchot a écrit : > Hi Phil, > Your setup looks good to me and you are not using detached mode. > MetricsPusher in streaming mode on flink works so we > need to fi

Re: Beam Metrics using FlinkRunner

2018-12-11 Thread Etienne Chauchot
options for MetricsHttpSink: > > options.setMetricsHttpSinkUrl("http://localhost:3000;); > options.setMetricsSink(MetricsHttpSink.class); > This works when I test SparkRunner, so I believe I have it set up correctly > for MetricsPusher to capture the metrics > from Flink as well. > -Phil >

Re: Beam Metrics using FlinkRunner

2018-12-07 Thread Etienne Chauchot
Hi Phil, MetricsPusher is tested on all the runners in both batch and streaming mode. I just ran this test in Flink in streaming mode and it works. What is the command line you are using and which version of Beam? Please also remember that, as discussed, metrics (other flink features ) do not

Re: Beam Metrics questions

2018-12-03 Thread Etienne Chauchot
Hi Phil, Thanks for the update I was checking the code and I was not understanding how the filtering could fail. Etienne Le vendredi 30 novembre 2018 à 10:53 -0600, Phil Franklin a écrit : > Etienne, I’ve just discovered that the code I used for my tests overrides the > command-line arguments,

Re: Beam Metrics questions

2018-11-30 Thread Etienne Chauchot
Hi Phil, Thanks for using MetricsPusher and Beam in general ! - MetricsHttpSink works that way: it filters out committed metrics from the json output when committed metrics are not supported. I checked, Flink runner still does not support committed metrics. So there should be no committed

Re: [Call for items] November Beam Newsletter

2018-11-13 Thread Etienne Chauchot
Hi,I just added some things that were done. Etienne Le lundi 12 novembre 2018 à 12:22 +, Matthias Baetens a écrit : > Looks great, thanks for the effort and for including the Summit blogpost, > Rose! > On Thu, 8 Nov 2018 at 22:55 Rose Nguyen wrote: > > Hi Beamers: > > > > > > Time to sync

Re: Apache Beam Newsletter - August 2018

2018-08-22 Thread Etienne Chauchot
Hi Rose, I know the newsletter has already been sent, but may I add some of my ongoing subjects: What's been done:- CI improvement: for each new commit on master Nexmark suite is run in both batch and streaming mode in spark, flink, dataflow (thanks to Andrew) and dashboards graphs are produced

Re: Apache Beam Summit in Europe

2018-07-05 Thread Etienne Chauchot
Hi, Just a comment, I'm not sure 28-29/09 is very practical because some of the Beam community will be at the apachecon in Montreal ending Sept 27th. Etienne Le mercredi 04 juillet 2018 à 17:13 +0100, Matthias Baetens a écrit : > Hi everyone! > Thanks for filling out the survey. We are currently

Re: Metrics: Non-cumulative values for Distribution

2018-06-19 Thread Etienne Chauchot
Hi Scott and Jozef, Sorry for the late answer, I missed the email. Well, MetricsPusher will aggregate the metrics just as PipelineResult.metrics() does but it will do so at given configurable intervals and export the values. It means that if you configure the export to be every 5s, you will get

Re: Apache Beam June Newsletter

2018-06-14 Thread Etienne Chauchot
Thanks Gris, this is very cool ! besides we did not include schedule talks for the ApacheCon (end of September) because they 'll take place in a long time, maybe they'll be announced in the next news letter? Etienne Le mercredi 13 juin 2018 à 16:41 -0700, Pablo Estrada a écrit : > Thanks Gris!

Re: Bundling in ParDos

2018-05-23 Thread Etienne Chauchot
Hi Abdul, Going back to your use case, if the use case is to do batching of the elements on a unbounded source, then you can use GroupIntoBatches transform that groups elements in batches (Iterables) of the size you specify. You can then process the batch downstream in your pipeline. PS: to

Re: Monitoring and Management Tools for Beam an Friends

2018-03-26 Thread Etienne Chauchot
Hi Benjamin Please know there is an ongoing PR for a runner agnostic metrics feature https://github.com/apache/beam/pull/4548 Le jeudi 02 mars 2017 à 16:26 +, Stas Levin a écrit : > Hi Benjamin, > > This is somewhat of a hot topic lately, visibility FTW :) > > My experience comes from doing

Re: [ANNOUNCE] Apache Beam 2.4.0 released

2018-03-22 Thread Etienne Chauchot
Great ! Le jeudi 22 mars 2018 à 08:24 +, Robert Bradshaw a écrit : > We are pleased to announce the release of Apache Beam 2.4.0. Thanks goes to > the many people who made this possible. > > Apache Beam is an open source unified programming model to define and > execute data processing

Re: Does ElasticsearchIO in the latest RC support adding document IDs?

2017-11-16 Thread Etienne Chauchot
e know, I’m pretty new to this. I'll create the ticket and we will loop on design in the comments. Best Etienne Chet On Nov 15, 2017, at 12:53 AM, Etienne Chauchot <echauc...@apache.org <mailto:echauc...@apache.org>> wrote: Hi Chet, What you say is totally true, docs written usi

Re: Does ElasticsearchIO in the latest RC support adding document IDs?

2017-11-15 Thread Etienne Chauchot
1/15/2017 09:53 AM, Etienne Chauchot wrote: Hi Chet, What you say is totally true, docs written using ElasticSearchIO will always have an ES generated id. But it might change in the future, indeed it might be a good thing to allow the user to pass an id. Just in 5 seconds thinking, I see 3 poss

Re: ElasticSearch with RestHighLevelClient

2017-10-25 Thread Etienne Chauchot
. Regards, Etienne Chauchot Le 23/10/2017 à 23:21, Ryan Bobko a écrit : Thanks Tim, I believe I'm doing what Jean-Baptiste recommends, so I guess I'll have a look at the snapshot and see what's different. I don't mind waiting a bit if it means I don't have to duplicate working code. ry On Mon, Oct