from:"Ali Gouta"

Re: How to handle auto-restart in Kubernetes Spark application

2021-05-02 Thread Ali Gouta

Hello, Better to ask your question on the spark operator github and not on this mailing list. For the answer, try: type: Always Best regards, Ali Gouta. On Sun, May 2, 2021 at 6:15 PM Sachit Murarka wrote: > Hi All, > > I am using Spark with Kubernetes, Can anyone please tell me

Re: Spark structured streaming + offset management in kafka + kafka headers

2021-04-04 Thread Ali Gouta

Thanks Mich ! Ali Gouta. On Sun, Apr 4, 2021 at 6:44 PM Mich Talebzadeh wrote: > Hi Ali, > > The old saying of one experiment is worth a hundred hypotheses, still > stands. > > As per Test driven approach have a go at it and see what comes out. Forum > members including my

Re: Spark structured streaming + offset management in kafka + kafka headers

2021-04-04 Thread Ali Gouta

Great, so SSS provides also an api that allows handling RDDs through dataFrames using foreachBatch. Still that I am not sure this is a good practice in general right ? Well, it depends on the use case in any way. Thank you so much for the hints ! Best regards, Ali Gouta. On Sun, Apr 4, 2021 at

Re: Spark structured streaming + offset management in kafka + kafka headers

2021-04-04 Thread Ali Gouta

Thank you guys for your answers, I will dig more this new way of doing things and why not consider leaving the old Dstreams and use instead structured streaming. Hope that strucrured streaming + spark on Kubernetes works well and the combination is production ready. Best regards, Ali Gouta. Le

Spark structured streaming + offset management in kafka + kafka headers

2021-04-03 Thread Ali Gouta

ark structured streaming to the concerned consumer group ? Best regards, Ali Gouta.

Re: Spark Streaming - Routing rdd to Executor based on Key

2021-03-09 Thread Ali Gouta

think this is the simplest way to achieve what you want to do. Best regards, Ali Gouta. On Tue, Mar 9, 2021 at 11:30 AM forece85 wrote: > We are doing batch processing using Spark Streaming with Kinesis with a > batch > size of 5 mins. We want to send all events with same eventI

Re: Spark Streaming Memory

2020-05-17 Thread Ali Gouta

The spark UI is misleading in spark 2.4.4. I moved to spark 2.4.5 and it fixed it. Now, your problem should be somewhere else. Probably related to memory consumption but not the one you see in the UI. Best regards, Ali Gouta. On Sun, May 17, 2020 at 7:36 PM András Kolbert wrote: > Hi, &g

Re: spark on k8s - can driver and executor have separate checkpoint location?

2020-05-16 Thread Ali Gouta

. Then have pod anti-affinity to make sure they are not running on the same node. You may achieve this by running an NFS fiesystem and then create a PV/PVC that mounts to that shared file system. The persistentVolumeClaim defined in your Yaml should call the PVC you created. Best regards, Ali Gouta

Re: Spark submit on yarn does not return with exit code 1 on exception

2017-02-03 Thread Ali Gouta

about it. Ali Gouta. Le 3 févr. 2017 22:24, "Jacek Laskowski" a écrit : Hi, An interesting case. You don't use Spark resources whatsoever. Creating a SparkConf does not use YARN...yet. I think any run mode would have the same effect. So, although spark-submit could have returned e

Re: Copying all Hive tables from Prod to UAT

2016-04-08 Thread Ali Gouta

For hive, you may use sqoop to achieve this. In my opinion, you may also run a spark job to make it.. Le 9 avr. 2016 00:25, "Ashok Kumar" a écrit : Hi, Anyone has suggestions how to create and copy Hive and Spark tables from Production to UAT. One way would be to copy table data to external fil

Re: Spark Streaming - print accumulators value every period as logs

2015-12-25 Thread Ali Gouta

Something like Stream.foreachRdd(rdd=> rdd.collect.foreach(print accum)) Should answer your question. You get things printed in Each batch interval Ali Gouta Le 25 déc. 2015 04:22, "Roberto Coluccio" a écrit : > Hello, > > I have a batch and a streaming driver using sam

Re: How do I link JavaEsSpark.saveToEs() to a sparkConf?

2015-12-14 Thread Ali Gouta

nerateRDD, "foo"); That's it... At last, be carreful while defining your sets of your "conf". For instance you may end-up changing the localhost by the real IP adresse of your Elasticsearch node... Ali Gouta. On Mon, Dec 14, 2015 at 1:52 PM, Spark Enthusiast wrote: &

Re: Replaying an RDD in spark streaming to update an accumulator

2015-12-10 Thread Ali Gouta

Indeed, you are right! I felt like I was missing or misunderstanding something. Thank you so much! Ali Gouta. On Thu, Dec 10, 2015 at 10:04 PM, Cody Koeninger wrote: > I'm a little confused as to why you have fake events rather than just > doing foreachRDD or foreachPartition on

Re: How to handle auto-restart in Kubernetes Spark application

Re: Spark structured streaming + offset management in kafka + kafka headers

Re: Spark structured streaming + offset management in kafka + kafka headers

Re: Spark structured streaming + offset management in kafka + kafka headers

Spark structured streaming + offset management in kafka + kafka headers

Re: Spark Streaming - Routing rdd to Executor based on Key

Re: Spark Streaming Memory

Re: spark on k8s - can driver and executor have separate checkpoint location?

Re: Spark submit on yarn does not return with exit code 1 on exception

Re: Copying all Hive tables from Prod to UAT

Re: Spark Streaming - print accumulators value every period as logs

Re: How do I link JavaEsSpark.saveToEs() to a sparkConf?

Re: Replaying an RDD in spark streaming to update an accumulator

13 matches

Site Navigation

Mail list logo

Footer information