subject:"Spark streaming routing"

Re: Spark Streaming - Routing rdd to Executor based on Key

2021-03-09 Thread forece85

Not sure if kinesis have such flexibility. What else possibilities are there at transformations level? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail:

Re: Spark Streaming - Routing rdd to Executor based on Key

2021-03-09 Thread forece85

Any example for this please -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark Streaming - Routing rdd to Executor based on Key

2021-03-09 Thread Sean Owen

You can also group by the key in the transformation on each batch. But yes that's faster/easier if it's already partitioned that way. On Tue, Mar 9, 2021 at 7:30 AM Ali Gouta wrote: > Do not know Kenesis, but it looks like it works like kafka. Your producer > should implement a paritionner that

Re: Spark Streaming - Routing rdd to Executor based on Key

2021-03-09 Thread Ali Gouta

Do not know Kenesis, but it looks like it works like kafka. Your producer should implement a paritionner that makes it possible to send your data with the same key to the same partition. Though, each task in your spark streaming app will load data from the same partition in the same executor. I

Spark Streaming - Routing rdd to Executor based on Key

2021-03-09 Thread forece85

We are doing batch processing using Spark Streaming with Kinesis with a batch size of 5 mins. We want to send all events with same eventId to same executor for a batch so that we can do multiple events based grouping operations based on eventId. No previous batch or future batch data is concerned.

Spark Streaming - Routing rdd to Executor based on Key

2021-03-09 Thread forece85

We are doing batch processing using Spark Streaming with Kinesis with a batch size of 5 mins. We want to send all events with same eventId to same executor for a batch so that we can do multiple events based grouping operations based on eventId. No previous batch or future batch data is concerned.

Spark Streaming: routing by key without groupByKey

2016-01-15 Thread Lin Zhao

I have requirement to route a paired DStream to a series of map and flatMap such that entries with the same key goes to the same thread within the same batch. Closest I can come up with is groupByKey().flatMap(_._2). But this kills throughput by 50%. When I think about it groupByKey is more

Spark streaming routing

2016-01-07 Thread Lin Zhao

I have a need to route the dstream through the streming pipeline by some key, such that data with the same key always goes through the same executor. There doesn't seem to be a way to do manual routing with Spark Streaming. The closest I can come up with is: stream.foreachRDD {rdd =>

Re: Spark streaming routing

2016-01-07 Thread Lin Zhao

cpu:memory ratio. From: Tathagata Das <t...@databricks.com<mailto:t...@databricks.com>> Date: Thursday, January 7, 2016 at 1:56 PM To: Lin Zhao <l...@exabeam.com<mailto:l...@exabeam.com>> Cc: user <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: Re:

Re: Spark streaming routing

2016-01-07 Thread Tathagata Das

You cannot guarantee that each key will forever be on the same executor. That is flawed approach to designing an application if you have to take ensure fault-tolerance toward executor failures. On Thu, Jan 7, 2016 at 9:34 AM, Lin Zhao wrote: > I have a need to route the

Re: Spark Streaming - Routing rdd to Executor based on Key

Re: Spark Streaming - Routing rdd to Executor based on Key

Re: Spark Streaming - Routing rdd to Executor based on Key

Re: Spark Streaming - Routing rdd to Executor based on Key

Spark Streaming - Routing rdd to Executor based on Key

Spark Streaming - Routing rdd to Executor based on Key

Spark Streaming: routing by key without groupByKey

Spark streaming routing

Re: Spark streaming routing

Re: Spark streaming routing

10 matches

Site Navigation

Mail list logo

Footer information