You can also group by the key in the transformation on each batch. But yes that's faster/easier if it's already partitioned that way.
On Tue, Mar 9, 2021 at 7:30 AM Ali Gouta <ali.go...@gmail.com> wrote: > Do not know Kenesis, but it looks like it works like kafka. Your producer > should implement a paritionner that makes it possible to send your data > with the same key to the same partition. Though, each task in your spark > streaming app will load data from the same partition in the same executor. > I think this is the simplest way to achieve what you want to do. > > Best regards, > Ali Gouta. > > On Tue, Mar 9, 2021 at 11:30 AM forece85 <forec...@gmail.com> wrote: > >> We are doing batch processing using Spark Streaming with Kinesis with a >> batch >> size of 5 mins. We want to send all events with same eventId to same >> executor for a batch so that we can do multiple events based grouping >> operations based on eventId. No previous batch or future batch data is >> concerned. Only Current batch keyed operation needed. >> >> Please help me how to achieve this. >> >> Thanks. >> >> >> >> -- >> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >>