subject:"Re\: MapWithState partitioning"

Re: MapWithState partitioning

2016-10-31 Thread Andrii Biletskyi

Thanks, As I understand for Kafka case the way to do it is to define my kafka.Partitioner that is used when data is produced to Kafka and just reuse this partitioner as spark.Partitioner in mapWithState spec. I think I'll stick with that. Thanks, Andrii 2016-10-31 16:55 GMT+02:00 Cody Koeninger

Re: MapWithState partitioning

2016-10-31 Thread Cody Koeninger

You may know that those streams share the same keys, but Spark doesn't unless you tell it. mapWithState takes a StateSpec, which should allow you to specify a partitioner. On Mon, Oct 31, 2016 at 9:40 AM, Andrii Biletskyi wrote: > Thanks for response, > > So as I understand there is no way to "t

Re: MapWithState partitioning

2016-10-31 Thread Cody Koeninger

If you call a transformation on an rdd using the same partitioner as that rdd, no shuffle will occur. KafkaRDD doesn't have a partitioner, there's no consistent partitioning scheme that works for all kafka uses. You can wrap each kafkardd with an rdd that has a custom partitioner that you write to

Re: MapWithState partitioning

Re: MapWithState partitioning

Re: MapWithState partitioning

3 matches

Site Navigation

Mail list logo

Footer information