[ https://issues.apache.org/jira/browse/SPARK-19873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kunal Khamar updated SPARK-19873: --------------------------------- Description: If the user changes the shuffle partition number between batches, Streaming aggregation will fail. Here are some possible cases: - Change "spark.sql.shuffle.partitions" - Use "repartition" and change the partition number in codes - RangePartitioner doesn't generate deterministic partitions. Right now it's safe as we disallow sort before aggregation. Not sure if we will add some operators using RangePartitioner in future. Fix: Record # shuffle partition in offset log and enforce in next batch was: It the user changes the shuffle partition number between batches, Streaming aggregation will fail. Here are some possible cases: - Change "spark.sql.shuffle.partitions" - Use "repartition" and change the partition number in codes - RangePartitioner doesn't generate deterministic partitions. Right now it's safe as we disallow sort before aggregation. Not sure if we will add some operators using RangePartitioner in future. Fix: Record # shuffle partition in offset log and enforce in next batch > If the user changes the shuffle partition number between batches, Streaming > aggregation will fail. > -------------------------------------------------------------------------------------------------- > > Key: SPARK-19873 > URL: https://issues.apache.org/jira/browse/SPARK-19873 > Project: Spark > Issue Type: Bug > Components: Structured Streaming > Affects Versions: 2.1.0 > Reporter: Kunal Khamar > > If the user changes the shuffle partition number between batches, Streaming > aggregation will fail. > Here are some possible cases: > - Change "spark.sql.shuffle.partitions" > - Use "repartition" and change the partition number in codes > - RangePartitioner doesn't generate deterministic partitions. Right now it's > safe as we disallow sort before aggregation. Not sure if we will add some > operators using RangePartitioner in future. > Fix: > Record # shuffle partition in offset log and enforce in next batch -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org