[GitHub] [spark] koertkuipers commented on pull request #27986: [SPARK-31220][SQL] repartition obeys initialPartitionNum when adaptiveExecutionEnabled

GitBox Sun, 21 Jun 2020 19:55:49 -0700


koertkuipers commented on pull request #27986:
URL: https://github.com/apache/spark/pull/27986#issuecomment-647240428



   @wangyum i have `spark.sql.adaptive.coalescePartitions.enabled=true` and the 
data size is small.
   how can i see that the step does coalesce? in number of tasks (i always see 
2048)? in number of output files (i always see 2048)?
   
   i have `spark.sql.shuffle.partitions=2048` and 
`spark.sql.adaptive.coalescePartitions.initialPartitionNum=2048`.
   
   when i do a groupBy instead of repartition than the number of tasks varies 
with data size (and is less than 2048) and the number of output files varies 
too (and is less than 2048). with repartition the tasks and output files are 
always fixed at 2048.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] koertkuipers commented on pull request #27986: [SPARK-31220][SQL] repartition obeys initialPartitionNum when adaptiveExecutionEnabled

Reply via email to