subject:"spark.sql.shuffle.partitions=auto"

Re: spark.sql.shuffle.partitions=auto

2024-04-30 Thread Mich Talebzadeh

spark.sql.shuffle.partitions=auto Because Apache Spark does not build clusters. This configuration option is specific to Databricks, with their managed Spark offering. It allows Databricks to automatically determine an optimal number of shuffle partitions for your workload. HTH Mich Talebzadeh

spark.sql.shuffle.partitions=auto

2024-04-30 Thread second_co...@yahoo.com.INVALID

May i know is spark.sql.shuffle.partitions=auto only available on Databricks? what about on vanilla Spark ? When i set this, it gives error need to put int. Any open source library that auto find the best partition , block size for dataframe?