Reynold Xin created SPARK-22160: ----------------------------------- Summary: Allow changing sample points per partition in range shuffle exchange Key: SPARK-22160 URL: https://issues.apache.org/jira/browse/SPARK-22160 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 2.2.0 Reporter: Reynold Xin Assignee: Reynold Xin
Spark's RangePartitioner hard codes the number of sampling points per partition to be 20. This is sometimes too low. This ticket makes it configurable, via spark.sql.execution.rangeExchange.sampleSizePerPartition, and raises the default in Spark SQL to be 100. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org