The number of partitions (which decides the number of tasks) is fixed after
any shuffle and can be configured using 'spark.sql.shuffle.partitions'
though SQLConf (i.e. sqlContext.set(...) or
"SET spark.sql.shuffle.partitions=..." in sql) It is possible we will auto
select this based on statistics
Hi,
I am trying to understand the query plan and number of tasks /execution time
created for joined query.
Consider following example , creating two tables emp, sal with appropriate 100
records in each table with key for joining them.
EmpRDDRelation.scala
case class EmpRecord(key: I