Spark SQL JDBC Source data skew

Sathish Kumaran Vairavelu Sat, 20 Jun 2015 06:48:41 -0700

Hi,

In Spark SQL JDBC data source there is an option to specify upper/lower
bound and num of partitions. How Spark handles data distribution, if we do
not give the upper/lower/num of parititons ? Will all data from the
external data source skewed up in one executor?


In many situations, we do not know the upper/lower bound of the underlying
dataset until the query is executed, so it is not possible to pass
upper/lower bound values.


Thanks

Sathish

Spark SQL JDBC Source data skew

Reply via email to