Hive SKEWED feature supported in Spark SQL ?

The Watcher Thu, 19 Feb 2015 02:27:08 -0800

I have done some testing of inserting into tables defined in Hive using 1.2
and I can see that the PARTITION clause is honored : data files get created
in multiple subdirectories correctly.


I tried the SKEWED BY ON STORED AS DIRECTORIES clause on the CREATE TABLE
clause but I didn't see subdirectories being created in that case.

1) is SKEWED BY honored ? If so, has anyone run into directories not being
created ?

2) if it is not honored, does it matter ? Hive introduced this feature to
better handle joins where tables had a skewed distribution on keys joined
on so that the single mapper handling one of the keys didn't hold up the
whole process. Could that happen in Spark / Spark SQL ?

Thanks

Hive SKEWED feature supported in Spark SQL ?

Reply via email to