I'm a little confused around Hive & Spark, can someone shed some light ?
Using Spark, I can access the Hive metastore and run Hive queries. Since I am able to do this in stand-alone mode, it can't be using map-reduce to run the Hive queries and I suppose it's building a query plan and executing it all in Spark. So, is this the same as https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started ? If not, why not and aren't they likely to merge at some point ? If Spark really builds its own query plan, joins, etc without Hive's then is everything that requires special SQL syntax in Hive supported : window functions, cubes, rollups, skewed tables, etc Thanks