>
> Yes, recently we improved ParquetRelation2 quite a bit. Spark SQL uses its
> own Parquet support to read partitioned Parquet tables declared in Hive
> metastore. Only writing to partitioned tables is not covered yet. These
> improvements will be included in Spark 1.3.0.
>
> Just created SPARK-5
>
>
>1. In Spark 1.3.0, timestamp support was added, also Spark SQL uses
>its own Parquet support to handle both read path and write path when
>dealing with Parquet tables declared in Hive metastore, as long as you’re
>not writing to a partitioned table. So yes, you can.
>
> Ah, I h
Still trying to get my head around Spark SQL & Hive.
1) Let's assume I *only* use Spark SQL to create and insert data into HIVE
tables, declared in a Hive meta-store.
Does it matter at all if Hive supports the data types I need with Parquet,
or is all that matters what Catalyst & spark's parquet
I have done some testing of inserting into tables defined in Hive using 1.2
and I can see that the PARTITION clause is honored : data files get created
in multiple subdirectories correctly.
I tried the SKEWED BY ON STORED AS DIRECTORIES clause on the CREATE TABLE
clause but I didn't see subdirecto
I'm a little confused around Hive & Spark, can someone shed some light ?
Using Spark, I can access the Hive metastore and run Hive queries. Since I
am able to do this in stand-alone mode, it can't be using map-reduce to run
the Hive queries and I suppose it's building a query plan and executing it