Yin Huai created SPARK-12682:
--------------------------------

             Summary: Hive will fail if the schema of a parquet table has a 
very wide schema
                 Key: SPARK-12682
                 URL: https://issues.apache.org/jira/browse/SPARK-12682
             Project: Spark
          Issue Type: Bug
          Components: SQL
            Reporter: Yin Huai


To reproduce it, you can create a table with many many columns. You need to 
make sure that all of data type strings combined exceeds 4000 chars (strings 
are generated by HiveMetastoreTypes.toMetastoreType). Then, save the table as 
parquet. Because we will try to use a hive compatible way to store the 
metadata, we will set the serde to parquet serde. Then, when you load the 
table, you will see a {{java.lang.IllegalArgumentException}} thrown from Hive's 
{{TypeInfoUtils}}. I believe the cause is the same as SPARK-6024. Hive's 
parquet does not handle wide schema well and the data type string is truncated. 

Once you hit this problem, you will not be able to drop the table because Hive 
fails to evaluate drop table command. To at least provide a better workaround. 
We should see if we should have a native drop table call to metastore and if we 
should add a flag to disable saving a data source table's metadata in hive 
compatible way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to