[ 
https://issues.apache.org/jira/browse/SPARK-22329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22329:
------------------------------------

    Assignee:     (was: Apache Spark)

> Use NEVER_INFER for `spark.sql.hive.caseSensitiveInferenceMode` by default
> --------------------------------------------------------------------------
>
>                 Key: SPARK-22329
>                 URL: https://issues.apache.org/jira/browse/SPARK-22329
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Dongjoon Hyun
>            Priority: Critical
>
> In Spark 2.2.0, `spark.sql.hive.caseSensitiveInferenceMode` has a critical 
> issue. 
> - SPARK-19611 uses `INFER_AND_SAVE` at 2.2.0 since Spark 2.1.0 breaks some 
> Hive tables backed by case-sensitive data files.
> bq. This situation will occur for any Hive table that wasn't created by Spark 
> or that was created prior to Spark 2.1.0. If a user attempts to run a query 
> over such a table containing a case-sensitive field name in the query 
> projection or in the query filter, the query will return 0 results in every 
> case.
> - However, SPARK-22306 reports this also corrupts Hive Metastore schema by 
> removing bucketing information (BUCKETING_COLS, SORT_COLS) and changing owner.
> - Since Spark 2.3.0 supports Bucketing, BUCKETING_COLS and SORT_COLS look 
> okay at least. However, we need to figure out the issue of changing owners. 
> Also, we cannot backport bucketing patch into `branch-2.2`. We need more 
> tests on before releasing 2.3.0.
> Hive Metastore is a shared resource and Spark should not corrupt it by 
> default. This issue proposes to recover that option back to `NEVER_INFO` like 
> Spark 2.2.0 by default. Users can take a risk by enabling `INFER_AND_SAVE` by 
> themselves.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to