[spark] branch master updated: [SPARK-26932][DOC] Add a warning for Hive 2.1.1 ORC reader issue

dongjoon Tue, 05 Mar 2019 12:08:10 -0800

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new c27caea  [SPARK-26932][DOC] Add a warning for Hive 2.1.1 ORC reader 
issue
c27caea is described below

commit c27caead43423d1f994f42502496d57ea8389dc0
Author: Bo Hai <haibo-s...@163.com>
AuthorDate: Tue Mar 5 11:57:04 2019 -0800

    [SPARK-26932][DOC] Add a warning for Hive 2.1.1 ORC reader issue
    
    Hive 2.1.1 cannot read ORC table created by Spark 2.4.0 in default, and I 
add the information into sql-migration-guide-upgrade.md. for details to see:  
[SPARK-26932](https://issues.apache.org/jira/browse/SPARK-26932)
    
    doc build
    
    Closes #23944 from haiboself/SPARK-26932.
    
    Authored-by: Bo Hai <haibo-s...@163.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 docs/sql-migration-guide-upgrade.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-migration-guide-upgrade.md 
b/docs/sql-migration-guide-upgrade.md
index c201056..af3d03e 100644
--- a/docs/sql-migration-guide-upgrade.md
+++ b/docs/sql-migration-guide-upgrade.md
@@ -169,7 +169,7 @@ displayTitle: Spark SQL Upgrading Guide
 
   - Since Spark 2.4, Spark will display table description column Last Access 
value as UNKNOWN when the value was Jan 01 1970.
 
-  - Since Spark 2.4, Spark maximizes the usage of a vectorized ORC reader for 
ORC files by default. To do that, `spark.sql.orc.impl` and 
`spark.sql.orc.filterPushdown` change their default values to `native` and 
`true` respectively.
+  - Since Spark 2.4, Spark maximizes the usage of a vectorized ORC reader for 
ORC files by default. To do that, `spark.sql.orc.impl` and 
`spark.sql.orc.filterPushdown` change their default values to `native` and 
`true` respectively. ORC files created by native ORC writer cannot be read by 
some old Apache Hive releases. Use `spark.sql.orc.impl=hive` to create the 
files shared with Hive 2.1.1 and older.
 
   - In PySpark, when Arrow optimization is enabled, previously `toPandas` just 
failed when Arrow optimization is unable to be used whereas `createDataFrame` 
from Pandas DataFrame allowed the fallback to non-optimization. Now, both 
`toPandas` and `createDataFrame` from Pandas DataFrame allow the fallback by 
default, which can be switched off by 
`spark.sql.execution.arrow.fallback.enabled`.
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-26932][DOC] Add a warning for Hive 2.1.1 ORC reader issue

Reply via email to