[spark] branch branch-2.4 updated: [MINOR][DOCS] Fix Spark hive example.

gurwls223 Tue, 21 May 2019 02:25:02 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-2.4 by this push:
     new 694ebb4  [MINOR][DOCS] Fix Spark hive example.
694ebb4 is described below

commit 694ebb493fa8c7d3a7ad1b6927af5d1b617999b7
Author: Prashant Sharma <prash...@apache.org>
AuthorDate: Tue May 21 18:23:38 2019 +0900

    [MINOR][DOCS] Fix Spark hive example.
    
    ## What changes were proposed in this pull request?
    
    Documentation has an error, 
https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#hive-tables.
    
    The example:
    ```scala
    scala> val dataDir = "/tmp/parquet_data"
    dataDir: String = /tmp/parquet_data
    
    scala> spark.range(10).write.parquet(dataDir)
    
    scala> sql(s"CREATE EXTERNAL TABLE hive_ints(key int) STORED AS PARQUET 
LOCATION '$dataDir'")
    res6: org.apache.spark.sql.DataFrame = []
    
    scala> sql("SELECT * FROM hive_ints").show()
    
    +----+
    | key|
    +----+
    |null|
    |null|
    |null|
    |null|
    |null|
    |null|
    |null|
    |null|
    |null|
    |null|
    +----+
    ```
    
    Range does not emit `key`, but `id` instead.
    
    Closes #24657 from ScrapCodes/fix_hive_example.
    
    Lead-authored-by: Prashant Sharma <prash...@apache.org>
    Co-authored-by: Prashant Sharma <prash...@in.ibm.com>
    Signed-off-by: HyukjinKwon <gurwls...@apache.org>
    (cherry picked from commit 5f4b50513cd34cd3dcf7f72972bfcd1f51031723)
    Signed-off-by: HyukjinKwon <gurwls...@apache.org>
---
 .../org/apache/spark/examples/sql/hive/SparkHiveExample.scala     | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git 
a/examples/src/main/scala/org/apache/spark/examples/sql/hive/SparkHiveExample.scala
 
b/examples/src/main/scala/org/apache/spark/examples/sql/hive/SparkHiveExample.scala
index 70fb5b2..a832276 100644
--- 
a/examples/src/main/scala/org/apache/spark/examples/sql/hive/SparkHiveExample.scala
+++ 
b/examples/src/main/scala/org/apache/spark/examples/sql/hive/SparkHiveExample.scala
@@ -122,16 +122,16 @@ object SparkHiveExample {
     val dataDir = "/tmp/parquet_data"
     spark.range(10).write.parquet(dataDir)
     // Create a Hive external Parquet table
-    sql(s"CREATE EXTERNAL TABLE hive_ints(key int) STORED AS PARQUET LOCATION 
'$dataDir'")
+    sql(s"CREATE EXTERNAL TABLE hive_bigints(id bigint) STORED AS PARQUET 
LOCATION '$dataDir'")
     // The Hive external table should already have data
-    sql("SELECT * FROM hive_ints").show()
+    sql("SELECT * FROM hive_bigints").show()
     // +---+
-    // |key|
+    // | id|
     // +---+
     // |  0|
     // |  1|
     // |  2|
-    // ...
+    // ... Order may vary, as spark processes the partitions in parallel.
 
     // Turn on flag for Hive Dynamic Partitioning
     spark.sqlContext.setConf("hive.exec.dynamic.partition", "true")


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [MINOR][DOCS] Fix Spark hive example.

Reply via email to