[GitHub] spark pull request #19779: [SPARK-17920][SPARK-19580][SPARK-19878][SQL] Supp...

vinodkc Sat, 18 Nov 2017 01:02:19 -0800

GitHub user vinodkc opened a pull request:

    https://github.com/apache/spark/pull/19779


    [SPARK-17920][SPARK-19580][SPARK-19878][SQL] Support writing to Hive table 
which uses Avro schema url 'avro.schema.url'

    ## What changes were proposed in this pull request?
    Support writing to Hive table which uses Avro schema url 'avro.schema.url'
    For ex: 
    create external table avro_in (a string) stored as avro location 
'/avro-in/' tblproperties ('avro.schema.url'='/avro-schema/avro.avsc');
    
    create external table avro_out (a string) stored as avro location 
'/avro-out/' tblproperties ('avro.schema.url'='/avro-schema/avro.avsc');
    
     insert overwrite table avro_out select * from avro_in;  // fails with 
java.lang.NullPointerException
    
     WARN AvroSerDe: Encountered exception determining schema. Returning signal 
schema to indicate problem
    java.lang.NullPointerException
        at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:182)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:174)
    
    ## Changes proposed in this fix
    Currently 'null' value is passed to serializer, which causes NPE during 
insert operation, instead pass Hadoop configuration object
    ## How was this patch tested?
    Added new test case in VersionsSuite

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vinodkc/spark br_Fix_SPARK-17920

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19779.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19779
    
----
commit 034b2466d073c008b71eae072ee98353df56cbf2
Author: vinodkc <vinod.kc...@gmail.com>
Date:   2017-11-18T07:52:59Z

    pass hadoopConfiguration to Serializer

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19779: [SPARK-17920][SPARK-19580][SPARK-19878][SQL] Supp...

Reply via email to