GitHub user vinodkc opened a pull request: https://github.com/apache/spark/pull/19779
[SPARK-17920][SPARK-19580][SPARK-19878][SQL] Support writing to Hive table which uses Avro schema url 'avro.schema.url' ## What changes were proposed in this pull request? Support writing to Hive table which uses Avro schema url 'avro.schema.url' For ex: create external table avro_in (a string) stored as avro location '/avro-in/' tblproperties ('avro.schema.url'='/avro-schema/avro.avsc'); create external table avro_out (a string) stored as avro location '/avro-out/' tblproperties ('avro.schema.url'='/avro-schema/avro.avsc'); insert overwrite table avro_out select * from avro_in; // fails with java.lang.NullPointerException WARN AvroSerDe: Encountered exception determining schema. Returning signal schema to indicate problem java.lang.NullPointerException at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:182) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:174) ## Changes proposed in this fix Currently 'null' value is passed to serializer, which causes NPE during insert operation, instead pass Hadoop configuration object ## How was this patch tested? Added new test case in VersionsSuite You can merge this pull request into a Git repository by running: $ git pull https://github.com/vinodkc/spark br_Fix_SPARK-17920 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19779.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19779 ---- commit 034b2466d073c008b71eae072ee98353df56cbf2 Author: vinodkc <vinod.kc...@gmail.com> Date: 2017-11-18T07:52:59Z pass hadoopConfiguration to Serializer ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org