Yunjian Zhang created SPARK-20973: ------------------------------------- Summary: insert table fail caused by unable to fetch data definition file from remote hdfs Key: SPARK-20973 URL: https://issues.apache.org/jira/browse/SPARK-20973 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.1.0 Reporter: Yunjian Zhang
I implemented my own hive serde to handle special data files which needs to read data definition during process. the process include 1.read definition file location from TBLPROPERTIES 2.read file content as per step 1 3.init serde base on step 2. //DDL of the table as below: --------------------------------------------- CREATE EXTERNAL TABLE dw_user_stg_txt_out ROW FORMAT SERDE 'com.ebay.dss.gdr.hive.serde.abvro.AbvroSerDe' STORED AS INPUTFORMAT 'com.ebay.dss.gdr.mapred.AbAsAvroInputFormat' OUTPUTFORMAT 'com.ebay.dss.gdr.hive.ql.io.ab.AvroAsAbOutputFormat' LOCATION 'hdfs://${remote_hdfs}/user/data' TBLPROPERTIES ( 'com.ebay.dss.dml.file' = 'hdfs://${remote_hdfs}/dml/user.dml' ) // insert statement insert overwrite table dw_user_stg_txt_out select * from dw_user_stg_txt_avro; //fail with ERROR 17/06/02 15:46:34 ERROR SparkSQLDriver: Failed in [insert overwrite table dw_user_stg_txt_out select * from dw_user_stg_txt_avro] java.lang.RuntimeException: FAILED to get dml file from: hdfs://${remote-hdfs}/dml/user.dml at com.ebay.dss.gdr.hive.serde.abvro.AbvroSerDe.initialize(AbvroSerDe.java:109) at org.apache.spark.sql.hive.SparkHiveWriterContainer.newSerializer(hiveWriterContainers.scala:160) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:258) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:170) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:347) -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org