Spark not saving data to Hive

Akhilesh Pathodia Sat, 23 Jan 2016 03:04:02 -0800

Hi,

I am trying to write data from spark to Hive partitioned table:


DataFrame dataFrame = sqlContext.createDataFrame(rdd, schema);
dataFrame.write().partitionBy("YEAR","MONTH","DAY").saveAsTable(tableName);

The data is not being written to hive table (hdfs location:
/user/hive/warehouse/<table_name>/), Below are the logs from spark
executor. As shown in the logs, it is writing the data to
/tmp/spark-a3c7ed0f-76c6-4c3c-b80c-0734e33390a2/metastore/case_logs, but I
did not find this directory in HDFS.

16/01/23 02:15:03 INFO datasources.DynamicPartitionWriterContainer:
Sorting complete. Writing out partition files one at a time.
16/01/23 02:15:03 INFO compress.CodecPool: Got brand-new compressor [.gz]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/parquet-pig-bundle-1.5.0-cdh5.5.1.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/parquet-hadoop-bundle-1.5.0-cdh5.5.1.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/parquet-format-2.1.0-cdh5.5.1.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/hive-exec-1.1.0-cdh5.5.1.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/hive-jdbc-1.1.0-cdh5.5.1-standalone.jar!/shaded/parquet/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type
[shaded.parquet.org.slf4j.helpers.NOPLoggerFactory]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:05 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO compress.CodecPool: Got brand-new compressor [.gz]
16/01/23 02:15:06 INFO output.FileOutputCommitter: Saved output of
task 'attempt_201601230214_0023_m_000000_0' to
file:/tmp/spark-a3c7ed0f-76c6-4c3c-b80c-0734e33390a2/metastore/case_logs
16/01/23 02:15:06 INFO mapred.SparkHadoopMapRedUtil:
attempt_201601230214_0023_m_000000_0: Committed
16/01/23 02:15:06 INFO executor.Executor: Finished task 0.0 in stage
23.0 (TID 23). 2013 bytes result sent to driver


I am using CDH 5.5.1 an Spark 1.5.0. Does anybody have idea what is
happening here?

Thanks,
Akhilesh

Spark not saving data to Hive

Reply via email to