Re: Spark SQL drops the HIVE table in "overwrite" mode while writing into table

2016-03-05 Thread Dhaval Modi
Added.

Regards,
Dhaval Modi
dhavalmod...@gmail.com

On 5 March 2016 at 20:47, Ted Yu  wrote:

> Please stack trace, code snippet, etc in the JIRA you created so that
> people can reproduce what you saw.
>
> On Sat, Mar 5, 2016 at 7:02 AM, Dhaval Modi 
> wrote:
>
>>
>> Regards,
>> Dhaval Modi
>> dhavalmod...@gmail.com
>>
>> -- Forwarded message --
>> From: Dhaval Modi 
>> Date: 5 March 2016 at 20:31
>> Subject: Spark SQL drops the HIVE table in "overwrite" mode while writing
>> into table
>> To: u...@spark.apache.org
>>
>>
>> Hi Team,
>>
>> I am facing a issue while writing dataframe back to HIVE table.
>>
>> When using "SaveMode.Overwrite" option the table is getting dropped and
>> Spark is unable to recreate it thus throwing error.
>>
>> JIRA: https://issues.apache.org/jira/browse/SPARK-13699
>>
>>
>> E.g.
>> tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table")
>>
>> Error:
>> ++
>> 16/03/05 13:57:26 INFO spark.SparkContext: Created broadcast 138 from run
>> at ThreadPoolExecutor.java:1145
>> 16/03/05 13:57:26 INFO log.PerfLogger: > from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl>
>> *java.lang.RuntimeException: serious problem*
>> *at *
>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
>> at
>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
>> at
>> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)
>> at
>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
>> at
>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
>> at scala.Option.getOrElse(Option.scala:120)
>> 
>> Caused by: java.util.concurrent.ExecutionException:
>> java.io.FileNotFoundException: File does not exist: hdfs://
>> sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table
>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
>> at
>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)
>> ... 138 more
>> *Caused by: java.io.FileNotFoundException: File does not exist:
>> hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table
>> <http://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table>*
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
>> ++
>>
>>
>> Regards,
>> Dhaval Modi
>> dhavalmod...@gmail.com
>>
>>
>


Re: Spark SQL drops the HIVE table in "overwrite" mode while writing into table

2016-03-05 Thread Ted Yu
Please stack trace, code snippet, etc in the JIRA you created so that
people can reproduce what you saw.

On Sat, Mar 5, 2016 at 7:02 AM, Dhaval Modi  wrote:

>
> Regards,
> Dhaval Modi
> dhavalmod...@gmail.com
>
> -- Forwarded message --
> From: Dhaval Modi 
> Date: 5 March 2016 at 20:31
> Subject: Spark SQL drops the HIVE table in "overwrite" mode while writing
> into table
> To: u...@spark.apache.org
>
>
> Hi Team,
>
> I am facing a issue while writing dataframe back to HIVE table.
>
> When using "SaveMode.Overwrite" option the table is getting dropped and
> Spark is unable to recreate it thus throwing error.
>
> JIRA: https://issues.apache.org/jira/browse/SPARK-13699
>
>
> E.g.
> tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table")
>
> Error:
> ++
> 16/03/05 13:57:26 INFO spark.SparkContext: Created broadcast 138 from run
> at ThreadPoolExecutor.java:1145
> 16/03/05 13:57:26 INFO log.PerfLogger:  from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl>
> *java.lang.RuntimeException: serious problem*
> *at *
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
> at
> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
> at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
> at scala.Option.getOrElse(Option.scala:120)
> 
> Caused by: java.util.concurrent.ExecutionException:
> java.io.FileNotFoundException: File does not exist: hdfs://
> sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)
> ... 138 more
> *Caused by: java.io.FileNotFoundException: File does not exist:
> hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table
> <http://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table>*
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
> ++
>
>
> Regards,
> Dhaval Modi
> dhavalmod...@gmail.com
>
>


Fwd: Spark SQL drops the HIVE table in "overwrite" mode while writing into table

2016-03-05 Thread Dhaval Modi
Regards,
Dhaval Modi
dhavalmod...@gmail.com

-- Forwarded message --
From: Dhaval Modi 
Date: 5 March 2016 at 20:31
Subject: Spark SQL drops the HIVE table in "overwrite" mode while writing
into table
To: u...@spark.apache.org


Hi Team,

I am facing a issue while writing dataframe back to HIVE table.

When using "SaveMode.Overwrite" option the table is getting dropped and
Spark is unable to recreate it thus throwing error.

JIRA: https://issues.apache.org/jira/browse/SPARK-13699


E.g.
tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table")

Error:
++
16/03/05 13:57:26 INFO spark.SparkContext: Created broadcast 138 from run
at ThreadPoolExecutor.java:1145
16/03/05 13:57:26 INFO log.PerfLogger: 
*java.lang.RuntimeException: serious problem*
*at *
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199)
at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)

Caused by: java.util.concurrent.ExecutionException:
java.io.FileNotFoundException: File does not exist: hdfs://
sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)
... 138 more
*Caused by: java.io.FileNotFoundException: File does not exist:
hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table
<http://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table>*
at
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
at
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
++


Regards,
Dhaval Modi
dhavalmod...@gmail.com