Re: Spark SQL drops the HIVE table in "overwrite" mode while writing into table
Added. Regards, Dhaval Modi dhavalmod...@gmail.com On 5 March 2016 at 20:47, Ted Yu wrote: > Please stack trace, code snippet, etc in the JIRA you created so that > people can reproduce what you saw. > > On Sat, Mar 5, 2016 at 7:02 AM, Dhaval Modi > wrote: > >> >> Regards, >> Dhaval Modi >> dhavalmod...@gmail.com >> >> -- Forwarded message -- >> From: Dhaval Modi >> Date: 5 March 2016 at 20:31 >> Subject: Spark SQL drops the HIVE table in "overwrite" mode while writing >> into table >> To: u...@spark.apache.org >> >> >> Hi Team, >> >> I am facing a issue while writing dataframe back to HIVE table. >> >> When using "SaveMode.Overwrite" option the table is getting dropped and >> Spark is unable to recreate it thus throwing error. >> >> JIRA: https://issues.apache.org/jira/browse/SPARK-13699 >> >> >> E.g. >> tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table") >> >> Error: >> ++ >> 16/03/05 13:57:26 INFO spark.SparkContext: Created broadcast 138 from run >> at ThreadPoolExecutor.java:1145 >> 16/03/05 13:57:26 INFO log.PerfLogger: > from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl> >> *java.lang.RuntimeException: serious problem* >> *at * >> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021) >> at >> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048) >> at >> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) >> at >> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) >> at >> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) >> at scala.Option.getOrElse(Option.scala:120) >> >> Caused by: java.util.concurrent.ExecutionException: >> java.io.FileNotFoundException: File does not exist: hdfs:// >> sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table >> at java.util.concurrent.FutureTask.report(FutureTask.java:122) >> at java.util.concurrent.FutureTask.get(FutureTask.java:188) >> at >> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998) >> ... 138 more >> *Caused by: java.io.FileNotFoundException: File does not exist: >> hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table >> <http://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table>* >> at >> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122) >> at >> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114) >> ++ >> >> >> Regards, >> Dhaval Modi >> dhavalmod...@gmail.com >> >> >
Re: Spark SQL drops the HIVE table in "overwrite" mode while writing into table
Please stack trace, code snippet, etc in the JIRA you created so that people can reproduce what you saw. On Sat, Mar 5, 2016 at 7:02 AM, Dhaval Modi wrote: > > Regards, > Dhaval Modi > dhavalmod...@gmail.com > > -- Forwarded message -- > From: Dhaval Modi > Date: 5 March 2016 at 20:31 > Subject: Spark SQL drops the HIVE table in "overwrite" mode while writing > into table > To: u...@spark.apache.org > > > Hi Team, > > I am facing a issue while writing dataframe back to HIVE table. > > When using "SaveMode.Overwrite" option the table is getting dropped and > Spark is unable to recreate it thus throwing error. > > JIRA: https://issues.apache.org/jira/browse/SPARK-13699 > > > E.g. > tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table") > > Error: > ++ > 16/03/05 13:57:26 INFO spark.SparkContext: Created broadcast 138 from run > at ThreadPoolExecutor.java:1145 > 16/03/05 13:57:26 INFO log.PerfLogger: from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl> > *java.lang.RuntimeException: serious problem* > *at * > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048) > at > org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > at scala.Option.getOrElse(Option.scala:120) > > Caused by: java.util.concurrent.ExecutionException: > java.io.FileNotFoundException: File does not exist: hdfs:// > sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:188) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998) > ... 138 more > *Caused by: java.io.FileNotFoundException: File does not exist: > hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table > <http://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table>* > at > org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122) > at > org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114) > ++ > > > Regards, > Dhaval Modi > dhavalmod...@gmail.com > >
Fwd: Spark SQL drops the HIVE table in "overwrite" mode while writing into table
Regards, Dhaval Modi dhavalmod...@gmail.com -- Forwarded message -- From: Dhaval Modi Date: 5 March 2016 at 20:31 Subject: Spark SQL drops the HIVE table in "overwrite" mode while writing into table To: u...@spark.apache.org Hi Team, I am facing a issue while writing dataframe back to HIVE table. When using "SaveMode.Overwrite" option the table is getting dropped and Spark is unable to recreate it thus throwing error. JIRA: https://issues.apache.org/jira/browse/SPARK-13699 E.g. tgtFinal.write.mode(SaveMode.Overwrite).saveAsTable("tgt_table") Error: ++ 16/03/05 13:57:26 INFO spark.SparkContext: Created broadcast 138 from run at ThreadPoolExecutor.java:1145 16/03/05 13:57:26 INFO log.PerfLogger: *java.lang.RuntimeException: serious problem* *at * org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) Caused by: java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File does not exist: hdfs:// sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998) ... 138 more *Caused by: java.io.FileNotFoundException: File does not exist: hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table <http://sandbox.hortonworks.com:8020/apps/hive/warehouse/tgt_table>* at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122) at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114) ++ Regards, Dhaval Modi dhavalmod...@gmail.com