Re: [PR] [SPARK-47197] Failed to connect HiveMetastore when using iceberg with HiveCatalog on spark-sql or spark-shell [spark]

via GitHub Wed, 28 Feb 2024 07:22:38 -0800


eubnara commented on PR #45309:
URL: https://github.com/apache/spark/pull/45309#issuecomment-1969212746


   Even with this patch, `insert into` is broken. (describe extended, select * 
from queries are okay)
   Maybe https://issues.apache.org/jira/browse/SPARK-30885 is related?
   
   ```
   Driver stacktrace:
           at 
org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2454)
           at 
org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2403)
           at 
org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2402)
           at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
           at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
           at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
           at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2402)
           at 
org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1160)
           at 
org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1160)
           at scala.Option.foreach(Option.scala:407)
           at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1160)
           at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2642)
           at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2584)
           at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2573)
           at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
           at 
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:938)
           at org.apache.spark.SparkContext.runJob(SparkContext.scala:2214)
           at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:228)
           ... 61 more
   Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.NullPointerException
           at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:274)
           at 
org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:132)
           at 
org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:105)
           at 
org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:161)
           at 
org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.<init>(FileFormatDataWriter.scala:146)
           at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:300)
           at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$17(FileFormatWriter.scala:239)
           at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
           at org.apache.spark.scheduler.Task.run(Task.scala:131)
           at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
           at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1492)
           at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:750)
   Caused by: java.lang.NullPointerException
           at 
org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper.<init>(TezUtil.java:105)
           at 
org.apache.iceberg.mr.hive.TezUtil.taskAttemptWrapper(TezUtil.java:78)
           at 
org.apache.iceberg.mr.hive.HiveIcebergOutputFormat.writer(HiveIcebergOutputFormat.java:73)
           at 
org.apache.iceberg.mr.hive.HiveIcebergOutputFormat.getHiveRecordWriter(HiveIcebergOutputFormat.java:58)
           at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:286)
           at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:271)
           ... 14 more
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-47197] Failed to connect HiveMetastore when using iceberg with HiveCatalog on spark-sql or spark-shell [spark]

Reply via email to