[GitHub] [iceberg] PaulLiang1 opened a new issue, #5104: ClassCastException with spark.sql.datetime.java8API.enabled=true

GitBox Tue, 21 Jun 2022 04:56:45 -0700


PaulLiang1 opened a new issue, #5104:
URL: https://github.com/apache/iceberg/issues/5104


   - Spark Version: 3.2.0
   - Iceberg Version: 0.13.1
   
   When `spark.sql.datetime.java8API.enabled=true` is set, 
   
   Doing Rewrite manifest on a date partitioned table throws the following 
exception:
   ```
   Job aborted due to stage failure: Task 0 in stage 36333.0 failed 5 times, 
most recent failure: Lost task 0.4 in stage 36333.0 (TID 140410) 
(ip-123.us-west-2.compute.internal executor 77): java.lang.ClassCastException: 
java.time.LocalDate cannot be cast to java.sql.Date
        at 
org.apache.iceberg.spark.SparkValueConverter.convert(SparkValueConverter.java:77)
        at org.apache.iceberg.spark.SparkStructLike.get(SparkStructLike.java:48)
        at 
org.apache.iceberg.PartitionSummary.updateFields(PartitionSummary.java:59)
        at org.apache.iceberg.PartitionSummary.update(PartitionSummary.java:51)
        at org.apache.iceberg.ManifestWriter.addEntry(ManifestWriter.java:87)
        at org.apache.iceberg.ManifestWriter.existing(ManifestWriter.java:135)
        at 
org.apache.iceberg.spark.actions.BaseRewriteManifestsSparkAction.writeManifest(BaseRewriteManifestsSparkAction.java:332)
        at 
org.apache.iceberg.spark.actions.BaseRewriteManifestsSparkAction.lambda$toManifests$afb7bc39$1(BaseRewriteManifestsSparkAction.java:354)
        at 
org.apache.spark.sql.Dataset.$anonfun$mapPartitions$1(Dataset.scala:2867)
        at 
org.apache.spark.sql.execution.MapPartitionsExec.$anonfun$doExecute$3(objects.scala:201)
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
        at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:133)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1474)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
   ```
   
   ---
   This is caused by casting 
https://github.com/apache/iceberg/blob/0.13.x/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkValueConverter.java#L77
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] PaulLiang1 opened a new issue, #5104: ClassCastException with spark.sql.datetime.java8API.enabled=true

Reply via email to