manuzhang opened a new issue, #8655:
URL: https://github.com/apache/iceberg/issues/8655
### Apache Iceberg version
1.2.1
### Query engine
Spark
### Please describe the bug 🐞
1. import parquet table into iceberg table with `add_files` procedure
2. read iceberg table failed with following exception
```
Caused by: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 7 in stage 0.0 failed 2 times, most recent failure: Lost task 7.1
in stage 0.0 (TID 152)
(hdc34-lvs05-01-0310-6208-042-tess0172.stratus.lvs.ebay.com executor 3):
java.lang.IllegalStateException: Value at index is null
at
org.apache.iceberg.shaded.org.apache.arrow.vector.TimeStampVector.get(TimeStampVector.java:74)
at
org.apache.iceberg.arrow.vectorized.GenericArrowVectorAccessorFactory$TimestampMicroTzAccessor.getLong(GenericArrowVectorAccessorFactory.java:501)
at
org.apache.iceberg.spark.data.vectorized.IcebergArrowColumnVector.getLong(IcebergArrowColumnVector.java:101)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
Source)
at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:756)
```
3. I narrow down the issue to a timestamp column containing null values
4. inserting parquet table into iceberg table doesn't have this issue
5. I compare the table metrics. The only difference is that column's
`lower_bound` and `upper_bound` are null from the imported file
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]