oku95 commented on issue #8655:
URL: https://github.com/apache/iceberg/issues/8655#issuecomment-2096066510
Hi @manuzhang
Getting similar error in AWS Glue 4.0 Spark env
```
24/05/06 00:49:40 ERROR Executor: Exception in task 1.0 in stage 11.0 (TID
20)
java.lang.IllegalStateException: Value at index is null
at
org.apache.iceberg.shaded.org.apache.arrow.vector.BigIntVector.get(BigIntVector.java:112)
~[iceberg-spark-runtime-3.3_2.12-1.0.0.jar:?]
at
org.apache.iceberg.arrow.vectorized.GenericArrowVectorAccessorFactory$LongAccessor.getLong(GenericArrowVectorAccessorFactory.java:257)
~[iceberg-spark-runtime-3.3_2.12-1.0.0.jar:?]
at
org.apache.iceberg.spark.data.vectorized.IcebergArrowColumnVector.getLong(IcebergArrowColumnVector.java:101)
~[iceberg-spark-runtime-3.3_2.12-1.0.0.jar:?]
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
Source) ~[?:?]
at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:35)
~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.hasNext(Unknown
Source) ~[?:?]
at
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:968)
~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
at
org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:314)
~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
at
org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$21(FileFormatWriter.scala:257)
~[spark-sql_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
at org.apache.spark.scheduler.Task.run(Task.scala:138)
~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
~[spark-core_2.12-3.3.0-amzn-1.jar:?]
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1516)
~[spark-core_2.12-3.3.0-amzn-1.jar:3.3.0-amzn-1]
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
~[spark-core_2.12-3.3.0-amzn-1.jar:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
~[?:1.8.0_402]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
~[?:1.8.0_402]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_402]
```
Tried to disabling it with `arrow.enable_null_check_for_get=false` but got
issue with
```
--conf
spark.executor.extraJavaOptions=-Darrow.enable_null_check_for_get=false
--enable-continuous-cloudwatch-log true --scriptLocation
s3://stage-pipeline-glue-assets-493140057280/scripts/load_source_iceberg.py
--job-language python --JOB_NAME stage-pipeline-load-source-iceberg
--
openjdk version "1.8.0_402"
OpenJDK Runtime Environment Corretto-8.402.08.1
(build 1.8.0_402-b08
)
OpenJDK 64-Bit Server VM Corretto-8.402.08.1 (build 25.402-b08, mixed mode)
1715002240878
LAUNCH ERROR \| Invalid input to --confPlease refer logs for details.
Exception in thread "main"
java.lang.IllegalArgumentException: Invalid input to --conf
at
com.amazonaws.services.glue.ArgsParserForSparkProperties.$anonfun$parse$2(ConfigParam.scala:458)
at
com.amazonaws.services.glue.ArgsParserForSparkProperties.$anonfun$parse$2$adapted(ConfigParam.scala:445)
at scala.collection.immutable.Range.foreach(Range.scala:158) at
com.amazonaws.services.glue.ArgsParserForSparkProperties.parse(ConfigParam.scala:445)
at
com.amazonaws.services.glue.PrepareLaunchProperties.<init>(PrepareLaunch.scala:222)
at com.amazonaws.services.glue.PrepareLaunch.<init>(PrepareLaunch.scala:528)
at com.amazonaws.services.glue.PrepareLaunch.<init>(PrepareLaunch.scala:525)
at com.amazonaws.services.glue.PrepareLaunch$.main(PrepareLaunch.scala:54)
at com.amazonaws.services.glue.PrepareLaunch.main(PrepareLaunch.scala)
```
Any ideas what might cause it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]