mcagriaktas commented on issue #12846: URL: https://github.com/apache/iceberg/issues/12846#issuecomment-2832219920
Same error for me! heres my iceberg jars: ```bash #!/bin/bash FLINK_LIB_DIR=/opt/flink/lib FLINK_VERSION=1.20.0 HADOOP_VERSION=3.4.1 ICEBERG_VERSION=1.8.1 HIVE_VERSION=3.1.3 # Core Iceberg dependencies wget -q https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-core/$ICEBERG_VERSION/iceberg-core-$ICEBERG_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-flink-runtime-1.20/$ICEBERG_VERSION/iceberg-flink-runtime-1.20-$ICEBERG_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-flink-1.20/$ICEBERG_VERSION/iceberg-flink-1.20-$ICEBERG_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-hive-metastore/$ICEBERG_VERSION/iceberg-hive-metastore-$ICEBERG_VERSION.jar -P $FLINK_LIB_DIR # Core Hadoop dependencies wget -q https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/$HADOOP_VERSION/hadoop-common-$HADOOP_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-hdfs/$HADOOP_VERSION/hadoop-hdfs-$HADOOP_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-hdfs-client/$HADOOP_VERSION/hadoop-hdfs-client-$HADOOP_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client/$HADOOP_VERSION/hadoop-client-$HADOOP_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-auth/$HADOOP_VERSION/hadoop-auth-$HADOOP_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-annotations/$HADOOP_VERSION/hadoop-annotations-$HADOOP_VERSION.jar -P $FLINK_LIB_DIR # Hadoop MapReduce dependencies wget -q https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core/$HADOOP_VERSION/hadoop-mapreduce-client-core-$HADOOP_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-common/$HADOOP_VERSION/hadoop-mapreduce-client-common-$HADOOP_VERSION.jar -P $FLINK_LIB_DIR # Hive Metastore dependencies wget -q https://repo1.maven.org/maven2/org/apache/hive/hive-metastore/$HIVE_VERSION/hive-metastore-$HIVE_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hive/hive-exec/$HIVE_VERSION/hive-exec-$HIVE_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hive/hive-common/$HIVE_VERSION/hive-common-$HIVE_VERSION.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hive/hive-service-rpc/$HIVE_VERSION/hive-service-rpc-$HIVE_VERSION.jar -P $FLINK_LIB_DIR # Additional common dependencies wget -q https://repo1.maven.org/maven2/commons-io/commons-io/2.13.0/commons-io-2.13.0.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/commons-codec/commons-codec/1.16.0/commons-codec-1.16.0.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/commons/commons-lang3/3.14.0/commons-lang3-3.14.0.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/commons/commons-collections4/4.4/commons-collections4-4.4.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/com/google/guava/guava/33.4.8-jre/guava-33.4.8-jre.jar -P $FLINK_LIB_DIR # Hadoop's missing dependencies wget -q https://repo1.maven.org/maven2/org/apache/commons/commons-configuration2/2.10.0/commons-configuration2-2.10.0.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/commons-logging/commons-logging/1.2/commons-logging-1.2.jar -P $FLINK_LIB_DIR # Download Woodstox XML parser dependencies wget -q https://repo1.maven.org/maven2/com/fasterxml/woodstox/woodstox-core/6.5.1/woodstox-core-6.5.1.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/codehaus/woodstox/stax2-api/4.2.1/stax2-api-4.2.1.jar -P $FLINK_LIB_DIR # Additional Hadoop XML dependencies wget -q https://repo1.maven.org/maven2/org/apache/hadoop/thirdparty/hadoop-shaded-guava/1.1.1/hadoop-shaded-guava-1.1.1.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/hadoop/thirdparty/hadoop-shaded-protobuf/1.1.1/hadoop-shaded-protobuf-1.1.1.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/com/fasterxml/jackson/core/jackson-core/2.15.3/jackson-core-2.15.3.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/com/fasterxml/jackson/core/jackson-databind/2.15.3/jackson-databind-2.15.3.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/com/fasterxml/jackson/core/jackson-annotations/2.15.3/jackson-annotations-2.15.3.jar -P $FLINK_LIB_DIR # Download Caffeine cache library wget -q https://repo1.maven.org/maven2/com/github/ben-manes/caffeine/caffeine/3.1.8/caffeine-3.1.8.jar -P $FLINK_LIB_DIR # Also, let's add some other potentially missing dependencies for Iceberg catalog wget -q https://repo1.maven.org/maven2/org/apache/parquet/parquet-hadoop/1.13.1/parquet-hadoop-1.13.1.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/parquet/parquet-column/1.13.1/parquet-column-1.13.1.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/parquet/parquet-common/1.13.1/parquet-common-1.13.1.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/parquet/parquet-format-structures/1.13.1/parquet-format-structures-1.13.1.jar -P $FLINK_LIB_DIR # Add ORC support which Iceberg might need wget -q https://repo1.maven.org/maven2/org/apache/orc/orc-core/1.9.2/orc-core-1.9.2.jar -P $FLINK_LIB_DIR wget -q https://repo1.maven.org/maven2/org/apache/orc/orc-shims/1.9.2/orc-shims-1.9.2.jar -P $FLINK_LIB_DIR # Additional Iceberg-related libraries wget -q https://repo1.maven.org/maven2/org/apache/avro/avro/1.11.3/avro-1.11.3.jar -P $FLINK_LIB_DIR # Download the libfb303 dependency wget -q https://repo1.maven.org/maven2/org/apache/thrift/libfb303/0.9.3/libfb303-0.9.3.jar -P /opt/flink/lib # Download the Thrift libraries wget -q https://repo1.maven.org/maven2/org/apache/thrift/libthrift/0.16.0/libthrift-0.16.0.jar -P /opt/flink/lib # Additional Hive Metastore dependencies wget -q https://repo1.maven.org/maven2/org/apache/hive/hive-exec/3.1.3/hive-exec-3.1.3.jar -P /opt/flink/lib wget -q https://repo1.maven.org/maven2/org/apache/hive/hive-common/3.1.3/hive-common-3.1.3.jar -P /opt/flink/lib wget -q https://repo1.maven.org/maven2/org/apache/hive/hive-serde/3.1.3/hive-serde-3.1.3.jar -P /opt/flink/lib wget -q https://repo1.maven.org/maven2/org/apache/hive/hive-service-rpc/3.1.3/hive-service-rpc-3.1.3.jar -P /opt/flink/lib # Download the Hadoop Protobuf dependency wget -q https://repo1.maven.org/maven2/org/apache/hadoop/thirdparty/hadoop-shaded-protobuf_3_7/1.1.1/hadoop-shaded-protobuf_3_7-1.1.1.jar -P /opt/flink/lib # You might also need these wget -q https://repo1.maven.org/maven2/com/google/protobuf/protobuf-java/3.23.4/protobuf-java-3.23.4.jar -P /opt/flink/lib wget -q https://repo1.maven.org/maven2/org/apache/hadoop/thirdparty/hadoop-shaded-guava/1.1.1/hadoop-shaded-guava-1.1.1.jar -P /opt/flink/lib wget -q https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-client-runtime/3.4.1/hadoop-client-runtime-3.4.1.jar -P /opt/flink/lib wget -q https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-parquet/1.8.1/iceberg-parquet-1.8.1.jar -P /opt/flink/lib wget -q https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-arrow/1.8.1/iceberg-arrow-1.8.1.jar -P /opt/flink/lib echo "All dependencies downloaded to $FLINK_LIB_DIR" ``` Error logs: ```text 2025-04-26 13:48:49,365 INFO org.apache.flink.runtime.resourcemanager.slotmanager.FineGrainedSlotManager [] - Matching resource requirements against available resources. Missing resources: Job 6c0538cd93f8081b838802c8ad10f916 ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, numberOfRequiredSlots=1} Current resources: TaskManager 172.80.0.71:43227-5a210b Available: ResourceProfile{cpuCores=8, taskHeapMemory=8.000gb (8589934592 bytes), taskOffHeapMemory=0 bytes, managedMemory=6.000gb (6442450944 bytes), networkMemory=1.681gb (1804482817 bytes)} Total: ResourceProfile{cpuCores=8, taskHeapMemory=8.000gb (8589934592 bytes), taskOffHeapMemory=0 bytes, managedMemory=6.000gb (6442450944 bytes), networkMemory=1.681gb (1804482817 bytes)} 2025-04-26 13:48:49,367 INFO org.apache.flink.runtime.resourcemanager.slotmanager.DefaultSlotStatusSyncer [] - Starting allocation of slot d9a5050ac4262332fb67a2a37a7e1b15 from 172.80.0.71:43227-5a210b for job 6c0538cd93f8081b838802c8ad10f916 with resource profile ResourceProfile{cpuCores=1, taskHeapMemory=1024.000mb (1073741824 bytes), taskOffHeapMemory=0 bytes, managedMemory=768.000mb (805306368 bytes), networkMemory=215.111mb (225560352 bytes)}. 2025-04-26 13:48:49,404 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: raw_data[1] -> Calc[2] -> ConstraintEnforcer[3] (1/1) (7e96e560b20421628daebadd735a80da_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from SCHEDULED to DEPLOYING. 2025-04-26 13:48:49,405 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Deploying Source: raw_data[1] -> Calc[2] -> ConstraintEnforcer[3] (1/1) (attempt #0) with attempt id 7e96e560b20421628daebadd735a80da_cbc357ccb763df2852fee8c4fc7d55f2_0_0 and vertex id cbc357ccb763df2852fee8c4fc7d55f2_0 to 172.80.0.71:43227-5a210b @ taskmanager.dahbest (dataPort=41545) with allocation id d9a5050ac4262332fb67a2a37a7e1b15 2025-04-26 13:48:49,407 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - IcebergStreamWriter (1/1) (7e96e560b20421628daebadd735a80da_306d8342cb5b2ad8b53f1be57f65bee8_0_0) switched from SCHEDULED to DEPLOYING. 2025-04-26 13:48:49,407 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Deploying IcebergStreamWriter (1/1) (attempt #0) with attempt id 7e96e560b20421628daebadd735a80da_306d8342cb5b2ad8b53f1be57f65bee8_0_0 and vertex id 306d8342cb5b2ad8b53f1be57f65bee8_0 to 172.80.0.71:43227-5a210b @ taskmanager.dahbest (dataPort=41545) with allocation id d9a5050ac4262332fb67a2a37a7e1b15 2025-04-26 13:48:49,411 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - IcebergFilesCommitter -> IcebergSink iceberg_catalog.etl_db.item_extract: Writer (1/1) (7e96e560b20421628daebadd735a80da_fbb4ef531e002f8fb3a2052db255adf5_0_0) switched from SCHEDULED to DEPLOYING. 2025-04-26 13:48:49,412 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Deploying IcebergFilesCommitter -> IcebergSink iceberg_catalog.etl_db.item_extract: Writer (1/1) (attempt #0) with attempt id 7e96e560b20421628daebadd735a80da_fbb4ef531e002f8fb3a2052db255adf5_0_0 and vertex id fbb4ef531e002f8fb3a2052db255adf5_0 to 172.80.0.71:43227-5a210b @ taskmanager.dahbest (dataPort=41545) with allocation id d9a5050ac4262332fb67a2a37a7e1b15 2025-04-26 13:48:49,498 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: raw_data[1] -> Calc[2] -> ConstraintEnforcer[3] (1/1) (7e96e560b20421628daebadd735a80da_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to INITIALIZING. 2025-04-26 13:48:49,500 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - IcebergFilesCommitter -> IcebergSink iceberg_catalog.etl_db.item_extract: Writer (1/1) (7e96e560b20421628daebadd735a80da_fbb4ef531e002f8fb3a2052db255adf5_0_0) switched from DEPLOYING to INITIALIZING. 2025-04-26 13:48:49,501 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - IcebergStreamWriter (1/1) (7e96e560b20421628daebadd735a80da_306d8342cb5b2ad8b53f1be57f65bee8_0_0) switched from DEPLOYING to INITIALIZING. 2025-04-26 13:48:49,655 INFO org.apache.flink.runtime.source.coordinator.SourceCoordinator [] - Source Source: raw_data[1] registering reader for parallel task 0 (#0) @ 172.80.0.71 2025-04-26 13:48:49,656 INFO org.apache.iceberg.flink.source.enumerator.AbstractIcebergEnumerator [] - Added reader: 0 2025-04-26 13:48:49,656 INFO org.apache.iceberg.flink.source.enumerator.AbstractIcebergEnumerator [] - Received request split event from subtask 0 2025-04-26 13:48:49,656 INFO org.apache.iceberg.flink.source.enumerator.AbstractIcebergEnumerator [] - Assigning splits for 1 awaiting readers 2025-04-26 13:48:49,656 INFO org.apache.iceberg.flink.source.enumerator.AbstractIcebergEnumerator [] - No more splits available for subtask 0 2025-04-26 13:48:49,662 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: raw_data[1] -> Calc[2] -> ConstraintEnforcer[3] (1/1) (7e96e560b20421628daebadd735a80da_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from INITIALIZING to RUNNING. 2025-04-26 13:48:49,970 INFO org.apache.flink.runtime.checkpoint.CheckpointFailureManager [] - Failed to trigger checkpoint for job 6c0538cd93f8081b838802c8ad10f916 since Checkpoint triggering task IcebergStreamWriter (1/1) of job 6c0538cd93f8081b838802c8ad10f916 is not being executed at the moment. Aborting checkpoint. Failure reason: Not all required tasks are currently running.. 2025-04-26 13:48:50,012 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - IcebergStreamWriter (1/1) (7e96e560b20421628daebadd735a80da_306d8342cb5b2ad8b53f1be57f65bee8_0_0) switched from INITIALIZING to FAILED on 172.80.0.71:43227-5a210b @ taskmanager.dahbest (dataPort=41545). java.lang.ClassCastException: class org.apache.iceberg.shaded.org.apache.parquet.schema.MessageType cannot be cast to class org.apache.parquet.schema.MessageType (org.apache.iceberg.shaded.org.apache.parquet.schema.MessageType and org.apache.parquet.schema.MessageType are in unnamed module of loader 'app') at org.apache.iceberg.parquet.ParquetWriter.<init>(ParquetWriter.java:92) ~[iceberg-flink-runtime-1.20-1.8.1.jar:?] at org.apache.iceberg.parquet.Parquet$WriteBuilder.build(Parquet.java:393) ~[iceberg-flink-runtime-1.20-1.8.1.jar:?] at org.apache.iceberg.flink.sink.FlinkAppenderFactory.newAppender(FlinkAppenderFactory.java:131) ~[iceberg-flink-1.20-1.8.1.jar:?] at org.apache.iceberg.flink.sink.FlinkAppenderFactory.newDataWriter(FlinkAppenderFactory.java:145) ~[iceberg-flink-1.20-1.8.1.jar:?] at org.apache.iceberg.io.BaseTaskWriter$RollingFileWriter.newWriter(BaseTaskWriter.java:383) ~[iceberg-core-1.8.1.jar:?] at org.apache.iceberg.io.BaseTaskWriter$RollingFileWriter.newWriter(BaseTaskWriter.java:376) ~[iceberg-core-1.8.1.jar:?] at org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.openCurrent(BaseTaskWriter.java:334) ~[iceberg-core-1.8.1.jar:?] at org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.<init>(BaseTaskWriter.java:296) ~[iceberg-core-1.8.1.jar:?] at org.apache.iceberg.io.BaseTaskWriter$RollingFileWriter.<init>(BaseTaskWriter.java:378) ~[iceberg-core-1.8.1.jar:?] at org.apache.iceberg.io.BaseTaskWriter$BaseEqualityDeltaWriter.<init>(BaseTaskWriter.java:134) ~[iceberg-core-1.8.1.jar:?] at org.apache.iceberg.flink.sink.BaseDeltaTaskWriter$RowDataDeltaWriter.<init>(BaseDeltaTaskWriter.java:113) ~[iceberg-flink-1.20-1.8.1.jar:?] at org.apache.iceberg.flink.sink.UnpartitionedDeltaWriter.<init>(UnpartitionedDeltaWriter.java:57) ~[iceberg-flink-1.20-1.8.1.jar:?] at org.apache.iceberg.flink.sink.RowDataTaskWriterFactory.create(RowDataTaskWriterFactory.java:191) ~[iceberg-flink-1.20-1.8.1.jar:?] at org.apache.iceberg.flink.sink.IcebergStreamWriter.open(IcebergStreamWriter.java:62) ~[iceberg-flink-1.20-1.8.1.jar:?] at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107) ~[flink-streaming-java-1.20.0.jar:1.20.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreStateAndGates(StreamTask.java:858) ~[flink-streaming-java-1.20.0.jar:1.20.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$restoreInternal$5(StreamTask.java:812) ~[flink-streaming-java-1.20.0.jar:1.20.0] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) ~[flink-streaming-java-1.20.0.jar:1.20.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:812) ~[flink-streaming-java-1.20.0.jar:1.20.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:771) ~[flink-streaming-java-1.20.0.jar:1.20.0] at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:970) ~[flink-dist-1.20.0.jar:1.20.0] at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:939) ~[flink-dist-1.20.0.jar:1.20.0] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:763) ~[flink-dist-1.20.0.jar:1.20.0] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) ~[flink-dist-1.20.0.jar:1.20.0] at java.lang.Thread.run(Thread.java:833) ~[?:?] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
