[ https://issues.apache.org/jira/browse/HUDI-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614121#comment-17614121 ]
Raymond Xu commented on HUDI-4975: ---------------------------------- Root-caused the issue: when using datahub-sync-bundle built with spark3.3 profile, it's expecting to work with {code} https://github.com/apache/parquet-mr/blob/apache-parquet-1.12.2/parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java {code} which does not exist in parquet 1.10.1, which is used by spark 3.1 This means datahub-sync-bundle (and possibly other sync bundles) are not fully decouple from spark profiles. This can be mitigated by putting spark-bundle first in the classpath but we should eliminate the root issue. > datahub sync bundle causes class loading issue > ---------------------------------------------- > > Key: HUDI-4975 > URL: https://issues.apache.org/jira/browse/HUDI-4975 > Project: Apache Hudi > Issue Type: Bug > Components: dependencies > Reporter: Raymond Xu > Assignee: Raymond Xu > Priority: Critical > Fix For: 0.12.2 > > > run utilities-slim.jar as the main jar for deltastreamer > set --jars > /tmp/hudi-datahub-sync-bundle-0.12.1-rc1.jar,/tmp/hudi-spark3.1-bundle_2.12-0.12.1-rc1.jar > put datahub sync bundle before spark bundle resulted in class loader issue. > works fine if spark bundle goes first > {code:bash} > Caused by: java.lang.NoClassDefFoundError: > org/apache/parquet/schema/LogicalTypeAnnotation > at > org.apache.hudi.io.storage.HoodieFileWriterFactory.newParquetFileWriter(HoodieFileWriterFactory.java:78) > at > org.apache.hudi.io.storage.HoodieFileWriterFactory.newParquetFileWriter(HoodieFileWriterFactory.java:70) > at > org.apache.hudi.io.storage.HoodieFileWriterFactory.getFileWriter(HoodieFileWriterFactory.java:54) > at > org.apache.hudi.io.HoodieCreateHandle.<init>(HoodieCreateHandle.java:104) > at > org.apache.hudi.io.HoodieCreateHandle.<init>(HoodieCreateHandle.java:76) > at > org.apache.hudi.io.CreateHandleFactory.create(CreateHandleFactory.java:46) > at > org.apache.hudi.execution.CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteInsertHandler.java:83) > at > org.apache.hudi.execution.CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteInsertHandler.java:40) > at > org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:37) > at > org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:135) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ... 3 more > Caused by: java.lang.ClassNotFoundException: > org.apache.parquet.schema.LogicalTypeAnnotation > at java.net.URLClassLoader.findClass(URLClassLoader.java:387) > at java.lang.ClassLoader.loadClass(ClassLoader.java:418) > at java.lang.ClassLoader.loadClass(ClassLoader.java:351) > ... 14 more > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)