[ 
https://issues.apache.org/jira/browse/HUDI-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614121#comment-17614121
 ] 

Raymond Xu commented on HUDI-4975:
----------------------------------

Root-caused the issue:

when using datahub-sync-bundle built with spark3.3 profile, it's expecting to 
work with 

{code}
https://github.com/apache/parquet-mr/blob/apache-parquet-1.12.2/parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java
{code}

which does not exist in parquet 1.10.1, which is used by spark 3.1

This means datahub-sync-bundle (and possibly other sync bundles) are not fully 
decouple from spark profiles. This can be mitigated by putting spark-bundle 
first in the classpath but we should eliminate the root issue.

> datahub sync bundle causes class loading issue
> ----------------------------------------------
>
>                 Key: HUDI-4975
>                 URL: https://issues.apache.org/jira/browse/HUDI-4975
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: dependencies
>            Reporter: Raymond Xu
>            Assignee: Raymond Xu
>            Priority: Critical
>             Fix For: 0.12.2
>
>
> run utilities-slim.jar as the main jar for deltastreamer
> set --jars 
> /tmp/hudi-datahub-sync-bundle-0.12.1-rc1.jar,/tmp/hudi-spark3.1-bundle_2.12-0.12.1-rc1.jar
> put datahub sync bundle before spark bundle resulted in class loader issue. 
> works fine if spark bundle goes first
> {code:bash}
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/parquet/schema/LogicalTypeAnnotation
>       at 
> org.apache.hudi.io.storage.HoodieFileWriterFactory.newParquetFileWriter(HoodieFileWriterFactory.java:78)
>       at 
> org.apache.hudi.io.storage.HoodieFileWriterFactory.newParquetFileWriter(HoodieFileWriterFactory.java:70)
>       at 
> org.apache.hudi.io.storage.HoodieFileWriterFactory.getFileWriter(HoodieFileWriterFactory.java:54)
>       at 
> org.apache.hudi.io.HoodieCreateHandle.<init>(HoodieCreateHandle.java:104)
>       at 
> org.apache.hudi.io.HoodieCreateHandle.<init>(HoodieCreateHandle.java:76)
>       at 
> org.apache.hudi.io.CreateHandleFactory.create(CreateHandleFactory.java:46)
>       at 
> org.apache.hudi.execution.CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteInsertHandler.java:83)
>       at 
> org.apache.hudi.execution.CopyOnWriteInsertHandler.consumeOneRecord(CopyOnWriteInsertHandler.java:40)
>       at 
> org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:37)
>       at 
> org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:135)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       ... 3 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.parquet.schema.LogicalTypeAnnotation
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>       ... 14 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to