hudi-bot opened a new issue, #15526:
URL: https://github.com/apache/hudi/issues/15526
Docker Demo has this exception when running step 2:
{code:java}
docker exec -it adhoc-2 /bin/bash
root@adhoc-2:/opt# spark-submit \
> --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
$HUDI_UTILITIES_BUNDLE \
> --table-type COPY_ON_WRITE \
> --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \
> --source-ordering-field ts \
> --target-base-path /user/hive/warehouse/stock_ticks_cow \
> --target-table stock_ticks_cow --props
/var/demo/config/kafka-source.properties \
> --schemaprovider-class
org.apache.hudi.utilities.schema.FilebasedSchemaProvider
22/10/31 15:14:41 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
22/10/31 15:14:41 WARN deploy.SparkSubmit$$anon$2: Failed to load
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.
java.lang.ClassNotFoundException:
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:238)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:806)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/10/31 15:14:41 INFO util.ShutdownHookManager: Shutdown hook called
22/10/31 15:14:41 INFO util.ShutdownHookManager: Deleting directory
/tmp/spark-c2d663e6-ff44-462a-beb0-bae5d73d3669 {code}
If you look at the size of the bundle jars, the utilities bundle is
significantly smaller than the others
{code:java}
root@adhoc-2:/opt# ls -l
/var/hoodie/ws/docker/hoodie/hadoop/hive_base/target/
total 173136
drwxr-xr-x 3 root root 96 Oct 31 15:07 antrun
-rw-r--r-- 1 root root 40597210 Oct 31 15:07 hoodie-hadoop-mr-bundle.jar
-rw-r--r-- 1 root root 36576220 Oct 31 15:07 hoodie-hive-sync-bundle.jar
-rw-r--r-- 1 root root 100091870 Oct 31 15:07 hoodie-spark-bundle.jar
-rw-r--r-- 1 root root 18336 Oct 31 15:07 hoodie-utilities.jar
drwxr-xr-x 3 root root 96 Oct 31 15:07 maven-shared-archive-resources
{code}
A quick workaround to run the demo is to replace that jar with a copy of the
complete bundle by running:
{code:java}
docker cp
packaging/hudi-utilities-bundle/target/hudi-utilities-bundle_2.11-0.13.0-SNAPSHOT.jar
adhoc-2:/var/hoodie/ws/docker/hoodie/hadoop/hive_base/target/hoodie-utilities.jar
{code}
## JIRA info
- Link: https://issues.apache.org/jira/browse/HUDI-5110
- Type: Bug
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]