[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3961: - Story Points: 1 > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:104) >
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3961: - Story Points: 2 (was: 1) > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3961: - Sprint: 2022/09/05 (was: 2022/08/22) > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionE
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3961: - Reviewers: Raymond Xu > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:104
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3961: - Story Points: 1 (was: 2) > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3961: - Sprint: 2022/09/05, 2022/09/19 (was: 2022/09/05) > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(Clea
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3961: - Fix Version/s: 0.11.1 (was: 0.11.0) > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task >Reporter: Ethan Guo >Priority: Blocker > Fix For: 0.11.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:104) > at > org.apache.hudi.table.action.cle
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3961: - Component/s: dependencies > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Blocker > Fix For: 0.11.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:104) > at > org.apache.hudi.table.action.
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3961: - Labels: pull-request-available (was: ) > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.11.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPl
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3961: Priority: Critical (was: Blocker) > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.11.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecut
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3961: -- Fix Version/s: 0.12.1 (was: 0.12.0) > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.1 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.reques
[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3961: Fix Version/s: 0.12.0 (was: 0.11.1) > Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim > bundle > > > Key: HUDI-3961 > URL: https://issues.apache.org/jira/browse/HUDI-3961 > Project: Apache Hudi > Issue Type: Task > Components: dependencies >Reporter: Ethan Guo >Priority: Critical > Labels: pull-request-available > Fix For: 0.12.0 > > > When running deltastreamer with both Spark 3.1 and utilities slim bundle > (compiled with Spark 3.2 profile), the following exception is thrown: > {code:java} > export SPARK_HOME=/Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2 > export > HUDI_SPARK_BUNDLE_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-spark3.1-bundle_2.12-0.11.0-rc3.jar > export > HUDI_UTILITIES_SLIM_JAR=/Users/ethan/Work/lib/hudi_releases/0.11.0-rc3/hudi-utilities-slim-bundle_2.12-0.11.0-rc3.jar > /Users/ethan/Work/lib/spark-3.1.3-bin-hadoop3.2/bin/spark-submit \ > --master local[4] \ > --driver-memory 4g --executor-memory 2g --num-executors 4 > --executor-cores 1 \ > --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ > --conf > spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain > \ > --conf spark.sql.catalogImplementation=hive \ > --conf spark.driver.maxResultSize=1g \ > --conf spark.speculation=true \ > --conf spark.speculation.multiplier=1.0 \ > --conf spark.speculation.quantile=0.5 \ > --conf spark.ui.port=6680 \ > --conf spark.eventLog.enabled=true \ > --conf spark.eventLog.dir=/Users/ethan/Work/data/hudi/spark-logs \ > --packages org.apache.spark:spark-avro_2.12:3.1.3 \ > --jars > /Users/ethan/Work/repo/hudi-benchmarks/target/hudi-benchmarks-0.1-SNAPSHOT.jar,$HUDI_SPARK_BUNDLE_JAR > \ > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ > $HUDI_UTILITIES_SLIM_JAR \ > --props $TEST_ROOT_DIR/ds_mor.properties \ > --source-class BenchmarkDataSource \ > --source-ordering-field ts \ > --target-base-path $TEST_ROOT_DIR/test_table \ > --target-table test_table \ > --table-type MERGE_ON_READ \ > --op UPSERT \ > --continuous{code} > > {code:java} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:191) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:186) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > java.lang.NoClassDefFoundError: org/apache/avro/AvroMissingFieldException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:189) > ... 15 more > Caused by: java.lang.NoClassDefFoundError: > org/apache/avro/AvroMissingFieldException > at > org.apache.hudi.avro.model.HoodieCleanerPlan.newBuilder(HoodieCleanerPlan.java:246) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestCle