I used Ambari to config and install Hive and Spark. I want to insert into a hive table using Spark execution Engine but I face to this weird error. The error is:
Job failed with java.lang.ClassNotFoundException: ive_20231017100559_301568f9-bdfa-4f7c-89a6-f69a65b30aaf:1 2023-10-17 10:07:42,972 ERROR [c4aeb932-743e-4736-b00f-6b905381fa03 main] status.SparkJobMonitor: Job failed with java.lang.ClassNotFoundException: ive_20231017100559_301568f9-bdfa-4f7c-89a6-f69a65b30aaf:1 com.esotericsoftware.kryo.KryoException: Unable to find class: ive_20231017100559_301568f9-bdfa-4f7c-89a6-f69a65b30aaf:1 Serialization trace: invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork) at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160) at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133) at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:181) at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:709) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:206) at org.apache.hadoop.hive.ql.exec.spark.KryoSerializer.deserialize(KryoSerializer.java:60) at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:329) at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:378) at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:343) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: ive_20231017100559_301568f9-bdfa-4f7c-89a6-f69a65b30aaf:1 at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154) ... 15 more 2023-10-17 10:07:43,067 INFO [c4aeb932-743e-4736-b00f-6b905381fa03 main] reexec.ReOptimizePlugin: ReOptimization: retryPossible: false FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed during runtime. Please check stacktrace for the root cause. the weird part is Hive make this itself and asks me where to find it! I would appreciate any helps to solve and locate the problem. note: The Ambari, Hadoop, Hive, Zookeeper and Spark Works Well according to the Ambari service health check. note: Since I didnt find any spark specific hive-site.xml I added the following configs to the hive-site.xml file: <property> <name>hive.execution.engine</name> <value>spark</value> </property> <property> <name>hive.spark.warehouse.location</name> <value>/tmp/spark/warehouse</value> </property> <property> <name>hive.spark.sql.execution.mode</name> <value>adaptive</value> </property> <property> <name>hive.spark.sql.shuffle.partitions</name> <value>200</value> </property> <property> <name>hive.spark.sql.shuffle.partitions.pernode</name> <value>2</value> </property> <property> <name>hive.spark.sql.memory.fraction</name> <value>0.6</value> </property> <property> <name>hive.spark.sql.codegen.enabled</name> <value>true</value> </property> <property> <name>spark.sql.hive.hiveserver2.jdbc.url</name> <value>jdbc:hive2://my.ambari.com:2181 /;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2</value> </property> <property> <name>spark.datasource.hive.warehouse.load.staging.dir</name> <value>/tmp</value> </property> <property> <name>spark.hadoop.hive.zookeeper.quorum</name> <value>my.ambari.com:2181</value> </property> <property> <name>spark.datasource.hive.warehouse.write.path.strictColumnNamesMapping</name> <value>true</value> </property> <property> <name>spark.sql.hive.conf.list</name> <value>hive.vectorized.execution.filesink.arrow.native.enabled=true;hive.vectorized.execution.enabled=true</value> </property> <property> <name>hive.spark.client.connect.timeout</name> <value>30000ms</value> </property> <property> <name>hive.spark.client.server.connect.timeout</name> <value>300000ms</value> <property> <name>hive.hook.proto.base-directory</name> <value>/tmp/hive/hooks</value> </property> <property> <name>hive.spark.sql.shuffle.partitions</name> <value>200</value> </property> <property> <name>hive.strict.managed.tables</name> <value>true</value> </property> <property> <name>hive.stats.fetch.partition.stats</name> <value>true</value> </property> <property> <name>hive.spark.sql.memory.fraction</name> <value>0.6</value> </property> <property> <name>hive.spark.sql.execution.mode</name> <value>spark</value> </property> <property> <name>hive.spark.sql.codegen.enabled</name> <value>true</value> </property> <property> <name>hive.heapsize</name> <value>2g</value> </property> <property> <name>hive.spark.sql.shuffle.partitions.pernode</name> <value>100</value> </property> <property> <name>hive.spark.warehouse.location</name> <value>/user/hive/warehouse</value> </property> </property>