Hi Team, We are trying hive on Spark in our cluster and we are experiencing the below exception whenever the hive queries involves a reducer phase in its execution (like group by, UDAF). Could you please help us understand the compatibility of Hive on Spark in UDAF execution and the root cause of this exception.
We are using Spark 1.1.0 version and made the build using the hadoop-2 profile(mvn clean install -DskipTests -Phadoop-2) with the code downloaded from https://github.com/apache/hive/tree/spark. hive (default)> select count(*) from employee; Query ID = phodisvc_20141103032121_978e1f48-6290-4e5d-8a57-955edc98b7cd Total jobs = 1 Launching Job 1 out of 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> java.lang.NoSuchMethodError: org.apache.spark.api.java.JavaPairRDD.foreachAsync(Lorg/apache/spark/api/java/function/VoidFunction;)Lorg/apache/spark/api/java/JavaFutureAction; at org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:189) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52) at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:76) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1366) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1178) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1005) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:995) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:246) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:198) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:197) FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. org.apache.spark.api.java.JavaPairRDD.foreachAsync(Lorg/apache/spark/api/java/function/VoidFunction;)Lorg/apache/spark/api/java/JavaFutureAction; Thanks & Regards, Prabu