Hi all, I ran into errors that I cannot explain when using a java User Defined Function:
- The UDF runs fine on smallquery, I am therefore confident on my ADD JAR/CREATE TEMPORARY code; - The error is only raised on complex queries/ high volume data sets; - The jobs complain first about a "java.lang.ClassNotFoundException:com", followed by a "Continuing ..." and then successive "FAILED/ATTEMPT" but goes on - It failed later on with a "java.lang.NoClassDefFoundError" Any idea about that weird pattern of errors? What could explain why it happens on that particular udf? (my project uses many other udfs with no problem). Thanks in advance, Gui ## Version hive@melissa:~$ hadoop version Hadoop 2.0.0-cdh4.1.2 Subversion file:///data/1/jenkins/workspace/generic-package-debian64-6-0/CDH4.1.2-Packaging-Hadoop-2012-11-01_17-01-07/hadoop-2.0.0+552-1.cdh4.1.2.p0.27~squeeze/src/hadoop-common-project/hadoop-common -r f0b53c81cbf56f5955e403b49fcd27afd5f082de Compiled by jenkins on Thu Nov 1 17:33:24 PDT 2012 >From source with checksum c5d56e606a3aa6dd5399cee3b2b8054f ## Libs hive@melissa:~$ ls /usr/lib/hive/lib/ | grep hive hive-builtins-0.9.0-cdh4.1.2.jar hive-cli-0.9.0-cdh4.1.2.jar hive-common-0.9.0-cdh4.1.2.jar hive-contrib-0.9.0-cdh4.1.2.jar hive_contrib.jar hive-exec-0.9.0-cdh4.1.2.jar hive-hbase-handler-0.9.0-cdh4.1.2.jar hive-hwi-0.9.0-cdh4.1.2.jar hive-jdbc-0.9.0-cdh4.1.2.jar hive-json-serde-0.2.jar hive-metastore-0.9.0-cdh4.1.2.jar hive-pdk-0.9.0-cdh4.1.2.jar hive-serde-0.9.0-cdh4.1.2.jar hive-service-0.9.0-cdh4.1.2.jar hive-shims-0.9.0-cdh4.1.2.jar ## Logs Total MapReduce jobs = 6 Ended Job = 1061943637, job is filtered out (removed at runtime). Ended Job = -251883494, job is filtered out (removed at runtime). Execution log at: /tmp/hive/hive_20130628162727_a4662ea1-6f20-4e19-9af9-f49f13874e85.log 2013-06-28 04:27:14 Starting to launch local task to process map join; maximum memory = 932118528 2013-06-28 04:27:15 Processing rows: 3 Hashtable size: 3 Memory usage: 3766480 rate: 0.004 2013-06-28 04:27:15 Dump the hashtable into file: file:/tmp/hive/hive_2013-06-28_16-27-04_597_7794600918247628371/-local-10007/HashTable-Stage-8/MapJoin-mapfile21--.hashtable 2013-06-28 04:27:15 Upload 1 File to: file:/tmp/hive/hive_2013-06-28_16-27-04_597_7794600918247628371/-local-10007/HashTable-Stage-8/MapJoin-mapfile21--.hashtable File size: 546 2013-06-28 04:27:15 End of local task; Time Taken: 1.662 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 2 out of 6 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201306201551_0655, Tracking URL = http://hadoop-master.we7.local:50030/jobdetails.jsp?jobid=job_201306201551_0655 Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=hadoop-master.we7.local:8021 -kill job_201306201551_0655 Hadoop job information for Stage-8: number of mappers: 1; number of reducers: 0 2013-06-28 16:27:20,383 Stage-8 map = 0%, reduce = 0% 2013-06-28 16:27:27,430 Stage-8 map = 100%, reduce = 0%, Cumulative CPU 4.27 sec 2013-06-28 16:27:28,447 Stage-8 map = 100%, reduce = 100%, Cumulative CPU 4.27 sec MapReduce Total cumulative CPU time: 4 seconds 270 msec Ended Job = job_201306201551_0655 Ended Job = 2140631413, job is filtered out (removed at runtime). Ended Job = 1738046100, job is filtered out (removed at runtime). Execution log at: /tmp/hive/hive_20130628162727_a4662ea1-6f20-4e19-9af9-f49f13874e85.log java.lang.ClassNotFoundException: com.we7.warehouse.udf.DateCompare Continuing ... java.lang.NullPointerException: target should not be null Continuing ... java.lang.ClassNotFoundException: com.we7.warehouse.udf.DateCompare Continuing ... java.lang.NullPointerException: target should not be null Continuing ... 2013-06-28 04:27:30 Starting to launch local task to process map join; maximum memory = 932118528 org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:76) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:166) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:385) at org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:267) at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:672) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isStateful(FunctionRegistry.java:1124) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.<init>(ExprNodeGenericFuncEvaluator.java:107) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.get(ExprNodeEvaluatorFactory.java:48) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.<init>(ExprNodeGenericFuncEvaluator.java:97) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.get(ExprNodeEvaluatorFactory.java:48) at org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:70) ... 13 more Execution failed with exit status: 2 Obtaining error information Task failed! Task ID: Stage-11 Logs: /tmp/hive/hive.log FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapredLocalTask ATTEMPT: Execute BackupTask: org.apache.hadoop.hive.ql.exec.MapRedTask Launching Job 4 out of 6 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201306201551_0656, Tracking URL = http://hadoop-master.we7.local:50030/jobdetails.jsp?jobid=job_201306201551_0656 Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=hadoop-master.we7.local:8021 -kill job_201306201551_0656 Hadoop job information for Stage-2: number of mappers: 2; number of reducers: 1 2013-06-28 16:27:34,525 Stage-2 map = 0%, reduce = 0% 2013-06-28 16:27:39,551 Stage-2 map = 50%, reduce = 0%, Cumulative CPU 3.49 sec ... 2013-06-28 16:27:58,736 Stage-2 map = 50%, reduce = 17%, Cumulative CPU 3.49 sec 2013-06-28 16:27:59,745 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 3.49 sec MapReduce Total cumulative CPU time: 3 seconds 490 msec Ended Job = job_201306201551_0656 with errors Error during job, obtaining debugging information... Examining task ID: task_201306201551_0656_m_000003 (and more) from job job_201306201551_0656 Exception in thread "Thread-59" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/util/HostUtil at org.apache.hadoop.hive.shims.Hadoop23Shims.getTaskAttemptLogUrl(Hadoop23Shims.java:51) at org.apache.hadoop.hive.ql.exec.JobDebugger$TaskInfoGrabber.getTaskInfos(JobDebugger.java:186) at org.apache.hadoop.hive.ql.exec.JobDebugger$TaskInfoGrabber.run(JobDebugger.java:142) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.util.HostUtil at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 4 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 4.27 sec HDFS Read: 0 HDFS Write: 0 SUCCESS Job 1: Map: 2 Reduce: 1 Cumulative CPU: 3.49 sec HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 7 seconds 760 msec Guillaume Allain Senior Development Engineer t: +44 20 7117 0809 m: blinkbox music - the easiest way to listen to the music you love, for free www.blinkboxmusic.com