Differences in building Giraph
Hi Giraph Experts, I'm trying to build Giraph (trunk) for Hadoop 2.4.0. I know that there are 2 profiles with which I can proceed to build. *hadoop_yarn* and *hadoop_2 * profiles. When I try with *hadoop_yarn*, build is fine. When I try with *hadoop_2*, build is failing. The commands I used is *mvn -Phadoop_yarn -Dhadoop.version=2.4.0 -DskipTests clean install* *mvn -Phadoop_2 -Dhadoop.version=2.4.0 -DskipTests clean install* I get the following error while building with *hadoop_2* profile [INFO] - [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Giraph Parent ... SUCCESS [ 5.758 s] [INFO] Apache Giraph Core . FAILURE [ 9.259 s] [INFO] Apache Giraph Examples . SKIPPED [INFO] Apache Giraph Accumulo I/O . SKIPPED [INFO] Apache Giraph HBase I/O SKIPPED [INFO] Apache Giraph HCatalog I/O . SKIPPED [INFO] Apache Giraph Hive I/O . SKIPPED [INFO] Apache Giraph Gora I/O . SKIPPED [INFO] Apache Giraph Rexster I/O .. SKIPPED [INFO] Apache Giraph Rexster Kibble ... SKIPPED [INFO] Apache Giraph Rexster I/O Formats .. SKIPPED [INFO] Apache Giraph Distribution . SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 15.722 s [INFO] Finished at: 2014-11-07T11:06:36+05:30 [INFO] Final Memory: 48M/558M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on project giraph-core: Compilation failure: Compilation failure: [ERROR] /home/sundar/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyClient.java:[93,32] getDefaultProperties() has protected access in org.apache.hadoop.security.SaslPropertiesResolver [ERROR] /home/sundar/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyServer.java:[113,32] getDefaultProperties() has protected access in org.apache.hadoop.security.SaslPropertiesResolver [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :giraph-core Can some one help me in understanding why this is happening with profile change? I get it if the build failed when hadoop version changed. -- Thanks, Sundar
Re: ShortestPath Code execution on Hadoop 2.4.0 Inbox x
Hi, You built Giraph for the Hadoop version 1.2.1 which is evident from your command line *hadoop jar /usr/local/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar * You have the build Giraph against the Hadoop version you'll be using. If you are using Hadoop 2.4.0, the command to build would be *mvn -Phadoop_2 -Dhadoop.version=2.4.0 -DskipTests ...* or *mvn -Phadoop_yarn -Dhadoop.version=2.4.0 -DskipTests ...* You can find this in the *README* file from Giraph trunk Hope this helps. On Tue, Aug 12, 2014 at 10:10 AM, Vikalp Handa handa.vik...@gmail.com wrote: Hi everyone, I am new to Apache Giraph and would like execute Shortestpath and PageRank example code on *Hadoop 2.4.0 single node cluster* (my machine) having Centos 6.5. I have successfully build Giraph on my machine but unable to execute ShortestPath code. Please let me know if there are any dependencies to be resolved before code execution. *P.S. :* *Command used:* hadoop jar /usr/local/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/hadoop/input/tiny_graph.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/hduser/output/shortestpaths -w 1 -ca giraph.SplitMasterWorker=false *Execution Result:* 14/08/11 18:48:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/08/11 18:48:40 INFO utils.ConfigurationUtils: No edge input format specified. Ensure your InputFormat does not require one. 14/08/11 18:48:40 INFO utils.ConfigurationUtils: No edge output format specified. Ensure your OutputFormat does not require one. 14/08/11 18:48:40 INFO utils.ConfigurationUtils: Setting custom argument [giraph.SplitMasterWorker] to [false] in GiraphConfiguration 14/08/11 18:48:40 INFO Configuration.deprecation: mapreduce.job.counters.limit is deprecated. Instead, use mapreduce.job.counters.max 14/08/11 18:48:40 INFO Configuration.deprecation: mapred.job.map.memory.mb is deprecated. Instead, use mapreduce.map.memory.mb 14/08/11 18:48:40 INFO Configuration.deprecation: mapred.job.reduce.memory.mb is deprecated. Instead, use mapreduce.reduce.memory.mb 14/08/11 18:48:40 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 14/08/11 18:48:40 INFO Configuration.deprecation: mapreduce.user.classpath.first is deprecated. Instead, use mapreduce.job.user.classpath.first 14/08/11 18:48:40 INFO Configuration.deprecation: mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts 14/08/11 18:48:40 INFO job.GiraphJob: run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 0, old value = 4) 14/08/11 18:48:40 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 14/08/11 18:48:41 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 14/08/11 18:48:41 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:44) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:250) at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:94) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Regards, Vikalp Handa
Re: string parameter from command line
I'm not sure of the use case here, but I'll give an example which might help you. The parameter is given in the command line with -ca (as you said). For example hadoop jar sample.jar -vip *-ca sample.option1=value1 -ca sample.option2=value2* You can then get this value from the configuration in your computation class *String option1 = getConf().get(sample.option1); // will get value1* *String option2 = getConf().get(sample.option2); // will get value2* On Wed, Jul 16, 2014 at 2:49 PM, Carmen Manzulli carmenmanzu...@gmail.com wrote: Hi experts, I'm trying to insert a string parameter from the command line with -ca...so in my Computation code i've created a StrConfOption...when running i get NPE because it seems to set the value of this string to null... how can i fix this problem?
Re: Giraph Basic Job Run
Hi, I've had this error before. Code compiled on Java 7 and bashrc file setting JAVA_HOME to Java 7, but still get this error. For me, the reason was because the Hadoop processes were still running on Java 6. This is was because, in hadoop-env.sh, I had set JAVA_HOME to point to Java 6. So, the processes were running on Java 6 and it wasn't able to run the code compiled in Java 7. Check out on which Java your Hadoop processes are running. Hope this helps Regards, Sundar On Wed, Jul 2, 2014 at 12:18 PM, Vineet Mishra clearmido...@gmail.com wrote: Hi Ritesh, As far as I have checked this error Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/giraph/GiraphRunner : Unsupported major.minor version 51.0 usually comes when the code compilation is done in one version and code is run in another version(usually lower) but I have explicitly set java 7 for compilation while packaging the project and moreover I have Java 7 set in classpath(referred bashrc) and this error Unsupported major.minor version 51.0 itself says that the compilation is done in java 7. So does that mean my Java is taking lower version than Java 7 even though I am able to see the classpath set to Java 7. java version 1.7.0_21 Java(TM) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode) Although I have faced this classversion error several times while running code from eclipse but this is the first time I am facing this issue with maven build and I don't find any reason for it to behave vague. Experts suggestions will be highly appreciated. Thanks! On Tue, Jul 1, 2014 at 8:29 PM, Ritesh Kumar Singh riteshoneinamill...@gmail.com wrote: I guess u have both Java 6 and Java 7 installed in your PC. And now when you are trying to run hadoop, its forcing Giraph which is made for java 7 to to run on a java 6 jvm. I will suggest you to have a look in your .bashrc file for the default java version and try to make java 7 as the default JVM. Hope this helps On Tue, Jul 1, 2014 at 5:45 PM, Vineet Mishra clearmido...@gmail.com wrote: Hi All, I am new to Giraph and I am running quick example from http://giraph.apache.org/quick_start.html While running the Giraph job it is throwing java class version error, although I am having Java and Javac set to version 7 still I don't have any nerve why is this happening. *Running command :* hadoop jar /usr/local/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /home/user/input/tiny_graph.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /home/user/output/shortestpaths -w 1 *Error Trace :* Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/giraph/GiraphRunner : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:634) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:266) at org.apache.hadoop.util.RunJar.main(RunJar.java:149) Version Used : Hadoop 1.1.2 Java 1.7 Any help would be highly appreciated. Thanks in advance!
Re: Taking string as vertex value..
I'm not sure of the problem, but I can suggest an alternative. Use the writeUTF and readUTF methods to write and read strings. On Thu, Apr 17, 2014 at 4:02 PM, Jyoti Yadav rao.jyoti26ya...@gmail.comwrote: Hi folks.. While implementing my algorithm , I am taking vertex value as string.. following is the sample input format... [0,hello world,[[1,1],[3,3]]] [1,a,[[0,1],[2,2],[3,1]]] [2,b,[[1,2],[4,4]]] [3,c,[[0,3],[1,1],[4,4]]] [4,d,[[3,4],[2,4]]] Suppose above is the input file given.. For this I implemented vertex value class.. But some error is thrown at some point. *.* * Below is the code for Vertex value writable class..* package org.apache.giraph.examples.utils; import java.io.*; import org.apache.hadoop.io.Writable; public class StringValueWritable implements Writable { String s; public StringValueWritable() { s=new String(); s=; } public StringValueWritable(String s1) { s=new String(); s=s1; } public String get_string() { return s; } @Override public void readFields(DataInput in) throws IOException { s=new String(); * s=in.readLine();* } @Override public void write(DataOutput out) throws IOException { out.writeBytes(s); } @Override public String toString() { return vertex value is +s+\n; } } In the highlighted part some error is shown in log files.. Please correct me ,where i went wrong.. Thanks.. Best Regards Jyoti -- *Sundara Raghavan Sankaran* sun...@crayondata.com +91 99520 06708 (IND) sundar.crayon -- -- http://crayondata.com/?utm_source=emailsig https://www.facebook.com/crayondatahttps://twitter.com/CrayonBigDatahttp://www.linkedin.com/company/crayon-datahttps://plus.google.com/+Crayondata1http://www.youtube.com/user/crayonbigdata www.crayondata.com http://crayondata.com/?utm_source=emailsig http://bigdata-madesimple.com/?utm_source=emailsig www.bigdata-madesimple.comhttp://bigdata-madesimple.com/?utm_source=emailsig -- Finalisthttp://www.code-n.org/fileadmin/user_upload/pdf/131210_List_Top_50_EN.pdf at the Code_N 2014 Contest http://www.code-n.org/cebit/award/ at CEBIThttp://www.cebit.com/, Hanover - the only big data company from Asia. This email and its contents are confidential, and meant only for you. Views or opinions, presented in this email, are solely of the author and may not necessarily represent Crayon Data.
RE: Running ConnectedComponents in a cluster.
I'd like to know how the directed graph is converted to an undirected graph. Do we just create an edge in the other direction for every edge created or is there some other way? On Apr 17, 2014 9:32 PM, Yu, Jaewook jaewook...@intel.com wrote: Ghufran, It looks like the graph loading is failing from your log: 14/04/17 16:12:31 INFO job.JobProgressTracker: Data from 3 workers - Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges loaded, 0 edge input splits loaded; min free memory on worker 2 - 141.96MB, average 142.66MB If you have access to JobTracker web interface (port 50030) or you know where the log files are located, take a look at the log for this failing job. That would be a good starting point to debug the issue. Thanks, Jae *From:* ghufran malik [mailto:ghufran1ma...@gmail.com] *Sent:* Thursday, April 17, 2014 8:28 AM *To:* user@giraph.apache.org *Subject:* Re: Running ConnectedComponents in a cluster. I would appreciate it if you could lend me your assistance with another problem of mine. I have an implementation of TriangleCounting algorithm that runs correctly on the smaller dataset I used to test ConnectedComponents, but fails when trying to compute this larger dataset. the map seems to fail and I do not know why. Full output is below. 14/04/17 16:12:30 INFO job.HaltApplicationUtils$DefaultHaltInstructionsWriter: writeHaltInstructions: To halt after next superstep execute: 'bin/halt-application --zkServer ricotta.eecs.qmul.ac.uk:2181 --zkNode /_hadoopBsp/job_1381849812331_2770/_haltComputation' 14/04/17 16:12:31 INFO mapreduce.Job: Running job: job_1381849812331_2770 14/04/17 16:12:31 INFO job.JobProgressTracker: Data from 3 workers - Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges loaded, 0 edge input splits loaded; min free memory on worker 2 - 141.96MB, average 142.66MB 14/04/17 16:12:32 INFO mapreduce.Job: Job job_1381849812331_2770 running in uber mode : false 14/04/17 16:12:32 INFO mapreduce.Job: map 100% reduce 0% 14/04/17 16:12:36 INFO job.JobProgressTracker: Data from 3 workers - Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges loaded, 0 edge input splits loaded; min free memory on worker 2 - 141.96MB, average 142.66MB 14/04/17 16:12:41 INFO job.JobProgressTracker: Data from 1 workers - Compute superstep 1: 0 out of 378222 vertices computed; 0 out of 3 partitions computed; min free memory on worker 2 - 24.77MB, average 103.6MB 14/04/17 16:12:46 INFO job.JobProgressTracker: Data from 3 workers - Compute superstep 1: 0 out of 1134723 vertices computed; 0 out of 9 partitions computed; min free memory on worker 1 - 22.5MB, average 23.36MB 14/04/17 16:12:48 INFO mapreduce.Job: Job job_1381849812331_2770 failed with state FAILED due to: Task failed task_1381849812331_2770_m_02 Job failed as tasks failed. failedMaps:1 failedReduces:0 14/04/17 16:12:48 INFO mapreduce.Job: Counters: 46 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=143668 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=37028489 HDFS: Number of bytes written=0 HDFS: Number of read operations=3 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Failed map tasks=1 Launched map tasks=3 Other local map tasks=3 Total time spent by all maps in occupied slots (ms)=24219 Total time spent by all reduces in occupied slots (ms)=0 Map-Reduce Framework Map input records=2 Map output records=0 Input split bytes=88 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=22209 CPU time spent (ms)=77200 Physical memory (bytes) snapshot=659660800 Virtual memory (bytes) snapshot=1657229312 Total committed heap usage (bytes)=372899840 Giraph Stats Aggregate edges=0 Aggregate finished vertices=0 Aggregate sent message message bytes=0 Aggregate sent messages=0 Aggregate vertices=0 Current master task partition=0 Current workers=0 Last checkpointed superstep=0 Sent message bytes=0 Sent messages=0 Superstep=0 Giraph Timers Initialize (ms)=0 Setup (ms)=0 Shutdown (ms)=0 Total (ms)=0 Zookeeper base path /_hadoopBsp/job_1381849812331_2770=0 Zookeeper halt node /_hadoopBsp/job_1381849812331_2770/_haltComputation=0 Zookeeper server:port ricotta.eecs.qmul.ac.uk:2181=0 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 Thanks, Ghufran On Thu, Apr
Re: Giraph Hadoop 2.2.0 (YARN)
I recently compiled giraph (trunk) with hadoop 2.2.0. The command I used is mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean compile It compiled fine On Thu, Mar 27, 2014 at 10:49 PM, chadi jaber chadijaber...@hotmail.comwrote: Hello, I have tried several commands in order to compile giraph for hadoop 2.2.0 without success . I am wondering if giraph can work with hadoop 2.2.0, if yes how can be compiled (which maven command to use, is there any pom modification to perform). Best Regards, Chadi -- *Sundara Raghavan Sankaran* sun...@crayondata.com +91 99520 06708 (IND) sundar.crayon -- -- http://crayondata.com/?utm_source=emailsig https://www.facebook.com/crayondatahttps://twitter.com/CrayonBigDatahttp://www.linkedin.com/company/crayon-datahttps://plus.google.com/+Crayondata1http://www.youtube.com/user/crayonbigdata www.crayondata.com http://crayondata.com/?utm_source=emailsig http://bigdata-madesimple.com/?utm_source=emailsig www.bigdata-madesimple.comhttp://bigdata-madesimple.com/?utm_source=emailsig -- Finalisthttp://www.code-n.org/fileadmin/user_upload/pdf/131210_List_Top_50_EN.pdf at the Code_N 2014 Contest http://www.code-n.org/cebit/award/ at CEBIThttp://www.cebit.com/, Hanover - the only big data company from Asia. This email and its contents are confidential, and meant only for you. Views or opinions, presented in this email, are solely of the author and may not necessarily represent Crayon Data.
OUT OF CORE options creating problems
Hi, Is something wrong with the Out Of Core options in giraph? I wrote a program which would do nothing. I just wanted to save the graph info [number of edges] to a file. When using useOutOfCoreGraph option, the tasks get time out while saving the vertices, but when not using the option, it is saving fine. From the poor experience with giraph in the last few weeks, I think a similar problem exists with useOutOfCoreMessages option also. Has anyone else faced the same/similar problem? -- Thanks, Sundar
StackOverflowError
Hi, I'm trying to run a Giraph Job having 60,000 nodes and 20,000,000 edges [highly connected]. Cluster consists of 4 nodes each with 17 GB RAM. I'm running only one worker per node and a master [so that'll be 1 Master 3 Workers] I keep getting the java.lang.StackOverflowError in one of the workers. Here is the log:- 2013-12-02 12:14:48,698 WARN org.apache.hadoop.mapred.Child: Error running child java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@928dc74 at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@928dc74 at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:151) at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:111) at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:73) at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:192) at org.apache.giraph.graph.GraphTaskManager.processGraphPartitions(GraphTaskManager.java:753) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:273) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:92) ... 7 more Caused by: java.util.concurrent.ExecutionException: java.lang.StackOverflowError at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:262) at java.util.concurrent.FutureTask.get(FutureTask.java:119) at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:271) at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:143) ... 13 more Caused by: java.lang.StackOverflowError at com.google.common.collect.Iterables$3.hasNext(Iterables.java:504) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:543) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:543) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:543) goes on for about 1000 times at com.google.common.collect.Iterators$5.hasNext(Iterators.java:543) 2013-12-02 12:14:48,740 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task Any help appreciated. -- Thanks, Sundar