Re: Moving Files to Distributed Cache in MapReduce
Please unsubscribe me. On Jul 29, 2011, at 1:18 PM, Mapred Learn wrote: I hope my previous reply helps... On Fri, Jul 29, 2011 at 11:11 AM, Roger Chen rogc...@ucdavis.edu wrote: After moving it to the distributed cache, how would I call it within my MapReduce program? On Fri, Jul 29, 2011 at 11:09 AM, Mapred Learn mapred.le...@gmail.com wrote: Did you try using -files option in your hadoop jar command as: /usr/bin/hadoop jar jar name main class name -files absolute path of file to be added to distributed cache input dir output dir On Fri, Jul 29, 2011 at 11:05 AM, Roger Chen rogc...@ucdavis.edu wrote: Slight modification: I now know how to add files to the distributed file cache, which can be done via this command placed in the main or run class: DistributedCache.addCacheFile(new URI(/user/hadoop/thefile.dat), conf); However I am still having trouble locating the file in the distributed cache. *How do I call the file path of thefile.dat in the distributed cache as a string?* I am using Hadoop 0.20.2 On Fri, Jul 29, 2011 at 10:26 AM, Roger Chen rogc...@ucdavis.edu wrote: Hi all, Does anybody have examples of how one moves files from the local filestructure/HDFS to the distributed cache in MapReduce? A Google search turned up examples in Pig but not MR. -- Roger Chen UC Davis Genome Center -- Roger Chen UC Davis Genome Center -- Roger Chen UC Davis Genome Center
Re: Please help with hadoop configuration parameter set and get
I did something like this using a global static boolean variable (flag) while I was implementing breadth first IDA*. In my case, I set the flag to something else if a solution was found, which was examined in the reducer. I guess in your case, since you know that if the mappers don't produce anything the reducers won't have anything as input, if I am not wrong. And I had chaining map-reduce jobs ( http://developer.yahoo.com/hadoop/tutorial/module4.html ) running until a solution was found. Kind regards, Arindam Khaled On Dec 17, 2010, at 12:58 AM, Peng, Wei wrote: Hi, I am a newbie of hadoop. Today I was struggling with a hadoop problem for several hours. I initialize a parameter by setting job configuration in main. E.g. Configuration con = new Configuration(); con.set(test, 1); Job job = new Job(con); Then in the mapper class, I want to set test to 2. I did it by context.getConfiguration().set(test,2); Finally in the main method, after the job is finished, I check the test again by job.getConfiguration().get(test); However, the value of test is still 1. The reason why I want to change the parameter inside Mapper class is that I want to determine when to stop an iteration in the main method. For example, for doing breadth-first search, when there is no new nodes are added for further expansion, the searching iteration should stop. Your help will be deeply appreciated. Thank you Wei
Re: Please help with hadoop configuration parameter set and get
Wei, I implemented one of the algorithms, BFIDA*, included the paper Out- of-Core Parallel Frontier Search with MapReduce by Alexander Reinefeld and Thorsten Sch{\u}tt. FYI, they implemented BFS for 15- puzzle using map-reduce and MPI, not Hadoop. For brevity, I am not including the whole source code. Here is my pseudo-code: public class standAloneIDA { static boolean solved = false; public static class BFIDAMapClass extends MapperObject, Text, Text, IntWritable { public void map(Object key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); while(there can be more new moves) { emit(the new board, move); //key, value pair } } } public static class BFIDAReducer extends ReducerText,IntWritable,Text,IntWritable { private IntWritable result = new IntWritable(); public void reduce(Text key, IterableIntWritable values, Context context ) throws IOException, InterruptedException { String line = ; ArrayListInteger previousMoves = new ArrayListInteger (); for(IntWritable val: values) { String valInt = val.toString(); int[] moves = stringToArrayInt(valInt); for(int bit: moves) { if(!previousMoves.contains(bit)) previousMoves.add(bit); } } for(int val: previousMoves) { line = line + Integer.toString(val); } if (isSolved(key.toString(), size)) { solved = true; } result.set(Integer.parseInt(line)); context.write(key, result); } } public static void main(String[] args) throws Exception { // logger.isDebugEnabled(); Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println(Usage: testUnit2 in out); System.exit(2); } while(!solved) { set up the file system include the input and output filenames Job job = new Job(conf, some name); job.setJarByClass(standAloneIDA.class); job.setMapperClass(BFIDAMapClass.class); job.setReducerClass(BFIDAReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); job.waitForCompletion(true); } } } Please excuse me if there are missing braces. There might be more efficient ways to setup the jobs and file system. I didn't have much time -- so, I ended up with something that worked for me then. Let me know if you have more questions. Kind regards, Arindam Khaled On Dec 17, 2010, at 10:31 AM, Peng, Wei wrote: Arindam, how to set this global static Boolean variable? I have tried to do something similarly yesterday in the following: Public class BFSearch { Private static boolean expansion; Public static class MapperClass {if no nodes expansion = false;} Public static class ReducerClass Public static void main {expansion = true; run job; print(expansion)} } In this case, expansion is still true. I will look at hadoop counter and report back here later. Thank you for all your help Wei -Original Message- From: Arindam Khaled [mailto:akha...@utdallas.edu] Sent: Friday, December 17, 2010 10:35 AM To: common-user@hadoop.apache.org Subject: Re: Please help with hadoop configuration parameter set and get I did something like this using a global static boolean variable (flag) while I was implementing breadth first IDA*. In my case, I set the flag to something else if a solution was found, which was examined in the reducer. I guess in your case, since you know
Re: Please help with hadoop configuration parameter set and get
There's no guarantee that it might work for a distributed environment as a couple of developers suggested. I didn't have the time to play with it in the distributed mode, rather, executed it in a standalone environment. Arindam On Dec 17, 2010, at 12:07 PM, Arindam Khaled wrote: Wei, I implemented one of the algorithms, BFIDA*, included the paper Out- of-Core Parallel Frontier Search with MapReduce by Alexander Reinefeld and Thorsten Sch{\u}tt. FYI, they implemented BFS for 15- puzzle using map-reduce and MPI, not Hadoop. For brevity, I am not including the whole source code. Here is my pseudo-code: public class standAloneIDA { static boolean solved = false; public static class BFIDAMapClass extends MapperObject, Text, Text, IntWritable { public void map(Object key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); while(there can be more new moves) { emit(the new board, move); //key, value pair } } } public static class BFIDAReducer extends ReducerText,IntWritable,Text,IntWritable { private IntWritable result = new IntWritable(); public void reduce(Text key, IterableIntWritable values, Context context ) throws IOException, InterruptedException { String line = ; ArrayListInteger previousMoves = new ArrayListInteger (); for(IntWritable val: values) { String valInt = val.toString(); int[] moves = stringToArrayInt(valInt); for(int bit: moves) { if(!previousMoves.contains(bit)) previousMoves.add(bit); } } for(int val: previousMoves) { line = line + Integer.toString(val); } if (isSolved(key.toString(), size)) { solved = true; } result.set(Integer.parseInt(line)); context.write(key, result); } } public static void main(String[] args) throws Exception { // logger.isDebugEnabled(); Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println(Usage: testUnit2 in out); System.exit(2); } while(!solved) { set up the file system include the input and output filenames Job job = new Job(conf, some name); job.setJarByClass(standAloneIDA.class); job.setMapperClass(BFIDAMapClass.class); job.setReducerClass(BFIDAReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); job.waitForCompletion(true); } } } Please excuse me if there are missing braces. There might be more efficient ways to setup the jobs and file system. I didn't have much time -- so, I ended up with something that worked for me then. Let me know if you have more questions. Kind regards, Arindam Khaled On Dec 17, 2010, at 10:31 AM, Peng, Wei wrote: Arindam, how to set this global static Boolean variable? I have tried to do something similarly yesterday in the following: Public class BFSearch { Private static boolean expansion; Public static class MapperClass {if no nodes expansion = false;} Public static class ReducerClass Public static void main {expansion = true; run job; print(expansion)} } In this case, expansion is still true. I will look at hadoop counter and report back here later. Thank you for all your help Wei -Original Message- From: Arindam Khaled [mailto:akha...@utdallas.edu] Sent: Friday, December 17, 2010 10:35 AM To: common-user@hadoop.apache.org Subject: Re: Please help with hadoop configuration
wrong value class error
Hello, I am new to Hadoop. I am getting the following error in my reducer. 10/11/15 15:29:11 WARN mapred.LocalJobRunner: job_local_0001 java.io.IOException: wrong value class: class org.apache.hadoop.io.Text is not class org.apache.hadoop.io.IntWritable Here is my reduce class: public static class BFIDAReducer extends ReducerText,IntWritable,Text,Text { private Text result = new Text(); public void reduce(Text key, IterableIntWritable values, Context context ) throws IOException, InterruptedException { Text result = new Text(); GameFunctions gf = GameFunctions.getInstance(); String line = ; for(IntWritable val: values) { line = line + val.toString() + ,; } if(line.length() 1) line = (String) line.subSequence(0, line.length() - 1); if (gf.isSolved(key.toString(), size)) solved = true; result.set(line); context.write(key, result); } } And here is my partial code from job configuration: job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); Can anyone help me? Thanks in advance. Arindam
Re: wrong value class error
This website has answered my question somewhat: http://blog.pfa-labs.com/2010/01/first-stab-at-hadoop-and-map-reduce.html When I comment out the combiner class, it seems to work fine. Thanks. - Original Message - From: Arindam Khaled akha...@utdallas.edu To: common-user@hadoop.apache.org Sent: Monday, November 15, 2010 6:05:58 PM Subject: wrong value class error Hello, I am new to Hadoop and I think I'm doing something silly. I sent this e-mail from another account which isn't registered to hadoop user group. I am getting the following error in my reducer. 10/11/15 15:29:11 WARN mapred.LocalJobRunner: job_local_0001 java.io.IOException: wrong value class: class org.apache.hadoop.io.Text is not class org.apache.hadoop.io.IntWritable Here is my reduce class: public static class BFIDAReducer extends ReducerText,IntWritable,Text,Text { private Text result = new Text(); public void reduce(Text key, IterableIntWritable values, Context context ) throws IOException, InterruptedException { Text result = new Text(); GameFunctions gf = GameFunctions.getInstance(); String line = ; for(IntWritable val: values) { line = line + val.toString() + ,; } if(line.length() 1) line = (String) line.subSequence(0, line.length() - 1); if (gf.isSolved(key.toString(), size)) solved = true; result.set(line); context.write(key, result); } } And here is my partial code from job configuration: job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); Can anyone help me? I know I'll have more question in near future. Thanks in advance. Arindam
wrong value class error
Hello, I am new to Hadoop and I think I'm doing something silly. I sent this e-mail from another account which isn't registered to hadoop user group. I am getting the following error in my reducer. 10/11/15 15:29:11 WARN mapred.LocalJobRunner: job_local_0001 java.io.IOException: wrong value class: class org.apache.hadoop.io.Text is not class org.apache.hadoop.io.IntWritable Here is my reduce class: public static class BFIDAReducer extends ReducerText,IntWritable,Text,Text { private Text result = new Text(); public void reduce(Text key, IterableIntWritable values, Context context ) throws IOException, InterruptedException { Text result = new Text(); GameFunctions gf = GameFunctions.getInstance(); String line = ; for(IntWritable val: values) { line = line + val.toString() + ,; } if(line.length() 1) line = (String) line.subSequence(0, line.length() - 1); if (gf.isSolved(key.toString(), size)) solved = true; result.set(line); context.write(key, result); } } And here is my partial code from job configuration: job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(IntWritable.class); Can anyone help me? I know I'll have more question in near future. Thanks in advance. Arindam