setInt getInt
I have no problem with Hadoop.mapred using JobConf to setInt integers and pass them to my map(s) for getInt as shown in the first program below. However, when I use Hadoop.mapreduce using Configuration to setInt these values are invisible to my map's getInt's. Please tell me what I am doing wrong. Thanks. Both programs expect to see a file with a line or two of text in a directory named testIn. Alan Ratner This program uses JobConf and setInt/getInt and works fine. It outputs: number = 12345 (from map) package cbTest; import java.io.*; import org.apache.hadoop.conf.*; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; import org.apache.hadoop.util.*; public class ConfTest extends Configured implements Tool { @SuppressWarnings(deprecation) public static class MapClass extends MapReduceBase implements MapperLongWritable, Text, Text, Text { public static int number; public void configure(JobConf job) { number = job.getInt(number,-999); } public void map(LongWritable key, Text t, OutputCollectorText, Text output, Reporter reporter) throws IOException { System.out.println(number = + number); } } @SuppressWarnings(deprecation) public int run(String[] args) throws Exception { Path InputDirectory = new Path(testIn); Path OutputDirectory = new Path(testOut); System.out.println( Running ConfTest Program); JobConf conf = new JobConf(getConf(), ConfTest.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(Text.class); conf.setMapperClass(MapClass.class); conf.setInt(number, 12345); FileInputFormat.setInputPaths(conf, InputDirectory); FileOutputFormat.setOutputPath(conf, OutputDirectory); FileSystem fs = OutputDirectory.getFileSystem(conf); fs.delete(OutputDirectory, true); //remove output of prior run JobClient.runJob(conf); return 0; } public static void main(String[] args) throws Exception { int res = ToolRunner.run(new Configuration(), new ConfTest(), args); System.exit(res); } } This program uses Configuration and setInt/getInt. Butt getInt in neither map or map:configure works. It outputs: Passing integer 12345 in configuration (from run) map numbers: -999 -1 (from map as intMapConf and intConfConf) package cbTest; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.conf.Configured; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner; public class Conf2Test extends Configured implements Tool { public static class MapClass extends MapperLongWritable, Text, Text, Text { public int intConfConf = -1; public void configure(Configuration job) { intConfConf = job.getInt(number, -2); } public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { int intMapConf = context.getConfiguration().getInt(number, -999); System.out.println(map numbers: + intMapConf+ +intConfConf); } } public static void main(String[] args) throws Exception { int res = ToolRunner.run(new Configuration(), new Conf2Test(), args); System.exit(res); } public int run(String[] arg0) throws Exception { Path Input_Directory = new Path(testIn); Path Output_Directory = new Path(testOut); Configuration conf = new Configuration(); Job job = new Job(conf, Conf2Test.class.getSimpleName()); job.setJarByClass(Conf2Test.class); job.setMapperClass(MapClass.class);
Version Mismatch
We have a version mismatch problem which may be Hadoop related but may be due to a third party product we are using that requires us to run Zookeeper and Hadoop. This product is rumored to soon be an Apache incubator project. As I am not sure what I can reveal about this third party program prior to its release to Apache I will refer to it as XXX. We are running Hadoop 0.20.203.0. We have no problems running Hadoop at all. It runs our Hadoop programs and our Hadoop fs commands without any version mismatch complaints. Localhost:50030 and 50070 both report we are running 0.20.203.0, r1099333. But when we try to initialize XXX we get org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client = 60, server = 61) org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client = 60, server = 61). The developers of XXX tell me that this error is coming from HDFS and is unrelated to their program. (XXX does not include any Hadoop or Zookeeper jar files - as HBase does - but simply grabs these from HADOOP_HOME which points to our 0.20.203.0 installation and ZOOKEEPER_HOME.) 1.What exactly does client = 60 mean? Which Hadoop version is this referring to? 2.What exactly does server = 61 mean? Which Hadoop version is this referring to? 3.Any ideas on whether this is a problem with my Hadoop configuration or whether this is a problem with XXX? 17 15:20:56,564 [security.Groups] INFO : Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 17 15:20:56,704 [conf.Configuration] WARN : mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id 17 15:20:56,771 [util.Initialize] FATAL: org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client = 60, server = 61) org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client = 60, server = 61) at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:231) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:156) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:255) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:222) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:94) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1734) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:74) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1768) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1750) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:234) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:131) Alan
RE: Problem running a Hadoop program with external libraries
Thanks to all who suggested solutions to our problem of running a Java MR job using both external Java and C++ libraries. We got it to work by moving all our .so files into an archive (http://hadoop.apache.org/mapreduce/docs/r0.21.0/hadoop_archives.html) and publishing it to our MR app with a single statement: DistributedCache.createSymlink(conf). We found that we had to use Eclipse to generate a runnable jar file in extract mode; running an ordinary jar did not work. (We tried putting our external jars in the archive file but a plain jar still did not work - perhaps I haven't assembled the complete set of jars into the archive.) We had tried putting all the libraries directly in HDFS with a pointer in mapred-site.xml: propertynamemapred.child.env/namevalueLD_LIBRARY_PATH=/user/ngc/lib/value/property as described in https://issues.apache.org/jira/browse/HADOOP-2838 but this did not work for us. The bottom line of all this is that we managed to write a Hadoop job in Java that invokes the OpenCV (Open Computer Vision) C++ libraries (http://opencv.willowgarage.com/wiki/) using the JavaCV Java wrapper (http://code.google.com/p/javacv/). OpenCV includes over 500 image processing algorithms. -Original Message- From: Ratner, Alan S (IS) [mailto:alan.rat...@ngc.com] Sent: Friday, March 04, 2011 3:53 PM To: common-user@hadoop.apache.org Subject: EXT :Problem running a Hadoop program with external libraries We are having difficulties running a Hadoop program making calls to external libraries - but this occurs only when we run the program on our cluster and not from within Eclipse where we are apparently running in Hadoop's standalone mode. This program invokes the Open Computer Vision libraries (OpenCV and JavaCV). (I don't think there is a problem with our cluster - we've run many Hadoop jobs on it without difficulty.) 1. I normally use Eclipse to create jar files for our Hadoop programs but I inadvertently hit the run as Java application button and the program ran fine, reading the input file from the eclipse workspace rather than HDFS and writing the output file to the same place. Hadoop's output appears below. (This occurred on the master Hadoop server.) 2. I then exported from Eclipse a runnable jar which extracted required libraries into the generated jar - presumably producing a jar file that incorporated all the required library functions. (The plain jar file for this program is 17 kB while the runnable jar is 30MB.) When I try to run this on my Hadoop cluster (including my master and slave servers) the program reports that it is unable to locate libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory. Now, in addition to this library being incorporated inside the runnable jar file it is also present on each of my servers at hadoop-0.21.0/lib/native/Linux-amd64-64/ where we have loaded the same libraries (to give Hadoop 2 shots at finding them). These include: ... libopencv_highgui_pch_dephelp.a libopencv_highgui.so libopencv_highgui.so.2.2 libopencv_highgui.so.2.2.0 ... When I poke around inside the runnable jar I find javacv_linux-x86_64.jar which contains: com/googlecode/javacv/cpp/linux-x86_64/libjniopencv_highgui.so 3. I then tried adding the following to mapred-site.xml as suggested in Patch 2838 that's supposed to be included in hadoop 0.21 https://issues.apache.org/jira/browse/HADOOP-2838 property namemapred.child.env/name valueLD_LIBRARY_PATH=/home/ngc/hadoop-0.21.0/lib/native/Linux-amd64-64/value /property The log is included at the bottom of this email with Hadoop now complaining about a different missing library with an out-of-memory error. Does anyone have any ideas as to what is going wrong here? Any help would be appreciated. Thanks. Alan BTW: Each of our servers has 4 hard drives and many of the errors below refer to the 3 drives (/media/hd2 or hd3 or hd4) reserved exclusively for HDFS and thus perhaps not a good place for Hadoop to be looking for a library file. My slaves have 24 GB RAM, the jar file is 30 MB, and the sequence file being read is 400 KB - so I hope I am not running out of memory. 1. RUNNING DIRECTLY FROM ECLIPSE IN HADOOP'S STANDALONE MODE - SUCCESS Running Face Program 11/03/04 12:44:10 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 11/03/04 12:44:10 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). 11/03/04 12:44:10 INFO mapred.FileInputFormat: Total input paths to process : 1
RE: Problem running a Hadoop program with external libraries
One other thing: We were getting out-of-memory errors with these external libraries and we had to reduce the value of namemapred.child.java.opts/name found in mapred-site.xml. We had originally been using 2 GB (our servers have 24-48 GB RAM) and eliminated the out-of-memory errors by reducing this value to 1.28 GB. Alan -Original Message- From: Ratner, Alan S (IS) [mailto:alan.rat...@ngc.com] Sent: Tuesday, March 08, 2011 4:22 PM To: common-user@hadoop.apache.org Cc: Gerlach, Hannah L (IS); Andrew Levine Subject: EXT :RE: Problem running a Hadoop program with external libraries Thanks to all who suggested solutions to our problem of running a Java MR job using both external Java and C++ libraries. We got it to work by moving all our .so files into an archive (http://hadoop.apache.org/mapreduce/docs/r0.21.0/hadoop_archives.html) and publishing it to our MR app with a single statement: DistributedCache.createSymlink(conf). We found that we had to use Eclipse to generate a runnable jar file in extract mode; running an ordinary jar did not work. (We tried putting our external jars in the archive file but a plain jar still did not work - perhaps I haven't assembled the complete set of jars into the archive.) We had tried putting all the libraries directly in HDFS with a pointer in mapred-site.xml: propertynamemapred.child.env/namevalueLD_LIBRARY_PATH=/user/ngc/lib/value/property as described in https://issues.apache.org/jira/browse/HADOOP-2838 but this did not work for us. The bottom line of all this is that we managed to write a Hadoop job in Java that invokes the OpenCV (Open Computer Vision) C++ libraries (http://opencv.willowgarage.com/wiki/) using the JavaCV Java wrapper (http://code.google.com/p/javacv/). OpenCV includes over 500 image processing algorithms. -Original Message- From: Ratner, Alan S (IS) [mailto:alan.rat...@ngc.com] Sent: Friday, March 04, 2011 3:53 PM To: common-user@hadoop.apache.org Subject: EXT :Problem running a Hadoop program with external libraries We are having difficulties running a Hadoop program making calls to external libraries - but this occurs only when we run the program on our cluster and not from within Eclipse where we are apparently running in Hadoop's standalone mode. This program invokes the Open Computer Vision libraries (OpenCV and JavaCV). (I don't think there is a problem with our cluster - we've run many Hadoop jobs on it without difficulty.) 1. I normally use Eclipse to create jar files for our Hadoop programs but I inadvertently hit the run as Java application button and the program ran fine, reading the input file from the eclipse workspace rather than HDFS and writing the output file to the same place. Hadoop's output appears below. (This occurred on the master Hadoop server.) 2. I then exported from Eclipse a runnable jar which extracted required libraries into the generated jar - presumably producing a jar file that incorporated all the required library functions. (The plain jar file for this program is 17 kB while the runnable jar is 30MB.) When I try to run this on my Hadoop cluster (including my master and slave servers) the program reports that it is unable to locate libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory. Now, in addition to this library being incorporated inside the runnable jar file it is also present on each of my servers at hadoop-0.21.0/lib/native/Linux-amd64-64/ where we have loaded the same libraries (to give Hadoop 2 shots at finding them). These include: ... libopencv_highgui_pch_dephelp.a libopencv_highgui.so libopencv_highgui.so.2.2 libopencv_highgui.so.2.2.0 ... When I poke around inside the runnable jar I find javacv_linux-x86_64.jar which contains: com/googlecode/javacv/cpp/linux-x86_64/libjniopencv_highgui.so 3. I then tried adding the following to mapred-site.xml as suggested in Patch 2838 that's supposed to be included in hadoop 0.21 https://issues.apache.org/jira/browse/HADOOP-2838 property namemapred.child.env/name valueLD_LIBRARY_PATH=/home/ngc/hadoop-0.21.0/lib/native/Linux-amd64-64/value /property The log is included at the bottom of this email with Hadoop now complaining about a different missing library with an out-of-memory error. Does anyone have any ideas as to what is going wrong here? Any help would be appreciated. Thanks. Alan BTW: Each of our servers has 4 hard drives and many of the errors below refer to the 3 drives (/media/hd2 or hd3 or hd4) reserved exclusively for HDFS and thus perhaps not a good place for Hadoop to be looking for a library file. My slaves have 24 GB RAM, the jar file is 30 MB, and the sequence file being read is 400 KB - so I hope I am not running out of memory. 1. RUNNING DIRECTLY FROM ECLIPSE IN HADOOP'S STANDALONE MODE - SUCCESS Running Face
Problem running a Hadoop program with external libraries
We are having difficulties running a Hadoop program making calls to external libraries - but this occurs only when we run the program on our cluster and not from within Eclipse where we are apparently running in Hadoop's standalone mode. This program invokes the Open Computer Vision libraries (OpenCV and JavaCV). (I don't think there is a problem with our cluster - we've run many Hadoop jobs on it without difficulty.) 1. I normally use Eclipse to create jar files for our Hadoop programs but I inadvertently hit the run as Java application button and the program ran fine, reading the input file from the eclipse workspace rather than HDFS and writing the output file to the same place. Hadoop's output appears below. (This occurred on the master Hadoop server.) 2. I then exported from Eclipse a runnable jar which extracted required libraries into the generated jar - presumably producing a jar file that incorporated all the required library functions. (The plain jar file for this program is 17 kB while the runnable jar is 30MB.) When I try to run this on my Hadoop cluster (including my master and slave servers) the program reports that it is unable to locate libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory. Now, in addition to this library being incorporated inside the runnable jar file it is also present on each of my servers at hadoop-0.21.0/lib/native/Linux-amd64-64/ where we have loaded the same libraries (to give Hadoop 2 shots at finding them). These include: ... libopencv_highgui_pch_dephelp.a libopencv_highgui.so libopencv_highgui.so.2.2 libopencv_highgui.so.2.2.0 ... When I poke around inside the runnable jar I find javacv_linux-x86_64.jar which contains: com/googlecode/javacv/cpp/linux-x86_64/libjniopencv_highgui.so 3. I then tried adding the following to mapred-site.xml as suggested in Patch 2838 that's supposed to be included in hadoop 0.21 https://issues.apache.org/jira/browse/HADOOP-2838 property namemapred.child.env/name valueLD_LIBRARY_PATH=/home/ngc/hadoop-0.21.0/lib/native/Linux-amd64-64/value /property The log is included at the bottom of this email with Hadoop now complaining about a different missing library with an out-of-memory error. Does anyone have any ideas as to what is going wrong here? Any help would be appreciated. Thanks. Alan BTW: Each of our servers has 4 hard drives and many of the errors below refer to the 3 drives (/media/hd2 or hd3 or hd4) reserved exclusively for HDFS and thus perhaps not a good place for Hadoop to be looking for a library file. My slaves have 24 GB RAM, the jar file is 30 MB, and the sequence file being read is 400 KB - so I hope I am not running out of memory. 1. RUNNING DIRECTLY FROM ECLIPSE IN HADOOP'S STANDALONE MODE - SUCCESS Running Face Program 11/03/04 12:44:10 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 11/03/04 12:44:10 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 11/03/04 12:44:10 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). 11/03/04 12:44:10 INFO mapred.FileInputFormat: Total input paths to process : 1 11/03/04 12:44:10 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 11/03/04 12:44:10 INFO mapreduce.JobSubmitter: number of splits:1 11/03/04 12:44:10 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null 11/03/04 12:44:10 WARN security.TokenCache: Overwriting existing token storage with # keys=0 11/03/04 12:44:10 INFO mapreduce.Job: Running job: job_local_0001 11/03/04 12:44:10 INFO mapred.LocalJobRunner: Waiting for map tasks 11/03/04 12:44:10 INFO mapred.LocalJobRunner: Starting task: attempt_local_0001_m_00_0 11/03/04 12:44:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 11/03/04 12:44:10 INFO compress.CodecPool: Got brand-new decompressor 11/03/04 12:44:10 INFO mapred.MapTask: numReduceTasks: 1 11/03/04 12:44:10 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 11/03/04 12:44:10 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 11/03/04 12:44:10 INFO mapred.MapTask: soft limit at 83886080 11/03/04 12:44:10 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 11/03/04 12:44:10 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 11/03/04 12:44:11 INFO mapreduce.Job: map 0% reduce 0% 11/03/04 12:44:16 INFO mapred.LocalJobRunner: file:/home/ngc/eclipse_workspace/HadoopPrograms/Images2/JPGSequenceFile.001:0+411569 map 11/03/04 12:44:17 INFO
RE: EXT :Re: Problem running a Hadoop program with external libraries
Aaron, Thanks for the rapid responses. * ulimit -u unlimited is in .bashrc. * HADOOP_HEAPSIZE is set to 4000 MB in hadoop-env.sh * Mapred.child.ulimit is set to 2048000 in mapred-site.xml * Mapred.child.java.opts is set to -Xmx1536m in mapred-site.xml I take it you are suggesting that I change the java.opts command to: Mapred.child.java.opts is value -Xmx1536m -Djava.library.path=/path/to/native/libs /value Alan Ratner Northrop Grumman Information Systems Manager of Large-Scale Computing 9020 Junction Drive Annapolis Junction, MD 20701 (410) 707-8605 (cell) From: Aaron Kimball [mailto:akimbal...@gmail.com] Sent: Friday, March 04, 2011 4:30 PM To: common-user@hadoop.apache.org Cc: Ratner, Alan S (IS) Subject: EXT :Re: Problem running a Hadoop program with external libraries Actually, I just misread your email and missed the difference between your 2nd and 3rd attempts. Are you enforcing min/max JVM heap sizes on your tasks? Are you enforcing a ulimit (either through your shell configuration, or through Hadoop itself)? I don't know where these cannot allocate memory errors are coming from. If they're from the OS, could it be because it needs to fork() and momentarily exceed the ulimit before loading the native libs? - Aaron On Fri, Mar 4, 2011 at 1:26 PM, Aaron Kimball akimbal...@gmail.commailto:akimbal...@gmail.com wrote: I don't know if putting native-code .so files inside a jar works. A native-code .so is not classloaded in the same way .class files are. So the correct .so files probably need to exist in some physical directory on the worker machines. You may want to doublecheck that the correct directory on the worker machines is identified in the JVM property 'java.library.path' (instead of / in addition to $LD_LIBRARY_PATH). This can be manipulated in the Hadoop configuration setting mapred.child.java.opts (include '-Djava.library.path=/path/to/native/libs' in the string there.) Also, if you added your .so files to a directory that is already used by the tasktracker (like hadoop-0.21.0/lib/native/Linux-amd64-64/), you may need to restart the tasktracker instance for it to take effect. (This is true of .jar files in the $HADOOP_HOME/lib directory; I don't know if it is true for native libs as well.) - Aaron On Fri, Mar 4, 2011 at 12:53 PM, Ratner, Alan S (IS) alan.rat...@ngc.commailto:alan.rat...@ngc.com wrote: We are having difficulties running a Hadoop program making calls to external libraries - but this occurs only when we run the program on our cluster and not from within Eclipse where we are apparently running in Hadoop's standalone mode. This program invokes the Open Computer Vision libraries (OpenCV and JavaCV). (I don't think there is a problem with our cluster - we've run many Hadoop jobs on it without difficulty.) 1. I normally use Eclipse to create jar files for our Hadoop programs but I inadvertently hit the run as Java application button and the program ran fine, reading the input file from the eclipse workspace rather than HDFS and writing the output file to the same place. Hadoop's output appears below. (This occurred on the master Hadoop server.) 2. I then exported from Eclipse a runnable jar which extracted required libraries into the generated jar - presumably producing a jar file that incorporated all the required library functions. (The plain jar file for this program is 17 kB while the runnable jar is 30MB.) When I try to run this on my Hadoop cluster (including my master and slave servers) the program reports that it is unable to locate libopencv_highgui.so.2.2: cannot open shared object file: No such file or directory. Now, in addition to this library being incorporated inside the runnable jar file it is also present on each of my servers at hadoop-0.21.0/lib/native/Linux-amd64-64/ where we have loaded the same libraries (to give Hadoop 2 shots at finding them). These include: ... libopencv_highgui_pch_dephelp.a libopencv_highgui.so libopencv_highgui.so.2.2 libopencv_highgui.so.2.2.0 ... When I poke around inside the runnable jar I find javacv_linux-x86_64.jar which contains: com/googlecode/javacv/cpp/linux-x86_64/libjniopencv_highgui.so 3. I then tried adding the following to mapred-site.xml as suggested in Patch 2838 that's supposed to be included in hadoop 0.21 https://issues.apache.org/jira/browse/HADOOP-2838 property namemapred.child.env/name valueLD_LIBRARY_PATH=/home/ngc/hadoop-0.21.0/lib/native/Linux-amd64-64/value /property The log is included at the bottom of this email with Hadoop now complaining about a different missing library with an out-of-memory error. Does anyone have any ideas as to what is going wrong here? Any help would be appreciated. Thanks. Alan BTW: Each of our servers has 4 hard drives and many of the errors below refer to the 3 drives (/media/hd2
Running C++ WordCount
I am trying to run the wordcount example in c/c++ given on http://wiki.apache.org/hadoop/C%2B%2BWordCount with Hadoop 0.18.3 but the instructions seem to make a number of assumptions. When I run Ant using the specified command ant -Dcompile.c++=yes examples I get a BUILD FAILED error ... Cannot run program c:\...\hadoop-0.18.3\src\c++\pipes\configure (in directory ...\hadoop-0.18.3\build\c++-build\Windows_XP-x86-32\pipes): CreateProcess error=193, %1 is not a valid Win32 application Question 1: Where in the directory path do I put the wordcount code and should it get a .cpp extension or something else? Question 2: Where should I be when I execute the Ant command? Ant complains that it cannot find build.xml unless I am in the ...\hadoop-0.18.3 directory. Question 3: The include statements in the wordcount code are of the form #include Hadoop/xxx.hh. These include files reside both in ...\hadoop-0.18.3\c++\Linux-i386-32\include\hadoop and in ...\hadoop-0.18.3\c++\Linux-amd64-64\include\hadoop. Does Ant produce both a 32-bit and a 64-bit version of my compiled code? Question 4: Where does Ant put the compiled code? Thanks, Alan Ratner
Running C++ WordCount
I am trying to run the wordcount example in c/c++ given on http://wiki.apache.org/hadoop/C%2B%2BWordCount with Hadoop 0.18.3 but when I run Ant using the specified command ant -Dcompile.c++=yes examples I get a BUILD FAILED error ... Cannot run program c:\...\hadoop-0.18.3\src\c++\pipes\configure (in directory ...\hadoop-0.18.3\build\c++-build\Windows_XP-x86-32\pipes): CreateProcess error=193, %1 is not a valid Win32 application Question 1: Where in the directory path do I put the wordcount code and should it get a .cpp extension or something else? Question 2: Where should I be when I execute the Ant command? Ant complains that it cannot find build.xml unless I am in the ...\hadoop-0.18.3 directory. Question 3: The include statements in the wordcount code are of the form #include Hadoop/xxx.hh. These include files reside both in ...\hadoop-0.18.3\c++\Linux-i386-32\include\hadoop and in ...\hadoop-0.18.3\c++\Linux-amd64-64\include\hadoop. Does Ant produce both a 32-bit and a 64-bit version of my compiled code? Question 4: Where does Ant put the compiled code? Thanks, Alan Ratner