too-many fetch failures
hi all... i'm using hadoop-17.2 version.. i'm tryin to cluster 3 nodes. when i run applns, map tasks gets over and reduce tasks just halts.. And i get this Too-many fetch failures.. [EMAIL PROTECTED] hadoop-0.17.2.1]# bin/hadoop jar word/word.jar org.myorg.WordCount input output2 08/10/01 10:56:26 INFO mapred.FileInputFormat: Total input paths to process : 1008/10/01 10:56:26 INFO mapred.JobClient: Running job: job_200810011054_0005 08/10/01 10:56:27 INFO mapred.JobClient: map 0% reduce 0% 08/10/01 10:56:31 INFO mapred.JobClient: map 40% reduce 0% 08/10/01 10:56:32 INFO mapred.JobClient: map 60% reduce 0% 08/10/01 10:56:33 INFO mapred.JobClient: map 100% reduce 0% 08/10/01 10:56:42 INFO mapred.JobClient: map 100% reduce 16% 08/10/01 10:56:47 INFO mapred.JobClient: map 100% reduce 20% 08/10/01 11:37:04 INFO mapred.JobClient: map 89% reduce 20% 08/10/01 11:37:04 INFO mapred.JobClient: Task Id : task_200810011054_0005_m_05_0, Status : FAILED Too many fetch-failures 08/10/01 11:37:04 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog? plaintext=true&taskid=task_200810011054_0005_m_05_0&filter=stdout 08/10/01 11:37:04 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog? plaintext=true&taskid=task_200810011054_0005_m_05_0&filter=stderr 08/10/01 11:37:06 INFO mapred.JobClient: map 100% reduce 20% 08/10/01 11:37:14 INFO mapred.JobClient: map 100% reduce 23% 08/10/01 11:44:34 INFO mapred.JobClient: Task Id : task_200810011054_0005_m_07_0, Status : FAILED Too many fetch-failures 08/10/01 11:44:34 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog? plaintext=true&taskid=task_200810011054_0005_m_07_0&filter=stdout 08/10/01 11:44:34 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog? plaintext=true&taskid=task_200810011054_0005_m_07_0&filter=stderr 08/10/01 11:44:46 INFO mapred.JobClient: map 100% reduce 26% 08/10/01 11:52:04 INFO mapred.JobClient: Task Id : task_200810011054_0005_m_04_0, Status : FAILED Too many fetch-failures 08/10/01 11:52:04 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog? plaintext=true&taskid=task_200810011054_0005_m_04_0&filter=stdout 08/10/01 11:52:04 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog? plaintext=true&taskid=task_200810011054_0005_m_04_0&filter=stderr 08/10/01 11:52:06 INFO mapred.JobClient: map 89% reduce 26% 08/10/01 11:52:07 INFO mapred.JobClient: map 100% reduce 26% 08/10/01 11:52:22 INFO mapred.JobClient: map 100% reduce 30% i'm not sure where i went wrong.. kindly help in solving this.. -- Best Regards S.Chandravadana This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful.
Re: LZO and native hadoop libraries
Are you seeing HADOOP-2009? Thanks Amareshwari Nathan Marz wrote: Unfortunately, setting those environment variables did not help my issue. It appears that the "HADOOP_LZO_LIBRARY" variable is not defined in both LzoCompressor.c and LzoDecompressor.c. Where is this variable supposed to be set? On Sep 30, 2008, at 12:33 PM, Colin Evans wrote: Hi Nathan, You probably need to add the Java headers to your build path as well - I don't know why the Mac doesn't ship with this as a default setting: export CPATH="/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include " export CPPFLAGS="-I/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include" Nathan Marz wrote: Thanks for the help. I was able to get past my previous issue, but the native build is still failing. Here is the end of the log output: [exec] then mv -f ".deps/LzoCompressor.Tpo" ".deps/LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; exit 1; fi [exec] mkdir .libs [exec] gcc -DHAVE_CONFIG_H -I. -I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo -I../../../../../../.. -I/Library/Java/Home//include -I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src -g -Wall -fPIC -O2 -m32 -g -O2 -MT LzoCompressor.lo -MD -MP -MF .deps/LzoCompressor.Tpo -c /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c -fno-common -DPIC -o .libs/LzoCompressor.o [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c: In function 'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs': [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c:135: error: syntax error before ',' token [exec] make[2]: *** [LzoCompressor.lo] Error 1 [exec] make[1]: *** [all-recursive] Error 1 [exec] make: *** [all] Error 2 Any ideas? On Sep 30, 2008, at 11:53 AM, Colin Evans wrote: There's a patch to get the native targets to build on Mac OS X: http://issues.apache.org/jira/browse/HADOOP-3659 You probably will need to monkey with LDFLAGS as well to get it to work, but we've been able to build the native libs for the Mac without too much trouble. Doug Cutting wrote: Arun C Murthy wrote: You need to add libhadoop.so to your java.library.patch. libhadoop.so is available in the corresponding release in the lib/native directory. I think he needs to first build libhadoop.so, since he appears to be running on OS X and we only provide Linux builds of this in releases. Doug
Re: rename return values
It's not ignored; it returns failure. Further, the point wasn't that the File API is good, but rather that the File API doesn't provide a cause for FileSystem to convert into a descriptive exception if the error originates from there (as it does in many FileSystem implementations). Finally, unlike File, the FileSystem API *permits* rename to throw IOExceptions (IIRC, it's possible to get quota and permission exceptions from rename in HDFS), but not all failures result in exceptions; application code that relies on a set of exceptions from a particular FileSystem assumes more than the interface claims. Ignoring the return value from rename causes silent failures. I'm sympathetic to the debugging burden, but if you were debugging this with LocalFileSystem, an exception would be no more descriptive than checking the return value. All that said, you can certainly emulate your preferred behavior with a FilterFileSystem that throws instead of returning false. -C On Sep 30, 2008, at 4:47 PM, Bryan Duxbury wrote: It's very interesting that the Java File API doesn't return exceptions, but that doesn't mean it's a good interface. The fact that there IS further exceptional information somewhere in the system but that it is currently ignored is sort of troubling. Perhaps, at least, we could add an overload that WILL throw an exception if there is one to report. -Bryan On Sep 30, 2008, at 1:53 PM, Chris Douglas wrote: FileSystem::rename doesn't always have the cause, per java.io.File::renameTo: http://java.sun.com/javase/6/docs/api/java/io/File.html#renameTo(java.io.File) Even if it did, it's not clear to FileSystem that the failure to rename is fatal/exceptional to the application. -C On Sep 30, 2008, at 1:37 PM, Bryan Duxbury wrote: Hey all, Why is it that FileSystem.rename returns true or false instead of throwing an exception? It seems incredibly inconvenient to get a false result and then have to go poring over the namenode logs looking for the actual error message. I had this case recently where I'd forgotten to create the parent directories, but I had no idea it was failing since there were no exceptions. -Bryan
Re: LZO and native hadoop libraries
Hi Nathan, This is defined in build/native//config.h. It is generated by autoconf during the build, and if it is missing or incorrect then you probably need to make sure that the LZO libraries and headers are in your search paths and then do a clean build. -Colin Nathan Marz wrote: Unfortunately, setting those environment variables did not help my issue. It appears that the "HADOOP_LZO_LIBRARY" variable is not defined in both LzoCompressor.c and LzoDecompressor.c. Where is this variable supposed to be set? On Sep 30, 2008, at 12:33 PM, Colin Evans wrote: Hi Nathan, You probably need to add the Java headers to your build path as well - I don't know why the Mac doesn't ship with this as a default setting: export CPATH="/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include " export CPPFLAGS="-I/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include" Nathan Marz wrote: Thanks for the help. I was able to get past my previous issue, but the native build is still failing. Here is the end of the log output: [exec] then mv -f ".deps/LzoCompressor.Tpo" ".deps/LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; exit 1; fi [exec] mkdir .libs [exec] gcc -DHAVE_CONFIG_H -I. -I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo -I../../../../../../.. -I/Library/Java/Home//include -I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src -g -Wall -fPIC -O2 -m32 -g -O2 -MT LzoCompressor.lo -MD -MP -MF .deps/LzoCompressor.Tpo -c /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c -fno-common -DPIC -o .libs/LzoCompressor.o [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c: In function 'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs': [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c:135: error: syntax error before ',' token [exec] make[2]: *** [LzoCompressor.lo] Error 1 [exec] make[1]: *** [all-recursive] Error 1 [exec] make: *** [all] Error 2 Any ideas? On Sep 30, 2008, at 11:53 AM, Colin Evans wrote: There's a patch to get the native targets to build on Mac OS X: http://issues.apache.org/jira/browse/HADOOP-3659 You probably will need to monkey with LDFLAGS as well to get it to work, but we've been able to build the native libs for the Mac without too much trouble. Doug Cutting wrote: Arun C Murthy wrote: You need to add libhadoop.so to your java.library.patch. libhadoop.so is available in the corresponding release in the lib/native directory. I think he needs to first build libhadoop.so, since he appears to be running on OS X and we only provide Linux builds of this in releases. Doug
Re: rename return values
It's very interesting that the Java File API doesn't return exceptions, but that doesn't mean it's a good interface. The fact that there IS further exceptional information somewhere in the system but that it is currently ignored is sort of troubling. Perhaps, at least, we could add an overload that WILL throw an exception if there is one to report. -Bryan On Sep 30, 2008, at 1:53 PM, Chris Douglas wrote: FileSystem::rename doesn't always have the cause, per java.io.File::renameTo: http://java.sun.com/javase/6/docs/api/java/io/File.html#renameTo (java.io.File) Even if it did, it's not clear to FileSystem that the failure to rename is fatal/exceptional to the application. -C On Sep 30, 2008, at 1:37 PM, Bryan Duxbury wrote: Hey all, Why is it that FileSystem.rename returns true or false instead of throwing an exception? It seems incredibly inconvenient to get a false result and then have to go poring over the namenode logs looking for the actual error message. I had this case recently where I'd forgotten to create the parent directories, but I had no idea it was failing since there were no exceptions. -Bryan
Configuring Hadoop to use S3 for Nutch
I am using Nutch for crawling and would like to configure Hadoop to use S3. I made the appropriate changes to the Hadoop configuration and that appears to be O.K. However, I *think* the problem I am hitting is that Hadoop now expects ALL paths to be locations in S3. Below is a typical error I am seeing. I think Hadoop expects there to be a /tmp folder in the S3 bucket. Also, any parameters to Nutch that are directories are expected to be available in S3. This makes me think there are things I need to do to "prepare" the S3 bucket that I've specified in the Hadoop configuration so that Hadoop has everything it needs to function. For example, I somehow have to copy my seed urls file to the S3 bucket in a way that Hadoop can find it. Can anyone point me in the right direction on how to do this? 2008-09-30 13:31:49,926 WARN httpclient.RestS3Service - Response '/%2Ftmp%2Fhadoop-Kevin%2Fmapred%2Fsystem%2Fjob_local_1' - Unexpected response code 404, expected 200 Thanks Kevin -- View this message in context: http://www.nabble.com/Configuring-Hadoop-to-use-S3-for-Nutch-tp19750758p19750758.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: rename return values
FileSystem::rename doesn't always have the cause, per java.io.File::renameTo: http://java.sun.com/javase/6/docs/api/java/io/File.html#renameTo(java.io.File) Even if it did, it's not clear to FileSystem that the failure to rename is fatal/exceptional to the application. -C On Sep 30, 2008, at 1:37 PM, Bryan Duxbury wrote: Hey all, Why is it that FileSystem.rename returns true or false instead of throwing an exception? It seems incredibly inconvenient to get a false result and then have to go poring over the namenode logs looking for the actual error message. I had this case recently where I'd forgotten to create the parent directories, but I had no idea it was failing since there were no exceptions. -Bryan
Re: LZO and native hadoop libraries
Unfortunately, setting those environment variables did not help my issue. It appears that the "HADOOP_LZO_LIBRARY" variable is not defined in both LzoCompressor.c and LzoDecompressor.c. Where is this variable supposed to be set? On Sep 30, 2008, at 12:33 PM, Colin Evans wrote: Hi Nathan, You probably need to add the Java headers to your build path as well - I don't know why the Mac doesn't ship with this as a default setting: export CPATH="/System/Library/Frameworks/JavaVM.framework/Versions/ CurrentJDK/Home/include " export CPPFLAGS="-I/System/Library/Frameworks/JavaVM.framework/ Versions/CurrentJDK/Home/include" Nathan Marz wrote: Thanks for the help. I was able to get past my previous issue, but the native build is still failing. Here is the end of the log output: [exec] then mv -f ".deps/LzoCompressor.Tpo" ".deps/ LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; exit 1; fi [exec] mkdir .libs [exec] gcc -DHAVE_CONFIG_H -I. -I/Users/nathan/Downloads/ hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo - I../../../../../../.. -I/Library/Java/Home//include -I/Users/nathan/ Downloads/hadoop-0.18.1/src/native/src -g -Wall -fPIC -O2 -m32 -g - O2 -MT LzoCompressor.lo -MD -MP -MF .deps/LzoCompressor.Tpo -c / Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/ hadoop/io/compress/lzo/LzoCompressor.c -fno-common -DPIC -o .libs/ LzoCompressor.o [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/ apache/hadoop/io/compress/lzo/LzoCompressor.c: In function 'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs': [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/ apache/hadoop/io/compress/lzo/LzoCompressor.c:135: error: syntax error before ',' token [exec] make[2]: *** [LzoCompressor.lo] Error 1 [exec] make[1]: *** [all-recursive] Error 1 [exec] make: *** [all] Error 2 Any ideas? On Sep 30, 2008, at 11:53 AM, Colin Evans wrote: There's a patch to get the native targets to build on Mac OS X: http://issues.apache.org/jira/browse/HADOOP-3659 You probably will need to monkey with LDFLAGS as well to get it to work, but we've been able to build the native libs for the Mac without too much trouble. Doug Cutting wrote: Arun C Murthy wrote: You need to add libhadoop.so to your java.library.patch. libhadoop.so is available in the corresponding release in the lib/native directory. I think he needs to first build libhadoop.so, since he appears to be running on OS X and we only provide Linux builds of this in releases. Doug
Re: rename return values
On Sep 30, 2008, at 1:37 PM, Bryan Duxbury wrote: Hey all, Why is it that FileSystem.rename returns true or false instead of throwing an exception? It seems incredibly inconvenient to get a false result and then have to go poring over the namenode logs looking for the actual error message. I had this case recently where I'd forgotten to create the parent directories, but I had no idea it was failing since there were no exceptions. Speculating... To be consistent to with java.io.File.renameTo?(http://java.sun.com/javase/6/docs/api/java/io/File.html#renameTo(java.io.File) ) Arun
rename return values
Hey all, Why is it that FileSystem.rename returns true or false instead of throwing an exception? It seems incredibly inconvenient to get a false result and then have to go poring over the namenode logs looking for the actual error message. I had this case recently where I'd forgotten to create the parent directories, but I had no idea it was failing since there were no exceptions. -Bryan
Re: Getting Hadoop Working on EC2/S3
I think we've identified a bug with the create-image parameter on the ec2 scripts under src/contrib This was my workaround. 1) Start a single instance of the Hadoop AMI you want to modify using the ElasticFox firefox plugin (or the ec2-tools) 2) Modify the /root/hadoop-init script and change the fs.default.name property to point to the FULL s3 path to your bucket (after doing this make sure you do not make your image public!) 3) Follow the instructions at http://docs.amazonwebservices.com/AWSEC2/2008-05-05/GettingStartedGuide/ for bundling, uploading and registering your new AMI. 4) On your local machine, in the hadoop-ec2-env.sh file, change the S3_BUCKET to point to your private s3 bucket where you uploaded your new image. Change the HADOOP_VERSION to your new AMI name. You can now go to your cmd prompt and say "bin/hadoop-ec2 launch-cluster myClusterName 5" and it will bring up 5 instances in a hadoop cluster all running off your S3 Bucket instead of HDFS. Kind regards Steve Watt IBM Certified IT Architect Open Group Certified Master IT Architect Tel: (512) 286 - 9170 Tie: 363 - 9170 Emerging Technologies, Austin, TX IBM Software Group From: "Alexander Aristov" <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Date: 09/30/2008 01:24 AM Subject: Re: Getting Hadoop Working on EC2/S3 Does you AWS (S3) key contain the "?" sing ? If so then it can be a cause. Regenerate the key in this case. I have also tried to use the create-image command but I stopped all attempts after constant failures, It was easier to make AMI by hands. Alexander 2008/9/29 Stephen Watt <[EMAIL PROTECTED]> > Hi Folks > > Before I get started, I just want to state that I've done the due > diligence and read Tom White's articles as well as EC2 and S3 pages on the > Hadoop Wiki and done some searching on this. > > Thus far I have successfully got Hadoop running on EC2 with no problems. > In my local hadoop 0.18 environment I simply add my AWS keys to the > hadoop-ec2-env.sh and kickoff the src/contrib/ec2/bin/hadoop-ec2 launch > cluster script and it works great. > > Now, I'm trying to use the Public Haodop EC2 images to run over S3 instead > of HDFS. They are set up to use variables passed in from a parameterized > launch for all the config options everything EXCEPT the > fs.default.filesystem. So in order to bring a cluster of 20 hadoop > instances up that run over S3, I need to mod the config file to point to > my S3 bucket for the fs.default.filesystem and keep the rest the same. > Thus I need my own image to do this. I am attempting this by using the > local src/contrib/ec2/bin/hadoop-ec2 create-image script. I've tried this > both on a windows system (cygwin environment) AND on my ubuntu 8 system > and with each one it gets all the way to the end and fails as it attempts > to save the new image to my bucket and says the bucket does not exist with > a Server.NoSuchBucket (404) error. > > The S3 bucket definitely does exist. I have block data inside of it that > are results of my Hadoop Jobs. I can go to a single hadoop image on EC2 > that I've launched and manually set up to use S3 and say bin/hadoop dfs > -ls / and I can see the contents of my S3 bucket. I can also succesfully > use that s3 bucket as an input and output of my jobs for a single EC2 > hadoop instance. I've tried creating new buckets using the FireFox S3 > Organizer plugin and specifying the scripts to save my new image to those > and its still the same error. > > Any ideas ? Is anyone having similar problems ? > > Regards > Steve Watt -- Best Regards Alexander Aristov
Re: LZO and native hadoop libraries
Hi Nathan, You probably need to add the Java headers to your build path as well - I don't know why the Mac doesn't ship with this as a default setting: export CPATH="/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include " export CPPFLAGS="-I/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include" Nathan Marz wrote: Thanks for the help. I was able to get past my previous issue, but the native build is still failing. Here is the end of the log output: [exec] then mv -f ".deps/LzoCompressor.Tpo" ".deps/LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; exit 1; fi [exec] mkdir .libs [exec] gcc -DHAVE_CONFIG_H -I. -I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo -I../../../../../../.. -I/Library/Java/Home//include -I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src -g -Wall -fPIC -O2 -m32 -g -O2 -MT LzoCompressor.lo -MD -MP -MF .deps/LzoCompressor.Tpo -c /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c -fno-common -DPIC -o .libs/LzoCompressor.o [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c: In function 'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs': [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c:135: error: syntax error before ',' token [exec] make[2]: *** [LzoCompressor.lo] Error 1 [exec] make[1]: *** [all-recursive] Error 1 [exec] make: *** [all] Error 2 Any ideas? On Sep 30, 2008, at 11:53 AM, Colin Evans wrote: There's a patch to get the native targets to build on Mac OS X: http://issues.apache.org/jira/browse/HADOOP-3659 You probably will need to monkey with LDFLAGS as well to get it to work, but we've been able to build the native libs for the Mac without too much trouble. Doug Cutting wrote: Arun C Murthy wrote: You need to add libhadoop.so to your java.library.patch. libhadoop.so is available in the corresponding release in the lib/native directory. I think he needs to first build libhadoop.so, since he appears to be running on OS X and we only provide Linux builds of this in releases. Doug
Re: LZO and native hadoop libraries
Thanks for the help. I was able to get past my previous issue, but the native build is still failing. Here is the end of the log output: [exec] then mv -f ".deps/LzoCompressor.Tpo" ".deps/ LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; exit 1; fi [exec] mkdir .libs [exec] gcc -DHAVE_CONFIG_H -I. -I/Users/nathan/Downloads/ hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo - I../../../../../../.. -I/Library/Java/Home//include -I/Users/nathan/ Downloads/hadoop-0.18.1/src/native/src -g -Wall -fPIC -O2 -m32 -g -O2 - MT LzoCompressor.lo -MD -MP -MF .deps/LzoCompressor.Tpo -c /Users/ nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/ compress/lzo/LzoCompressor.c -fno-common -DPIC -o .libs/LzoCompressor.o [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/ apache/hadoop/io/compress/lzo/LzoCompressor.c: In function 'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs': [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/ apache/hadoop/io/compress/lzo/LzoCompressor.c:135: error: syntax error before ',' token [exec] make[2]: *** [LzoCompressor.lo] Error 1 [exec] make[1]: *** [all-recursive] Error 1 [exec] make: *** [all] Error 2 Any ideas? On Sep 30, 2008, at 11:53 AM, Colin Evans wrote: There's a patch to get the native targets to build on Mac OS X: http://issues.apache.org/jira/browse/HADOOP-3659 You probably will need to monkey with LDFLAGS as well to get it to work, but we've been able to build the native libs for the Mac without too much trouble. Doug Cutting wrote: Arun C Murthy wrote: You need to add libhadoop.so to your java.library.patch. libhadoop.so is available in the corresponding release in the lib/ native directory. I think he needs to first build libhadoop.so, since he appears to be running on OS X and we only provide Linux builds of this in releases. Doug
Re: LZO and native hadoop libraries
On Sep 30, 2008, at 11:46 AM, Doug Cutting wrote: Arun C Murthy wrote: You need to add libhadoop.so to your java.library.patch. libhadoop.so is available in the corresponding release in the lib/ native directory. I think he needs to first build libhadoop.so, since he appears to be running on OS X and we only provide Linux builds of this in releases. Ah, good point. Unfortunately the work on getting native libs on Mac OS X stalled... http://issues.apache.org/jira/browse/HADOOP-3659 Arun
Re: LZO and native hadoop libraries
There's a patch to get the native targets to build on Mac OS X: http://issues.apache.org/jira/browse/HADOOP-3659 You probably will need to monkey with LDFLAGS as well to get it to work, but we've been able to build the native libs for the Mac without too much trouble. Doug Cutting wrote: Arun C Murthy wrote: You need to add libhadoop.so to your java.library.patch. libhadoop.so is available in the corresponding release in the lib/native directory. I think he needs to first build libhadoop.so, since he appears to be running on OS X and we only provide Linux builds of this in releases. Doug
Re: LZO and native hadoop libraries
Arun C Murthy wrote: You need to add libhadoop.so to your java.library.patch. libhadoop.so is available in the corresponding release in the lib/native directory. I think he needs to first build libhadoop.so, since he appears to be running on OS X and we only provide Linux builds of this in releases. Doug
Re: LZO and native hadoop libraries
Nathan, You need to add libhadoop.so to your java.library.patch. libhadoop.so is available in the corresponding release in the lib/ native directory. Arun On Sep 30, 2008, at 11:14 AM, Nathan Marz wrote: I am trying to use SequenceFiles with LZO compression outside the context of a MapReduce application. However, when I try to use the LZO codec, I get the following errors in the log: 08/09/30 11:09:56 DEBUG conf.Configuration: java.io.IOException: config() at org.apache.hadoop.conf.Configuration.(Configuration.java: 157) at com .rapleaf .formats .stream.TestSequenceFileStreams.setUp(TestSequenceFileStreams.java:22) at junit.framework.TestCase.runBare(TestCase.java:125) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org .junit .internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:36) at org .apache .tools .ant .taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421) at org .apache .tools .ant .taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java: 912) at org .apache .tools .ant .taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java: 766) 08/09/30 11:09:56 DEBUG security.UserGroupInformation: Unix Login: nathan,staff,_lpadmin,com.apple.sharepoint.group. 1,_appserveradm,_appserverusr,admin,com.apple.access_ssh 08/09/30 11:09:56 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library... 08/09/30 11:09:56 DEBUG util.NativeCodeLoader: Failed to load native- hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 08/09/30 11:09:56 DEBUG util.NativeCodeLoader: java.library.path=.:/ Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java 08/09/30 11:09:56 WARN util.NativeCodeLoader: Unable to load native- hadoop library for your platform... using builtin-java classes where applicable 08/09/30 11:09:56 ERROR compress.LzoCodec: Cannot load native-lzo without native-hadoop What is the native hadoop library and how should I configure things to use it? Thanks, Nathan Marz RapLeaf
LZO and native hadoop libraries
I am trying to use SequenceFiles with LZO compression outside the context of a MapReduce application. However, when I try to use the LZO codec, I get the following errors in the log: 08/09/30 11:09:56 DEBUG conf.Configuration: java.io.IOException: config() at org.apache.hadoop.conf.Configuration.(Configuration.java:157) at com .rapleaf .formats .stream.TestSequenceFileStreams.setUp(TestSequenceFileStreams.java:22) at junit.framework.TestCase.runBare(TestCase.java:125) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org .junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java: 81) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:36) at org .apache .tools .ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java: 421) at org .apache .tools .ant .taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java: 912) at org .apache .tools .ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java: 766) 08/09/30 11:09:56 DEBUG security.UserGroupInformation: Unix Login: nathan,staff,_lpadmin,com.apple.sharepoint.group. 1,_appserveradm,_appserverusr,admin,com.apple.access_ssh 08/09/30 11:09:56 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library... 08/09/30 11:09:56 DEBUG util.NativeCodeLoader: Failed to load native- hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 08/09/30 11:09:56 DEBUG util.NativeCodeLoader: java.library.path=.:/ Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java 08/09/30 11:09:56 WARN util.NativeCodeLoader: Unable to load native- hadoop library for your platform... using builtin-java classes where applicable 08/09/30 11:09:56 ERROR compress.LzoCodec: Cannot load native-lzo without native-hadoop What is the native hadoop library and how should I configure things to use it? Thanks, Nathan Marz RapLeaf
Re: ClassNotFoundException from Jython
On Tue, Sep 30, 2008 at 12:34 AM, Karl Anderson <[EMAIL PROTECTED]> wrote: > I recommend using streaming instead if you can, much easier to develop and > debug. It's also nice to not get that "stop doing that, jythonc is going > away" message each time you compile :) Also check out the recently > announced Happy framework for Hadoop and Jython, it looks interesting. If you want to use cpython/streaming instead, you might be interested in: https://issues.apache.org/jira/browse/HADOOP-4304 -Klaas
A quick question about partioner and reducer
Hello, I am slightly confused about the number of reducers executed and the size of data each receives. Setup: I have a setup of 5 task trackers. In my hadoop-site: (1) mapred.reduce.tasks 7 The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when mapred.job.tracker is "local". (2) mapred.tasktracker.map.tasks.maximum 7 The maximum number of map tasks that will be run simultaneously by a task tracker. (3) However from http://hadoop.apache.org/core/docs/r0.18.1/api/index.html "The total number of partitions is the same as the number of reduce tasks for the job.." Q: So does that mean (from (1) & (2)) , there will be a total of 7 reduce tasks distributed across 5 machines such that no machine receives more than 7 reduce jobs? If so, suppose i have millions of unique keys which need to be reduced(e.g urls/hashes), these will be partitioned into 7 groups (from (3)) and distributed across 5 machines? Which is equivalent to saying that the number of reduces tasks run across all machines will be equal to 7? Wouldn't that be too large a number of keys for each reduce task? Are these possible solutions: Solution: I) Fixed machines (5), but increase mapred.reduce.tasks (loss in performance?) 2) Increase number of machines (not possible for me, but a theoretical solution) and set mapred.reduce.tasks to a commensurate number Many thanks for your time Saptarshi Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha More people are flattered into virtue than bullied out of vice. -- R. S. Surtees
Re: Jobtracker config?
Hi, Thanks, it worked. Correct me if I'm wrong, but isn't this a configuration defect? E.g the location of sec namenode is in conf/master and if I run start- dfs.sh, the sec namenode starts it on B. Similarly, given that the Jobtracker is specified to run on C, shouldn't start-all.sh start the jobtracker on C? Regards Saptarshi On Sep 29, 2008, at 6:37 PM, Arun C Murthy wrote: On Sep 29, 2008, at 2:52 PM, Saptarshi Guha wrote: Setup: I am running the namenode on A, the sec. namenode on B and the jobtracker on C. The datanodes and tasktrackers are on Z1,Z2,Z3. Problem: However, the jobtracker is starting up on A. Here are my configs for Jobtracker This would happen if you ran 'start-all.sh' on A rather than start- dfs.sh on A and start-mapred.sh on B. Is that what you did? If not, please post the commands you used to start the HDFS and Map- Reduce clusters... Arun mapred.job.tracker C:54311 The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. mapred.job.tracker.http.address C:50030 The job tracker http server address and port the server will listen on. If the port is 0 then the server will start on a free port. Also, my masters contains on entry for B (so that the sec. name node starts on B) and my slaves file contains Z1,Z2,Z3. The config files are synchronized across all machines. Any help would be appreciated. Thank you Saptarshi Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha I think I'm schizophrenic. One half of me's paranoid and the other half's out to get him.
Re: Question about Hadoop 's Feature(s)
> However, HDFS uses HTTP to serve blocks up -that needs to be locked down > too. Would the signing work there? I am not familiar with HDFS over HTTP. Could it simply sign the stream and include the signature at the end of the HTTP message returned? On Tue, Sep 30, 2008 at 8:56 AM, Steve Loughran <[EMAIL PROTECTED]> wrote: > Jason Rutherglen wrote: >> >> I implemented an RMI protocol using Hadoop IPC and implemented basic >> HMAC signing. It is I believe faster than public key private key >> because it uses a secret key and does not require public key >> provisioning like PKI would. Perhaps it would be a baseline way to >> sign the data. > > That should work for authenticating messages between (trusted) nodes. > Presumably the ipc.key value could be set in the Conf and all would be well. > > External job submitters shouldn't be given those keys; they'd need an > HTTP(S) front end that could authenticate them however the organisation > worked. > > Yes, that would be simpler. I am not enough of a security expert to say if > it will work, but the keys should be easier to work with. As long as the > configuration files are kept secure, your cluster will be locked. > > However, HDFS uses HTTP to serve blocks up -that needs to be locked down > too. Would the signing work there? > > -steve >
Re: Question about Hadoop 's Feature(s)
Jason Rutherglen wrote: I implemented an RMI protocol using Hadoop IPC and implemented basic HMAC signing. It is I believe faster than public key private key because it uses a secret key and does not require public key provisioning like PKI would. Perhaps it would be a baseline way to sign the data. That should work for authenticating messages between (trusted) nodes. Presumably the ipc.key value could be set in the Conf and all would be well. External job submitters shouldn't be given those keys; they'd need an HTTP(S) front end that could authenticate them however the organisation worked. Yes, that would be simpler. I am not enough of a security expert to say if it will work, but the keys should be easier to work with. As long as the configuration files are kept secure, your cluster will be locked. However, HDFS uses HTTP to serve blocks up -that needs to be locked down too. Would the signing work there? -steve