too-many fetch failures

2008-09-30 Thread chandra

hi all...

i'm using hadoop-17.2 version.. i'm tryin to cluster 3 nodes.
when i run applns, map tasks gets over and reduce tasks just halts..
And i get this Too-many fetch failures..

[EMAIL PROTECTED] hadoop-0.17.2.1]# bin/hadoop jar word/word.jar
org.myorg.WordCount input output2
08/10/01 10:56:26 INFO mapred.FileInputFormat: Total input paths to
process : 1008/10/01 10:56:26 INFO mapred.JobClient: Running job:
job_200810011054_0005
08/10/01 10:56:27 INFO mapred.JobClient:  map 0% reduce 0%
08/10/01 10:56:31 INFO mapred.JobClient:  map 40% reduce 0%
08/10/01 10:56:32 INFO mapred.JobClient:  map 60% reduce 0%
08/10/01 10:56:33 INFO mapred.JobClient:  map 100% reduce 0%
08/10/01 10:56:42 INFO mapred.JobClient:  map 100% reduce 16%
08/10/01 10:56:47 INFO mapred.JobClient:  map 100% reduce 20%
08/10/01 11:37:04 INFO mapred.JobClient:  map 89% reduce 20%
08/10/01 11:37:04 INFO mapred.JobClient: Task Id :
task_200810011054_0005_m_05_0, Status : FAILED
Too many fetch-failures
08/10/01 11:37:04 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?
plaintext=true&taskid=task_200810011054_0005_m_05_0&filter=stdout
08/10/01 11:37:04 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?
plaintext=true&taskid=task_200810011054_0005_m_05_0&filter=stderr
08/10/01 11:37:06 INFO mapred.JobClient:  map 100% reduce 20%
08/10/01 11:37:14 INFO mapred.JobClient:  map 100% reduce 23%
08/10/01 11:44:34 INFO mapred.JobClient: Task Id :
task_200810011054_0005_m_07_0, Status : FAILED
Too many fetch-failures
08/10/01 11:44:34 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?
plaintext=true&taskid=task_200810011054_0005_m_07_0&filter=stdout
08/10/01 11:44:34 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?
plaintext=true&taskid=task_200810011054_0005_m_07_0&filter=stderr
08/10/01 11:44:46 INFO mapred.JobClient:  map 100% reduce 26%
08/10/01 11:52:04 INFO mapred.JobClient: Task Id :
task_200810011054_0005_m_04_0, Status : FAILED
Too many fetch-failures
08/10/01 11:52:04 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?
plaintext=true&taskid=task_200810011054_0005_m_04_0&filter=stdout
08/10/01 11:52:04 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?
plaintext=true&taskid=task_200810011054_0005_m_04_0&filter=stderr
08/10/01 11:52:06 INFO mapred.JobClient:  map 89% reduce 26%
08/10/01 11:52:07 INFO mapred.JobClient:  map 100% reduce 26%
08/10/01 11:52:22 INFO mapred.JobClient:  map 100% reduce 30%

i'm not sure where i went wrong..
kindly help in solving this..
-- 
Best Regards
S.Chandravadana

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply 
e-mail and destroy all copies of the original message. 
Any unauthorized review, use, disclosure, dissemination, forwarding, printing 
or copying of this email or any action taken in reliance on this e-mail is 
strictly 
prohibited and may be unlawful.


Re: LZO and native hadoop libraries

2008-09-30 Thread Amareshwari Sriramadasu

Are you seeing HADOOP-2009?

Thanks
Amareshwari
Nathan Marz wrote:
Unfortunately, setting those environment variables did not help my 
issue. It appears that the "HADOOP_LZO_LIBRARY" variable is not 
defined in both LzoCompressor.c and LzoDecompressor.c. Where is this 
variable supposed to be set?




On Sep 30, 2008, at 12:33 PM, Colin Evans wrote:


Hi Nathan,
You probably need to add the Java headers to your build path as well 
- I don't know why the Mac doesn't ship with this as a default setting:


export 
CPATH="/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include 
"
export 
CPPFLAGS="-I/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include" 






Nathan Marz wrote:
Thanks for the help. I was able to get past my previous issue, but 
the native build is still failing. Here is the end of the log output:


[exec] then mv -f ".deps/LzoCompressor.Tpo" 
".deps/LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; 
exit 1; fi

[exec] mkdir .libs
[exec]  gcc -DHAVE_CONFIG_H -I. 
-I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo 
-I../../../../../../.. -I/Library/Java/Home//include 
-I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src -g -Wall 
-fPIC -O2 -m32 -g -O2 -MT LzoCompressor.lo -MD -MP -MF 
.deps/LzoCompressor.Tpo -c 
/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c  
-fno-common -DPIC -o .libs/LzoCompressor.o
[exec] 
/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c: 
In function 
'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs':
[exec] 
/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c:135: 
error: syntax error before ',' token

[exec] make[2]: *** [LzoCompressor.lo] Error 1
[exec] make[1]: *** [all-recursive] Error 1
[exec] make: *** [all] Error 2


Any ideas?



On Sep 30, 2008, at 11:53 AM, Colin Evans wrote:


There's a patch to get the native targets to build on Mac OS X:

http://issues.apache.org/jira/browse/HADOOP-3659

You probably will need to monkey with LDFLAGS as well to get it to 
work, but we've been able to build the native libs for the Mac 
without too much trouble.



Doug Cutting wrote:

Arun C Murthy wrote:
You need to add libhadoop.so to your java.library.patch. 
libhadoop.so is available in the corresponding release in the 
lib/native directory.


I think he needs to first build libhadoop.so, since he appears to 
be running on OS X and we only provide Linux builds of this in 
releases.


Doug












Re: rename return values

2008-09-30 Thread Chris Douglas
It's not ignored; it returns failure. Further, the point wasn't that  
the File API is good, but rather that the File API doesn't provide a  
cause for FileSystem to convert into a descriptive exception if the  
error originates from there (as it does in many FileSystem  
implementations). Finally, unlike File, the FileSystem API *permits*  
rename to throw IOExceptions (IIRC, it's possible to get quota and  
permission exceptions from rename in HDFS), but not all failures  
result in exceptions; application code that relies on a set of  
exceptions from a particular FileSystem assumes more than the  
interface claims.


Ignoring the return value from rename causes silent failures. I'm  
sympathetic to the debugging burden, but if you were debugging this  
with LocalFileSystem, an exception would be no more descriptive than  
checking the return value. All that said, you can certainly emulate  
your preferred behavior with a FilterFileSystem that throws instead of  
returning false. -C


On Sep 30, 2008, at 4:47 PM, Bryan Duxbury wrote:

It's very interesting that the Java File API doesn't return  
exceptions, but that doesn't mean it's a good interface. The fact  
that there IS further exceptional information somewhere in the  
system but that it is currently ignored is sort of troubling.  
Perhaps, at least, we could add an overload that WILL throw an  
exception if there is one to report.


-Bryan

On Sep 30, 2008, at 1:53 PM, Chris Douglas wrote:

FileSystem::rename doesn't always have the cause, per  
java.io.File::renameTo:


http://java.sun.com/javase/6/docs/api/java/io/File.html#renameTo(java.io.File)

Even if it did, it's not clear to FileSystem that the failure to  
rename is fatal/exceptional to the application. -C


On Sep 30, 2008, at 1:37 PM, Bryan Duxbury wrote:


Hey all,

Why is it that FileSystem.rename returns true or false instead of  
throwing an exception? It seems incredibly inconvenient to get a  
false result and then have to go poring over the namenode logs  
looking for the actual error message. I had this case recently  
where I'd forgotten to create the parent directories, but I had no  
idea it was failing since there were no exceptions.


-Bryan








Re: LZO and native hadoop libraries

2008-09-30 Thread Colin Evans

Hi Nathan,
This is defined in build/native//config.h.  It is 
generated by autoconf during the build, and if it is missing or 
incorrect then you probably need to make sure that the LZO libraries and 
headers are in your search paths and then do a clean build.


-Colin


Nathan Marz wrote:
Unfortunately, setting those environment variables did not help my 
issue. It appears that the "HADOOP_LZO_LIBRARY" variable is not 
defined in both LzoCompressor.c and LzoDecompressor.c. Where is this 
variable supposed to be set?




On Sep 30, 2008, at 12:33 PM, Colin Evans wrote:


Hi Nathan,
You probably need to add the Java headers to your build path as well 
- I don't know why the Mac doesn't ship with this as a default setting:


export 
CPATH="/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include 
"
export 
CPPFLAGS="-I/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include" 






Nathan Marz wrote:
Thanks for the help. I was able to get past my previous issue, but 
the native build is still failing. Here is the end of the log output:


[exec] then mv -f ".deps/LzoCompressor.Tpo" 
".deps/LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; 
exit 1; fi

[exec] mkdir .libs
[exec]  gcc -DHAVE_CONFIG_H -I. 
-I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo 
-I../../../../../../.. -I/Library/Java/Home//include 
-I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src -g -Wall 
-fPIC -O2 -m32 -g -O2 -MT LzoCompressor.lo -MD -MP -MF 
.deps/LzoCompressor.Tpo -c 
/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c  
-fno-common -DPIC -o .libs/LzoCompressor.o
[exec] 
/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c: 
In function 
'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs':
[exec] 
/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c:135: 
error: syntax error before ',' token

[exec] make[2]: *** [LzoCompressor.lo] Error 1
[exec] make[1]: *** [all-recursive] Error 1
[exec] make: *** [all] Error 2


Any ideas?



On Sep 30, 2008, at 11:53 AM, Colin Evans wrote:


There's a patch to get the native targets to build on Mac OS X:

http://issues.apache.org/jira/browse/HADOOP-3659

You probably will need to monkey with LDFLAGS as well to get it to 
work, but we've been able to build the native libs for the Mac 
without too much trouble.



Doug Cutting wrote:

Arun C Murthy wrote:
You need to add libhadoop.so to your java.library.patch. 
libhadoop.so is available in the corresponding release in the 
lib/native directory.


I think he needs to first build libhadoop.so, since he appears to 
be running on OS X and we only provide Linux builds of this in 
releases.


Doug












Re: rename return values

2008-09-30 Thread Bryan Duxbury
It's very interesting that the Java File API doesn't return  
exceptions, but that doesn't mean it's a good interface. The fact  
that there IS further exceptional information somewhere in the system  
but that it is currently ignored is sort of troubling. Perhaps, at  
least, we could add an overload that WILL throw an exception if there  
is one to report.


-Bryan

On Sep 30, 2008, at 1:53 PM, Chris Douglas wrote:

FileSystem::rename doesn't always have the cause, per  
java.io.File::renameTo:


http://java.sun.com/javase/6/docs/api/java/io/File.html#renameTo 
(java.io.File)


Even if it did, it's not clear to FileSystem that the failure to  
rename is fatal/exceptional to the application. -C


On Sep 30, 2008, at 1:37 PM, Bryan Duxbury wrote:


Hey all,

Why is it that FileSystem.rename returns true or false instead of  
throwing an exception? It seems incredibly inconvenient to get a  
false result and then have to go poring over the namenode logs  
looking for the actual error message. I had this case recently  
where I'd forgotten to create the parent directories, but I had no  
idea it was failing since there were no exceptions.


-Bryan






Configuring Hadoop to use S3 for Nutch

2008-09-30 Thread Kevin MacDonald

I am using Nutch for crawling and would like to configure Hadoop to use S3. I
made the appropriate changes to the Hadoop configuration and that appears to
be O.K. However, I *think* the problem I am hitting is that Hadoop now
expects ALL paths to be locations in S3. Below is a typical error I am
seeing. I think Hadoop expects there to be a /tmp folder in the S3 bucket.
Also, any parameters to Nutch that are directories are expected to be
available in S3. This makes me think there are things I need to do to
"prepare" the S3 bucket that I've specified in the Hadoop configuration so
that Hadoop has everything it needs to function. For example, I somehow have
to copy my seed urls file to the S3 bucket in a way that Hadoop can find it.
Can anyone point me in the right direction on how to do this?

2008-09-30 13:31:49,926 WARN  httpclient.RestS3Service - Response
'/%2Ftmp%2Fhadoop-Kevin%2Fmapred%2Fsystem%2Fjob_local_1' - Unexpected
response code 404, expected 200

Thanks

Kevin


-- 
View this message in context: 
http://www.nabble.com/Configuring-Hadoop-to-use-S3-for-Nutch-tp19750758p19750758.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: rename return values

2008-09-30 Thread Chris Douglas
FileSystem::rename doesn't always have the cause, per  
java.io.File::renameTo:


http://java.sun.com/javase/6/docs/api/java/io/File.html#renameTo(java.io.File)

Even if it did, it's not clear to FileSystem that the failure to  
rename is fatal/exceptional to the application. -C


On Sep 30, 2008, at 1:37 PM, Bryan Duxbury wrote:


Hey all,

Why is it that FileSystem.rename returns true or false instead of  
throwing an exception? It seems incredibly inconvenient to get a  
false result and then have to go poring over the namenode logs  
looking for the actual error message. I had this case recently where  
I'd forgotten to create the parent directories, but I had no idea it  
was failing since there were no exceptions.


-Bryan




Re: LZO and native hadoop libraries

2008-09-30 Thread Nathan Marz
Unfortunately, setting those environment variables did not help my  
issue. It appears that the "HADOOP_LZO_LIBRARY" variable is not  
defined in both LzoCompressor.c and LzoDecompressor.c. Where is this  
variable supposed to be set?




On Sep 30, 2008, at 12:33 PM, Colin Evans wrote:


Hi Nathan,
You probably need to add the Java headers to your build path as well  
- I don't know why the Mac doesn't ship with this as a default  
setting:


export CPATH="/System/Library/Frameworks/JavaVM.framework/Versions/ 
CurrentJDK/Home/include "
export CPPFLAGS="-I/System/Library/Frameworks/JavaVM.framework/ 
Versions/CurrentJDK/Home/include"





Nathan Marz wrote:
Thanks for the help. I was able to get past my previous issue, but  
the native build is still failing. Here is the end of the log output:


[exec] then mv -f ".deps/LzoCompressor.Tpo" ".deps/ 
LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; exit 1; fi

[exec] mkdir .libs
[exec]  gcc -DHAVE_CONFIG_H -I. -I/Users/nathan/Downloads/ 
hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo - 
I../../../../../../.. -I/Library/Java/Home//include -I/Users/nathan/ 
Downloads/hadoop-0.18.1/src/native/src -g -Wall -fPIC -O2 -m32 -g - 
O2 -MT LzoCompressor.lo -MD -MP -MF .deps/LzoCompressor.Tpo -c / 
Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/ 
hadoop/io/compress/lzo/LzoCompressor.c  -fno-common -DPIC -o .libs/ 
LzoCompressor.o
[exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/ 
apache/hadoop/io/compress/lzo/LzoCompressor.c: In function  
'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs':
[exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/ 
apache/hadoop/io/compress/lzo/LzoCompressor.c:135: error: syntax  
error before ',' token

[exec] make[2]: *** [LzoCompressor.lo] Error 1
[exec] make[1]: *** [all-recursive] Error 1
[exec] make: *** [all] Error 2


Any ideas?



On Sep 30, 2008, at 11:53 AM, Colin Evans wrote:


There's a patch to get the native targets to build on Mac OS X:

http://issues.apache.org/jira/browse/HADOOP-3659

You probably will need to monkey with LDFLAGS as well to get it to  
work, but we've been able to build the native libs for the Mac  
without too much trouble.



Doug Cutting wrote:

Arun C Murthy wrote:
You need to add libhadoop.so to your java.library.patch.  
libhadoop.so is available in the corresponding release in the  
lib/native directory.


I think he needs to first build libhadoop.so, since he appears to  
be running on OS X and we only provide Linux builds of this in  
releases.


Doug










Re: rename return values

2008-09-30 Thread Arun C Murthy


On Sep 30, 2008, at 1:37 PM, Bryan Duxbury wrote:


Hey all,

Why is it that FileSystem.rename returns true or false instead of  
throwing an exception? It seems incredibly inconvenient to get a  
false result and then have to go poring over the namenode logs  
looking for the actual error message. I had this case recently where  
I'd forgotten to create the parent directories, but I had no idea it  
was failing since there were no exceptions.




Speculating...

To be consistent to with java.io.File.renameTo?(http://java.sun.com/javase/6/docs/api/java/io/File.html#renameTo(java.io.File) 
)


Arun



rename return values

2008-09-30 Thread Bryan Duxbury

Hey all,

Why is it that FileSystem.rename returns true or false instead of  
throwing an exception? It seems incredibly inconvenient to get a  
false result and then have to go poring over the namenode logs  
looking for the actual error message. I had this case recently where  
I'd forgotten to create the parent directories, but I had no idea it  
was failing since there were no exceptions.


-Bryan


Re: Getting Hadoop Working on EC2/S3

2008-09-30 Thread Stephen Watt
I think we've identified a bug with the create-image parameter on the ec2 
scripts under src/contrib

This was my workaround. 

1) Start a single instance of the Hadoop AMI you want to modify using the 
ElasticFox firefox plugin (or the ec2-tools)
2) Modify the /root/hadoop-init script and change the fs.default.name 
property to point to the FULL s3 path to your bucket (after doing this 
make sure you do not make your image public!)
3) Follow the instructions at 
http://docs.amazonwebservices.com/AWSEC2/2008-05-05/GettingStartedGuide/ 
for bundling, uploading and registering your new AMI.
4) On your local machine, in the hadoop-ec2-env.sh file, change the 
S3_BUCKET to point to your private s3 bucket where you uploaded your new 
image.  Change the HADOOP_VERSION to your new AMI name.

You can now go to your cmd prompt and say "bin/hadoop-ec2 launch-cluster 
myClusterName 5"  and it will bring up 5 instances in a hadoop cluster all 
running off your S3 Bucket instead of HDFS.

Kind regards


Steve Watt
IBM Certified IT Architect
Open Group Certified Master IT Architect

Tel: (512) 286 - 9170
Tie: 363 - 9170
Emerging Technologies, Austin, TX
IBM Software Group




From:
"Alexander Aristov" <[EMAIL PROTECTED]>
To:
core-user@hadoop.apache.org
Date:
09/30/2008 01:24 AM
Subject:
Re: Getting Hadoop Working on EC2/S3



Does you AWS (S3) key contain the "?" sing ? If so then it can be a cause.
Regenerate the key in this case.

I have also tried to use the create-image command but I stopped all 
attempts
after constant failures, It was easier to make AMI by hands.

Alexander

2008/9/29 Stephen Watt <[EMAIL PROTECTED]>

> Hi Folks
>
> Before I get started, I just want to state that I've done the due
> diligence and read Tom White's articles as well as EC2 and S3 pages on 
the
> Hadoop Wiki and done some searching on this.
>
> Thus far I have successfully got Hadoop running on EC2 with no problems.
> In my local hadoop 0.18 environment I simply add my AWS keys to the
> hadoop-ec2-env.sh and kickoff the src/contrib/ec2/bin/hadoop-ec2 launch
> cluster script and it works great.
>
> Now, I'm trying to use the Public Haodop EC2 images to run over S3 
instead
> of HDFS. They are set up to use variables passed in from a parameterized
> launch for all the config options everything EXCEPT the
> fs.default.filesystem. So in order to bring a cluster of 20 hadoop
> instances up that run over S3, I need to mod the config file to point to
> my S3 bucket for the fs.default.filesystem and keep the rest the same.
> Thus I need my own image to do this.  I am attempting this by using the
> local src/contrib/ec2/bin/hadoop-ec2 create-image script. I've tried 
this
> both on a windows system (cygwin environment) AND on my ubuntu 8 system
> and with each one it gets all the way to the end and fails as it 
attempts
> to save the new image to my bucket and says the bucket does not exist 
with
> a Server.NoSuchBucket (404) error.
>
> The S3 bucket definitely does exist. I have block data inside of it that
> are results of my Hadoop Jobs. I can go to a single hadoop image on EC2
> that I've launched and manually set up to use S3 and say bin/hadoop dfs
> -ls / and I can see the contents of my S3 bucket. I can also succesfully
> use that s3 bucket as an input and output of my jobs for a single EC2
> hadoop instance. I've tried creating new buckets using the FireFox S3
> Organizer plugin and specifying the scripts to save my new image to 
those
> and its still the same error.
>
> Any ideas ? Is anyone having similar problems ?
>
> Regards
> Steve Watt




-- 
Best Regards
Alexander Aristov




Re: LZO and native hadoop libraries

2008-09-30 Thread Colin Evans

Hi Nathan,
You probably need to add the Java headers to your build path as well - I 
don't know why the Mac doesn't ship with this as a default setting:


export 
CPATH="/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include 
"
export 
CPPFLAGS="-I/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/include" 






Nathan Marz wrote:
Thanks for the help. I was able to get past my previous issue, but the 
native build is still failing. Here is the end of the log output:


 [exec] then mv -f ".deps/LzoCompressor.Tpo" 
".deps/LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; exit 
1; fi

 [exec] mkdir .libs
 [exec]  gcc -DHAVE_CONFIG_H -I. 
-I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo 
-I../../../../../../.. -I/Library/Java/Home//include 
-I/Users/nathan/Downloads/hadoop-0.18.1/src/native/src -g -Wall -fPIC 
-O2 -m32 -g -O2 -MT LzoCompressor.lo -MD -MP -MF 
.deps/LzoCompressor.Tpo -c 
/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c  
-fno-common -DPIC -o .libs/LzoCompressor.o
 [exec] 
/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c: 
In function 
'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs':
 [exec] 
/Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo/LzoCompressor.c:135: 
error: syntax error before ',' token

 [exec] make[2]: *** [LzoCompressor.lo] Error 1
 [exec] make[1]: *** [all-recursive] Error 1
 [exec] make: *** [all] Error 2


Any ideas?



On Sep 30, 2008, at 11:53 AM, Colin Evans wrote:


There's a patch to get the native targets to build on Mac OS X:

http://issues.apache.org/jira/browse/HADOOP-3659

You probably will need to monkey with LDFLAGS as well to get it to 
work, but we've been able to build the native libs for the Mac 
without too much trouble.



Doug Cutting wrote:

Arun C Murthy wrote:
You need to add libhadoop.so to your java.library.patch. 
libhadoop.so is available in the corresponding release in the 
lib/native directory.


I think he needs to first build libhadoop.so, since he appears to be 
running on OS X and we only provide Linux builds of this in releases.


Doug








Re: LZO and native hadoop libraries

2008-09-30 Thread Nathan Marz
Thanks for the help. I was able to get past my previous issue, but the  
native build is still failing. Here is the end of the log output:


 [exec] 	then mv -f ".deps/LzoCompressor.Tpo" ".deps/ 
LzoCompressor.Plo"; else rm -f ".deps/LzoCompressor.Tpo"; exit 1; fi

 [exec] mkdir .libs
 [exec]  gcc -DHAVE_CONFIG_H -I. -I/Users/nathan/Downloads/ 
hadoop-0.18.1/src/native/src/org/apache/hadoop/io/compress/lzo - 
I../../../../../../.. -I/Library/Java/Home//include -I/Users/nathan/ 
Downloads/hadoop-0.18.1/src/native/src -g -Wall -fPIC -O2 -m32 -g -O2 - 
MT LzoCompressor.lo -MD -MP -MF .deps/LzoCompressor.Tpo -c /Users/ 
nathan/Downloads/hadoop-0.18.1/src/native/src/org/apache/hadoop/io/ 
compress/lzo/LzoCompressor.c  -fno-common -DPIC -o .libs/LzoCompressor.o
 [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/ 
apache/hadoop/io/compress/lzo/LzoCompressor.c: In function  
'Java_org_apache_hadoop_io_compress_lzo_LzoCompressor_initIDs':
 [exec] /Users/nathan/Downloads/hadoop-0.18.1/src/native/src/org/ 
apache/hadoop/io/compress/lzo/LzoCompressor.c:135: error: syntax error  
before ',' token

 [exec] make[2]: *** [LzoCompressor.lo] Error 1
 [exec] make[1]: *** [all-recursive] Error 1
 [exec] make: *** [all] Error 2


Any ideas?



On Sep 30, 2008, at 11:53 AM, Colin Evans wrote:


There's a patch to get the native targets to build on Mac OS X:

http://issues.apache.org/jira/browse/HADOOP-3659

You probably will need to monkey with LDFLAGS as well to get it to  
work, but we've been able to build the native libs for the Mac  
without too much trouble.



Doug Cutting wrote:

Arun C Murthy wrote:
You need to add libhadoop.so to your java.library.patch.  
libhadoop.so is available in the corresponding release in the lib/ 
native directory.


I think he needs to first build libhadoop.so, since he appears to  
be running on OS X and we only provide Linux builds of this in  
releases.


Doug






Re: LZO and native hadoop libraries

2008-09-30 Thread Arun C Murthy


On Sep 30, 2008, at 11:46 AM, Doug Cutting wrote:


Arun C Murthy wrote:
You need to add libhadoop.so to your java.library.patch.  
libhadoop.so is available in the corresponding release in the lib/ 
native directory.


I think he needs to first build libhadoop.so, since he appears to be  
running on OS X and we only provide Linux builds of this in releases.




Ah, good point.

Unfortunately the work on getting native libs on Mac OS X stalled... 
http://issues.apache.org/jira/browse/HADOOP-3659

Arun



Re: LZO and native hadoop libraries

2008-09-30 Thread Colin Evans

There's a patch to get the native targets to build on Mac OS X:

http://issues.apache.org/jira/browse/HADOOP-3659

You probably will need to monkey with LDFLAGS as well to get it to work, 
but we've been able to build the native libs for the Mac without too 
much trouble.



Doug Cutting wrote:

Arun C Murthy wrote:
 You need to add libhadoop.so to your java.library.patch. 
libhadoop.so is available in the corresponding release in the 
lib/native directory.


I think he needs to first build libhadoop.so, since he appears to be 
running on OS X and we only provide Linux builds of this in releases.


Doug




Re: LZO and native hadoop libraries

2008-09-30 Thread Doug Cutting

Arun C Murthy wrote:
 You need to add libhadoop.so to your java.library.patch. libhadoop.so 
is available in the corresponding release in the lib/native directory.


I think he needs to first build libhadoop.so, since he appears to be 
running on OS X and we only provide Linux builds of this in releases.


Doug


Re: LZO and native hadoop libraries

2008-09-30 Thread Arun C Murthy

Nathan,

 You need to add libhadoop.so to your java.library.patch.  
libhadoop.so is available in the corresponding release in the lib/ 
native directory.


Arun

On Sep 30, 2008, at 11:14 AM, Nathan Marz wrote:

I am trying to use SequenceFiles with LZO compression outside the  
context of a MapReduce application. However, when I try to use the  
LZO codec, I get the following errors in the log:


08/09/30 11:09:56 DEBUG conf.Configuration: java.io.IOException:  
config()
	at org.apache.hadoop.conf.Configuration.(Configuration.java: 
157)
	at  
com 
.rapleaf 
.formats 
.stream.TestSequenceFileStreams.setUp(TestSequenceFileStreams.java:22)

at junit.framework.TestCase.runBare(TestCase.java:125)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:118)
at junit.framework.TestSuite.runTest(TestSuite.java:208)
at junit.framework.TestSuite.run(TestSuite.java:203)
	at  
org 
.junit 
.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)

at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:36)
	at  
org 
.apache 
.tools 
.ant 
.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421)
	at  
org 
.apache 
.tools 
.ant 
.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java: 
912)
	at  
org 
.apache 
.tools 
.ant 
.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java: 
766)


08/09/30 11:09:56 DEBUG security.UserGroupInformation: Unix Login:  
nathan,staff,_lpadmin,com.apple.sharepoint.group. 
1,_appserveradm,_appserverusr,admin,com.apple.access_ssh
08/09/30 11:09:56 DEBUG util.NativeCodeLoader: Trying to load the  
custom-built native-hadoop library...
08/09/30 11:09:56 DEBUG util.NativeCodeLoader: Failed to load native- 
hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in  
java.library.path
08/09/30 11:09:56 DEBUG util.NativeCodeLoader: java.library.path=.:/ 
Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java
08/09/30 11:09:56 WARN util.NativeCodeLoader: Unable to load native- 
hadoop library for your platform... using builtin-java classes where  
applicable
08/09/30 11:09:56 ERROR compress.LzoCodec: Cannot load native-lzo  
without native-hadoop



What is the native hadoop library and how should I configure things  
to use it?




Thanks,

Nathan Marz
RapLeaf





LZO and native hadoop libraries

2008-09-30 Thread Nathan Marz
I am trying to use SequenceFiles with LZO compression outside the  
context of a MapReduce application. However, when I try to use the LZO  
codec, I get the following errors in the log:


08/09/30 11:09:56 DEBUG conf.Configuration: java.io.IOException:  
config()

at org.apache.hadoop.conf.Configuration.(Configuration.java:157)
	at  
com 
.rapleaf 
.formats 
.stream.TestSequenceFileStreams.setUp(TestSequenceFileStreams.java:22)

at junit.framework.TestCase.runBare(TestCase.java:125)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:118)
at junit.framework.TestSuite.runTest(TestSuite.java:208)
at junit.framework.TestSuite.run(TestSuite.java:203)
	at  
org 
.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java: 
81)

at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:36)
	at  
org 
.apache 
.tools 
.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java: 
421)
	at  
org 
.apache 
.tools 
.ant 
.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java: 
912)
	at  
org 
.apache 
.tools 
.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java: 
766)


08/09/30 11:09:56 DEBUG security.UserGroupInformation: Unix Login:  
nathan,staff,_lpadmin,com.apple.sharepoint.group. 
1,_appserveradm,_appserverusr,admin,com.apple.access_ssh
08/09/30 11:09:56 DEBUG util.NativeCodeLoader: Trying to load the  
custom-built native-hadoop library...
08/09/30 11:09:56 DEBUG util.NativeCodeLoader: Failed to load native- 
hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in  
java.library.path
08/09/30 11:09:56 DEBUG util.NativeCodeLoader: java.library.path=.:/ 
Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java
08/09/30 11:09:56 WARN util.NativeCodeLoader: Unable to load native- 
hadoop library for your platform... using builtin-java classes where  
applicable
08/09/30 11:09:56 ERROR compress.LzoCodec: Cannot load native-lzo  
without native-hadoop



What is the native hadoop library and how should I configure things to  
use it?




Thanks,

Nathan Marz
RapLeaf



Re: ClassNotFoundException from Jython

2008-09-30 Thread Klaas Bosteels
On Tue, Sep 30, 2008 at 12:34 AM, Karl Anderson <[EMAIL PROTECTED]> wrote:
> I recommend using streaming instead if you can, much easier to develop and
> debug.  It's also nice to not get that "stop doing that, jythonc is going
> away" message each time you compile :)  Also check out the recently
> announced Happy framework for Hadoop and Jython, it looks interesting.

If you want to use cpython/streaming instead, you might be interested in:

https://issues.apache.org/jira/browse/HADOOP-4304


-Klaas


A quick question about partioner and reducer

2008-09-30 Thread Saptarshi Guha

Hello,
I am slightly confused about the number of reducers executed and the  
size of data each receives.

Setup:
I have a setup of 5 task trackers.
In my hadoop-site:
(1)

  mapred.reduce.tasks
  7
  The default number of reduce tasks per job.  Typically  
set

  to a prime close to the number of available hosts.  Ignored when
  mapred.job.tracker is "local".
  

(2)

  mapred.tasktracker.map.tasks.maximum
  7
  The maximum number of map tasks that will be run
  simultaneously by a task tracker.
  


(3)
However  from http://hadoop.apache.org/core/docs/r0.18.1/api/index.html
"The total number of partitions is the same as the number of reduce  
tasks for the job.."


Q:
So does that mean (from (1) & (2)) , there will be a total of 7 reduce  
tasks distributed across 5 machines such that

no machine receives more than 7 reduce jobs?
If so, suppose i have millions of unique keys which  need to be  
reduced(e.g urls/hashes), these will be partitioned into 7 groups  
(from (3))

and distributed across 5 machines?
Which is equivalent to saying that the number of reduces tasks run  
across all machines will be equal to 7?


Wouldn't that be too large a number of keys for each reduce task?

Are these possible solutions:
Solution:
I) Fixed machines (5), but increase mapred.reduce.tasks (loss in  
performance?)
2) Increase number of machines (not possible for me, but a theoretical  
solution) and set mapred.reduce.tasks to a commensurate number



Many thanks for your time
Saptarshi


Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha
More people are flattered into virtue than bullied out of vice.
-- R. S. Surtees



Re: Jobtracker config?

2008-09-30 Thread Saptarshi Guha

Hi,
Thanks, it worked. Correct me if I'm wrong, but isn't this a  
configuration defect?
E.g the location of sec namenode is in conf/master and if I run start- 
dfs.sh, the sec namenode starts it on B.


Similarly, given that the Jobtracker is specified to run on C,  
shouldn't start-all.sh start the jobtracker on C?


Regards
Saptarshi

On Sep 29, 2008, at 6:37 PM, Arun C Murthy wrote:



On Sep 29, 2008, at 2:52 PM, Saptarshi Guha wrote:

Setup:
I am running the namenode on A, the sec. namenode on B and the  
jobtracker on C. The datanodes and tasktrackers are on Z1,Z2,Z3.


Problem:
However, the jobtracker is starting up on A. Here are my configs  
for Jobtracker


This would happen if you ran 'start-all.sh' on A rather than start- 
dfs.sh on A and start-mapred.sh on B. Is that what you did?


If not, please post the commands you used to start the HDFS and Map- 
Reduce clusters...


Arun




mapred.job.tracker
C:54311
The host and port that the MapReduce job tracker runs
at.  If "local", then jobs are run in-process as a single map
and reduce task.



mapred.job.tracker.http.address
C:50030

  The job tracker http server address and port the server will  
listen on.

  If the port is 0 then the server will start on a free port.



Also, my masters contains on entry for B (so that the sec. name  
node starts on B) and my slaves file contains Z1,Z2,Z3.

The config files are synchronized across all machines.

Any help would be appreciated.
Thank you
Saptarshi

Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha







Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha
I think I'm schizophrenic.  One half of me's
paranoid and the other half's out to get him.



Re: Question about Hadoop 's Feature(s)

2008-09-30 Thread Jason Rutherglen
> However, HDFS uses HTTP to serve blocks up -that needs to be locked down
>  too. Would the signing work there?

I am not familiar with HDFS over HTTP.  Could it simply sign the
stream and include the signature at the end of the HTTP message
returned?

On Tue, Sep 30, 2008 at 8:56 AM, Steve Loughran <[EMAIL PROTECTED]> wrote:
> Jason Rutherglen wrote:
>>
>> I implemented an RMI protocol using Hadoop IPC and implemented basic
>> HMAC signing.  It is I believe faster than public key private key
>> because it uses a secret key and does not require public key
>> provisioning like PKI would.  Perhaps it would be a baseline way to
>> sign the data.
>
> That should work for authenticating messages between (trusted) nodes.
> Presumably the ipc.key value could be set in the Conf and all would be well.
>
> External job submitters shouldn't be given those keys; they'd need an
> HTTP(S) front end that could authenticate them however the organisation
> worked.
>
> Yes, that would be simpler. I am not enough of a security expert to say if
> it will work, but the keys should be easier to work with. As long as the
> configuration files are kept secure, your cluster will be locked.
>
> However, HDFS uses HTTP to serve blocks up -that needs to be locked down
>  too. Would the signing work there?
>
> -steve
>


Re: Question about Hadoop 's Feature(s)

2008-09-30 Thread Steve Loughran

Jason Rutherglen wrote:

I implemented an RMI protocol using Hadoop IPC and implemented basic
HMAC signing.  It is I believe faster than public key private key
because it uses a secret key and does not require public key
provisioning like PKI would.  Perhaps it would be a baseline way to
sign the data.


That should work for authenticating messages between (trusted) nodes. 
Presumably the ipc.key value could be set in the Conf and all would be well.


External job submitters shouldn't be given those keys; they'd need an 
HTTP(S) front end that could authenticate them however the organisation 
worked.


Yes, that would be simpler. I am not enough of a security expert to say 
if it will work, but the keys should be easier to work with. As long as 
the configuration files are kept secure, your cluster will be locked.


However, HDFS uses HTTP to serve blocks up -that needs to be locked down 
 too. Would the signing work there?


-steve