Tasks seem to fail randomly with nonzero status of 1

2011-03-02 Thread Marc Sturlese
Hey there, My cluster was working fine but suddenly lots and lots of tasks start failing like: java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:472) Caused by: java.io.IOException: Task process exit with nonzero status of 1. at

Re: Tasks seem to fail randomly with nonzero status of 1

2011-03-02 Thread Hari Sreekumar
Did this happen just once or it happens every time? This usually happens when the Child processes are forcibly killed. If it was a one-off thing, it is possible that someone else working on your machine at the same time killed the processes. If it happens every time, then it could be due to lack

Re: Tasks seem to fail randomly with nonzero status of 1

2011-03-02 Thread Marc Sturlese
Well I'ven been running these jobs for days. It's just happening since last night and now even if I restart the error keeps happening. I'am the only one using the cluster -- View this message in context:

FileSystem.exists() returns false when querying for the directory (Linux)

2011-03-02 Thread xypod-ii
Hello, I've got a following problem - I'm extending HadoopTestCase to test some functionality. The test creates some files and directories in tmp directory, then performs some actions, and fileSystem.exists() method is being used. This works perfectly for files and under Windows also for

Get the number of map reduce tasks

2011-03-02 Thread xypod-ii
Hello, Is it possible to get the actual numbers of map and reduce tasks from the level of JobClient or RunningJob? Jobtracker webapp gets this information directly from JobTracker using JobInProgress class (desiredMaps and desiredReduces), but is it possible to retrieve this information from

conceptual question regarding slots

2011-03-02 Thread bikash sharma
Hi, Could someone throw some light as to how intuitively fixed-type slots in Hadoop have a negative impact of cluster utilization as mentioned in Arun's blog? http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/ Thanks, Bikash

hadoop installation problem(single-node)

2011-03-02 Thread Manish Yadav
Dear Sir/Madam I'm very new to hadoop. I'm trying to install hadoop on my computer. I followed a weblink and try to install it. I want to install hadoop on my single node cluster. i 'm using Ubuntu 10.04 64-bit as my operating system . I have installed java in /usr/java/jdk1.6.0_24. the step

RE: hadoop installation problem(single-node)

2011-03-02 Thread Habermaas, William
If you are interested in a quick start hadoop and don't mind if hbase is included take a look at the dashboard application at www.habermaas.com It is a free packaged hadoop setup. Just unzip it and run it. Bill -Original Message- From: Manish Yadav [mailto:manish.ya...@orkash.com]

Re: hadoop installation problem(single-node)

2011-03-02 Thread Matthew John
hey Manish, Are u giving the commands in the Hadoop_home directory ? if yes please give bin/hadoop namenode -format dont forget to append bin/ before ur commands because all the scripts reside in the bin directory. Matthew On Wed, Mar 2, 2011 at 2:29 PM, Manish Yadav manish.ya...@orkash.com

Hadoop C++ Task Fails during runtime

2011-03-02 Thread Kumar, Amit H.
Hi All, I am trying to follow first steps on getting a simple C++ program to work using Hadoop Pipes. And I get the following error while running it. Can anybody help me understand what could I be doing wrong? Used the following code. http://wiki.apache.org/hadoop/C%2B%2BWordCount # hadoop

Small file Map performance

2011-03-02 Thread Aaron Baff
So, the problem is we have a crap ton of small files, and a limited sized cluster (only 4 nodes, just up from 2, yay!) as we are just starting to use Hadoop. With our current hardware, we have 32 Map slots, and 1500 files. The Task startup time is, frankly, killing us, and at this time we can't

Re: hadoop installation problem(single-node)

2011-03-02 Thread Tom White
The instructions at http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html should be what you need. Cheers, Tom On Wed, Mar 2, 2011 at 12:59 AM, Manish Yadav manish.ya...@orkash.com wrote: Dear Sir/Madam  I'm very new to hadoop. I'm trying to install hadoop on my computer. I followed a

Re: FileSystem.exists() returns false when querying for the directory (Linux)

2011-03-02 Thread Harsh J
Could you post your test case demonstrating this? TestLocalFileSystem entirely passes on my Linux desktop box, via ant - and it does contain various directory exists tests. Also, are you sure that your created FS instance is LocalFileSystem and not something else? On Wed, Mar 2, 2011 at 4:23

ToolRunner run function

2011-03-02 Thread maha
Hi, Assuming my program implements the ToolRunner, my question is where does the run function execute? ie. which daemon (DataNode/TT) ? or is it on the local machine where it is run? Thank you, Maha

Re: Hadoop Case Studies?

2011-03-02 Thread Ted Pedersen
Greetings all, Since posting my original request I ran across the following, which is a nice example of what I'd call a case study. Gives a few details at least and is kind of an interesting or creative use of Hadoop...

Re: Get the number of map reduce tasks

2011-03-02 Thread Harsh J
The submitted job file (job.xml) contains the 'right' mapred.map.tasks amount set into it (Use JobContext.NUM_MAPS instead in new API, I guess). If you can read that file back into a configuration object (I believe RunningJob lets you get hold of the submitted job file), you can get hold of what

Re: Comparison between Gzip and LZO

2011-03-02 Thread Niels Basjes
Question: Are you 100% sure that nothing else was running on that system during the tests? No cron jobs, no makewhatis or updatedb? P.S. There is a permission issue with downloading one of the files. 2011/3/2 José Vinícius Pimenta Coletto jvcole...@gmail.com: Hi, I'm making a comparison

RE: ToolRunner run function

2011-03-02 Thread Michael Segel
Run is local to your edge machine where you launched your job. It then connects to the cluster / job tracker ... HTH -Mike From: m...@umail.ucsb.edu Subject: ToolRunner run function Date: Wed, 2 Mar 2011 12:10:05 -0800 To: common-user@hadoop.apache.org Hi, Assuming my program

Speculative execution

2011-03-02 Thread Keith Wiley
I realize that the intended purpose of speculative execution is to overcome individual slow tasks...and I have read that it explicitly is *not* intended to start copies of a task simultaneously and to then race them, but rather to start copies of tasks that seem slow after running for a while.

Re: ToolRunner run function

2011-03-02 Thread maha
Thanks Mike :) I was also wondering what if: hdfs.CopyToLocal( src-file, dst-file) ; // is executed on node N and there exists a copy of src-file from the replication process in that same node(N) local file system ? Will hdfs recognize that there is already a copy in there and hence

RE: Hadoop C++ Task Fails during runtime

2011-03-02 Thread Kumar, Amit H.
Just wanted to add to this: that My setup is working correctly for java examples without any issues or errors. Thank you for any help! Amit -Original Message- From: Kumar, Amit H. [mailto:ahku...@odu.edu] Sent: Wednesday, March 02, 2011 10:51 AM To: common-user@hadoop.apache.org

Namenode trying to connect to localhost instead of the name and dying

2011-03-02 Thread Mark Kerzner
Hi, I am doing a pseudo-distributed mode on my laptop, following the same steps I used for all configurations on my regular cluster, but I get this error 2011-03-02 16:45:13,651 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=mapred ip=/ 192.168.1.150 cmd=delete

Re: Namenode trying to connect to localhost instead of the name and dying

2011-03-02 Thread Bibek Paudel
On Wed, Mar 2, 2011 at 11:57 PM, Mark Kerzner markkerz...@gmail.com wrote: Hi, I am doing a pseudo-distributed mode on my laptop, following the same steps I used for all configurations on my regular cluster, but I get this error 2011-03-02 16:45:13,651 INFO

Re: Namenode trying to connect to localhost instead of the name and dying

2011-03-02 Thread Mark Kerzner
It has just one entry hadoop-sony and ping hadoop-sony PING hadoop-sony (192.168.1.150) 56(84) bytes of data. 64 bytes from ubuntu (192.168.1.150): icmp_req=1 ttl=64 time=0.024 ms On Wed, Mar 2, 2011 at 4:59 PM, Bibek Paudel eternalyo...@gmail.com wrote: On Wed, Mar 2, 2011 at 11:57 PM, Mark

Re: Namenode trying to connect to localhost instead of the name and dying

2011-03-02 Thread Mark Kerzner
all other daemons are alive, but namenode daemon dying On Wed, Mar 2, 2011 at 5:01 PM, Mark Kerzner markkerz...@gmail.com wrote: It has just one entry hadoop-sony and ping hadoop-sony PING hadoop-sony (192.168.1.150) 56(84) bytes of data. 64 bytes from ubuntu (192.168.1.150): icmp_req=1

Re: Namenode trying to connect to localhost instead of the name and dying

2011-03-02 Thread Bibek Paudel
On Thu, Mar 3, 2011 at 12:01 AM, Mark Kerzner markkerz...@gmail.com wrote: It has just one entry hadoop-sony and ping hadoop-sony PING hadoop-sony (192.168.1.150) 56(84) bytes of data. 64 bytes from ubuntu (192.168.1.150): icmp_req=1 ttl=64 time=0.024 ms In that case, I think you should

Re: Namenode trying to connect to localhost instead of the name and dying

2011-03-02 Thread Eric Sammer
Check your /etc/hosts file and make sure the hostname of the machine is not on the loopback device. This is almost always the cause of this. On Wed, Mar 2, 2011 at 5:57 PM, Mark Kerzner markkerz...@gmail.com wrote: Hi, I am doing a pseudo-distributed mode on my laptop, following the same

Re: Namenode trying to connect to localhost instead of the name and dying

2011-03-02 Thread Bibek Paudel
On Thu, Mar 3, 2011 at 12:07 AM, Mark Kerzner markkerz...@gmail.com wrote: all other daemons are alive, but namenode daemon dying In particular, please check this setting: dfs.datanode.ipc.address -b On Wed, Mar 2, 2011 at 5:01 PM, Mark Kerzner markkerz...@gmail.com wrote: It has just one

Re: Namenode trying to connect to localhost instead of the name and dying

2011-03-02 Thread Bibek Paudel
On Thu, Mar 3, 2011 at 12:08 AM, Eric Sammer esam...@cloudera.com wrote: Check your /etc/hosts file and make sure the hostname of the machine is not on the loopback device. This is almost always the cause of this. +1 -b On Wed, Mar 2, 2011 at 5:57 PM, Mark Kerzner markkerz...@gmail.com

Reminder: SF Hadoop meetup in 1 week

2011-03-02 Thread Aaron Kimball
Hadoop fans, As a reminder -- the third SF Hadoop meetup is one week away! We will meet on March 9th, from 6pm to 8pm. (We will hopefully continue using the 2nd Wednesday of the month for successive meetups). This meetup will be hosted by Yelp, at their office at 706 Mission St, San Francisco

Module import problem

2011-03-02 Thread Luca Aiello
Hello everyone, I am experiencing some problems in importing external non-java libraries into my hadoop code. I am trying to run my code on a grid. I followed the instructions here: http://hadoop.apache.org/common/docs/current/native_libraries.html#Native+Shared+Libraries but I failed. I have

Re: Namenode trying to connect to localhost instead of the name and dying

2011-03-02 Thread Mark Kerzner
Thank you, Eric, thank you, Bibek. /etc/hosts was part of the problem, and then after some re-install commands it just started working :) Pleasure == working Hadoop cluster (even if it is pseudo-pleasure) Sincerely, Mark On Wed, Mar 2, 2011 at 5:09 PM, Bibek Paudel eternalyo...@gmail.com

Specify File Name to mappers

2011-03-02 Thread maha
Hi, If FileInputFormat is used with File.splitable(false) then each mapper will be getting a full file. I want the mapper to also know the path or at least name of the file it's assigned to. Please help, any ideas are appreciated. Thank you, Maha

Re: Comparison between Gzip and LZO

2011-03-02 Thread Brian Bockelman
I think some profiling is in order: claiming LZO decompresses at 1.0MB/s and is more than 3x faster at compression than decompression (especially when it's a well known asymmetric algorithm in favor of decompression speed) is somewhat unbelievable. I see that you use small files. Maybe

Re: Comparison between Gzip and LZO

2011-03-02 Thread James Seigel
slightly not on point for this conversation, but I thought it worth mentioningLZO is splitable, which makes it a good for for hadoopy things. Just something to remember when you do get some final results on performance. Cheers James. On 2011-03-02, at 8:12 PM, Brian Bockelman wrote:

Performance Test

2011-03-02 Thread liupei
Hi, I'd like to tune params in hadoop config for my job. But my current cluster runs lot of other processes such as mongod, php gateways and some other routine hadoop jobs. It is impossible to stop all to get a clear environment for testing. Is there any way to get reliable results for my

Re: Performance Test

2011-03-02 Thread Ted Dunning
It will be very difficult to do. If you have n machines running 4 different things, you will probably get better results segregating tasks as much as possible. Interactions can be very subtle and can have major impact on performance in a few cases. Hadoop, in general, will use a lot of the

Re: Specify File Name to mappers

2011-03-02 Thread Harsh J
The property 'map.input.file' should be what you're looking for. If you have your own RecordReader, then you can get the Path from the FileSplit object. On Thu, Mar 3, 2011 at 8:14 AM, maha m...@umail.ucsb.edu wrote: Hi, If FileInputFormat is used with File.splitable(false) then each mapper

Hadoop 0.21 running problems , no namenode to stop

2011-03-02 Thread Shivani Rao
Problems running local installation of hadoop on single-node cluster I followed instructions given by tutorials to run hadoop-0.21 on a single node cluster. The first problem I encountered was that of HADOOP-6953. Thankfully that has got fixed. The other problem I am facing is that the

Re: Hadoop 0.21 running problems , no namenode to stop

2011-03-02 Thread rahul patodi
Hi, Please check logs, there might be some error occured while starting daemons Please post the error On Thu, Mar 3, 2011 at 10:24 AM, Shivani Rao sg...@purdue.edu wrote: Problems running local installation of hadoop on single-node cluster I followed instructions given by tutorials to run

Re: hadoop installation problem(single-node)

2011-03-02 Thread manish.yadav
Matthew thanks for the help now the command is working but I got the following errors .Will u help me to solve these error im giving you the error list i just use the command hadoop@ws40-man-lin:~/project/hadoop-0.20.0$ bin/hadoop namenode -format and i get following result Exception in thread

Re: hadoop installation problem(single-node)

2011-03-02 Thread manish.yadav
thanks for the help now the command is working but I got the following errors .Will u help me to solve these error im giving you the error list which i faced in installing hadoop on single node cluster all the configuration files are attached to the earlier post i just use the command

Re: Performance Test

2011-03-02 Thread liupei
Ted, thanks very much. I'll try to split these things. Ted Dunning wrote: It will be very difficult to do. If you have n machines running 4 different things, you will probably get better results segregating tasks as much as possible. Interactions can be very subtle and can have major impact

Some questions about streaming

2011-03-02 Thread liupei
Hi, I used to write some map/reduce jobs with streaming interface and found it very friendly to us not familiar with Java. But I wonder that can streaming interface do as much as Java interface. I have two questions: 1. does there exist 1-1 mapping between Java usage and streaming usage, so I

Re: hadoop installation problem(single-node)

2011-03-02 Thread Matthew John
Hey Manish, I am not very sure if you have got your configurations correct including the javapath. Can u try re-installing hadoop following the guidelines given in the following link step by step. That would take care of any glitches possible.

Re: hadoop installation problem(single-node)

2011-03-02 Thread manish.yadav
hi thanx for replying . i already attached the configuration files in my earlier post please check them and tell me what I'm doing wrong -- View this message in context: http://lucene.472066.n3.nabble.com/hadoop-installation-problem-single-node-tp2613742p2623269.html Sent from the Hadoop