Re: Practical limits on number of blocks per datanode.

2008-11-21 Thread Johan Oskarsson
Hi Rick, unfortunately 4,800,000 blocks per node is going to be too much. Ideally you'd want to merge your files into as few as possible, even 1MB per file is quite small for Hadoop. Would it be possible to merge them into hundreds of mbs or preferably gigabyte files? In newer Hadoop versions

Re: Map merge part makes the task timeout

2008-11-21 Thread David Alves
Hi again Browsing the source code (Merger.class) I see that merger actually call reporter.progress() so shouldn't this make the task be reported as still working? Regards David Alves On Nov 20, 2008, at 6:29 PM, David Alves wrote: Hi all I have a big map task that takes a long time

ls command output format

2008-11-21 Thread Alexander Aristov
Hello I wonder if hadoop shell command ls has changed output format Trying hadoop-0.18.2 I got next output [root]# hadoop fs -ls / Found 2 items drwxr-xr-x - root supergroup 0 2008-11-21 08:08 /mnt drwxr-xr-x - root supergroup 0 2008-11-21 08:19 /repos Though according

Re: ls command output format

2008-11-21 Thread Tsz Wo (Nicholas), Sze
Hi Alex, Yes, the doc about ls is out-dated. Thanks for pointing this out. Would you mind to file a JIRA? Nicholas Sze - Original Message From: Alexander Aristov [EMAIL PROTECTED] To: core-user@hadoop.apache.org Sent: Friday, November 21, 2008 6:08:08 AM Subject: Re: ls

Re: ls command output format

2008-11-21 Thread Allen Wittenauer
On 11/21/08 6:03 AM, Alexander Aristov [EMAIL PROTECTED] wrote: Trying hadoop-0.18.2 I got next output [root]# hadoop fs -ls / Found 2 items drwxr-xr-x - root supergroup 0 2008-11-21 08:08 /mnt drwxr-xr-x - root supergroup 0 2008-11-21 08:19 /repos ... which

Re: Hadoop Installation

2008-11-21 Thread Mithila Nagendra
I tried the 0.18.2 as welll.. it gave me the same exception.. so tried the lower version.. I should check if this works.. Thanks! On Fri, Nov 21, 2008 at 5:06 AM, Alex Loddengaard [EMAIL PROTECTED] wrote: Maybe try downloading the Apache Commons - Logging jars (

Re: Hadoop Installation

2008-11-21 Thread Mithila Nagendra
Hey ALex Which file do I download from the apache commons website? Thanks Mithila On Fri, Nov 21, 2008 at 8:15 PM, Mithila Nagendra [EMAIL PROTECTED] wrote: I tried the 0.18.2 as welll.. it gave me the same exception.. so tried the lower version.. I should check if this works.. Thanks! On

Redirecting the logs to remote log server?

2008-11-21 Thread Erik Holstad
Hi! I have been trying to get the logs from Hadoop to redirect to a remote log server. Tried to add the socket appender in the log4j.properties file in the conf directory and also to add commons.logging + log4j jars + the same log4j.properties file into the WEB-INF of the master but I still get

combiner without reducer

2008-11-21 Thread Amogh Vasekar
Hi, I believe currently a combiner is not run unless you have atleast one reducer set. Not getting into the Hadoop-18 semantics of combiner running on both sides ( the number of reducers are anyways 0, so I guess the merge-combine doesn't come into picture at all) , I have a use case where I

Re: Hadoop Installation

2008-11-21 Thread Alex Loddengaard
Download the 1.1.1.tar.gz binaries. This file will have a bunch of JAR files; drop the JAR files in to $HADOOP_HOME/lib and see what happens. Alex On Fri, Nov 21, 2008 at 9:19 AM, Mithila Nagendra [EMAIL PROTECTED] wrote: Hey ALex Which file do I download from the apache commons website?

Re: Hadoop Installation

2008-11-21 Thread Mithila Nagendra
I tried dropping the jar files into the lib. It still doesnt work.. The following is how the lib looks after the new files were put in: [EMAIL PROTECTED] hadoop-0.17.2.1]$ cd bin [EMAIL PROTECTED] bin]$ ls hadoophadoop-daemon.sh rccstart-all.sh start-dfs.sh stop-all.sh

Re: Hung in DFSClient$DFSOutputStream.writeChunk

2008-11-21 Thread stack
Trying to get more data on the issue reported below, I tripped over the following where a datanode died and dfsclient is trying to transition to getting wanted block from another. The transition attempt is unsuccessful but odd is that we do not proceed to the datanode carrying the 3rd replica.

Re: too many open files? Isn't 4K enough???

2008-11-21 Thread Yuri Pradkin
We created a Jira for this as well as provided a patch. Please see http://issues.apache.org/jira/browse/HADOOP-4614 I hope it'll make it into svn soon (it's been kind of slow lately). Are you able to create a reproducible setup for this? I haven't been able to. Yes we did see consistent

NN JVM process takes a lot more memory than assigned

2008-11-21 Thread Raghu Angadi
There is one instance of NN where JVM process takes 40GB memory though jvm is started with 24GB. Java heap is still 24GB. Looks like it ends up taking a lot of memory outside. There are a lot entries in pmap similar to below that account for the difference. Anyone knows what this might be?

Re: Conf Object witout hadoop-default.xml and hadoop-site.xml

2008-11-21 Thread Aaron Kimball
You'll want to set fs.default.name which identifies the filesystem to use. Based on the protocol-prefix of the URI (e.g., hdfs://, file://, etc), it will choose a different filesystem implementation. So you'll also need to set the relevant fs.*.impl keys. See hadoop-default.xml for the correct

Re: Any suggestion on performance improvement ?

2008-11-21 Thread Aaron Kimball
It's worth pointing out that Hadoop really isn't designed to run at this low of a scale. Hadoop's performance doesn't really begin to kick in until you've got 10's of GB's of data. The question is sort of like asking how can I make an 18-wheeler run faster when carrying only a single bag of