Re: How does reducer get intermediate output?

2009-08-26 Thread Inifok Song
Hello Harish, I find taskLogUrl.openConnection() often cause IOException. And I suspect that the connection pool is too small. Could you tell me how can I get settings of jetty for hadoop? Thank you. Inifok 2009/8/27 Harish Mallipeddi > On Thu, Aug 27, 2009 at 8:34 AM, inifok.song >wrote: >

Re: Concatenating files on HDFS

2009-08-26 Thread Ankur Goel
HDFS files are write once so you cannot append to them (at the moment). What you can do is copy your local file to HDFS dir containing the same file you want to append to. Once that is done you can run a simple (Identity Mapper & Identity Reducer) mapreduce job with input as this directory and nu

Re: How does reducer get intermediate output?

2009-08-26 Thread Harish Mallipeddi
On Thu, Aug 27, 2009 at 8:34 AM, inifok.song wrote: > Hi all, > > In my cluster, the reducer often can't fetch mapper's output. I know there > are many reasons for this situation. And I think it's necessary to find out > how does reducer get intermediate output. I have read the source code. > Howe

How does reducer get intermediate output?

2009-08-26 Thread inifok.song
Hi all, In my cluster, the reducer often can't fetch mapper's output. I know there are many reasons for this situation. And I think it's necessary to find out how does reducer get intermediate output. I have read the source code. However, I'm not clear about the whole process. Could you tell me th

control map to split assignment

2009-08-26 Thread Rares Vernica
Hello, I wonder is there is a way to control how maps are assigned to splits in order to balance the load across the cluster. Here is a simplified example. I have tow types of inputs: "long" and "short". Each input is in a different file and will be processed by a single map task. Suppose the "lo

Symlink support

2009-08-26 Thread Yasuyuki Watanabe
Hi! Could someone tell me about the status of symbolic link support in HDFS (HDFS-245)? It looks like a patch is merged with latest trunk. So I would like to know how good it works and whether or not the patch is applicable for the current release of Hadoop. We just start testing HDFS as a part

Re: Seattle / NW Hadoop, HBase Lucene, etc. Meetup , Wed August 26th, 6:45pm

2009-08-26 Thread Bradford Stephens
Hello, My apologies, but there was a mix-up reserving our meeting location, and we don't have access to it. I'm very sorry, and beer is on me next month. Promise :) Sent from my Internets On Aug 25, 2009, at 4:21 PM, Bradford Stephens > wrote: Hey there, Apologies for this not going out

MapReduce read patterns

2009-08-26 Thread hadooprcoks
Hi, I wanted to know about usual read patterns from MapReduce applications into HDFS based on users' experiences here. I think large requests are more common (32K and above) but wanted to know if small reads (512bytes - 1K) are common too ? Thanks.

Re: Intra-datanode balancing?

2009-08-26 Thread Kris Jirapinyo
Hmm then in that case, it is possible for me to manually balance load those datanodes by moving most of the files onto the new, larger partition. I will try it. Thanks! -- Kris J. On Wed, Aug 26, 2009 at 10:13 AM, Raghu Angadi wrote: > Kris Jirapinyo wrote: > >> But I mean, then how does that

Re: Intra-datanode balancing?

2009-08-26 Thread Raghu Angadi
Kris Jirapinyo wrote: But I mean, then how does that datanode knows that these files were copied from one partition to another, in this new directory? I'm not sure the inner workings of how a datanode knows what files are on itself...I was assuming that it knows by keeping track of the subdir di

Re: Intra-datanode balancing?

2009-08-26 Thread Kris Jirapinyo
But I mean, then how does that datanode knows that these files were copied from one partition to another, in this new directory? I'm not sure the inner workings of how a datanode knows what files are on itself...I was assuming that it knows by keeping track of the subdir directory...or is that jus

Concatenating files on HDFS

2009-08-26 Thread Turner Kunkel
Is there any way to concatenate/append a local file to a file on HDFS without copying down the HDFS file locally first? I tried: bin/hadoop dfs -cat file:///[local file] >> hdfs://[hdfs file] But it just tries to look for hdfs://[hdfs file] as a local file, since I suppose the dfs -cat command doe

RE: 0.19.1 infinite loop

2009-08-26 Thread Jeremy Pinkham
Thanks Brian. I'm trying to find a way to reliably replicate it, and will certainly update this list if I manage to do so. It is happening with more frequency in our QA environment, which is a much smaller cluster (only 2 nodes), but still not deterministically. Hopefully we can hone in on somet

Re: Testing Hadoop job

2009-08-26 Thread Jason Venner
I put together a framework for the Pro Hadoop book that I use quite a bit, and has some documentation in the book examples ;) I haven't tried it with 0.20.0 however. The nicest thing that I did with the framework was provide a way to run a persistent mini virtual cluster for running multiple tests

Re: 0.19.1 infinite loop

2009-08-26 Thread Brian Bockelman
Hey Jeremy, Glad someone else has run into this! I always thought this specific infinite loop was in my code. I had an issue open for it earlier, but I ultimately was not sure if it was in my code or HDFS, so we closed it: https://issues.apache.org/jira/browse/HADOOP-4866 We [and others]

0.19.1 infinite loop

2009-08-26 Thread Jeremy Pinkham
I'm using hadoop 0.19.1 on a 60 node cluster, each node has 8GB of ram and 4 cores. I have several jobs that run every day, and last night one of them triggered an infinite loop that rendered the cluster inoperable. As the job finishes, the following is logged to the job tracker logs: 2009-08-

Testing Hadoop job

2009-08-26 Thread Nikhil Sawant
hi can u guys suggest some hadoop unit testing framework apart from MRUnit??? i have used MRUnit but i m not sure abt its feasibilty and support to hadoop 0.20. i could not find a proper documentation for MRUnit, is it available anywhere? -- cheers nikhil

HBase master not starting

2009-08-26 Thread ilayaraja
Hello, Iam trying to setup Hbase-0.20 with Hadoop-0.20 in fully distributed mode. I have problem while starting the Hbase master: The stack trace is as follows 2009-08-26 01:18:31,454 INFO org.apache.hadoop.hbase.master.HMaster: My address is domU-12-31-39-00-0A-52.compute-1.internal:6 200