Re: java.io.FileNotFoundException

2010-05-09 Thread Yang Li
I met the same problem on window+cygwin, don't know the root cause, but it can be worked around by a explicit "mapred.child.tmp" property. Try add these lines to your /mapred-site.xml: mapred.child.tmp /tmp/hadoop/mapred/mapred.child.tmp The examples then worked on my laptop. Be

Re: Fundamental question

2010-05-09 Thread Joseph Stein
1) only the namenode is "formatted" and what happens is basically the image file is created and prepped. The image file holds the meta data about how your files are stored on the cluster. 2) The datanodes are not formatted in the conventional sense. Their (datanode) disk usage will grow only wh

Re: Fundamental question

2010-05-09 Thread Bill Habermaas
These questions are usually answered once you start using the system but I'll provide some quick answers. 1. Hadoop uses the local file system at each node to store blocks. The only part of the system that needs to be formatted is the namenode which is where Hadoop keeps track of the logical H

Re: java.io.FileNotFoundException

2010-05-09 Thread Carlos Eduardo Moreira dos Santos
I've already tried that. Are you using 0.20.2? Thank you, Carlos Eduardo On Sun, May 9, 2010 at 8:34 AM, Yang Li wrote: > I met the same problem on window+cygwin, don't know the root cause, but it > can be worked around by a explicit "mapred.child.tmp" property. Try add > these lines to your /

Re: Data-Intensive Text Processing with MapReduce

2010-05-09 Thread Mark Kerzner
Dear Jimmy and Chris: I am reading your book (thank you for providing the pre-release version) and I find it great in contents and in style. Thank you! Sincerely, Mark On Sat, May 8, 2010 at 1:25 PM, Jimmy Lin wrote: > Hi everyone, > > I'm pleased to announce the publication a new book on MapR

where job executed

2010-05-09 Thread Alan Miller
Hi, I have class that I run on the master node and submits a bunch of MR jobs to my cluster but how can I tell where each job actually executed? I'm using Cloudera's 0.20.2+228. I don't see any commands or pages in the GUI that tell me this. Looks like there are some classes that might provide

Re: where job executed

2010-05-09 Thread Ted Yu
InetAddress.getLocalHost() should give you the hostname for each mapper/reducer On Sun, May 9, 2010 at 12:16 PM, Alan Miller wrote: > Hi, > > I have class that I run on the master node and submits > a bunch of MR jobs to my cluster but how can I tell where > each job actually executed? I'm using

Re: Silly question about 'rack aware'...

2010-05-09 Thread Eli Collins
Hey Michael, The script specified by dfs.network.script is passed both host names and IPs. In most cases an IP is passed, however in some cases (eg when using dfs.hosts files) a hostname is passed. Thanks, Eli ps - useful pointers: http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200

Re: java.io.FileNotFoundException

2010-05-09 Thread Yang Li
Yes, I'm using 0.20.2. I run it in pseudo mode on windows xp + cygwin as a private playground. All default settings with following change: core-site.xml: - fs.default.name hdfs://localhost:9000 hadoop.tmp.dir /tmp/hadoop hdfs-site.xm

Re: where job executed

2010-05-09 Thread Jeff Zhang
Do you mean where each task run ? You can look at the job tracker web ui where you can find each job's status and info. On Mon, May 10, 2010 at 3:16 AM, Alan Miller wrote: > Hi, > > I have class that I run on the master node and submits > a bunch of MR jobs to my cluster but how can I tell wher