How to speed up the copy phrase?

2009-08-23 Thread yang song
Hello, everyone When I submit a big job(e.g. maptasks:1, reducetasks:500), I find that the copy phrase will last for a long long time. From WebUI, the message "reduce > copy ( of 1 at 0.01 MB/s) >" tells me the transfer speed is just 0.01 MB/s. Does it a regular value? How can I solve

Some issues related to cluster invokation!

2009-08-23 Thread Sugandha Naolekar
Hello! It takes a lot of time to invoke all the nodes in a cluster and run the corresponding daemons. Why is it so???It seems to be a very tedious job! -- Regards! Sugandha

Re: Datanode-Failure..!

2009-08-23 Thread Sugandha Naolekar
hello! Still the same status I am getting:: 09/08/24 09:46:37 WARN datanode.DataNode: Invalid directory in dfs.data.dir: can not create directory: /home/hadoop/Softwares/hadoop-0.19.0/temp/dfs/data 09/08/24 09:46:37 ERROR datanode.DataNode: All directories in dfs.data.dir are invalid. 09/08/24 09

Re: NLineInputFormat - Map always left out one split

2009-08-23 Thread Anh Nguyen
Fixed. I restart hadoop and everything works again. It could be that there are some hidden bugs in the Hadoop code that get triggered. Anh On Sun, Aug 23, 2009 at 3:39 PM, Anh Nguyen wrote: > I am using Hadoop for one of my research. I use NLineInputFormat for Map, > which take a few lines as o

Re: utilizing all cores on single-node hadoop

2009-08-23 Thread Vasilis Liaskovitis
Hi, thanks to everyone for the valuable suggestions. what would be the default number of map and reduce tasks for the sort-rand example described at: http://wiki.apache.org/hadoop/Sort This is one of the simplest possible examples and uses identity mapper/reducers I am seeing 160 map tasks and 2

problem using Hadoop 0.18.3 with lzo 2.03

2009-08-23 Thread Bill Au
I am using Hadoop 0.18.3 with lzo 2.03. I am able to compile Hadoop's native code and load lzo's native library. I am trying to run the grep example in examples.jar on a lzo-compressed file. I am getting an OutOfMemoryError on the Java heap space. My input file is 1628727 bytes which compressed

Re: Getting free space percentage on DFS

2009-08-23 Thread Arvind Sharma
The APIs work for the user with which Hadoop was started. And moreover I don't think the User level authentication is there yet in Hadoop (not sure here though) for APIs... From: Stas Oskin To: common-user@hadoop.apache.org Sent: Sunday, August 23, 2009 1:33:

NLineInputFormat - Map always left out one split

2009-08-23 Thread Anh Nguyen
I am using Hadoop for one of my research. I use NLineInputFormat for Map, which take a few lines as one split. Each line specify a filename. So if I have 10 input files 1..10 in my hdfs home, I would have an input file list this: *~/1* *~/2* *.* *.* *.* *~/10* It used to works fine but recently I

Re: Getting free space percentage on DFS

2009-08-23 Thread Stas Oskin
Hi. Thank you both for the advices - any idea if these approaches works for non-super user? Regards.

Re: Getting free space percentage on DFS

2009-08-23 Thread Arvind Sharma
You can try something like this: if (_FileSystem instanceof DistributedFileSystem) { DistributedFileSystem dfs = (DistributedFileSystem) _FileSystems; DiskStatus ds = dfs.getDiskStatus(); long capacity = ds.getCapacity(); long used = ds.getD

hdfs.DFSClient: DataStreamer Exception: java.net.SocketTimeoutException

2009-08-23 Thread Xie, Tao
I found this exception below in my datanode log. Anybody know why this happens? 09/08/23 22:13:50 WARN hdfs.DFSClient: DataStreamer Exception: java.net.SocketTimeoutException: 15000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected loc

Re: Getting free space percentage on DFS

2009-08-23 Thread Edward Capriolo
On Sun, Aug 23, 2009 at 7:22 AM, Stas Oskin wrote: > Hi. > > How can I get the free / used space on DFS, via Java? > > What are the functions that can be used for that? > > Note, I'm using a regular (non-super) user, so I need to do it in a similar > way to dfshealth.jsp, which AFAIK doesn't requir

Hadoop streaming: How is data distributed from mappers to reducers?

2009-08-23 Thread Nipun Saggar
Hi all, I have recently started using Hadoop streaming. From the documentation, I understand that by default, each line output from a mapper up to the first tab becomes the key and rest of the line is the value. I wanted to know that between the mapper and reducer, is there a shuffling(sorting) ph

Getting free space percentage on DFS

2009-08-23 Thread Stas Oskin
Hi. How can I get the free / used space on DFS, via Java? What are the functions that can be used for that? Note, I'm using a regular (non-super) user, so I need to do it in a similar way to dfshealth.jsp, which AFAIK doesn't require any permissions. Thanks in advance.