fair scheduler in 0.19
Hi, It seems to me that anyone can change pool assignment via scheduler's web UI. In other words, admins can not strictly enforce the pool assignment. Is this still be true in 0.20? Thanks, Rong-En Fan
Re: Accessing local files
On Tue, Dec 23, 2008 at 9:12 AM, Rodrigo Schmidt rschm...@facebook.comwrote: Hi, I want to use a local file (present on the file system of machine in my cluster) as the input to be used by mappers on my job. Is there an easy way to do that? I think you can use file:// (i.e., LocalFileSystem) or you can also use DistributedCache. Regards, Rong-En Fan
Re: hadoop 0.18.2 Checksum ok was sent and should not be sent again
I believe it was for debug purpose and was removed after 0.18.2 released. On Mon, Nov 17, 2008 at 8:57 PM, Alexander Aristov [EMAIL PROTECTED] wrote: Hi all I upgraded hadoop to the 0.18.2 version and tried to run a test job, distcopy from S3 to HDFS I got a lot of info-level errors although the job successfully finished. Any ideas? Can I simply suppress INFOs in log4j and forget about the error? 08/11/17 07:43:09 INFO fs.FSInputChecker: java.io.IOException: Checksum ok was sent and should not be sent again at org.apache.hadoop.dfs.DFSClient$BlockReader.read(DFSClient.java:863) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.readBuffer(DFSClient.java:1392) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1428) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1377) at java.io.DataInputStream.readInt(DataInputStream.java:372) at org.apache.hadoop.io.SequenceFile$Reader.readRecordLength(SequenceFile.java:1898) at org.apache.hadoop.io.SequenceFile$Reader.nextRaw(SequenceFile.java:1961) at org.apache.hadoop.io.SequenceFile$Sorter$SortPass.run(SequenceFile.java:2399) at org.apache.hadoop.io.SequenceFile$Sorter.sortPass(SequenceFile.java:2335) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2285) at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2326) at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1032) at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1013) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:618) at org.apache.hadoop.tools.DistCp.run(DistCp.java:768) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:788) -- Best Regards Alexander Aristov
network topology script, datanode rejoin
Hi, Recently, I'm playing with the network topology script and the scenario that datanode comes back from dead with rack location changed. I did few experiments: 1) just stop datanode 2) just stop datanode, remove data storage dir 3) decommission node 4) decommission node, remove data storage dir The time from death to life is about one week. However, it seems that somehow the namenode uses the old rack localtion for those nodes. I have to restart the namenode in order to get the rack information correctly. However, I roughly checked the source, it seems to me that under some circumstances the namenode will re-query the topology script. Would someone please explain this in more details? (the HDFS docs on hadoop site is not very clear about the network topology part) Thanks, Rong-En Fan
slow copy makes reduce hang
Hi, I'm using 0.17.2.1 and see a reduce hang in shuffle phase due to a unresponsive node. From the reduce log (sorry that I didn't keep it around), it stuck in copying map output from a dead node (I can not ssh to that one). At that point, all maps are already finished. I'm wondering why this slowness does not trigger a reduce task fail and the corresponding map failed (even if it is finished) then redo the map task on another node so that the reduce can work. Thanks, Rong-En Fan
Re: slow copy makes reduce hang
Reply to myself. I'm using streaming and the task timeout was set to 0, so that's why. On Fri, Sep 19, 2008 at 3:34 AM, Rong-en Fan [EMAIL PROTECTED] wrote: Hi, I'm using 0.17.2.1 and see a reduce hang in shuffle phase due to a unresponsive node. From the reduce log (sorry that I didn't keep it around), it stuck in copying map output from a dead node (I can not ssh to that one). At that point, all maps are already finished. I'm wondering why this slowness does not trigger a reduce task fail and the corresponding map failed (even if it is finished) then redo the map task on another node so that the reduce can work. Thanks, Rong-En Fan
Re: slow copy makes reduce hang
this time, I set task timeout to 10m via -jobconf mapred.task.timeout=60 However, I still see this hang at shuffle stage, and lots of messages below appear in the log 2008-09-19 12:34:02,289 INFO org.apache.hadoop.mapred.ReduceTask: task_200809190308_0007_r_01_1 Need 6 map output(s) 2008-09-19 12:34:02,290 INFO org.apache.hadoop.mapred.ReduceTask: task_200809190308_0007_r_01_1: Got 0 new map-outputs 0 obsolete map-outputs from tasktracker and 0 map-outputs from previous failures 2008-09-19 12:34:02,290 INFO org.apache.hadoop.mapred.ReduceTask: task_200809190308_0007_r_01_1 Got 6 known map output location(s); scheduling... 2008-09-19 12:34:02,290 INFO org.apache.hadoop.mapred.ReduceTask: task_200809190308_0007_r_01_1 Scheduled 0 of 6 known outputs (6 slow hosts and 0 dup hosts) When fetching map output from one weird node (actually, it has a disk died), the http daemon returns 500 internal server error. It seems to me that the reducer fails in an infinite loop... I'm wondering this behavior is fixed in 0.18.x or there is some configuration parameters that I should tune with? Thanks, Rong-En Fan On Fri, Sep 19, 2008 at 9:42 AM, Rong-en Fan [EMAIL PROTECTED] wrote: Reply to myself. I'm using streaming and the task timeout was set to 0, so that's why. On Fri, Sep 19, 2008 at 3:34 AM, Rong-en Fan [EMAIL PROTECTED] wrote: Hi, I'm using 0.17.2.1 and see a reduce hang in shuffle phase due to a unresponsive node. From the reduce log (sorry that I didn't keep it around), it stuck in copying map output from a dead node (I can not ssh to that one). At that point, all maps are already finished. I'm wondering why this slowness does not trigger a reduce task fail and the corresponding map failed (even if it is finished) then redo the map task on another node so that the reduce can work. Thanks, Rong-En Fan
Re: [Streaming] How to pass arguments to a map/reduce script
On Thu, Aug 21, 2008 at 3:14 PM, Gopal Gandhi [EMAIL PROTECTED] wrote: I am using Hadoop streaming and I need to pass arguments to my map/reduce script. Because a map/reduce script is triggered by hadoop, like hadoop -file MAPPER -mapper $MAPPER -file REDUCER -reducer $REDUCER ... How can I pass arguments to MAPPER? I tried -cmdenv name=val , but it does not work. Anybody can help me? Thanks lot. I use -jobconf, for example hadoop ... -jobconf my.mapper.arg1=foobar and in the map script, I get this by reading the environment variable my_mapper_arg1 Hope this helps, Rong-En Fan
Re: [Streaming] How to pass arguments to a map/reduce script
On Fri, Aug 22, 2008 at 12:51 AM, Steve Gao [EMAIL PROTECTED] wrote: That's interesting. Suppose your mapper script is a Perl script, how do you assign my.mapper.arg1's value to a variable $x? $x = $my.mapper.arg1 I just tried the way and my perl script does not recognize $my.mapper.arg1. $ENV{my_mapper_arg1} --- On Thu, 8/21/08, Rong-en Fan [EMAIL PROTECTED] wrote: From: Rong-en Fan [EMAIL PROTECTED] Subject: Re: [Streaming] How to pass arguments to a map/reduce script To: core-user@hadoop.apache.org Cc: [EMAIL PROTECTED] Date: Thursday, August 21, 2008, 11:09 AM On Thu, Aug 21, 2008 at 3:14 PM, Gopal Gandhi [EMAIL PROTECTED] wrote: I am using Hadoop streaming and I need to pass arguments to my map/reduce script. Because a map/reduce script is triggered by hadoop, like hadoop -file MAPPER -mapper $MAPPER -file REDUCER -reducer $REDUCER ... How can I pass arguments to MAPPER? I tried -cmdenv name=val , but it does not work. Anybody can help me? Thanks lot. I use -jobconf, for example hadoop ... -jobconf my.mapper.arg1=foobar and in the map script, I get this by reading the environment variable my_mapper_arg1 Hope this helps, Rong-En Fan
access jobconf in streaming job
I'm using streaming with a mapper written in perl. However, an issue is that I want to pass some arguments via command line. In regular Java mapper, I can access JobConf in Mapper. Is there a way to do this? Thanks, Rong-En Fan
Re: access jobconf in streaming job
After looking into streaming source, the answer is via environment variables. For example, mapred.task.timeout is in the mapred_task_timeout environment variable. On Fri, Aug 8, 2008 at 4:26 PM, Rong-en Fan [EMAIL PROTECTED] wrote: I'm using streaming with a mapper written in perl. However, an issue is that I want to pass some arguments via command line. In regular Java mapper, I can access JobConf in Mapper. Is there a way to do this? Thanks, Rong-En Fan
different dfs block size
Hi, I'm wondering what would be the memory consumption of dfs.block.size for a fixed set of data in NameNode? I know it is determined by # of blocks and # of replications, but how many memory does one block will use in NameNode? In addition, what would be the pros/cons of bigger/smaller block size? Thanks, Rong-En Fan
Re: How to set up rack awareness?
On Thu, Apr 17, 2008 at 2:41 AM, Nate Carlson [EMAIL PROTECTED] wrote: I'm setting up a hadoop cluster across two data centers (with gig bandwidth between them).. I'd like to use the rack awareness features to help Hadoop know which nodes are local.. I see that it's possible, but haven't found any guides on how to set it up. If anyone's got a quick primer I'd appreciate it! I think you have to prepare a 'network script' in hadoop-site.xml. This script tells which rack this host belongs to. Regards, Rong-En Fan -nc
MapFile and MapFileOutputFormat
Hi, I have two questions regarding the mapfile in hadoop/hdfs. First, when using MapFileOutputFormat as reducer's output, is there any way to change the index interval (i.e., able to call setIndexInterval() on the output MapFile)? Second, is it possible to tell what is the position in data file for a given key, assuming index interval is 1 and # of keys are small? Thanks, Rong-En Fan