date:20110506

NNbench and MRBench

2011-05-06 Thread stanley.shi

Hi guys, I have a cluster of 16 machines running Hadoop. Now I want to do some benchmark on this cluster with the "nnbench" and "mrbench". I'm new to the hadoop thing and have no one to refer to. I don't know what the supposed result should I have? Now for mrbench, I have an average time of 22se

passing classpath through to datanodes?

2011-05-06 Thread Tom Melendez

Hi Folks, I'm having trouble getting a custom classpath through to the datanodes in my cluster. I'm using libhdfs and pipes, and the hdfsConnect call in libhdfs requires that the classpath is set. My code executes fine on a standalone machine, but when I take to the cluster, I can see that the c

Re: How do I create per-reducer temporary files?

2011-05-06 Thread Harsh J

Bryan, On Fri, May 6, 2011 at 10:50 PM, Bryan Keller wrote: > I wanted to be able to use the same local directory that the reducer is using > so, if there are multiple reducers running, I can take advantage of all of > the drives I have configured in mapred.local.dir. If I was unclear before,

How masters and slaves works in hadoop cluster

2011-05-06 Thread hadoopfan

masters is the central controller, slaves are specific cluster nodes? how to do if the masters crashing? thanks

Re: How do I create per-reducer temporary files?

2011-05-06 Thread Bryan Keller

Thanks for the info Matt. My use case is this. I have a fairly large amount of data being passed to my reducer. I need to load all of the data into a matrix and run a linear program. The most efficient way to do this from the LP side is to write all of the data to a file, and pass the file to t

Re: Cluster hard drive ratios

2011-05-06 Thread Matthew Foley

Ah, so you're suggesting there should be some hysteresis in the system, delaying response for a while to large-scale events? In particular, are you suggesting that for anticipated events, like "I'm taking this rack offline for 30 minutes, but it will be back with data intact, AN

Getting Job Name from CLI

2011-05-06 Thread Quinn Gil

Is there a command that will allow you to retrieve a setting's value for a specific job? I'm looking specifically for the JobName (mapred.job.name), but a more general purpose 'get setting' command would be very nice. I went through the 'hadoop -job' options and could get the job id, and lo

Re: can a `hadoop -jar streaming.jar` command return when a job is packaged and submitted?

2011-05-06 Thread Bharath Mundlapudi

One option is changing the code in streaming.jar to not wait for job completion but then you are on your own to check the job status, failures etc of these asynchronous jobs. -Bharath From: Dieter Plaetinck To: common-user@hadoop.apache.org Cc: bharathw...@ya

Re: How do I configure a Partitioner in the new API?

2011-05-06 Thread W.P. McNeill

Here is a configurable custom partitioner template along with a discussion of when the configurable interface methods are called: http://cornercases.wordpress.com/2011/05/06/an-example-configurable-partitioner/ . On Thu, May 5, 2011 at 9:03 AM, W.P. McNeill wrote: > The other thing you want to d

Re: can a `hadoop -jar streaming.jar` command return when a job is packaged and submitted?

2011-05-06 Thread Dieter Plaetinck

that will cause 200 regenerate-files processes running on the same files, at the same time. not good.. Dieter On Fri, 6 May 2011 07:49:45 -0700 (PDT) Bharath Mundlapudi wrote: > how about this? > > for i in $(seq 1 200); do > exec_stream_job.sh $dir $i & > > > exec_stream_job.sh > > --

Re: can a `hadoop -jar streaming.jar` command return when a job is packaged and submitted?

2011-05-06 Thread Bharath Mundlapudi

how about this? for i in $(seq 1 200); do exec_stream_job.sh $dir $i & exec_stream_job.sh regenerate-files $dir $i hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar \ -D mapred.job.name="$i" \ -file $dir \ -mapper "..." -

can a `hadoop -jar streaming.jar` command return when a job is packaged and submitted?

2011-05-06 Thread Dieter Plaetinck

Hi, I have a script something like this (simplified): for i in $(seq 1 200); do regenerate-files $dir $i hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar \ -D mapred.job.name="$i" \ -file $dir \ -mapper "..." -reducer "..." -input $i-input -o

Re: Cluster hard drive ratios

2011-05-06 Thread Steve Loughran

On 05/05/11 19:14, Matthew Foley wrote: "a node (or rack) is going down, don't replicate" == DataNode Decommissioning. This feature is available. The current usage is to add the hosts to be decommissioned to the exclusion file named in dfs.hosts.exclude, then use DFSAdmin to invoke "-refreshNo

NNbench and MRBench

passing classpath through to datanodes?

Re: How do I create per-reducer temporary files?

How masters and slaves works in hadoop cluster

Re: How do I create per-reducer temporary files?

Re: Cluster hard drive ratios

Getting Job Name from CLI

Re: can a `hadoop -jar streaming.jar` command return when a job is packaged and submitted?

Re: How do I configure a Partitioner in the new API?

Re: can a `hadoop -jar streaming.jar` command return when a job is packaged and submitted?

Re: can a `hadoop -jar streaming.jar` command return when a job is packaged and submitted?

can a `hadoop -jar streaming.jar` command return when a job is packaged and submitted?

Re: Cluster hard drive ratios

13 matches

Site Navigation

Mail list logo

Footer information