date:20110426

Re: Reading from File

2011-04-26 Thread Harsh J

Hello Mark, On Wed, Apr 27, 2011 at 12:19 AM, Mark question wrote: > Hi, > > My mapper opens a file and read records using next() . However, I want to > stop reading if there is no memory available. What confuses me here is that > even though I'm reading record by record with next(), hadoop act

Re: access hdfs from streaming job

2011-04-26 Thread steven.z zhuang

OK, I will try to answer this question myself. this is caused by the env variable HADOOP_CLIENT_OPTS being double quoted in mapred/org/apache/hadoop/mapred/TaskRunner.java which in turn will make the command line in streaming job like this: nohup /dist/JAVA_HOME/bin/java -Dproc_fs -Xmx4000

Re: Cluster hardware question

2011-04-26 Thread Xiaobo Gu

On Tue, Apr 26, 2011 at 11:30 PM, Michel Segel wrote: > Hi, > Actually if you have 2 4 core CPUs xeon chips... You will become i/o bound > with 4 drives. > The rule of thumb tends to be 2 disks per core so you would want 16 drives > per node... At least in theory. > 24 1TB drives would be inter

Re: Cluster hardware question

2011-04-26 Thread Michel Segel

Hi, Actually if you have 2 4 core CPUs xeon chips... You will become i/o bound with 4 drives. The rule of thumb tends to be 2 disks per core so you would want 16 drives per node... At least in theory. 24 1TB drives would be interesting, but I'm not sure what sort of problems you could expect to

Reading from File

2011-04-26 Thread Mark question

Hi, My mapper opens a file and read records using next() . However, I want to stop reading if there is no memory available. What confuses me here is that even though I'm reading record by record with next(), hadoop actually reads them in dfs.block.size. So, I have two questions: 1. Is it true

Re: fair scheduler issue

2011-04-26 Thread hadoopman

Still digging through it all and playing with multiple versions of Hadoop (both the Apache and Cloudera flavors). Looks like it's a recent bug in Cloudera from what we're seeing... Thanks! On 04/26/2011 07:15 AM, James Seigel wrote: I know cloudera has a bug in their version. They should ha

Re: fair scheduler issue

2011-04-26 Thread hadoopman

Here is the version we're currently running. Pulled from the mapred admin page: Version: 0.20.3-CDH3-SNAPSHOT, r${cloudera.hash} Compiled: Fri Dec 10 13:33:50 PST 2010 by hadoop Thanks! On 04/26/2011 06:52 AM, Saurabh bhutyani wrote: Which version of hadoop are you referring to? Thanks&

Apply HADOOP-4667 to branch-0.20

2011-04-26 Thread He Chen

Hey everyone I tried to apply HADOOP-4667 patch to branch-0.20, but always failed. Because my cluster is based on branch-0.20, however, I want to test the delay scheduling method performance. I do not want to re-format the HDFS. Then I tried to apply HADOOP-4667 to branch-0.20. Anyone did this be

Re: map JVMs do not cease to exist

2011-04-26 Thread Shrinivas Joshi

JVM reuse policy seem to have an effect here. All map JVMs exit soon after their individual map tasks finish execution, if JVM reuse policy is disabled. However, when JVM reuse policy is enabled, there is no code which checks whether all map tasks assigned to a particular JVM process have finished

Re: Does it mean that single disk failure causes the whole datanode to fail?

2011-04-26 Thread Xiaobo Gu

Thanks. On Tue, Apr 26, 2011 at 10:48 PM, Owen O'Malley wrote: > On Tue, Apr 26, 2011 at 6:46 AM, Xiaobo Gu wrote: > >> How can I download the patched version of hadoop, I only know the >> initial versions of each release from the official download website. > > > The 0.20.204 version is still be

Re: Does it mean that single disk failure causes the whole datanode to fail?

2011-04-26 Thread Owen O'Malley

On Tue, Apr 26, 2011 at 6:46 AM, Xiaobo Gu wrote: > How can I download the patched version of hadoop, I only know the > initial versions of each release from the official download website. The 0.20.204 version is still being tested. I'd expect a release next month. You can look at the sources a

access hdfs from streaming job

2011-04-26 Thread steven.z zhuang

hi, list, I have this very old, simple question, which I can not figure out in short time, so I turn to you guys. OK, in my perl hadoop streaming job, I want to access a file in HDFS, what I did is as fllows: 1. fork a subprocess and try to dump the file

Re: maven missing source + javadocs

2011-04-26 Thread Manuel Meßner

Hello Arun, Am 20.04.2011 05:53, schrieb Arun Ramakrishnan: > https://repository.cloudera.com/content/repositories/releases/org/apache/hadoop/hadoop-core/0.20.2-cdh3u0/ > > http://repo2.maven.org/maven2/org/apache/hadoop/hadoop-core/0.20.2/ > > at least the cloudera repo seems to have the source

Cluster hardware question

2011-04-26 Thread Xiaobo Gu

Hi, People say a balanced server configration is as following: 2 4 Core CPU, 24G RAM, 4 1TB SATA Disks But we have been used to use storages servers with 24 1T SATA Disks, we are wondering will Hadoop be CPU bounded if this kind of servers are used. Does anybody have experiences with hadoop ru

Re: Does it mean that single disk failure causes the whole datanode to fail?

2011-04-26 Thread Xiaobo Gu

How can I download the patched version of hadoop, I only know the initial versions of each release from the official download website. On Tue, Apr 26, 2011 at 2:34 AM, Owen O'Malley wrote: > On Mon, Apr 25, 2011 at 9:17 AM, Mathias Herberts < > mathias.herbe...@gmail.com> wrote: > >> You can conf

Re: Execution time.

2011-04-26 Thread real great..

Thanks a lot.I have managed to do it. And my final year project is on power aware Hadoop. i do realise its against ethics to get the code that way..:) On Tue, Apr 26, 2011 at 4:24 PM, Steve Loughran wrote: > On 20/04/11 10:28, real great.. wrote: > >> Hi, >> I had asked a question about predicti

Re: fair scheduler issue

2011-04-26 Thread James Seigel

I know cloudera has a bug in their version. They should have filed a Jira for it. Are you getting NPE in the logs? James Sent from my mobile. Please excuse the typos. On 2011-04-26, at 6:53 AM, Saurabh bhutyani wrote: > Which version of hadoop are you referring to? > > Thanks & Regards, > Sau

Re: fair scheduler issue

2011-04-26 Thread Saurabh bhutyani

Which version of hadoop are you referring to? Thanks & Regards, Saurabh Bhutyani Call : 9820083104 Gtalk: s4saur...@gmail.com On Tue, Apr 26, 2011 at 5:59 AM, hadoopman wrote: > Has anyone had problems with the latest version of hadoop and the fair > scheduler not placing jobs into pools co

Re: Fixing a bad HD

2011-04-26 Thread Steve Loughran

On 26/04/11 05:20, Bharath Mundlapudi wrote: Right, if you have a hardware which supports hot-swappable disk, this might be easiest one. But still you will need to restart the datanode to detect this new disk. There is an open Jira on this. -Bharath Correction, there is a patch up there now.

Re: Fixing a bad HD

2011-04-26 Thread Steve Loughran

On 26/04/11 05:20, Bharath Mundlapudi wrote: Right, if you have a hardware which supports hot-swappable disk, this might be easiest one. But still you will need to restart the datanode to detect this new disk. There is an open Jira on this. -Bharath That'll be HDFS-664 https://issues.apach

Re: Seeking Advice on Upgrading a Cluster

2011-04-26 Thread Steve Loughran

On 21/04/11 18:33, Geoffry Roberts wrote: What will give me the most bang for my buck? - Should I bring all machines up to 8G of memory? or is 4G good enough? (8 is the max.) depends on whether your code is running out of memory - Should I double up the NICs and use LACP? I wo

Re: Execution time.

2011-04-26 Thread Steve Loughran

On 20/04/11 10:28, real great.. wrote: Hi, I had asked a question about predicting map times in hadoop. Thanks a lot for the encouraging response. I want to know if anybody has a code or any idea on how to calculate the execution time? I mean a small estimation.q 1. Surely this is what your f

Re: Reading from File

Re: access hdfs from streaming job

Re: Cluster hardware question

Re: Cluster hardware question

Reading from File

Re: fair scheduler issue

Re: fair scheduler issue

Apply HADOOP-4667 to branch-0.20

Re: map JVMs do not cease to exist

Re: Does it mean that single disk failure causes the whole datanode to fail?

Re: Does it mean that single disk failure causes the whole datanode to fail?

access hdfs from streaming job

Re: maven missing source + javadocs

Cluster hardware question

Re: Does it mean that single disk failure causes the whole datanode to fail?

Re: Execution time.

Re: fair scheduler issue

Re: fair scheduler issue

Re: Fixing a bad HD

Re: Fixing a bad HD

Re: Seeking Advice on Upgrading a Cluster

Re: Execution time.

22 matches

Site Navigation

Mail list logo

Footer information