Re: about CombineFileInputFormat

2010-05-04 Thread Amareshwari Sri Ramadasu
See patch on https://issues.apache.org/jira/browse/MAPREDUCE-364 as an example. -Amareshwari On 5/5/10 1:52 AM, "Zhenyu Zhong" wrote: Hi, I tried to use CombineFileInputFormat in 0.20.2. It seems I need to extend it because it is an abstract class. However, I need to implement getRecordReader

Re: new to hadoop

2010-05-04 Thread Tamas Jambor
great. thank you. I'll set it up that way. Tom On 05/05/2010 00:37, Ravi Phulari wrote: How much RAM ? With 6-8GB RAM you can go for 4 mappers and 2 reducers (this is my personal guess). - Ravi On 5/4/10 4:33 PM, "Tamas Jambor" wrote: thank you. so what would be the optimal setting fo

Re: new to hadoop

2010-05-04 Thread Tamas Jambor
thank you. so what would be the optimal setting for mapred.map.tasks and mapred.reduce.tasks, say, on a dual-core machine? Tom On 05/05/2010 00:12, Ravi Phulari wrote: You can configure (conf/hadoop-env.sh) configuration files on each node to specify --Xmx values. You can use conf/mapred-site.

Re: new to hadoop

2010-05-04 Thread Ravi Phulari
How much RAM ? With 6-8GB RAM you can go for 4 mappers and 2 reducers (this is my personal guess). - Ravi On 5/4/10 4:33 PM, "Tamas Jambor" wrote: thank you. so what would be the optimal setting for mapred.map.tasks and mapred.reduce.tasks, say, on a dual-core machine? Tom On 05/05/2010 00:

Re: new to hadoop

2010-05-04 Thread Ravi Phulari
You can configure (conf/hadoop-env.sh) configuration files on each node to specify -Xmx values. You can use conf/mapred-site.xml to configure default mappers and reducers running on a node. mapred.map.tasks 2 The default number of map tasks per job. Ignored when mapred.job.tracker is "l

Accepting contributions for the "Hadooop in Practice" book

2010-05-04 Thread Mark Kerzner
Hi, guys, I am working on this book for Manning , and I need your solutions. If you had a specific problem that you solved with Hadoop, and you can share your solution, even in general terms, I will accept it from you and put it in the book. You will be mentioned as the pe

new to hadoop

2010-05-04 Thread jamborta
Hi, I am tring to set up a small hadoop cluster with 6 machines. the problem I have now is that if I set the memory allocated to a task low (e.g -Xmx512m) the application does not run, if I set it higher some machines in the cluster only have not got too much memory (1 or 2GB) and when the comput

about CombineFileInputFormat

2010-05-04 Thread Zhenyu Zhong
Hi, I tried to use CombineFileInputFormat in 0.20.2. It seems I need to extend it because it is an abstract class. However, I need to implement getRecordReader method in the extended class. May I ask how to implement this getRecordReader method? I tried to do something like this: public RecordR

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Udaya Lakshmi
Thank you. Udaya. On Wed, May 5, 2010 at 12:23 AM, Allen Wittenauer wrote: > > On May 4, 2010, at 7:46 AM, Udaya Lakshmi wrote: > > > Hi, > > I am given an account on a cluster which uses OpenPBS as the cluster > > management software. The only way I can run a job is by submitting it to > > Ope

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Allen Wittenauer
On May 4, 2010, at 7:46 AM, Udaya Lakshmi wrote: > Hi, > I am given an account on a cluster which uses OpenPBS as the cluster > management software. The only way I can run a job is by submitting it to > OpenPBS. How to run mapreduce programs on it? Is there any possible work > around? Take a

RE: Hadoop User Group - May 19th at Yahoo!

2010-05-04 Thread Dekel Tankel
Hi Agenda is available for the upcoming HUG. Hope to see you all there. http://www.meetup.com/hadoop/calendar/13048582/ thanks Dekel Register today for Hadoop Summit 2010 June 29th, Hyatt, Santa Clara, CA http://hadoopsummit2010.eventbrite.com/ Presentation submission deadli

Re: Applying HDFS-630 patch to hadoop-0.20.2 tarball release?

2010-05-04 Thread Joseph Chiu
Thanks! On Tue, May 4, 2010 at 11:14 AM, Owen O'Malley wrote: > On Tue, May 4, 2010 at 10:03 AM, Joseph Chiu wrote: > > Thanks Todd.Where I really need help is to get up to speed on that > > process of recompiling (and re-installing the build outputs) with ant. > > The place to look is in th

Re: Applying HDFS-630 patch to hadoop-0.20.2 tarball release?

2010-05-04 Thread Owen O'Malley
On Tue, May 4, 2010 at 10:03 AM, Joseph Chiu wrote: > Thanks Todd.    Where I really need help is to get up to speed on that > process of recompiling (and re-installing the build outputs) with ant. The place to look is in the wiki: http://wiki.apache.org/hadoop/HowToRelease It walks through the

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Peeyush Bishnoi
Udaya, Following link will help you for HOD on torque. http://hadoop.apache.org/common/docs/r0.20.0/hod_user_guide.html Thanks, --- Peeyush On Tue, 2010-05-04 at 22:49 +0530, Udaya Lakshmi wrote: > Thank you Craig. My cluster has got Torque. Can you please point me > something which will have

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Udaya Lakshmi
Thank you Craig. My cluster has got Torque. Can you please point me something which will have detailed explanation about using HOD on Torque. On Tue, May 4, 2010 at 10:17 PM, Craig Macdonald wrote: > HOD supports a PBS environment, namely Torque. Torque is the vastly > improved fork of OpenPBS. Y

Re: Applying HDFS-630 patch to hadoop-0.20.2 tarball release?

2010-05-04 Thread Joseph Chiu
Thanks Todd.Where I really need help is to get up to speed on that process of recompiling (and re-installing the build outputs) with ant. Cheers, Joseph On Tue, May 4, 2010 at 9:48 AM, Todd Lipcon wrote: > Hi Joseph, > > You'll have to apply the patch with patch -p0 < foo.patch and then > r

Re: Applying HDFS-630 patch to hadoop-0.20.2 tarball release?

2010-05-04 Thread Todd Lipcon
Hi Joseph, You'll have to apply the patch with patch -p0 < foo.patch and then recompile using ant. If you want to avoid this you can grab the CDH2 tarball here: http://archive.cloudera.com/cdh/2/ - it includes the HDFS-630 patch. Thanks -Todd On Tue, May 4, 2010 at 9:38 AM, Joseph Chiu wrote:

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Craig Macdonald
HOD supports a PBS environment, namely Torque. Torque is the vastly improved fork of OpenPBS. You may be able to get HOD working on OpenPBS, or better still persuade your cluster admins to upgrade to a more recent version of Torque (e.g. at least 2.1.x) Craig On 22/07/28164 20:59, Udaya Laks

Applying HDFS-630 patch to hadoop-0.20.2 tarball release?

2010-05-04 Thread Joseph Chiu
I am currently testing out a rollout of HBase 0.20.3 on top of Hadoop 0.20.2. The HBase doc recommends HDFS-630 patch be applied. I realize this is a newbieish question, but has anyone done this to the tarball Hadoop-0.20.2 release? Since this is a specific recommendation by the HBase release, I

RE: Need a Jira?

2010-05-04 Thread Michael Segel
> Date: Tue, 4 May 2010 11:03:48 -0400 > Subject: Re: Need a Jira? > From: esam...@cloudera.com > To: common-user@hadoop.apache.org > The reason / problem here is because JobClient is from the old (0.18) > API and thus has no understanding of Configuration. You can initialize > a JobConf from a

Re: Need a Jira?

2010-05-04 Thread Eric Sammer
On Tue, May 4, 2010 at 10:50 AM, Michael Segel wrote: > > Hi, > > Came across something "ugly". > > I'm using the latest Hadoop version in Cloudera's CH2 :Hadoop 0.20.1+169.68 > (At least I think its the latest version in CH2) > > Noticed that when I instantiate a JobClient() passing in a Configur

Re: Need a Jira?

2010-05-04 Thread Todd Lipcon
On Tue, May 4, 2010 at 7:50 AM, Michael Segel wrote: > > Hi, > > Came across something "ugly". > > I'm using the latest Hadoop version in Cloudera's CH2 :Hadoop 0.20.1+169.68 > (At least I think its the latest version in CH2) > > Noticed that when I instantiate a JobClient() passing in a Configura

Re: having a directory as input split

2010-05-04 Thread Sonal Goyal
One way to do this will be: Create a DirectoryInputFormat which accepts the list of directories as inputs and emits each directory path in one split. Your custom RecordReader can then read this split and generate appropriate input for your mapper. Thanks and Regards, Sonal www.meghsoft.com On F

Need a Jira?

2010-05-04 Thread Michael Segel
Hi, Came across something "ugly". I'm using the latest Hadoop version in Cloudera's CH2 :Hadoop 0.20.1+169.68 (At least I think its the latest version in CH2) Noticed that when I instantiate a JobClient() passing in a Configuration object, I have to cast it to the deprecated class (JobConf).

Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Udaya Lakshmi
Hi, I am given an account on a cluster which uses OpenPBS as the cluster management software. The only way I can run a job is by submitting it to OpenPBS. How to run mapreduce programs on it? Is there any possible work around? Thanks, Udaya.

Re: java.io.FileNotFoundException

2010-05-04 Thread Nick Jones
Carlos, I'm using 0.18 still, but I have the following set on a cygwin based machine: hadoop.tmp.dir /cygdrive/e/hadoop/hadoop-${user.name},/cygdrive/d/hadoop/hadoop-${user.name} mapred.child.tmp ./tmp Nick Jones On 5/3/2010 7:27 PM, Carlos Eduardo Moreira dos Santos wrote: I tried E:\t

Re: Hadoop Cookbook?

2010-05-04 Thread Mark Kerzner
Thank you On Tue, May 4, 2010 at 4:52 AM, Steve Loughran wrote: > Mark Kerzner wrote: > >> Hi, guys, >> >> I think that there is a need for a collection of Hadoop exercises. The >> great >> books out there teach you how to use Hadoop, but the Hadoop Cookbook is >> missing, If people can submit t

Re: Hadoop Cookbook?

2010-05-04 Thread Steve Loughran
Mark Kerzner wrote: Hi, guys, I think that there is a need for a collection of Hadoop exercises. The great books out there teach you how to use Hadoop, but the Hadoop Cookbook is missing, If people can submit their solutions, I can become an editor - or a group of editors can do it - but there a