What's a valid combiner?

2009-09-11 Thread Harish Mallipeddi
Hi, I was looking at how the Combiner gets run inside Hadoop. Since there's no separate Combiner interface, it basically seems to give the user the impression any valid Reducer can be used as a Combiner. There's a little "warning" note added to the docs under the setCombinerClass() method as to wha

Re: building hdfs-fuse

2009-09-11 Thread Anthony Urso
fuse.h should come with the FUSE software, not Hadoop. It should be somewhere like /usr/include/fuse.h on a Linux machine. Possibly /usr/local/include/fuse.h Did you install FUSE from source? If not, you probably need something like Debian's libfuse-dev package installed by your operating syste

building hdfs-fuse

2009-09-11 Thread Ted Yu
I use the following commandline to build fuse: ant compile-contrib -Dlibhdfs=1 -Dfusedfs=1 My ant version is 1.7.1 I got the following error: [exec] if gcc -DPACKAGE_NAME=\"fuse_dfs\" -DPACKAGE_TARNAME=\"fuse_dfs\" -DPACKAGE_VERSION=\"0.1.0\" -DPACKAGE_STRING=\"fuse_dfs\ 0.1.0\" -DPACKAGE_BU

Re: Hadoop Input Files Directory

2009-09-11 Thread Amandeep Khurana
You can give something like /path/to/directories/*/*/* On Fri, Sep 11, 2009 at 2:10 PM, Boyu Zhang wrote: > Dear All, > > > > I have an input directories of depth 3, the actual files are in the deepest > levels. (something like /data/user/dir_0/file0 , /data/user/dir_1/file0, > /data/user/dir_2

Hadoop Input File Directory

2009-09-11 Thread Boyu Zhang
Dear all, I have an input file hierarchy of depth 3, something like /data/user/dir_0/file0, /data/user/dir_1/file0, /data/user/dir_2/file0. I want to run a mapreduce job to process all the files in the deepest levels. One way of doing so is to specify the input path like /data/user/dir_0, /

Hadoop Input Files Directory

2009-09-11 Thread Boyu Zhang
Dear All, I have an input directories of depth 3, the actual files are in the deepest levels. (something like /data/user/dir_0/file0 , /data/user/dir_1/file0, /data/user/dir_2/file0) And I want to write a mapreduce job to process these files in the deepest levels. One way of doing so is to

Question about mapred.child.java.opts

2009-09-11 Thread Mayuran Yogarajah
Is there any sense in setting mapred.child.java.opts to a high value if all we're using is Hadoop streaming ? We've set it to 512MB but I don't know if it even matters.. thanks

Re: Decommissioning Individual Disks

2009-09-11 Thread Edward Capriolo
On Fri, Sep 11, 2009 at 12:23 PM, Allen Wittenauer wrote: > On 9/10/09 8:06 PM, "David B. Ritch" wrote: >> Thank you both.  That's what we did today.  It seems fairly reasonable >> when a node has a few disks, say 3-5.  However, at some sites, with >> larger nodes, it seems more awkward. > > Hmm.

RE: s3n intermediate storage problem

2009-09-11 Thread zjffdu
This is interesting, does that mean hadoop can use the S3 as the DistributedFileSystem and EC2 machine as the Computing Node ? If so, how does the namenode communicate with the datanode (s3) ? -Original Message- From: Irfan Mohammed [mailto:irfan...@gmail.com] Sent: 2009年9月9日 15:03 To:

Hadoop User Group (Bay Area) - Sep 23rd at Yahoo!

2009-09-11 Thread Dekel Tankel
Hi all, I'd like to remind everyone that RSVP is open for the next monthly Bay Area Hadoop user group organized by Yahoo!. Agenda and registration available here http://www.meetup.com/hadoop/calendar/11166700/ Looking forward to seeing you at September 23rd. Dekel

Re: Decommissioning Individual Disks

2009-09-11 Thread Allen Wittenauer
On 9/10/09 8:06 PM, "David B. Ritch" wrote: > Thank you both. That's what we did today. It seems fairly reasonable > when a node has a few disks, say 3-5. However, at some sites, with > larger nodes, it seems more awkward. Hmm. The vast majority of sites are using 4 disk configurations, that

Re: Thrift HDFS interface problems

2009-09-11 Thread Anthony Urso
For the Thrift server bug, the best way to get it fixed is to file a bug report at http://issues.apache.org/jira HBase 0.20 is out, download here: http://hadoop.apache.org/hbase/releases.html There is an HBase mailing list, hbase-u...@hadoop.apache.org. And yes, I believe you do still need to ke

Thrift HDFS interface problems

2009-09-11 Thread Bryn Divey
Hi all, I'm accessing a HDFS filesystem (version 0.19.2) over the Python Thrift API and I'm noticing quite a few situations in which the Java Thrift server freaks out and breaks the existing connection. For example, listing a non-existent directory, or doing a read with an offset outside of the ac

Re: Using Ganglia with hadoop 0.19.0 on Amazon EC2

2009-09-11 Thread John Clarke
Hadoop comes with a number of scripts to configure EC2 instances, see "HADOOP_HOME/src/contrib/ec2/bin" If you take a look at "src/contrib/ec2/bin/image/hadoop-init" you will see that it sets up Ganglia. I currently use the Cloudera scripts from http://www.cloudera.com/hadoop-ec2, these do not ho

转发: a problem in hadoop cluster:reduce ta sk couldn't find map tasks' output.

2009-09-11 Thread qiu tian
--- 09年9月11日,周五, qiu tian 写道: 发件人: qiu tian 主题: a problem in hadoop cluster:reduce task couldn't find map tasks' output. 收件人: common-user@hadoop.apache.org, common-user-h...@hadoop.apache.org 日期: 2009年9月11日,周五,下午1:57 Hi everyone! I tried hadoop cluster setup on 4 pcs. I ran into a problem abo