Re: Secondary NameNodes or NFS exports?

2009-12-04 Thread Jason Venner
I have dug into this more, it turns out the problem is unrelated to nfs or solaris. The issue is that if there is a meta data change, while the secondary is rebuilding the fsimage, the rebuilt image is rejected. On our production cluster, there is almost never a moment where there is not a file bei

Re: Namenode crashes while rolling edit log from secondary namenode

2009-12-04 Thread Eli Collins
Hey Zhang, Thanks for the info. Can you be more specific, eg by NN crash do you mean NPE? Relevant logs would be helpful. Thanks, Eli On Fri, Dec 4, 2009 at 2:59 PM, Zhang, Zhang wrote: > > Eli, > > Thanks for responding to my question. I guess what you're mentioning are > dfs.name.edits.dir an

return in map

2009-12-04 Thread Gang Luo
Hi all, I got a tricky problem. I input a small file manually to do some filtering work on each line in map function. I check if the line satisfy the constrain then I output it, otherwise I return, without doing any other work below. For the map function will be called on each line, I think th

Re: Problem while writing a file from Reducer

2009-12-04 Thread Ted Xu
Hello Parth, Once I met a similar problem while using user-defined recordwriter in reducer. That time I forgot to close the recordwriter in the close phase of reducer. 2009/12/5 Parth J. Brahmbhatt > Hello All, > > I am creating 2 files in the constructor of my reducer and I store the file > Ha

Re: unable to write in hbase using mapreduce hadoop 0.20 and hbase 0.20

2009-12-04 Thread Amandeep Khurana
Try and output the data you are parsing from the xml to stdout. Maybe its not getting any data at all? One more thing you can try is to not use vectors and see if the individual Puts are getting committed or not. Use sysouts to see whats happening in the program. The code seems correct. On Fri, D

Problem while writing a file from Reducer

2009-12-04 Thread Parth J. Brahmbhatt
Hello All, I am creating 2 files in the constructor of my reducer and I store the file Handles as member variables. I write some data in these files for each call to reduce method. For some reason the files are created but only the larger file has data(The larger file is 400mb).The smaller file

Re: Combiner phase question

2009-12-04 Thread Owen O'Malley
The combiner runs when it is spilling the intermediate output to disk. So the flow looks like: in map: map writes into buffer when buffer is "full" do a quick sort, combine and write to disk merge sort the partial outputs from disk, combine and write to disk in reduce: fetch output from m

unable to write in hbase using mapreduce hadoop 0.20 and hbase 0.20

2009-12-04 Thread Vipul Sharma
Hi all, I am developing an application to populate hbase table with some data that I am getting after parsing some xml files. I have a mapreduce job using new hadoop 0.20 api and i am using hbase 0.20.2. Here is my mapreduce job public class MsgEventCollector { private static Logger logge

Re: Combiner phase question

2009-12-04 Thread Mike Kendall
from what i understand, the combiner runs when nodes are idle and you're waiting on a few processes that are taking too long... so the cluster tries to optimize by putting these idle nodes to work by doing optional preprocessing... On Fri, Dec 4, 2009 at 2:02 PM, Raymond Jennings III wrote: > I

Re: streaming job written in c++

2009-12-04 Thread Chris Dyer
I've written plenty of apps in c++ using the streaming interface-- you just need to read from std::cin and write to std::cout. Keys and Values are separated by a tab. Since Pipes doesn't give you access to very much of the Hadoop/MR runtime (e.g., HDFS), and it's not a very idiomatic c++ interfac

Re: streaming job written in c++

2009-12-04 Thread Allen Wittenauer
Absolutely none. You just read stdin/stdout as you would in any other language. But since C/C++ has its own interface, it is just more common to use that interface than using the one built for 'everything else'. On 12/4/09 3:29 PM, "Upendra Dadi" wrote: > Thank you Allen for you reply. I am

Re: streaming job written in c++

2009-12-04 Thread Upendra Dadi
Thank you Allen for you reply. I am trying to use MapReduce on Amazon EC2. EC2 don't seem to support Pipes using their simple web GUI interface (is it possible to use Pipes using their CLI or API?!). What is the problem with using C++ with streaming? Upendra - Original Message - From

Re: streaming job written in c++

2009-12-04 Thread Allen Wittenauer
For C/C++, you should be using the pipes interface. On 12/4/09 3:09 PM, "Upendra Dadi" wrote: > Hi, > Can anybody please give an example of a streaming mapper/reducer written in > C++? I don't seem to find even a single example on web. Thanks. > Upendra

streaming job written in c++

2009-12-04 Thread Upendra Dadi
Hi, Can anybody please give an example of a streaming mapper/reducer written in C++? I don't seem to find even a single example on web. Thanks. Upendra

RE: Namenode crashes while rolling edit log from secondary namenode

2009-12-04 Thread Zhang, Zhang
Eli, Thanks for responding to my question. I guess what you're mentioning are dfs.name.edits.dir and dfs.namenode.name.dir. They do not contain any NFS mounted directories. They were all accessible (can be seen by UNIX shell commands such as ls and du) when the namenode went down. The error messa

Re: Namenode crashes while rolling edit log from secondary namenode

2009-12-04 Thread Eli Collins
Hey Zhang, > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Fatal Error : All > storage directories are inaccessible. Are the directories specified by dfs.namenode.[name|edits].dir accessible? Perhaps they're NFS mounts that are flaking out? Thanks, Eli

Re: Combiner phase question

2009-12-04 Thread Raymond Jennings III
I still would like to know how many times it will run given how many mappers run. I realize it may never run but what determines how many times if any? --- On Fri, 12/4/09, Mike Kendall wrote: > From: Mike Kendall > Subject: Re: Combiner phase question > To: common-user@hadoop.apache.org > Da

Re: Combiner phase question

2009-12-04 Thread Mike Kendall
are you sure it can be run in the reduce task? if it does it's still before the reducer is called though... so the flow of your data will still be: data -> mapper(s) -> optional reducer(s) -> reducer(s) -> output_data On Fri, Dec 4, 2009 at 1:42 PM, Owen O'Malley wrote: > On Fri, Dec 4, 2009

Re: how to run programs present in the test folder

2009-12-04 Thread Eli Collins
Hey Siddu, You the testcase flag, eg ant -Dtestcase=TestHDFSCLI test to run TestHDFSCLI.java Thanks, Eli On Tue, Dec 1, 2009 at 10:36 AM, Siddu wrote: > Hi all , > > I am interested in the exploring the test folder . which is present in > src/test/org/apache/hadoop/hdfs/* > > Please ca

Re: Combiner phase question

2009-12-04 Thread Owen O'Malley
On Fri, Dec 4, 2009 at 12:32 PM, Raymond Jennings III wrote: > Does the combiner run once per data node or one per map task? (That it can > run multiple times on the same data node after each map task.) Thanks. > The combiner can run 0, 1, or many times on each data value. It can run in both t

Re: Tasktracker getting blacklisted

2009-12-04 Thread Amandeep Khurana
Seems like the reducer isnt able to read from the mapper node. Do you see something in the datanode logs? Also, check the namenode logs.. Make sure you have DEBUG logging enabled. -Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Fri, Dec 4, 2

Combiner phase question

2009-12-04 Thread Raymond Jennings III
Does the combiner run once per data node or one per map task? (That it can run multiple times on the same data node after each map task.) Thanks.

Tasktracker getting blacklisted

2009-12-04 Thread Madhur Khandelwal
Hi all, I have a 3 node cluster running a hadoop (0.20.1) job. I am noticing the following exception during the SHUFFLE phase because of which tasktracker on one of the nodes is getting blacklisted (after 4 occurrences of the exception). I have the config set to run 8 maps and 8 reduces simultaneo

ANN: First Munich OpenHUG Meeting

2009-12-04 Thread Lars George
First Munich OpenHUG Meeting We are trying to gauge the interest in a south Germany Hadoop User Group Meeting. After seeing quite a big interest in the Berlin meetings a few of us got together and decided to test the waters for another meeting at the other end of the country. We are therefore

Re: DFSClient write error when DN down

2009-12-04 Thread Edward Capriolo
On Fri, Dec 4, 2009 at 12:01 PM, Arvind Sharma wrote: > Thanks Todd ! > > Just wanted another confirmation I guess :-) > > Arvind > > > > > > From: Todd Lipcon > To: common-user@hadoop.apache.org > Sent: Fri, December 4, 2009 8:35:56 AM > Subject: Re: DFSClient wr

Re: DFSClient write error when DN down

2009-12-04 Thread Arvind Sharma
Thanks Todd ! Just wanted another confirmation I guess :-) Arvind From: Todd Lipcon To: common-user@hadoop.apache.org Sent: Fri, December 4, 2009 8:35:56 AM Subject: Re: DFSClient write error when DN down Hi Arvind, Looks to me like you've identified the JI

Re: DFSClient write error when DN down

2009-12-04 Thread Todd Lipcon
Hi Arvind, Looks to me like you've identified the JIRAs that are causing this. Hopefully they will be fixed soon. -Todd On Fri, Dec 4, 2009 at 4:43 AM, Arvind Sharma wrote: > Any suggestions would be welcome :-) > > Arvind > > > > > > > > From: Arvind Sharma >

Re: DFSClient write error when DN down

2009-12-04 Thread Arvind Sharma
Any suggestions would be welcome :-) Arvind From: Arvind Sharma To: common-user@hadoop.apache.org Sent: Wed, December 2, 2009 8:02:39 AM Subject: DFSClient write error when DN down I have seen similar error logs in the Hadoop Jira (Hadoop-2691, HDFS-795 )