Re: How to manage a large cluster?

2008-09-11 Thread 叶双明
er... is that: Set up a DNS server, use hostnames instead of raw ips? Config all node in the slaves file, and put this file on the namenode and secondary namenode to prevent accidents? Use a real system configuration management package to sync software in all nodes of cluster? Thanks for all, a

Re: Reduce task failed: org.apache.hadoop.fs.FSError: java.io.IOException

2008-09-11 Thread Prasad Pingali
Thanks Arun, Yeah I think it was a disk problem. Redoing the task went through fine. Since disk corruptions may be common, doesn't hadoop pick a replicated block? - Prasad. On Thursday 11 September 2008 11:38:53 pm Arun C Murthy wrote: > On Sep 11, 2008, at 9:10 AM, pvvpr wrote: > > Hello, >

namenode multitreaded

2008-09-11 Thread Dmitry Pushkarev
Hi. My namenode runs on a 8-core server with lots of RAM, but it only uses one core (100%). Is it possible to tell namenode to use all available cores? Thanks.

Re: How to manage a large cluster?

2008-09-11 Thread Alex Loddengaard
My inexperience has been revealed ;). I've taken your comments, James and Allen, and added them to the wiki: Alex On Fri, Sep 12, 2008 at 2:01 AM, James Moore <[EMAIL PROTECTED]>wrote: > On Thu, Sep 11, 2008 at 5:46 AM, Allen Wittenauer <[EMAIL

Re: Thinking about retriving DFS metadata from datanodes!!!

2008-09-11 Thread 叶双明
Thanks. It seem that it isn't a right way, but i learn a lot from you. 2008/9/12 Pete Wyckoff <[EMAIL PROTECTED]> > > You may want to look at hadoop's proposal for snapshotting, where one can > take a snapshot's metadata and store it in some disaster resilient place(s) > for a rainy day: > > htt

Amazon Node Dead! Help! Urgent!

2008-09-11 Thread Xing
Hi All, I made a big mistake and right now some of the nodes have already dead... I have already terminated all the program but still those dead nodes are dead... Anyone has experienced the same problem? What should I do right now to recover those nodes at a minimum cost? Thanks a lot for you h

[ANNOUNCE] Pig Release 0.1.0 available

2008-09-11 Thread Olga Natkovich
Hi, Pig release 0.1.0 is now available. This is the first Pig release from the incubator! For release details and downloads, visit: http://incubator.apache.org/pig/releases.html Thanks to everybody who helped with this release! Olga

Re: Thinking about retriving DFS metadata from datanodes!!!

2008-09-11 Thread Pete Wyckoff
You may want to look at hadoop's proposal for snapshotting, where one can take a snapshot's metadata and store it in some disaster resilient place(s) for a rainy day: https://issues.apache.org/jira/browse/HADOOP-3637 On 9/11/08 10:06 AM, "Dhruba Borthakur" <[EMAIL PROTECTED]> wrote: > My op

Re: Reduce task failed: org.apache.hadoop.fs.FSError: java.io.IOException

2008-09-11 Thread Arun C Murthy
On Sep 11, 2008, at 9:10 AM, pvvpr wrote: Hello, Never came across this error before. Upgraded to 0.18.0 this morning and ran a nutch fetch job. Got this exception in both the reduce attempts of a task and they failed. All other reducers seemed to work fine, except for one task. Any idea

Re: How to manage a large cluster?

2008-09-11 Thread James Moore
On Thu, Sep 11, 2008 at 5:46 AM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > On 9/11/08 2:39 AM, "Alex Loddengaard" <[EMAIL PROTECTED]> wrote: >> I've never dealt with a large cluster, though I'd imagine it is managed the >> same way as small clusters: > >Maybe. :) Add me to the "maybe :)" c

Re: Thinking about retriving DFS metadata from datanodes!!!

2008-09-11 Thread Dhruba Borthakur
My opinion is to not store file-namespace related metadata on the datanodes. When a file is renamed, one has to contact all datanodes to change this new metadata. Worse still, if one renames an entire subdirectory, all blocks that belongs to all files in the subdirectory have to be updated. Similar

Reduce task failed: org.apache.hadoop.fs.FSError: java.io.IOException

2008-09-11 Thread pvvpr
Hello, Never came across this error before. Upgraded to 0.18.0 this morning and ran a nutch fetch job. Got this exception in both the reduce attempts of a task and they failed. All other reducers seemed to work fine, except for one task. Any ideas what could be the problem? - Prasad Pingali. II

Re: How to manage a large cluster?

2008-09-11 Thread Allen Wittenauer
On 9/11/08 2:39 AM, "Alex Loddengaard" <[EMAIL PROTECTED]> wrote: > I've never dealt with a large cluster, though I'd imagine it is managed the > same way as small clusters: Maybe. :) > -Use hostnames or ips, whichever is more convenient for you Use hostnames. Seriously. Who are you pe

Re: How to manage a large cluster?

2008-09-11 Thread Alex Loddengaard
I've never dealt with a large cluster, though I'd imagine it is managed the same way as small clusters: -Use hostnames or ips, whichever is more convenient for you -All the slaves need to go into the slave file -You can update software by using bin/hadoop-daemons.sh. Something like: #bin/hadoop-d

How to manage a large cluster?

2008-09-11 Thread 叶双明
Hi, all! How to manage a large cluster, eg. more than 2000 nodes. How to config hostname and ip, use DNS? How to config slaves, all in slaves file? How to update software in all nodes. Any practice, articles, suggestion is appreciate! Thanks. -- Sorry for my english!! 明 Please help me to correc

Re: Issue in reduce phase with SortedMapWritable and custom Writables as values

2008-09-11 Thread Shengkai Zhu
AFAIK, tasktracker will load your job archive automatically while running the map/reduce task. On Tue, Sep 9, 2008 at 10:28 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote: > Based on some similar problems that I found others were having in the > mailing lists, it looks like the solution was to list

Re: hadoop hanging (probably misconfiguration) assistance

2008-09-11 Thread Amar Kamat
Shengkai Zhu wrote: Logs may probably tell what happened. On Thu, Sep 11, 2008 at 3:20 PM, <[EMAIL PROTECTED]> wrote: Hi All, I have been trying to move from pseudo distributed hadoop cluster which worked perfectly well, to a real hadoop cluster. I was able to execute the wordcount example

Re: hadoop hanging (probably misconfiguration) assistance

2008-09-11 Thread Shengkai Zhu
Logs may probably tell what happened. On Thu, Sep 11, 2008 at 3:20 PM, <[EMAIL PROTECTED]> wrote: > Hi All, > I have been trying to move from pseudo distributed hadoop cluster which > worked perfectly well, to a real hadoop cluster. I was able to execute > the wordcount example on my pseudo clus

hadoop hanging (probably misconfiguration) assistance

2008-09-11 Thread damien . cooke
Hi All, I have been trying to move from pseudo distributed hadoop cluster which worked perfectly well, to a real hadoop cluster. I was able to execute the wordcount example on my pseudo cluster but my real cluster hangs at this point: # bin/hadoop jar hadoop*jar wordcount /myinput /myoutput 08