Desperate!!!! Expanding,shrinking cluster or replacing failed nodes.

2011-12-20 Thread Sloot, Hans-Peter
Hello all, I have asked this question a couple of days ago but no one responded. I built a 6 node hadoop cluster, guided Michael Noll, starting with a single node and expanding it one by one. Every time I expanded the cluster I ran into error : java.io.IOException: Incompatible namespaceIDs

Re: Desperate!!!! Expanding,shrinking cluster or replacing failed nodes.

2011-12-20 Thread Dejan Menges
Hi, Here is quick info in section 1.5 http://wiki.apache.org/hadoop/FAQ So just briefly - when you add new node, and you are sure configuration on that one is fine, before you start anything you need to issue hadoop dfsadmin -refreshNodes after what you need to start datanode/mr services. Hope

RE: Desperate!!!! Expanding,shrinking cluster or replacing failed nodes.

2011-12-20 Thread Sloot, Hans-Peter
But I ran into the : java.io.IOException: Incompatible namespaceIDs error every time. Should I config the files : dfs/data/current/VERSION and dfs/name/current/VERSION and conf/*site.xml from other existing nodes? -Original Message- From: Harsh J [mailto:ha...@cloudera.com]

Hadoop and Ubuntu / Java

2011-12-20 Thread hadoopman
http://www.omgubuntu.co.uk/2011/12/java-to-be-removed-from-ubuntu-uninstalled-from-user-machines/ I'm curious what this will mean for Hadoop on Ubuntu systems moving forward. I've tried openJDK nearly two years ago with Hadoop. Needless to say it was a real problem. Hopefully we can still

Configure hadoop scheduler

2011-12-20 Thread Merto Mertek
Hi, I am having problems with changing the default hadoop scheduler (i assume that the default scheduler is a FIFO scheduler). I am following the guide located in hadoop/docs directory however I am not able to run it. Link for scheduling administration returns an http error 404 (

Re: Configure hadoop scheduler

2011-12-20 Thread Matei Zaharia
Are you trying to use the capacity scheduler or the fair scheduler? Your mapred-site.xml says to use the capacity scheduler but then points to a fair scheduler allocation file. Take a look at http://hadoop.apache.org/common/docs/r0.20.204.0/fair_scheduler.html for setting up the fair scheduler

Re: Configure hadoop scheduler

2011-12-20 Thread Prashant Kommireddi
I am guessing you are trying to use the FairScheduler but you have specified CapacityScheduler in your configuration. You need to change mapreduce.jobtracker.scheduler to FairScheduler. Sent from my iPhone On Dec 20, 2011, at 8:51 AM, Merto Mertek masmer...@gmail.com wrote: Hi, I am having

Custom Writables in MapWritable

2011-12-20 Thread Kyle Renfro
Hadoop 0.22.0-RC0 I have the following reducer: public static class MergeRecords extends ReducerText,MapWritable,Text,MapWritable The MapWritables that are handled by the reducer all have Text 'keys' and contain different 'value' classes including Text, DoubleWritable, and a custom Writable

Re: Desperate!!!! Expanding,shrinking cluster or replacing failed nodes.

2011-12-20 Thread Merto Mertek
I followed the same tutorial as you. If I am not wrong the problem arise because you first tried to run a node as single node and then joining it to the cluster (like Arpit mentioned). After testing that the new node works ok try to delete content in directory /app/hadoop/tmp/ and insert a new

Re: collecting CPU, mem, iops of hadoop jobs

2011-12-20 Thread He Chen
You may need Ganglia. It is a cluster monitoring software. On Tue, Dec 20, 2011 at 2:44 PM, Patai Sangbutsarakum silvianhad...@gmail.com wrote: Hi Hadoopers, We're running Hadoop 0.20 CentOS5.5. I am finding the way to collect CPU time, memory usage, IOPS of each hadoop Job. What would be

Re: collecting CPU, mem, iops of hadoop jobs

2011-12-20 Thread Patai Sangbutsarakum
Thanks for reply, but I don't think metric exposed to Ganglia would be what i am really looking for.. what i am looking for is some kind of these (but not limit to) Job__ CPU time: 10204 sec. --aggregate from all tasknodes IOPS: 2344 -- aggregated from all datanode MEM: 30G --

Re: collecting CPU, mem, iops of hadoop jobs

2011-12-20 Thread Arun C Murthy
Take a look at the JobHistory files produced for each job. With 0.20.205 you get CPU (slot millis). With 0.23 (alpha quality) you get CPU and JVM metrics (GC etc.). I believe you also get Memory, but not IOPS. Arun On Dec 20, 2011, at 1:11 PM, Patai Sangbutsarakum wrote: Thanks for reply,

Release 1.0.0 RC3 - Ant Build Fails with JSP-Compile Error

2011-12-20 Thread Royston Sellman
Hi, We have just checked out the latest version of Hadoop source from http://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.0-rc3 and we have attempted to build it using the ant build.xml script. However we are getting errors relating to the jsp-compile command. We are getting the

Re: collecting CPU, mem, iops of hadoop jobs

2011-12-20 Thread Patai Sangbutsarakum
Thanks again Arun, you save me again.. :-) This is a great starting point. for CPU and possibly Mem. For the IOPS, just would like to ask if the tasknode/datanode collect the number or we should dig into OS level.. like /proc/PID_OF_tt/io ^hope this make sense -P On Tue, Dec 20, 2011 at 1:22

Re: collecting CPU, mem, iops of hadoop jobs

2011-12-20 Thread Arun C Murthy
Go ahead and open a MR jira (would appreciate a patch too! ;) ). thanks, Arun On Dec 20, 2011, at 2:55 PM, Patai Sangbutsarakum wrote: Thanks again Arun, you save me again.. :-) This is a great starting point. for CPU and possibly Mem. For the IOPS, just would like to ask if the

network configuration (etc/hosts) ?

2011-12-20 Thread MirrorX
dear all i am trying for many days to get a simple hadoop cluster (with 2 nodes) to work but i have trouble configuring the network parameters. i have properly configured the ssh keys, and the /etc/hosts files are: master- 127.0.0.1 localhost6.localdomain6 localhost 127.0.1.1

Re: How to create Output files of about fixed size

2011-12-20 Thread Mapred Learn
Hi Shevek/others, I tried this. First job created about 78 files of each 15 MB size. I tried a second map only job with IdentityMapper with -Dmapred.min.split.size=1073741824 but it did not cause output files to be 1 Gb each but same output as above i.e. 78 files of 15 MB size. Is there a way

Fastest HDFS loader

2011-12-20 Thread Edmon Begoli
Hi, We are going to be loading 4-5 GB text, delimited file from a RHEL file system into HDFS to be managed as external table by Hive. What is the recommended, fastest loading mechanism? Thank you, Edmon

Re: Fastest HDFS loader

2011-12-20 Thread Konstantin Boudnik
Do you have some strict performance requirement or something? Cause 5Gb is pretty much nothing, really. I'd say copyFromLocal will do just fine. Cos On Tue, Dec 20, 2011 at 10:32PM, Edmon Begoli wrote: Hi, We are going to be loading 4-5 GB text, delimited file from a RHEL file system into