Hello all,
I have asked this question a couple of days ago but no one responded.
I built a 6 node hadoop cluster, guided Michael Noll, starting with a single
node and expanding it one by one.
Every time I expanded the cluster I ran into error : java.io.IOException:
Incompatible namespaceIDs
Hi,
Here is quick info in section 1.5 http://wiki.apache.org/hadoop/FAQ
So just briefly - when you add new node, and you are sure configuration on
that one is fine, before you start anything you need to issue hadoop
dfsadmin -refreshNodes after what you need to start datanode/mr services.
Hope
But I ran into the : java.io.IOException: Incompatible namespaceIDs error every
time.
Should I config the files : dfs/data/current/VERSION and
dfs/name/current/VERSION and conf/*site.xml
from other existing nodes?
-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
http://www.omgubuntu.co.uk/2011/12/java-to-be-removed-from-ubuntu-uninstalled-from-user-machines/
I'm curious what this will mean for Hadoop on Ubuntu systems moving
forward. I've tried openJDK nearly two years ago with Hadoop. Needless
to say it was a real problem.
Hopefully we can still
Hi,
I am having problems with changing the default hadoop scheduler (i assume
that the default scheduler is a FIFO scheduler).
I am following the guide located in hadoop/docs directory however I am not
able to run it. Link for scheduling administration returns an http error
404 (
Are you trying to use the capacity scheduler or the fair scheduler? Your
mapred-site.xml says to use the capacity scheduler but then points to a fair
scheduler allocation file. Take a look at
http://hadoop.apache.org/common/docs/r0.20.204.0/fair_scheduler.html for
setting up the fair scheduler
I am guessing you are trying to use the FairScheduler but you have
specified CapacityScheduler in your configuration. You need to change
mapreduce.jobtracker.scheduler to FairScheduler.
Sent from my iPhone
On Dec 20, 2011, at 8:51 AM, Merto Mertek masmer...@gmail.com wrote:
Hi,
I am having
Hadoop 0.22.0-RC0
I have the following reducer:
public static class MergeRecords extends
ReducerText,MapWritable,Text,MapWritable
The MapWritables that are handled by the reducer all have Text 'keys'
and contain different 'value' classes including Text, DoubleWritable,
and a custom Writable
I followed the same tutorial as you. If I am not wrong the problem arise
because you first tried to run a node as single node and then joining it to
the cluster (like Arpit mentioned). After testing that the new node works
ok try to delete content in directory /app/hadoop/tmp/ and insert a new
You may need Ganglia. It is a cluster monitoring software.
On Tue, Dec 20, 2011 at 2:44 PM, Patai Sangbutsarakum
silvianhad...@gmail.com wrote:
Hi Hadoopers,
We're running Hadoop 0.20 CentOS5.5. I am finding the way to collect
CPU time, memory usage, IOPS of each hadoop Job.
What would be
Thanks for reply, but I don't think metric exposed to Ganglia would be
what i am really looking for..
what i am looking for is some kind of these (but not limit to)
Job__
CPU time: 10204 sec. --aggregate from all tasknodes
IOPS: 2344 -- aggregated from all datanode
MEM: 30G --
Take a look at the JobHistory files produced for each job.
With 0.20.205 you get CPU (slot millis).
With 0.23 (alpha quality) you get CPU and JVM metrics (GC etc.). I believe you
also get Memory, but not IOPS.
Arun
On Dec 20, 2011, at 1:11 PM, Patai Sangbutsarakum wrote:
Thanks for reply,
Hi,
We have just checked out the latest version of Hadoop source from
http://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.0-rc3 and we
have attempted to build it using the ant build.xml script. However we are
getting errors relating to the jsp-compile command.
We are getting the
Thanks again Arun, you save me again.. :-)
This is a great starting point. for CPU and possibly Mem.
For the IOPS, just would like to ask if the tasknode/datanode collect the number
or we should dig into OS level.. like /proc/PID_OF_tt/io
^hope this make sense
-P
On Tue, Dec 20, 2011 at 1:22
Go ahead and open a MR jira (would appreciate a patch too! ;) ).
thanks,
Arun
On Dec 20, 2011, at 2:55 PM, Patai Sangbutsarakum wrote:
Thanks again Arun, you save me again.. :-)
This is a great starting point. for CPU and possibly Mem.
For the IOPS, just would like to ask if the
dear all
i am trying for many days to get a simple hadoop cluster (with 2 nodes) to
work but i have trouble configuring the network parameters. i have properly
configured the ssh keys, and the /etc/hosts files are:
master-
127.0.0.1 localhost6.localdomain6 localhost
127.0.1.1
Hi Shevek/others,
I tried this.
First job created about 78 files of each 15 MB size.
I tried a second map only job with IdentityMapper with
-Dmapred.min.split.size=1073741824 but it did not cause output files to be
1 Gb each but same output as above i.e. 78 files of 15 MB size.
Is there a way
Hi,
We are going to be loading 4-5 GB text, delimited file from a RHEL file
system into HDFS to be managed
as external table by Hive.
What is the recommended, fastest loading mechanism?
Thank you,
Edmon
Do you have some strict performance requirement or something? Cause 5Gb is
pretty much nothing, really. I'd say copyFromLocal will do just fine.
Cos
On Tue, Dec 20, 2011 at 10:32PM, Edmon Begoli wrote:
Hi,
We are going to be loading 4-5 GB text, delimited file from a RHEL file
system into
19 matches
Mail list logo