Michael Noll has a good description of the upgrade process here:
http://www.michael-noll.com/blog/2011/08/23/performing-an-hdfs-upgrade-of-an-hadoop-cluster/
If may not quite reflect the versions of Hadoop you plan to upgrade but it
has some good pointers.
Chris
On 30 May 2012 09:12, wrote:
>
M,
See http://wiki.apache.org/hadoop/FAQ - "3.6. I want to make a large
cluster smaller by taking out a bunch of nodes simultaneously. How can this
be done?"
This explains how to decomission nodes by moving the data off of the
existing node. It's fairly easy and painless (just add the nodename t
Have a look at OpenTSDB (http://opentsdb.net/overview.html) as this
does not have the same down sampling issue as Ganglia and stores the
metrics in HBase making it easier to access and process the data.
It's also pretty easy to add your own metrics.
Another useful utility is 'collectl'
(http://col
Madhu,
Try working your way through the MapReduce tutorial here:
http://hadoop.apache.org/common/docs/r0.20.205.0/mapred_tutorial.html#Example%3A+WordCount+v1.0
that covers most of the concepts you require to do a distributed
sort.
Search for the worf, "combiner", in the tutorial to understand a
Jorn,
If you've configured the Name Node fsimage and edit log replication to
both NFS and Secondary Name Node and regularly backup the fsimage and
edit logs you would do better investing time in understanding exactly
how the Name Node builds up it's internal database and how it applies
it's edit
On 29 September 2011 18:39, lessonz wrote:
> I'm new to Hadoop, and I'm trying to understand the implications of a 64M
> block size in the HDFS. Is there a good reference that enumerates the
> implications of this decision and its effects on files stored in the system
> as well as map-reduce jobs?
Elton,
Rapleaf's blog has an interesting posting on their experience that's
worth a read:
http://blog.rapleaf.com/dev/2010/08/26/analyzing-some-interesting-networks-for-mapreduce-clusters/
And if you want to get an idea of the interaction between CPU, Disk
and Network there nothing like a pictu
Worth a look at OpenTSDB ( http://opentsdb.net/ ) as it doesn't lose
precision on the historical data.
It also has some neat tracks around the collection and display of data.
Another useful tool is 'collectl' ( http://collectl.sourceforge.net/ )
which is a light weight Perl script that
both captur
2010/9/6 褚 鵬兵 :
>
> hi ,my hadoop friends:i have the 3 questions about hadoop.there are
>
> 1 the speed between the datanodes. Tera data in one datanodes , the data
> transfers from one datanode to the another datanode. if the speed is bad,
> Hadoop will be slow, i think. i heard t
Some thoughts on how to restrict the temporary data, but I have only
tried (a) in anger:
a) Partition your disks into HDFS and intermediate temp partitions
of the relevant size. This gives a fixed separation but is
difficult/impossible to modify on a busy cluster especially as there
may be no
10 matches
Mail list logo