from:"Chris Smith"

Re: Best Practices for Upgrading Hadoop Version?

2012-05-30 Thread Chris Smith

Michael Noll has a good description of the upgrade process here: http://www.michael-noll.com/blog/2011/08/23/performing-an-hdfs-upgrade-of-an-hadoop-cluster/ If may not quite reflect the versions of Hadoop you plan to upgrade but it has some good pointers. Chris On 30 May 2012 09:12, wrote: >

Re: Moving blocks from a datanode

2012-05-22 Thread Chris Smith

M, See http://wiki.apache.org/hadoop/FAQ - "3.6. I want to make a large cluster smaller by taking out a bunch of nodes simultaneously. How can this be done?" This explains how to decomission nodes by moving the data off of the existing node. It's fairly easy and painless (just add the nodename t

Re: collecting CPU, mem, iops of hadoop jobs

2012-01-03 Thread Chris Smith

Have a look at OpenTSDB (http://opentsdb.net/overview.html) as this does not have the same down sampling issue as Ganglia and stores the metrics in HBase making it easier to access and process the data. It's also pretty easy to add your own metrics. Another useful utility is 'collectl' (http://col

Re: Distributed sorting using Hadoop

2011-11-29 Thread Chris Smith

Madhu, Try working your way through the MapReduce tutorial here: http://hadoop.apache.org/common/docs/r0.20.205.0/mapred_tutorial.html#Example%3A+WordCount+v1.0 that covers most of the concepts you require to do a distributed sort. Search for the worf, "combiner", in the tutorial to understand a

Re: Running more than one secondary namenode

2011-10-12 Thread Chris Smith

Jorn, If you've configured the Name Node fsimage and edit log replication to both NFS and Secondary Name Node and regularly backup the fsimage and edit logs you would do better investing time in understanding exactly how the Name Node builds up it's internal database and how it applies it's edit

Re: Block Size

2011-09-29 Thread Chris Smith

On 29 September 2011 18:39, lessonz wrote: > I'm new to Hadoop, and I'm trying to understand the implications of a 64M > block size in the HDFS. Is there a good reference that enumerates the > implications of this decision and its effects on files stored in the system > as well as map-reduce jobs?

Re: Why inter-rack communication in mapreduce slow?

2011-06-06 Thread Chris Smith

Elton, Rapleaf's blog has an interesting posting on their experience that's worth a read: http://blog.rapleaf.com/dev/2010/08/26/analyzing-some-interesting-networks-for-mapreduce-clusters/ And if you want to get an idea of the interaction between CPU, Disk and Network there nothing like a pictu

Re: tips and tools to optimize cluster

2011-05-24 Thread Chris Smith

Worth a look at OpenTSDB ( http://opentsdb.net/ ) as it doesn't lose precision on the historical data. It also has some neat tracks around the collection and display of data. Another useful tool is 'collectl' ( http://collectl.sourceforge.net/ ) which is a light weight Perl script that both captur

Re: the question of hadoop

2010-09-08 Thread Chris Smith

2010/9/6 褚鵬兵 : > > hi ,my hadoop friends:i have the 3 questions about hadoop.there are > > 1 the speed between the datanodes. Tera data in one datanodes , the data > transfers from one datanode to the another datanode. if the speed is bad, > Hadoop will be slow, i think. i heard t

RE: Question about disk space allocation in hadoop

2010-06-30 Thread Chris Smith

Some thoughts on how to restrict the temporary data, but I have only tried (a) in anger: a) Partition your disks into HDFS and intermediate temp partitions of the relevant size. This gives a fixed separation but is difficult/impossible to modify on a busy cluster especially as there may be no

Re: Best Practices for Upgrading Hadoop Version?

Re: Moving blocks from a datanode

Re: collecting CPU, mem, iops of hadoop jobs

Re: Distributed sorting using Hadoop

Re: Running more than one secondary namenode

Re: Block Size

Re: Why inter-rack communication in mapreduce slow?

Re: tips and tools to optimize cluster

Re: the question of hadoop

RE: Question about disk space allocation in hadoop

10 matches

Site Navigation

Mail list logo

Footer information