Re: dfs.name.dir capacity for namenode backup?
On Mon, May 17, 2010 at 5:10 PM, jiang licht licht_ji...@yahoo.com wrote: I am considering to use a machine to save a redundant copy of HDFS metadata through setting dfs.name.dir in hdfs-site.xml like this (as in YDN): property namedfs.name.dir/name value/home/hadoop/dfs/name,/mnt/namenode-backup/value finaltrue/final /property where the two folders are on different machines so that /mnt/namenode-backup keeps a copy of hdfs file system information and its machine can be used to replace the first machine that fails as namenode. So, my question is how big this hdfs metatdata will consume? I guess it is proportional to the hdfs capacity. What ratio is that or what size will be for 150TB hdfs? On the order of a few GB, max (you really need double the size of your image, so it has tmp space when downloading a checkpoint or performing an upgrade). But on any disk you can buy these days you'll have plenty of space. -Todd -- Todd Lipcon Software Engineer, Cloudera
Hadoop User Group UK Meetup - June 3rd
Hi all, I've picked up where Johan left off with the HUGUK meetups and the next one is planned for June 3rd. The main talks will be: “Introduction to Sqoop” by Aaron Kimball (Cloudera) “Hive at Last.fm” by Tim Sell (Last.fm) More details are available at: http://dumbotics.com/2010/05/18/huguk-4/ -Klaas
Any possible to set hdfs block size to a value smaller than 64MB?
Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT.
Re: Any possible to set hdfs block size to a value smaller than 64MB?
Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. smime.p7s Description: S/MIME cryptographic signature
Re: Any possible to set hdfs block size to a value smaller than 64MB?
Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.eduwrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect
Re: Any possible to set hdfs block size to a value smaller than 64MB?
... and by slices of 64MB then I mean... ? On Tue, May 18, 2010 at 2:38 PM, Pierre ANCELOT pierre...@gmail.com wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.eduwrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect
Re: Data node decommission doesn't seem to be working correctly
Hey Scott, Hadoop tends to get confused by nodes with multiple hostnames or multiple IP addresses. Is this your case? I can't remember precisely what our admin does, but I think he puts in the IP address which Hadoop listens on in the exclude-hosts file. Look in the output of hadoop dfsadmin -report to determine precisely which IP address your datanode is listening on. Brian On May 17, 2010, at 11:32 PM, Scott White wrote: I followed the steps mentioned here: http://developer.yahoo.com/hadoop/tutorial/module2.html#decommission to decommission a data node. What I see from the namenode is the hostname of the machine that I decommissioned shows up in both the list of dead nodes but also live nodes where its admin status is marked as 'In Service'. It's been twelve hours and there is no sign in the namenode logs that the node has been decommissioned. Any suggestions of what might be the problem and what to try to ensure that this node gets safely taken down? thanks in advance, Scott smime.p7s Description: S/MIME cryptographic signature
Re: Any possible to set hdfs block size to a value smaller than 64MB?
On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.eduwrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect smime.p7s Description: S/MIME cryptographic signature
Re: Any possible to set hdfs block size to a value smaller than 64MB?
Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect
Re: Any possible to set hdfs block size to a value smaller than 64MB?
Thank you, Any way I can measure the startup overhead in terms of time? On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.comwrote: Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect
Re: Any possible to set hdfs block size to a value smaller than 64MB?
Should be evident in the total job running time... that's the only metric that really matters :) On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.comwrote: Thank you, Any way I can measure the startup overhead in terms of time? On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com wrote: Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect
Re: Any possible to set hdfs block size to a value smaller than 64MB?
If you know how to use AspectJ to do aspect oriented programming. You can write a aspect class. Let it just monitors the whole process of MapReduce On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles patr...@cloudera.comwrote: Should be evident in the total job running time... that's the only metric that really matters :) On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com wrote: Thank you, Any way I can measure the startup overhead in terms of time? On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com wrote: Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- Best Wishes! 顺送商祺! -- Chen He (402)613-9298 PhD. student of CSE Dept. Holland Computing Center University of Nebraska-Lincoln Lincoln NE 68588
Re: what's the mechnism to determine the reducer number and reduce progress
Thanks PanFeng, do you have more detailed explanation on this? Is it caculated by how many reduce files has completed each phase? Also, what's the answer for my second question? Thanks! On Mon, May 17, 2010 at 12:44 PM, 原攀峰 ypf...@163.com wrote: For a reduce task, the execution is divided into three phases, each of which accounts for 1/3 of the score: • The copy phase, when the task fetches map outputs. • The sort phase, when map outputs are sorted by key. • The reduce phase, when a user-defined function is applied to the list of map outputs with each key. -- Yuan Panfeng(原攀峰) | BeiHang University TEL: +86-13426166934 MSN: ypf...@hotmail.com EMAIL: ypf...@gmail.com QQ: 362889262 在2010-05-17 09:44:38,stan lee lee.stan...@gmail.com 写道: When I run the sort job, I found when there are 70 reduce tasks running and no one completed, the progress bar shows that it has finished about 80%, so how the mapreduce mechnism to caculate this? Also, when I run a job, as we know, we can determine the number of total reduce tasks through setNumReduceTasks() function, but how to determine the reducer number(I mean the tasktracker number which run the reduce task) being used? Thanks! Stan. Lee
Re: Any possible to set hdfs block size to a value smaller than 64MB?
Hey Hassan, 1) The overhead is pretty small, measured in a small number of milliseconds on average 2) HDFS is not designed for online latency. Even though the average is small, if something bad happens, your clients might experience a lot of delays while going through the retry stack. The initial design was for batch processing, and latency-sensitive applications came later. Additionally since the NN is a SPOF, you might want to consider your uptime requirements. Each organization will have to balance these risks with the advantages (such as much cheaper hardware). There's a nice interview with the GFS authors here where they touch upon the latency issues: http://queue.acm.org/detail.cfm?id=1594206 As GFS and HDFS share many design features, the theoretical parts of their discussion might be useful for you. As far as overall throughput of the system goes, it depends heavily upon your implementation and hardware. Our HDFS routinely serves 5-10 Gbps. Brian On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote: This is a very interesting thread to us, as we are thinking about deploying HDFS as a massive online storage for a on online university, and then serving the video files to students who want to view them. We cannot control the size of the videos (and some class work files), as they will mostly be uploaded by the teachers providing the classes. How would the overall through put of HDFS be affected in such a solution? Would HDFS be feasible at all for such a setup? Regards HASSAN On Tue, May 18, 2010 at 21:11, He Chen airb...@gmail.com wrote: If you know how to use AspectJ to do aspect oriented programming. You can write a aspect class. Let it just monitors the whole process of MapReduce On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles patr...@cloudera.com wrote: Should be evident in the total job running time... that's the only metric that really matters :) On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com wrote: Thank you, Any way I can measure the startup overhead in terms of time? On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com wrote: Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a
Do we need to install both 32 and 64 bit lzo2 to enable lzo compression and how can we use gzip compressoin codec in hadoop
Hi Guys, I am trying to use compression to reduce the IO workload when trying to run a job but failed. I have several questions which needs your help. For lzo compression, I found a guide http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ, why it said Note that you must have both 32-bit and 64-bit liblzo2 installed ? I am not sure whether it means that we also need 32bit liblzo2 installed even when we are on 64bit system. If so, why? Also if I don't use lzo compression and tried to use gzip to compress the final reduce output file, I just set below value in mapred-site.xml, but seems it doesn't work(how can I find the final .gz file compressed? I used hadoop dfs -l dir and didn't find that.). My question: can we use gzip to compress the final result when it's not streaming job? How can we ensure that the compression has been enabled during a job execution? property namemapred.output.compress/name valuetrue/value /property Thanks! Stan Lee
Re: Do we need to install both 32 and 64 bit lzo2 to enable lzo compression and how can we use gzip compressoin codec in hadoop
32bit liblzo2 isn't needed on 64-bit systems. On Tue, May 18, 2010 at 8:44 AM, stan lee lee.stan...@gmail.com wrote: Hi Guys, I am trying to use compression to reduce the IO workload when trying to run a job but failed. I have several questions which needs your help. For lzo compression, I found a guide http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ, why it said Note that you must have both 32-bit and 64-bit liblzo2 installed ? I am not sure whether it means that we also need 32bit liblzo2 installed even when we are on 64bit system. If so, why? Also if I don't use lzo compression and tried to use gzip to compress the final reduce output file, I just set below value in mapred-site.xml, but seems it doesn't work(how can I find the final .gz file compressed? I used hadoop dfs -l dir and didn't find that.). My question: can we use gzip to compress the final result when it's not streaming job? How can we ensure that the compression has been enabled during a job execution? property namemapred.output.compress/name valuetrue/value /property Thanks! Stan Lee
Re: Data node decommission doesn't seem to be working correctly
Hi Scott, You might be hitting two different issues. 1) Decommission not finishing. https://issues.apache.org/jira/browse/HDFS-694 explains decommission never finishing due to open files in 0.20 2) Nodes showing up both in live and dead nodes. I remember Suresh taking a look at this. It was something about same node registered with hostname and IP separately (when datanode is rejumped and started fresh (?)). Cc-ing Suresh. Koji On 5/17/10 9:32 PM, Scott White scottbl...@gmail.com wrote: I followed the steps mentioned here: http://developer.yahoo.com/hadoop/tutorial/module2.html#decommission to decommission a data node. What I see from the namenode is the hostname of the machine that I decommissioned shows up in both the list of dead nodes but also live nodes where its admin status is marked as 'In Service'. It's been twelve hours and there is no sign in the namenode logs that the node has been decommissioned. Any suggestions of what might be the problem and what to try to ensure that this node gets safely taken down? thanks in advance, Scott
Re: Data node decommission doesn't seem to be working correctly
Dfsadmin -report reports the hostname for that machine and not the ip. That machine happens to be the master node which is why I am trying to decommission the data node there since I only want the data node running on the slave nodes. Dfs admin -report reports all the ips for the slave nodes. One question: I believe that the namenode was accidentally restarted during the 12 hours or so I was waiting for the decommission to complete. Would this put things into a bad state? I did try running dfsadmin -refreshNodes after it was restarted. Scott On Tue, May 18, 2010 at 5:44 AM, Brian Bockelman bbock...@cse.unl.eduwrote: Hey Scott, Hadoop tends to get confused by nodes with multiple hostnames or multiple IP addresses. Is this your case? I can't remember precisely what our admin does, but I think he puts in the IP address which Hadoop listens on in the exclude-hosts file. Look in the output of hadoop dfsadmin -report to determine precisely which IP address your datanode is listening on. Brian On May 17, 2010, at 11:32 PM, Scott White wrote: I followed the steps mentioned here: http://developer.yahoo.com/hadoop/tutorial/module2.html#decommission to decommission a data node. What I see from the namenode is the hostname of the machine that I decommissioned shows up in both the list of dead nodes but also live nodes where its admin status is marked as 'In Service'. It's been twelve hours and there is no sign in the namenode logs that the node has been decommissioned. Any suggestions of what might be the problem and what to try to ensure that this node gets safely taken down? thanks in advance, Scott
Re: Do we need to install both 32 and 64 bit lzo2 to enable lzo compression and how can we use gzip compressoin codec in hadoop
Hi stan, You can do something of this sort if you use FileOutputFormat, from within your Job Driver: FileOutputFormat.setCompressOutput(job, true); FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class); // GzipCodec from org.apache.hadoop.io.compress. // and where 'job' is either JobConf or Job object. This will write the simple file output in Gzip format. You also have BZip2Codec. On Tue, May 18, 2010 at 9:14 PM, stan lee lee.stan...@gmail.com wrote: Hi Guys, I am trying to use compression to reduce the IO workload when trying to run a job but failed. I have several questions which needs your help. For lzo compression, I found a guide http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ, why it said Note that you must have both 32-bit and 64-bit liblzo2 installed ? I am not sure whether it means that we also need 32bit liblzo2 installed even when we are on 64bit system. If so, why? Also if I don't use lzo compression and tried to use gzip to compress the final reduce output file, I just set below value in mapred-site.xml, but seems it doesn't work(how can I find the final .gz file compressed? I used hadoop dfs -l dir and didn't find that.). My question: can we use gzip to compress the final result when it's not streaming job? How can we ensure that the compression has been enabled during a job execution? property namemapred.output.compress/name valuetrue/value /property Thanks! Stan Lee -- Harsh J www.harshj.com
Re: preserve JobTracker information
Preserved JobTracker history is already available at /jobhistory.jsp There is a link at the end of the /jobtracker.jsp page that leads to this. There's also free analysis to go with that! :) On Tue, May 18, 2010 at 11:00 PM, Alan Miller someb...@squareplanet.de wrote: Hi, Is there a way to preserve previous job information (Completed Jobs, Failed Jobs) when the hadoop cluster is restarted? Everytime I start up my cluster (start-dfs.sh,start-mapred.sh) the JobTracker interface at http://myhost:50020/jobtracker.jsp is always empty. Thanks, Alan -- Harsh J www.harshj.com
Re: Any possible to set hdfs block size to a value smaller than 64MB?
I had an experiment with block size of 10 bytes (sic!). This was _very_ slow on NN side. Like writing 5 Mb was happening for 25 minutes or so :( No fun to say the least... On Tue, May 18, 2010 at 10:56AM, Konstantin Shvachko wrote: You can also get some performance numbers and answers to the block size dilemma problem here: http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html I remember some people were using Hadoop for storing or streaming videos. Don't know how well that worked. It would be interesting to learn about your experience. Thanks, --Konstantin On 5/18/2010 8:41 AM, Brian Bockelman wrote: Hey Hassan, 1) The overhead is pretty small, measured in a small number of milliseconds on average 2) HDFS is not designed for online latency. Even though the average is small, if something bad happens, your clients might experience a lot of delays while going through the retry stack. The initial design was for batch processing, and latency-sensitive applications came later. Additionally since the NN is a SPOF, you might want to consider your uptime requirements. Each organization will have to balance these risks with the advantages (such as much cheaper hardware). There's a nice interview with the GFS authors here where they touch upon the latency issues: http://queue.acm.org/detail.cfm?id=1594206 As GFS and HDFS share many design features, the theoretical parts of their discussion might be useful for you. As far as overall throughput of the system goes, it depends heavily upon your implementation and hardware. Our HDFS routinely serves 5-10 Gbps. Brian On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote: This is a very interesting thread to us, as we are thinking about deploying HDFS as a massive online storage for a on online university, and then serving the video files to students who want to view them. We cannot control the size of the videos (and some class work files), as they will mostly be uploaded by the teachers providing the classes. How would the overall through put of HDFS be affected in such a solution? Would HDFS be feasible at all for such a setup? Regards HASSAN On Tue, May 18, 2010 at 21:11, He Chenairb...@gmail.com wrote: If you know how to use AspectJ to do aspect oriented programming. You can write a aspect class. Let it just monitors the whole process of MapReduce On Tue, May 18, 2010 at 10:00 AM, Patrick Angelespatr...@cloudera.com wrote: Should be evident in the total job running time... that's the only metric that really matters :) On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOTpierre...@gmail.com wrote: Thank you, Any way I can measure the startup overhead in terms of time? On Tue, May 18, 2010 at 4:27 PM, Patrick Angelespatr...@cloudera.com wrote: Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOTpierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this:
Re: Data node decommission doesn't seem to be working correctly
Hey Scott, If the node shows up in the dead nodes and the live nodes as you say, it's definitely not even attempting to be decommissioned. If HDFS was attempting decommissioning and you restart the namenode, then it would only show up in the dead nodes list. Another option is to just turn off HDFS on that node alone, and don't physically delete the data from the node until HDFS completely recovers. This is not recommended for production usage, as it creates a period where the cluster is in danger of losing files. However, it can be used as a one-off to get over this speed-hump. Brian On May 18, 2010, at 12:02 PM, Scott White wrote: Dfsadmin -report reports the hostname for that machine and not the ip. That machine happens to be the master node which is why I am trying to decommission the data node there since I only want the data node running on the slave nodes. Dfs admin -report reports all the ips for the slave nodes. One question: I believe that the namenode was accidentally restarted during the 12 hours or so I was waiting for the decommission to complete. Would this put things into a bad state? I did try running dfsadmin -refreshNodes after it was restarted. Scott On Tue, May 18, 2010 at 5:44 AM, Brian Bockelman bbock...@cse.unl.eduwrote: Hey Scott, Hadoop tends to get confused by nodes with multiple hostnames or multiple IP addresses. Is this your case? I can't remember precisely what our admin does, but I think he puts in the IP address which Hadoop listens on in the exclude-hosts file. Look in the output of hadoop dfsadmin -report to determine precisely which IP address your datanode is listening on. Brian On May 17, 2010, at 11:32 PM, Scott White wrote: I followed the steps mentioned here: http://developer.yahoo.com/hadoop/tutorial/module2.html#decommission to decommission a data node. What I see from the namenode is the hostname of the machine that I decommissioned shows up in both the list of dead nodes but also live nodes where its admin status is marked as 'In Service'. It's been twelve hours and there is no sign in the namenode logs that the node has been decommissioned. Any suggestions of what might be the problem and what to try to ensure that this node gets safely taken down? thanks in advance, Scott smime.p7s Description: S/MIME cryptographic signature
Re: Do we need to install both 32 and 64 bit lzo2 to enable lzo compression and how can we use gzip compressoin codec in hadoop
Stan, See my comments inline. Thanks, Hong On May 18, 2010, at 8:44 AM, stan lee wrote: Hi Guys, I am trying to use compression to reduce the IO workload when trying to run a job but failed. I have several questions which needs your help. For lzo compression, I found a guide http://code.google.com/p/hadoop-gpl-compression/wiki/FAQ, why it said Note that you must have both 32-bit and 64-bit liblzo2 installed ? I am not sure whether it means that we also need 32bit liblzo2 installed even when we are on 64bit system. If so, why? The answer on the wiki page is to the question of how to set up the native libraries so that both 32-bit AND 64-bit java would work. If you adhere to an environment with the same flavor of java across the whole cluster, then the solution would not apply to you. Also if I don't use lzo compression and tried to use gzip to compress the final reduce output file, I just set below value in mapred-site.xml, but seems it doesn't work(how can I find the final .gz file compressed? I used hadoop dfs -l dir and didn't find that.). My question: can we use gzip to compress the final result when it's not streaming job? How can we ensure that the compression has been enabled during a job execution? property namemapred.output.compress/name valuetrue/value /property The truth is, this option is honored by the implementation of OutputFormat classes. If you use TextOutputFormat, then you should see files like part-.gz in the output directory. If you write your own output format class, then you should follow the implementations of TextOutputFormat or SequenceFileOutputFormat to set up compression properly.
Re: dfs.name.dir capacity for namenode backup?
Sorry to hijack but after following this thread, I had a related question to the secondary location of dfs.name.dir. Is the approach outlined below the preferred/suggested way to do this? Is this people mean when they say, stick it on NFS ? Thanks! On May 17, 2010, at 11:14 PM, Todd Lipcon wrote: On Mon, May 17, 2010 at 5:10 PM, jiang licht licht_ji...@yahoo.com wrote: I am considering to use a machine to save a redundant copy of HDFS metadata through setting dfs.name.dir in hdfs-site.xml like this (as in YDN): property namedfs.name.dir/name value/home/hadoop/dfs/name,/mnt/namenode-backup/value finaltrue/final /property where the two folders are on different machines so that /mnt/namenode-backup keeps a copy of hdfs file system information and its machine can be used to replace the first machine that fails as namenode. So, my question is how big this hdfs metatdata will consume? I guess it is proportional to the hdfs capacity. What ratio is that or what size will be for 150TB hdfs? On the order of a few GB, max (you really need double the size of your image, so it has tmp space when downloading a checkpoint or performing an upgrade). But on any disk you can buy these days you'll have plenty of space. -Todd -- Todd Lipcon Software Engineer, Cloudera
Re: dfs.name.dir capacity for namenode backup?
Yes, we recommend at least one local directory and one NFS directory for dfs.name.dir in production environments. This allows an up-to-date recovery of NN metadata if the NN should fail. In future versions the BackupNode functionality will move us one step closer to not needing NFS for production deployments. Note that the NFS directory does not need to be anything fancy - you can simply use an NFS mount on another normal Linux box. -Todd On Tue, May 18, 2010 at 11:19 AM, Andrew Nguyen and...@ucsfcti.org wrote: Sorry to hijack but after following this thread, I had a related question to the secondary location of dfs.name.dir. Is the approach outlined below the preferred/suggested way to do this? Is this people mean when they say, stick it on NFS ? Thanks! On May 17, 2010, at 11:14 PM, Todd Lipcon wrote: On Mon, May 17, 2010 at 5:10 PM, jiang licht licht_ji...@yahoo.com wrote: I am considering to use a machine to save a redundant copy of HDFS metadata through setting dfs.name.dir in hdfs-site.xml like this (as in YDN): property namedfs.name.dir/name value/home/hadoop/dfs/name,/mnt/namenode-backup/value finaltrue/final /property where the two folders are on different machines so that /mnt/namenode-backup keeps a copy of hdfs file system information and its machine can be used to replace the first machine that fails as namenode. So, my question is how big this hdfs metatdata will consume? I guess it is proportional to the hdfs capacity. What ratio is that or what size will be for 150TB hdfs? On the order of a few GB, max (you really need double the size of your image, so it has tmp space when downloading a checkpoint or performing an upgrade). But on any disk you can buy these days you'll have plenty of space. -Todd -- Todd Lipcon Software Engineer, Cloudera -- Todd Lipcon Software Engineer, Cloudera
Re: Any possible to set hdfs block size to a value smaller than 64MB?
Thanks for the sarcasm but with 3 small files and so, 3 Mapper instatiations, even though it's not (and never did I say it was) he only metric that matters, it seem to me lie something very interresting to check out... I have hierarchy over me and they will be happy to understand my choices with real numbers to base their understanding on. Thanks. On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles patr...@cloudera.comwrote: Should be evident in the total job running time... that's the only metric that really matters :) On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com wrote: Thank you, Any way I can measure the startup overhead in terms of time? On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com wrote: Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect -- http://www.neko-consulting.com Ego sum quis ego servo Je suis ce que je protège I am what I protect
Re: JAVA_HOME not set
Are you using Cloudera's hadoop 0.20.2? There's some logic in bin/hadoop-config.sh that seems to be failing if JAVA_HOME isn't set, and it runs before hadoop-env.sh. If you think it might be the same problem, please weigh in: http://getsatisfaction.com/cloudera/topics/java_home_setting_in_hadoop_env_sh_not_respected_in_cdh_3 - David On Tue, May 18, 2010 at 12:30 PM, Erik Test erik.shi...@gmail.com wrote: Hi All, I continually get this error when trying to run start-all.sh for hadoop 0.20.2 on ubuntu. What confuses me is I DO have JAVA_HOME set in hadoop-env.sh to /usr/lib/jvm/jdk1.6.0_17. I've double checked to see that JAVA_HOME is set to this by echoing the path before running the start script but still now luck. I then tried adding bin to the path but then got errors saying /usr/lib/jvm/jdk1.6.0_17/bin/bin/java couldn't be found. Can someone give me suggestions on how to fix this problem please? Erik
Re: JAVA_HOME not set
Hm. I actually just changed to this version Erik On 18 May 2010 15:59, David Howell dehow...@gmail.com wrote: Are you using Cloudera's hadoop 0.20.2? There's some logic in bin/hadoop-config.sh that seems to be failing if JAVA_HOME isn't set, and it runs before hadoop-env.sh. If you think it might be the same problem, please weigh in: http://getsatisfaction.com/cloudera/topics/java_home_setting_in_hadoop_env_sh_not_respected_in_cdh_3 - David On Tue, May 18, 2010 at 12:30 PM, Erik Test erik.shi...@gmail.com wrote: Hi All, I continually get this error when trying to run start-all.sh for hadoop 0.20.2 on ubuntu. What confuses me is I DO have JAVA_HOME set in hadoop-env.sh to /usr/lib/jvm/jdk1.6.0_17. I've double checked to see that JAVA_HOME is set to this by echoing the path before running the start script but still now luck. I then tried adding bin to the path but then got errors saying /usr/lib/jvm/jdk1.6.0_17/bin/bin/java couldn't be found. Can someone give me suggestions on how to fix this problem please? Erik
Re: Any possible to set hdfs block size to a value smaller than 64MB?
That wasn't sarcasm. This is what you do: - Run your mapreduce job on 30k small files. - Consolidate your 30k small files into larger files. - Run mapreduce ok the larger files. - Compare the running time The difference in runtime is made up by your task startup and seek overhead. If you want to get the 'average' overhead per task, divide the total times for each job by the number of map tasks. This won't be a true average because with larger chunks of data, you will have longer running map tasks that will hold up the shuffle phase. But the average doesn't really matter here because you always have that trade-off going from small to large chunks of data. On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT pierre...@gmail.com wrote: Thanks for the sarcasm but with 3 small files and so, 3 Mapper instatiations, even though it's not (and never did I say it was) he only metric that matters, it seem to me lie something very interresting to check out... I have hierarchy over me and they will be happy to understand my choices with real numbers to base their understanding on. Thanks. On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles patr...@cloudera.com wrote: Should be evident in the total job running time... that's the only metric that really matters :) On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com wrote: Thank you, Any way I can measure the startup overhead in terms of time? On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com wrote: Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of small files. I'm aware that having such small files ain't a good idea but I'm not doing the technical decisions and the port has to be done for yesterday... Of course such small files are a problem, loading 64MB blocks for a few lines of text is an evident loss. What will happen if I set a smaller, or even way smaller (32kB) blocks? Thank you. Pierre ANCELOT. -- http://www.neko-consulting.com Ego sum quis
RE: Any possible to set hdfs block size to a value smaller than 64MB?
I'm not familiar with how to use/create them, but shouldn't a HAR (Hadoop Archive) work well in this situation? I thought it was designed to collect several small files together through another level indirection to avoid the NN load and decreasing the HDFS block size. Nick Jones -Original Message- From: patrickange...@gmail.com [mailto:patrickange...@gmail.com] On Behalf Of Patrick Angeles Sent: Tuesday, May 18, 2010 4:36 PM To: common-user@hadoop.apache.org Subject: Re: Any possible to set hdfs block size to a value smaller than 64MB? That wasn't sarcasm. This is what you do: - Run your mapreduce job on 30k small files. - Consolidate your 30k small files into larger files. - Run mapreduce ok the larger files. - Compare the running time The difference in runtime is made up by your task startup and seek overhead. If you want to get the 'average' overhead per task, divide the total times for each job by the number of map tasks. This won't be a true average because with larger chunks of data, you will have longer running map tasks that will hold up the shuffle phase. But the average doesn't really matter here because you always have that trade-off going from small to large chunks of data. On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT pierre...@gmail.com wrote: Thanks for the sarcasm but with 3 small files and so, 3 Mapper instatiations, even though it's not (and never did I say it was) he only metric that matters, it seem to me lie something very interresting to check out... I have hierarchy over me and they will be happy to understand my choices with real numbers to base their understanding on. Thanks. On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles patr...@cloudera.com wrote: Should be evident in the total job running time... that's the only metric that really matters :) On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com wrote: Thank you, Any way I can measure the startup overhead in terms of time? On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com wrote: Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pierre, These are not traditional filesystem blocks - if you save a file smaller than 64MB, you don't lose 64MB of file space.. Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), not 64MB. Brian On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote: Hi, I'm porting a legacy application to hadoop and it uses a bunch of
Re: Any possible to set hdfs block size to a value smaller than 64MB?
On Tue, May 18, 2010 at 2:50 PM, Jones, Nick nick.jo...@amd.com wrote: I'm not familiar with how to use/create them, but shouldn't a HAR (Hadoop Archive) work well in this situation? I thought it was designed to collect several small files together through another level indirection to avoid the NN load and decreasing the HDFS block size. Yes, or CombineFileInputFormat. JVM reuse also helps somewhat, so long as you're not talking about hundreds of thousands of files (in which case it starts to hurt JT load with that many tasks in jobs) There are a number of ways to combat the issue, but rule of thumb is that you shouldn't try to use HDFS to store tons of small files :) -Todd -Original Message- From: patrickange...@gmail.com [mailto:patrickange...@gmail.com] On Behalf Of Patrick Angeles Sent: Tuesday, May 18, 2010 4:36 PM To: common-user@hadoop.apache.org Subject: Re: Any possible to set hdfs block size to a value smaller than 64MB? That wasn't sarcasm. This is what you do: - Run your mapreduce job on 30k small files. - Consolidate your 30k small files into larger files. - Run mapreduce ok the larger files. - Compare the running time The difference in runtime is made up by your task startup and seek overhead. If you want to get the 'average' overhead per task, divide the total times for each job by the number of map tasks. This won't be a true average because with larger chunks of data, you will have longer running map tasks that will hold up the shuffle phase. But the average doesn't really matter here because you always have that trade-off going from small to large chunks of data. On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT pierre...@gmail.com wrote: Thanks for the sarcasm but with 3 small files and so, 3 Mapper instatiations, even though it's not (and never did I say it was) he only metric that matters, it seem to me lie something very interresting to check out... I have hierarchy over me and they will be happy to understand my choices with real numbers to base their understanding on. Thanks. On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles patr...@cloudera.com wrote: Should be evident in the total job running time... that's the only metric that really matters :) On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com wrote: Thank you, Any way I can measure the startup overhead in terms of time? On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com wrote: Pierre, Adding to what Brian has said (some things are not explicitly mentioned in the HDFS design doc)... - If you have small files that take up 64MB you do not actually use the entire 64MB block on disk. - You *do* use up RAM on the NameNode, as each block represents meta-data that needs to be maintained in-memory in the NameNode. - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is optimized for high sustained throughput per single file/block. There is a penalty for doing too many seeks to get to the beginning of each block. Additionally, you will have a MapReduce task per small file. Each MapReduce task has a non-trivial startup overhead. - The recommendation is to consolidate your small files into large files. One way to do this is via SequenceFiles... put the filename in the SequenceFile key field, and the file's bytes in the SequenceFile value field. In addition to the HDFS design docs, I recommend reading this blog post: http://www.cloudera.com/blog/2009/02/the-small-files-problem/ Happy Hadooping, - Patrick On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote: Okay, thank you :) On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu wrote: On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote: Hi, thanks for this fast answer :) If so, what do you mean by blocks? If a file has to be splitted, it will be splitted when larger than 64MB? For every 64MB of the file, Hadoop will create a separate block. So, if you have a 32KB file, there will be one block of 32KB. If the file is 65MB, then it will have one block of 64MB and another block of 1MB. Splitting files is very useful for load-balancing and distributing I/O across multiple nodes. At 32KB / file, you don't really need to split the files at all. I recommend reading the HDFS design document for background issues like this: http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html Brian On Tue,