Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-20 Thread Nyamul Hassan
Thank you Brian and Konstantin for the two very interesting reads.  I had to
read some lines quite a few times to get the proper understanding of what
they tried to explain.  That interview was a class of its own.

It does seem that Google has heavily used BigTable to overcome some of the
shortcomings of the GFS.  Is that the same as what Konstantin is referring
to HBase over HDFS?

Regards
HASSAN



On Thu, May 20, 2010 at 04:11, Konstantin Shvachko s...@yahoo-inc.comwrote:

 Hi Brian,

 Interesting observations.
 This is probably in line with the client side mount table approach, [soon
 to be] proposed in HDFS-1053.
 Another way to provide personalized view of the file system would be to use
 symbolic links, which is
 available now in 0.21/22.
 For 4K files I would probably use h-archives, especially if the data, as
 you describe it, is not changing.
 But high-energy physicist should be considering using HBase over HDFS.

 --konst



 On 5/18/2010 11:57 AM, Brian Bockelman wrote:


 Hey Konstantin,

 Interesting paper :)

 One thing which I've been kicking around lately is at what scale does the
 file/directory paradigm break down?

 At some point, I think the human mind can no longer comprehend so many
 files (certainly, I can barely organize the few thousand files on my
 laptop).  Thus, file growth comes from (a) having lots of humans use a
 single file system or (b) automated programs generate the files.  For (a),
 you don't need a central global namespace, you just need the ability to have
 a local namespace per-person that can be shared among friends.  For (b), a
 program isn't going to be upset if you replace a file system with a database
 / dataset object / bucket.

 Two examples:
 - structural biology: I've seen a lot of different analysis workflows
 (such as autodock) that compares a protein against a database of ligands,
 where the database is 80,000 O(4KB) files.  Each file represents a known
 ligand that the biologist might come back and examine if it is relevant to
 the study of their protein.
 - high-energy physics: Each detector can produce millions of a events a
 night, and experiments will produce many billions of events.  These are
 saved into files (each file containing hundreds or thousands of events);
 these files are kept in collections (called datasets, data blocks, lumi
 sections, or runs, depending on what you're doing).  Tasks are run against
 datasets; they will output smaller datasets which the physicists will
 iterate upon until they get some dataset which fits onto their laptop.
 Here's my Claim: The biologists have a small enough number of objects to
 manage each one as a separate file; they do this because it's easier for
 humans navigating around in a terminal.  The physicists have such a huge
 number of objects that there's no way to manage them using one file per
 object, so they utilize files only as a mechanism to serialize bytes of data
 and have higher-order data structures for management
 Here's my Question: at what point do you move from the biologist's model
 (named objects, managed independently, single files) to the physicist's
 model (anonymous objects, managed in large groups, files are only used
 because we save data on file systems)?

 Another way to look at this is to consider DNS.  DNS maintains the
 namespace of the globe, but appears to do this just fine without a single
 central catalog.  If you start with a POSIX filesystem namespace (and the
 guarantees it implies), what rules must you relax in order to arrive at DNS?
  On the scale of managing million (billion? ten billion? trillion?) files,
 are any of the assumptions relevant?

 I don't know the answers to these questions, but I suspect they become
 important over the next 10 years.

 Brian

 PS - I starting thinking along these lines during MSST when the LLNL guy
 was speculating about what it meant to fsck a file system with 1 trillion
 files.

 On May 18, 2010, at 12:56 PM, Konstantin Shvachko wrote:

  You can also get some performance numbers and answers to the block size
 dilemma problem here:


 http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html

 I remember some people were using Hadoop for storing or streaming videos.
 Don't know how well that worked.
 It would be interesting to learn about your experience.

 Thanks,
 --Konstantin


 On 5/18/2010 8:41 AM, Brian Bockelman wrote:

 Hey Hassan,

 1) The overhead is pretty small, measured in a small number of
 milliseconds on average
 2) HDFS is not designed for online latency.  Even though the average
 is small, if something bad happens, your clients might experience a lot 
 of
 delays while going through the retry stack.  The initial design was for
 batch processing, and latency-sensitive applications came later.

 Additionally since the NN is a SPOF, you might want to consider your
 uptime requirements.  Each organization will have to balance these risks
 with the advantages (such as much cheaper 

Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-19 Thread Pierre ANCELOT
Okay, sorry then, I misunderstood.
I think I could aswell run it on empty files, I would only get task startup
overhead.
Thank you.

On Tue, May 18, 2010 at 11:36 PM, Patrick Angeles patr...@cloudera.comwrote:

 That wasn't sarcasm. This is what you do:

 - Run your mapreduce job on 30k small files.
 - Consolidate your 30k small files into larger files.
 - Run mapreduce ok the larger files.
 - Compare the running time

 The difference in runtime is made up by your task startup and seek
 overhead.

 If you want to get the 'average' overhead per task, divide the total times
 for each job by the number of map tasks. This won't be a true average
 because with larger chunks of data, you will have longer running map tasks
 that will hold up the shuffle phase. But the average doesn't really matter
 here because you always have that trade-off going from small to large
 chunks
 of data.


 On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT pierre...@gmail.com
 wrote:

  Thanks for the sarcasm but with 3 small files and so, 3 Mapper
  instatiations, even though it's not (and never did I say it was) he only
  metric that matters, it seem to me lie something very interresting to
 check
  out...
  I have hierarchy over me and they will be happy to understand my choices
  with real numbers to base their understanding on.
  Thanks.
 
 
  On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles patr...@cloudera.com
  wrote:
 
   Should be evident in the total job running time... that's the only
 metric
   that really matters :)
  
   On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com
   wrote:
  
Thank you,
Any way I can measure the startup overhead in terms of time?
   
   
On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles 
 patr...@cloudera.com
wrote:
   
 Pierre,

 Adding to what Brian has said (some things are not explicitly
  mentioned
in
 the HDFS design doc)...

 - If you have small files that take up  64MB you do not actually
 use
   the
 entire 64MB block on disk.
 - You *do* use up RAM on the NameNode, as each block represents
   meta-data
 that needs to be maintained in-memory in the NameNode.
 - Hadoop won't perform optimally with very small block sizes.
 Hadoop
   I/O
is
 optimized for high sustained throughput per single file/block.
 There
  is
   a
 penalty for doing too many seeks to get to the beginning of each
  block.
 Additionally, you will have a MapReduce task per small file. Each
MapReduce
 task has a non-trivial startup overhead.
 - The recommendation is to consolidate your small files into large
   files.
 One way to do this is via SequenceFiles... put the filename in the
 SequenceFile key field, and the file's bytes in the SequenceFile
  value
 field.

 In addition to the HDFS design docs, I recommend reading this blog
   post:
 http://www.cloudera.com/blog/2009/02/the-small-files-problem/

 Happy Hadooping,

 - Patrick

 On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT 
 pierre...@gmail.com
  
 wrote:

  Okay, thank you :)
 
 
  On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman 
   bbock...@cse.unl.edu
  wrote:
 
  
   On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
  
Hi, thanks for this fast answer :)
If so, what do you mean by blocks? If a file has to be
  splitted,
   it
  will
   be
splitted when larger than 64MB?
   
  
   For every 64MB of the file, Hadoop will create a separate
 block.
So,
 if
   you have a 32KB file, there will be one block of 32KB.  If the
  file
is
  65MB,
   then it will have one block of 64MB and another block of 1MB.
  
   Splitting files is very useful for load-balancing and
  distributing
I/O
   across multiple nodes.  At 32KB / file, you don't really need
 to
split
  the
   files at all.
  
   I recommend reading the HDFS design document for background
  issues
like
   this:
  
   http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
  
   Brian
  
   
   
   
On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman 
 bbock...@cse.unl.edu
   wrote:
   
Hey Pierre,
   
These are not traditional filesystem blocks - if you save a
  file
  smaller
than 64MB, you don't lose 64MB of file space..
   
Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
metadata
  or
so), not 64MB.
   
Brian
   
On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
   
Hi,
I'm porting a legacy application to hadoop and it uses a
  bunch
   of
  small
files.
I'm aware that having such small files ain't a good idea
 but
   I'm
 not
doing
the technical decisions and the port has to be done 

Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-19 Thread Konstantin Shvachko

Hi Brian,

Interesting observations.
This is probably in line with the client side mount table approach, [soon to 
be] proposed in HDFS-1053.
Another way to provide personalized view of the file system would be to use 
symbolic links, which is
available now in 0.21/22.
For 4K files I would probably use h-archives, especially if the data, as you 
describe it, is not changing.
But high-energy physicist should be considering using HBase over HDFS.

--konst


On 5/18/2010 11:57 AM, Brian Bockelman wrote:


Hey Konstantin,

Interesting paper :)

One thing which I've been kicking around lately is at what scale does the 
file/directory paradigm break down?

At some point, I think the human mind can no longer comprehend so many files (certainly, 
I can barely organize the few thousand files on my laptop).  Thus, file growth comes from 
(a) having lots of humans use a single file system or (b) automated programs generate the 
files.  For (a), you don't need a central global namespace, you just need the ability to 
have a local namespace per-person that can be shared among friends.  For (b), 
a program isn't going to be upset if you replace a file system with a database / dataset 
object / bucket.

Two examples:
- structural biology: I've seen a lot of different analysis workflows (such as autodock) 
that compares a protein against a database of ligands, where the database is 
80,000 O(4KB) files.  Each file represents a known ligand that the biologist might come 
back and examine if it is relevant to the study of their protein.
- high-energy physics: Each detector can produce millions of a events a night, 
and experiments will produce many billions of events.  These are saved into 
files (each file containing hundreds or thousands of events); these files are 
kept in collections (called datasets, data blocks, lumi sections, or runs, 
depending on what you're doing).  Tasks are run against datasets; they will 
output smaller datasets which the physicists will iterate upon until they get 
some dataset which fits onto their laptop.
Here's my Claim: The biologists have a small enough number of objects to manage 
each one as a separate file; they do this because it's easier for humans 
navigating around in a terminal.  The physicists have such a huge number of 
objects that there's no way to manage them using one file per object, so they 
utilize files only as a mechanism to serialize bytes of data and have 
higher-order data structures for management
Here's my Question: at what point do you move from the biologist's model (named 
objects, managed independently, single files) to the physicist's model 
(anonymous objects, managed in large groups, files are only used because we 
save data on file systems)?

Another way to look at this is to consider DNS.  DNS maintains the namespace of 
the globe, but appears to do this just fine without a single central catalog.  
If you start with a POSIX filesystem namespace (and the guarantees it implies), 
what rules must you relax in order to arrive at DNS?  On the scale of managing 
million (billion? ten billion? trillion?) files, are any of the assumptions 
relevant?

I don't know the answers to these questions, but I suspect they become 
important over the next 10 years.

Brian

PS - I starting thinking along these lines during MSST when the LLNL guy was speculating 
about what it meant to fsck a file system with 1 trillion files.

On May 18, 2010, at 12:56 PM, Konstantin Shvachko wrote:


You can also get some performance numbers and answers to the block size dilemma 
problem here:

http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html

I remember some people were using Hadoop for storing or streaming videos.
Don't know how well that worked.
It would be interesting to learn about your experience.

Thanks,
--Konstantin


On 5/18/2010 8:41 AM, Brian Bockelman wrote:

Hey Hassan,

1) The overhead is pretty small, measured in a small number of milliseconds on 
average
2) HDFS is not designed for online latency.  Even though the average is small, if 
something bad happens, your clients might experience a lot of delays while going 
through the retry stack.  The initial design was for batch processing, and latency-sensitive 
applications came later.

Additionally since the NN is a SPOF, you might want to consider your uptime 
requirements.  Each organization will have to balance these risks with the 
advantages (such as much cheaper hardware).

There's a nice interview with the GFS authors here where they touch upon the 
latency issues:

http://queue.acm.org/detail.cfm?id=1594206

As GFS and HDFS share many design features, the theoretical parts of their 
discussion might be useful for you.

As far as overall throughput of the system goes, it depends heavily upon your 
implementation and hardware.  Our HDFS routinely serves 5-10 Gbps.

Brian

On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:


This is a very interesting thread to us, as we are thinking 

Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Pierre ANCELOT
Hi,
I'm porting a legacy application to hadoop and it uses a bunch of small
files.
I'm aware that having such small files ain't a good idea but I'm not doing
the technical decisions and the port has to be done for yesterday...
Of course such small files are a problem, loading 64MB blocks for a few
lines of text is an evident loss.
What will happen if I set a smaller, or even way smaller (32kB) blocks?

Thank you.

Pierre ANCELOT.


Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Brian Bockelman
Hey Pierre,

These are not traditional filesystem blocks - if you save a file smaller than 
64MB, you don't lose 64MB of file space..

Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so), 
not 64MB.

Brian

On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:

 Hi,
 I'm porting a legacy application to hadoop and it uses a bunch of small
 files.
 I'm aware that having such small files ain't a good idea but I'm not doing
 the technical decisions and the port has to be done for yesterday...
 Of course such small files are a problem, loading 64MB blocks for a few
 lines of text is an evident loss.
 What will happen if I set a smaller, or even way smaller (32kB) blocks?
 
 Thank you.
 
 Pierre ANCELOT.



smime.p7s
Description: S/MIME cryptographic signature


Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Pierre ANCELOT
Hi, thanks for this fast answer :)
If so, what do you mean by blocks? If a file has to be splitted, it will be
splitted when larger than 64MB?




On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.eduwrote:

 Hey Pierre,

 These are not traditional filesystem blocks - if you save a file smaller
 than 64MB, you don't lose 64MB of file space..

 Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or
 so), not 64MB.

 Brian

 On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:

  Hi,
  I'm porting a legacy application to hadoop and it uses a bunch of small
  files.
  I'm aware that having such small files ain't a good idea but I'm not
 doing
  the technical decisions and the port has to be done for yesterday...
  Of course such small files are a problem, loading 64MB blocks for a few
  lines of text is an evident loss.
  What will happen if I set a smaller, or even way smaller (32kB) blocks?
 
  Thank you.
 
  Pierre ANCELOT.




-- 
http://www.neko-consulting.com
Ego sum quis ego servo
Je suis ce que je protège
I am what I protect


Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Pierre ANCELOT
... and by slices of 64MB then I mean...
?

On Tue, May 18, 2010 at 2:38 PM, Pierre ANCELOT pierre...@gmail.com wrote:

 Hi, thanks for this fast answer :)
 If so, what do you mean by blocks? If a file has to be splitted, it will be
 splitted when larger than 64MB?





 On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.eduwrote:

 Hey Pierre,

 These are not traditional filesystem blocks - if you save a file smaller
 than 64MB, you don't lose 64MB of file space..

 Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or
 so), not 64MB.

 Brian

 On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:

  Hi,
  I'm porting a legacy application to hadoop and it uses a bunch of small
  files.
  I'm aware that having such small files ain't a good idea but I'm not
 doing
  the technical decisions and the port has to be done for yesterday...
  Of course such small files are a problem, loading 64MB blocks for a few
  lines of text is an evident loss.
  What will happen if I set a smaller, or even way smaller (32kB) blocks?
 
  Thank you.
 
  Pierre ANCELOT.




 --
 http://www.neko-consulting.com
 Ego sum quis ego servo
 Je suis ce que je protège
 I am what I protect




-- 
http://www.neko-consulting.com
Ego sum quis ego servo
Je suis ce que je protège
I am what I protect


Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Brian Bockelman

On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:

 Hi, thanks for this fast answer :)
 If so, what do you mean by blocks? If a file has to be splitted, it will be
 splitted when larger than 64MB?
 

For every 64MB of the file, Hadoop will create a separate block.  So, if you 
have a 32KB file, there will be one block of 32KB.  If the file is 65MB, then 
it will have one block of 64MB and another block of 1MB.

Splitting files is very useful for load-balancing and distributing I/O across 
multiple nodes.  At 32KB / file, you don't really need to split the files at 
all.

I recommend reading the HDFS design document for background issues like this:

http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html

Brian

 
 
 
 On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.eduwrote:
 
 Hey Pierre,
 
 These are not traditional filesystem blocks - if you save a file smaller
 than 64MB, you don't lose 64MB of file space..
 
 Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or
 so), not 64MB.
 
 Brian
 
 On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
 
 Hi,
 I'm porting a legacy application to hadoop and it uses a bunch of small
 files.
 I'm aware that having such small files ain't a good idea but I'm not
 doing
 the technical decisions and the port has to be done for yesterday...
 Of course such small files are a problem, loading 64MB blocks for a few
 lines of text is an evident loss.
 What will happen if I set a smaller, or even way smaller (32kB) blocks?
 
 Thank you.
 
 Pierre ANCELOT.
 
 
 
 
 -- 
 http://www.neko-consulting.com
 Ego sum quis ego servo
 Je suis ce que je protège
 I am what I protect



smime.p7s
Description: S/MIME cryptographic signature


Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Patrick Angeles
Pierre,

Adding to what Brian has said (some things are not explicitly mentioned in
the HDFS design doc)...

- If you have small files that take up  64MB you do not actually use the
entire 64MB block on disk.
- You *do* use up RAM on the NameNode, as each block represents meta-data
that needs to be maintained in-memory in the NameNode.
- Hadoop won't perform optimally with very small block sizes. Hadoop I/O is
optimized for high sustained throughput per single file/block. There is a
penalty for doing too many seeks to get to the beginning of each block.
Additionally, you will have a MapReduce task per small file. Each MapReduce
task has a non-trivial startup overhead.
- The recommendation is to consolidate your small files into large files.
One way to do this is via SequenceFiles... put the filename in the
SequenceFile key field, and the file's bytes in the SequenceFile value
field.

In addition to the HDFS design docs, I recommend reading this blog post:
http://www.cloudera.com/blog/2009/02/the-small-files-problem/

Happy Hadooping,

- Patrick

On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com wrote:

 Okay, thank you :)


 On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu
 wrote:

 
  On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
 
   Hi, thanks for this fast answer :)
   If so, what do you mean by blocks? If a file has to be splitted, it
 will
  be
   splitted when larger than 64MB?
  
 
  For every 64MB of the file, Hadoop will create a separate block.  So, if
  you have a 32KB file, there will be one block of 32KB.  If the file is
 65MB,
  then it will have one block of 64MB and another block of 1MB.
 
  Splitting files is very useful for load-balancing and distributing I/O
  across multiple nodes.  At 32KB / file, you don't really need to split
 the
  files at all.
 
  I recommend reading the HDFS design document for background issues like
  this:
 
  http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
 
  Brian
 
  
  
  
   On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.edu
  wrote:
  
   Hey Pierre,
  
   These are not traditional filesystem blocks - if you save a file
 smaller
   than 64MB, you don't lose 64MB of file space..
  
   Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata
 or
   so), not 64MB.
  
   Brian
  
   On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
  
   Hi,
   I'm porting a legacy application to hadoop and it uses a bunch of
 small
   files.
   I'm aware that having such small files ain't a good idea but I'm not
   doing
   the technical decisions and the port has to be done for yesterday...
   Of course such small files are a problem, loading 64MB blocks for a
 few
   lines of text is an evident loss.
   What will happen if I set a smaller, or even way smaller (32kB)
 blocks?
  
   Thank you.
  
   Pierre ANCELOT.
  
  
  
  
   --
   http://www.neko-consulting.com
   Ego sum quis ego servo
   Je suis ce que je protège
   I am what I protect
 
 


 --
 http://www.neko-consulting.com
 Ego sum quis ego servo
 Je suis ce que je protège
 I am what I protect



Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Pierre ANCELOT
Thank you,
Any way I can measure the startup overhead in terms of time?


On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.comwrote:

 Pierre,

 Adding to what Brian has said (some things are not explicitly mentioned in
 the HDFS design doc)...

 - If you have small files that take up  64MB you do not actually use the
 entire 64MB block on disk.
 - You *do* use up RAM on the NameNode, as each block represents meta-data
 that needs to be maintained in-memory in the NameNode.
 - Hadoop won't perform optimally with very small block sizes. Hadoop I/O is
 optimized for high sustained throughput per single file/block. There is a
 penalty for doing too many seeks to get to the beginning of each block.
 Additionally, you will have a MapReduce task per small file. Each MapReduce
 task has a non-trivial startup overhead.
 - The recommendation is to consolidate your small files into large files.
 One way to do this is via SequenceFiles... put the filename in the
 SequenceFile key field, and the file's bytes in the SequenceFile value
 field.

 In addition to the HDFS design docs, I recommend reading this blog post:
 http://www.cloudera.com/blog/2009/02/the-small-files-problem/

 Happy Hadooping,

 - Patrick

 On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com
 wrote:

  Okay, thank you :)
 
 
  On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu
  wrote:
 
  
   On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
  
Hi, thanks for this fast answer :)
If so, what do you mean by blocks? If a file has to be splitted, it
  will
   be
splitted when larger than 64MB?
   
  
   For every 64MB of the file, Hadoop will create a separate block.  So,
 if
   you have a 32KB file, there will be one block of 32KB.  If the file is
  65MB,
   then it will have one block of 64MB and another block of 1MB.
  
   Splitting files is very useful for load-balancing and distributing I/O
   across multiple nodes.  At 32KB / file, you don't really need to split
  the
   files at all.
  
   I recommend reading the HDFS design document for background issues like
   this:
  
   http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
  
   Brian
  
   
   
   
On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman 
 bbock...@cse.unl.edu
   wrote:
   
Hey Pierre,
   
These are not traditional filesystem blocks - if you save a file
  smaller
than 64MB, you don't lose 64MB of file space..
   
Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata
  or
so), not 64MB.
   
Brian
   
On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
   
Hi,
I'm porting a legacy application to hadoop and it uses a bunch of
  small
files.
I'm aware that having such small files ain't a good idea but I'm
 not
doing
the technical decisions and the port has to be done for
 yesterday...
Of course such small files are a problem, loading 64MB blocks for a
  few
lines of text is an evident loss.
What will happen if I set a smaller, or even way smaller (32kB)
  blocks?
   
Thank you.
   
Pierre ANCELOT.
   
   
   
   
--
http://www.neko-consulting.com
Ego sum quis ego servo
Je suis ce que je protège
I am what I protect
  
  
 
 
  --
  http://www.neko-consulting.com
  Ego sum quis ego servo
  Je suis ce que je protège
  I am what I protect
 




-- 
http://www.neko-consulting.com
Ego sum quis ego servo
Je suis ce que je protège
I am what I protect


Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Patrick Angeles
Should be evident in the total job running time... that's the only metric
that really matters :)

On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.comwrote:

 Thank you,
 Any way I can measure the startup overhead in terms of time?


 On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com
 wrote:

  Pierre,
 
  Adding to what Brian has said (some things are not explicitly mentioned
 in
  the HDFS design doc)...
 
  - If you have small files that take up  64MB you do not actually use the
  entire 64MB block on disk.
  - You *do* use up RAM on the NameNode, as each block represents meta-data
  that needs to be maintained in-memory in the NameNode.
  - Hadoop won't perform optimally with very small block sizes. Hadoop I/O
 is
  optimized for high sustained throughput per single file/block. There is a
  penalty for doing too many seeks to get to the beginning of each block.
  Additionally, you will have a MapReduce task per small file. Each
 MapReduce
  task has a non-trivial startup overhead.
  - The recommendation is to consolidate your small files into large files.
  One way to do this is via SequenceFiles... put the filename in the
  SequenceFile key field, and the file's bytes in the SequenceFile value
  field.
 
  In addition to the HDFS design docs, I recommend reading this blog post:
  http://www.cloudera.com/blog/2009/02/the-small-files-problem/
 
  Happy Hadooping,
 
  - Patrick
 
  On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com
  wrote:
 
   Okay, thank you :)
  
  
   On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman bbock...@cse.unl.edu
   wrote:
  
   
On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
   
 Hi, thanks for this fast answer :)
 If so, what do you mean by blocks? If a file has to be splitted, it
   will
be
 splitted when larger than 64MB?

   
For every 64MB of the file, Hadoop will create a separate block.  So,
  if
you have a 32KB file, there will be one block of 32KB.  If the file
 is
   65MB,
then it will have one block of 64MB and another block of 1MB.
   
Splitting files is very useful for load-balancing and distributing
 I/O
across multiple nodes.  At 32KB / file, you don't really need to
 split
   the
files at all.
   
I recommend reading the HDFS design document for background issues
 like
this:
   
http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
   
Brian
   



 On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman 
  bbock...@cse.unl.edu
wrote:

 Hey Pierre,

 These are not traditional filesystem blocks - if you save a file
   smaller
 than 64MB, you don't lose 64MB of file space..

 Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
 metadata
   or
 so), not 64MB.

 Brian

 On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:

 Hi,
 I'm porting a legacy application to hadoop and it uses a bunch of
   small
 files.
 I'm aware that having such small files ain't a good idea but I'm
  not
 doing
 the technical decisions and the port has to be done for
  yesterday...
 Of course such small files are a problem, loading 64MB blocks for
 a
   few
 lines of text is an evident loss.
 What will happen if I set a smaller, or even way smaller (32kB)
   blocks?

 Thank you.

 Pierre ANCELOT.




 --
 http://www.neko-consulting.com
 Ego sum quis ego servo
 Je suis ce que je protège
 I am what I protect
   
   
  
  
   --
   http://www.neko-consulting.com
   Ego sum quis ego servo
   Je suis ce que je protège
   I am what I protect
  
 



 --
 http://www.neko-consulting.com
 Ego sum quis ego servo
 Je suis ce que je protège
 I am what I protect



Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread He Chen
If you know how to use AspectJ to do aspect oriented programming. You can
write a aspect class. Let it just monitors the whole process of MapReduce

On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles patr...@cloudera.comwrote:

 Should be evident in the total job running time... that's the only metric
 that really matters :)

 On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com
 wrote:

  Thank you,
  Any way I can measure the startup overhead in terms of time?
 
 
  On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com
  wrote:
 
   Pierre,
  
   Adding to what Brian has said (some things are not explicitly mentioned
  in
   the HDFS design doc)...
  
   - If you have small files that take up  64MB you do not actually use
 the
   entire 64MB block on disk.
   - You *do* use up RAM on the NameNode, as each block represents
 meta-data
   that needs to be maintained in-memory in the NameNode.
   - Hadoop won't perform optimally with very small block sizes. Hadoop
 I/O
  is
   optimized for high sustained throughput per single file/block. There is
 a
   penalty for doing too many seeks to get to the beginning of each block.
   Additionally, you will have a MapReduce task per small file. Each
  MapReduce
   task has a non-trivial startup overhead.
   - The recommendation is to consolidate your small files into large
 files.
   One way to do this is via SequenceFiles... put the filename in the
   SequenceFile key field, and the file's bytes in the SequenceFile value
   field.
  
   In addition to the HDFS design docs, I recommend reading this blog
 post:
   http://www.cloudera.com/blog/2009/02/the-small-files-problem/
  
   Happy Hadooping,
  
   - Patrick
  
   On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com
   wrote:
  
Okay, thank you :)
   
   
On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman 
 bbock...@cse.unl.edu
wrote:
   

 On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:

  Hi, thanks for this fast answer :)
  If so, what do you mean by blocks? If a file has to be splitted,
 it
will
 be
  splitted when larger than 64MB?
 

 For every 64MB of the file, Hadoop will create a separate block.
  So,
   if
 you have a 32KB file, there will be one block of 32KB.  If the file
  is
65MB,
 then it will have one block of 64MB and another block of 1MB.

 Splitting files is very useful for load-balancing and distributing
  I/O
 across multiple nodes.  At 32KB / file, you don't really need to
  split
the
 files at all.

 I recommend reading the HDFS design document for background issues
  like
 this:

 http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html

 Brian

 
 
 
  On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman 
   bbock...@cse.unl.edu
 wrote:
 
  Hey Pierre,
 
  These are not traditional filesystem blocks - if you save a file
smaller
  than 64MB, you don't lose 64MB of file space..
 
  Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
  metadata
or
  so), not 64MB.
 
  Brian
 
  On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
 
  Hi,
  I'm porting a legacy application to hadoop and it uses a bunch
 of
small
  files.
  I'm aware that having such small files ain't a good idea but
 I'm
   not
  doing
  the technical decisions and the port has to be done for
   yesterday...
  Of course such small files are a problem, loading 64MB blocks
 for
  a
few
  lines of text is an evident loss.
  What will happen if I set a smaller, or even way smaller (32kB)
blocks?
 
  Thank you.
 
  Pierre ANCELOT.
 
 
 
 
  --
  http://www.neko-consulting.com
  Ego sum quis ego servo
  Je suis ce que je protège
  I am what I protect


   
   
--
http://www.neko-consulting.com
Ego sum quis ego servo
Je suis ce que je protège
I am what I protect
   
  
 
 
 
  --
  http://www.neko-consulting.com
  Ego sum quis ego servo
  Je suis ce que je protège
  I am what I protect
 




-- 
Best Wishes!
顺送商祺!

--
Chen He
(402)613-9298
PhD. student of CSE Dept.
Holland Computing Center
University of Nebraska-Lincoln
Lincoln NE 68588


Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Brian Bockelman
Hey Hassan,

1) The overhead is pretty small, measured in a small number of milliseconds on 
average
2) HDFS is not designed for online latency.  Even though the average is 
small, if something bad happens, your clients might experience a lot of 
delays while going through the retry stack.  The initial design was for batch 
processing, and latency-sensitive applications came later.

Additionally since the NN is a SPOF, you might want to consider your uptime 
requirements.  Each organization will have to balance these risks with the 
advantages (such as much cheaper hardware).

There's a nice interview with the GFS authors here where they touch upon the 
latency issues:

http://queue.acm.org/detail.cfm?id=1594206

As GFS and HDFS share many design features, the theoretical parts of their 
discussion might be useful for you.

As far as overall throughput of the system goes, it depends heavily upon your 
implementation and hardware.  Our HDFS routinely serves 5-10 Gbps.

Brian

On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:

 This is a very interesting thread to us, as we are thinking about deploying
 HDFS as a massive online storage for a on online university, and then
 serving the video files to students who want to view them.
 
 We cannot control the size of the videos (and some class work files), as
 they will mostly be uploaded by the teachers providing the classes.
 
 How would the overall through put of HDFS be affected in such a solution?
 Would HDFS be feasible at all for such a setup?
 
 Regards
 HASSAN
 
 
 
 On Tue, May 18, 2010 at 21:11, He Chen airb...@gmail.com wrote:
 
 If you know how to use AspectJ to do aspect oriented programming. You can
 write a aspect class. Let it just monitors the whole process of MapReduce
 
 On Tue, May 18, 2010 at 10:00 AM, Patrick Angeles patr...@cloudera.com
 wrote:
 
 Should be evident in the total job running time... that's the only metric
 that really matters :)
 
 On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com
 wrote:
 
 Thank you,
 Any way I can measure the startup overhead in terms of time?
 
 
 On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com
 wrote:
 
 Pierre,
 
 Adding to what Brian has said (some things are not explicitly
 mentioned
 in
 the HDFS design doc)...
 
 - If you have small files that take up  64MB you do not actually use
 the
 entire 64MB block on disk.
 - You *do* use up RAM on the NameNode, as each block represents
 meta-data
 that needs to be maintained in-memory in the NameNode.
 - Hadoop won't perform optimally with very small block sizes. Hadoop
 I/O
 is
 optimized for high sustained throughput per single file/block. There
 is
 a
 penalty for doing too many seeks to get to the beginning of each
 block.
 Additionally, you will have a MapReduce task per small file. Each
 MapReduce
 task has a non-trivial startup overhead.
 - The recommendation is to consolidate your small files into large
 files.
 One way to do this is via SequenceFiles... put the filename in the
 SequenceFile key field, and the file's bytes in the SequenceFile
 value
 field.
 
 In addition to the HDFS design docs, I recommend reading this blog
 post:
 http://www.cloudera.com/blog/2009/02/the-small-files-problem/
 
 Happy Hadooping,
 
 - Patrick
 
 On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com
 
 wrote:
 
 Okay, thank you :)
 
 
 On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman 
 bbock...@cse.unl.edu
 wrote:
 
 
 On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
 
 Hi, thanks for this fast answer :)
 If so, what do you mean by blocks? If a file has to be
 splitted,
 it
 will
 be
 splitted when larger than 64MB?
 
 
 For every 64MB of the file, Hadoop will create a separate block.
 So,
 if
 you have a 32KB file, there will be one block of 32KB.  If the
 file
 is
 65MB,
 then it will have one block of 64MB and another block of 1MB.
 
 Splitting files is very useful for load-balancing and
 distributing
 I/O
 across multiple nodes.  At 32KB / file, you don't really need to
 split
 the
 files at all.
 
 I recommend reading the HDFS design document for background
 issues
 like
 this:
 
 http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
 
 Brian
 
 
 
 
 On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman 
 bbock...@cse.unl.edu
 wrote:
 
 Hey Pierre,
 
 These are not traditional filesystem blocks - if you save a
 file
 smaller
 than 64MB, you don't lose 64MB of file space..
 
 Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
 metadata
 or
 so), not 64MB.
 
 Brian
 
 On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
 
 Hi,
 I'm porting a legacy application to hadoop and it uses a
 bunch
 of
 small
 files.
 I'm aware that having such small files ain't a good idea but
 I'm
 not
 doing
 the technical decisions and the port has to be done for
 yesterday...
 Of course such small files are a problem, loading 64MB blocks
 for
 a
 few
 lines of text is an evident loss.
 What will happen if I set a 

Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Konstantin Boudnik
I had an experiment with block size of 10 bytes (sic!). This was _very_ slow
on NN side. Like writing 5 Mb was happening for 25 minutes or so :( No fun to
say the least...

On Tue, May 18, 2010 at 10:56AM, Konstantin Shvachko wrote:
 You can also get some performance numbers and answers to the block size 
 dilemma problem here:
 
 http://developer.yahoo.net/blogs/hadoop/2010/05/scalability_of_the_hadoop_dist.html
 
 I remember some people were using Hadoop for storing or streaming videos.
 Don't know how well that worked.
 It would be interesting to learn about your experience.
 
 Thanks,
 --Konstantin
 
 
 On 5/18/2010 8:41 AM, Brian Bockelman wrote:
  Hey Hassan,
 
  1) The overhead is pretty small, measured in a small number of milliseconds 
  on average
  2) HDFS is not designed for online latency.  Even though the average is 
  small, if something bad happens, your clients might experience a lot of 
  delays while going through the retry stack.  The initial design was for 
  batch processing, and latency-sensitive applications came later.
 
  Additionally since the NN is a SPOF, you might want to consider your uptime 
  requirements.  Each organization will have to balance these risks with the 
  advantages (such as much cheaper hardware).
 
  There's a nice interview with the GFS authors here where they touch upon 
  the latency issues:
 
  http://queue.acm.org/detail.cfm?id=1594206
 
  As GFS and HDFS share many design features, the theoretical parts of their 
  discussion might be useful for you.
 
  As far as overall throughput of the system goes, it depends heavily upon 
  your implementation and hardware.  Our HDFS routinely serves 5-10 Gbps.
 
  Brian
 
  On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:
 
  This is a very interesting thread to us, as we are thinking about deploying
  HDFS as a massive online storage for a on online university, and then
  serving the video files to students who want to view them.
 
  We cannot control the size of the videos (and some class work files), as
  they will mostly be uploaded by the teachers providing the classes.
 
  How would the overall through put of HDFS be affected in such a solution?
  Would HDFS be feasible at all for such a setup?
 
  Regards
  HASSAN
 
 
 
  On Tue, May 18, 2010 at 21:11, He Chenairb...@gmail.com  wrote:
 
  If you know how to use AspectJ to do aspect oriented programming. You can
  write a aspect class. Let it just monitors the whole process of MapReduce
 
  On Tue, May 18, 2010 at 10:00 AM, Patrick Angelespatr...@cloudera.com
  wrote:
 
  Should be evident in the total job running time... that's the only metric
  that really matters :)
 
  On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOTpierre...@gmail.com
  wrote:
 
  Thank you,
  Any way I can measure the startup overhead in terms of time?
 
 
  On Tue, May 18, 2010 at 4:27 PM, Patrick Angelespatr...@cloudera.com
  wrote:
 
  Pierre,
 
  Adding to what Brian has said (some things are not explicitly
  mentioned
  in
  the HDFS design doc)...
 
  - If you have small files that take up  64MB you do not actually use
  the
  entire 64MB block on disk.
  - You *do* use up RAM on the NameNode, as each block represents
  meta-data
  that needs to be maintained in-memory in the NameNode.
  - Hadoop won't perform optimally with very small block sizes. Hadoop
  I/O
  is
  optimized for high sustained throughput per single file/block. There
  is
  a
  penalty for doing too many seeks to get to the beginning of each
  block.
  Additionally, you will have a MapReduce task per small file. Each
  MapReduce
  task has a non-trivial startup overhead.
  - The recommendation is to consolidate your small files into large
  files.
  One way to do this is via SequenceFiles... put the filename in the
  SequenceFile key field, and the file's bytes in the SequenceFile
  value
  field.
 
  In addition to the HDFS design docs, I recommend reading this blog
  post:
  http://www.cloudera.com/blog/2009/02/the-small-files-problem/
 
  Happy Hadooping,
 
  - Patrick
 
  On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOTpierre...@gmail.com
 
  wrote:
 
  Okay, thank you :)
 
 
  On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman
  bbock...@cse.unl.edu
  wrote:
 
 
  On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
 
  Hi, thanks for this fast answer :)
  If so, what do you mean by blocks? If a file has to be
  splitted,
  it
  will
  be
  splitted when larger than 64MB?
 
 
  For every 64MB of the file, Hadoop will create a separate block.
  So,
  if
  you have a 32KB file, there will be one block of 32KB.  If the
  file
  is
  65MB,
  then it will have one block of 64MB and another block of 1MB.
 
  Splitting files is very useful for load-balancing and
  distributing
  I/O
  across multiple nodes.  At 32KB / file, you don't really need to
  split
  the
  files at all.
 
  I recommend reading the HDFS design document for background
  issues
  like
  this:
 
  

Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Pierre ANCELOT
Thanks for the sarcasm but with 3 small files and so, 3 Mapper
instatiations, even though it's not (and never did I say it was) he only
metric that matters, it seem to me lie something very interresting to check
out...
I have hierarchy over me and they will be happy to understand my choices
with real numbers to base their understanding on.
Thanks.


On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles patr...@cloudera.comwrote:

 Should be evident in the total job running time... that's the only metric
 that really matters :)

 On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com
 wrote:

  Thank you,
  Any way I can measure the startup overhead in terms of time?
 
 
  On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com
  wrote:
 
   Pierre,
  
   Adding to what Brian has said (some things are not explicitly mentioned
  in
   the HDFS design doc)...
  
   - If you have small files that take up  64MB you do not actually use
 the
   entire 64MB block on disk.
   - You *do* use up RAM on the NameNode, as each block represents
 meta-data
   that needs to be maintained in-memory in the NameNode.
   - Hadoop won't perform optimally with very small block sizes. Hadoop
 I/O
  is
   optimized for high sustained throughput per single file/block. There is
 a
   penalty for doing too many seeks to get to the beginning of each block.
   Additionally, you will have a MapReduce task per small file. Each
  MapReduce
   task has a non-trivial startup overhead.
   - The recommendation is to consolidate your small files into large
 files.
   One way to do this is via SequenceFiles... put the filename in the
   SequenceFile key field, and the file's bytes in the SequenceFile value
   field.
  
   In addition to the HDFS design docs, I recommend reading this blog
 post:
   http://www.cloudera.com/blog/2009/02/the-small-files-problem/
  
   Happy Hadooping,
  
   - Patrick
  
   On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com
   wrote:
  
Okay, thank you :)
   
   
On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman 
 bbock...@cse.unl.edu
wrote:
   

 On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:

  Hi, thanks for this fast answer :)
  If so, what do you mean by blocks? If a file has to be splitted,
 it
will
 be
  splitted when larger than 64MB?
 

 For every 64MB of the file, Hadoop will create a separate block.
  So,
   if
 you have a 32KB file, there will be one block of 32KB.  If the file
  is
65MB,
 then it will have one block of 64MB and another block of 1MB.

 Splitting files is very useful for load-balancing and distributing
  I/O
 across multiple nodes.  At 32KB / file, you don't really need to
  split
the
 files at all.

 I recommend reading the HDFS design document for background issues
  like
 this:

 http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html

 Brian

 
 
 
  On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman 
   bbock...@cse.unl.edu
 wrote:
 
  Hey Pierre,
 
  These are not traditional filesystem blocks - if you save a file
smaller
  than 64MB, you don't lose 64MB of file space..
 
  Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
  metadata
or
  so), not 64MB.
 
  Brian
 
  On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
 
  Hi,
  I'm porting a legacy application to hadoop and it uses a bunch
 of
small
  files.
  I'm aware that having such small files ain't a good idea but
 I'm
   not
  doing
  the technical decisions and the port has to be done for
   yesterday...
  Of course such small files are a problem, loading 64MB blocks
 for
  a
few
  lines of text is an evident loss.
  What will happen if I set a smaller, or even way smaller (32kB)
blocks?
 
  Thank you.
 
  Pierre ANCELOT.
 
 
 
 
  --
  http://www.neko-consulting.com
  Ego sum quis ego servo
  Je suis ce que je protège
  I am what I protect


   
   
--
http://www.neko-consulting.com
Ego sum quis ego servo
Je suis ce que je protège
I am what I protect
   
  
 
 
 
  --
  http://www.neko-consulting.com
  Ego sum quis ego servo
  Je suis ce que je protège
  I am what I protect
 




-- 
http://www.neko-consulting.com
Ego sum quis ego servo
Je suis ce que je protège
I am what I protect


Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Patrick Angeles
That wasn't sarcasm. This is what you do:

- Run your mapreduce job on 30k small files.
- Consolidate your 30k small files into larger files.
- Run mapreduce ok the larger files.
- Compare the running time

The difference in runtime is made up by your task startup and seek overhead.

If you want to get the 'average' overhead per task, divide the total times
for each job by the number of map tasks. This won't be a true average
because with larger chunks of data, you will have longer running map tasks
that will hold up the shuffle phase. But the average doesn't really matter
here because you always have that trade-off going from small to large chunks
of data.


On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT pierre...@gmail.com wrote:

 Thanks for the sarcasm but with 3 small files and so, 3 Mapper
 instatiations, even though it's not (and never did I say it was) he only
 metric that matters, it seem to me lie something very interresting to check
 out...
 I have hierarchy over me and they will be happy to understand my choices
 with real numbers to base their understanding on.
 Thanks.


 On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles patr...@cloudera.com
 wrote:

  Should be evident in the total job running time... that's the only metric
  that really matters :)
 
  On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com
  wrote:
 
   Thank you,
   Any way I can measure the startup overhead in terms of time?
  
  
   On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com
   wrote:
  
Pierre,
   
Adding to what Brian has said (some things are not explicitly
 mentioned
   in
the HDFS design doc)...
   
- If you have small files that take up  64MB you do not actually use
  the
entire 64MB block on disk.
- You *do* use up RAM on the NameNode, as each block represents
  meta-data
that needs to be maintained in-memory in the NameNode.
- Hadoop won't perform optimally with very small block sizes. Hadoop
  I/O
   is
optimized for high sustained throughput per single file/block. There
 is
  a
penalty for doing too many seeks to get to the beginning of each
 block.
Additionally, you will have a MapReduce task per small file. Each
   MapReduce
task has a non-trivial startup overhead.
- The recommendation is to consolidate your small files into large
  files.
One way to do this is via SequenceFiles... put the filename in the
SequenceFile key field, and the file's bytes in the SequenceFile
 value
field.
   
In addition to the HDFS design docs, I recommend reading this blog
  post:
http://www.cloudera.com/blog/2009/02/the-small-files-problem/
   
Happy Hadooping,
   
- Patrick
   
On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com
 
wrote:
   
 Okay, thank you :)


 On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman 
  bbock...@cse.unl.edu
 wrote:

 
  On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
 
   Hi, thanks for this fast answer :)
   If so, what do you mean by blocks? If a file has to be
 splitted,
  it
 will
  be
   splitted when larger than 64MB?
  
 
  For every 64MB of the file, Hadoop will create a separate block.
   So,
if
  you have a 32KB file, there will be one block of 32KB.  If the
 file
   is
 65MB,
  then it will have one block of 64MB and another block of 1MB.
 
  Splitting files is very useful for load-balancing and
 distributing
   I/O
  across multiple nodes.  At 32KB / file, you don't really need to
   split
 the
  files at all.
 
  I recommend reading the HDFS design document for background
 issues
   like
  this:
 
  http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
 
  Brian
 
  
  
  
   On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman 
bbock...@cse.unl.edu
  wrote:
  
   Hey Pierre,
  
   These are not traditional filesystem blocks - if you save a
 file
 smaller
   than 64MB, you don't lose 64MB of file space..
  
   Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
   metadata
 or
   so), not 64MB.
  
   Brian
  
   On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
  
   Hi,
   I'm porting a legacy application to hadoop and it uses a
 bunch
  of
 small
   files.
   I'm aware that having such small files ain't a good idea but
  I'm
not
   doing
   the technical decisions and the port has to be done for
yesterday...
   Of course such small files are a problem, loading 64MB blocks
  for
   a
 few
   lines of text is an evident loss.
   What will happen if I set a smaller, or even way smaller
 (32kB)
 blocks?
  
   Thank you.
  
   Pierre ANCELOT.
  
  
  
  
   --
   http://www.neko-consulting.com
   Ego sum quis 

RE: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Jones, Nick
I'm not familiar with how to use/create them, but shouldn't a HAR (Hadoop 
Archive) work well in this situation?  I thought it was designed to collect 
several small files together through another level indirection to avoid the NN 
load and decreasing the HDFS block size.

Nick Jones
-Original Message-
From: patrickange...@gmail.com [mailto:patrickange...@gmail.com] On Behalf Of 
Patrick Angeles
Sent: Tuesday, May 18, 2010 4:36 PM
To: common-user@hadoop.apache.org
Subject: Re: Any possible to set hdfs block size to a value smaller than 64MB?

That wasn't sarcasm. This is what you do:

- Run your mapreduce job on 30k small files.
- Consolidate your 30k small files into larger files.
- Run mapreduce ok the larger files.
- Compare the running time

The difference in runtime is made up by your task startup and seek overhead.

If you want to get the 'average' overhead per task, divide the total times
for each job by the number of map tasks. This won't be a true average
because with larger chunks of data, you will have longer running map tasks
that will hold up the shuffle phase. But the average doesn't really matter
here because you always have that trade-off going from small to large chunks
of data.


On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT pierre...@gmail.com wrote:

 Thanks for the sarcasm but with 3 small files and so, 3 Mapper
 instatiations, even though it's not (and never did I say it was) he only
 metric that matters, it seem to me lie something very interresting to check
 out...
 I have hierarchy over me and they will be happy to understand my choices
 with real numbers to base their understanding on.
 Thanks.


 On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles patr...@cloudera.com
 wrote:

  Should be evident in the total job running time... that's the only metric
  that really matters :)
 
  On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com
  wrote:
 
   Thank you,
   Any way I can measure the startup overhead in terms of time?
  
  
   On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles patr...@cloudera.com
   wrote:
  
Pierre,
   
Adding to what Brian has said (some things are not explicitly
 mentioned
   in
the HDFS design doc)...
   
- If you have small files that take up  64MB you do not actually use
  the
entire 64MB block on disk.
- You *do* use up RAM on the NameNode, as each block represents
  meta-data
that needs to be maintained in-memory in the NameNode.
- Hadoop won't perform optimally with very small block sizes. Hadoop
  I/O
   is
optimized for high sustained throughput per single file/block. There
 is
  a
penalty for doing too many seeks to get to the beginning of each
 block.
Additionally, you will have a MapReduce task per small file. Each
   MapReduce
task has a non-trivial startup overhead.
- The recommendation is to consolidate your small files into large
  files.
One way to do this is via SequenceFiles... put the filename in the
SequenceFile key field, and the file's bytes in the SequenceFile
 value
field.
   
In addition to the HDFS design docs, I recommend reading this blog
  post:
http://www.cloudera.com/blog/2009/02/the-small-files-problem/
   
Happy Hadooping,
   
- Patrick
   
On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT pierre...@gmail.com
 
wrote:
   
 Okay, thank you :)


 On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman 
  bbock...@cse.unl.edu
 wrote:

 
  On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
 
   Hi, thanks for this fast answer :)
   If so, what do you mean by blocks? If a file has to be
 splitted,
  it
 will
  be
   splitted when larger than 64MB?
  
 
  For every 64MB of the file, Hadoop will create a separate block.
   So,
if
  you have a 32KB file, there will be one block of 32KB.  If the
 file
   is
 65MB,
  then it will have one block of 64MB and another block of 1MB.
 
  Splitting files is very useful for load-balancing and
 distributing
   I/O
  across multiple nodes.  At 32KB / file, you don't really need to
   split
 the
  files at all.
 
  I recommend reading the HDFS design document for background
 issues
   like
  this:
 
  http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
 
  Brian
 
  
  
  
   On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman 
bbock...@cse.unl.edu
  wrote:
  
   Hey Pierre,
  
   These are not traditional filesystem blocks - if you save a
 file
 smaller
   than 64MB, you don't lose 64MB of file space..
  
   Hadoop will use 32KB to store a 32KB file (ok, plus a KB of
   metadata
 or
   so), not 64MB.
  
   Brian
  
   On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
  
   Hi,
   I'm porting a legacy application to hadoop and it uses a
 bunch

Re: Any possible to set hdfs block size to a value smaller than 64MB?

2010-05-18 Thread Todd Lipcon
On Tue, May 18, 2010 at 2:50 PM, Jones, Nick nick.jo...@amd.com wrote:

 I'm not familiar with how to use/create them, but shouldn't a HAR (Hadoop
 Archive) work well in this situation?  I thought it was designed to collect
 several small files together through another level indirection to avoid the
 NN load and decreasing the HDFS block size.


Yes, or CombineFileInputFormat. JVM reuse also helps somewhat, so long as
you're not talking about hundreds of thousands of files (in which case it
starts to hurt JT load with that many tasks in jobs)

There are a number of ways to combat the issue, but rule of thumb is that
you shouldn't try to use HDFS to store tons of small files :)

-Todd

-Original Message-
 From: patrickange...@gmail.com [mailto:patrickange...@gmail.com] On Behalf
 Of Patrick Angeles
 Sent: Tuesday, May 18, 2010 4:36 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Any possible to set hdfs block size to a value smaller than
 64MB?

 That wasn't sarcasm. This is what you do:

 - Run your mapreduce job on 30k small files.
 - Consolidate your 30k small files into larger files.
 - Run mapreduce ok the larger files.
 - Compare the running time

 The difference in runtime is made up by your task startup and seek
 overhead.

 If you want to get the 'average' overhead per task, divide the total times
 for each job by the number of map tasks. This won't be a true average
 because with larger chunks of data, you will have longer running map tasks
 that will hold up the shuffle phase. But the average doesn't really matter
 here because you always have that trade-off going from small to large
 chunks
 of data.


 On Tue, May 18, 2010 at 7:31 PM, Pierre ANCELOT pierre...@gmail.com
 wrote:

  Thanks for the sarcasm but with 3 small files and so, 3 Mapper
  instatiations, even though it's not (and never did I say it was) he only
  metric that matters, it seem to me lie something very interresting to
 check
  out...
  I have hierarchy over me and they will be happy to understand my choices
  with real numbers to base their understanding on.
  Thanks.
 
 
  On Tue, May 18, 2010 at 5:00 PM, Patrick Angeles patr...@cloudera.com
  wrote:
 
   Should be evident in the total job running time... that's the only
 metric
   that really matters :)
  
   On Tue, May 18, 2010 at 10:39 AM, Pierre ANCELOT pierre...@gmail.com
   wrote:
  
Thank you,
Any way I can measure the startup overhead in terms of time?
   
   
On Tue, May 18, 2010 at 4:27 PM, Patrick Angeles 
 patr...@cloudera.com
wrote:
   
 Pierre,

 Adding to what Brian has said (some things are not explicitly
  mentioned
in
 the HDFS design doc)...

 - If you have small files that take up  64MB you do not actually
 use
   the
 entire 64MB block on disk.
 - You *do* use up RAM on the NameNode, as each block represents
   meta-data
 that needs to be maintained in-memory in the NameNode.
 - Hadoop won't perform optimally with very small block sizes.
 Hadoop
   I/O
is
 optimized for high sustained throughput per single file/block.
 There
  is
   a
 penalty for doing too many seeks to get to the beginning of each
  block.
 Additionally, you will have a MapReduce task per small file. Each
MapReduce
 task has a non-trivial startup overhead.
 - The recommendation is to consolidate your small files into large
   files.
 One way to do this is via SequenceFiles... put the filename in the
 SequenceFile key field, and the file's bytes in the SequenceFile
  value
 field.

 In addition to the HDFS design docs, I recommend reading this blog
   post:
 http://www.cloudera.com/blog/2009/02/the-small-files-problem/

 Happy Hadooping,

 - Patrick

 On Tue, May 18, 2010 at 9:11 AM, Pierre ANCELOT 
 pierre...@gmail.com
  
 wrote:

  Okay, thank you :)
 
 
  On Tue, May 18, 2010 at 2:48 PM, Brian Bockelman 
   bbock...@cse.unl.edu
  wrote:
 
  
   On May 18, 2010, at 7:38 AM, Pierre ANCELOT wrote:
  
Hi, thanks for this fast answer :)
If so, what do you mean by blocks? If a file has to be
  splitted,
   it
  will
   be
splitted when larger than 64MB?
   
  
   For every 64MB of the file, Hadoop will create a separate
 block.
So,
 if
   you have a 32KB file, there will be one block of 32KB.  If the
  file
is
  65MB,
   then it will have one block of 64MB and another block of 1MB.
  
   Splitting files is very useful for load-balancing and
  distributing
I/O
   across multiple nodes.  At 32KB / file, you don't really need
 to
split
  the
   files at all.
  
   I recommend reading the HDFS design document for background
  issues
like
   this:
  
   http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
  
   Brian
  
   
   
   
On Tue