Perfect, thanks. It's what I was looking for. I have few nodes, all with 2TB drives, but one with 2x1TB. Which mean that at the end, for Hadoop, it's almost the same thing.
JM 2012/12/28, Robert Molina <rmol...@hortonworks.com>: > Hi Jean, > Hadoop will not factor in number of disks or directories, but rather mainly > allocated free space. Hadoop will do its best to spread the data across > evenly amongst the nodes. For instance, let's say you had 3 datanodes > (replication factor 1) and all have allocated 10GB each, but one of the > nodes split the 10GB into two directories. Now if we try to store a file > that takes up 3 blocks, Hadoop will just place 1 block in each node. > > Hope that helps. > > Regards, > Robert > > On Fri, Dec 28, 2012 at 9:12 AM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > >> Hi, >> >> Quick question regarding hard drive space usage. >> >> Hadoop will distribute the data evenly on the cluster. So all the >> nodes are going to receive almost the same quantity of data to store. >> >> Now, if on one node I have 2 directories configured, is hadoop going >> to assign twice the quantity on this node? Or is each directory going >> to receive half the load? >> >> Thanks, >> >> JM >> >