Re: Data node with multiple disks

Aitor Perez Cedres Tue, 13 May 2014 09:32:56 -0700

If you specify a list in the property "dfs.datanode.data.dir" hadoopwill distribute the data blocks among all those disks; it will notreplicate data between them. If you want to use the disks as a singleone you gotta make a LVM array or any other solution to present them asa single one to the OS.

However, benchmarks prove that specifying a list of disks and lettinghadoop distribute data among them gives better performance.


On 13/05/14 17:12, Marcos Sousa wrote:

Yes,

I don't want to replicate, just use as one disk? Isn't possible tomake this work?


Best regards,

Marcos

On Tue, May 13, 2014 at 6:55 AM, Rahul Chaudhari<rahulchaudhari0...@gmail.com <mailto:rahulchaudhari0...@gmail.com>>wrote:


    Marcos,
        While configuring hadoop, the "dfs.datanode.data.dir" property
    in hdfs-default.xml should have this list of disks specified on
    separate line. If you specific comma separated list, it will
    replicate on all those disks/partitions.

    _Rahul
    Sent from my iPad

    > On 13-May-2014, at 12:22 am, Marcos Sousa
    <falecom...@marcossousa.com <mailto:falecom...@marcossousa.com>>
    wrote:
    >
    > Hi,
    >
    > I have 20 servers with 10 HD with 400GB SATA. I'd like to use
    them to be my datanode:
    >
    > /vol1/hadoop/data
    > /vol2/hadoop/data
    > /vol3/hadoop/data
    > /volN/hadoop/data
    >
    > How do user those distinct discs not to replicate?
    >
    > Best regards,
    >
    > --
    > Marcos Sousa




--
Marcos Sousa
www.marcossousa.com <http://www.marcossousa.com> Enjoy it!


--
*Aitor Pérez*
/Big Data System Engineer/

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/Manuel Tovar, 49-53 - 28034 Madrid - Spain

_http://www.bidoop.es_

Re: Data node with multiple disks

Reply via email to