On Mon, Oct 28, 2013 at 4:24 PM, Kyle Sletmoe
<kyle.slet...@urbanrobotics.net> wrote:
> I have written a WebHDFSClient and I do not believe that reusing
> connections is enough to noticeably speed up transfers in my case. I did
> some tests and on average it took roughly 14 minutes to transfer a 3.6 GB
> file to an HDFS on my local network (I tried the same operation using cURL,
> with similar results). I tried transferring the exact same file with the
> hdfs->dfs->copyFromLocal command, and it took on average 40 seconds. I need
> to be able to reliably transfer files that are in the 250 GB - 1TB range,
> and I really need the speed afforded by the "direct" transferring method
> that libhdfs uses. Does libhdfs work with Hadoop 2.2.0 (if I use it in
> Linux)?

libhdfs is the basis of a lot of software built on top of HDFS, such
as Impala and fuse_dfs, and yes, it works.

Patches that improve portabilty are welcome.  However, rather than
#ifdefs, I would rather see platform-specific files that implement
whatever functionality is platform-specific.

Another option for you is to use the new NFS v3 gateway included in
Hadoop 2.  I have heard that newer version of Windows finally include
some kind of NFS support.  (However, older versions, such as Windows
XP, do not have this support).

best,
Colin


>
> --
> Kyle Sletmoe
>
> *Urban Robotics Inc.**
> *Software Engineer
>
> 33 NW First Avenue, Suite 200 | Portland, OR 97209
> c: (541) 621-7516 | e: kyle.slet...@urbanrobotics.net
>
> http://www.urbanrobotics.net
>
>
> On Mon, Oct 28, 2013 at 4:14 PM, Haohui Mai <h...@hortonworks.com> wrote:
>
>> I believe that the WebHDFS API is your best bet for now. The current
>> implementation of WebHDFSClient does not reuse the HTTP connections, which
>> leads to a large part of the performance penalty.
>>
>> You might want to implement your own version that reuses HTTP connection to
>> see whether it meets your performance requirements.
>>
>> Thanks,
>> Haohui
>>
>>
>> On Mon, Oct 28, 2013 at 3:38 PM, Kyle Sletmoe <
>> kyle.slet...@urbanrobotics.net> wrote:
>>
>> > Now that Hadoop 2.2.0 is Windows compatible, is there going to be work on
>> > creating a portable version of libhdfs for C/C++ interaction with HDFS? I
>> > know I can use the WebHDFS REST API, but the data transfer rates are
>> > abysmally slow compared to the direct interaction via libhdfs.
>> >
>> > Regards,
>> > --
>> > Kyle Sletmoe
>> >
>> > *Urban Robotics Inc.**
>> > *Software Engineer
>> >
>> > 33 NW First Avenue, Suite 200 | Portland, OR 97209
>> > c: (541) 621-7516 | e: kyle.slet...@urbanrobotics.net
>> >
>> > http://www.urbanrobotics.net
>> >
>> > --
>> > *Information contained herein is subject to the Code of Federal
>> Regulations
>> > Chapter 22 International Traffic in Arms Regulations. This data may not
>> be
>> > resold, diverted, transferred, transshipped, made available to a foreign
>> > national within the United States, or otherwise disposed of in any other
>> > country outside of its intended destination, either in original form or
>> > after being incorporated through an intermediate process into other data
>> > without the prior written approval of the US Department of State.
>> >  **Penalties
>> > for violation include bans on defense and military work, fines and
>> > imprisonment.*
>> >
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
> --
> *Information contained herein is subject to the Code of Federal Regulations
> Chapter 22 International Traffic in Arms Regulations. This data may not be
> resold, diverted, transferred, transshipped, made available to a foreign
> national within the United States, or otherwise disposed of in any other
> country outside of its intended destination, either in original form or
> after being incorporated through an intermediate process into other data
> without the prior written approval of the US Department of State.  **Penalties
> for violation include bans on defense and military work, fines and
> imprisonment.*

Reply via email to