On Mon, Oct 28, 2013 at 4:24 PM, Kyle Sletmoe <kyle.slet...@urbanrobotics.net> wrote: > I have written a WebHDFSClient and I do not believe that reusing > connections is enough to noticeably speed up transfers in my case. I did > some tests and on average it took roughly 14 minutes to transfer a 3.6 GB > file to an HDFS on my local network (I tried the same operation using cURL, > with similar results). I tried transferring the exact same file with the > hdfs->dfs->copyFromLocal command, and it took on average 40 seconds. I need > to be able to reliably transfer files that are in the 250 GB - 1TB range, > and I really need the speed afforded by the "direct" transferring method > that libhdfs uses. Does libhdfs work with Hadoop 2.2.0 (if I use it in > Linux)?
libhdfs is the basis of a lot of software built on top of HDFS, such as Impala and fuse_dfs, and yes, it works. Patches that improve portabilty are welcome. However, rather than #ifdefs, I would rather see platform-specific files that implement whatever functionality is platform-specific. Another option for you is to use the new NFS v3 gateway included in Hadoop 2. I have heard that newer version of Windows finally include some kind of NFS support. (However, older versions, such as Windows XP, do not have this support). best, Colin > > -- > Kyle Sletmoe > > *Urban Robotics Inc.** > *Software Engineer > > 33 NW First Avenue, Suite 200 | Portland, OR 97209 > c: (541) 621-7516 | e: kyle.slet...@urbanrobotics.net > > http://www.urbanrobotics.net > > > On Mon, Oct 28, 2013 at 4:14 PM, Haohui Mai <h...@hortonworks.com> wrote: > >> I believe that the WebHDFS API is your best bet for now. The current >> implementation of WebHDFSClient does not reuse the HTTP connections, which >> leads to a large part of the performance penalty. >> >> You might want to implement your own version that reuses HTTP connection to >> see whether it meets your performance requirements. >> >> Thanks, >> Haohui >> >> >> On Mon, Oct 28, 2013 at 3:38 PM, Kyle Sletmoe < >> kyle.slet...@urbanrobotics.net> wrote: >> >> > Now that Hadoop 2.2.0 is Windows compatible, is there going to be work on >> > creating a portable version of libhdfs for C/C++ interaction with HDFS? I >> > know I can use the WebHDFS REST API, but the data transfer rates are >> > abysmally slow compared to the direct interaction via libhdfs. >> > >> > Regards, >> > -- >> > Kyle Sletmoe >> > >> > *Urban Robotics Inc.** >> > *Software Engineer >> > >> > 33 NW First Avenue, Suite 200 | Portland, OR 97209 >> > c: (541) 621-7516 | e: kyle.slet...@urbanrobotics.net >> > >> > http://www.urbanrobotics.net >> > >> > -- >> > *Information contained herein is subject to the Code of Federal >> Regulations >> > Chapter 22 International Traffic in Arms Regulations. This data may not >> be >> > resold, diverted, transferred, transshipped, made available to a foreign >> > national within the United States, or otherwise disposed of in any other >> > country outside of its intended destination, either in original form or >> > after being incorporated through an intermediate process into other data >> > without the prior written approval of the US Department of State. >> > **Penalties >> > for violation include bans on defense and military work, fines and >> > imprisonment.* >> > >> >> -- >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity to >> which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender immediately >> and delete it from your system. Thank You. >> > > -- > *Information contained herein is subject to the Code of Federal Regulations > Chapter 22 International Traffic in Arms Regulations. This data may not be > resold, diverted, transferred, transshipped, made available to a foreign > national within the United States, or otherwise disposed of in any other > country outside of its intended destination, either in original form or > after being incorporated through an intermediate process into other data > without the prior written approval of the US Department of State. **Penalties > for violation include bans on defense and military work, fines and > imprisonment.*