Explore usage of the sendfile api via 
java.nio.channels.FileChannel.transfer{To|From} for i/o in datanodes
----------------------------------------------------------------------------------------------------------

                 Key: HADOOP-2312
                 URL: https://issues.apache.org/jira/browse/HADOOP-2312
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
            Reporter: Arun C Murthy


We could potentially gain a lot of performance by using the *sendfile* system 
call:

$ man sendfile
{noformat}
DESCRIPTION
       This  call  copies  data between one file descriptor and another.  
Either or both of these file descriptors may refer to a socket (but see below).
       in_fd should be a file descriptor opened for reading and out_fd should 
be a descriptor opened for writing.  offset is  a  pointer  to  a  variable
       holding  the input file pointer position from which sendfile() will 
start reading data.  When sendfile() returns, this variable will be set to the
       offset of the byte following the last byte that was read.  count is the 
number of bytes to copy between file descriptors.

       Because this copying is done within the kernel, sendfile() does not need 
to spend time transferring data to and from user space.
{noformat}

The nio package offers this via the 
java.nio.channels.FileChannel.transfer{To|From} apis:
http://java.sun.com/j2se/1.5.0/docs/api/java/nio/channels/FileChannel.html#transferFrom(java.nio.channels.ReadableByteChannel,%20long,%20long)
http://java.sun.com/j2se/1.5.0/docs/api/java/nio/channels/FileChannel.html#transferTo(long,%20long,%20java.nio.channels.WritableByteChannel)

>From the javadocs:
{noformat}
     This method is potentially much more efficient than a simple loop that 
reads from this channel and writes to the target channel. Many operating 
systems can transfer bytes directly from the filesystem cache to the target 
channel without actually copying them.
{noformat}

----

Hence, this could well-worth exploring for doing io at the datanodes...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to