[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759591#comment-13759591
 ] 

Owen O'Malley commented on HDFS-4953:
-------------------------------------

Ok, after a bit more thought, I realize that we can't assume the user is done 
with the ByteBuffer when they call read the next time. I'd like to propose 
changing the API in FSDataInputStream to the following:

{code}
/**
 * Read the file from offset to offset+length-1 into a series of ByteBuffers. 
If there are fewer 
 * than length bytes available in the file, an IOException will be thrown. In 
all other cases, the
 * combined length of all of the returned buffers will precisely match the 
requested length.
 * The ByteBuffers may either backed by mmapped memory or byte[] depending on 
whether the data is
 * available locally.
 * @param offset the first offset of the file to read
 * @param length the total combined length of the file to read
 * @return A list of ByteBuffers that precisely cover the requested region of 
the file. 
 */
public ByteBuffer[] readByteBuffers(long offset, int length) throws IOException;

/**
 * Add the provided buffers to the FSDataInputStream's pool of byte buffers 
that are
 * available to be reused. The client should not hold references to buffers 
after they
 * have been released.
 * @param buffers buffers returned from previous calls to readByteBuffers
 */
public void releaseByteBuffers(ByteBuffer... buffers);
{code}
                
> enable HDFS local reads via mmap
> --------------------------------
>
>                 Key: HDFS-4953
>                 URL: https://issues.apache.org/jira/browse/HDFS-4953
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 2.3.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>             Fix For: HDFS-4949
>
>         Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
> HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
> HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access 
> files directly without going through the DataNode.  However, all of these 
> reads involve a copy at the operating system level, since they rely on the 
> read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable 
> truly zero-copy reads.
> In the initial implementation, zero-copy reads will only be performed when 
> checksums were disabled.  Later, we can use the DataNode's cache awareness to 
> only perform zero-copy reads when we know that checksum has already been 
> verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to