[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13727274#comment-13727274
 ] 

Tsuyoshi OZAWA commented on HDFS-4953:
--------------------------------------

bq. We might run out of file descriptors long before a GC is triggered.

I see, the way of thinking looks like natural for me. 

bq. That consumes page table entries and may prevent us from unmapping a memory 
map which really has not been used for a long time.

I agree with your opinion. We, however, can co-exist ref count based memory 
management for advanced users and Phantom reference based memory management for 
casual users to prevent memory leak as just a option. What do you think?

bq. My understanding of PR (correct me if I'm wrong) is that you generally have 
to have a thread that keeps polling the PR to see if it's ready to be disposed 
of. This is extra overhead for the users who do remember to correctly call 
close().

This is correct. Basically, we need a new thread to poll ReferenceQueue, an 
event queue for PhantomReference. If this overhead can be problem, we can add a 
new configuration to switch it on and off.
                
> enable HDFS local reads via mmap
> --------------------------------
>
>                 Key: HDFS-4953
>                 URL: https://issues.apache.org/jira/browse/HDFS-4953
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 2.3.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
> HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
> HDFS-4953.006.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access 
> files directly without going through the DataNode.  However, all of these 
> reads involve a copy at the operating system level, since they rely on the 
> read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable 
> truly zero-copy reads.
> In the initial implementation, zero-copy reads will only be performed when 
> checksums were disabled.  Later, we can use the DataNode's cache awareness to 
> only perform zero-copy reads when we know that checksum has already been 
> verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to