[ 
https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847873#comment-13847873
 ] 

Colin Patrick McCabe commented on HDFS-5634:
--------------------------------------------

bq. Alright. It seems odd for datanodeID and block to ever be null since 
they're used to key the FISCache, but okay. Still, we're currently not setting 
the caching strategy in DFSInputStream#getBlockReader when pulling something 
out of the FISCache, should we?

Definitely.  Fixed.

bq. In fillDataBuf(boolean), it looks like we try to fill maxReadaheadLength 
modulo the slop. If this is zero, what happens? I think we need to take the 
checksum chunk size into account here somewhere.

This is a bug.  Basically it's a corner case where we need to fudge the user's 
no-readahead request into a short-readahead request when doing a checksummed 
read.  Will fix...

bq. It'd also be good to have test coverage for all these different read paths 
and parameter combinations. It looks like TestParallel* already cover the 
ByteBuffer and array variants of read with and without checksums, but we also 
need to permute on different readahead sizes (e.g. 0, chunk-1, chunk+1). 
Unfortunately even though the array and BB versions are almost the same, they 
don't share any code, so we should probably still permute on that axis.

I think the main thing is to have coverage for the no-readahead + checksum 
case, which we don't really now.  Having a test that did chunk-1 readahead, to 
verify that the round-up functionality worked, would also be a good idea.

bq. With the new getter, we could use that instead of passing around 
canSkipChecksum everywhere, but I'm unsure if mlock toggling has implications 
here. Your call.

Accessing the atomic boolean does have a performance overhead that I wanted to 
avoid.  Also, as you said, it might change, which could create problems.

bq. I was thinking user documentation in hdfs-default.xml for 
"dfs.client.read.shortcircuit.buffer.size" and "dfs.datanode.readahead.bytes" 
but I see you filed a follow-on for that. It also feels inconsistent that the 
datanode defaults to 4MB readahead while the client looks like it defaults to 
0, but maybe there's some reasoning there.

Actually, the client defaults to "null" readahead which usually means the 
datanode gets to decide.  I guess it's a bit of a confusing construct, but here 
it is:

{code}
    Long readahead = (conf.get(DFS_CLIENT_CACHE_READAHEAD) == null) ?
        null : conf.getLong(DFS_CLIENT_CACHE_READAHEAD, 0);
{code}

In {{BlockReaderLocal}} I am defaulting to {{dfs.datanode.readahead.bytes}} 
when the client's value is null.  If we wanted to be totally consistent with 
how remote reads work, we'd have to get the readahead default from the DataNode 
as part of the RPC.  That seems a little bit like overkill, though.

bq. There's still BlockReaderFactory#newBlockReader using verifyChecksum rather 
than skipChecksum. Do you want to change this one too? There are only 6 usages 
from what I see.

OK.

> allow BlockReaderLocal to switch between checksumming and not
> -------------------------------------------------------------
>
>                 Key: HDFS-5634
>                 URL: https://issues.apache.org/jira/browse/HDFS-5634
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, 
> HDFS-5634.003.patch, HDFS-5634.004.patch
>
>
> BlockReaderLocal should be able to switch between checksumming and 
> non-checksumming, so that when we get notifications that something is mlocked 
> (see HDFS-5182), we can avoid checksumming when reading from that block.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to