[ 
https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847854#comment-13847854
 ] 

Andrew Wang commented on HDFS-5634:
-----------------------------------

Thanks for being patient with me, I think I finally grok what's going on.

bq. <BRL builder>

Alright. It seems odd for datanodeID and block to ever be null since they're 
used to key the FISCache, but okay. Still, we're currently not setting the 
caching strategy in DFSInputStream#getBlockReader when pulling something out of 
the FISCache, should we?

bq. We always buffer at least a single chunk, even if readahead is turned off. 
The mechanics of checksumming require this.

In {{fillDataBuf(boolean)}}, it looks like we try to fill 
{{maxReadaheadLength}} modulo the slop. If this is zero, what happens? I think 
we need to take the checksum chunk size into account here somewhere.

It'd also be good to have test coverage for all these different read paths and 
parameter combinations. It looks like TestParallel* already cover the 
ByteBuffer and array variants of read with and without checksums, but we also 
need to permute on different readahead sizes (e.g. 0, chunk-1, chunk+1). 
Unfortunately even though the array and BB versions are almost the same, they 
don't share any code, so we should probably still permute on that axis.

With the new getter, we could use that instead of passing around 
{{canSkipChecksum}} everywhere, but I'm unsure if mlock toggling has 
implications here. Your call.

There's still BlockReaderFactory#newBlockReader using verifyChecksum rather 
than skipChecksum. Do you want to change this one too? There are only 6 usages 
from what I see.

bq. <deque>

Yea, let's just skip this idea for now. We really should use a profiler for 
guidance before trying optimizations anyway. That'd be interesting to do later.

bq. <readahead and buffer size linkage>

I was thinking user documentation in hdfs-default.xml for 
"dfs.client.read.shortcircuit.buffer.size" and "dfs.datanode.readahead.bytes" 
but I see you filed a follow-on for that. It also feels inconsistent that the 
datanode defaults to 4MB readahead while the client looks like it defaults to 
0, but maybe there's some reasoning there.

> allow BlockReaderLocal to switch between checksumming and not
> -------------------------------------------------------------
>
>                 Key: HDFS-5634
>                 URL: https://issues.apache.org/jira/browse/HDFS-5634
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, 
> HDFS-5634.003.patch, HDFS-5634.004.patch
>
>
> BlockReaderLocal should be able to switch between checksumming and 
> non-checksumming, so that when we get notifications that something is mlocked 
> (see HDFS-5182), we can avoid checksumming when reading from that block.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to