[ https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847854#comment-13847854 ]
Andrew Wang commented on HDFS-5634: ----------------------------------- Thanks for being patient with me, I think I finally grok what's going on. bq. <BRL builder> Alright. It seems odd for datanodeID and block to ever be null since they're used to key the FISCache, but okay. Still, we're currently not setting the caching strategy in DFSInputStream#getBlockReader when pulling something out of the FISCache, should we? bq. We always buffer at least a single chunk, even if readahead is turned off. The mechanics of checksumming require this. In {{fillDataBuf(boolean)}}, it looks like we try to fill {{maxReadaheadLength}} modulo the slop. If this is zero, what happens? I think we need to take the checksum chunk size into account here somewhere. It'd also be good to have test coverage for all these different read paths and parameter combinations. It looks like TestParallel* already cover the ByteBuffer and array variants of read with and without checksums, but we also need to permute on different readahead sizes (e.g. 0, chunk-1, chunk+1). Unfortunately even though the array and BB versions are almost the same, they don't share any code, so we should probably still permute on that axis. With the new getter, we could use that instead of passing around {{canSkipChecksum}} everywhere, but I'm unsure if mlock toggling has implications here. Your call. There's still BlockReaderFactory#newBlockReader using verifyChecksum rather than skipChecksum. Do you want to change this one too? There are only 6 usages from what I see. bq. <deque> Yea, let's just skip this idea for now. We really should use a profiler for guidance before trying optimizations anyway. That'd be interesting to do later. bq. <readahead and buffer size linkage> I was thinking user documentation in hdfs-default.xml for "dfs.client.read.shortcircuit.buffer.size" and "dfs.datanode.readahead.bytes" but I see you filed a follow-on for that. It also feels inconsistent that the datanode defaults to 4MB readahead while the client looks like it defaults to 0, but maybe there's some reasoning there. > allow BlockReaderLocal to switch between checksumming and not > ------------------------------------------------------------- > > Key: HDFS-5634 > URL: https://issues.apache.org/jira/browse/HDFS-5634 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client > Affects Versions: 3.0.0 > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, > HDFS-5634.003.patch, HDFS-5634.004.patch > > > BlockReaderLocal should be able to switch between checksumming and > non-checksumming, so that when we get notifications that something is mlocked > (see HDFS-5182), we can avoid checksumming when reading from that block. -- This message was sent by Atlassian JIRA (v6.1.4#6159)