[ https://issues.apache.org/jira/browse/HDFS-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549400#comment-13549400 ]
Todd Lipcon commented on HDFS-4356: ----------------------------------- - The patch still adds {{available()}} to {{BlockReader}}, but it's not actually used, best I can tell. {code} + String msg = "access control error while " + + "attempting to set up short-circuit access to " + + file + resp.getMessage(); + DFSClient.LOG.error(msg); + throw new InvalidBlockTokenException(msg); {code} No need to log this here, especailly at ERROR level. The exception will get logged in the catch clause at the callsite already. ---- - A couple unused imports (eg LoadingCache) ---- {code} + DFSClient.LOG.warn("error while attempting to set up short-circuit " + + "access to " + file + ": " + resp.getMessage()); {code} Can you add an {{isDebugEnabled}} log here? It's on the hot path for random read. ---- - Can you add a warning log if the shortcircuit flag is enabled, but there is no path configured? This will be a common case for people upgrading from HDFS-2246 support to HDFS-347. ---- {code} + public static final String DFS_DATANODE_DOMAIN_SOCKET_PATH = "dfs.datanode.domain.socket.path"; {code} Rename constant to end in {{_KEY}} like the others. Also, there's an extra space in this line. ---- {code} + this.fileInputStreamCache = new FileInputStreamCache(5, 30000); {code} Add config keys for these constants. I would also say the default number of streams should be higher -- eg a seeky workload in a 1GB file with 64MB blocks would need 20 entries in order to avoid churning the cache. ---- {code} - // Don't use the cache on the last attempt - it's possible that there - // are arbitrarily many unusable sockets in the cache, but we don't - // want to fail the read. {code} Why'd you remove this comment? ---- {code} + boolean allowShortCircuitcLocalReads = {code} typo: circuitc ---- {code} + if (conf.domainSocketPath == null) return null; + // UNIX domain sockets can only be used to talk to local peers + if (!DFSClient.isLocalAddress(addr)) return null; + // If the DomainSocket code is not loaded, we can't create + // DomainSocket objects. + if (DomainSocket.getLoadingFailureReason() != null) return null; ... ... + // If we don't want to pass data over domain sockets, and we don't want + // to pass file descriptors over them either, we have no use for domain + // sockets. + return null; {code} Add guarded DEBUG logs in these cases. At DFSClient configuration time, if domain sockets are supposed to be enabled, and DomainSocket.getLoadingFailureReason is non-null, you should also WARN at that point (but only once, not once per read). This will help users be sure they've got their setup right. ---- - FileInputStreamCache is missing a license. Also please add javadoc. - Please add some javadoc on the members in this class as well. ---- {code} + private final static ScheduledThreadPoolExecutor executor + = new ScheduledThreadPoolExecutor(1); {code} Indentation. Also, pass a ThreadFactory so that it makes daemon threads with reasonable names. ---- {code} + map.remove(entry.getKey(), entry.getValue()); {code} This will throw a ConcurrentModificationException. You need to remove from the iterator. This bug shows up in a few places. ---- {code} + return (datanodeID.equals(otherKey.datanodeID) && (block.equals(otherKey.block))); {code} The block equals function doesn't compare generation stamps, but I think you should do so here by adding block.getGenerationStamp == otherKey.block.getGenerationStamp() ---- Stupid performance things on the FileInputStreamCache.Key: - change equals to compare blocks first, then datanodes (datanodes usually won't differ, since there is only one local DN) - change hashcode to only use the block's hashcode, since datanodes won't differ ---- {code} + for (FileInputStream f : fis) { + IOUtils.cleanup(LOG, f); + } {code} You can just use {{IOUtils.cleanup(LOG, fis)}} here, since it takes an array, no? ---- {code} + static final int CURRENT_BLOCK_FORMAT_VERSION = 1; {code} Find a better spot for this? Seems like not quite the right place. ---- {code} + String domainSocketPath = + conf.get(DFSConfigKeys.DFS_DATANODE_DOMAIN_SOCKET_PATH); + if (domainSocketPath == null) return null; {code} Should WARN if the flag says enabled, but there is no path configured. > BlockReaderLocal should use passed file descriptors rather than paths > --------------------------------------------------------------------- > > Key: HDFS-4356 > URL: https://issues.apache.org/jira/browse/HDFS-4356 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, hdfs-client, performance > Affects Versions: 2.0.3-alpha > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > Attachments: 04b-cumulative.patch, _04b.patch, 04-cumulative.patch, > 04d-cumulative.patch, 04f-cumulative.patch, 04g-cumulative.patch > > > {{BlockReaderLocal}} should use file descriptors passed over UNIX domain > sockets rather than paths. We also need some configuration options for these > UNIX domain sockets. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira