[
https://issues.apache.org/jira/browse/HDFS-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549400#comment-13549400
]
Todd Lipcon commented on HDFS-4356:
-----------------------------------
- The patch still adds {{available()}} to {{BlockReader}}, but it's not
actually used, best I can tell.
{code}
+ String msg = "access control error while " +
+ "attempting to set up short-circuit access to " +
+ file + resp.getMessage();
+ DFSClient.LOG.error(msg);
+ throw new InvalidBlockTokenException(msg);
{code}
No need to log this here, especailly at ERROR level. The exception will get
logged in the catch clause at the callsite already.
----
- A couple unused imports (eg LoadingCache)
----
{code}
+ DFSClient.LOG.warn("error while attempting to set up short-circuit " +
+ "access to " + file + ": " + resp.getMessage());
{code}
Can you add an {{isDebugEnabled}} log here? It's on the hot path for random
read.
----
- Can you add a warning log if the shortcircuit flag is enabled, but there is
no path configured? This will be a common case for people upgrading from
HDFS-2246 support to HDFS-347.
----
{code}
+ public static final String DFS_DATANODE_DOMAIN_SOCKET_PATH =
"dfs.datanode.domain.socket.path";
{code}
Rename constant to end in {{_KEY}} like the others. Also, there's an extra
space in this line.
----
{code}
+ this.fileInputStreamCache = new FileInputStreamCache(5, 30000);
{code}
Add config keys for these constants. I would also say the default number of
streams should be higher -- eg a seeky workload in a 1GB file with 64MB blocks
would need 20 entries in order to avoid churning the cache.
----
{code}
- // Don't use the cache on the last attempt - it's possible that there
- // are arbitrarily many unusable sockets in the cache, but we don't
- // want to fail the read.
{code}
Why'd you remove this comment?
----
{code}
+ boolean allowShortCircuitcLocalReads =
{code}
typo: circuitc
----
{code}
+ if (conf.domainSocketPath == null) return null;
+ // UNIX domain sockets can only be used to talk to local peers
+ if (!DFSClient.isLocalAddress(addr)) return null;
+ // If the DomainSocket code is not loaded, we can't create
+ // DomainSocket objects.
+ if (DomainSocket.getLoadingFailureReason() != null) return null;
...
...
+ // If we don't want to pass data over domain sockets, and we don't want
+ // to pass file descriptors over them either, we have no use for domain
+ // sockets.
+ return null;
{code}
Add guarded DEBUG logs in these cases. At DFSClient configuration time, if
domain sockets are supposed to be enabled, and
DomainSocket.getLoadingFailureReason is non-null, you should also WARN at that
point (but only once, not once per read). This will help users be sure they've
got their setup right.
----
- FileInputStreamCache is missing a license. Also please add javadoc.
- Please add some javadoc on the members in this class as well.
----
{code}
+ private final static ScheduledThreadPoolExecutor executor
+ = new ScheduledThreadPoolExecutor(1);
{code}
Indentation. Also, pass a ThreadFactory so that it makes daemon threads with
reasonable names.
----
{code}
+ map.remove(entry.getKey(), entry.getValue());
{code}
This will throw a ConcurrentModificationException. You need to remove from the
iterator. This bug shows up in a few places.
----
{code}
+ return (datanodeID.equals(otherKey.datanodeID) &&
(block.equals(otherKey.block)));
{code}
The block equals function doesn't compare generation stamps, but I think you
should do so here by adding block.getGenerationStamp ==
otherKey.block.getGenerationStamp()
----
Stupid performance things on the FileInputStreamCache.Key:
- change equals to compare blocks first, then datanodes (datanodes usually
won't differ, since there is only one local DN)
- change hashcode to only use the block's hashcode, since datanodes won't differ
----
{code}
+ for (FileInputStream f : fis) {
+ IOUtils.cleanup(LOG, f);
+ }
{code}
You can just use {{IOUtils.cleanup(LOG, fis)}} here, since it takes an array,
no?
----
{code}
+ static final int CURRENT_BLOCK_FORMAT_VERSION = 1;
{code}
Find a better spot for this? Seems like not quite the right place.
----
{code}
+ String domainSocketPath =
+ conf.get(DFSConfigKeys.DFS_DATANODE_DOMAIN_SOCKET_PATH);
+ if (domainSocketPath == null) return null;
{code}
Should WARN if the flag says enabled, but there is no path configured.
> BlockReaderLocal should use passed file descriptors rather than paths
> ---------------------------------------------------------------------
>
> Key: HDFS-4356
> URL: https://issues.apache.org/jira/browse/HDFS-4356
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, hdfs-client, performance
> Affects Versions: 2.0.3-alpha
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: 04b-cumulative.patch, _04b.patch, 04-cumulative.patch,
> 04d-cumulative.patch, 04f-cumulative.patch, 04g-cumulative.patch
>
>
> {{BlockReaderLocal}} should use file descriptors passed over UNIX domain
> sockets rather than paths. We also need some configuration options for these
> UNIX domain sockets.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira