[ 
https://issues.apache.org/jira/browse/HDFS-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549400#comment-13549400
 ] 

Todd Lipcon commented on HDFS-4356:
-----------------------------------

- The patch still adds {{available()}} to {{BlockReader}}, but it's not 
actually used, best I can tell.

{code}
+      String msg = "access control error while " +
+          "attempting to set up short-circuit access to " +
+          file + resp.getMessage();
+      DFSClient.LOG.error(msg);
+      throw new InvalidBlockTokenException(msg);
{code}
No need to log this here, especailly at ERROR level. The exception will get 
logged in the catch clause at the callsite already.

----
- A couple unused imports (eg LoadingCache)

----
{code}
+      DFSClient.LOG.warn("error while attempting to set up short-circuit " +
+          "access to " + file + ": " + resp.getMessage());
{code}
Can you add an {{isDebugEnabled}} log here? It's on the hot path for random 
read.

----
- Can you add a warning log if the shortcircuit flag is enabled, but there is 
no path configured? This will be a common case for people upgrading from 
HDFS-2246 support to HDFS-347.

----
{code}
+  public static final String  DFS_DATANODE_DOMAIN_SOCKET_PATH = 
"dfs.datanode.domain.socket.path";
{code}

Rename constant to end in {{_KEY}} like the others. Also, there's an extra 
space in this line.

----
{code}
+    this.fileInputStreamCache = new FileInputStreamCache(5, 30000);
{code}

Add config keys for these constants. I would also say the default number of 
streams should be higher -- eg a seeky workload in a 1GB file with 64MB blocks 
would need 20 entries in order to avoid churning the cache.

----
{code}
-      // Don't use the cache on the last attempt - it's possible that there
-      // are arbitrarily many unusable sockets in the cache, but we don't
-      // want to fail the read.
{code}

Why'd you remove this comment?

----

{code}
+        boolean allowShortCircuitcLocalReads =
{code}
typo: circuitc

----
{code}
+    if (conf.domainSocketPath == null) return null;
+    // UNIX domain sockets can only be used to talk to local peers
+    if (!DFSClient.isLocalAddress(addr)) return null;
+    // If the DomainSocket code is not loaded, we can't create
+    // DomainSocket objects.
+    if (DomainSocket.getLoadingFailureReason() != null) return null;
...
...
+      // If we don't want to pass data over domain sockets, and we don't want
+      // to pass file descriptors over them either, we have no use for domain
+      // sockets.
+      return null;
{code}

Add guarded DEBUG logs in these cases. At DFSClient configuration time, if 
domain sockets are supposed to be enabled, and 
DomainSocket.getLoadingFailureReason is non-null, you should also WARN at that 
point (but only once, not once per read). This will help users be sure they've 
got their setup right.

----
- FileInputStreamCache is missing a license. Also please add javadoc.
- Please add some javadoc on the members in this class as well.

----
{code}
+  private final static ScheduledThreadPoolExecutor executor
+  = new ScheduledThreadPoolExecutor(1);
{code}

Indentation. Also, pass a ThreadFactory so that it makes daemon threads with 
reasonable names.

----

{code}
+          map.remove(entry.getKey(), entry.getValue());
{code}

This will throw a ConcurrentModificationException. You need to remove from the 
iterator. This bug shows up in a few places.

----
{code}
+      return (datanodeID.equals(otherKey.datanodeID) && 
(block.equals(otherKey.block)));
{code}

The block equals function doesn't compare generation stamps, but I think you 
should do so here by adding block.getGenerationStamp == 
otherKey.block.getGenerationStamp()

----
Stupid performance things on the FileInputStreamCache.Key:
- change equals to compare blocks first, then datanodes (datanodes usually 
won't differ, since there is only one local DN)
- change hashcode to only use the block's hashcode, since datanodes won't differ

----
{code}
+        for (FileInputStream f : fis) {
+          IOUtils.cleanup(LOG, f);
+        }
{code}

You can just use {{IOUtils.cleanup(LOG, fis)}} here, since it takes an array, 
no?

----
{code}
+  static final int CURRENT_BLOCK_FORMAT_VERSION = 1;
{code}

Find a better spot for this? Seems like not quite the right place.

----
{code}
+    String domainSocketPath =
+        conf.get(DFSConfigKeys.DFS_DATANODE_DOMAIN_SOCKET_PATH);
+    if (domainSocketPath == null) return null;
{code}

Should WARN if the flag says enabled, but there is no path configured.
                
> BlockReaderLocal should use passed file descriptors rather than paths
> ---------------------------------------------------------------------
>
>                 Key: HDFS-4356
>                 URL: https://issues.apache.org/jira/browse/HDFS-4356
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, hdfs-client, performance
>    Affects Versions: 2.0.3-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: 04b-cumulative.patch, _04b.patch, 04-cumulative.patch, 
> 04d-cumulative.patch, 04f-cumulative.patch, 04g-cumulative.patch
>
>
> {{BlockReaderLocal}} should use file descriptors passed over UNIX domain 
> sockets rather than paths.  We also need some configuration options for these 
> UNIX domain sockets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to