[
https://issues.apache.org/jira/browse/HADOOP-6097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745027#action_12745027
]
Vladimir Klimontovich commented on HADOOP-6097:
-----------------------------------------------
The patch contains:
+ /** Hadoop Archive connections cannot be cached by (scheme, authority,
+ * username) -- the path is significant as well. Come to think of it, they
+ * probably don't need to be cached at all, since they wrap another
+ * connection that does the actual networking.
+ */
+ if (scheme.equals("har")) {
+ return createFileSystem(uri, conf);
+ }
+
return CACHE.get(uri, conf);
}
I don't think it's a good way.
I'd like to suggest to introduce property fs.[fs-name].impl.disable.cache. And
don't use cache is this property exists and set to true.
> Multiple bugs w/ Hadoop archives
> --------------------------------
>
> Key: HADOOP-6097
> URL: https://issues.apache.org/jira/browse/HADOOP-6097
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.20.0
> Reporter: Ben Slusky
> Fix For: 0.20.1
>
> Attachments: HADOOP-6097.patch
>
>
> Found and fixed several bugs involving Hadoop archives:
> - In makeQualified(), the sloppy conversion from Path to URI and back mangles
> the path if it contains an escape-worthy character.
> - It's possible that fileStatusInIndex() may have to read more than one
> segment of the index. The LineReader and count of bytes read need to be reset
> for each block.
> - har:// connections cannot be indexed by (scheme, authority, username) --
> the path is significant as well. Caching them in this way limits a hadoop
> client to opening one archive per filesystem. It seems to be safe not to
> cache them, since they wrap another connection that does the actual
> networking.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.