[ https://issues.apache.org/jira/browse/HDFS-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414041#comment-13414041 ]
Todd Lipcon commented on HDFS-3626: ----------------------------------- bq. Shouldn't the normalization just occur on the server side? Well, we already normalize the Path(String) constructor, which is the much more commonly used one throughout all the code I've seen. So I think we should be consistent between the URI and String constructors. bq. Shouldn't the normalization just occur on the server side? I don't think so, because on the client side you at least need to resolve things like relative paths from the home directory. Or, eg, if you have a viewfs, you need to be able to handle normalization of "/mount1/../mount2/foo" on the client side in order to know that you need to go to mount2 and not mount1's NN. My other concern with doing normalization on the server side has to do with things like lease renewal. We have to be very careful that the leases are tracked with the correct (canonical) strings and then renewed with the same strings. Of course, if we normalize consistently everywhere, this will work, but I wanted to just fix this bug here rather than try to overhaul where we do normalization. Given the above, and given that DFSUtil _already_ checks for non-canonical elements like "../" in the path, I think this current approach is the most self-consistent, and if we want to move the normalization to server-side, we should do it separately from the bugfix. If there's a bug with "../" in symlinks, then I think that bug already exists and wouldn't be introduced by this patch, right? > Creating file with invalid path can corrupt edit log > ---------------------------------------------------- > > Key: HDFS-3626 > URL: https://issues.apache.org/jira/browse/HDFS-3626 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 2.0.0-alpha > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Blocker > Attachments: hdfs-3626.txt, hdfs-3626.txt > > > Joris Bontje reports the following: > The following command results in a corrupt NN editlog (note the double slash > and reading from stdin): > $ cat /usr/share/dict/words | hadoop fs -put - > hdfs://localhost:8020//path/file > After this, restarting the namenode will result into the following fatal > exception: > {code} > 2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: > Reading > /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_0000000000000000173-0000000000000000188 > expecting start txid #173 > 2012-07-10 06:29:19,912 ERROR > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception > on operation MkdirOp [length=0, path=/, timestamp=1341915658216, > permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182] > java.lang.ArrayIndexOutOfBoundsException: -1 > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira