[ https://issues.apache.org/jira/browse/HDFS-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bob Hansen resolved HDFS-10574. ------------------------------- Resolution: Invalid Ah, it appears that my test cluster was running an old version of HDFS. My reproducer also succeeds on trunk. Thanks, [~yuanbo], for looking into it and setting me straight. I apologize for adding to the noise floor. http://izquotes.com/quotes-pictures/quote-the-boy-cried-wolf-wolf-and-the-villagers-came-out-to-help-him-aesop-205890.jpg > webhdfs fails with filenames including semicolons > ------------------------------------------------- > > Key: HDFS-10574 > URL: https://issues.apache.org/jira/browse/HDFS-10574 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Affects Versions: 2.7.0 > Reporter: Bob Hansen > > Via webhdfs or native HDFS, we can create files with semicolons in their > names: > {code} > bhansen@::1 /tmp$ hdfs dfs -copyFromLocal /tmp/data > "webhdfs://localhost:50070/foo;bar" > bhansen@::1 /tmp$ hadoop fs -ls / > Found 1 items > -rw-r--r-- 2 bhansen supergroup 9 2016-06-24 12:20 /foo;bar > {code} > Attempting to fetch the file via webhdfs fails: > {code} > bhansen@::1 /tmp$ curl -L > "http://localhost:50070/webhdfs/v1/foo%3Bbar?user.name=bhansen&op=OPEN" > {"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File > does not exist: /foo\n\tat > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)\n\tat > > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)\n\tat > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1891)\n\tat > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1832)\n\tat > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1812)\n\tat > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1784)\n\tat > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:542)\n\tat > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:362)\n\tat > > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)\n\tat > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)\n\tat > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)\n\tat > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)\n\tat > java.security.AccessController.doPrivileged(Native Method)\n\tat > javax.security.auth.Subject.doAs(Subject.java:422)\n\tat > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)\n\tat > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)\n"}} > {code} > It appears (from the attached TCP dump in curl_request.txt) that the > namenode's redirect unescapes the semicolon, and the DataNode's HTTP server > is splitting the request at the semicolon, and failing to find the file "foo". > Interesting side notes: > * In the attached dfs_copyfrom_local_traffic.txt, you can see the > copyFromLocal command writing the data to "foo;bar_COPYING_", which is then > redirected and just writes to "foo". The subsequent rename attempts to > rename "foo;bar_COPYING_" to "foo;bar", but has the same parsing bug so > effectively renames "foo" to "foo;bar". > Here is the full range of special characters that we initially started with > that led to the minimal reproducer above: > {code} > hdfs dfs -copyFromLocal /tmp/data webhdfs://localhost:50070/'~`!@#$%^& > ()-_=+|<.>]}",\\\[\{\*\?\;'\''data' > curl -L > "http://localhost:50070/webhdfs/v1/%7E%60%21%40%23%24%25%5E%26+%28%29-_%3D%2B%7C%3C.%3E%5D%7D%22%2C%5C%5B%7B*%3F%3B%27data?user.name=bhansen&op=OPEN&offset=0" > {code} > Thanks to [~anatoli.shein] for making a concise reproducer. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org