[ https://issues.apache.org/jira/browse/KNOX-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052853#comment-16052853 ]
ASF subversion and git services commented on KNOX-949: ------------------------------------------------------ Commit 680e8a6e4faf6b65dcb81a61525faedb0e937f09 in knox's branch refs/heads/master from [~lmccay] [ https://git-wip-us.apache.org/repos/asf?p=knox.git;h=680e8a6 ] KNOX-949 - WeBHDFS proxy replaces %20 encoded spaces in URL with + encoding > WeBHDFS proxy replaces %20 encoded spaces in URL with + encoding > ---------------------------------------------------------------- > > Key: KNOX-949 > URL: https://issues.apache.org/jira/browse/KNOX-949 > Project: Apache Knox > Issue Type: Bug > Affects Versions: 0.11.0 > Reporter: Alex Willmer > Assignee: Larry McCay > Priority: Blocker > Fix For: 0.13.0 > > Attachments: knox-0.13-with-KNOX-949-001-patch.log, KNOX-949-001.patch > > > If a file with spaces in the name (e.g. {{foo bar.txt}}) is requested from > HDFS, through WebHDFS and Knox - then Knox rewrites the {{%20}} encoding in > the URL sent by the client, with {{+}} encoding (e.g. {{foo%20bar.txt}} -> > {{foo+bar.txt}}). This results in an HTTP 404 being returned by WebHDFS, and > hence by Knox. Requesting the same file directly from WebHDFS works. Example > Client request > {noformat} > curl > "https://<hostname>:18443/gateway/<cluster>/webhdfs/v1/docs/filename%20with%20spaces.pdf?op=OPEN" > \ > -<username>:<password> -k -s > {noformat} > Knox response body > {noformat} > {"exception":"FileNotFoundException", > "javaClassName":"java.io.FileNotFoundException", > "message":"File /docs/filename+with+spaces.pdf not found."} > {noformat} > Knox logs > {noformat} > ==> /var/log/hadoop/knox/gateway-audit.log <== > 17/05/24 15:51:05 > ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS||||access|uri|/gateway/<cluster>/webhdfs/v1/docs/filename > with spaces.pdf?op=OPEN|unavailable|Request method: GET > 17/05/24 15:51:05 > ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||authentication|uri|/gateway/<cluster>/webhdfs/v1/docs/filename > with spaces.pdf?op=OPEN|success| > 17/05/24 15:51:05 > ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||authentication|uri|/gateway/<cluster>/webhdfs/v1/docs/filename > with spaces.pdf?op=OPEN|success|Groups: [] > 17/05/24 15:51:05 > ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||authorization|uri|/gateway/<cluster>/webhdfs/v1/docs/filename > with spaces.pdf?op=OPEN|success| > 17/05/24 15:51:05 > ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||dispatch|uri|http://<namenode>.<cluster>:50070/webhdfs/v1/docs/filename+with+spaces.pdf?op=OPEN&doAs=<username>|unavailable|Request > method: GET > 17/05/24 15:51:05 > ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||dispatch|uri|http://<namenode>.<cluster>:50070/webhdfs/v1/docs/filename+with+spaces.pdf?op=OPEN&doAs=<username>|success|Response > status: 404 > 17/05/24 15:51:05 > ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||access|uri|/gateway/<cluster>/webhdfs/v1/docs/filename > with spaces.pdf?op=OPEN|success|Response status: 404 > ==> /var/log/hadoop/knox/gateway.log <== > 2017-05-24 15:51:05,254 INFO hadoop.gateway > (KnoxLdapRealm.java:getUserDn(691)) - Computed > userDn: uid=<username>,cn=users,cn=accounts,dc=<cluster> using dnTemplate for > principal: <username> > 2017-05-24 15:51:05,259 INFO hadoop.gateway > (AclsAuthorizationFilter.java:doFilter(85)) - > Access Granted: true > {noformat} > Direct WebHDFS request for the same file > {noformat} > # curl -si -u: > "http://<namenode>:50070/webhdfs/v1/docs/filename%20with%20spaces.pdf?op=OPEN" > --negotiate -L | head -n40 > HTTP/1.1 401 Authentication required > Cache-Control: must-revalidate,no-cache,no-store > Date: Wed, 24 May 2017 19:01:41 GMT > Pragma: no-cache > Date: Wed, 24 May 2017 19:01:41 GMT > Pragma: no-cache > X-FRAME-OPTIONS: SAMEORIGIN > WWW-Authenticate: Negotiate > Set-Cookie: hadoop.auth=; Path=/; HttpOnly > Content-Type: text/html; charset=iso-8859-1 > Content-Length: 1533 > Server: Jetty(6.1.26.hwx) > HTTP/1.1 307 TEMPORARY_REDIRECT > Cache-Control: no-cache > Expires: Wed, 24 May 2017 19:01:42 GMT > Date: Wed, 24 May 2017 19:01:42 GMT > Pragma: no-cache > Expires: Wed, 24 May 2017 19:01:42 GMT > Date: Wed, 24 May 2017 19:01:42 GMT > Pragma: no-cache > X-FRAME-OPTIONS: SAMEORIGIN > WWW-Authenticate: Negotiate > YGkGCSqGSIb3EgECAgIAb1owWKADAgEFoQMCAQ+iTDBKoAMCARKiQwRBQM/auuLcl2xey6wMp6EjCPJFSqK3snscxMzW7RvfgxOo7182GzD5N9jf+OWGr+tjpvlRX0c/7iTBfYKSetf4ekU= > Set-Cookie: > hadoop.auth="u=admin&p=admin@CYSAFA&t=kerberos&e=1495688502002&s=b7p35TgaxItAUTkKJuSXuynoq9E="; > Path=/; HttpOnly > Content-Type: application/octet-stream > Location: > http://<datanode3>:1022/webhdfs/v1/docs/filename%20with%20spaces.pdf?op=OPEN&delegation=HgAFYWRtaW4FYWRtaW4AigFcO9YJ8ooBXF_ijfJFAxSBYFUnsXY3up11ZNIi4hIi__5RvRJXRUJIREZTIGRlbGVnYXRpb24PMTcyLjE4LjAuOTo4MDIw&namenoderpcaddress=<namenode>:8020&offset=0 > Content-Length: 0 > Server: Jetty(6.1.26.hwx) > HTTP/1.1 200 OK > Access-Control-Allow-Methods: GET > Access-Control-Allow-Origin: * > Content-Type: application/octet-stream > Connection: close > Content-Length: 13365618 > %����1.6 > <</Filter/FlateDecode/First 157/Length 5350/N 16/Type/ObjStm>>stream > ... > {noformat} > See also > - > http://mail-archives.apache.org/mod_mbox/knox-user/201705.mbox/%3C335C4DD06CF6C24EAA7A73F44D43D7CB4E6EB300%40SE-EX021.groupinfra.com%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029)