[GitHub] [hadoop] steveloughran commented on issue #1601: HADOOP-16635. S3A innerGetFileStatus scans for directories-only still does a HEAD.

2019-10-14 Thread GitBox
steveloughran commented on issue #1601: HADOOP-16635. S3A innerGetFileStatus 
scans for directories-only still does a HEAD.
URL: https://github.com/apache/hadoop/pull/1601#issuecomment-541796750
 
 
   thx
   -merged


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1601: HADOOP-16635. S3A innerGetFileStatus scans for directories-only still does a HEAD.

2019-10-11 Thread GitBox
steveloughran commented on issue #1601: HADOOP-16635. S3A innerGetFileStatus 
scans for directories-only still does a HEAD.
URL: https://github.com/apache/hadoop/pull/1601#issuecomment-541059565
 
 
   updated the docs. The only place we don't do Head and dir marker is in 
create()
   
   Now. can you create a Path with a trailing / ? I was about to say no, but 
remembered https://issues.apache.org/jira/browse/HADOOP-15430 .. one of the 
constructors of Path does let you get away with it, which is something which 
breaks S3Guard already


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1601: HADOOP-16635. S3A innerGetFileStatus scans for directories-only still does a HEAD.

2019-10-10 Thread GitBox
steveloughran commented on issue #1601: HADOOP-16635. S3A innerGetFileStatus 
scans for directories-only still does a HEAD.
URL: https://github.com/apache/hadoop/pull/1601#issuecomment-540571728
 
 
   Sid, thanks for the comments, will review/update the patch
   
   Interesting point about the double list. This code path is how its always 
been, presumably descended from the s3n code. LIST is slower, costs more and 
much more prone to eventual consistency, which are all good arguments for HEAD 
first.
   
   I actually plan to tune some of the calls which always seem to get used on 
directory walks (listStatus, listFiles, listLocatedStatus) to do the subtree 
list first, and only go for the HEAD calls if they don't find any children. 
This is to reduce the cost of treewalks where the bias is towards populated 
directories


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org