[ https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=493671&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493671 ]
ASF GitHub Bot logged work on HADOOP-17281: ------------------------------------------- Author: ASF GitHub Bot Created on: 01/Oct/20 20:12 Start Date: 01/Oct/20 20:12 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2354: URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522 Looks good. Annoying about the return types which force you to do that wrapping/casting. Can't you just forcibly cast the return type of the inner iterator? after all, type erasure means all type info will be lost in the actual compiled binary. I'd prefer that as it will give you automatic passthrough of the IOStatistics stuff. Add text to filesystem.md, something which: * specifies the result is exactly the same a listStatus, provided no other caller updates the directory during the list * declares that it's not atomic and performance implementations will page * and that if a path isn't there, that fact may not surface until next/hasNext...that is, we do lazy eval for all file IO We need to similar new contract tests in AbstractContractGetFileStatusTest for all to use * that in a dir with files and subdirectories, you get both returned in the listing * that you can iterate through with next() to failure as well as hasNext/next, and get the same results * listStatusIterator(file) returns the file * listStatusIterator("/") gives you a listing of root (put that in AbstractContractRootDirectoryTest) And two for changes partway through the iteration * change the directory during a list to add/delete files * deletes the actual path. These tests can't assert on what will happen, and with paged IO aren't likely to pick up on changes...there just to show it can be done and pick up on any major issues with implementations. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 493671) Time Spent: 20m (was: 10m) > Implement FileSystem.listStatusIterator() in S3AFileSystem > ---------------------------------------------------------- > > Key: HADOOP-17281 > URL: https://issues.apache.org/jira/browse/HADOOP-17281 > Project: Hadoop Common > Issue Type: Task > Components: fs/s3 > Affects Versions: 3.3.0 > Reporter: Mukund Thakur > Assignee: Mukund Thakur > Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently S3AFileSystem only implements listStatus() api which returns an > array. Once we implement the listStatusIterator(), clients can benefit from > the async listing done recently > https://issues.apache.org/jira/browse/HADOOP-17074 by performing some tasks > on files while iterating them. > > CC [~stevel] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org