[ https://issues.apache.org/jira/browse/HADOOP-15245?focusedWorklogId=732507&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-732507 ]
ASF GitHub Bot logged work on HADOOP-15245: ------------------------------------------- Author: ASF GitHub Bot Created on: 24/Feb/22 17:13 Start Date: 24/Feb/22 17:13 Worklog Time Spent: 10m Work Description: dannycjones commented on a change in pull request #3927: URL: https://github.com/apache/hadoop/pull/3927#discussion_r814094303 ########## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/statistics/impl/EmptyS3AStatisticsContext.java ########## @@ -317,6 +322,9 @@ public long getInputPolicy() { return 0; } + @Override + public long getSkipOperations() { return 0; } Review comment: This line introduces a new checkstyle violation, can you drop the return on to a new line? ``` ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/statistics/impl/EmptyS3AStatisticsContext.java:326: public long getSkipOperations() { return 0; }:37: '{' at column 37 should have line break after. [LeftCurly] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 732507) Time Spent: 1h 10m (was: 1h) > S3AInputStream.skip() to use lazy seek > -------------------------------------- > > Key: HADOOP-15245 > URL: https://issues.apache.org/jira/browse/HADOOP-15245 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.1.0 > Reporter: Steve Loughran > Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > the default skip() does a read and discard of all bytes, no matter how far > ahead the skip is. This is very inefficient if the skip() is being done on > S3A random IO, though exactly what to do when in sequential mode. > Proposed: > * add an optimized version of S3AInputStream.skip() which does a lazy seek, > which itself will decided when to skip() vs issue a new GET. > * add some more instrumentation to measure how often this gets used -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org