[ https://issues.apache.org/jira/browse/HDDS-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bharat Viswanadham resolved HDDS-3223. -------------------------------------- Fix Version/s: 0.6.0 Resolution: Fixed > Improve s3g read 1GB object efficiency by 100 times > ---------------------------------------------------- > > Key: HDDS-3223 > URL: https://issues.apache.org/jira/browse/HDDS-3223 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Reporter: runzhiwang > Assignee: runzhiwang > Priority: Critical > Labels: pull-request-available > Fix For: 0.6.0 > > Attachments: screenshot-1.png > > > *What's the problem ?* > Read 1000M object, it cost about 470 seconds, i.e. 2.2M/s, which is too slow. > *What's the reason ?* > When read 1000M file, there are 50 GET requests, each GET request read 20M. > When do GET, the stack is: > [IOUtils::copyLarge|https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java#L262] > -> > [IOUtils::skipFully|https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/IOUtils.java#L1190] > -> > [IOUtils::skip|https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/IOUtils.java#L2064] > -> > [InputStream::read|https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/IOUtils.java#L1957]. > It means, the 50th GET request which should read 980M-1000M, but to skip > 0-980M, it also > [InputStream::read|https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/IOUtils.java#L1957] > 0-980M. So the 1st GET request read 0-20M, the 2nd GET request read 0-40M, > the 3rd GET request read 0-60M, ..., the 50th GET request read 0-1000M. So > the GET request from 1st-50th become slower and slower. > You can also refer it [here|https://issues.apache.org/jira/browse/IO-203] why > IOUtils implement skip by read rather than real skip, e.g. seek. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org