[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766228#comment-16766228 ]
Michael Osipov commented on WAGON-537: -------------------------------------- Awesome, thank you very much for retesting. Funnily, this changed revealed a serious bug in JSch. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > --------------------------------------------------------------------------------- > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api > Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. > Reporter: Olaf Otto > Assignee: Michael Osipov > Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)