[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739361#comment-16739361 ]
Olaf Otto edited comment on WAGON-537 at 1/10/19 12:34 PM: ----------------------------------------------------------- I have began testing using docker on win 10 (with hyper-v). For testing, I have run a local docker container via {code:java} docker run -i -t -v [local maven installation dir]:/opt/maven -v [local dir with test POM]:/opt/test -v [local .m2 dir]:/root/.m2 openjdk:latest /bin/bash {code} Then, I have executed a CURL download of a 4GB test file from a remote nexus repo as a reference point. Subsequently, I executed a maven artifact download using maven 3.6.0 with and without the patch. Here are the results: *Reference download via CURL:* 11 MB/s *Download with this patch:* 11 MB /s *Download without patch:* < 1 MB /s Moreover, the unpatched version caused massive CPU usage due to the millions of invocations of fireTransferProgress. Thus, when using docker for windows, one can see a significant improvement when the remote artifact repo is relatively slow. I will make another test on a mac to see how this played out on a bare metal *nix. was (Author: o.otto): I have began testing using docker on win 10 (with hyper-v). For testing, I have run a local docker container via {code}docker run -i -t -v [local maven installation dir]:/opt/maven -v [local dir with test POM]:/opt/test -v [local .m2 dir]:/root/.m2 openjdk:latest /bin/bash {code} Then, I have executed a CURL download of a 4GB test file from a remote nexus repo as a reference point. Subsequently, I executed a maven artifact download using maven 3.6.0 with and without the patch. Here are the results: *Reference download via CURL:* 11 MB/s *Download with this patch:* 11 MB /s *Download without patch:* < 1 MB /s Moreover, the unpatched version caused massive CPU usage due to the millions of invocations of fireTransferProgress. Thus, when using docker for windows, one can see a significant improvement when the remote artifact repo is relatively slow. I will make another test on a mac to see how this played out on a bare metal *nix. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > --------------------------------------------------------------------------------- > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api > Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. > Reporter: Olaf Otto > Assignee: Michael Osipov > Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)