[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676689#comment-16676689 ]
Olaf Otto commented on WAGON-537: --------------------------------- Hi [~michael-o] Are you still looking into this? I just saw that I missed one of your questions due to the [~githubbot] spam: I did indeed disable the transfer listeners once, resulting in a > 10-fold increase of download & upload performance when transferring a 6 Gb file via a symmetric 1 Gigabit connection. The performance gain varies with artefact size, remote transfer speed and capacity and of course computing power. I have just run a test with a 5.7 Gb artifact via a connection that allows ~ 6-7 MB/s transfer from a nexus repo, meaning that when compared to a 1 Gigabit connection, the overhead is somewhat shifted to network i/o, thus reducing the effect of the refacoring. However, results are: *Without the changes:* (5.9 GB at 1.8 MB/s) Total time: 53:24 min *With these changes:* (5.9 GB at 6.9 MB/s) Total time: 14:28 min With the patch, the download speed matches precisely the speed I get when using the browser or wget. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > --------------------------------------------------------------------------------- > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api > Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. > Reporter: Olaf Otto > Assignee: Michael Osipov > Priority: Major > Labels: perfomance > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)