[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766228#comment-16766228 ] Michael Osipov commented on WAGON-537: -- Awesome, thank you very much for retesting. Funnily, this changed revealed a serious bug in JSch. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765965#comment-16765965 ] Olaf Otto commented on WAGON-537: - [~michael-o] It is the exact same situation with uploading artifacts, i.e. this change - at least on Windows / linux via Docker for windows leads to a full utilization of the network capacity. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739388#comment-16739388 ] Michael Osipov commented on WAGON-537: -- Awesome, that looks like full saturation of the 100 Mbit/s interface. How about upload? > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739361#comment-16739361 ] Olaf Otto commented on WAGON-537: - I have began testing using docker on win 10 (with hyper-v). For testing, I have run a local docker container via {code} docker run -i -t -v [local maven installation dir]:/opt/maven -v [local dir with test POM]:/opt/test -v [local .m2 dir]:/root/.m2 openjdk:latest /bin/bash {code} Then, I have executed a CURL download of a 4GB test file from a remote nexus repo as a reference point. Subsequently, I executed a maven artifact download using maven 3.6.0 with and without the patch. Here are the results: *Reference download via CURL:* 11 MB/s *Download with this patch:* 11 MB /s *Download without patch:* < 1 MB /s Moreover, the unpatched version caused massive CPU usage due to the millions of invocations of fireTransferProgress. Thus, when using docker for windows, one can see a significant improvement when the remote artifact repo is relatively slow. I will make another test on a mac to see how this played out on a bare metal *nix. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735547#comment-16735547 ] Olaf Otto commented on WAGON-537: - Thanks [~michael-o]! My hypothesis is that when maven is executed on a *nix derivate (Linux / Mac...), the costs of java.io.Printstream.print(...) to the console can besignificantly lower than on windows, thus this change does not contribute much to the download speed, as it is already almost optimal. However, it the change may still reduce CPU usage due to the reduced number of calls. I'll try and test this on a *nix machine as soon as I have some time. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735044#comment-16735044 ] Michael Osipov commented on WAGON-537: -- [~o.otto], the regression is actually not in your code, but a bug in JSch. My tests are on a Windows 10, Intel Skylake against an old 4 thread atom machine running FreeBSD 11.2-RELEASE. I can retry at work with my Windows 7 box against a Xeon based older FreeBSD server. In both cases, server and client are physically very close. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735027#comment-16735027 ] Olaf Otto commented on WAGON-537: - Hi [~dantran], thanks a lot for trying this out. So in summary, your tests show no difference between curl, the current maven release version and the version including this change, correct? If so, you are in a situation where the current implementation already provides the optimal result - this means the buffer in the copy loop is used efficiently and, most importantly, exeuction of fireTransferProgress is *very* fast. So at least we know that the change does not reduce the transfer speed in this situation. What OS are you on? [~michael-o], same question to you. I've tested Windows 10 so far. I'd assume the costs of e.g. console output might vary greatly. Regrads, Olaf > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735030#comment-16735030 ] Olaf Otto commented on WAGON-537: - Hi [~michael-o], regarding your pointer to the mailing list ([http://mail-archives.apache.org/mod_mbox/maven-dev/201812.mbox/%3CCAPCjjnGSpwPA3%2BaTgy2MHfY7wMLK6_v1ZHBi0RWP-XnjJd5r2w%40mail.gmail.com%3E]), I take it the regression (WAGON-543) is resolved? [~dantran] do you still recommend trying to isolate this change to the http implementation? > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0, 3.3.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730818#comment-16730818 ] Dan Tran commented on WAGON-537: To test wagon 3.3.0 download speed, I build maven-3.6.1-snapshot with the latest wagon, and run mvn dependency:get to download a single 7G+ file. Also tested with another artifactory's mirror over a wan link, the download speed for mvn 3.6, 3.6.1, and curl are the same. ~60M/s Note: the network traffic during the holiday is very quiet ( ie no build/test) > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730777#comment-16730777 ] Michael Osipov commented on WAGON-537: -- How can you see an improvement if curl already fully saturates your gigabit link. How does the download speed look like with Wagon 3.3.0? > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730513#comment-16730513 ] Dan Tran commented on WAGON-537: I see no gain in my download. Basic single download test with curl curl -O % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 6779M 100 6779M0 0 301M 0 0:00:22 0:00:22 --:--:-- 343M maven 3.6.1 with wagon 3.3 is around 115M/s maven 3.6.0 with wagon 3.2 is around 115M/s No improvement between 3.6.0/3.6.1. Did check to make sure wagon-3.3 does landed at my local maven-3.6.1 build > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730483#comment-16730483 ] Michael Osipov commented on WAGON-537: -- [~o.otto], can you have a look at [this](http://mail-archives.apache.org/mod_mbox/maven-dev/201812.mbox/%3CCAPCjjnGSpwPA3%2BaTgy2MHfY7wMLK6_v1ZHBi0RWP-XnjJd5r2w%40mail.gmail.com%3E) message and tell what you think? > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730466#comment-16730466 ] Olaf Otto commented on WAGON-537: - Hi [~dantran] the impact of the change greatly depends on the network capabilities and your local setup (i.e., how much time is used by fireTransferProgress(...)). To asses whether wagon efficiently uses the available capacity I recommend uploading using CURL for comparison (e.g. https://issues.apache.org/jira/browse/WAGON-537?focusedCommentId=16687105&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16687105) If you had the time to do so I'd greatly appreciate it. Should curl be faster I'd like to investigate further. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729907#comment-16729907 ] Michael Osipov commented on WAGON-537: -- [~dantran], the upload improvement is not so big as for download. At least with my setup with a low power device hosting Nexus and upload from my Windows machine. How does download look for you? Better? How does curl perform with Artifactory? I am convinced that [~o.otto]s work is a great step forward, but not yet complete in terms of performance. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729902#comment-16729902 ] Dan Tran commented on WAGON-537: Tested with latest maven 3.6.1-SNAPSHOT with both latest wagon snapshot and wagon-2.3.0 currently at staging repo. Dont see upload improvement. Here one of my 7.1 GB .ova upload (7.1 GB at 35 MB/s) . Am I missing anything? Maybe due to my Artifactory and jenkins are 1 network hop away? > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729287#comment-16729287 ] Dan Tran commented on WAGON-537: sorry about the noise, I am able to cook up a multi SCM freeslyle job to build both wagon and maven > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729276#comment-16729276 ] Dan Tran commented on WAGON-537: Could you deploy latest wagon snapshot to repository.apache.org? This is much easier for me to build apache-maven at our Jenkins with -DwagonVersion=3.3.0-SNAPSHOT. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728965#comment-16728965 ] Michael Osipov commented on WAGON-537: -- You can do that alreay. Build 3.3.0-SNAPSHOT, manually update them POM of 3.6.1-SNAPSHOT and there you go. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728872#comment-16728872 ] Dan Tran commented on WAGON-537: I think not able to use dependency as snapshot is a setback. but i am looking forward to test maven 3.6.1-SNAPSHOT with wagon 3.3.0 > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728838#comment-16728838 ] Michael Osipov commented on WAGON-537: -- [~dantran], not really even snapshot POMs are not allowed to have snapshot dependencies. I think I will roll 3.3.0 this week. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728837#comment-16728837 ] Dan Tran commented on WAGON-537: [~michael-o] could you hook up latest wagon snapshot with maven 3.6.1-SNAPSHOT, love to test this out early > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.3.0 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687107#comment-16687107 ] ASF GitHub Bot commented on WAGON-537: -- asfgit closed pull request #51: WAGON-537 Maven transfer speed of large artifacts is slow URL: https://github.com/apache/maven-wagon/pull/51 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/wagon-provider-api/src/main/java/org/apache/maven/wagon/AbstractWagon.java b/wagon-provider-api/src/main/java/org/apache/maven/wagon/AbstractWagon.java index 4cbf37d7..361390a4 100644 --- a/wagon-provider-api/src/main/java/org/apache/maven/wagon/AbstractWagon.java +++ b/wagon-provider-api/src/main/java/org/apache/maven/wagon/AbstractWagon.java @@ -42,8 +42,14 @@ import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; +import java.nio.ByteBuffer; +import java.nio.channels.Channels; +import java.nio.channels.ReadableByteChannel; import java.util.List; +import static java.lang.Math.max; +import static java.lang.Math.min; + /** * Implementation of common facilities for Wagon providers. * @@ -53,6 +59,24 @@ implements Wagon { protected static final int DEFAULT_BUFFER_SIZE = 1024 * 4; +protected static final int MAXIMUM_BUFFER_SIZE = 1024 * 512; + +/** + * To efficiently buffer data, use a multiple of 4k + * as this is likely to match the hardware buffer size of certain + * storage devices. + */ +protected static final int BUFFER_SEGMENT_SIZE = 4 * 1024; + +/** + * The desired minimum amount of chunks in which a {@link Resource} shall be + * {@link #transfer(Resource, InputStream, OutputStream, int, long) transferred}. + * This corresponds to the minimum times {@link #fireTransferProgress(TransferEvent, byte[], int)}. + * 100 notifications is a conservative value that will lead to small chunks for + * any artifact less that {@link #BUFFER_SEGMENT_SIZE} * {@link #MINIMUM_AMOUNT_OF_TRANSFER_CHUNKS} + * in size. + */ +protected static final int MINIMUM_AMOUNT_OF_TRANSFER_CHUNKS = 100; protected Repository repository; @@ -560,31 +584,74 @@ protected void transfer( Resource resource, InputStream input, OutputStream outp protected void transfer( Resource resource, InputStream input, OutputStream output, int requestType, long maxSize ) throws IOException { -byte[] buffer = new byte[DEFAULT_BUFFER_SIZE]; + +ByteBuffer buffer = ByteBuffer.allocate( getBufferCapacityForTransferring( resource.getContentLength() ) ); +int halfBufferCapacity = buffer.capacity() / 2; TransferEvent transferEvent = new TransferEvent( this, resource, TransferEvent.TRANSFER_PROGRESS, requestType ); transferEvent.setTimestamp( System.currentTimeMillis() ); +ReadableByteChannel in = Channels.newChannel( input ); + long remaining = maxSize; while ( remaining > 0 ) { -// let's safely cast to int because the min value will be lower than the buffer size. -int n = input.read( buffer, 0, (int) Math.min( buffer.length, remaining ) ); +int read = in.read( buffer ); -if ( n == -1 ) +if ( read == -1 ) { +// EOF, but some data has not been written yet. +if ( buffer.position() != 0 ) +{ +buffer.flip(); +fireTransferProgress( transferEvent, buffer.array(), buffer.limit() ); +output.write( buffer.array(), 0, buffer.limit() ); +} + break; } -fireTransferProgress( transferEvent, buffer, n ); - -output.write( buffer, 0, n ); +// Prevent minichunking / fragmentation: when less than half the buffer is utilized, +// read some more bytes before writing and firing progress. +if ( buffer.position() < halfBufferCapacity ) +{ +continue; +} -remaining -= n; +buffer.flip(); +fireTransferProgress( transferEvent, buffer.array(), buffer.limit() ); +output.write( buffer.array(), 0, buffer.limit() ); +remaining -= buffer.limit(); +buffer.clear(); } output.flush(); } +/** + * Provides a buffer size for efficiently transferring the given amount of bytes, such that + * it is not fragmented into to many chunks. For larger files, larger buffers are provided such that downstream + * {@link #fireTransferProgress(TransferEvent, byte[], int
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687105#comment-16687105 ] Michael Osipov commented on WAGON-537: -- Just tried the upload with curl, no issue on our side: {noformat} $ curl --upload big.bin http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/toll.bin --verbose -u admin:admin123 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 34 4768M0 0 34 1633M 0 22.5M 0:03:31 0:01:12 0:02:19 23.1M {noformat} Must be some other reason for the speed. I will now go ahead and merge it. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.2.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685265#comment-16685265 ] Olaf Otto commented on WAGON-537: - Hi [~michael-o] I agree that it would be great if the progress monitor only made an update if a decimal changes. The only potential issue I see is that if a transfer is slow, users may notice no activity on the console and may believe maven is "hanging". >From an architectural perspective, I would personally not want to create any >dependency between TransferListeners and the transfer loop. The loop should >simply make sure not to invoke the listeners overly frequently as this slows >down transfers (the main objective of this change), and the TransferListeners >should do whatever they want with the data. Perhaps one could discuss whether TransferListeners must be invoked synchronously during data transfer. For me, the term "Listener" implies asynchronism. Are there cases where a listener actively aborts transfer by throwing an exception, perhaps something related to checksums? If not, one could e.g. put progress notifications on a queue and asynchronously notify listeners. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.2.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685104#comment-16685104 ] Michael Osipov commented on WAGON-537: -- I will test the upload with curl in the next couple of days. The [progress monitor|https://github.com/apache/maven/blob/master/maven-embedder/src/main/java/org/apache/maven/cli/transfer/AbstractMavenTransferListener.java] has been reworked by me some time ago. Do you think it makes sense to peg the buffer size to the granularity of the output formatter. Why update the if the decimal does not change?! > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.2.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16683508#comment-16683508 ] Olaf Otto commented on WAGON-537: - Hi [~michael-o] Great to hear you could reproduce the improvement. I am not sure why the upload does not improve to the same extend in your case - to make sure it's not network related, coud you perhaps compare the upload speed you measured with the upload speed of the respective wget post to your nexus instance? If there is a gap, we'd still have some work to do. Regards, Olaf > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.2.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682745#comment-16682745 ] Dan Tran commented on WAGON-537: This is Xmas present ... early. Many thanks > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.2.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682640#comment-16682640 ] Michael Osipov commented on WAGON-537: -- What a great improvement! I made some simple tests form my Windows box to my NExus instance in my LAN with a gigabit link: {noformat} Before: Uploaded to nexus-mika: http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.220953-5-big.bin (5.0 GB at 5.7 MB/s) Downloaded from nexus-mika: http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.220953-5-big.bin (5.0 GB at 11 MB/s) After: Uploaded to nexus-mika: http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.215857-3-big.bin (5.0 GB at 20 MB/s) Downloaded from nexus-mika: http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.214908-2-big.bin (5.0 GB at 83 MB/s) {noformat} The difference is insane! I have pushed a slighly modified branch. Any idea why upload is way slower than download? > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Fix For: 3.2.1 > > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676890#comment-16676890 ] Olaf Otto commented on WAGON-537: - Hi [~michael-o] Sure: I tested the previous solution and found that a similar buffering issue existed for uploading artefacts. The previous solution only included a fix for downloading artefacts. I then implemented a more general solution for both situations. In the process, I switched to using nio's Buffer as it provides a conveniant API well suited for the purpose. Also, the Buffer's work for the channel API that I used for uploading artefacts. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676712#comment-16676712 ] Michael Osipov commented on WAGON-537: -- Haven't noticed this yet, but will look into this in the next couple of days. Can you share how this new implementation differs from the previous PR from you? > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16676689#comment-16676689 ] Olaf Otto commented on WAGON-537: - Hi [~michael-o] Are you still looking into this? I just saw that I missed one of your questions due to the [~githubbot] spam: I did indeed disable the transfer listeners once, resulting in a > 10-fold increase of download & upload performance when transferring a 6 Gb file via a symmetric 1 Gigabit connection. The performance gain varies with artefact size, remote transfer speed and capacity and of course computing power. I have just run a test with a 5.7 Gb artifact via a connection that allows ~ 6-7 MB/s transfer from a nexus repo, meaning that when compared to a 1 Gigabit connection, the overhead is somewhat shifted to network i/o, thus reducing the effect of the refacoring. However, results are: *Without the changes:* (5.9 GB at 1.8 MB/s) Total time: 53:24 min *With these changes:* (5.9 GB at 6.9 MB/s) Total time: 14:28 min With the patch, the download speed matches precisely the speed I get when using the browser or wget. > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-http, wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664388#comment-16664388 ] ASF GitHub Bot commented on WAGON-537: -- olaf-otto commented on issue #51: WAGON-537 Maven transfer speed of large artifacts is slow URL: https://github.com/apache/maven-wagon/pull/51#issuecomment-433232059 This change increases download and upload speed of large artifacts by more than a factor of 10 when applied to maven 3.5.4 on Windows 10 64 bit on recent hardware. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
[ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664329#comment-16664329 ] ASF GitHub Bot commented on WAGON-537: -- olaf-otto opened a new pull request #51: WAGON-537 Maven transfer speed of large artifacts is slow URL: https://github.com/apache/maven-wagon/pull/51 Implemented a buffer strategy such that filling the buffer to at least 50% has priority over frequent writes. Added dynamic buffer capacity allocation based on the expected number of bytes to transfer. Used NIO buffers to simplify buffer management. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Maven transfer speed of large artifacts is slow due to unsuitable buffer > strategy > - > > Key: WAGON-537 > URL: https://issues.apache.org/jira/browse/WAGON-537 > Project: Maven Wagon > Issue Type: Improvement > Components: wagon-provider-api >Affects Versions: 3.2.0 > Environment: Windows 10, JDK 1.8, Nexus Artifact store > 100MB/s > network connection. >Reporter: Olaf Otto >Assignee: Michael Osipov >Priority: Major > Labels: perfomance > Attachments: wagon-issue.png > > > We are using maven for build process automation with docker. This sometimes > involves uploading and downloading artifacts with a few gigabytes in size. > Here, maven's transfer speed is consistently and reproducibly slow. For > instance, an artifact with 7,5 GB in size took almost two hours to transfer > in spite of a 100 MB/s connection with respective reproducible download speed > from the remote nexus artifact repository when using a browser to download. > The same is true when uploding such an artifact. > I have investigated the issue using JProfiler. The result shows an issue in > AbstractWagon's transfer( Resource resource, InputStream input, OutputStream > output, int requestType, long maxSize ) method used for remote artifacts and > the same issue in AbstractHttpClientWagon#writeTo(OutputStream). > Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data > is received, the received data is pushed to downstream listeners via > fireTransferProgress. These listeners (or rather consumers) perform expensive > tasks. > Now, the underlying InputStream implementation used in transfer will return > calls to read(buffer, offset, length) as soon as *some* data is available. > That is, fireTransferProgress may well be invoked with an average number of > bytes less than half the buffer capacity (this varies with the underlying > network and hardware architecture). Consequently, fireTransferProgress is > invoked *millions of times* for large files. As this is a blocking operation, > the time spent in fireTransferProgress dominates and drastically slows down > the transfers by at least one order of magnitude. > !wagon-issue.png! > In our case, we found download speed reduced from a theoretical optimum of > ~80 seconds to to more than 3200 seconds. > From an architectural perspective, I would not want to make the consumers / > listeners invoked via fireTransferProgress aware of their potential impact on > download speed, but rather refactor the transfer method such that it uses a > buffer strategy reducing the the number of fireTransferProgress invocations. > This should be done with regard to the expected file size of the transfer, > such that fireTransferProgress is invoked often enough but not to frequent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)