[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2019-02-12 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766228#comment-16766228
 ] 

Michael Osipov commented on WAGON-537:
--

Awesome, thank you very much for retesting. Funnily, this changed revealed a 
serious bug in JSch.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0, 3.3.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2019-02-12 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765965#comment-16765965
 ] 

Olaf Otto commented on WAGON-537:
-

[~michael-o]

It is the exact same situation with uploading artifacts, i.e. this change - at 
least on Windows / linux via Docker for windows leads to a full utilization of 
the network capacity.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0, 3.3.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2019-01-10 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739388#comment-16739388
 ] 

Michael Osipov commented on WAGON-537:
--

Awesome, that looks like full saturation of the 100 Mbit/s interface. How about 
upload?

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0, 3.3.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2019-01-10 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739361#comment-16739361
 ] 

Olaf Otto commented on WAGON-537:
-

I have began testing using docker on win 10 (with hyper-v). For testing, I have 
run a local docker container via

 {code}
docker run -i -t -v [local maven installation dir]:/opt/maven -v [local dir 
with test POM]:/opt/test -v [local .m2 dir]:/root/.m2  openjdk:latest /bin/bash
 {code}

Then, I have executed a CURL download of a 4GB test file from a remote nexus 
repo as a reference point. Subsequently, I executed a maven artifact download 
using maven 3.6.0 with and without the patch. Here are the results:

*Reference download via CURL:* 11 MB/s
*Download with this patch:* 11 MB /s
*Download without patch:* < 1 MB /s

Moreover, the unpatched version caused massive CPU usage due to the millions of 
invocations of fireTransferProgress. Thus, when using docker for windows, one 
can see a significant improvement when the remote artifact repo is relatively 
slow.

I will make another test on a mac to see how this played out on a bare metal 
*nix.


> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0, 3.3.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2019-01-07 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735547#comment-16735547
 ] 

Olaf Otto commented on WAGON-537:
-

Thanks [~michael-o]!

My hypothesis is that when maven is executed on a *nix derivate (Linux / 
Mac...), the costs of java.io.Printstream.print(...) to the console can 
besignificantly lower than on windows, thus this change does not contribute 
much to the download speed, as it is already almost optimal. However, it the 
change may still reduce CPU usage due to the reduced number of calls. I'll try 
and test this on a *nix machine as soon as I have some time.

 

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0, 3.3.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2019-01-05 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735044#comment-16735044
 ] 

Michael Osipov commented on WAGON-537:
--

[~o.otto], the regression is actually not in your code, but a bug in JSch. 

My tests are on a Windows 10, Intel Skylake against an old 4 thread atom 
machine running FreeBSD 11.2-RELEASE. I can retry at work with my Windows 7 box 
against a Xeon based older FreeBSD server. In both cases, server and client are 
physically very close.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0, 3.3.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2019-01-05 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735027#comment-16735027
 ] 

Olaf Otto commented on WAGON-537:
-

Hi [~dantran], thanks a lot for trying this out. So in summary, your tests show 
no difference between curl, the current maven release version and the version 
including this change, correct?

If so, you are in a situation where the current implementation already provides 
the optimal result - this means the buffer in the copy loop is used efficiently 
and, most importantly, exeuction of fireTransferProgress is *very* fast. So at 
least we know that the change does not reduce the transfer speed in this 
situation.

What OS are you on? [~michael-o], same question to you. I've tested Windows 10 
so far. I'd assume the costs of e.g. console output might vary greatly.

Regrads,
Olaf

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0, 3.3.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2019-01-05 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735030#comment-16735030
 ] 

Olaf Otto commented on WAGON-537:
-

Hi [~michael-o], regarding your pointer to the mailing list 
([http://mail-archives.apache.org/mod_mbox/maven-dev/201812.mbox/%3CCAPCjjnGSpwPA3%2BaTgy2MHfY7wMLK6_v1ZHBi0RWP-XnjJd5r2w%40mail.gmail.com%3E]),
 I take it the regression (WAGON-543) is resolved? [~dantran] do you still 
recommend trying to isolate this change to the http implementation?

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0, 3.3.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-29 Thread Dan Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730818#comment-16730818
 ] 

Dan Tran commented on WAGON-537:


To test wagon 3.3.0 download speed, I build maven-3.6.1-snapshot with the 
latest wagon, and run mvn dependency:get to download a single 7G+ file.  

Also tested with another artifactory's mirror over a wan link,   the download 
speed for mvn 3.6, 3.6.1, and curl are the same.  ~60M/s

Note: the network traffic during the holiday is very quiet ( ie no build/test)


> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-29 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730777#comment-16730777
 ] 

Michael Osipov commented on WAGON-537:
--

How can you see an improvement if curl already fully saturates your gigabit 
link. How does the download speed look like with Wagon 3.3.0?

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-28 Thread Dan Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730513#comment-16730513
 ] 

Dan Tran commented on WAGON-537:


I see no gain in my download. Basic single download test with curl 

curl -O 
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
100 6779M  100 6779M0 0   301M  0  0:00:22  0:00:22 --:--:--  343M

maven 3.6.1 with wagon 3.3 is around 115M/s

maven 3.6.0 with wagon 3.2 is around 115M/s

No improvement between 3.6.0/3.6.1.  Did check to make sure wagon-3.3 does 
landed at my local maven-3.6.1 build

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-28 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730483#comment-16730483
 ] 

Michael Osipov commented on WAGON-537:
--

[~o.otto], can you have a look at 
[this](http://mail-archives.apache.org/mod_mbox/maven-dev/201812.mbox/%3CCAPCjjnGSpwPA3%2BaTgy2MHfY7wMLK6_v1ZHBi0RWP-XnjJd5r2w%40mail.gmail.com%3E)
 message and tell what you think?

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-28 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730466#comment-16730466
 ] 

Olaf Otto commented on WAGON-537:
-

Hi [~dantran] the impact of the change greatly depends on the network 
capabilities and your local setup (i.e., how much time is used by 
fireTransferProgress(...)). To asses whether wagon efficiently uses the 
available capacity I recommend uploading using CURL for comparison (e.g. 
https://issues.apache.org/jira/browse/WAGON-537?focusedCommentId=16687105=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16687105)
 If you had the time to do so I'd greatly appreciate it. Should curl be faster 
I'd like to investigate further.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-27 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729907#comment-16729907
 ] 

Michael Osipov commented on WAGON-537:
--

[~dantran], the upload improvement is not so big as for download. At least with 
my setup with a low power device hosting Nexus and upload from my Windows 
machine. How does download look for you? Better? How does curl perform with 
Artifactory?

I am convinced that [~o.otto]s work is a great step forward, but not yet 
complete in terms of performance.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-27 Thread Dan Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729902#comment-16729902
 ] 

Dan Tran commented on WAGON-537:


Tested with latest maven 3.6.1-SNAPSHOT with both latest wagon snapshot and 
wagon-2.3.0 currently at staging repo.  Dont see upload improvement. Here one 
of my 7.1 GB .ova upload  (7.1 GB at 35 MB/s) . Am I missing anything?

Maybe due to my Artifactory and jenkins are 1 network hop away?


> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-26 Thread Dan Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729287#comment-16729287
 ] 

Dan Tran commented on WAGON-537:


sorry about the noise, I am able to cook up a multi SCM freeslyle job to build 
both wagon and maven

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-26 Thread Dan Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729276#comment-16729276
 ] 

Dan Tran commented on WAGON-537:


Could you deploy latest wagon snapshot to repository.apache.org?  This is much 
easier for me to build apache-maven at our Jenkins with  
-DwagonVersion=3.3.0-SNAPSHOT.



> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-26 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728965#comment-16728965
 ] 

Michael Osipov commented on WAGON-537:
--

You can do that alreay. Build 3.3.0-SNAPSHOT, manually update them POM of 
3.6.1-SNAPSHOT and there you go.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-25 Thread Dan Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728872#comment-16728872
 ] 

Dan Tran commented on WAGON-537:


I think not able to use dependency as snapshot is a setback.  but i am looking 
forward to test maven 3.6.1-SNAPSHOT with wagon 3.3.0

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-25 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728838#comment-16728838
 ] 

Michael Osipov commented on WAGON-537:
--

[~dantran], not really even snapshot POMs are not allowed to have snapshot 
dependencies. I think I will roll 3.3.0 this week.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-12-25 Thread Dan Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728837#comment-16728837
 ] 

Dan Tran commented on WAGON-537:


[~michael-o] could you hook up latest wagon snapshot with maven 3.6.1-SNAPSHOT, 
love to test this out early

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.3.0
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687107#comment-16687107
 ] 

ASF GitHub Bot commented on WAGON-537:
--

asfgit closed pull request #51: WAGON-537 Maven transfer speed of large 
artifacts is slow
URL: https://github.com/apache/maven-wagon/pull/51
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/wagon-provider-api/src/main/java/org/apache/maven/wagon/AbstractWagon.java 
b/wagon-provider-api/src/main/java/org/apache/maven/wagon/AbstractWagon.java
index 4cbf37d7..361390a4 100644
--- a/wagon-provider-api/src/main/java/org/apache/maven/wagon/AbstractWagon.java
+++ b/wagon-provider-api/src/main/java/org/apache/maven/wagon/AbstractWagon.java
@@ -42,8 +42,14 @@
 import java.io.IOException;
 import java.io.InputStream;
 import java.io.OutputStream;
+import java.nio.ByteBuffer;
+import java.nio.channels.Channels;
+import java.nio.channels.ReadableByteChannel;
 import java.util.List;
 
+import static java.lang.Math.max;
+import static java.lang.Math.min;
+
 /**
  * Implementation of common facilities for Wagon providers.
  *
@@ -53,6 +59,24 @@
 implements Wagon
 {
 protected static final int DEFAULT_BUFFER_SIZE = 1024 * 4;
+protected static final int MAXIMUM_BUFFER_SIZE = 1024 * 512;
+
+/**
+ * To efficiently buffer data, use a multiple of 4k
+ * as this is likely to match the hardware buffer size of certain
+ * storage devices.
+ */
+protected static final int BUFFER_SEGMENT_SIZE = 4 * 1024;
+
+/**
+ * The desired minimum amount of chunks in which a {@link Resource} shall 
be
+ * {@link #transfer(Resource, InputStream, OutputStream, int, long) 
transferred}.
+ * This corresponds to the minimum times {@link 
#fireTransferProgress(TransferEvent, byte[], int)}.
+ * 100 notifications is a conservative value that will lead to small 
chunks for
+ * any artifact less that {@link #BUFFER_SEGMENT_SIZE} * {@link 
#MINIMUM_AMOUNT_OF_TRANSFER_CHUNKS}
+ * in size.
+ */
+protected static final int MINIMUM_AMOUNT_OF_TRANSFER_CHUNKS = 100;
 
 protected Repository repository;
 
@@ -560,31 +584,74 @@ protected void transfer( Resource resource, InputStream 
input, OutputStream outp
 protected void transfer( Resource resource, InputStream input, 
OutputStream output, int requestType, long maxSize )
 throws IOException
 {
-byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
+
+ByteBuffer buffer = ByteBuffer.allocate( 
getBufferCapacityForTransferring( resource.getContentLength() ) );
+int halfBufferCapacity = buffer.capacity() / 2;
 
 TransferEvent transferEvent = new TransferEvent( this, resource, 
TransferEvent.TRANSFER_PROGRESS, requestType );
 transferEvent.setTimestamp( System.currentTimeMillis() );
 
+ReadableByteChannel in = Channels.newChannel( input );
+
 long remaining = maxSize;
 while ( remaining > 0 )
 {
-// let's safely cast to int because the min value will be lower 
than the buffer size.
-int n = input.read( buffer, 0, (int) Math.min( buffer.length, 
remaining ) );
+int read = in.read( buffer );
 
-if ( n == -1 )
+if ( read == -1 )
 {
+// EOF, but some data has not been written yet.
+if ( buffer.position() != 0 )
+{
+buffer.flip();
+fireTransferProgress( transferEvent, buffer.array(), 
buffer.limit() );
+output.write( buffer.array(), 0, buffer.limit() );
+}
+
 break;
 }
 
-fireTransferProgress( transferEvent, buffer, n );
-
-output.write( buffer, 0, n );
+// Prevent minichunking / fragmentation: when less than half the 
buffer is utilized,
+// read some more bytes before writing and firing progress.
+if ( buffer.position() < halfBufferCapacity  )
+{
+continue;
+}
 
-remaining -= n;
+buffer.flip();
+fireTransferProgress( transferEvent, buffer.array(), 
buffer.limit() );
+output.write( buffer.array(), 0, buffer.limit() );
+remaining -= buffer.limit();
+buffer.clear();
 }
 output.flush();
 }
 
+/**
+ * Provides a buffer size for efficiently transferring the given amount of 
bytes, such that
+ * it is not fragmented into to many chunks. For larger files, larger 
buffers are provided such that downstream
+ * {@link #fireTransferProgress(TransferEvent, byte[], int) listeners} are 

[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-14 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687105#comment-16687105
 ] 

Michael Osipov commented on WAGON-537:
--

Just tried the upload with curl, no issue on our side:

{noformat}
$ curl --upload big.bin 
http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/toll.bin
 --verbose -u admin:admin123
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
 34 4768M0 0   34 1633M  0  22.5M  0:03:31  0:01:12  0:02:19 23.1M
{noformat}

Must be some other reason for the speed. I will now go ahead and merge it.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.2.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-13 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685265#comment-16685265
 ] 

Olaf Otto commented on WAGON-537:
-

Hi [~michael-o]

I agree that it would be great if the progress monitor only made an update if a 
decimal changes. The only potential issue I see is that if a transfer is slow, 
users may notice no activity on the console and may believe maven is "hanging".

>From an architectural perspective, I would personally not want to create any 
>dependency between TransferListeners and the transfer loop. The loop should 
>simply make sure not to invoke the listeners overly frequently as this slows 
>down transfers (the main objective of this change), and the TransferListeners 
>should do whatever they want with the data.

Perhaps one could discuss whether TransferListeners must be invoked 
synchronously during data transfer. For me, the term "Listener" implies 
asynchronism. Are there cases where a listener actively aborts transfer by 
throwing an exception, perhaps something related to checksums? If not, one 
could e.g. put progress notifications on a queue and asynchronously notify 
listeners.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.2.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-13 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685104#comment-16685104
 ] 

Michael Osipov commented on WAGON-537:
--

I will test the upload with curl in the next couple of days. The [progress 
monitor|https://github.com/apache/maven/blob/master/maven-embedder/src/main/java/org/apache/maven/cli/transfer/AbstractMavenTransferListener.java]
 has been reworked by me some time ago. Do you think it makes sense to peg the 
buffer size to the granularity of the output formatter. Why update the if the 
decimal does not change?!

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.2.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-12 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683508#comment-16683508
 ] 

Olaf Otto commented on WAGON-537:
-

Hi [~michael-o]

Great to hear you could reproduce the improvement. I am not sure why the upload 
does not improve to the same extend in your case - to make sure it's not 
network related, coud you perhaps compare the upload speed you measured with 
the upload speed of the respective wget post to your nexus instance? If there 
is a gap, we'd still have some work to do.

Regards,
Olaf

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.2.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-10 Thread Dan Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682745#comment-16682745
 ] 

Dan Tran commented on WAGON-537:


This is Xmas present ... early. Many thanks 

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.2.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-10 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682640#comment-16682640
 ] 

Michael Osipov commented on WAGON-537:
--

What a great improvement! I made some simple tests form my Windows box to my 
NExus instance in my LAN with a gigabit link:

{noformat}
Before:
Uploaded to nexus-mika: 
http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.220953-5-big.bin
 (5.0 GB at 5.7 MB/s)
Downloaded from nexus-mika: 
http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.220953-5-big.bin
 (5.0 GB at 11 MB/s)

After:
Uploaded to nexus-mika: 
http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.215857-3-big.bin
 (5.0 GB at 20 MB/s)
Downloaded from nexus-mika: 
http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.214908-2-big.bin
 (5.0 GB at 83 MB/s)
{noformat}

The difference is insane! I have pushed a slighly modified branch. Any idea why 
upload is way slower than download?

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Fix For: 3.2.1
>
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-06 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676890#comment-16676890
 ] 

Olaf Otto commented on WAGON-537:
-

Hi [~michael-o]

Sure: I tested the previous solution and found that a similar buffering issue 
existed for uploading artefacts. The previous solution only included a fix for 
downloading artefacts. I then implemented a more general solution for both 
situations. In the process, I switched to using nio's Buffer as it provides a 
conveniant API well suited for the purpose. Also, the Buffer's work for the 
channel API that I used for uploading artefacts.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-06 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676712#comment-16676712
 ] 

Michael Osipov commented on WAGON-537:
--

Haven't noticed this yet, but will look into this in the next couple of days. 
Can you share how this new implementation differs from the previous PR from you?

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-11-06 Thread Olaf Otto (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676689#comment-16676689
 ] 

Olaf Otto commented on WAGON-537:
-

Hi [~michael-o]

Are you still looking into this? I just saw that I missed one of your questions 
due to the [~githubbot] spam: I did indeed disable the transfer listeners once, 
resulting in a > 10-fold increase of download & upload performance when 
transferring a 6 Gb file via a symmetric 1 Gigabit connection. The performance 
gain varies with artefact size, remote transfer speed and capacity and of 
course computing power.

I have just run a test with a 5.7 Gb artifact via a connection that allows ~ 
6-7 MB/s transfer from a nexus repo, meaning that when compared to a 1 Gigabit 
connection, the overhead is somewhat shifted to network i/o, thus reducing the 
effect of the refacoring. However, results are:

*Without the changes:*
 (5.9 GB at 1.8 MB/s)
 Total time: 53:24 min

*With these changes:*
 (5.9 GB at 6.9 MB/s)
 Total time: 14:28 min

With the patch, the download speed matches precisely the speed I get when using 
the browser or wget.

> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-http, wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-10-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664388#comment-16664388
 ] 

ASF GitHub Bot commented on WAGON-537:
--

olaf-otto commented on issue #51: WAGON-537 Maven transfer speed of large 
artifacts is slow
URL: https://github.com/apache/maven-wagon/pull/51#issuecomment-433232059
 
 
   This change increases download and upload speed of large artifacts by more 
than a factor of 10 when applied to maven 3.5.4 on Windows 10 64 bit on recent 
hardware.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (WAGON-537) Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy

2018-10-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664329#comment-16664329
 ] 

ASF GitHub Bot commented on WAGON-537:
--

olaf-otto opened a new pull request #51: WAGON-537 Maven transfer speed of 
large artifacts is slow
URL: https://github.com/apache/maven-wagon/pull/51
 
 
   Implemented a buffer strategy such that filling the buffer to at least 50% 
has priority over frequent writes.
   
   Added dynamic buffer capacity allocation based on the expected number of 
bytes to transfer.
   
   Used NIO buffers to simplify buffer management.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Maven transfer speed of large artifacts is slow due to unsuitable buffer 
> strategy
> -
>
> Key: WAGON-537
> URL: https://issues.apache.org/jira/browse/WAGON-537
> Project: Maven Wagon
>  Issue Type: Improvement
>  Components: wagon-provider-api
>Affects Versions: 3.2.0
> Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s 
> network connection.
>Reporter: Olaf Otto
>Assignee: Michael Osipov
>Priority: Major
>  Labels: perfomance
> Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes 
> involves uploading and downloading artifacts with a few gigabytes in size. 
> Here, maven's transfer speed is consistently and reproducibly slow. For 
> instance, an artifact with 7,5 GB in size took almost two hours to transfer 
> in spite of a 100 MB/s connection with respective reproducible download speed 
> from the remote nexus artifact repository when using a browser to download. 
> The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in 
> AbstractWagon's transfer( Resource resource, InputStream input, OutputStream 
> output, int requestType, long maxSize ) method used for remote artifacts and 
> the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data 
> is received, the received data is pushed to downstream listeners via 
> fireTransferProgress. These listeners (or rather consumers) perform expensive 
> tasks.
> Now, the underlying InputStream implementation used in transfer will return 
> calls to read(buffer, offset, length) as soon as *some* data is available. 
> That is, fireTransferProgress may well be invoked with an average number of 
> bytes less than half the buffer capacity (this varies with the underlying 
> network and hardware architecture). Consequently, fireTransferProgress is 
> invoked *millions of times* for large files. As this is a blocking operation, 
> the time spent in fireTransferProgress dominates and drastically slows down 
> the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of 
> ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / 
> listeners invoked via fireTransferProgress aware of their potential impact on 
> download speed, but rather refactor the transfer method such that it uses a 
> buffer strategy reducing the the number of fireTransferProgress invocations. 
> This should be done with regard to the expected file size of the transfer, 
> such that fireTransferProgress is invoked often enough but not to frequent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)