On Wed, May 15, 2024 at 8:33 PM David Pfitzner <[email protected]> wrote:
> On Wed, May 15, 2024 at 5:31 PM Daniel Stenberg <[email protected]> wrote: > >> On Wed, 15 May 2024, David Pfitzner via curl-library wrote: >> > >> > Perhaps it would be useful for a user of libcurl to be able to >> (somehow) >> > control this tradeoff between rate-limiting accuracy and CPU usage? >> >> Perhaps getting more data would be a first step. How big difference in >> rate-limit accuracy does this commit make in your case? >> >> I have not looked at that carefully, but casually I don't see much > difference if any. But possibly closer inspection may find a systematic > difference. I also suspect one may see bigger differences in some regime > different to what I'm looking at - eg, smaller files. > > Getting back to this, although various things can affect timing, I think the maximum expected error for the rate-limiting timing mainly corresponds to the maximum number of bytes which can be read in one call of lib/transfer.c:readwrite_data(). (That is, since rate-limiting is done outside such calls, and not within such calls.) This turns out to be, for these two curl versions: 8.6.0: max_error_bytes = min(rate, 10*buffersize) 8.7.1: max_error_bytes = min(rate, buffersize) (Where rate=max_recv_speed. The first term is that we are not allowed to read more than max_recv_speed; the second is that for 8.6.0 we can do up to maxloops = 10, while for 8.7.1 we can only do one iteration of the loop when rate-limiting is in effect.) Dividing by the rate to get the corresponding time (in seconds): 8.6.0: max_error_time = min(1s, 10*buffersize/rate) 8.7.1: max_error_time = min(1s, buffersize/rate) For the curl command-line utility, we have: buffersize = min(rate, 100KiB) which gives for this case: 8.6.0: max_error_time = min(1s, 1000KiB/rate) 8.7.1: max_error_time = min(1s, 100KiB/rate) For a few example rates, the maximum expected error in seconds would be: rate 8.6.0 8.7.1 <=100KiB: 1 1 200KiB: 1 0.5 500KiB: 1 0.2 1000KiB: 1 0.1 2000KiB: 0.5 0.05 5000KiB: 0.2 0.02 10000KiB: 0.1 0.01 To check this, I did some timing experiments with the curl command-line utility, and measured the following values (taking 95th percentile as "maximum"): rate 8.6.0 8.7.1 50KiB 0.906 0.921 100KiB 0.898 0.949 200KiB 0.954 0.439 500KiB 0.934 0.178 1000KiB 0.921 0.093 2000KiB 0.517 0.042 5000KiB 0.198 0.028 10000KiB 0.101 0.011 Given timing variability etc, I would say they match pretty well the expected values above. If one is interested in the relative error (that is, the timing error relative to the total download time), then dividing max_error_time by (size/rate) gives: 8.6.0: max_relative_error = min(rate/size, 10*buffersize/size) 8.7.1: max_relative_error = min(rate/size, buffersize/size) Or for the curl command-line utility: 8.6.0: max_relative_error = min(rate/size, 1000KiB/size) 8.7.1: max_relative_error = min(rate/size, 100KiB/size) Note for both versions the relative error can be 1 (that is, 100%), if the size is smaller than the number of bytes which can be read in one go by readwrite_data(). That is, because in that case we effectively do not rate-limit at all. (But the value of the cutoff size varies by version.) On the other hand, as the size becomes large, the relative error becomes small (but still smaller for 8.7.1 than for 8.6.0, for rate > 100KiB). So what does that all mean? Well 8.7.1 does indeed have improved rate-limiting accuracy (at least at high rates) compared to 8.6.0 - up to 10 times better. And certainly there may be cases where that improvement is important. However in my case I am mostly downloading very large files, and the accuracy of 8.6.0 was sufficient. So I would still say it could be useful to have a curl request option which influences that accuracy (and hence the CPU usage tradeoff). I have not yet tried adding such an option, but I was thinking one could perhaps specify the desired accuracy in seconds, and then libcurl would calculate (based on the rate) the maximum number of bytes to be read in one go by readwrite_data(), and implement it that way. -- David
-- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html
