[Pharo-users] Re: Splitting a single HTTP Request into multiple concurrent requests

Yanni Chiu Mon, 18 Oct 2021 15:18:40 -0700

A good use case is when one of the downloads fails. When it’s just one big
one then you have start over from the beginning.


On Mon, Oct 18, 2021 at 11:05 AM Sven Van Caekenberghe <s...@stfx.eu> wrote:

> Hi,
>
> Somebody asked how you would split single HTTP Request into multiple
> concurrent requests. This is one way to do it.
>
> Upfront I should state that
>
> - I do no think this is worth the trouble
> - It is only applicable to large downloads (even larger than in the
> example)
> - The other side (server) must honour Range requests correctly (and be
> fast)
>
> This one is based on the data used in the ZnHTTPSTest(s)>>#testTransfers
> units test. More specifically the files available under
> https://s3-eu-west-1.amazonaws.com/public-stfx-eu/ such as
> https://s3-eu-west-1.amazonaws.com/public-stfx-eu/test-2050.txt for the
> smallest one.
>
> sizes := (Integer primesUpTo: 100) collect: [ :each | 1024 * each + each ].
>
> size := sizes last.
> concurrency := 11.
> step := size // concurrency.
>
> ranges := (0 to: size - 1 by: step) collect: [ :each |
>   { each. (each + step) min: size } ].
>
> chunks := Array new: ranges size.
> done := Semaphore new.
> ms := 0.
>
> [
> ms := Time millisecondClockValue.
> ranges withIndexDo: [ :range :index |
>   [ | client |
>      (client := ZnClient new)
>         https;
>         host: 's3-eu-west-1.amazonaws.com';
>         addPath: 'public-stfx-eu'.
>      client addPath: ('test-{1}.txt' format: { size }).
>      client headerAt: #Range put: ('bytes={1}-{2}' format: range).
>      client get.
>      client close.
>      chunks at: index put: client contents.
>      done signal ] forkAt: Processor lowIOPriority ].
> ranges size timesRepeat: [ done wait ].
> ms := Time millisecondsSince: ms.
> (String empty join: chunks) inspect.
> ] fork.
>
> This takes about 2 seconds total for me.
>
> [
>    ZnClient new
>      https;
>      host: 's3-eu-west-1.amazonaws.com';
>      addPath: 'public-stfx-eu';
>      addPath: 'test-99425.txt';
>      get.
> ] timeToRun.
>
> Which is roughly similar to the single request (again, for me).
>
> Two things to note: connection time dominates, in the parallel case, 11
> independent requests were executed, so concurrency is definitively
> happening.
>
> The largest size file is just 100k, split in about 10 parts, which is most
> probably not enough to see much effect from doing things in parallel.
>
> HTH,
>
> Sven
>

[Pharo-users] Re: Splitting a single HTTP Request into multiple concurrent requests

Reply via email to