I am of the opinion that a case like this is best handled by simply writing
the thing you want.

Concurrency limits are easily managed by using tokens to gate fetches. One
simple technique is to make a channel of struct{} with capacity equal to
the maximum number of concurrent connections you are allowed. You can
either fill it with things at startup, and then read from it before a
request and send back to the channel when done, or start with it empty and
send to it before a request and read back after. Either is equivalent.

I'm not sure the point of overall concurrency limits in general, but the
same thing works. It's unlikely to be a problem IMO for the size of job you
describe.

Retries are best done inside each fetch; wrap http.Get with the logic you
want. There is no one-size-fits all here. There is a public backoff library
available, but it's a bit complex and the code could easily be simpler if
you address exactly what you want directly.

For contexts, just use the http package's (*Request).WithContext method.
That accomplishes timeouts too.

On Mon, Aug 19, 2019 at 12:20 PM tom via golang-nuts <
golang-nuts@googlegroups.com> wrote:

> tl;dr Do you of any libraries for parallelizing HTTP requests with
> per-server concurrency control and handling of retries?
>
> I'm writing a service that fetches many independent small binary blobs
> (map tiles) over HTTP from a several upstream servers and package them
> together in to a single archive. I want to parallelize the fetching of the
> small binary blobs. Currently there are O(10) upstream servers and O(1000)
> small binary blobs fetched from each.
>
> Making parallel HTTP requests in Go is trivially easy and is demonstrated
> in many Go tutorials and blog posts. However, I'm looking for a "production
> ready" library that supports:
> * Per upstream server concurrency limits.
> * Overall (across all upstream servers) concurrency limits.
> * Controllable retries with exponential backoff in the case of upstream
> server errors.
> * Timeouts for upstream requests.
> * context.Context support.
>
> This would seem to be a common enough task that I would expect to find an
> existing library that does all of the above. Existing Go web scrapers, e.g.
> colly <http://go-colly.org>, likely have this functionality internally
> but do not expose it in their API and are instead focused on crawling web
> pages.
>
> Do you know of any such library?
>
> Many thanks,
> Tom
>
> Confidentiality Notice:
> This electronic message and any attached documents contain confidential
> and privileged information and is for the sole use of the individual or
> entity to whom it is addressed. If you are not the addressee of this email,
> or the employee or agent responsible for delivering it to the addressee,
> you are hereby notified that any dissemination, distribution or copying of
> this transmission is strictly prohibited. If you receive this message in
> error, please notify the sender immediately by return e-mail or telephone
> and destroy the attached message (and all attached documents) immediately.
> Thank you for your cooperation.
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/cfb138b2-88d4-46ee-9315-996389718bad%40googlegroups.com
> <https://groups.google.com/d/msgid/golang-nuts/cfb138b2-88d4-46ee-9315-996389718bad%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CA%2BYjuxvjvAQU-R7PAeocEJ%3Dp9-k0MSqRL%2BtcL4XhXaXjV%3DUepw%40mail.gmail.com.

Reply via email to