I am of the opinion that a case like this is best handled by simply writing the thing you want.
Concurrency limits are easily managed by using tokens to gate fetches. One simple technique is to make a channel of struct{} with capacity equal to the maximum number of concurrent connections you are allowed. You can either fill it with things at startup, and then read from it before a request and send back to the channel when done, or start with it empty and send to it before a request and read back after. Either is equivalent. I'm not sure the point of overall concurrency limits in general, but the same thing works. It's unlikely to be a problem IMO for the size of job you describe. Retries are best done inside each fetch; wrap http.Get with the logic you want. There is no one-size-fits all here. There is a public backoff library available, but it's a bit complex and the code could easily be simpler if you address exactly what you want directly. For contexts, just use the http package's (*Request).WithContext method. That accomplishes timeouts too. On Mon, Aug 19, 2019 at 12:20 PM tom via golang-nuts < golang-nuts@googlegroups.com> wrote: > tl;dr Do you of any libraries for parallelizing HTTP requests with > per-server concurrency control and handling of retries? > > I'm writing a service that fetches many independent small binary blobs > (map tiles) over HTTP from a several upstream servers and package them > together in to a single archive. I want to parallelize the fetching of the > small binary blobs. Currently there are O(10) upstream servers and O(1000) > small binary blobs fetched from each. > > Making parallel HTTP requests in Go is trivially easy and is demonstrated > in many Go tutorials and blog posts. However, I'm looking for a "production > ready" library that supports: > * Per upstream server concurrency limits. > * Overall (across all upstream servers) concurrency limits. > * Controllable retries with exponential backoff in the case of upstream > server errors. > * Timeouts for upstream requests. > * context.Context support. > > This would seem to be a common enough task that I would expect to find an > existing library that does all of the above. Existing Go web scrapers, e.g. > colly <http://go-colly.org>, likely have this functionality internally > but do not expose it in their API and are instead focused on crawling web > pages. > > Do you know of any such library? > > Many thanks, > Tom > > Confidentiality Notice: > This electronic message and any attached documents contain confidential > and privileged information and is for the sole use of the individual or > entity to whom it is addressed. If you are not the addressee of this email, > or the employee or agent responsible for delivering it to the addressee, > you are hereby notified that any dissemination, distribution or copying of > this transmission is strictly prohibited. If you receive this message in > error, please notify the sender immediately by return e-mail or telephone > and destroy the attached message (and all attached documents) immediately. > Thank you for your cooperation. > > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/golang-nuts/cfb138b2-88d4-46ee-9315-996389718bad%40googlegroups.com > <https://groups.google.com/d/msgid/golang-nuts/cfb138b2-88d4-46ee-9315-996389718bad%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CA%2BYjuxvjvAQU-R7PAeocEJ%3Dp9-k0MSqRL%2BtcL4XhXaXjV%3DUepw%40mail.gmail.com.