Dear all,

On 12.09.2018 03:51, wor...@alum.mit.edu wrote:
Tim Rühsen <tim.rueh...@gmx.de> writes:
Thanks for the pointer to coproc, never heard of it ;-) (That means I
never had a problem that needed coproc).

Anyways, copy&pasting the script results in a file '[1]' with bash 4.4.23.

Yeah, I'm not surprised there are bugs in it.

Also, wget -i - waits with downloading until stdin has been closed. How
can you circumvent that ?

The more I think about the original problem, the more puzzled I am. The
OP said that starting wget for each URL took a long time.  But my
experience is that starting processes is quite quick.  (I once modified
tar to compress each file individually with gzip before writing it to an
Exabyte type.  On a much slower processor than modern processors, the
writing was not delayed by starting a process for each file written.)

I suspect the delay is not starting wget but establishing the initial
HTTP connection to the server.

That's what the OP thinks, too. I attributed the slow startup to DNS resolution.

Probably a better approach to the problem is to download the files in
batches on N consecutive URLs, where N is large enough that the HTTP
startup time is well less than the total download time.  Process each
batch with a seperate invocation of wget, and exit the loop when an
attempted batch doesn't create any new downloaded files (or, the last
file in the batch doesn't exist), indicating there are no more files to
download.

Neat idea. Finally, I solved it by estimating the number of chunks from the total running time and the duration of each chunk. But thanks for giving it a thought!

Regards,

Paul


Reply via email to