Re: [Yum-devel] curlmulti based parallel downloads

James Antill Tue, 27 Sep 2011 07:41:47 -0700

On Mon, 2011-09-26 at 18:06 +0200, Zdeněk Pavlas wrote:
> First, fix some issues with MultiFileMeter.
> 
> [PATCH 1/9] Prevent float division by zero
> [PATCH 2/9] MultiFileMeter: show correct finished size
> [PATCH 3/9] TextMultiFileMeter: use 'text' instead of 'basename'.
> [PATCH 4/9] Use re.total instead of total_size.


 These are all fine, although I'm not sure we want to move to using
MultiFileMeter instead of our current progress meter. 

> Then, move parts of PyCurlFileObject code to separate
> functions, so it can be shared by the new code.
> 
> [PATCH 5/9] move pycurl.error handling to _do_perform_exc()
> [PATCH 6/9] move opening of target file to _do_open_fo().
> [PATCH 7/9] move closing of target file to _do_close_fo().

 These are fine.

> Implement parallel downloads, and functions parallel_begin()
> and parallel_end().

 Why do we want to use these two extra functions instead of just
triggering off a parameter (like the errors one)?
 I'm also not dying to have a giant global flag for moving to "new"
download model. Even if we can't mix new/old downloads at "0.1" it seems
bad to make sure we never can.

> [PATCH 8/9] Implement parallel urlgrab()s
> 
> Implement a useful callback.

 Using CurlMulti() is not very useful. We already have had a significant
number of bugs due to curl hiding global variables behind it's curl
objects and weird interactions with NSS etc. ... sticking it in an
external process should get rid of all of those bugs, using CurlMulti is
more likely to give us 666 more variants of those bugs.

> [PATCH 9/9] Implement 'failfunc' callback.

 I think this is fine.

> Open issues:
> 
> 1) Connection limit: This probably should change from simple 'global'
> to 'per-host'.  We should honour limits from metalink files.
> 
> 2) Might be useful to increment the global mirror lists after
> every *started* download, to spread the load over more hosts
> and increase the number of active connections.

 Yeh, we need more integration between urlgrabber and yum here.

> 3) Downloading from a separate process: Not done yet.
> Nothing really needs it ATM, and there seem to be no implementation
> issues I know of, so I'd like to ack the general concept first.

 There are basically two layers of things that "need" it:

1. Get all the curl/NSS/etc. API usage out of the process, this should
close all the weird/annoying bugs we've had with curl* and make sure we
don't get any more. It should also fix DNS hang problems. This probably
also fixes C-c as well.

2. SELinux+ level containment. Basically having the bit that writes to
the disk not having download (+ talk to untrusted stuff) privs. and the
bit that talks to the network not being able to alter random bits on
your disk. Full level this is SELinux+chroot+drop privs, and we can be a
bit lax in implementing this.

_______________________________________________
Yum-devel mailing list
[email protected]
http://lists.baseurl.org/mailman/listinfo/yum-devel

Re: [Yum-devel] curlmulti based parallel downloads

Reply via email to