On Thu, Nov 12, 2009 at 2:09 AM, Dan McGee <[email protected]> wrote: > Sorry for this being such a huge patch, but I believe it is necessary for > quite a few reasons which I will attempt to explain herein. I've been > mulling this over for a while, but wasn't super happy with making the > download interface more complex. Instead, if we carefully order things in > the internal download code, we can actually make the interface simpler. > > 1. FS#15657 - This involves `name.db.tar.gz.part` files being left around the > filesystem, and then causing all sorts of issues when someone attempts to > rerun the operation they canceled. We need to ensure that if we resume a > download, we are resuming it on exactly the same file; if we cannot be > almost postive of that then we need to start over. > > 2. http://www.mail-archive.com/[email protected]/msg03536.html - Here > we have a lighttpd bug to ruin the day. If we send both a Range: header and > If-Modified-Since: header across the wire in a GET request, lighttpd doesn't > do what we want in several cases. If the file hadn't been modified, it > returns a '304 Not Modified' instead of a '206 Partial Content'. We need to > do a stat (e.g. HEAD in HTTP terms) operation here, and the proceed > accordingly based off the values we get back from it. > > 3. The mtime stuff was rather ugly, and relied on the called function to > write back to a passed in reference, which isn't the greatest. Instead, use > the power of the filesystem to contain this info. Every file downloaded > internally is now carefully timestamped with the remote file time. This > should allow the resume logic to work. In order to guarantee this, we need > to implement a signal handler that catches interrupts, notifies the running > code, and causes it to set the mtimes on the file. It then rethrows the > signal so the pacman signal handler (or any frontend) works as expected. > > 4. We did a lot of funky stuff in trying to track the DB last modified time. > It is a lot easier to just keep the downloaded DB file around and track the > time on that rather than in a funky dot file. It also kills a lot of code. > > 5. For GPG verification of the databases down the road, we are going to need > the DB file around for at least a short bit of time anyway, so this gets us > closer to that. > > Signed-off-by: Dan McGee <[email protected]> > --- > lib/libalpm/alpm.h | 8 +-- > lib/libalpm/be_files.c | 96 ++------------------------- > lib/libalpm/dload.c | 177 > +++++++++++++++++++++++++++++++++--------------- > lib/libalpm/dload.h | 4 +- > src/pacman/pacman.c | 11 +++- > 5 files changed, 141 insertions(+), 155 deletions(-)
So yes, I know this patch is huge, but I'm really really *really* hoping someone (or a few people!) would like to review it. Xavier, I know you asked me about this stuff a while back, and I finally got back around to it and it is in a much more working form. It is hopefully well commented too. I should note this went through about 3 distinct iterations locally tonight before this final version came about. Function arguments were dropped, added, and dropped again; calls into libfetch were reordered and reworked and changed, etc. It took a good amount of time to get it to work for all of the various partial download cases (both DB *and* packages, yes, it actually works!) as well as keeping the force flag working and just downloading in the most straightforward start-to-end case. Anyway, it may be a tad easier to look at it here: http://code.toofishes.net/cgit/dan/pacman.git/commit/?h=download&id=8a15e0cbebaff0c34d2d837b37dd6279cff44e32 or in a side by side diff format: http://code.toofishes.net/cgit/dan/pacman.git/commit/?h=download&id=8a15e0cbebaff0c34d2d837b37dd6279cff44e32&ss=1 Thanks guys! And don't ask why I am up at 2 AM my time working on this, it feels like the old days of hacking again. :) -Dan
