> Do you have any stats. on this? # for n in `seq 4` do; # sync # yum install thunderbird -y # yum clean packages # rpm -e thunderbird; done
CHECKSUM TIME: time in misc.checksum() DOWNLOAD+VERIFY: time in po.repo.getPackage() - includes the above Checksumming after the download: CHECKSUM TIME 208.12702179 ms DOWNLOAD+VERIFY TIME 488.519906998 ms CHECKSUM TIME 206.007003784 ms DOWNLOAD+VERIFY TIME 475.650072098 ms CHECKSUM TIME 204.250097275 ms DOWNLOAD+VERIFY TIME 473.475933075 ms CHECKSUM TIME 204.069137573 ms DOWNLOAD+VERIFY TIME 465.682983398 ms It takes about 200ms to checksum 28MB of the rpm, and that seems constant. The download alone is about 260-290ms (local gbit lan). Single pass: DOWNLOAD+VERIFY TIME 384.808063507 ms DOWNLOAD+VERIFY TIME 375.88095665 ms DOWNLOAD+VERIFY TIME 465.562820435 ms DOWNLOAD+VERIFY TIME 372.400045395 ms As I see it, about 100ms of checksumming time (50%) are masked by network latency. On a slower network, it'd be probably close to 100%. Using best values in both cases (372ms vs 465ms) the net win is 20%. > fo._hash.__self__.hexdigest() > ...makes me twitch :). Yep, not nice at all. But replacing a member function with a dummy callback is much easier than crafting a dummy object. > My first instinct is that instead of doing it this way we could just > have a generic "csum" member ... and if it's not None, we call update on > it. Then callers can pass in a hashlib.new() or a yum.misc.Checksums() > etc. That's reasonable. I'd probably rename 'opts.csum_type' to 'opts.csumfunc', and whoever needs to checksum downloaded data, he'd put a callback there. That would also need no changes at the 'checkfunc' side. What about 'reget's? Should already stored data be fed to the callback too? Could that potentially run into some problems? -- Zdenek _______________________________________________ Yum-devel mailing list [email protected] http://lists.baseurl.org/mailman/listinfo/yum-devel
