On Thu, 2011-02-17 at 19:56 +0200, Panu Matilainen wrote:
> On Thu, 17 Feb 2011, seth vidal wrote:
>
> > On Thu, 2011-02-17 at 19:09 +0200, Panu Matilainen wrote:
> >> On 02/17/2011 04:17 PM, James Antill wrote:
> >>> On Thu, 2011-02-17 at 11:48 +0200, Panu Matilainen wrote:
> >>>> The open/close file callbacks only need NEVRA information in the "key".
> >>>> Copy the necessary bits into a regular dict and pass that instead
> >>>> of the entire header, the callback wont even know the difference
> >>>> as the header behaves like a dict.
> >>>> Saves (tens of) megabytes of memory on large transactions.
> >>>> ---
> >>>> yum/depsolve.py | 6 +++++-
> >>>> 1 files changed, 5 insertions(+), 1 deletions(-)
> >>>
> >>> ACK.
> >>
> >> Oh crap... just realized this breaks anaconda and possibly something
> >> else too.
> >>
> >> I keep forgetting how completely broken up the python callback is: the
> >> header is never there unless its passed as (a part of) the package
> >> "key", wasting huge amounts of memory. So anaconda (and possibly others)
> >> are relying on it being a header (even testing instanceof(rpm.hdr), and
> >> accessing various things (summary, size, count of 'basenames') from there.
> >>
> >> Anaconda can be fixed easily enough: just make it use the information
> >> from package object (which it can find from the nevra info) instead of
> >> the assumed header. Or we could just copy a bit more data to the
> >> "minimal header" and it'd still be a big win over the full header... but
> >> there's no telling what header data other API users might try to access.
> >> And then there's the object type change from rpm.hdr to dict too.
> >>
> >> So, it's an API change for sure, far more so than I originally thought
> >> it would be :-/
> >>
> >> One possibility is making the new behavior opt-in, such as adding a
> >> fullheader=True keyword argument to populateTs() so that yum itself can
> >> benefit from it now, and give callers the old compatible behavior with
> >> full header until they update their code.
> >>
> >
> > ugh - I don't think that's such a fabulous plan. Decorating the code in
> > more interesting ways doesn't thrill me.
>
> Well, it isn't exactly a whole lot of code. Something like this (untested
> but you'll get the idea):
>
> diff --git a/yum/depsolve.py b/yum/depsolve.py
> index 997c66d..b99265d 100644
> --- a/yum/depsolve.py
> +++ b/yum/depsolve.py
> @@ -179,7 +179,7 @@ class Depsolve(object):
>
> return False
>
> - def populateTs(self, test=0, keepold=1):
> + def populateTs(self, test=0, keepold=1, fullheader=1):
> """take transactionData class and populate transaction set"""
>
> if self.dsCallback: self.dsCallback.transactionPopulation()
> @@ -220,12 +220,18 @@ class Depsolve(object):
> txmbr.ts_state = 'i'
> txmbr.output_state = TS_INSTALL
>
> - # grab essential data for callback use in hdr-like form
> - minhdr = {}
> - for item in ['epoch', 'name', 'version', 'release', 'arch']:
> - minhdr[item] = hdr[item]
> + # Traditionally the entire header was used as part of the
> + # pkg key, by default give this compatible behavior. However
> + # this wastes gobs of memory for no good reason, the callback
> + # really only needs nevra to locate its files.
> + if fullhdr:
> + cbhdr = hdr
> + else:
> + cbhdr = {}
> + for item in ['epoch', 'name', 'version', 'release',
> 'arch']:
> + cbhdr[item] = hdr[item]
>
> - self.ts.addInstall(hdr, (minhdr, rpmfile), txmbr.ts_state)
> + self.ts.addInstall(hdr, (cbhdr, rpmfile), txmbr.ts_state)
> self.verbose_logger.log(logginglevels.DEBUG_1,
> _('Adding Package %s in mode %s'), txmbr.po,
> txmbr.ts_state)
> if self.dsCallback:
>
> This "decoration" is worth tens of megabytes of saved memory on bigger
> installs/updates, me thinks its well worth those couple of lines. It's
> either something to this effect, or revert the API-breaking patch and keep
> the memory bloat until cowboys come home. This isn't something rpm can
> magically fix either - when the rpm-side callback is finally changed to
> something better it's going to require bigger changes than the above.
my concern here is since populateTs() is public we are opening up an api
break by adding the kwarg there. B/c of people subclassing it.
We've killed stuff like that before.
-sv
_______________________________________________
Yum-devel mailing list
[email protected]
http://lists.baseurl.org/mailman/listinfo/yum-devel