On Wed, Feb 15, 2012 at 09:04:03AM -0800, Shawn Walker wrote:
> Greetings,
>
> The following webrev contains general changes to improve the overall
> performance of the package system:
>
> 7145683 explore general pkg performance improvements
>
> webrev:
> https://cr.opensolaris.org/action/browse/pkg/swalker/pkg-7145863/webrev/
>
>
i'm really impressed with all the performance you've squeezed out of
this code. here are my comments:
----------
general:
- for classes which have methods defined via:
YYY.ZZZ = types.MethodType(...)
it might be nice to have a mention of these methods at the top of the
class. (because folks searching for those methods may not find them.)
- in the .c files, since it takes me a while to remember how python c
bindings work, it'd be nice to have some comments that say:
in the case of an error we call "return (NULL)", which will
generate an exception. in the case of success we call
Py_RETURN_NONE.
- i have a bit of concern that some of your performance gains could be
slowly lost in the future due to innocuous looking code changes. i
wonder if perhaps the test suite should have an automated mechanism
for running benchmarks and comparing them against a known baseline?
(then perhaps these benchmarks could be run regularly via hudson for
every putback).
----------
src/modules/_varcet.c
- you could reduce indentation and make the code easier to read by
doing:
if (strncmp(as, "facet.", 6) != 0)
continue
...
if (facet != NULL) {
/* Facets are currently OR'd. */
ret = Py_False;
continue
}
...
if (strncmp(as, "variant.", 8) != 0)
continue
- in _allow_facet(), for consistency why not just do:
if (facet == Py_True) {
CLEANUP_FREFS;
Py_RETURN_TRUE;
}
----------
src/modules/actions/__init__.py
- why do the new InvalidActionError derived error classes use "*args"
as a parameter instead of having usefully named arguments? (i see
that InvalidActionError and other InvalidActionError derived classes
do it, but i don't see any comments explaining why they do this.)
- why not import fromstr() at the top of the file where we normally
import stuff?
----------
src/modules/actions/_actions.c
- in fromstr(), rather than doing:
PyErr_SetString(PyExc_TypeError, "unknown action type");
shouldn't you raise an UnknownActionError?
- in general, i really prefer data driven code. so for example, in both
init_actions() and fromstr() you have hard coded all the action names.
i'd would find it cleaner if there was one sorted array of action
names, and that array was used in init_actions() to generate a table
of pointers that could subsequently be used in fromstr().
also, is this same sorted action list duplicated in depend.py (see
comment below for depend.py)
- in init_actions(), shouldn't you call Py_DECREF(sys_modules) if the
lookup for pkg_actions fails?
- in init_actions(), why Py_DECREF() the error classes right after
looking them up? (seems you'd want to keep a reference to them since
you still have globals pointing to them.
- in init_actions(), shouldn't cache_class() do:
Py_DECREF(action_types);
before calling return?
----------
src/modules/actions/_common.c
- in set_invalid_action_error(), why not move the call to
PyImport_ImportModule() into init_common()?
- in set_invalid_action_error(), instead of calling:
PyObject_GetAttrString(pkg_actions, <class name>)
for each error class when you go to generate an error, i'd recommend
looking up all the error error class in init_common() ahead of time.
the reason is that these errors paths are infrequently used so moving
some of this code to init_common() will make it easier to catch any
errors in the code. (which makes maintenance in the future easier as
well.)
- in set_invalid_action_error(), does the call to PyErr_SetObject() end
up invoking a __init__() routine for the exc type object? if so it'd
be great to get some comments stating that.
- in _generic_init(), CLEANUP_GREFS() seems kinda pointless since it
only expands to one line of code.
- in _generic_init(), all your strncmp() calls should have their 'n'
value incremented by one. ex:
strncmp(ns, "set", 3)
should be:
strncmp(ns, "set", 4)
otherwise you're testing for a prefix substring match and not a full
string match.
- in _generic_init(), could you add a comment mentioning why it's ok
for "set" and "signature" actions to not have their key_value defined?
- _file_init() ends up calling PyArg_UnpackTuple() twice. why not avoid
this by creating _generic_init_common() that kaes a "action" and
"data" pointer and gets called from both _file_init() and
_generic_init() after calling PyArg_UnpackTuple().
----------
src/modules/actions/attribute.py
- i think you broke the "convenience" name=value code by removing the
check for "name" and "value". (it looks like we always raise an
exception if we enter this code path.) if i'm reading the code right
it might be nice to have a test case for this code path.
----------
src/modules/actions/depend.py
- please add a comment above known_types explaining that it is sorted
for performance reasons and mention how you determined the sort order
(so that no one else comes along and resorts it in alphabetic order,
and so that the next person creating a new dependency type can
somehow determine where their new entry should be placed.)
does this sorted list overlap with the sorted actions hard coded into
generic.c`fromstr() (and generic.c`init_common())? if so is there any
way that this could all be reduced into one list?
----------
src/modules/actions/generic.py
- if "hardlinks must follow all filesystem-modifying actions", why are
they followed by "link" actions.
- user/group actions seem like filesystem modifying actions (since they
update files) so why are they before "dir" actions? (or perhaps i
just don't understand the comment?)
- in set_data(), the first thing it does is:
if data is None:
but _generic_init() won't invoke set_data() if data == None.
- in __cmp__(), it'd be nice if you could scope the "except
AttributeError" handling more narrowly (not to include the call to
compare). there's another similar comment below for catalog.py.
- in get_varcet_keys(), walking through self.attrs twice speeds things
up?
- in different(), it seems odd that you raise an assertion if a file
action is comparted to another action, but you don't raise an
assertion if another action is compared to a file action. perhaps you
could make this more consistent by doing something like:
shash = ohash = None
try:
shash = self.hash
ohash = other.hash
...
except AttributeError:
if shash or ohash:
raise AssertionError(...)
- in consolidate_attrs(), i thought you removed the ability for
attribute values to be sets? (it seems you did in __str__().)
----------
src/modules/catalog.py
- in general, feel free to call me paranoid, but putting function calls
in try blocks like this worries me:
try:
yield f, self.__gen_actions(f, entry["actions"],
excludes)
except KeyError:
...
the reason is that if any code in __gen_actions() generates a KeyError
(which is a pretty common exception) we'll unexpectedly catch that
here. i'd much rather see something like:
try:
actions = entry["actions"]:
except KeyError:
...
else:
yield f, self.__gen_actions(f, actions, excludes)
----------
src/modules/client/imageplan.py
- in __get_directories(), why does it matter in which order you update
the "dirs" set?
- why bother with:
aname = act.name
aattrs = act.attrs
...
oattrs = oact.attrs
----------
src/modules/client/pkg_solver.py
- in __comb_newer_fmris(), you could reduce indentation and make the code
easier to read by doing:
if tp in self.__cache:
return self.__cache[tp]
- in __comb_common(), the elif below could probably just be an else:
if not mver:
...
elif mver:
...
----------
src/modules/fmri.py
- there are two separate:
if veridx != None:
...
else
...
comparisons in __init__(). why not just have one and set self.version
and self.pkg_name in the same block?
----------
src/modules/manifest.py
- instead of:
if aname == "set" and attrs["name"] == "authority":
...
elif aname == "hardlink":
why not do:
if aname == "set":
if attrs["name"] == "authority":
...
elif aname == "hardlink":
that way the "elif" won't get invoked for most set actions.
- could you add a comment explaining why we need to lstrip() hardlink
paths?
----------
src/modules/flavor/base.py
src/modules/publish/dependencies.py
- is there any reason not to always invoke consolidate_attrs implicitly
from the generic Action __init__() method (or perhaps just the
DependencyAction __init__() method)?
----------
src/tests/api/t_manifest.py
- why are you deleting the test_variants() test case?
- if your going to delete the test_variants() test case, it's the only
consumer of self.m6_contents so you should delete that as well.
----------
src/tests/cli/t_pkgdep.py
- i have no idea why you removed "%(pfx)s.path=isadir/pfoo2/baz" from
double_double_stdout.
----------
src/tests/perf/fmribench.py
- for "fmri create (parts)", why are you specifying build_release if
that information is already included in the specified version string?
(what happens if build_release doesn't match the build_release value
embedded in version?)
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss