On 02/21/12 17:46, Edward Pilatowicz wrote:
On Wed, Feb 15, 2012 at 09:04:03AM -0800, Shawn Walker wrote:
Greetings,
The following webrev contains general changes to improve the overall
performance of the package system:
7145683 explore general pkg performance improvements
webrev:
https://cr.opensolaris.org/action/browse/pkg/swalker/pkg-7145863/webrev/
i'm really impressed with all the performance you've squeezed out of
this code. here are my comments:
----------
general:
...
- in the .c files, since it takes me a while to remember how python c
bindings work, it'd be nice to have some comments that say:
in the case of an error we call "return (NULL)", which will
generate an exception. in the case of success we call
Py_RETURN_NONE.
That's not quite right though; the calling conventions differ depending
on the routine and situation involved.
Returning (NULL) by itself won't generate an exception. If an exception
is already set, or in other error scenarios, a caller is expected to
return (NULL). But again, even that isn't always true (for example, in
module init).
As for success, no, the value returned is dependent on the function.
It's just that functions are expected in non-error scenarios to return
*something* other than (NULL) (again this isn't always true, for things
like module init). In these particular cases, since something has to be
returned, and None would be the normal default.
- i have a bit of concern that some of your performance gains could be
slowly lost in the future due to innocuous looking code changes. i
wonder if perhaps the test suite should have an automated mechanism
for running benchmarks and comparing them against a known baseline?
(then perhaps these benchmarks could be run regularly via hudson for
every putback).
Last I heard, perfpit is already tracking our performance. (They've
certainly filed bugs on regressions in the past ...)
This something we should discuss more generally for the project. I
share the same concerns.
...
----------
src/modules/actions/__init__.py
- why do the new InvalidActionError derived error classes use "*args"
as a parameter instead of having usefully named arguments? (i see
that InvalidActionError and other InvalidActionError derived classes
do it, but i don't see any comments explaining why they do this.)
I suppose a bit of historical silliness. Specifically, subclasses
aren't supposed to have a different number of named arguments than their
parent. But in this case, it doesn't matter. I'll change them.
- why not import fromstr() at the top of the file where we normally
import stuff?
Because the exceptions that _actions needs aren't defined until that
point. I'll add a comment. (core dump otherwise...)
----------
src/modules/actions/_actions.c
- in fromstr(), rather than doing:
PyErr_SetString(PyExc_TypeError, "unknown action type");
shouldn't you raise an UnknownActionError?
No. If you'll see the error case here, the problem is that I was unable
to build an argument list for a proper exception, so I'm falling back as
a last ditch effort to raise a general type error. I'll add a comment.
- in general, i really prefer data driven code. so for example, in both
init_actions() and fromstr() you have hard coded all the action names.
i'd would find it cleaner if there was one sorted array of action
names, and that array was used in init_actions() to generate a table
of pointers that could subsequently be used in fromstr().
That sounds like it would be slower than what I'm doing now for little
gain. Our action types at this point change so infrequently that I'm
not inclined to optimise for that sort of maintenance. Especially since
the action string parsing code is now in a single source file.
Is there some other benefit I'm missing to doing this?
It certainly sounds more complex than what I'm doing now.
I'll stop by to ask about this later...
also, is this same sorted action list duplicated in depend.py (see
comment below for depend.py)
No; the types in depend.py are types of dependencies, not types of actions.
...
- in init_actions(), why Py_DECREF() the error classes right after
looking them up? (seems you'd want to keep a reference to them since
you still have globals pointing to them.
So Python 2.x unfortunately has no module shutdown/cleanup hooks [1].
Since I know that these objects won't be garbage collected until the
actions module is unloaded, and since I can't DECREF them when _actions
is unloaded, I simply DECREF them ahead of time so garbage collection on
exit is clean.
- in init_actions(), shouldn't cache_class() do:
Py_DECREF(action_types);
before calling return?
Yes.
----------
src/modules/actions/_common.c
- in set_invalid_action_error(), why not move the call to
PyImport_ImportModule() into init_common()?
Partly historical; when I originally wrote this code, it was in a
different module, and I had a cycle-dependency issue where I couldn't
import pkg.actions at module init. However, at the end, I ended up
moving the code for the same reason to a different module, so I could do
that.
However, I would point out that with Python, if a module's already been
imported before (and is still loaded), this is essentially a no-op.
I'm happy to move this particular import though as it really adds no
additional complexity if so desired.
- in set_invalid_action_error(), instead of calling:
PyObject_GetAttrString(pkg_actions,<class name>)
for each error class when you go to generate an error, i'd recommend
looking up all the error error class in init_common() ahead of time.
the reason is that these errors paths are infrequently used so moving
some of this code to init_common() will make it easier to catch any
errors in the code. (which makes maintenance in the future easier as
well.)
I could see that except that we have test cases for every one of these
errors (and should continue to), so I don't see any real benefit to
looking them up ahead of time. Certainly not concerned about the
performance aspects for this path. I'd still have to lookup the errors
in a data structure or do an if/else if comparison, so that doesn't
seems simpler either.
- in set_invalid_action_error(), does the call to PyErr_SetObject() end
up invoking a __init__() routine for the exc type object? if so it'd
be great to get some comments stating that.
It would have to call __init__ to be able to construct an exception object.
I'm uncertain of the utility of documenting the calling conventions of
the standard CPython library
Per the documentation, PyErr_SetObject is identical to PyErr_SetString;
it expects a *type* object, and then arguments for that exception to
construct it. The only difference is that SetObject permits arbitrary
exception types.
In short, yes, it will call __init__, but it would have to do that to
create the exception object.
With that said, I'll add a comment describing what this section is doing
like this (above the call to PyObject_GetAttrString):
/*
* Obtain a reference to the action exception type so that SetObject can
* build the appropriate exception object using the created list of
* arguments.
*/
- in _generic_init(), CLEANUP_GREFS() seems kinda pointless since it
only expands to one line of code.
I did it for consistency and simplicity of future maintenance, but I've
expanded this and removed the macro.
- in _generic_init(), all your strncmp() calls should have their 'n'
value incremented by one. ex:
strncmp(ns, "set", 3)
should be:
strncmp(ns, "set", 4)
otherwise you're testing for a prefix substring match and not a full
string match.
Actually, these should just be strcmp() instead (and a few other
places). That solves the issue just as well I think. (No real reason
to use strncmp here that I can think of.)
However, thanks for catching that, as _actions.c:fromstr actually has
this problem. (We had no unit test either, so I've added one.)
Currently, it would have raised MalformedActionError, but it should be
raising UnknownActionError instead. Example:
setbar=baz
I'd note the string being compared to isn't null-terminated (it's a
substring of a much larger string). The easy way out seems to be to
check the string length first (typestrl == 3, in the case of a set
action) and then if strncmp(ns, "set", 3) == 0, it's valid.
...
- _file_init() ends up calling PyArg_UnpackTuple() twice. why not avoid
this by creating _generic_init_common() that kaes a "action" and
"data" pointer and gets called from both _file_init() and
_generic_init() after calling PyArg_UnpackTuple().
So initially, this actually slowed things down a bit (~2%), but
declaring _generic_init_common as static inline solved that, and it's
now a little bit faster (~2%).
I also changed _actions:add_to_attrs() to be inline which boosted things
another ~3%.
----------
src/modules/actions/attribute.py
- i think you broke the "convenience" name=value code by removing the
check for "name" and "value". (it looks like we always raise an
exception if we enter this code path.) if i'm reading the code right
it might be nice to have a test case for this code path.
No, but we didn't have a unit test for this, so I added one to verify
that I had not.
I think the line you're missing is line 60 where it returns if
len(attrs) == 1.
We only allow the convenience form if the *only* attribute/value pair
you specify is the name and value.
----------
src/modules/actions/depend.py
- please add a comment above known_types explaining that it is sorted
for performance reasons and mention how you determined the sort order
(so that no one else comes along and resorts it in alphabetic order,
and so that the next person creating a new dependency type can
somehow determine where their new entry should be placed.)
Oh, I didn't actually sort that for performance reasons, though I
suppose it would have some minor effect. I did that purely to match a
previous experiment. I'll revert that change.
does this sorted list overlap with the sorted actions hard coded into
generic.c`fromstr() (and generic.c`init_common())? if so is there any
way that this could all be reduced into one list?
No; this is a list of action dependency types, not a list of action types.
----------
src/modules/actions/generic.py
- if "hardlinks must follow all filesystem-modifying actions", why are
they followed by "link" actions.
- user/group actions seem like filesystem modifying actions (since they
update files) so why are they before "dir" actions? (or perhaps i
just don't understand the comment?)
For the record; I didn't change the order, but I agree the comment or
the order should match the other.
The user/group actions must be before dir/file actions so we can chown
the file/dir actions to the user/group that we might be delivering at
the same time.
Since I don't have a deathwish, I'm going to leave the order alone, but
here's the new comment:
# Directories must precede all filesystem object actions;
# hardlinks must follow all filesystem object actions (except links).
# Note that usr/group actions precede file actions; this implies that
# /etc/group and /etc/passwd file ownership needs to be part of initial
# contents of those files.
- in set_data(), the first thing it does is:
if data is None:
but _generic_init() won't invoke set_data() if data == None.
Correct, _generic_init() doesn't bother calling set_data() if it doesn't
need to as that would be much slower. I've added a comment to
_common.c:_generic_init.
- in __cmp__(), it'd be nice if you could scope the "except
AttributeError" handling more narrowly (not to include the call to
compare). there's another similar comment below for catalog.py.
I suppose; it's consistently 0.5% slower to put it outside the
try/except block though, and 2.5% slower if I stuck it in an else:
clause. (Yes, Python's interpreter is bizarre...)
However, since I don't see the point of raising NotImplemented here
(since this is the base class for all Action objects) I'm just going to
drop the try/except block and let the AttribtueError be raised (since,
realistically, every action object must have an ordinality for other
reasons anyway). That's 4.5% faster anyway 8)
- in get_varcet_keys(), walking through self.attrs twice speeds things
up?
Yes, welcome to my world of Python performance craziness:
action get_varcet_keys()
0.056018 357027 actions/sec
0.056124 356353 actions/sec
0.056033 356931 actions/sec
action get_new_varcet_keys()
0.052513 380857 actions/sec
0.052486 381054 actions/sec
0.052561 380509 actions/sec
Now, why is that you would ask?
def get_varcet_keys(self):
"""Return the names of any facet or variant tags in this
action."""
variants = []
facets = []
for k in self.attrs.iterkeys():
if k[:8] == "variant.":
variants.append(k)
if k[:6] == "facet.":
facets.append(k)
return variants, facets
>>> dis.dis(Action.get_varcet_keys)
600 0 BUILD_LIST 0
3 STORE_FAST 1 (variants)
601 6 BUILD_LIST 0
9 STORE_FAST 2 (facets)
603 12 SETUP_LOOP 91 (to 106)
15 LOAD_FAST 0 (self)
18 LOAD_ATTR 0 (attrs)
21 LOAD_ATTR 1 (iterkeys)
24 CALL_FUNCTION 0
27 GET_ITER
>> 28 FOR_ITER 74 (to 105)
31 STORE_FAST 3 (k)
604 34 LOAD_FAST 3 (k)
37 LOAD_CONST 1 (8)
40 SLICE+2
41 LOAD_CONST 2 ('variant.')
44 COMPARE_OP 2 (==)
47 JUMP_IF_FALSE 17 (to 67)
50 POP_TOP
605 51 LOAD_FAST 1 (variants)
54 LOAD_ATTR 2 (append)
57 LOAD_FAST 3 (k)
60 CALL_FUNCTION 1
63 POP_TOP
64 JUMP_FORWARD 1 (to 68)
>> 67 POP_TOP
606 >> 68 LOAD_FAST 3 (k)
71 LOAD_CONST 3 (6)
74 SLICE+2
75 LOAD_CONST 4 ('facet.')
78 COMPARE_OP 2 (==)
81 JUMP_IF_FALSE 17 (to 101)
84 POP_TOP
607 85 LOAD_FAST 2 (facets)
88 LOAD_ATTR 2 (append)
91 LOAD_FAST 3 (k)
94 CALL_FUNCTION 1
97 POP_TOP
98 JUMP_ABSOLUTE 28
>> 101 POP_TOP
102 JUMP_ABSOLUTE 28
>> 105 POP_BLOCK
608 >> 106 LOAD_FAST 1 (variants)
109 LOAD_FAST 2 (facets)
112 BUILD_TUPLE 2
115 RETURN_VALUE
def get_new_varcet_keys(self):
return [k for k in self.attrs if k[:8] == "variant."], \
[k for k in self.attrs if k[:6] == "facet."]
>>> dis.dis(Action.get_new_varcet_keys)
611 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_FAST 1 (_[1])
7 LOAD_FAST 0 (self)
10 LOAD_ATTR 0 (attrs)
13 GET_ITER
>> 14 FOR_ITER 34 (to 51)
17 STORE_FAST 2 (k)
20 LOAD_FAST 2 (k)
23 LOAD_CONST 1 (8)
26 SLICE+2
27 LOAD_CONST 2 ('variant.')
30 COMPARE_OP 2 (==)
33 JUMP_IF_FALSE 11 (to 47)
36 POP_TOP
37 LOAD_FAST 1 (_[1])
40 LOAD_FAST 2 (k)
43 LIST_APPEND
44 JUMP_ABSOLUTE 14
>> 47 POP_TOP
48 JUMP_ABSOLUTE 14
>> 51 DELETE_FAST 1 (_[1])
612 54 BUILD_LIST 0
57 DUP_TOP
58 STORE_FAST 3 (_[2])
61 LOAD_FAST 0 (self)
64 LOAD_ATTR 0 (attrs)
67 GET_ITER
>> 68 FOR_ITER 34 (to 105)
71 STORE_FAST 2 (k)
74 LOAD_FAST 2 (k)
77 LOAD_CONST 3 (6)
80 SLICE+2
81 LOAD_CONST 4 ('facet.')
84 COMPARE_OP 2 (==)
87 JUMP_IF_FALSE 11 (to 101)
90 POP_TOP
91 LOAD_FAST 3 (_[2])
94 LOAD_FAST 2 (k)
97 LIST_APPEND
98 JUMP_ABSOLUTE 68
>> 101 POP_TOP
102 JUMP_ABSOLUTE 68
>> 105 DELETE_FAST 3 (_[2])
108 BUILD_TUPLE 2
111 RETURN_VALUE
In fact, I can make a new version that's even slightly faster than that:
def get_new_new_varcet_keys(self):
attrs = self.attrs
return [k for k in attrs if k[:8] == "variant."], \
[k for k in attrs if k[:6] == "facet."]
action get_varcet_keys()
0.055945 357494 actions/sec
0.055962 357384 actions/sec
0.055869 357981 actions/sec
action get_new_varcet_keys()
0.052244 382817 actions/sec
0.052330 382189 actions/sec
0.052224 382966 actions/sec
action get_new_new_varcet_keys()
0.051829 385885 actions/sec
0.051871 385573 actions/sec
0.051824 385922 actions/sec
>>> dis.dis(Action.get_new_new_varcet_keys)
615 0 LOAD_FAST 0 (self)
3 LOAD_ATTR 0 (attrs)
6 STORE_FAST 1 (attrs)
616 9 BUILD_LIST 0
12 DUP_TOP
13 STORE_FAST 2 (_[1])
16 LOAD_FAST 1 (attrs)
19 GET_ITER
>> 20 FOR_ITER 34 (to 57)
23 STORE_FAST 3 (k)
26 LOAD_FAST 3 (k)
29 LOAD_CONST 1 (8)
32 SLICE+2
33 LOAD_CONST 2 ('variant.')
36 COMPARE_OP 2 (==)
39 JUMP_IF_FALSE 11 (to 53)
42 POP_TOP
43 LOAD_FAST 2 (_[1])
46 LOAD_FAST 3 (k)
49 LIST_APPEND
50 JUMP_ABSOLUTE 20
>> 53 POP_TOP
54 JUMP_ABSOLUTE 20
>> 57 DELETE_FAST 2 (_[1])
617 60 BUILD_LIST 0
63 DUP_TOP
64 STORE_FAST 4 (_[2])
67 LOAD_FAST 1 (attrs)
70 GET_ITER
>> 71 FOR_ITER 34 (to 108)
74 STORE_FAST 3 (k)
77 LOAD_FAST 3 (k)
80 LOAD_CONST 3 (6)
83 SLICE+2
84 LOAD_CONST 4 ('facet.')
87 COMPARE_OP 2 (==)
90 JUMP_IF_FALSE 11 (to 104)
93 POP_TOP
94 LOAD_FAST 4 (_[2])
97 LOAD_FAST 3 (k)
100 LIST_APPEND
101 JUMP_ABSOLUTE 71
>> 104 POP_TOP
105 JUMP_ABSOLUTE 71
>> 108 DELETE_FAST 4 (_[2])
111 BUILD_TUPLE 2
114 RETURN_VALUE
Are you utterly terrified yet? I've add a comment explaining why it's
written that way.
- in different(), it seems odd that you raise an assertion if a file
action is comparted to another action, but you don't raise an
assertion if another action is compared to a file action. perhaps you
could make this more consistent by doing something like:
shash = ohash = None
try:
shash = self.hash
ohash = other.hash
...
except AttributeError:
if shash or ohash:
raise AssertionError(...)
I didn't change the logic; this is replicating what we already did.
However, I'm not adverse to changing this, so I have.
- in consolidate_attrs(), i thought you removed the ability for
attribute values to be sets? (it seems you did in __str__().)
Per the comments in other places, this ensures that the attribute values
for the variants are converted to lists from sets. (See comment in
modules/flavor/base.py as an example).
If you'd prefer that we never provide set objects for the attribute
values so there's not even the slightest potential for a window where we
could fail, I'm willing to change modules/flavor/base.py and
modules/publish/dependencies.py to instead simply convert the attributes
themselves ahead of time.
----------
src/modules/catalog.py
- in general, feel free to call me paranoid, but putting function calls
in try blocks like this worries me:
try:
yield f, self.__gen_actions(f, entry["actions"],
excludes)
except KeyError:
...
the reason is that if any code in __gen_actions() generates a KeyError
(which is a pretty common exception) we'll unexpectedly catch that
here. i'd much rather see something like:
try:
actions = entry["actions"]:
except KeyError:
...
else:
yield f, self.__gen_actions(f, actions, excludes)
I'm not actually worried about this; this is why we have extensive test
cases for catalog.py after all...
Not only that, Python places a fair performance penalty on you if you
use the else: clause. (See earlier comments.) At that point, I'd be
better off reverting these changes completely.
----------
src/modules/client/imageplan.py
- in __get_directories(), why does it matter in which order you update
the "dirs" set?
It doesn't matter in which order I update, but it does matter how many
items I pass to update(). Constructing a set() object using a generator
with potentially many items is faster than calling update with a set of
many items that are needed to expand the set().
So I call update() with the smaller, constant set of data. Again, we're
talking about milliseconds here 8)
- why bother with:
aname = act.name
aattrs = act.attrs
...
oattrs = oact.attrs
See earlier example in get_varcet_keys() above; we avoid a few extra
attribute retrievals since Python doesn't optimise this for us. So
often, assigning a reference to an attribute and then re-using that is
faster.
...
----------
src/modules/fmri.py
- there are two separate:
if veridx != None:
...
else
...
comparisons in __init__(). why not just have one and set self.version
and self.pkg_name in the same block?
I guess the intent was to fail early when this was originally written,
but I see no reason to not combine these. So changed.
----------
src/modules/manifest.py
...
- could you add a comment explaining why we need to lstrip() hardlink
paths?
Yes, although I'm uncertain why we don't just strip the '/' from actions
to begin with. It must be some crazy historical compatibility thing :-(
----------
src/modules/flavor/base.py
src/modules/publish/dependencies.py
- is there any reason not to always invoke consolidate_attrs implicitly
from the generic Action __init__() method (or perhaps just the
DependencyAction __init__() method)?
Other than the substantially unnecessary overhead? See my earlier
explanation above.
----------
src/tests/api/t_manifest.py
- why are you deleting the test_variants() test case?
Because it was actually wrong. We do allow multiple variant values and
in fact, pkgdepend and pkglint both construct such things during their
normal operations.
- if your going to delete the test_variants() test case, it's the only
consumer of self.m6_contents so you should delete that as well.
Thanks, I'll remove that too.
----------
src/tests/cli/t_pkgdep.py
- i have no idea why you removed "%(pfx)s.path=isadir/pfoo2/baz" from
double_double_stdout.
Because the test was verifying that if a user specified the same search
path twice to pkgdepend that would be seen twice in the pkgdepend debug
output. I decided it was silly for pkgdepend to actually care that they
specified it twice, and to simply ignore redundant specifications. As a
result, you'll only see one unique entry in the output. Doing that also
changed the sort order ...
----------
src/tests/perf/fmribench.py
- for "fmri create (parts)", why are you specifying build_release if
that information is already included in the specified version string?
(what happens if build_release doesn't match the build_release value
embedded in version?)
Because it may not be in the specified version string. The default
build_release is only used if it isn't present in the version string
already.
For the performance test case, this isn't actually required, but I
wanted to pass it in simulate its actual usage in our codebase.
-Shawn
[1] http://www.python.org/dev/peps/pep-3121/
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss