On 02/21/12 17:46, Edward Pilatowicz wrote:
On Wed, Feb 15, 2012 at 09:04:03AM -0800, Shawn Walker wrote:
Greetings,

The following webrev contains general changes to improve the overall
performance of the package system:

   7145683 explore general pkg performance improvements

webrev:
   https://cr.opensolaris.org/action/browse/pkg/swalker/pkg-7145863/webrev/



i'm really impressed with all the performance you've squeezed out of
this code.  here are my comments:

----------
general:
...
- in the .c files, since it takes me a while to remember how python c
   bindings work, it'd be nice to have some comments that say:
        in the case of an error we call "return (NULL)", which will
        generate an exception.  in the case of success we call
        Py_RETURN_NONE.

That's not quite right though; the calling conventions differ depending on the routine and situation involved.

Returning (NULL) by itself won't generate an exception. If an exception is already set, or in other error scenarios, a caller is expected to return (NULL). But again, even that isn't always true (for example, in module init).

As for success, no, the value returned is dependent on the function. It's just that functions are expected in non-error scenarios to return *something* other than (NULL) (again this isn't always true, for things like module init). In these particular cases, since something has to be returned, and None would be the normal default.

- i have a bit of concern that some of your performance gains could be
   slowly lost in the future due to innocuous looking code changes.  i
   wonder if perhaps the test suite should have an automated mechanism
   for running benchmarks and comparing them against a known baseline?
   (then perhaps these benchmarks could be run regularly via hudson for
   every putback).

Last I heard, perfpit is already tracking our performance. (They've certainly filed bugs on regressions in the past ...)

This something we should discuss more generally for the project. I share the same concerns.

...
----------
src/modules/actions/__init__.py

- why do the new InvalidActionError derived error classes use "*args"
   as a parameter instead of having usefully named arguments?  (i see
   that InvalidActionError and other InvalidActionError derived classes
   do it, but i don't see any comments explaining why they do this.)

I suppose a bit of historical silliness. Specifically, subclasses aren't supposed to have a different number of named arguments than their parent. But in this case, it doesn't matter. I'll change them.

- why not import fromstr() at the top of the file where we normally
   import stuff?

Because the exceptions that _actions needs aren't defined until that point. I'll add a comment. (core dump otherwise...)

----------
src/modules/actions/_actions.c

- in fromstr(), rather than doing:
        PyErr_SetString(PyExc_TypeError, "unknown action type");
   shouldn't you raise an UnknownActionError?

No. If you'll see the error case here, the problem is that I was unable to build an argument list for a proper exception, so I'm falling back as a last ditch effort to raise a general type error. I'll add a comment.

- in general, i really prefer data driven code.  so for example, in both
   init_actions() and fromstr() you have hard coded all the action names.
   i'd would find it cleaner if there was one sorted array of action
   names, and that array was used in init_actions() to generate a table
   of pointers that could subsequently be used in fromstr().

That sounds like it would be slower than what I'm doing now for little gain. Our action types at this point change so infrequently that I'm not inclined to optimise for that sort of maintenance. Especially since the action string parsing code is now in a single source file.

Is there some other benefit I'm missing to doing this?

It certainly sounds more complex than what I'm doing now.

I'll stop by to ask about this later...

   also, is this same sorted action list duplicated in depend.py (see
   comment below for depend.py)

No; the types in depend.py are types of dependencies, not types of actions.

...
- in init_actions(), why Py_DECREF() the error classes right after
   looking them up?  (seems you'd want to keep a reference to them since
   you still have globals pointing to them.

So Python 2.x unfortunately has no module shutdown/cleanup hooks [1].

Since I know that these objects won't be garbage collected until the actions module is unloaded, and since I can't DECREF them when _actions is unloaded, I simply DECREF them ahead of time so garbage collection on exit is clean.

- in init_actions(), shouldn't cache_class() do:
        Py_DECREF(action_types);
   before calling return?

Yes.

----------
src/modules/actions/_common.c

- in set_invalid_action_error(), why not move the call to
   PyImport_ImportModule() into init_common()?

Partly historical; when I originally wrote this code, it was in a different module, and I had a cycle-dependency issue where I couldn't import pkg.actions at module init. However, at the end, I ended up moving the code for the same reason to a different module, so I could do that.

However, I would point out that with Python, if a module's already been imported before (and is still loaded), this is essentially a no-op.

I'm happy to move this particular import though as it really adds no additional complexity if so desired.

- in set_invalid_action_error(), instead of calling:

        PyObject_GetAttrString(pkg_actions,<class name>)

   for each error class when you go to generate an error, i'd recommend
   looking up all the error error class in init_common() ahead of time.
   the reason is that these errors paths are infrequently used so moving
   some of this code to init_common() will make it easier to catch any
   errors in the code.  (which makes maintenance in the future easier as
   well.)

I could see that except that we have test cases for every one of these errors (and should continue to), so I don't see any real benefit to looking them up ahead of time. Certainly not concerned about the performance aspects for this path. I'd still have to lookup the errors in a data structure or do an if/else if comparison, so that doesn't seems simpler either.

- in set_invalid_action_error(), does the call to PyErr_SetObject() end
   up invoking a __init__() routine for the exc type object?  if so it'd
   be great to get some comments stating that.

It would have to call __init__ to be able to construct an exception object.

I'm uncertain of the utility of documenting the calling conventions of the standard CPython library

Per the documentation, PyErr_SetObject is identical to PyErr_SetString; it expects a *type* object, and then arguments for that exception to construct it. The only difference is that SetObject permits arbitrary exception types.

In short, yes, it will call __init__, but it would have to do that to create the exception object.

With that said, I'll add a comment describing what this section is doing like this (above the call to PyObject_GetAttrString):

/*
 * Obtain a reference to the action exception type so that SetObject can
 * build the appropriate exception object using the created list of
 * arguments.
 */

- in _generic_init(), CLEANUP_GREFS() seems kinda pointless since it
   only expands to one line of code.

I did it for consistency and simplicity of future maintenance, but I've expanded this and removed the macro.

- in _generic_init(), all your strncmp() calls should have their 'n'
   value incremented by one.  ex:
                strncmp(ns, "set", 3)
   should be:
                strncmp(ns, "set", 4)
   otherwise you're testing for a prefix substring match and not a full
   string match.

Actually, these should just be strcmp() instead (and a few other places). That solves the issue just as well I think. (No real reason to use strncmp here that I can think of.)

However, thanks for catching that, as _actions.c:fromstr actually has this problem. (We had no unit test either, so I've added one.) Currently, it would have raised MalformedActionError, but it should be raising UnknownActionError instead. Example:

   setbar=baz

I'd note the string being compared to isn't null-terminated (it's a substring of a much larger string). The easy way out seems to be to check the string length first (typestrl == 3, in the case of a set action) and then if strncmp(ns, "set", 3) == 0, it's valid.

...
- _file_init() ends up calling PyArg_UnpackTuple() twice.  why not avoid
   this by creating _generic_init_common() that kaes a "action" and
   "data" pointer and gets called from both _file_init() and
   _generic_init() after calling PyArg_UnpackTuple().

So initially, this actually slowed things down a bit (~2%), but declaring _generic_init_common as static inline solved that, and it's now a little bit faster (~2%).

I also changed _actions:add_to_attrs() to be inline which boosted things another ~3%.

----------
src/modules/actions/attribute.py

- i think you broke the "convenience" name=value code by removing the
   check for "name" and "value".  (it looks like we always raise an
   exception if we enter this code path.)  if i'm reading the code right
   it might be nice to have a test case for this code path.

No, but we didn't have a unit test for this, so I added one to verify that I had not.

I think the line you're missing is line 60 where it returns if len(attrs) == 1.

We only allow the convenience form if the *only* attribute/value pair you specify is the name and value.

----------
src/modules/actions/depend.py

- please add a comment above known_types explaining that it is sorted
   for performance reasons and mention how you determined the sort order
   (so that no one else comes along and resorts it in alphabetic order,
   and so that the next person creating a new dependency type can
   somehow determine where their new entry should be placed.)

Oh, I didn't actually sort that for performance reasons, though I suppose it would have some minor effect. I did that purely to match a previous experiment. I'll revert that change.

   does this sorted list overlap with the sorted actions hard coded into
   generic.c`fromstr() (and generic.c`init_common())?  if so is there any
   way that this could all be reduced into one list?

No; this is a list of action dependency types, not a list of action types.

----------
src/modules/actions/generic.py

- if "hardlinks must follow all filesystem-modifying actions", why are
   they followed by "link" actions.

- user/group actions seem like filesystem modifying actions (since they
   update files) so why are they before "dir" actions?  (or perhaps i
   just don't understand the comment?)

For the record; I didn't change the order, but I agree the comment or
the order should match the other.

The user/group actions must be before dir/file actions so we can chown the file/dir actions to the user/group that we might be delivering at the same time.

Since I don't have a deathwish, I'm going to leave the order alone, but here's the new comment:

# Directories must precede all filesystem object actions;
# hardlinks must follow all filesystem object actions (except links).
# Note that usr/group actions precede file actions; this implies that
# /etc/group and /etc/passwd file ownership needs to be part of initial
# contents of those files.

- in set_data(), the first thing it does is:
        if data is None:
   but _generic_init() won't invoke set_data() if data == None.

Correct, _generic_init() doesn't bother calling set_data() if it doesn't need to as that would be much slower. I've added a comment to _common.c:_generic_init.

- in __cmp__(), it'd be nice if you could scope the "except
   AttributeError" handling more narrowly (not to include the call to
   compare). there's another similar comment below for catalog.py.

I suppose; it's consistently 0.5% slower to put it outside the try/except block though, and 2.5% slower if I stuck it in an else: clause. (Yes, Python's interpreter is bizarre...)

However, since I don't see the point of raising NotImplemented here (since this is the base class for all Action objects) I'm just going to drop the try/except block and let the AttribtueError be raised (since, realistically, every action object must have an ordinality for other reasons anyway). That's 4.5% faster anyway 8)

- in get_varcet_keys(), walking through self.attrs twice speeds things
   up?

Yes, welcome to my world of Python performance craziness:

action get_varcet_keys()
            0.056018    357027 actions/sec
            0.056124    356353 actions/sec
            0.056033    356931 actions/sec
action get_new_varcet_keys()
            0.052513    380857 actions/sec
            0.052486    381054 actions/sec
            0.052561    380509 actions/sec

Now, why is that you would ask?


def get_varcet_keys(self):
        """Return the names of any facet or variant tags in this
        action."""

        variants = []
        facets = []

        for k in self.attrs.iterkeys():
                if k[:8] == "variant.":
                        variants.append(k)
                if k[:6] == "facet.":
                        facets.append(k)
        return variants, facets

>>> dis.dis(Action.get_varcet_keys)
600           0 BUILD_LIST               0
              3 STORE_FAST               1 (variants)

601           6 BUILD_LIST               0
              9 STORE_FAST               2 (facets)

603          12 SETUP_LOOP              91 (to 106)
             15 LOAD_FAST                0 (self)
             18 LOAD_ATTR                0 (attrs)
             21 LOAD_ATTR                1 (iterkeys)
             24 CALL_FUNCTION            0
             27 GET_ITER
        >>   28 FOR_ITER                74 (to 105)
             31 STORE_FAST               3 (k)

604          34 LOAD_FAST                3 (k)
             37 LOAD_CONST               1 (8)
             40 SLICE+2
             41 LOAD_CONST               2 ('variant.')
             44 COMPARE_OP               2 (==)
             47 JUMP_IF_FALSE           17 (to 67)
             50 POP_TOP

605          51 LOAD_FAST                1 (variants)
             54 LOAD_ATTR                2 (append)
             57 LOAD_FAST                3 (k)
             60 CALL_FUNCTION            1
             63 POP_TOP
             64 JUMP_FORWARD             1 (to 68)
        >>   67 POP_TOP

606     >>   68 LOAD_FAST                3 (k)
             71 LOAD_CONST               3 (6)
             74 SLICE+2
             75 LOAD_CONST               4 ('facet.')
             78 COMPARE_OP               2 (==)
             81 JUMP_IF_FALSE           17 (to 101)
             84 POP_TOP

607          85 LOAD_FAST                2 (facets)
             88 LOAD_ATTR                2 (append)
             91 LOAD_FAST                3 (k)
             94 CALL_FUNCTION            1
             97 POP_TOP
             98 JUMP_ABSOLUTE           28
        >>  101 POP_TOP
            102 JUMP_ABSOLUTE           28
        >>  105 POP_BLOCK

608     >>  106 LOAD_FAST                1 (variants)
            109 LOAD_FAST                2 (facets)
            112 BUILD_TUPLE              2
            115 RETURN_VALUE


def get_new_varcet_keys(self):
        return [k for k in self.attrs if k[:8] == "variant."], \
            [k for k in self.attrs if k[:6] == "facet."]

>>> dis.dis(Action.get_new_varcet_keys)
611           0 BUILD_LIST               0
              3 DUP_TOP
              4 STORE_FAST               1 (_[1])
              7 LOAD_FAST                0 (self)
             10 LOAD_ATTR                0 (attrs)
             13 GET_ITER
        >>   14 FOR_ITER                34 (to 51)
             17 STORE_FAST               2 (k)
             20 LOAD_FAST                2 (k)
             23 LOAD_CONST               1 (8)
             26 SLICE+2
             27 LOAD_CONST               2 ('variant.')
             30 COMPARE_OP               2 (==)
             33 JUMP_IF_FALSE           11 (to 47)
             36 POP_TOP
             37 LOAD_FAST                1 (_[1])
             40 LOAD_FAST                2 (k)
             43 LIST_APPEND
             44 JUMP_ABSOLUTE           14
        >>   47 POP_TOP
             48 JUMP_ABSOLUTE           14
        >>   51 DELETE_FAST              1 (_[1])

612          54 BUILD_LIST               0
             57 DUP_TOP
             58 STORE_FAST               3 (_[2])
             61 LOAD_FAST                0 (self)
             64 LOAD_ATTR                0 (attrs)
             67 GET_ITER
        >>   68 FOR_ITER                34 (to 105)
             71 STORE_FAST               2 (k)
             74 LOAD_FAST                2 (k)
             77 LOAD_CONST               3 (6)
             80 SLICE+2
             81 LOAD_CONST               4 ('facet.')
             84 COMPARE_OP               2 (==)
             87 JUMP_IF_FALSE           11 (to 101)
             90 POP_TOP
             91 LOAD_FAST                3 (_[2])
             94 LOAD_FAST                2 (k)
             97 LIST_APPEND
             98 JUMP_ABSOLUTE           68
        >>  101 POP_TOP
            102 JUMP_ABSOLUTE           68
        >>  105 DELETE_FAST              3 (_[2])
            108 BUILD_TUPLE              2
            111 RETURN_VALUE

In fact, I can make a new version that's even slightly faster than that:

def get_new_new_varcet_keys(self):
        attrs = self.attrs
        return [k for k in attrs if k[:8] == "variant."], \
            [k for k in attrs if k[:6] == "facet."]

action get_varcet_keys()
            0.055945    357494 actions/sec
            0.055962    357384 actions/sec
            0.055869    357981 actions/sec
action get_new_varcet_keys()
            0.052244    382817 actions/sec
            0.052330    382189 actions/sec
            0.052224    382966 actions/sec
action get_new_new_varcet_keys()
            0.051829    385885 actions/sec
            0.051871    385573 actions/sec
            0.051824    385922 actions/sec

>>> dis.dis(Action.get_new_new_varcet_keys)
615           0 LOAD_FAST                0 (self)
              3 LOAD_ATTR                0 (attrs)
              6 STORE_FAST               1 (attrs)

616           9 BUILD_LIST               0
             12 DUP_TOP
             13 STORE_FAST               2 (_[1])
             16 LOAD_FAST                1 (attrs)
             19 GET_ITER
        >>   20 FOR_ITER                34 (to 57)
             23 STORE_FAST               3 (k)
             26 LOAD_FAST                3 (k)
             29 LOAD_CONST               1 (8)
             32 SLICE+2
             33 LOAD_CONST               2 ('variant.')
             36 COMPARE_OP               2 (==)
             39 JUMP_IF_FALSE           11 (to 53)
             42 POP_TOP
             43 LOAD_FAST                2 (_[1])
             46 LOAD_FAST                3 (k)
             49 LIST_APPEND
             50 JUMP_ABSOLUTE           20
        >>   53 POP_TOP
             54 JUMP_ABSOLUTE           20
        >>   57 DELETE_FAST              2 (_[1])

617          60 BUILD_LIST               0
             63 DUP_TOP
             64 STORE_FAST               4 (_[2])
             67 LOAD_FAST                1 (attrs)
             70 GET_ITER
        >>   71 FOR_ITER                34 (to 108)
             74 STORE_FAST               3 (k)
             77 LOAD_FAST                3 (k)
             80 LOAD_CONST               3 (6)
             83 SLICE+2
             84 LOAD_CONST               4 ('facet.')
             87 COMPARE_OP               2 (==)
             90 JUMP_IF_FALSE           11 (to 104)
             93 POP_TOP
             94 LOAD_FAST                4 (_[2])
             97 LOAD_FAST                3 (k)
            100 LIST_APPEND
            101 JUMP_ABSOLUTE           71
        >>  104 POP_TOP
            105 JUMP_ABSOLUTE           71
        >>  108 DELETE_FAST              4 (_[2])
            111 BUILD_TUPLE              2
            114 RETURN_VALUE

Are you utterly terrified yet? I've add a comment explaining why it's written that way.

- in different(), it seems odd that you raise an assertion if a file
   action is comparted to another action, but you don't raise an
   assertion if another action is compared to a file action.  perhaps you
   could make this more consistent by doing something like:

        shash = ohash = None
        try:
                shash = self.hash
                ohash = other.hash
                ...
        except AttributeError:
                if shash or ohash:
                        raise AssertionError(...)

I didn't change the logic; this is replicating what we already did. However, I'm not adverse to changing this, so I have.

- in consolidate_attrs(), i thought you removed the ability for
   attribute values to be sets?  (it seems you did in __str__().)

Per the comments in other places, this ensures that the attribute values for the variants are converted to lists from sets. (See comment in modules/flavor/base.py as an example).

If you'd prefer that we never provide set objects for the attribute values so there's not even the slightest potential for a window where we could fail, I'm willing to change modules/flavor/base.py and modules/publish/dependencies.py to instead simply convert the attributes themselves ahead of time.

----------
src/modules/catalog.py

- in general, feel free to call me paranoid, but putting function calls
   in try blocks like this worries me:

        try:
                yield f, self.__gen_actions(f, entry["actions"],
                    excludes)
        except KeyError:
                ...

   the reason is that if any code in __gen_actions() generates a KeyError
   (which is a pretty common exception) we'll unexpectedly catch that
   here.  i'd much rather see something like:

        try:
                actions = entry["actions"]:
        except KeyError:
                ...
        else:
                yield f, self.__gen_actions(f, actions, excludes)

I'm not actually worried about this; this is why we have extensive test cases for catalog.py after all...

Not only that, Python places a fair performance penalty on you if you use the else: clause. (See earlier comments.) At that point, I'd be better off reverting these changes completely.

----------
src/modules/client/imageplan.py

- in __get_directories(), why does it matter in which order you update
   the "dirs" set?

It doesn't matter in which order I update, but it does matter how many items I pass to update(). Constructing a set() object using a generator with potentially many items is faster than calling update with a set of many items that are needed to expand the set().

So I call update() with the smaller, constant set of data. Again, we're talking about milliseconds here 8)

- why bother with:
        aname = act.name
        aattrs = act.attrs
        ...
                oattrs = oact.attrs

See earlier example in get_varcet_keys() above; we avoid a few extra attribute retrievals since Python doesn't optimise this for us. So often, assigning a reference to an attribute and then re-using that is faster.

...
----------
src/modules/fmri.py

- there are two separate:
        if veridx != None:
                ...
        else
                ...
   comparisons in __init__().  why not just have one and set self.version
   and self.pkg_name in the same block?

I guess the intent was to fail early when this was originally written, but I see no reason to not combine these. So changed.

----------
src/modules/manifest.py
...
- could you add a comment explaining why we need to lstrip() hardlink
   paths?

Yes, although I'm uncertain why we don't just strip the '/' from actions to begin with. It must be some crazy historical compatibility thing :-(

----------
src/modules/flavor/base.py
src/modules/publish/dependencies.py

- is there any reason not to always invoke consolidate_attrs implicitly
   from the generic Action __init__() method (or perhaps just the
   DependencyAction __init__() method)?

Other than the substantially unnecessary overhead? See my earlier explanation above.

----------
src/tests/api/t_manifest.py

- why are you deleting the test_variants() test case?

Because it was actually wrong. We do allow multiple variant values and in fact, pkgdepend and pkglint both construct such things during their normal operations.

- if your going to delete the test_variants() test case, it's the only
   consumer of self.m6_contents so you should delete that as well.

Thanks, I'll remove that too.

----------
src/tests/cli/t_pkgdep.py

- i have no idea why you removed "%(pfx)s.path=isadir/pfoo2/baz" from
   double_double_stdout.

Because the test was verifying that if a user specified the same search path twice to pkgdepend that would be seen twice in the pkgdepend debug output. I decided it was silly for pkgdepend to actually care that they specified it twice, and to simply ignore redundant specifications. As a result, you'll only see one unique entry in the output. Doing that also changed the sort order ...

----------
src/tests/perf/fmribench.py

- for "fmri create (parts)", why are you specifying build_release if
   that information is already included in the specified version string?
   (what happens if build_release doesn't match the build_release value
   embedded in version?)

Because it may not be in the specified version string. The default build_release is only used if it isn't present in the version string already.

For the performance test case, this isn't actually required, but I wanted to pass it in simulate its actual usage in our codebase.

-Shawn

[1] http://www.python.org/dev/peps/pep-3121/
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to