Re: [Rpm-maint] [PATCH] Improve macro table performance

2013-01-28 Thread Florian Festi
On 01/28/2013 04:05 AM, Alexey Tourbin wrote:
 In the existing implementation, when a new macro is added, the whole
 table has to be sorted again.  Hence the cost of adding n macros is
 worse than O(n^2), due to arithmetic progression.
 
 This change drops all qsort(3) stuff altogether, by carefully preserving
 table in sorted order.  In findEntry routine, bsearch(3) is replaced
 with customized binary search which tracks position for insertion.
 In the addMacro routine, if a matching entry is not found, this
 position is used for direct insertion, after the rest of the elements
 are shifted to the right with memmove(3).  Likewise, in delMacro
 routine, the elements are shifted back to the left when the last macro
 definition is popped.  Technically, shifting half of the array with
 memmove(3) is still O(n^2); however, modern CPUs process contiguous
 memory in a very efficient manner, and glibc provides a fine-tuned
 memmove(3) implementation.
 
 Also, macro table entries are now allocated in a single chunk.
 
 This change reduces rpm startup costs by factor of 6 (see callgrind
 annotations below).  Also, this change improves specfile parser
 performance by a factor of 2 (e.g. the parse time of texlive.spec
 is reduced from 67s to 35s).

Impressive numbers. I wonder if thing could even improve further by
switching to a hash table. RPM has it's own generic hash data type found
in lib/rpmhash.H and lib/rpmhash.C
These work with some kind of include magic.

You #define HASHTYPE as a prefix used for the datatype and all related
functions.
HTKEYTYPE for the key data type and
HTDATATYPE for value data type. This can also be a struct for keeping
the value inside the hash. There can be more than one value per key.
With HTDATATYPE undefined the hash acts as a set.

After setting these macros the rpmhash.[HC] fiels are include to build
the header or code for the given types.

See lib/depends.c for two examples of how it is used.

Florian

___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] FSM hooks for rpm plugin

2013-01-28 Thread Reshetova, Elena
Hi,

I am attaching the next version of the patch with modifications explained
inline below.

The patch looks deceptively simple :) In principle things look ok to me,
the issues I see have mostly to do with the symmetry part:
In fsmMkdirs(), the FileOpen() hook is not called, only
rpmpluginsCallFsmFileClose(). If we promise symmetry, we need to stick to it
here too. Looks like just an oversight and trivial to fix.
Yes, fixed now: I was looking to install/remove mostly and forgot about
symmetry on this one. 

rpmPackageFilesRemove() looks like symmetry is indeed guaranteed, but the
problem here is that the hooks will get called whether the file/directory is
actually removed or not, and the hook does has no way of knowing it: passing
FSM_PKGERASE (or FSM_PKGINSTALL on install) is not sufficient. Either we
need to pass fsm-action (those FA_CREATE, FA_ERASE, FA_SKIP etc things) to
the hooks, or alternatively only call the hooks when an actual action is
taken. Or even both: a plugin might want to know if eg a backup is taken.
Not sure which option is the best here: from plugin POV it'd certainly be
simpler to not have to worry about files that rpm has decided to skip
entirely (there shouldn't be need to label etc such things). But then there
might be use-cases where plugins would want to know the fate of each file,
including skipped. It's not exactly hard to do if (XFA_SKIPPED(action))
in plugins that dont care about skipped files either.

I think it would make sense to call a plugin anyway but supply the action.
It is indeed easy for plugin to filter actions it wants to do smth about and
it is hard to see all potential use cases. So, I am passing the action now
as a parameter and then it is up to plugin to react on this. 

And then there's the PITA also known as rpmPackageFilesInstall(). There are
at least two cases where FileClose() hook could be skipped despite
FileOpen() getting called (early return before expandRegular() and a
break in the post-processing to rmdir/unlink failed file. At least the
latter should be reasonably easy to refactor, the early return case .. 
guess I need to actually figure out what on earth its about in the first
place :)

Hm... I guess both can be fixed by manually calling fileclosed() hook before
return or break (like in this version of the patcj), but it doesn't look
that clean and nice. :( The verify() function before early return is hard to
read for me, it seems to do million things and even called a number of times
during installation. Another option would be to call fileOpen() hook later,
but then one can't stop cleanly in emergency cases. 

And like said, the same question whether to call the hooks for skipped
files is true here as well, but probably somewhat trickier as there are more
details and cases to handle.
Would passing the action to all hooks help to it? 

I think the plugins should be allowed to cause an abort here, just like 
rpm itself can, in case of fatal unexpected failure. Unexpected is the 
key really: what I complained about was the MSM plugin *planning* to 
fail here on unresolved file conflicts, which is something that should 
be handled before anything gets installed or removed. 

Yes, I agree and I think the only reason before was the signature check hook
wasn't called early enough, but this is one of the next todo items to do it
properly :)

Best Regards,
Elena.





0001-Adding-FSM-file-hooks.patch
Description: Binary data


smime.p7s
Description: S/MIME cryptographic signature
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint