The changes I checked in last night,
http://viewcvs.osafoundation.org/chandler/?rev=7540&view=rev and
http://viewcvs.osafoundation.org/chandler/?rev=7541&view=rev,
focused on some very often called low-level repository routines, such as
marking items dirty, invoking monitors, setting up bi-refs, etc...
Some of these are now vastly faster and this translates into some performance
improvements for many if not most repository operations.
I've been using two python tools, hotshot and timeit to measure the
performance of various operations. In particular, I've been running hotshot
against the 3000 event calendar import bug
http://bugzilla.osafoundation.org/show_bug.cgi?id=3988
by running the attached t79.py python file.
Recently, the event import code was changed to also include a commit, so the
performance characteristics of bug 3988 have shifted quite a bit. Still, the
two attached 'cumulative time' hotshot profiles, r7541.txt and r7538.txt run
against Chandler before (rev 7538) and after my changes (rev 7541) show
significant improvements.
Before going into the details I'd like to briefly define the meaning of time
units:
- all times were measured on a 1 Ghz Powerbook G4 with 1GB of RAM running Mac
OS X 10.4.2, without any other non mostly-idle process, such as iTunes,
iChat, Skype, FireFox, etc... running.
- the attached hotshot profiles list cumulative times, that is the total
time spent in a given routine including all its sub-routines.
- take hotshot's CPU seconds with a grain of salt although I've been told
that in python 2.4 they're much closer to reality.
- the % diffs listed below reflect the difference between the new time and
the old time. For example, if something got ten times faster it also
means that it got 90% faster because t0 - 90% = t1 ==> t1 = t0 / 10, hence
in the table below:
% diff = (1 - (after / before)) * 100
Before going into the details I'd like to also briefly describe how to run
these tests yourself on your own favorite platform:
- the 3000 event calendar import test is done with the help of the attached
t79.py script:
> cd $CHANDLERHOME
> RunPython
>>> from t79 import *
the raw hotshot profile is 32 Mb and takes a *long* time to load
- the individual timeit tests are run with the help of the attached t7.py and
t6.py scripts, where t7.py is used to create a sample 'data' repository and
t6.py is used to open and inspect it, for example:
> cd $CHANDLERHOME
> RunPython t7.py
> RunPython -m timeit -s "from t6 import k; d=k._values; f=k.VDIRTY" \
"k.setDirty(f, 'born', d)"
The hotshot details (3000 event calendar import):
| after (r7541) | before (r7538) | % diff
--------------------------------------------------------------------
total time | 69.07 s | 83.20 s | 17 %
commit() | 23.27 s | 25.91 s | 10.2 %
total time - commit() | 45.80 s | 57.29 s | 20 %
--------------------------------------------------------------------
new item creation via | | |
Kind.newItem() | 9.58 s | 15.65 s | 38.7 %
Item.__init__() | 6.86 s | 8.40 s | 18.3 %
Item.setAttributeValue | 6.13 s | 14.38 s | 57.4 %
establishing bi-refs in | | |
Values._setValue | 6.82 s | 13.08 s | 47.9 %
establishing one side of | | |
bi-refs in a ref coll | 3.29 s | 6.77 s | 51.4 %
--------------------------------------------------------------------
The timeit details (run with t6.py):
| after (r7541) | before (r7538) | % diff
--------------------------------------------------------------------
Item.setDirty() | 10.3 usec | 65.7 usec | 84.3 %
monitor invocation when | | |
no monitors (overhead) | 2.8 usec | 27.2 usec | 89.7 %
view.find(uuid) of item | | |
already loaded | 3.14 usec | 6.33 usec | 50.4 %
view[uuid] of item | | |
already loaded | 1.37 usec | 7.82 usec | 82.5 %
--------------------------------------------------------------------
returning a defaultValue | | |
for an attribute | 0.97 usec | 86.7 usec | 98.9 %
--------------------------------------------------------------------
Where that leaves us with the various performance acceptance tests, I
don't know. The tests are still shifting around, so it is kind of difficult
to assess, yet I'm pretty sure a 'bunch of amorphous things' still must have
gotten noticeably faster.
In any case, one is never done with performance work, watch this space for
more...
I hope this helps !
Andi..
import os
import hotshot, hotshot.stats
from osaf.sharing.Sharing import FileSystemConduit, Share
from osaf.sharing.ICalendar import ICalendarFormat
from application.Parcel import Manager as ParcelManager
from repository.persistence.DBRepository import DBRepository
r = DBRepository('data')
r.create(ramdb='True', refcounted='True')
view = r.view
manager = ParcelManager.get(view, path=['parcels'])
manager.loadParcels(["osaf.pim.calendar"])
path = os.path.join(os.getenv('CHANDLERHOME') or '.',
'parcels', 'osaf', 'sharing', 'tests')
conduit = FileSystemConduit(name="conduit", sharePath=path,
shareName="3kevents.ics", view=view)
format = ICalendarFormat(name="format", view=view)
share = Share(name="share", conduit=conduit, format=format, view=view)
#share.get()
prof = hotshot.Profile('t79.prof')
prof.runcall(share.get)
prof.close()
def cum():
stats.sort_stats('cum')
stats.print_stats(50)
def calls():
stats.sort_stats('time', 'calls')
stats.print_stats(50)
print 'loading profile'
stats = hotshot.stats.load('t79.prof')
stats.strip_dirs()
cum()
from repository.persistence.DBRepository import DBRepository
r = DBRepository('data')
r.create()
r = r.view
#r.loadPack('repository/packs/chandler.pack')
r.loadPack('repository/tests/data/packs/cineguide.pack')
r.findPath('//CineGuide/KHepburn').movies.addIndex('n', 'numeric')
r.findPath('//CineGuide/KHepburn').movies.addIndex('t', 'attribute',
attribute='title')
r.findPath('//CineGuide/KHepburn').movies.addIndex('f', 'string',
attribute='frenchTitle',
locale='fr_FR')
r.check()
r.commit()
r = r.repository
r.close()
#c = r['CineGuide']
#k = c['KHepburn']
#m = k.movies.first()
from repository.persistence.DBRepository import DBRepository
from repository.persistence.Repository import RepositoryThread
#from repository.remote.RepositoryServer import SOAPRepositoryServer
from repository.item.Query import TextQuery, KindQuery
from repository.item.Access import ACL, ACE, Permission
from repository.util.SAX import XMLPrettyGenerator, XMLGenerator
from sys import stdout
r = DBRepository('data')
r.open(recover=False, force=False, refcounted=True)
r = r.view
#r.check()
a = r.findPath('//Schema/Core/Attribute')
#e = r.findPath('//Schema/Core/Enumeration')
c = r.findPath('//CineGuide')
k = r.findPath('//CineGuide/KHepburn')
i = r.findPath('//Collections/foo')
strings = r.findPath('//Collections/Strings')
l = r.findPath('//Schema/Core/Clouds/Kind')
al = r.findPath('//Schema/CineGuide/Clouds/Actor')
ml = r.findPath('//Schema/CineGuide/Clouds/Movie')
if k is not None:
m = k.movies
m1 = m.first()
if m1 is not None:
m2 = m.next(m1)
m3 = m.next(m2)
m4 = m.next(m3)
m5 = m.next(m4)
def p(index=None, attr='title', a=k):
i = 0
if index is not None:
m = a.movies.firstInIndex(index)
while m is not None:
print i, ':', m.getAttributeValue(attr).encode('iso-8859-1')
m = a.movies.nextInIndex(m, index)
i += 1
else:
for m in a.movies:
print i, ':', m.getItemDisplayName()
i += 1
def q(index):
m = k.movies
for i in xrange(m._index(index)._count):
print i, ':', m.getByIndex(index, i).title
def g(i, indexName='n'):
return m.getByIndex(indexName, i).title
def s(a=k):
for m in a.movies:
print m.title, ':',
m.synopsis.getPlainTextReader().read().encode('iso-8859-1')
def es(a=k):
for m in a.movies:
synopsis = m.synopsis.getPlainTextReader().read().encode('iso-8859-1')
if not synopsis:
print m.title, ':', synopsis
def apply(item, callable):
for i in item.iterChildren():
callable(i)
apply(i, callable)
def load():
for root in r.iterRoots():
apply(root, lambda item: item.setDirty(True))
def tq(expression):
for item, attribute in TextQuery(expression).run(r):
print item.getItemDisplayName().encode('iso-8859-1')
print
item.getAttributeValue(attribute).getReader().read().encode('iso-8859-1')
print
#r.close()
def wi(cloud=al, item=k, file=stdout, pretty=True):
generator = XMLGenerator(file)
if pretty:
generator = XMLPrettyGenerator(generator)
cloud.writeItems(item.itsUUID, 1, generator)
if pretty:
file.write('\n')
def wa():
a1=ACE(k.itsUUID, Permission.READ)
a2=ACE(c.itsUUID, Permission.WRITE)
acl = ACL()
acl.append(a1)
acl.append(a2)
k.setACL(acl, 'born')
def serve():
def run(*args, **kwds):
server = SOAPRepositoryServer(r)
server.startup()
t = RepositoryThread(r, target=run)
t.setDaemon(True)
t.start()
def ch(item, version, bits, values, references):
print item, version, "0x%x" %(bits), values, references
def hist(item, version, bits, values, references):
print item, version, "0x%x" %(bits), values, references
class o(object):
pass
def nop(*args, **kwds):
pass
o.attr = 5
7983271 function calls (7782644 primitive calls) in 83.202 CPU seconds
Ordered by: cumulative time
List reduced from 533 to 50 due to restriction <50>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 83.202 83.202 Sharing.py:191(get)
1 0.000 0.000 83.202 83.202 Sharing.py:615(get)
1 0.000 0.000 57.290 57.290 Sharing.py:630(_get)
1 0.001 0.001 57.289 57.289
Sharing.py:582(_conditionalGetItem)
1 0.000 0.000 57.286 57.286 Sharing.py:1059(_getItem)
1 1.689 1.689 57.286 57.286 ICalendar.py:294(importProcess)
2 0.166 0.083 25.912 12.956 DBRepositoryView.py:242(commit)
5046 0.090 0.000 24.913 0.005 DBRepositoryView.py:348(_saveItem)
5046 0.463 0.000 24.786 0.005 DBItemIO.py:35(writeItem)
5046 0.323 0.000 21.930 0.004 ItemIO.py:14(writeItem)
1 0.531 0.531 17.632 17.632 vobject.py:710(readComponents)
3000 0.061 0.000 15.653 0.005 Kind.py:156(newItem)
3000 0.049 0.000 15.524 0.005 Calendar.py:1086(__init__)
3000 0.142 0.000 15.475 0.005 Calendar.py:227(__init__)
35991/35990 0.537 0.000 14.384 0.000 Item.py:188(setAttributeValue)
5046 0.048 0.000 13.702 0.003 DBItemIO.py:225(_values)
5046 1.295 0.000 13.654 0.003 Values.py:209(_writeValues)
15234 0.632 0.000 13.076 0.001 Values.py:475(_setValue)
45900 2.029 0.000 11.427 0.000 DBItemIO.py:233(_value)
3004 0.105 0.000 10.208 0.003 items.py:167(__init__)
30468 0.477 0.000 9.347 0.000 Values.py:532(_setRef)
10229 0.187 0.000 8.928 0.001 RefCollections.py:204(append)
135586/117587 1.054 0.000 8.889 0.000 Item.py:1154(setDirty)
120130/45763 1.222 0.000 8.573 0.000 DBItemIO.py:117(writeValue)
3978 0.087 0.000 8.522 0.002 schema.py:584(__init__)
3978 0.387 0.000 8.402 0.002 Item.py:28(__init__)
58746/52746 0.798 0.000 7.933 0.000 Item.py:1224(_fireChanges)
26902 1.077 0.000 7.402 0.000
vobject.py:623(textLineToContentLine)
7227 0.111 0.000 7.383 0.001 Values.py:527(_addValue)
5046 0.068 0.000 7.229 0.001 DBItemIO.py:229(_references)
5046 1.033 0.000 7.161 0.001 Values.py:784(_writeValues)
18957 0.411 0.000 7.116 0.000 DBRefs.py:264(_setRef)
19461 0.196 0.000 6.767 0.000 RefCollections.py:256(_setRef)
13015 1.642 0.000 6.756 0.001 Types.py:1002(writeValue)
58746/52746 2.410 0.000 6.035 0.000 Monitors.py:84(invoke)
26902 3.407 0.000 5.713 0.000 vobject.py:475(parseLine)
23439 0.669 0.000 5.634 0.000 LinkedMap.py:156(__setitem__)
35987 1.559 0.000 5.376 0.000 DBItemIO.py:344(_ref)
3974 0.026 0.000 5.337 0.001 RefCollections.py:195(add)
2002 0.645 0.000 5.118 0.003 Calendar.py:1114(getLocation)
3975/1 0.280 0.000 4.871 4.871
vobject.py:345(transformChildrenToNative)
3000 0.037 0.000 4.552 0.002 collections.py:313(add)
3973 0.079 0.000 4.370 0.001 Kind.py:900(iterItems)
25925/22925 0.189 0.000 4.341 0.000 vobject.py:95(transformToNative)
23001 0.166 0.000 4.229 0.000 Item.py:2203(_refList)
3973 0.073 0.000 4.116 0.001 Query.py:45(run)
23001 0.257 0.000 4.059 0.000
DBRepositoryView.py:108(_createRefList)
3973 4.041 0.001 4.042 0.001 Query.py:60(_run)
22962 0.510 0.000 3.799 0.000 DBRefs.py:222(__init__)
3978 0.435 0.000 3.657 0.001 Kind.py:605(getInitialValues)
6523867 function calls (6353239 primitive calls) in 69.068 CPU seconds
Ordered by: cumulative time
List reduced from 512 to 50 due to restriction <50>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 69.068 69.068 Sharing.py:191(get)
1 0.000 0.000 69.068 69.068 Sharing.py:615(get)
1 0.000 0.000 45.798 45.798 Sharing.py:630(_get)
1 0.001 0.001 45.797 45.797
Sharing.py:582(_conditionalGetItem)
1 0.000 0.000 45.795 45.795 Sharing.py:1059(_getItem)
1 1.652 1.652 45.795 45.795 ICalendar.py:294(importProcess)
2 0.146 0.073 23.269 11.635 DBRepositoryView.py:242(commit)
5046 0.088 0.000 22.391 0.004 DBRepositoryView.py:348(_saveItem)
5046 0.447 0.000 22.265 0.004 DBItemIO.py:34(writeItem)
5046 0.332 0.000 19.757 0.004 ItemIO.py:14(writeItem)
1 0.538 0.538 18.012 18.012 vobject.py:710(readComponents)
5046 0.047 0.000 12.727 0.003 DBItemIO.py:224(_values)
5046 0.963 0.000 12.680 0.003 Values.py:111(_writeValues)
45900 2.075 0.000 11.665 0.000 DBItemIO.py:232(_value)
3000 0.059 0.000 9.585 0.003 Kind.py:156(newItem)
3000 0.057 0.000 9.465 0.003 Calendar.py:1086(__init__)
3000 0.143 0.000 9.408 0.003 Calendar.py:227(__init__)
120130/45763 1.356 0.000 8.707 0.000 DBItemIO.py:116(writeValue)
26902 1.105 0.000 7.167 0.000
vobject.py:623(textLineToContentLine)
3978 0.098 0.000 6.994 0.002 schema.py:584(__init__)
3978 1.110 0.000 6.862 0.002 Item.py:28(__init__)
15234 0.605 0.000 6.817 0.000 Values.py:352(_setValue)
13015 1.612 0.000 6.757 0.001 Types.py:1002(writeValue)
3004 0.107 0.000 6.716 0.002 items.py:167(__init__)
35991/35990 0.547 0.000 6.131 0.000 Item.py:149(setAttributeValue)
5046 0.067 0.000 6.044 0.001 DBItemIO.py:228(_references)
5046 0.783 0.000 5.977 0.001 Values.py:654(_writeValues)
10229 0.180 0.000 5.598 0.001 RefCollections.py:199(append)
26902 3.558 0.000 5.442 0.000 vobject.py:475(parseLine)
35987 1.536 0.000 5.153 0.000 DBItemIO.py:342(_ref)
3975/1 0.281 0.000 4.803 4.803
vobject.py:345(transformChildrenToNative)
30468 0.185 0.000 4.798 0.000 Values.py:399(_setRef)
25925/22925 0.192 0.000 4.275 0.000 vobject.py:95(transformToNative)
2002 0.679 0.000 4.001 0.002 Calendar.py:1114(getLocation)
23001 0.171 0.000 3.891 0.000 Item.py:2085(_refList)
23001 0.238 0.000 3.716 0.000
DBRepositoryView.py:108(_createRefList)
23439 0.762 0.000 3.508 0.000 LinkedMap.py:152(__setitem__)
18957 0.273 0.000 3.507 0.000 DBRefs.py:256(_setRef)
3974 0.027 0.000 3.486 0.001 RefCollections.py:190(add)
22962 0.513 0.000 3.474 0.000 DBRefs.py:214(__init__)
19461 0.303 0.000 3.288 0.000 RefCollections.py:247(_setRef)
3949 0.066 0.000 3.225 0.001 Kind.py:906(iterItems)
24170 1.192 0.000 3.182 0.000 DBRefs.py:266(_saveValues)
3949 0.074 0.000 3.102 0.001 Query.py:45(run)
3000 0.039 0.000 3.101 0.001 collections.py:313(add)
3949 3.027 0.001 3.027 0.001 Query.py:60(_run)
3978 0.264 0.000 3.008 0.001 Kind.py:605(getInitialValues)
22925 0.239 0.000 2.953 0.000 vobject.py:693(modifyTop)
29900/9976 1.284 0.000 2.724 0.000 vobject.py:71(autoBehavior)
120979 1.003 0.000 2.699 0.000 DBItemIO.py:305(_type)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Open Source Applications Foundation "Dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/dev