The changes I checked in last night,
  http://viewcvs.osafoundation.org/chandler/?rev=7540&view=rev and
  http://viewcvs.osafoundation.org/chandler/?rev=7541&view=rev,
focused on some very often called low-level repository routines, such as marking items dirty, invoking monitors, setting up bi-refs, etc... Some of these are now vastly faster and this translates into some performance improvements for many if not most repository operations.

I've been using two python tools, hotshot and timeit to measure the performance of various operations. In particular, I've been running hotshot against the 3000 event calendar import bug
  http://bugzilla.osafoundation.org/show_bug.cgi?id=3988
by running the attached t79.py python file.

Recently, the event import code was changed to also include a commit, so the performance characteristics of bug 3988 have shifted quite a bit. Still, the two attached 'cumulative time' hotshot profiles, r7541.txt and r7538.txt run against Chandler before (rev 7538) and after my changes (rev 7541) show significant improvements.

Before going into the details I'd like to briefly define the meaning of time units:
 - all times were measured on a 1 Ghz Powerbook G4 with 1GB of RAM running Mac
   OS X 10.4.2, without any other non mostly-idle process, such as iTunes,
   iChat, Skype, FireFox, etc... running.
 - the attached hotshot profiles list cumulative times, that is the total
   time spent in a given routine including all its sub-routines.
 - take hotshot's CPU seconds with a grain of salt although I've been told
   that in python 2.4 they're much closer to reality.
 - the % diffs listed below reflect the difference between the new time and
   the old time. For example, if something got ten times faster it also
   means that it got 90% faster because t0 - 90% = t1 ==> t1 = t0 / 10, hence
   in the table below:
       % diff  =  (1 - (after / before)) * 100

Before going into the details I'd like to also briefly describe how to run these tests yourself on your own favorite platform:
 - the 3000 event calendar import test is done with the help of the attached
   t79.py script:
   > cd $CHANDLERHOME
   > RunPython
   >>> from t79 import *
   the raw hotshot profile is 32 Mb and takes a *long* time to load
 - the individual timeit tests are run with the help of the attached t7.py and
   t6.py scripts, where t7.py is used to create a sample 'data' repository and
   t6.py is used to open and inspect it, for example:
   > cd $CHANDLERHOME
   > RunPython t7.py
   > RunPython -m timeit -s "from t6 import k; d=k._values; f=k.VDIRTY" \
                            "k.setDirty(f, 'born', d)"

The hotshot details (3000 event calendar import):
                         | after (r7541) | before (r7538) | % diff
--------------------------------------------------------------------
total time               |  69.07 s      |  83.20 s       |  17   %
commit()                 |  23.27 s      |  25.91 s       |  10.2 %
total time - commit()    |  45.80 s      |  57.29 s       |  20   %
--------------------------------------------------------------------
new item creation via    |               |                |
  Kind.newItem()         |   9.58 s      |  15.65 s       |  38.7 %
Item.__init__()          |   6.86 s      |   8.40 s       |  18.3 %
Item.setAttributeValue   |   6.13 s      |  14.38 s       |  57.4 %
establishing bi-refs in  |               |                |
  Values._setValue       |   6.82 s      |  13.08 s       |  47.9 %
establishing one side of |               |                |
   bi-refs in a ref coll |   3.29 s      |   6.77 s       |  51.4 %
--------------------------------------------------------------------

The timeit details (run with t6.py):
                         | after (r7541) | before (r7538) | % diff
--------------------------------------------------------------------
Item.setDirty()          |  10.3  usec   |  65.7  usec    |  84.3 %
monitor invocation when  |               |                |
no monitors (overhead)   |   2.8  usec   |  27.2  usec    |  89.7 %
view.find(uuid) of item  |               |                |
already loaded           |   3.14 usec   |   6.33 usec    |  50.4 %
view[uuid] of item       |               |                |
already loaded           |   1.37 usec   |   7.82 usec    |  82.5 %
--------------------------------------------------------------------
returning a defaultValue |               |                |
for an attribute         |   0.97 usec   |  86.7  usec    |  98.9 %
--------------------------------------------------------------------

Where that leaves us with the various performance acceptance tests, I don't know. The tests are still shifting around, so it is kind of difficult to assess, yet I'm pretty sure a 'bunch of amorphous things' still must have gotten noticeably faster.

In any case, one is never done with performance work, watch this space for more...

I hope this helps !

Andi..
import os
import hotshot, hotshot.stats

from osaf.sharing.Sharing import FileSystemConduit, Share
from osaf.sharing.ICalendar import ICalendarFormat
from application.Parcel import Manager as ParcelManager

from repository.persistence.DBRepository import DBRepository

r = DBRepository('data')
r.create(ramdb='True', refcounted='True')
view = r.view

manager = ParcelManager.get(view, path=['parcels'])
manager.loadParcels(["osaf.pim.calendar"])

path = os.path.join(os.getenv('CHANDLERHOME') or '.',
                    'parcels', 'osaf', 'sharing', 'tests')

conduit = FileSystemConduit(name="conduit", sharePath=path,
                            shareName="3kevents.ics", view=view)
format = ICalendarFormat(name="format", view=view)
share = Share(name="share", conduit=conduit, format=format, view=view)

#share.get()

prof = hotshot.Profile('t79.prof')
prof.runcall(share.get)
prof.close()

def cum():
    stats.sort_stats('cum')
    stats.print_stats(50)

def calls():
    stats.sort_stats('time', 'calls')
    stats.print_stats(50)

print 'loading profile'
stats = hotshot.stats.load('t79.prof')
stats.strip_dirs()
cum()
from repository.persistence.DBRepository import DBRepository

r = DBRepository('data')
r.create()
r = r.view

#r.loadPack('repository/packs/chandler.pack')
r.loadPack('repository/tests/data/packs/cineguide.pack')
r.findPath('//CineGuide/KHepburn').movies.addIndex('n', 'numeric')
r.findPath('//CineGuide/KHepburn').movies.addIndex('t', 'attribute',
                                                   attribute='title')
r.findPath('//CineGuide/KHepburn').movies.addIndex('f', 'string',
                                                   attribute='frenchTitle',
                                                   locale='fr_FR')
r.check()
r.commit()

r = r.repository
r.close()

#c = r['CineGuide']
#k = c['KHepburn']
#m = k.movies.first()
from repository.persistence.DBRepository import DBRepository
from repository.persistence.Repository import RepositoryThread
#from repository.remote.RepositoryServer import SOAPRepositoryServer
from repository.item.Query import TextQuery, KindQuery
from repository.item.Access import ACL, ACE, Permission
from repository.util.SAX import XMLPrettyGenerator, XMLGenerator
from sys import stdout

r = DBRepository('data')
r.open(recover=False, force=False, refcounted=True)
r = r.view

#r.check()

a = r.findPath('//Schema/Core/Attribute')
#e = r.findPath('//Schema/Core/Enumeration')
c = r.findPath('//CineGuide')
k = r.findPath('//CineGuide/KHepburn')
i = r.findPath('//Collections/foo')
strings = r.findPath('//Collections/Strings')
l = r.findPath('//Schema/Core/Clouds/Kind')
al = r.findPath('//Schema/CineGuide/Clouds/Actor')
ml = r.findPath('//Schema/CineGuide/Clouds/Movie')

if k is not None:
    m = k.movies
    m1 = m.first()
    if m1 is not None:
        m2 = m.next(m1)
        m3 = m.next(m2)
        m4 = m.next(m3)
        m5 = m.next(m4)

def p(index=None, attr='title', a=k):
    i = 0
    if index is not None:
        m = a.movies.firstInIndex(index)
        while m is not None:
            print i, ':', m.getAttributeValue(attr).encode('iso-8859-1')
            m = a.movies.nextInIndex(m, index)
            i += 1
    else:
        for m in a.movies:
            print i, ':', m.getItemDisplayName()
            i += 1

def q(index):
    m = k.movies
    for i in xrange(m._index(index)._count):
        print i, ':', m.getByIndex(index, i).title

def g(i, indexName='n'):
    return m.getByIndex(indexName, i).title

def s(a=k):
    for m in a.movies:
        print m.title, ':', 
m.synopsis.getPlainTextReader().read().encode('iso-8859-1')

def es(a=k):
    for m in a.movies:
        synopsis = m.synopsis.getPlainTextReader().read().encode('iso-8859-1')
        if not synopsis:
            print m.title, ':', synopsis

def apply(item, callable):
    for i in item.iterChildren():
        callable(i)
        apply(i, callable)

def load():
    for root in r.iterRoots():
        apply(root, lambda item: item.setDirty(True))

def tq(expression):
    for item, attribute in TextQuery(expression).run(r):
        print item.getItemDisplayName().encode('iso-8859-1')
        print 
item.getAttributeValue(attribute).getReader().read().encode('iso-8859-1')
        print

#r.close()

def wi(cloud=al, item=k, file=stdout, pretty=True):
    generator = XMLGenerator(file)
    if pretty:
        generator = XMLPrettyGenerator(generator)
    cloud.writeItems(item.itsUUID, 1, generator)
    if pretty:
        file.write('\n')

def wa():
    a1=ACE(k.itsUUID, Permission.READ)
    a2=ACE(c.itsUUID, Permission.WRITE)
    acl = ACL()
    acl.append(a1)
    acl.append(a2)
    k.setACL(acl, 'born')

def serve():
    def run(*args, **kwds):
        server = SOAPRepositoryServer(r)
        server.startup()

    t = RepositoryThread(r, target=run)
    t.setDaemon(True)
    t.start()

def ch(item, version, bits, values, references):
    print item, version, "0x%x" %(bits), values, references

def hist(item, version, bits, values, references):
    print item, version, "0x%x" %(bits), values, references

class o(object):
    pass

def nop(*args, **kwds):
    pass

o.attr = 5
         7983271 function calls (7782644 primitive calls) in 83.202 CPU seconds

   Ordered by: cumulative time
   List reduced from 533 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   83.202   83.202 Sharing.py:191(get)
        1    0.000    0.000   83.202   83.202 Sharing.py:615(get)
        1    0.000    0.000   57.290   57.290 Sharing.py:630(_get)
        1    0.001    0.001   57.289   57.289 
Sharing.py:582(_conditionalGetItem)
        1    0.000    0.000   57.286   57.286 Sharing.py:1059(_getItem)
        1    1.689    1.689   57.286   57.286 ICalendar.py:294(importProcess)
        2    0.166    0.083   25.912   12.956 DBRepositoryView.py:242(commit)
     5046    0.090    0.000   24.913    0.005 DBRepositoryView.py:348(_saveItem)
     5046    0.463    0.000   24.786    0.005 DBItemIO.py:35(writeItem)
     5046    0.323    0.000   21.930    0.004 ItemIO.py:14(writeItem)
        1    0.531    0.531   17.632   17.632 vobject.py:710(readComponents)
     3000    0.061    0.000   15.653    0.005 Kind.py:156(newItem)
     3000    0.049    0.000   15.524    0.005 Calendar.py:1086(__init__)
     3000    0.142    0.000   15.475    0.005 Calendar.py:227(__init__)
35991/35990    0.537    0.000   14.384    0.000 Item.py:188(setAttributeValue)
     5046    0.048    0.000   13.702    0.003 DBItemIO.py:225(_values)
     5046    1.295    0.000   13.654    0.003 Values.py:209(_writeValues)
    15234    0.632    0.000   13.076    0.001 Values.py:475(_setValue)
    45900    2.029    0.000   11.427    0.000 DBItemIO.py:233(_value)
     3004    0.105    0.000   10.208    0.003 items.py:167(__init__)
    30468    0.477    0.000    9.347    0.000 Values.py:532(_setRef)
    10229    0.187    0.000    8.928    0.001 RefCollections.py:204(append)
135586/117587    1.054    0.000    8.889    0.000 Item.py:1154(setDirty)
120130/45763    1.222    0.000    8.573    0.000 DBItemIO.py:117(writeValue)
     3978    0.087    0.000    8.522    0.002 schema.py:584(__init__)
     3978    0.387    0.000    8.402    0.002 Item.py:28(__init__)
58746/52746    0.798    0.000    7.933    0.000 Item.py:1224(_fireChanges)
    26902    1.077    0.000    7.402    0.000 
vobject.py:623(textLineToContentLine)
     7227    0.111    0.000    7.383    0.001 Values.py:527(_addValue)
     5046    0.068    0.000    7.229    0.001 DBItemIO.py:229(_references)
     5046    1.033    0.000    7.161    0.001 Values.py:784(_writeValues)
    18957    0.411    0.000    7.116    0.000 DBRefs.py:264(_setRef)
    19461    0.196    0.000    6.767    0.000 RefCollections.py:256(_setRef)
    13015    1.642    0.000    6.756    0.001 Types.py:1002(writeValue)
58746/52746    2.410    0.000    6.035    0.000 Monitors.py:84(invoke)
    26902    3.407    0.000    5.713    0.000 vobject.py:475(parseLine)
    23439    0.669    0.000    5.634    0.000 LinkedMap.py:156(__setitem__)
    35987    1.559    0.000    5.376    0.000 DBItemIO.py:344(_ref)
     3974    0.026    0.000    5.337    0.001 RefCollections.py:195(add)
     2002    0.645    0.000    5.118    0.003 Calendar.py:1114(getLocation)
   3975/1    0.280    0.000    4.871    4.871 
vobject.py:345(transformChildrenToNative)
     3000    0.037    0.000    4.552    0.002 collections.py:313(add)
     3973    0.079    0.000    4.370    0.001 Kind.py:900(iterItems)
25925/22925    0.189    0.000    4.341    0.000 vobject.py:95(transformToNative)
    23001    0.166    0.000    4.229    0.000 Item.py:2203(_refList)
     3973    0.073    0.000    4.116    0.001 Query.py:45(run)
    23001    0.257    0.000    4.059    0.000 
DBRepositoryView.py:108(_createRefList)
     3973    4.041    0.001    4.042    0.001 Query.py:60(_run)
    22962    0.510    0.000    3.799    0.000 DBRefs.py:222(__init__)
     3978    0.435    0.000    3.657    0.001 Kind.py:605(getInitialValues)


         6523867 function calls (6353239 primitive calls) in 69.068 CPU seconds

   Ordered by: cumulative time
   List reduced from 512 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   69.068   69.068 Sharing.py:191(get)
        1    0.000    0.000   69.068   69.068 Sharing.py:615(get)
        1    0.000    0.000   45.798   45.798 Sharing.py:630(_get)
        1    0.001    0.001   45.797   45.797 
Sharing.py:582(_conditionalGetItem)
        1    0.000    0.000   45.795   45.795 Sharing.py:1059(_getItem)
        1    1.652    1.652   45.795   45.795 ICalendar.py:294(importProcess)
        2    0.146    0.073   23.269   11.635 DBRepositoryView.py:242(commit)
     5046    0.088    0.000   22.391    0.004 DBRepositoryView.py:348(_saveItem)
     5046    0.447    0.000   22.265    0.004 DBItemIO.py:34(writeItem)
     5046    0.332    0.000   19.757    0.004 ItemIO.py:14(writeItem)
        1    0.538    0.538   18.012   18.012 vobject.py:710(readComponents)
     5046    0.047    0.000   12.727    0.003 DBItemIO.py:224(_values)
     5046    0.963    0.000   12.680    0.003 Values.py:111(_writeValues)
    45900    2.075    0.000   11.665    0.000 DBItemIO.py:232(_value)
     3000    0.059    0.000    9.585    0.003 Kind.py:156(newItem)
     3000    0.057    0.000    9.465    0.003 Calendar.py:1086(__init__)
     3000    0.143    0.000    9.408    0.003 Calendar.py:227(__init__)
120130/45763    1.356    0.000    8.707    0.000 DBItemIO.py:116(writeValue)
    26902    1.105    0.000    7.167    0.000 
vobject.py:623(textLineToContentLine)
     3978    0.098    0.000    6.994    0.002 schema.py:584(__init__)
     3978    1.110    0.000    6.862    0.002 Item.py:28(__init__)
    15234    0.605    0.000    6.817    0.000 Values.py:352(_setValue)
    13015    1.612    0.000    6.757    0.001 Types.py:1002(writeValue)
     3004    0.107    0.000    6.716    0.002 items.py:167(__init__)
35991/35990    0.547    0.000    6.131    0.000 Item.py:149(setAttributeValue)
     5046    0.067    0.000    6.044    0.001 DBItemIO.py:228(_references)
     5046    0.783    0.000    5.977    0.001 Values.py:654(_writeValues)
    10229    0.180    0.000    5.598    0.001 RefCollections.py:199(append)
    26902    3.558    0.000    5.442    0.000 vobject.py:475(parseLine)
    35987    1.536    0.000    5.153    0.000 DBItemIO.py:342(_ref)
   3975/1    0.281    0.000    4.803    4.803 
vobject.py:345(transformChildrenToNative)
    30468    0.185    0.000    4.798    0.000 Values.py:399(_setRef)
25925/22925    0.192    0.000    4.275    0.000 vobject.py:95(transformToNative)
     2002    0.679    0.000    4.001    0.002 Calendar.py:1114(getLocation)
    23001    0.171    0.000    3.891    0.000 Item.py:2085(_refList)
    23001    0.238    0.000    3.716    0.000 
DBRepositoryView.py:108(_createRefList)
    23439    0.762    0.000    3.508    0.000 LinkedMap.py:152(__setitem__)
    18957    0.273    0.000    3.507    0.000 DBRefs.py:256(_setRef)
     3974    0.027    0.000    3.486    0.001 RefCollections.py:190(add)
    22962    0.513    0.000    3.474    0.000 DBRefs.py:214(__init__)
    19461    0.303    0.000    3.288    0.000 RefCollections.py:247(_setRef)
     3949    0.066    0.000    3.225    0.001 Kind.py:906(iterItems)
    24170    1.192    0.000    3.182    0.000 DBRefs.py:266(_saveValues)
     3949    0.074    0.000    3.102    0.001 Query.py:45(run)
     3000    0.039    0.000    3.101    0.001 collections.py:313(add)
     3949    3.027    0.001    3.027    0.001 Query.py:60(_run)
     3978    0.264    0.000    3.008    0.001 Kind.py:605(getInitialValues)
    22925    0.239    0.000    2.953    0.000 vobject.py:693(modifyTop)
29900/9976    1.284    0.000    2.724    0.000 vobject.py:71(autoBehavior)
   120979    1.003    0.000    2.699    0.000 DBItemIO.py:305(_type)


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "Dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/dev

Reply via email to