On 08/14/2013 10:42 PM, Jun Omae wrote:
> Hi,
> 
> On Sun, Aug 11, 2013 at 5:29 AM, Eli Carter <[email protected]> wrote:
>> part3: uncache previous_rev()
>>     Makes previous_rev() unconditionally go to the repository without
>> hitting the node_change table.  (Was "part3a".)
>>
>> part4: warn on slow next_rev()
>>     This logs a warning if next_rev() takes more than 10 seconds to
>> complete, suggesting enabling [trac] debug_sql.  (Was "part3b".)
>>     The purpose of this is to give the admin that first breadcrumb to
>> lead them to the underlying cause of their slowness.
> 
> I tried performance tuning of `_next_prev_rev`, tuning-_next_prev_rev.diff.
> 
>  a. If the node is a directory, don't check changes of node children.

I think you meant "only if the node is a directory, check changes of
node children"?

>  b. In condition for deletion of path ancestors, first check change_type.
>     The check for `path IN (...)` is expensive, if the path is deep.

Makes sense.

>  c. In condition for deletion of path ancestors, don't check path itself.
>     The checking of path itself is included in the following.
> 
>             # changes on path itself
>             sql += " AND (path=%s"
>             args.append(path)

Makes sense.

> Timing of `_next_prev_rev` with 
> source:trunk/tracopt/mimeview/tests/php.py@8802
> of a clone of http://svn.edgewall.com/repos/trac.
> 
> || Database          ||   before             ||     after            ||
> ||                   ||   <      ||   >      ||   <      ||   >      || 
> direction parameter
> || SQLite 3.3.6      || 0.209179s|| 0.045196s|| 0.103258s|| 0.024124s||
> || MySQL 5.0.95      || 0.230615s|| 0.236727s|| 0.203341s|| 0.214520s|| 
> innodb_buffer_pool_size = 32M
> || PostgreSQL 8.1.23 || 0.085839s|| 0.023007s|| 0.072130s|| 0.018516s||
> 
> 
> Thoughts?

Your optimizations look like worthwhile improvements... especially if
they hold the 50% improvement for other testcases.  I'd be interested in
seeing a different testcase than the php.py file though.  The key issue
I'm seeing is related to the ends of the file's history, not in the
middle.  In other words, when previous_rev or next_rev would return
None, the queries take a very long time.  When there is a previous or
next revision present, they perform reasonably.

The scenarios I'm concerned with:
1) previous_rev takes a long time
Here, you have a file which is created near the end of history.  That
is, in a repository with 100k revisions, the file was *first created* in
revision 90000 or so.  Thus, the previous_rev call will return None, but
will take a long time to run.  I've found that if there is a previous
revision (say 89000) it seems to complete much faster.  So the lack of
previous history is important to the performance test.

2) next_rev takes a long time
Here, you have a file which was created near the beginning of history,
and *has not been modified* since.  So in a repository with 100k
revisions, the file was created in revision 1k or something, and never
changed.  The next_rev call will return None, but will take a long time
to run.

The testcase you're timing has revisions 8734 < 8802 < 11493; I'm not
sure that's going to demonstrate the problem very well.  It does
demonstrate that you've improved its performance in the common case
though, which is good.

Here's an example of what I'm concerned with:
I have a file 'home/trunk/documents/todo-2010' in my repository.  It was
created in revision 243052, with the next couple of commits 243054 and
243056.
When viewing revision 243052, the query for the non-existent previous
revision took 446s (with the original SQL query, Trac 0.12.6), while the
query for the next revision took under a second.  (I lowered the warning
threshold to 1s, and it didn't trigger.)  But when viewing revision
243054, both previous_rev and next_rev took under a second.  Viewing the
most recent version at revision 935851 (of 935878 in the repository),
both previous_rev and next_rev (returning None) take under a second.

I also have a file 'home/trunk/documents/signature' which was added in
revision 17484 and never modified.  So when you view that file, both
previous_rev and next_rev return None.  previous_rev takes 231s, and
next_rev takes 754 seconds.

I suspect that the DB is doing a linear search so that the first entry
it finds is the only entry it needs to return.

Eli

-- 
You received this message because you are subscribed to the Google Groups "Trac 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/trac-dev.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to