Re: [Doc-SIG] Building Python Document 30% faster.

Naoki INADA Sat, 04 Apr 2009 09:03:32 -0700

Hi Georg.

>> Attached patches make building document 30% faster.
>> (In my environ. 330sec -> 220sec roughly)
>>
>> I post sphinx.patch to bitbucket, but I don't know where to post 
>> docutils.patch.
>> Could anyone review these patch?
>
> I will, when I have a bit more time.


Thank you.

>> But searchindex.js with PyStemmer is different to one with PorterStemmer.
>
> This could be a problem.  The client-side search implemented in JavaScript
> uses exactly the same stemmer (which is necessary to be able to find all
> words).  In short, if you can find a C implementation of the Porter stemmer
> we could include it in Sphinx as an optional extension.

I see.
Original Porter Stemmer is here.
http://tartarus.org/~martin/PorterStemmer/

And that implemented in C. I'll try to make Python wrapper with swig and
compare searchindex.js. Wait for a while.


>> 2. Avoid building OptionParser many times.
>> Sphinx uses docutils.core.publish_parts() without `settings` argument
>> many times.
>> This causes building docutils.frontend.OptionParser many times and consumes
>> 29 seconds.
>>
>> 3. Avoid building NestedStateMachine many times.
>> NestedStateMachine is built and destroyed many times.
>> Recycling that SM make significant performance gain.
>
> I assume that both of this is in the second commit I see on bitbucket?  Both
> look like a worthy optimization.

Former is in bitbucket.
http://bitbucket.org/methane/sphinx-speedup/changeset/72fa0ceefcae/

And later is not in bitbucket because NestedStateMachine is not in Sphinx
but docutils.

-- 
Naoki INADA  <inad...@klab.jp>
   KLab Inc.  <http://www.klab.jp>
_______________________________________________
Doc-SIG maillist  -  Doc-SIG@python.org
http://mail.python.org/mailman/listinfo/doc-sig

Re: [Doc-SIG] Building Python Document 30% faster.

Reply via email to