Bah. My newsreader lost my reply when the WiFi connection dropped out... attempt #2.

On 2017-04-12 18:45:16 +0000, bart4...@gmail.com said:

On Wednesday, 12 April 2017 16:04:53 UTC+1, Brecht Machiels  wrote:
On 2017-04-12 14:46:45 +0000, Michael Torrie said:

It would be great if you could run the benchmark I mention in my first> link and share the results. Highly appreciated!

Were you ever able to isolate what it was that's taking up most of the time? Either in general or in the bit that pypy has trouble with. Or is execution time spread too widely?

It's been a while since I last focused on performance, but the profile is still pretty flat. It's easy enough to verify (see also the URL referenced below):

   python -m cProfile -o demo.prof `which rinoh` -f restructuredtext demo.rst
   python -m pstats demo.prof

   demo.prof% strip
   demo.prof% sort tottime
   demo.prof% stats 15

Thu Apr 13 10:59:19 2017    demo.prof

        35193174 function calls (27868271 primitive calls) in 22.461 seconds

  Ordered by: internal time
  List reduced from 5499 to 15 due to restriction <15>

  ncalls       tottime  percall  cumtime  percall filename:lineno(function)
6020041/321084   2.812    0.000    2.884    0.000 layout.py:152(document_part)
  287201        1.211    0.000    6.156    0.000 style.py:645(match)
   98788        0.901    0.000    1.965    0.000 version.py:198(__init__)
419928/232734    0.751    0.000   17.332    0.000 util.py:109(function_wrapper)
  344783        0.588    0.000    1.198    0.000 style.py:319(match)
 1302467        0.534    0.000    0.840    0.000 style.py:438(__hash__)
128992/83504 0.459 0.000 15.477 0.000 style.py:556(get_style_recursive) 1472251/1472250 0.399 0.000 0.469 0.000 {built-in method builtins.isinstance}
  701320        0.395    0.000    0.679    0.000 parse.py:18(reader)
306381/10913     0.389    0.000    6.546    0.000 style.py:757(find_matches)
89622/86768      0.368    0.000    2.126    0.000 style.py:369(match)
     176        0.311    0.002    0.840    0.005 parse.py:157(check_sum)
339968/10360     0.308    0.000    0.417    0.000 dimension.py:239(__float__)
   95312        0.301    0.000    0.347    0.000 version.py:343(_cmpkey)
    2642        0.288    0.000    3.380    0.001 __init__.py:792(resolve)

(I looked at your project but it's too large, and didn't get much further with the github benchmark, which requires me to subscribe, but the .sh file extensions don't seem too promising to someone on Windows.)

GitHub benchmark? .sh file extensions?

You can easily run some benchmarks following the instructions here (pip install): https://bitbucket.org/pypy/pypy/issues/2365/rinohtype-much-slower-on-pypy3

As I commented on that issue, I have been able to run the benchmarks using PyPy3 5.7.1 beta, which is now significantly faster than CPython. That's very promising!

Your program seems to be to do with typesetting. Is it possible to at least least quantity the work that is being done in terms of total bytes (and total files) of input, and bytes of output? That might enable comparisons with other systems executing similar tasks, to see if the Python version is taking unreasonably long.

The Sphinx benchmark's source reStructuredText files add up to 584 KB. The output PDF file is almost 3 MB (includes fonts and images). Note that the input document is parsed into a document tree where each paragraph is represented by an object of the Paragraph class, containing StyledText objects and so on. The total memory used is about 1 GB!

LaTeX is orders of magnitude faster, but requires multiple passes. It's memory usage is probably much less since it works stream-based.

Best regards,
Brecht

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to