#32814: Improve performance of TextNode rendering by providing a special-case of
render_annotated
------------------------------------------------+------------------------
               Reporter:  Keryn Knight          |          Owner:  nobody
                   Type:  Cleanup/optimization  |         Status:  new
              Component:  Template system       |        Version:  dev
               Severity:  Normal                |       Keywords:
           Triage Stage:  Unreviewed            |      Has patch:  0
    Needs documentation:  0                     |    Needs tests:  0
Patch needs improvement:  0                     |  Easy pickings:  0
                  UI/UX:  0                     |
------------------------------------------------+------------------------
 A pull request will be forthcoming shortly, should this be accepted (all
 tests pass against current `main`)

 Currently, when rendering a `TextNode` within any `NodeList` or `ForNode`
 (at least), it is always rendered via `Node.render_annotated(context)`  -
 the generic version for handling any exceptions which occur during
 rendering for which it might be preferable to augment the exception with
 additional debug data.

 This additional layer of complexity isn't free, and for `TextNode` I don't
 think it's necessary. The only way `TextNode.render(context)` can fail
 that I can think of is ending up at a `MemoryError` which is presumably
 not usefully recoverable for rendering a debug page or whatever, anyway.
 It'd be interesting to know if my mental model is wrong :)

 For rendering any single `TextNode` it's pretty cheap, but the more of
 them there are, the more compound costs stacks up, even if the interleaved
 nodes are just comment nodes (block or line).

 By way of example, below are some benchmarks with both timeit and cprofile
 instrumented. Using yappi in lieu of cProfile also shows similar outcomes,
 but is even slower. For the most complex template I had to give up waiting
 on it.

 = Wildly simplistic contrived example

 Render a  single `TextNode` containing the HTML for the standard
 html5boilerplate.

 == Using timeit (single textnode)

 ==== before patch:
 {{{
 Running timeit for
 single_textnode
 100000 loops -> 0.7104 secs
 }}}

 ==== after patch:
 {{{
 Running timeit for
 single_textnode
 100000 loops -> 0.6781 secs
 }}}
 Note that this is small enough to fluctuate quite a bit both before and
 after, so YMMV.

 == Using cProfile (single textnode)

 For the purposes of being able to determine precisely (under cProfile)
 different render_annotated calls for more complex templates (further
 down), I copy-pasted the `Node.render_annotated` definition to the
 `TextNode`, hence the line number offset of `984` ...

 ==== before patch:
 {{{
 Running cProfile for
 single_textnode
 3500002 function calls in 1.398 seconds

 Ordered by: standard name

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 ...
 100000    0.014    0.000    0.014    0.000 base.py:981(render)
 100000    0.030    0.000    0.044    0.000 base.py:984(render_annotated)
 ...
 }}}

 ==== after patch (`render` is never called):
 {{{
 Running cProfile for
 single_textnode
 3400002 function calls in 1.407 seconds

 Ordered by: standard name

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 ...
 100000    0.014    0.000    0.014    0.000 base.py:984(render_annotated)
 ...
 }}}

 = Realistic template:

 For a more realistic use case, the same situation but with
 `contrib/admin/templates/admin/change_list.html` which should be a decent
 excerciser:

 == using timeit (many textnodes + other noisy nodes)

 ====  admin changelist + faked context, before patch:
 {{{
 Running timeit for
 admin_changelist
 100000 loops -> 158.4 secs
 }}}

 ==== same changelist template + context, after patch:
 {{{
 Running timeit for
 admin_changelist
 100000 loops -> 160.9 secs
 }}}

 As I mentioned earlier, it fluctuates enough to not really be that useful,
 see?

 == using cProfile

 ====  admin changelist + faked context, before patch:
 {{{
 Running cProfile for
 admin_changelist
 493564702 function calls (457763995 primitive calls) in 307.297 seconds

 Ordered by: standard name
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 ...
 10200000    1.763    0.000    1.763    0.000 base.py:981(render)
 10200000    3.851    0.000    5.614    0.000 base.py:984(render_annotated)
 ...
 }}}

 ==== same changelist template + context, after patch:
 {{{
 Running cProfile for
 admin_changelist
 483364697 function calls (447563990 primitive calls) in 300.721 seconds

 Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 ...
 10200000    1.729    0.000    1.729    0.000 base.py:984(render_annotated)
 ...
 }}}

 = The patch itself

 It's dirt simple, which seems right for the amount of time it saves (ie:
 not much really) but not for the amount of energy I've burnt off my laptop
 profiling the above + running the test suite...

 {{{
 class TextNode(Node):
     ...

     def render_annotated(self, context):
         return self.s
 }}}

-- 
Ticket URL: <https://code.djangoproject.com/ticket/32814>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/052.8e35f42ffd6eb91d957994e7ecb989b5%40djangoproject.com.

Reply via email to