#35250: Stop URL system checks from compiling regular expressions
-------------------------------------+-------------------------------------
     Reporter:  Adam Johnson         |                    Owner:  Adam
         Type:                       |  Johnson
  Cleanup/optimization               |                   Status:  assigned
    Component:  Core (System         |                  Version:  dev
  checks)                            |
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:
                                     |  Unreviewed
    Has patch:  1                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Description changed by Adam Johnson:

Old description:

> Continuing my project to optimize the system checks, I found some good
> optimizations under  `django.core.checks.urls.check_url_config()`, which
> showed up as quite expensive in profiling.
>
> Looking down the call tree, it seems the most expensive part of this
> process is compiling the each URL pattern’s regular expression. This is
> unnecessary work though, as the checks only need *uncompiled* regular
> expression patterns. Using the compiled versions “undoes” the lazy-
> compile optimization that `LocaleRegexDescriptor` was created for in
> #27453 / 6e222dae5636f875c19ec66f730a4241abe33faa, at least for any
> process that runs checks.
>
> The checks were fetching the uncompiled pattern with
> `self.regex.pattern`, which makse `LocaleRegexDescriptor`  compile the
> pattern only to then read the uncompiled pattern from
> [https://docs.python.org/3.12/library/re.html#re.Pattern.pattern its
> pattern attribute].
>
> Additionally, `RoutePattern` was calling `_route_to_regex()` twice to
> fetch its two result variables in different places: once in `__init__()`
> and again in `_compile()` (in the non-translated case). This function has
> non-trivial cost so avoiding double execution is worth it.
>
> Before optimization stats:
>
> * `check_url_config` took 67ms, or ~10% of the time for checks.
> * `LocaleRegexDescriptor.__get__()` showed 965 calls taking ~60ms, ~9% of
> the total runtime of checks.
> * `re.compile()` showed 741 calls for 94ms.
> * `_route_to_regex()` had 1900 calls taking 18ms (~2.6% of the total
> runtime).
>
> After optimization:
>
> * `check_url_config()` took 5ms, ~0.9% of the new total runtime.
> * The calls to `LocaleRegexDescriptor.__get__` are gone.
> * `re.compile()` drops to 212 calls from other sites, for a total of
> 51ms.
> * `_route_to_regex()` drops to the expected 950 calls, taking half the
> time at 9ms.

New description:

 Continuing my project to optimize the system checks, I found some good
 optimizations under  `django.core.checks.urls.check_url_config()`, which
 showed up as quite expensive in profiling.

 Looking down the call tree, it seems the most expensive part of this
 process is compiling the each URL pattern’s regular expression. This is
 unnecessary work though, as the checks only need *uncompiled* regular
 expression patterns. Using the compiled versions “undoes” the lazy-compile
 optimization that `LocaleRegexDescriptor` was created for in #27453 /
 6e222dae5636f875c19ec66f730a4241abe33faa, at least for any process that
 runs checks.

 The checks were fetching the uncompiled pattern with `self.regex.pattern`,
 which makse `LocaleRegexDescriptor`  compile the pattern only to then read
 the uncompiled pattern from
 [https://docs.python.org/3.12/library/re.html#re.Pattern.pattern its
 pattern attribute].

 Additionally, `RoutePattern` was calling `_route_to_regex()` twice to
 fetch its two result variables in different places: once in `__init__()`
 and again in `_compile()` (in the non-translated case). This function has
 non-trivial cost so avoiding double execution is worth it.

 Before optimization stats:

 * `check_url_config` took 67ms, or ~10% of the time for checks.
 * `LocaleRegexDescriptor.__get__()` showed 965 calls taking ~60ms, ~9% of
 the total runtime of checks.
 * `re.compile()` showed 741 calls for 94ms.
 * `_route_to_regex()` had 1900 calls taking 18ms (~2.6% of the total
 runtime).

 After optimization:

 * `check_url_config()` took 5ms, ~0.9% of the new total runtime.
 * The calls to `LocaleRegexDescriptor.__get__` are gone.
 * `re.compile()` drops to 212 calls from other sites, for a total of 51ms.
 * `_route_to_regex()` drops to the expected 950 calls, taking half the
 time at 9ms.

 (I also tried benchmarking with django-asv but got inconclusive results
 where change was within the error margins.)

--
-- 
Ticket URL: <https://code.djangoproject.com/ticket/35250#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/0107018ddcc43340-e16d272b-a83e-4e41-ae91-c606b7111000-000000%40eu-central-1.amazonses.com.

Reply via email to