#35252: Optimize django.urls.resolvers._route_to_regex()
-------------------------------------+-------------------------------------
               Reporter:  Adam       |          Owner:  Adam Johnson
  Johnson                            |
                   Type:             |         Status:  assigned
  Cleanup/optimization               |
              Component:  Core       |        Version:  dev
  (URLs)                             |
               Severity:  Normal     |       Keywords:
           Triage Stage:             |      Has patch:  1
  Unreviewed                         |
    Needs documentation:  0          |    Needs tests:  0
Patch needs improvement:  0          |  Easy pickings:  0
                  UI/UX:  0          |
-------------------------------------+-------------------------------------
 `_route_to_regex()` converts Django’s parameterized route syntax into a
 regular expression. Whilst working on #35250, I noticed several
 opportunities for optimizing this function:

 1. It has O(n^2) runtime from slicing the route string per parameter,
 repeatedly copying the remainder. A single search for all parameters would
 have O(n) runtime instead.
 2. Within my 950 URL project, there are many repeat calls with the same
 route, such as the empty string or ModelAdmin URL suffixes like
 '<path:object_id>/history/'. I think this would be typical of many
 projects, so it makes sense to add caching.
 3. `match.start()` and `match.end()` are called separately, when
 `match.span()` gives both values in one function call.
 4. `get_converter()` is unnecessary, the function can fetch all converters
 once with `get_converters()` and use the dictionary directly.
 4. Casting each parameter string to a set for the whitespace check is a
 bit costly, it’s faster to use a whitespace set scanning the string.
 5. An f-string can be used for concatenation, which is a little bit
 faster.

 Applying these optimizations makes the function significantly faster,
 especially for more parameters. Below are some benchmarks.

 Before optimization stats (Python 3.12, macOS, M1 mac, Django main
 branch):

 * Converting a seven parameter route:

   {{{
   In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True)
   12.3 µs ± 68 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops
 each)
   }}}

 * Profiling a project’s system checks: 950 calls take 9ms, ~1.5% of the
 total runtime.

 After optimization:

 * Converting that seven parameter route is ~2x faster (with caching
 removed):

   {{{
   In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True)
   5.49 µs ± 18.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops
 each)
   }}}

 * Adding caching makes repeat calls ~100x faster, from 5µs to 50ns.

 * The profile shows those 950 calls now take 5ms, ~50% faster, ~0.8% of
 total runtime.
-- 
Ticket URL: <https://code.djangoproject.com/ticket/35252>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/0107018de500cac2-e7eaf1ca-c8e5-429a-b11e-0bd01fc34c80-000000%40eu-central-1.amazonses.com.

Reply via email to