#35252: Optimize django.urls.resolvers._route_to_regex() -------------------------------------+------------------------------------- Reporter: Adam | Owner: Adam Johnson Johnson | Type: | Status: assigned Cleanup/optimization | Component: Core | Version: dev (URLs) | Severity: Normal | Keywords: Triage Stage: | Has patch: 1 Unreviewed | Needs documentation: 0 | Needs tests: 0 Patch needs improvement: 0 | Easy pickings: 0 UI/UX: 0 | -------------------------------------+------------------------------------- `_route_to_regex()` converts Django’s parameterized route syntax into a regular expression. Whilst working on #35250, I noticed several opportunities for optimizing this function:
1. It has O(n^2) runtime from slicing the route string per parameter, repeatedly copying the remainder. A single search for all parameters would have O(n) runtime instead. 2. Within my 950 URL project, there are many repeat calls with the same route, such as the empty string or ModelAdmin URL suffixes like '<path:object_id>/history/'. I think this would be typical of many projects, so it makes sense to add caching. 3. `match.start()` and `match.end()` are called separately, when `match.span()` gives both values in one function call. 4. `get_converter()` is unnecessary, the function can fetch all converters once with `get_converters()` and use the dictionary directly. 4. Casting each parameter string to a set for the whitespace check is a bit costly, it’s faster to use a whitespace set scanning the string. 5. An f-string can be used for concatenation, which is a little bit faster. Applying these optimizations makes the function significantly faster, especially for more parameters. Below are some benchmarks. Before optimization stats (Python 3.12, macOS, M1 mac, Django main branch): * Converting a seven parameter route: {{{ In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True) 12.3 µs ± 68 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) }}} * Profiling a project’s system checks: 950 calls take 9ms, ~1.5% of the total runtime. After optimization: * Converting that seven parameter route is ~2x faster (with caching removed): {{{ In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True) 5.49 µs ± 18.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) }}} * Adding caching makes repeat calls ~100x faster, from 5µs to 50ns. * The profile shows those 950 calls now take 5ms, ~50% faster, ~0.8% of total runtime. -- Ticket URL: <https://code.djangoproject.com/ticket/35252> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/0107018de500cac2-e7eaf1ca-c8e5-429a-b11e-0bd01fc34c80-000000%40eu-central-1.amazonses.com.