#31202: Bulk update suffers from poor performance with large numbers of models and columns -------------------------------------+------------------------------------- Reporter: Tom Forbes | Owner: Tom Type: | Forbes Cleanup/optimization | Status: assigned Component: Database layer | Version: dev (models, ORM) | Severity: Normal | Resolution: Keywords: | Triage Stage: Accepted Has patch: 0 | Needs documentation: 0 Needs tests: 0 | Patch needs improvement: 0 Easy pickings: 0 | UI/UX: 0 -------------------------------------+------------------------------------- Comment (by Adam Sołtysik):
Even though the thread specifically mentions "large numbers of columns", performance issues are noticeable even with something as simple as `ManyToManyField`. Let's say I have 1 million users to add to a group. Django's `group.users.add(*user_ids)` takes 38 seconds, while the same SQL query built directly in Python takes 10 seconds. Similarly, to remove all the users, `group.users.remove(*user_ids)` takes 7.5 seconds, while a raw SQL query takes 2 seconds. This is a ~4x performance difference (not even considering how much of the time is used by the DB), and it's not the biggest I've seen after rewriting some other queries. I'm struggling to imagine what could be taking so long for Django to build those. In our project we also tried using the `django-bulk-load` library, as suggested above. It's certainly faster, but there are still some issues. First, the library still requires creating Django object instances, which is another known bottleneck (discussed e.g. in https://forum.djangoproject.com/t/how-to-avoid-the-overhead-of-model- instances-in-bulk-create/25538). Second, the `COPY FROM` approach actually turns out to be slower than a direct `INSERT INTO` in our Postgres database. Overall, our bare SQL queries ended up being 2x-3x faster than operations performed with `django-bulk-load`, which seems to be worth the slight increase in code length. -- Ticket URL: <https://code.djangoproject.com/ticket/31202#comment:16> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/django-updates/01070194bd9ab9ed-984ae776-e100-4fd0-9573-c8684f60e175-000000%40eu-central-1.amazonses.com.