On Wed, 2018-03-28 at 09:42 +1100, Daniel Axtens wrote: > Stephen Finucane <step...@that.guru> writes: > > > On Fri, 2018-03-09 at 02:54 +1100, Daniel Axtens wrote: > > > Stephen Rothwell noticed (way back in September - sorry Stephen!) that > > > the following query is really slow on OzLabs: > > > > > > SELECT COUNT(*) AS "__count" FROM "patchwork_patch" > > > INNER JOIN "patchwork_submission" ON > > > ("patchwork_patch"."submission_ptr_id" = > > > "patchwork_submission"."id") > > > WHERE ("patchwork_submission"."project_id" = 14 AND > > > "patchwork_patch"."state_id" IN > > > (SELECT U0."id" AS Col1 FROM "patchwork_state" U0 > > > WHERE U0."action_required" = true > > > ORDER BY U0."ordering" ASC)); > > > > > > I think this is really slow because we have to join the patch and > > > submission table to get the project id, which we need to filter the > > > patches. > > > > > > Duplicate the project id in the patch table itself, which allows us to > > > avoid the JOIN. > > > > > > The new query reads as: > > > SELECT COUNT(*) AS "__count" FROM "patchwork_patch" > > > WHERE ("patchwork_patch"."patch_project_id" = 1 AND > > > "patchwork_patch"."state_id" IN > > > (SELECT U0."id" AS Col1 FROM "patchwork_state" U0 > > > WHERE U0."action_required" = true > > > ORDER BY U0."ordering" ASC)); > > > > > > Very simple testing on a small, artifical Postgres instance (3 > > > projects, 102711 patches), shows speed gains of ~1.5-5x for this > > > query. Looking at Postgres' cost estimates (EXPLAIN) of the first > > > query vs the second query, we see a ~1.75x improvement there too. > > > > > > I suspect the gains will be bigger on OzLabs. > > > > > > (It turns out all of this is all for the "| NN patches" counter we > > > added to the filter bar!!) > > > > > > Reported-by: Stephen Rothwell <s...@canb.auug.org.au> > > > Signed-off-by: Daniel Axtens <d...@axtens.net> > > > > It's unfortunate that this has already merged. While it works, it > > defeats the whole point of the multi-table inheritance introduced in > > commit 86172ccc1 (normalization). To be honest, given the performance > > impacts that particular change introduced (which we're only seeing at > > scale), I'd rather denormalize the whole thing and fold the 'Patch' and > > 'CoverLetter' models back into 'Submission' and just use a 'type' field > > (or similar) to control behavior. Is there any reason not to do this? > > I agree that would be a better conceptual solution. The reason I didn't > do it is because I tried a couple of times and couldn't get it all to > work, and I haven't had the time to really sit down and re-engineer it. > > If you can get it to work I am happy to revert this and apply that; the > changes to the database schema aren't at all difficult to undo so any > brave souls running master won't be completely hosed.
OK, that's fair. Let me see how I fare with this. The biggest issues I see are the custom managers that we use for the Patch class. I'm hoping Veronika's tag infrastructure rework will help with this and I can build upon that. I'll review said RFC this week to see if that's the case. Stephen > Regards, > Daniel > > > If not, I'd like to wait on 2.1 until we've done both this and the > > event fixes. > > > > Stephen > > > > > --- > > > > > > This requires a migration, so I don't think we can feasibly do it as a > > > stable update. > > > > > > I think we drop the patch counter for stable and try to get this and > > > the event stuff merged to master promptly, and just tag 2.1. (To that > > > end, I will re-read and finish reviewing the event stuff soon.) > > > --- > > > patchwork/migrations/0024_patch_patch_project.py | 39 > > > ++++++++++++++++++++++++ > > > patchwork/models.py | 4 +++ > > > patchwork/parser.py | 1 + > > > patchwork/views/__init__.py | 2 +- > > > 4 files changed, 45 insertions(+), 1 deletion(-) > > > create mode 100644 patchwork/migrations/0024_patch_patch_project.py > > > > > > diff --git a/patchwork/migrations/0024_patch_patch_project.py > > > b/patchwork/migrations/0024_patch_patch_project.py > > > new file mode 100644 > > > index 000000000000..76d8f144c9dd > > > --- /dev/null > > > +++ b/patchwork/migrations/0024_patch_patch_project.py > > > @@ -0,0 +1,39 @@ > > > +# -*- coding: utf-8 -*- > > > +# Generated by Django 1.11.10 on 2018-03-08 01:51 > > > +from __future__ import unicode_literals > > > + > > > +from django.db import migrations, models > > > +import django.db.models.deletion > > > + > > > + > > > +class Migration(migrations.Migration): > > > + # per migration 16, but note this seems to be going away > > > + # in new PostgreSQLs > > > (https://stackoverflow.com/questions/12838111/south-cannot-alter-table-because-it-has-pending-trigger-events#comment44629663_12838113) > > > + atomic = False > > > + > > > + dependencies = [ > > > + ('patchwork', '0023_timezone_unify'), > > > + ] > > > + > > > + operations = [ > > > + migrations.AddField( > > > + model_name='patch', > > > + name='patch_project', > > > + field=models.ForeignKey(blank=True, null=True, > > > on_delete=django.db.models.deletion.CASCADE, to='patchwork.Project'), > > > + preserve_default=False, > > > + ), > > > + > > > + # as with 10, this will break if you use non-default table names > > > + migrations.RunSQL('''UPDATE patchwork_patch SET patch_project_id > > > = > > > + (SELECT project_id FROM > > > patchwork_submission > > > + WHERE patchwork_submission.id = > > > + > > > patchwork_patch.submission_ptr_id);''' > > > + ), > > > + > > > + migrations.AlterField( > > > + model_name='patch', > > > + name='patch_project', > > > + > > > field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, > > > to='patchwork.Project'), > > > + ), > > > + > > > + ] > > > diff --git a/patchwork/models.py b/patchwork/models.py > > > index b2491752f04a..3b905c4cd75b 100644 > > > --- a/patchwork/models.py > > > +++ b/patchwork/models.py > > > @@ -423,6 +423,10 @@ class Patch(SeriesMixin, Submission): > > > archived = models.BooleanField(default=False) > > > hash = HashField(null=True, blank=True) > > > > > > + # duplicate project from submission in subclass so we can count the > > > + # patches in a project without needing to do a JOIN. > > > + patch_project = models.ForeignKey(Project, on_delete=models.CASCADE) > > > + > > > objects = PatchManager() > > > > > > @staticmethod > > > diff --git a/patchwork/parser.py b/patchwork/parser.py > > > index 803e98592fa8..805037c72d73 100644 > > > --- a/patchwork/parser.py > > > +++ b/patchwork/parser.py > > > @@ -1004,6 +1004,7 @@ def parse_mail(mail, list_id=None): > > > patch = Patch.objects.create( > > > msgid=msgid, > > > project=project, > > > + patch_project=project, > > > name=name[:255], > > > date=date, > > > headers=headers, > > > diff --git a/patchwork/views/__init__.py b/patchwork/views/__init__.py > > > index 3baf2999a836..f8d23a388ac7 100644 > > > --- a/patchwork/views/__init__.py > > > +++ b/patchwork/views/__init__.py > > > @@ -270,7 +270,7 @@ def generic_list(request, project, view, > > > view_args=None, filter_settings=None, > > > context['filters'].set_status(filterclass, setting) > > > > > > if patches is None: > > > - patches = Patch.objects.filter(project=project) > > > + patches = Patch.objects.filter(patch_project=project) > > > > > > # annotate with tag counts > > > patches = patches.with_tag_counts(project) _______________________________________________ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork