Re: [sqlalchemy] SA0.7.9 - poor query performance (full scan instead of index scan because of joined table inheritance)
Hello. I ended up with the following query: @classmethod def _find_contacts_fetch_window(cls, contact_cls, win): """Special data-fetching query for contacts and all their related info including tags, partner, client,... NOTE: We build the FROM part entirely by hand, because SA generates bad SQL for postgres. It does a FULL SCAN of client and personal_client / corporate_client even though it reads at most window_size rows from them. All this because SA inheritance creates a subselect which leads to the full scan. """ # win .. list of ids # contact_cls .. PersonalContact / CorporateContact client_cls = contact_cls.client_cls() # PersonalClient / CorporateClient data_cls = client_cls.data_cls() # PersonalData / CorporateData # We need TABLEs to build the FROM part by hand. # We reference PersonalData/CorporateData and Address twice, hence we # need to alias them. # We also need their aliased ORM classes for contains_eager() to work. contact_table = Contact.__table__ contact_subtable = contact_cls.__table__ client_table = Client.__table__ personal_client_table = client_cls.__table__ partner_table = Partner.__table__ partner_data_table = PersonalData.__table__.alias(name='partner_data') partner_address_table = Address.__table__.alias(name='partner_address') client_data_table = data_cls.__table__.alias(name='client_data') client_address_table = Address.__table__.alias(name='client_address') partner_data = aliased(PersonalData, partner_data_table) partner_address = aliased(Address, partner_address_table) client_data = aliased(data_cls, client_data_table) client_address = aliased(Address, client_address_table) select_from = contact_table.join( contact_subtable, contact_table.c.id == contact_subtable.c.id ).outerjoin( client_table, contact_subtable.c.client_id == client_table.c.id ).outerjoin( personal_client_table, client_table.c.id == personal_client_table.c.id ).outerjoin( client_data_table, personal_client_table.c.data_id == client_data_table.c.id ).outerjoin( client_address_table, client_data_table.c.address_id == client_address_table.c.id ) options = [ subqueryload_all(contact_cls.contact_tags, ContactTag.tag), subqueryload(contact_cls.phones), subqueryload(contact_cls.emails), contains_eager(contact_cls.client), contains_eager(contact_cls.client, client_cls.data, alias=client_data), contains_eager(contact_cls.client, client_cls.data, client_data.address, alias=client_address), ] if contact_cls is PersonalContact: select_from = select_from.outerjoin( partner_table, contact_subtable.c.partner_id == partner_table.c.id ).outerjoin( partner_data_table, partner_table.c.personal_data_id == partner_data_table.c.id ).outerjoin( partner_address_table, partner_data_table.c.address_id == partner_address_table.c.id ) options.extend([ contains_eager(contact_cls.partner), contains_eager(contact_cls.partner, Partner.personal_data, alias=partner_data), contains_eager(contact_cls.partner, Partner.personal_data, partner_data.address, alias=partner_address), ]) q = session.query(contact_cls).select_from(select_from) q = q.filter(contact_cls.id.in_(win)) q = q.options(*options) return q It works great though it took me quite a while to get right, especially the aliasing between tables and classes. Also, I wouldn't figure it out without your help (I mean the part with select_from). Thank you again, Ladislav Lenart On 13.6.2013 18:44, Michael Bayer wrote: > > On Jun 13, 2013, at 11:03 AM, Ladislav Lenart wrote: > >> Unfortunately migrating to SA 0.9 is not an option for me at the moment due >> to >> severe time constraints. > > I was pretty sure you'd say that, though I'm really looking to verify that my > fixes are going to hold up under real world usage. The issues you're having > are real issues, and they've been fixed. > >> >> Could you please help me write SA query for 0.7.9 that uses index scan and >> also >> loads all the necessary relations? It must be possible with a proper use of >> from_statement(), contains_eager() and/or other SA features. It is just that >> toy >> examples in the documentation don't help me much with this complex beast. > > you use the SQL expression language in conjunction with .join()/outerjoin(), > pass to query.select_from(), then use
Re: [sqlalchemy] SA0.7.9 - poor query performance (full scan instead of index scan because of joined table inheritance)
Hello. I ended up with the following query: @classmethod def _find_contacts_fetch_window(cls, contact_cls, win): """Special data-fetching query for contacts and all their related info including tags, partner, client,... NOTE: We build the FROM part entirely by hand, because SA generates bad SQL for postgres. It does a FULL SCAN of client and personal_client / corporate_client even though it reads at most window_size rows from them. All this because SA inheritance creates a subselect which leads to the full scan. """ # win .. list of ids # contact_cls .. PersonalContact / CorporateContact client_cls = contact_cls.client_cls() # PersonalClient / CorporateClient data_cls = client_cls.data_cls() # PersonalData / CorporateData # We need TABLEs to build the FROM part by hand. # We reference PersonalData/CorporateData and Address twice, hence we # need to alias them. # We also need their aliased ORM classes for contains_eager() to work. contact_table = Contact.__table__ contact_subtable = contact_cls.__table__ client_table = Client.__table__ personal_client_table = client_cls.__table__ partner_table = Partner.__table__ partner_data_table = PersonalData.__table__.alias(name='partner_data') partner_address_table = Address.__table__.alias(name='partner_address') client_data_table = data_cls.__table__.alias(name='client_data') client_address_table = Address.__table__.alias(name='client_address') partner_data = aliased(PersonalData, partner_data_table) partner_address = aliased(Address, partner_address_table) client_data = aliased(data_cls, client_data_table) client_address = aliased(Address, client_address_table) select_from = contact_table.join( contact_subtable, contact_table.c.id == contact_subtable.c.id ).outerjoin( client_table, contact_subtable.c.client_id == client_table.c.id ).outerjoin( personal_client_table, client_table.c.id == personal_client_table.c.id ).outerjoin( client_data_table, personal_client_table.c.data_id == client_data_table.c.id ).outerjoin( client_address_table, client_data_table.c.address_id == client_address_table.c.id ) options = [ subqueryload_all(contact_cls.contact_tags, ContactTag.tag), subqueryload(contact_cls.phones), subqueryload(contact_cls.emails), contains_eager(contact_cls.client), contains_eager(contact_cls.client, client_cls.data, alias=client_data), contains_eager(contact_cls.client, client_cls.data, client_data.address, alias=client_address), ] if contact_cls is PersonalContact: select_from = select_from.outerjoin( partner_table, contact_subtable.c.partner_id == partner_table.c.id ).outerjoin( partner_data_table, partner_table.c.personal_data_id == partner_data_table.c.id ).outerjoin( partner_address_table, partner_data_table.c.address_id == partner_address_table.c.id ) options.extend([ contains_eager(contact_cls.partner), contains_eager(contact_cls.partner, Partner.personal_data, alias=partner_data), contains_eager(contact_cls.partner, Partner.personal_data, partner_data.address, alias=partner_address), ]) q = session.query(contact_cls).select_from(select_from) q = q.filter(contact_cls.id.in_(win)) q = q.options(*options) return q It works great though it took me quite a while to get right, especially the aliasing between tables and classes. Also, I wouldn't figure it out without your help (I mean the part with select_from). Thank you again, Ladislav Lenart On 13.6.2013 18:44, Michael Bayer wrote: > > On Jun 13, 2013, at 11:03 AM, Ladislav Lenart wrote: > >> Unfortunately migrating to SA 0.9 is not an option for me at the moment due >> to >> severe time constraints. > > I was pretty sure you'd say that, though I'm really looking to verify that my > fixes are going to hold up under real world usage. The issues you're having > are real issues, and they've been fixed. > >> >> Could you please help me write SA query for 0.7.9 that uses index scan and >> also >> loads all the necessary relations? It must be possible with a proper use of >> from_statement(), contains_eager() and/or other SA features. It is just that >> toy >> examples in the documentation don't help me much with this complex beast. > > you use the SQL expression language in conjunction with .join()/outerjoin(), > pass to query.select_from(), then use
Re: [sqlalchemy] SA0.7.9 - poor query performance (full scan instead of index scan because of joined table inheritance)
Hello. Here is the fully self-contained regression of the issue, including the workaround for SA 0.7.9. Thank you again, because I wouldn't figure it out without your help (the select_from part). I haven't tried it on SA 0.9. If you have any questions, please ask. HTH, Ladislav Lenart On 14.6.2013 11:18, Ladislav Lenart wrote: > Hello. > > > On 13.6.2013 18:44, Michael Bayer wrote: >> >> On Jun 13, 2013, at 11:03 AM, Ladislav Lenart wrote: >> >>> Unfortunately migrating to SA 0.9 is not an option for me at the moment due >>> to >>> severe time constraints. >> >> I was pretty sure you'd say that, though I'm really looking to verify that >> my fixes are going to hold up under real world usage. The issues you're >> having are real issues, and they've been fixed. > > I will make a 1:1 self-containted test case for this issue during the next > week, > I promise! > > >>> Could you please help me write SA query for 0.7.9 that uses index scan and >>> also >>> loads all the necessary relations? It must be possible with a proper use of >>> from_statement(), contains_eager() and/or other SA features. It is just >>> that toy >>> examples in the documentation don't help me much with this complex beast. >> >> you use the SQL expression language in conjunction with .join()/outerjoin(), >> pass to query.select_from(), then use contains_eager(): >> >> j = >> Foo.__table__.outerjoin(Bar.__table__.outerjoin(BarA.__table__).outerjoin(Data.__table__)) >> >> q = s.query(Foo).\ >> select_from(j).\ >> filter(Foo.id.in_([1, 2, 3])).\ >> options( >> contains_eager(Foo.bar), >> contains_eager(Foo.bar.of_type(BarA), BarA.data) >> ) > > Thank you very much! I am going to try it :-) > > > Ladislav Lenart -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out. # coding=utf-8 """ Standalone regression of the full table-scan problem in combination with joined table inheritance. Applies to SA 0.7.9 at least. See: Issue, Workaround, main_issue, main_workaround and main_fill. The rest is support code and/or ORM definitions. """ from sqlalchemy import Column, ForeignKey, UniqueConstraint, CheckConstraint,\ Integer, Unicode, String, Date from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.engine import create_engine from sqlalchemy.sql.expression import distinct from sqlalchemy.orm.session import sessionmaker from sqlalchemy.orm import relationship, subqueryload_all, subqueryload,\ joinedload_all, contains_eager from sqlalchemy.orm.util import aliased Base = declarative_base() session = None class Contact(Base): __tablename__ = 'Contact' id = Column(Integer(), primary_key=True) type = Column(Integer(), nullable=False) # 1 - personal, 2 - corporate parent_id = Column(Integer(), ForeignKey('Partner.id', ondelete='CASCADE'), nullable=False) parent = relationship('Partner', back_populates='contacts', primaryjoin='Contact.parent_id == Partner.id', remote_side='Partner.id', uselist=False) contact_tags = relationship('ContactTag', back_populates='contact', primaryjoin='Contact.id == ContactTag.contact_id', cascade='all, delete-orphan', passive_updates=True, passive_deletes=True) phones = relationship('Phone', back_populates='contact', primaryjoin='Contact.id == Phone.contact_id', cascade='all, delete-orphan', passive_updates=True, passive_deletes=True) emails = relationship('Email', back_populates='contact', primaryjoin='Contact.id == Email.contact_id', cascade='all, delete-orphan', passive_updates=True, passive_deletes=True) __mapper_args__ = {'polymorphic_on': type} __table_args__ = (CheckConstraint(u'type in (1, 2)'),) class ContactTag(Base): __tablename__ = 'ContactTag' id = Column(Integer(), primary_key=True) tag_id = Column(Integer(), ForeignKey('Tag.id', ondelete='CASCADE'), nullable=False) tag = relationship('Tag', primaryjoin='ContactTag.tag_id == Tag.id', remote_side='Tag.id', uselist=False) contact_id = Column(Integer(), ForeignKey('Contact.id', ondelete='CASCADE'), nullable=False) contact = relationship('Contact', back_populates='contact_tags', primaryjoin='ContactTag.contact_id == Contact.id', remote_side='Contact.id', uselist=False) __table_args__ = (UniqueConstraint('contact_id', 'tag_id'),) class Tag(Base): __tablename__ = 'Tag' id = Column(Integer(), primary_key=True) partner_id = Column(Integer(), ForeignKey('Partner.id', ondelete='CASCADE'), nullable=False, index=True) partner = relationship('Partner', back_populates='tags', primaryjoin='Tag.partner_i
Re: [sqlalchemy] SA0.7.9 - poor query performance (full scan instead of index scan because of joined table inheritance)
Hello. On 13.6.2013 18:44, Michael Bayer wrote: > > On Jun 13, 2013, at 11:03 AM, Ladislav Lenart wrote: > >> Unfortunately migrating to SA 0.9 is not an option for me at the moment due >> to >> severe time constraints. > > I was pretty sure you'd say that, though I'm really looking to verify that my > fixes are going to hold up under real world usage. The issues you're having > are real issues, and they've been fixed. I will make a 1:1 self-containted test case for this issue during the next week, I promise! >> Could you please help me write SA query for 0.7.9 that uses index scan and >> also >> loads all the necessary relations? It must be possible with a proper use of >> from_statement(), contains_eager() and/or other SA features. It is just that >> toy >> examples in the documentation don't help me much with this complex beast. > > you use the SQL expression language in conjunction with .join()/outerjoin(), > pass to query.select_from(), then use contains_eager(): > > j = > Foo.__table__.outerjoin(Bar.__table__.outerjoin(BarA.__table__).outerjoin(Data.__table__)) > > q = s.query(Foo).\ > select_from(j).\ > filter(Foo.id.in_([1, 2, 3])).\ > options( > contains_eager(Foo.bar), > contains_eager(Foo.bar.of_type(BarA), BarA.data) > ) Thank you very much! I am going to try it :-) Ladislav Lenart -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] SA0.7.9 - poor query performance (full scan instead of index scan because of joined table inheritance)
On Jun 13, 2013, at 11:03 AM, Ladislav Lenart wrote: > Unfortunately migrating to SA 0.9 is not an option for me at the moment due to > severe time constraints. I was pretty sure you'd say that, though I'm really looking to verify that my fixes are going to hold up under real world usage. The issues you're having are real issues, and they've been fixed. > > Could you please help me write SA query for 0.7.9 that uses index scan and > also > loads all the necessary relations? It must be possible with a proper use of > from_statement(), contains_eager() and/or other SA features. It is just that > toy > examples in the documentation don't help me much with this complex beast. you use the SQL expression language in conjunction with .join()/outerjoin(), pass to query.select_from(), then use contains_eager(): j = Foo.__table__.outerjoin(Bar.__table__.outerjoin(BarA.__table__).outerjoin(Data.__table__)) q = s.query(Foo).\ select_from(j).\ filter(Foo.id.in_([1, 2, 3])).\ options( contains_eager(Foo.bar), contains_eager(Foo.bar.of_type(BarA), BarA.data) ) -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: [sqlalchemy] SA0.7.9 - poor query performance (full scan instead of index scan because of joined table inheritance)
Hello. Just a minor correction: The new query can return whatever it pleases as long as it *also* returns Foo instances with properly populated relations. The rest of the e-mail is the same. Please help me, Ladislav Lenart On 13.6.2013 17:03, Ladislav Lenart wrote: > Unfortunately migrating to SA 0.9 is not an option for me at the moment due to > severe time constraints. > > Could you please help me write SA query for 0.7.9 that uses index scan and > also > loads all the necessary relations? It must be possible with a proper use of > from_statement(), contains_eager() and/or other SA features. It is just that > toy > examples in the documentation don't help me much with this complex beast. > > Here is a simplified version of my problem: > > Foo > bar (can be NULL) > > Bar > > BarA (subclass of Bar) > data -> Data (cannot be NULL) > > Data > value (string) > > I need to rewrite this query: > > q = session.query(Foo).filter(Foo.id.in_(...)) > q = q.options( > joinedload_all(Foo.bar, BarA.data) > ) > > in such a way that it does NOT perform any full scan and also populates > Foo.bar.data.value of each returned Foo. The new query can return whatever it > pleases as long as it returns Foo instances with properly populated relations. > > Please help me write it (or tell me that it is not possible in SA 0.7.9, > though > I highly doubt that). > > > Thank you, > > Ladislav Lenart > > > On 13.6.2013 15:51, Michael Bayer wrote: >> Please try out 0.9 from the git master which fixes the issue of the nested >> SELECT on the right side of a join. >> >> Sent from my iPhone >> >> On Jun 13, 2013, at 9:18 AM, Ladislav Lenart wrote: >> >>> Hello. >>> >>> I have a query that does a full scan of an inherited table with more than >>> million rows even though I need only 100 of them (on postgres 9.1). This is >>> a >>> real bummer! Please help me rewrite the SA query or instruct postgres to >>> not do >>> this stupidity. >>> >>> >>> I have the following setup (only the interesting relations): >>> >>>Contact >>>contact_tags -> ContactTag (collection) >>>phones -> Phone (collection) >>>emails -> Email (collection) >>> >>>ContactTag >>>tag -> Tag (cannot be NULL) >>> >>>PersonalContact (Contact subclass) >>>partner -> Partner (can be NULL) >>>client -> PersonalClient (can be NULL) >>> >>>CorporateContact (Contact subclass) >>>client -> CorporateClient (can be NULL) >>> >>>Client >>> >>>PersonalClient (Client subclass) >>>data -> PersonalData (cannot be NULL) >>> >>>CorporateClient (Client subclass) >>>data -> CorporateData (cannot be NULL) >>> >>> >>> I have the following query that loads data of one window: >>> >>># window (input argument) is a list of id values. >>>q = session.query(PersonalContact).filter(PersonalContact.id.in_(window) >>>q = q.options( >>>subqueryload_all(PersonalContact.contact_tags, ContactTag.tag), >>>subqueryload(PersonalContact.phones), >>>subqueryload(PersonalContact.emails), >>>joinedload_all( >>>PersonalContact.partner, >>>Partner.personal_data, >>>PersonalData.address, >>> ), >>>joinedload_all( >>>PersonalContact.client, >>>PersonalClient.data, >>>PersonalData.address >>>), >>>) >>> >>> >>> (Note that I have similar query for CorporateContact. Infact, I generate >>> them >>> both in the same method.) >>> >>> It produces SQL like this (the problematic part is emphasized): >>> >>>SELECT * >>>FROM >>>contact >>>JOIN personal_contact ON contact.id = personal_contact.id >>>-- *** >>>LEFT OUTER JOIN ( >>>SELECT * >>>FROM >>>client >>>JOIN personal_client ON client.id = personal_client.id >>>) AS anon_1 ON personal_contact.client_id = anon_1.personal_client_id >>>-- *** >>>LEFT OUTER JOIN partner AS partner_1 >>>ON personal_contact.partner_id = partner_1.id >>>LEFT OUTER JOIN personal_data AS personal_data_1 >>>ON partner_1.personal_data_id = personal_data_1.id >>>LEFT OUTER JOIN address AS address_1 >>>ON personal_data_1.address_id = address_1.id >>>LEFT OUTER JOIN personal_data AS personal_data_2 >>>ON anon_1.personal_client_data_id = personal_data_2.id >>>LEFT OUTER JOIN address AS address_2 >>>ON personal_data_2.address_id = address_2.id >>>WHERE personal_contact.id IN (...) >>> >>> >>> The inner select directly corresponds to joinedload of >>> PersonalContact.client, a >>> PersonalClient instance (and a Client subclass). >>> >>> The postgres does a full scan of tables Client and PersonalClient even >>> though I >>> will need at mo
Re: [sqlalchemy] SA0.7.9 - poor query performance (full scan instead of index scan because of joined table inheritance)
Unfortunately migrating to SA 0.9 is not an option for me at the moment due to severe time constraints. Could you please help me write SA query for 0.7.9 that uses index scan and also loads all the necessary relations? It must be possible with a proper use of from_statement(), contains_eager() and/or other SA features. It is just that toy examples in the documentation don't help me much with this complex beast. Here is a simplified version of my problem: Foo bar (can be NULL) Bar BarA (subclass of Bar) data -> Data (cannot be NULL) Data value (string) I need to rewrite this query: q = session.query(Foo).filter(Foo.id.in_(...)) q = q.options( joinedload_all(Foo.bar, BarA.data) ) in such a way that it does NOT perform any full scan and also populates Foo.bar.data.value of each returned Foo. The new query can return whatever it pleases as long as it returns Foo instances with properly populated relations. Please help me write it (or tell me that it is not possible in SA 0.7.9, though I highly doubt that). Thank you, Ladislav Lenart On 13.6.2013 15:51, Michael Bayer wrote: > Please try out 0.9 from the git master which fixes the issue of the nested > SELECT on the right side of a join. > > Sent from my iPhone > > On Jun 13, 2013, at 9:18 AM, Ladislav Lenart wrote: > >> Hello. >> >> I have a query that does a full scan of an inherited table with more than >> million rows even though I need only 100 of them (on postgres 9.1). This is a >> real bummer! Please help me rewrite the SA query or instruct postgres to not >> do >> this stupidity. >> >> >> I have the following setup (only the interesting relations): >> >>Contact >>contact_tags -> ContactTag (collection) >>phones -> Phone (collection) >>emails -> Email (collection) >> >>ContactTag >>tag -> Tag (cannot be NULL) >> >>PersonalContact (Contact subclass) >>partner -> Partner (can be NULL) >>client -> PersonalClient (can be NULL) >> >>CorporateContact (Contact subclass) >>client -> CorporateClient (can be NULL) >> >>Client >> >>PersonalClient (Client subclass) >>data -> PersonalData (cannot be NULL) >> >>CorporateClient (Client subclass) >>data -> CorporateData (cannot be NULL) >> >> >> I have the following query that loads data of one window: >> >># window (input argument) is a list of id values. >>q = session.query(PersonalContact).filter(PersonalContact.id.in_(window) >>q = q.options( >>subqueryload_all(PersonalContact.contact_tags, ContactTag.tag), >>subqueryload(PersonalContact.phones), >>subqueryload(PersonalContact.emails), >>joinedload_all( >>PersonalContact.partner, >>Partner.personal_data, >>PersonalData.address, >> ), >>joinedload_all( >>PersonalContact.client, >>PersonalClient.data, >>PersonalData.address >>), >>) >> >> >> (Note that I have similar query for CorporateContact. Infact, I generate them >> both in the same method.) >> >> It produces SQL like this (the problematic part is emphasized): >> >>SELECT * >>FROM >>contact >>JOIN personal_contact ON contact.id = personal_contact.id >>-- *** >>LEFT OUTER JOIN ( >>SELECT * >>FROM >>client >>JOIN personal_client ON client.id = personal_client.id >>) AS anon_1 ON personal_contact.client_id = anon_1.personal_client_id >>-- *** >>LEFT OUTER JOIN partner AS partner_1 >>ON personal_contact.partner_id = partner_1.id >>LEFT OUTER JOIN personal_data AS personal_data_1 >>ON partner_1.personal_data_id = personal_data_1.id >>LEFT OUTER JOIN address AS address_1 >>ON personal_data_1.address_id = address_1.id >>LEFT OUTER JOIN personal_data AS personal_data_2 >>ON anon_1.personal_client_data_id = personal_data_2.id >>LEFT OUTER JOIN address AS address_2 >>ON personal_data_2.address_id = address_2.id >>WHERE personal_contact.id IN (...) >> >> >> The inner select directly corresponds to joinedload of >> PersonalContact.client, a >> PersonalClient instance (and a Client subclass). >> >> The postgres does a full scan of tables Client and PersonalClient even >> though I >> will need at most 100 rows from each one. >> >> However, if I rewrite the problematic part by hand like this: >> >>LEFT OUTER JOIN client >>ON personal_contact.client_id = client.id >> LEFT OUTER JOIN personal_client >>ON client.id = personal_client.id >> >> it works like a charm. >> >> Unfortunately I don't know how to write such a query in SA. I am really >> stuck so >> any help is much appreciated. >> >> >> Thank you, >> >> Ladislav Lenart >> >> >>
Re: [sqlalchemy] SA0.7.9 - poor query performance (full scan instead of index scan because of joined table inheritance)
Please try out 0.9 from the git master which fixes the issue of the nested SELECT on the right side of a join. Sent from my iPhone On Jun 13, 2013, at 9:18 AM, Ladislav Lenart wrote: > Hello. > > I have a query that does a full scan of an inherited table with more than > million rows even though I need only 100 of them (on postgres 9.1). This is a > real bummer! Please help me rewrite the SA query or instruct postgres to not > do > this stupidity. > > > I have the following setup (only the interesting relations): > >Contact >contact_tags -> ContactTag (collection) >phones -> Phone (collection) >emails -> Email (collection) > >ContactTag >tag -> Tag (cannot be NULL) > >PersonalContact (Contact subclass) >partner -> Partner (can be NULL) >client -> PersonalClient (can be NULL) > >CorporateContact (Contact subclass) >client -> CorporateClient (can be NULL) > >Client > >PersonalClient (Client subclass) >data -> PersonalData (cannot be NULL) > >CorporateClient (Client subclass) >data -> CorporateData (cannot be NULL) > > > I have the following query that loads data of one window: > ># window (input argument) is a list of id values. >q = session.query(PersonalContact).filter(PersonalContact.id.in_(window) >q = q.options( >subqueryload_all(PersonalContact.contact_tags, ContactTag.tag), >subqueryload(PersonalContact.phones), >subqueryload(PersonalContact.emails), >joinedload_all( >PersonalContact.partner, >Partner.personal_data, >PersonalData.address, > ), >joinedload_all( >PersonalContact.client, >PersonalClient.data, >PersonalData.address >), >) > > > (Note that I have similar query for CorporateContact. Infact, I generate them > both in the same method.) > > It produces SQL like this (the problematic part is emphasized): > >SELECT * >FROM >contact >JOIN personal_contact ON contact.id = personal_contact.id >-- *** >LEFT OUTER JOIN ( >SELECT * >FROM >client >JOIN personal_client ON client.id = personal_client.id >) AS anon_1 ON personal_contact.client_id = anon_1.personal_client_id >-- *** >LEFT OUTER JOIN partner AS partner_1 >ON personal_contact.partner_id = partner_1.id >LEFT OUTER JOIN personal_data AS personal_data_1 >ON partner_1.personal_data_id = personal_data_1.id >LEFT OUTER JOIN address AS address_1 >ON personal_data_1.address_id = address_1.id >LEFT OUTER JOIN personal_data AS personal_data_2 >ON anon_1.personal_client_data_id = personal_data_2.id >LEFT OUTER JOIN address AS address_2 >ON personal_data_2.address_id = address_2.id >WHERE personal_contact.id IN (...) > > > The inner select directly corresponds to joinedload of > PersonalContact.client, a > PersonalClient instance (and a Client subclass). > > The postgres does a full scan of tables Client and PersonalClient even though > I > will need at most 100 rows from each one. > > However, if I rewrite the problematic part by hand like this: > >LEFT OUTER JOIN client >ON personal_contact.client_id = client.id > LEFT OUTER JOIN personal_client >ON client.id = personal_client.id > > it works like a charm. > > Unfortunately I don't know how to write such a query in SA. I am really stuck > so > any help is much appreciated. > > > Thank you, > > Ladislav Lenart > > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. > For more options, visit https://groups.google.com/groups/opt_out. > > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
[sqlalchemy] SA0.7.9 - poor query performance (full scan instead of index scan because of joined table inheritance)
Hello. I have a query that does a full scan of an inherited table with more than million rows even though I need only 100 of them (on postgres 9.1). This is a real bummer! Please help me rewrite the SA query or instruct postgres to not do this stupidity. I have the following setup (only the interesting relations): Contact contact_tags -> ContactTag (collection) phones -> Phone (collection) emails -> Email (collection) ContactTag tag -> Tag (cannot be NULL) PersonalContact (Contact subclass) partner -> Partner (can be NULL) client -> PersonalClient (can be NULL) CorporateContact (Contact subclass) client -> CorporateClient (can be NULL) Client PersonalClient (Client subclass) data -> PersonalData (cannot be NULL) CorporateClient (Client subclass) data -> CorporateData (cannot be NULL) I have the following query that loads data of one window: # window (input argument) is a list of id values. q = session.query(PersonalContact).filter(PersonalContact.id.in_(window) q = q.options( subqueryload_all(PersonalContact.contact_tags, ContactTag.tag), subqueryload(PersonalContact.phones), subqueryload(PersonalContact.emails), joinedload_all( PersonalContact.partner, Partner.personal_data, PersonalData.address, ), joinedload_all( PersonalContact.client, PersonalClient.data, PersonalData.address ), ) (Note that I have similar query for CorporateContact. Infact, I generate them both in the same method.) It produces SQL like this (the problematic part is emphasized): SELECT * FROM contact JOIN personal_contact ON contact.id = personal_contact.id -- *** LEFT OUTER JOIN ( SELECT * FROM client JOIN personal_client ON client.id = personal_client.id ) AS anon_1 ON personal_contact.client_id = anon_1.personal_client_id -- *** LEFT OUTER JOIN partner AS partner_1 ON personal_contact.partner_id = partner_1.id LEFT OUTER JOIN personal_data AS personal_data_1 ON partner_1.personal_data_id = personal_data_1.id LEFT OUTER JOIN address AS address_1 ON personal_data_1.address_id = address_1.id LEFT OUTER JOIN personal_data AS personal_data_2 ON anon_1.personal_client_data_id = personal_data_2.id LEFT OUTER JOIN address AS address_2 ON personal_data_2.address_id = address_2.id WHERE personal_contact.id IN (...) The inner select directly corresponds to joinedload of PersonalContact.client, a PersonalClient instance (and a Client subclass). The postgres does a full scan of tables Client and PersonalClient even though I will need at most 100 rows from each one. However, if I rewrite the problematic part by hand like this: LEFT OUTER JOIN client ON personal_contact.client_id = client.id LEFT OUTER JOIN personal_client ON client.id = personal_client.id it works like a charm. Unfortunately I don't know how to write such a query in SA. I am really stuck so any help is much appreciated. Thank you, Ladislav Lenart -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy?hl=en. For more options, visit https://groups.google.com/groups/opt_out.