Re: [sqlalchemy] Large number of polymorphic subclasses

mike bayer Mon, 01 May 2017 18:22:22 -0700

The selectinload feature revealed the need to reorganize most of the"loading_relationships" documentation to make adequate context for thenew feature and overall improve a lot of crufty organization. Thisreorg of the docs somewhat awkwardly needed to be decoupled from thefeature itself so that it was in a separate commit, and also able to bemerged up to the 1.1 documentation with the new organization but withoutthe actual feature.

The situation is similar with joined eager "in" load - the inheritancedocs are disorganized from years of updates without a full pass over thewhole thing to organize in terms of how inheritance config/loading worksnow. So I am working on reorganizing the inheritance documentationbefore I continue with writing the tests for the joined eager "in" loadfeature so that the docs for the new feature can be easily integratedinto the existing docs.

As always, everything I do is fully visible just by subscribing to thegerrits:


https://gerrit.sqlalchemy.org/#/c/392/

https://gerrit.sqlalchemy.org/#/c/359/



On 05/01/2017 06:12 PM, da...@benchling.com wrote:

Hey Mike,

I just saw the new selectinload feature, and am super excited about it!

As far as I can tell, this is only for relationships, so I wanted tocheck what the plans are for an inheritance loader using selectinload aswell, given that you initially mentioned them together and I can't findan issue in the tracker for it.


    I've created an issue for both a new relationship loader and an
    inheritance loader at the same time, since they will use very similar
    paths, at https://bitbucket.org/zzzeek/sqlalchemy/issues/3944
    
<https://www.google.com/url?q=https%3A%2F%2Fbitbucket.org%2Fzzzeek%2Fsqlalchemy%2Fissues%2F3944&sa=D&sntz=1&usg=AFQjCNHP34Q9-nl0O_qmczm7uZFo8daCAA>.


Thanks!
Damon

On Tuesday, April 4, 2017 at 8:24:07 PM UTC-7, Mike Bayer wrote:

    Reusing the original query is hard, and the case you have is due to
    mismatched entities that would have to be handled, probably by
    wrapping the original query in a subquery just like subquery eager
    loading does.  The subquery eager loading feature took a few years
    to work out an enormous number of issues with this approach.

    The new feature which I'm now working on, several hours a day, in
    response to this thread solves the whole problem in a much better
    way, using a simple IN expression against the primary keys of all
    states loaded in the query.   Betas of 1.2 will hopefully be
    available in a month or two.    The feature can be replicated in
    older versions by just grabbing the list of states coming out of
    __iter__ and using their primary keys in an IN clause.   Parallel
    development of this other approach as an interim throwaway is
    probably not worth it if it continues to become more complicated.

    On Apr 4, 2017 7:02 PM, <da...@benchling.com <javascript:>> wrote:

        Hey Mike,

        Thanks for those -- seems to have helped those cases, though
        we're running into some pretty weird behavior with joins. Here's
        a simplified case that shows one of the issues we're running
        into (use _loader_from_cls from above):


        class W(Base):
             __tablename__ = 'w'
             id = Column(Integer, primary_key=True)
             type = Column(String)

             x_id = Column(Integer, ForeignKey('x.id <http://x.id>'))
             x = relationship('X')

             __mapper_args__ = {'polymorphic_on': type}


        class W2(W):
             __tablename__ = 'w2'
             id = Column(Integer, ForeignKey('w.id <http://w.id>'),
        primary_key=True)
             w2 = Column(String)
             __mapper_args__ = {'polymorphic_identity': 'w2'}


        class X(Base):
             __tablename__ = 'x'
             id = Column(Integer, primary_key=True)


        @event.listens_for(W, "load", propagate=True)
        def load_extra(target, context):
              key = ('loader_by_cls', type(target))

              if key not in context.attributes:
                  context.attributes[key] = _loader_for_cls(target, context)


        e = create_engine("sqlite://", echo=True)
        Base.metadata.create_all(e)
        s = Session(e)

        s.add(W2(x=X()))
        s.commit()

        s.query(W).join(W.x).first()


        One of the statements emitted by this is

        SELECT w2.id <http://w2.id> AS w2_id, w.id <http://w.id> AS
        w_id, w.type AS w_type, w.x_id AS w_x_id, w2.w2 AS w2_w2
        FROM w2, w JOIN x ON x.id <http://x.id> = w.x_id
          LIMIT ? OFFSET ?

        Specifically, the "FROM w2, w JOIN x" is not what we want here
        -- we'd just want "FROM w2"

        Replacing

             q = orig_query.with_entities(target_cls)

        with

             q =
        context.session.query(target_cls).join(orig_query.subquery())

        fixes the issue, though this feels less than ideal. Subclassing
        Query also seems less than ideal. Do you have any ideas here?


        Thanks,
        Damon

        On Tuesday, April 4, 2017 at 6:42:32 AM UTC-7, Mike Bayer wrote:

            This repaste of the example contains two refinements to
            address each of
            these issues distinctly.  We want to avoid re-entrant
            invocation of
            _loader_for_cls, so putting a flag into attributes handles
            that.   Then
            there's the issue of same class coming in that we are
            already querying,
            we can look at the type being loaded in column_descriptions,
            and this
            also solves the re-entrant problem though I think the flag
            is more
            direct.  The column_descriptions check likely needs more
            refinements,
            but the general idea is that if the entity being loaded is
            already a B2,
            then you wouldn't do any extra B2 queries (unless you have
            multiple
            levels of inheritance / polymorphic loading in which case
            that probably
            needs adjustment).

            The "strong reference" comment, you can try commenting that
            out in the
            test and see what happens.  Because this loader trick loads
            all the
            subclasses up front upon seeing the first member of a
            particular
            subtype, it is loading for subsequent instances that aren't
            seen yet as
            well.  If we don't make a strong reference to them, they get
            lost and
            the extra attribute loading fails.

            The feature I have in development works a lot better because
            all the
            additional loads are done *after* all the entities are
            loaded, and it
            does it using an IN query that only includes those entities
            that
            definitely need the load.  I'm not sure if there are other
            negative side
            effects from our loading of the "subclass" entity in some
            cases ahead of
            where the primary query gets at the entity.    The recipe
            here could be
            made to do all the extra loads after the primary query but
            theres no
            "after query" hook, you'd have to subclass Query and
            override __iter__
            to add this step.

            from sqlalchemy import *
            from sqlalchemy.orm import *
            from sqlalchemy.ext.declarative import declarative_base
            from sqlalchemy import event

            Base = declarative_base()


            class A(Base):
                  __tablename__ = 'a'
                  id = Column(Integer, primary_key=True)
                  a1 = Column(String)
                  type = Column(String)

                  __mapper_args__ = {'polymorphic_on': type}


            class B1(A):
                  __tablename__ = 'b1'
                  id = Column(Integer, ForeignKey('a.id <http://a.id>'),
            primary_key=True)
                  b1 = Column(String)
                  b_data = Column(String)
                  __mapper_args__ = {'polymorphic_identity': 'b1'}


            class B2(A):
                  __tablename__ = 'b2'
                  id = Column(Integer, ForeignKey('a.id <http://a.id>'),
            primary_key=True)
                  b2 = Column(String)
                  b_data = Column(String)
                  cs = relationship("C", lazy='subquery')

                  __mapper_args__ = {'polymorphic_identity': 'b2'}


            class C(Base):
                  __tablename__ = 'c'
                  id = Column(Integer, primary_key=True)
                  b2_id = Column(ForeignKey('b2.id <http://b2.id>'))


            class B3(A):
                  __tablename__ = 'b3'
                  id = Column(Integer, ForeignKey('a.id <http://a.id>'),
            primary_key=True)
                  b3 = Column(String)
                  b_data = Column(String)
                  __mapper_args__ = {'polymorphic_identity': 'b3'}


            def _loader_for_cls(target, context):
                  orig_query = context.query

                  target_cls = type(target)

                  if orig_query.column_descriptions[0]['type'] is
            target_cls:
                      return None

                  # take the original query and chance the entity to the
            subclass
                  q = orig_query.with_entities(target_cls)

                  # defer everything that's not PK / polymorphic_on from
            A.  this whole
                  # bit is just to avoid all those extra columns
                  to_defer = []
                  mapper = inspect(target).mapper
                  inherited = mapper.inherits
                  while inherited is not None:
                      for attr in inherited.column_attrs:
                          if not attr.expression.primary_key and \
                                  attr.expression is not
            inherited.polymorphic_on:
                              to_defer.append(attr.key)
                      for attr in inherited.relationships:
                          to_defer.append(attr.key)
                      inherited = inherited.inherits
                  q = q.options(*[defer(k) for k in to_defer])

                  q._attributes["load_extra_called"] = True

                  # store this strong reference so recs don't get lost
            while
                  # iterating
                  return q.all()


            @event.listens_for(A, "load", propagate=True)
            def load_extra(target, context):
                  if "load_extra_called" in context.attributes:
                      return

                  key = ('loader_by_cls', type(target))

                  if key not in context.attributes:
                      context.attributes[key] = _loader_for_cls(target,
            context)


            e = create_engine("sqlite://", echo=True)
            Base.metadata.create_all(e)


            s = Session(e)
            s.add_all([
                  B1(b_data='b11', b1='b1', a1='b11'),
                  B2(b_data='b21', b2='b2', a1='b21', cs=[C(), C(), C()]),
                  B3(b_data='b31', b3='b3', a1='b31'),
                  B1(b_data='b12', b1='b1', a1='b12'),
                  B1(b_data='b13', b1='b1', a1='b13'),
                  B2(b_data='b22', b2='b2', a1='b22', cs=[C(), C()]),
                  B3(b_data='b32', b3='b3', a1='b12'),
                  B3(b_data='b33', b3='b3', a1='b33')
            ])
            s.commit()


            q = s.query(A).filter(A.a1.like('%2%'))

            result = q.all()

            print "----- no more SQL ----"
            for b in result:
                  if isinstance(b, B1):
                      print b.b1
                  elif isinstance(b, B2):
                      print b.cs
                      print b.b2
                  elif isinstance(b, B3):
                      print b.b3




            On 04/04/2017 12:16 AM, da...@benchling.com wrote:
             > Hey Mike,
             >
             > Looks like I spoke too soon -- a few more questions:
             >
             > Using the example code you posted, we're actually seeing
            4 additional
             > queries (one per result model), rather than the expected
            3 (one per
             > result model type). If you print context.query inside
            load_extra, I
             > think it's clear why:
             >
             > - the loader sequentially processes loaded models
             >
            
(https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L75
            
<https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L75>)

             > - on processing the first B2, it executes a query to load
            all B2's data
             > - when processing the results of this new query, it now
            processes the
             > second B2 record (under a new context)
             > - load_extra() gets called for the new B2 with a
            different context than
             > the first B2
             > - this now executes _loader_for_cls
             >
             > Note that if you don't call _loader_for_cls in
            load_extra, load_extra is
             > called for each record using the correct context.
             >
             > A related issue is that if you query directly for B2,
            it'll redundantly
             > execute _loader_for_cls.
             >
             > I think we could solve both of these issues by doing this
            instead:
             >
             >     q = orig_query.with_entities(target_cls)
             >     if q == orig_query:
             >         return
             >
             > Unfortunately, the == operator doesn't work for comparing
            queries. Do
             > you have a way to compare query equality, or
            alternatively have a
             > solution to both of these issues?
             >
             > Could you also elaborate on what you meant by this comment?
             >
             >     # store this strong reference so recs don't get lost
            while
             >     # iterating
             >
             >
             > Thanks again for all your help,
             > Damon
             >
             > On Monday, April 3, 2017 at 6:42:17 PM UTC-7,
            da...@benchling.com wrote:
             >
             >     Thanks a ton for your help, Mike!
             >
             >     We played around with it and are pretty happy with
            your solution
             >     using the load() event, so we'll be using that moving
            forward.
             >
             >
             >     Damon
             >
             >     On Wednesday, March 29, 2017 at 2:40:39 PM UTC-7,
            Mike Bayer wrote:
             >
             >         I have a working version of both loading
            relationships via IN as
             >         well as
             >         loading joined inheritance subclasses via IN,
            including your
             >         great idea
             >         that extra eager loaders should continue to work
            for the subclass
             >         loaders.
             >
             >         I've only tested it with one scenario so far and
            both patches
             >         have a
             >         long way to go re: tests and documentation.
             >
             >         the scratch test case is currently in the commit
            message for the
             >         second
             >         patch, which illustrates a base class + subclass
            load where both
             >         classes
             >         have an additional relationship.     This is at
             > https://gerrit.sqlalchemy.org/#/c/359/
            <https://gerrit.sqlalchemy.org/#/c/359/>
             >         <https://gerrit.sqlalchemy.org/#/c/359/
            <https://gerrit.sqlalchemy.org/#/c/359/>>, note this builds
            upon the
             >         previous gerrit at
            https://gerrit.sqlalchemy.org/#/c/352/
            <https://gerrit.sqlalchemy.org/#/c/352/>
             >         <https://gerrit.sqlalchemy.org/#/c/352/
            <https://gerrit.sqlalchemy.org/#/c/352/>>.    The work
             >         at the moment will likely fall down with any kind
            of surprises
             >         but if
             >         you wanted to start running it and finding those
            cases, that is
             >         always
             >         helpful.
             >
             >         So far this looks promising as something that can
            be in 1.2 with
             >         perhaps
             >         some "EXPERIMENTAL" warnings surrounding it, but
            overall 1.2 was
             >         lacking
             >         a "killer ORM feature" so these two would be it.
             >
             >
             >
             >         On 03/23/2017 06:02 PM, mike bayer wrote:
             >         >
             >         >
             >         > On 03/23/2017 02:40 PM, mike bayer wrote:
             >         >>
             >         >>
             >         >> On 03/23/2017 12:53 PM, da...@benchling.com
            wrote:
             >         >>> Hey Mike,
             >         >>>
             >         >>> Thanks for the quick response!
             >         >>>
             >         >>> For developers that are pretty familiar with
            the SQLAlchemy
             >         API, but not
             >         >>> so much the internals, would implementing the
            subqueryloads to
             >         >>> contribute to SA be a reasonable endeavor?
            Could you
             >         ballpark how much
             >         >>> time how long it might take for us to do it?
             >         >>
             >         >> I haven't looked at what this would take, but
            it would be
             >         intricate and
             >         >> also need a lot of tests that are not easy to
            write.    Like,
             >         if you
             >         >> worked on it, you could probably get something
            working, but
             >         then that
             >         >> probably wouldn't be how it really needs to be
            because all
             >         kinds of
             >         >> things that are simple for simple cases don't
            work with the
             >         vast amount
             >         >> of edge cases which we have.
             >         >
             >         > I've created an issue for both a new
            relationship loader and an
             >         > inheritance loader at the same time, since they
            will use very
             >         similar
             >         > paths, at
            https://bitbucket.org/zzzeek/sqlalchemy/issues/3944
            <https://bitbucket.org/zzzeek/sqlalchemy/issues/3944>

><https://bitbucket.org/zzzeek/sqlalchemy/issues/3944

            <https://bitbucket.org/zzzeek/sqlalchemy/issues/3944>>.    A
            POC
             >         > for the relationship loader so far looks to be
            very simple
             >         (but then
             >         > again all the loaders start out very simple...)
            and is at
             >         > https://gerrit.sqlalchemy.org/352
            <https://gerrit.sqlalchemy.org/352>
             >         <https://gerrit.sqlalchemy.org/352
            <https://gerrit.sqlalchemy.org/352>>.   The same
            infrastructure and
             >         > approach would also be used for the mapper
            inheritance loader,
             >         which
             >         > would be enabled via a mapper()-level flag, as
            well as
             >         per-query using a
             >         > new query option.
             >         >
             >         > I can't yet guarantee this will be a 1.2 thing,
            I'd have to
             >         get some
             >         > more momentum going w/ test cases and all
            that.  In a 1.2
             >         release it
             >         > would also be a little bit experimental as new
            loading styles
             >         usually
             >         > have small issues coming up for months or
            years, as people try
             >         more use
             >         > cases.
             >         >
             >         >
             >         >
             >         >
             >         >
             >         >
             >         >>
             >         >> The routine that's loading the additional
            columns just for
             >         one object at
             >         >> a time is here:
             >         >>
             >         >>
             >
            
https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L635
            
<https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L635>

><https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L635

            
<https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L635>>

             >
             >         >>
             >         >> and then here for the bulk of it:
             >         >>
             >
            
https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/mapper.py#L2588
            
<https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/mapper.py#L2588>

><https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/mapper.py#L2588

            
<https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/mapper.py#L2588>>

             >
             >         >>
             >         >>
             >         >>
             >         >> But the much harder part would be how to work
            this step into
             >         the loading
             >         >> infrastructure, which would be somewhere in
             >         >>
             >
            
https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L273
            
<https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L273>

><https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L273

            
<https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/loading.py#L273>>,

             >
             >         >>
             >         >> which is a very intricate function with over a
            decade of
             >         constant
             >         >> refactorings behind it, and I'd have to think
            pretty deeply
             >         for awhile
             >         >> how best to do this.
             >         >>
             >         >> Not to mention that there's more than one way
            to do this
             >         query, there's
             >         >> either re-using the criteria from the original
            query, or there's
             >         >> injecting the primary key ids of the whole
            list of objects
             >         into an IN
             >         >> clause after the fact.  The latter approach is
            probably more
             >         efficient
             >         >> but wouldn't work for composite primary keys
            outside of
             >         Postgresql.   As
             >         >> a built in feature I'd want "IN" loading to be
            an option at
             >         least.
             >         >>
             >         >>
             >         >>
             >         >>>
             >         >>> Regarding STI and relationships, is there any
            way to do that
             >         but still
             >         >>> get the benefits of JTI? e.g. is there an
            easy way to resolve
             >         >>> my_base_class_inst.subclass_prop as a proxy
            to the subclass?
             >         We could
             >         >>> roll our own using __getitem__ but it seems a
            little hairy.
             >         >>
             >         >> So proxying to a related item wouldn't be
            terrible, sure you
             >         could use
             >         >> __getitem__ or you could also add descriptors
            to the primary
             >         class,
             >         >> adding the descriptors to match the "info"
            could be automated
             >         as well
             >         >> (or even do it in the other direction, add
            special descriptor
             >         to main
             >         >> class == a column gets added to related class).
             >         >>
             >         >>
             >         >> Doing the thing where you query() for all the
            related classes
             >         after the

> >> fact yourself might not be that terrible.you can use the

             >         load() event
             >         >> which receives the query context that has the
            query you need
             >         already.  I

> >> guess it's time for proof of concept.Here's that. You

             >         can see at
             >         >> the end we load all the "bs" without any
            per-object load.
             >         >>
             >         >> from sqlalchemy import *
             >         >> from sqlalchemy.orm import *
             >         >> from sqlalchemy.ext.declarative import
            declarative_base
             >         >> from sqlalchemy import event
             >         >>
             >         >> Base = declarative_base()
             >         >>
             >         >>
             >         >> class A(Base):
             >         >>     __tablename__ = 'a'
             >         >>     id = Column(Integer, primary_key=True)
             >         >>     a1 = Column(String)
             >         >>     type = Column(String)
             >         >>
             >         >>     __mapper_args__ = {'polymorphic_on': type}
             >         >>
             >         >>
             >         >> class B1(A):
             >         >>     __tablename__ = 'b1'
             >         >>     id = Column(Integer, ForeignKey('a.id
            <http://a.id> <http://a.id>'),
             >         primary_key=True)
             >         >>     b1 = Column(String)
             >         >>     b_data = Column(String)
             >         >>     __mapper_args__ = {'polymorphic_identity':
            'b1'}
             >         >>
             >         >>
             >         >> class B2(A):
             >         >>     __tablename__ = 'b2'
             >         >>     id = Column(Integer, ForeignKey('a.id
            <http://a.id> <http://a.id>'),
             >         primary_key=True)
             >         >>     b2 = Column(String)
             >         >>     b_data = Column(String)
             >         >>     cs = relationship("C", lazy='subquery')
             >         >>
             >         >>     __mapper_args__ = {'polymorphic_identity':
            'b2'}
             >         >>
             >         >>
             >         >> class C(Base):
             >         >>     __tablename__ = 'c'
             >         >>     id = Column(Integer, primary_key=True)
             >         >>     b2_id = Column(ForeignKey('b2.id
            <http://b2.id> <http://b2.id>'))
             >         >>
             >         >>
             >         >> class B3(A):
             >         >>     __tablename__ = 'b3'
             >         >>     id = Column(Integer, ForeignKey('a.id
            <http://a.id> <http://a.id>'),
             >         primary_key=True)
             >         >>     b3 = Column(String)
             >         >>     b_data = Column(String)
             >         >>     __mapper_args__ = {'polymorphic_identity':
            'b3'}
             >         >>
             >         >>
             >         >> def _loader_for_cls(target, context):
             >         >>     orig_query = context.query
             >         >>
             >         >>     target_cls = type(target)
             >         >>
             >         >>     # take the original query and chance the
            entity to the
             >         subclass
             >         >>     q = orig_query.with_entities(target_cls)
             >         >>
             >         >>     # defer everything that's not PK /
            polymorphic_on from A.
             >          this whole
             >         >>     # bit is just to avoid all those extra
            columns
             >         >>     to_defer = []
             >         >>     mapper = inspect(target).mapper
             >         >>     inherited = mapper.inherits
             >         >>     while inherited is not None:
             >         >>         for attr in inherited.column_attrs:
             >         >>             if not attr.expression.primary_key
            and \
             >         >>                     attr.expression is not
             >         inherited.polymorphic_on:
             >         >>                 to_defer.append(attr.key)
             >         >>         for attr in inherited.relationships:
             >         >>             to_defer.append(attr.key)
             >         >>         inherited = inherited.inherits
             >         >>     q = q.options(*[defer(k) for k in to_defer])
             >         >>
             >         >>     # store this strong reference so recs
            don't get lost while
             >         >>     # iterating
             >         >>     return q.all()
             >         >>
             >         >>
             >         >> @event.listens_for(A, "load", propagate=True)
             >         >> def load_extra(target, context):
             >         >>     key = ('loader_by_cls', type(target))
             >         >>
             >         >>     if key not in context.attributes:
             >         >>         context.attributes[key] =
            _loader_for_cls(target,
             >         context)
             >         >>
             >         >>
             >         >> e = create_engine("sqlite://", echo=True)
             >         >> Base.metadata.create_all(e)
             >         >>
             >         >>
             >         >> s = Session(e)
             >         >> s.add_all([
             >         >>     B1(b_data='b11', b1='b1', a1='b11'),
             >         >>     B2(b_data='b21', b2='b2', a1='b21',
            cs=[C(), C(), C()]),
             >         >>     B3(b_data='b31', b3='b3', a1='b31'),
             >         >>     B1(b_data='b12', b1='b1', a1='b12'),
             >         >>     B1(b_data='b13', b1='b1', a1='b13'),
             >         >>     B2(b_data='b22', b2='b2', a1='b22',
            cs=[C(), C()]),
             >         >>     B3(b_data='b32', b3='b3', a1='b12'),
             >         >>     B3(b_data='b33', b3='b3', a1='b33')
             >         >> ])
             >         >> s.commit()
             >         >>
             >         >>
             >         >> q = s.query(A).filter(A.a1.like('%2%'))
             >         >>
             >         >> result = q.all()
             >         >>
             >         >> print "----- no more SQL ----"
             >         >> for b in result:
             >         >>     if isinstance(b, B1):
             >         >>         print b.b1
             >         >>     elif isinstance(b, B2):
             >         >>         print b.cs
             >         >>         print b.b2
             >         >>     elif isinstance(b, B3):
             >         >>         print b.b3
             >         >>
             >         >>
             >         >>
             >         >>
             >         >>
             >         >>>
             >         >>>
             >         >>> Thanks again,
             >         >>> Damon
             >         >>>
             >         >>> On Wednesday, March 22, 2017 at 3:59:45 PM
            UTC-7, Mike Bayer
             >         wrote:
             >         >>>
             >         >>>
             >         >>>
             >         >>>     On 03/22/2017 02:17 PM, da...@benchling.com
             >         <javascript:> wrote:
             >         >>>     > Hey all,
             >         >>>     >
             >         >>>     > We were wondering if you had any advice
            on having a
             >         large (~10)
             >         >>>     number
             >         >>>     > of polymorphic subclasses for a single
            base class. Using
             >         >>>     > with_polymorphic: '*' causes SQLAlchemy
            to joinedload all
             >         >>> subclasses
             >         >>>     > like this:
             >         >>>     >
             >         >>>     > SELECT ...
             >         >>>     > FROM base_table
             >         >>>     > LEFT OUTER JOIN sub_table_1 ON
            base_table.id <http://base_table.id>
             >         <http://base_table.id>
             >         >>>     <http://base_table.id> = sub_table_1.id
            <http://sub_table_1.id>
             >         <http://sub_table_1.id> <http://sub_table_1.id>
             >         >>>     > LEFT OUTER JOIN sub_table_2 ON
            base_table.id <http://base_table.id>
             >         <http://base_table.id>
             >         >>>     <http://base_table.id> = sub_table_2.id
            <http://sub_table_2.id>
             >         <http://sub_table_2.id> <http://sub_table_2.id>
             >         >>>     > ...
             >         >>>     >
             >         >>>     > Postgres buckles under too many joins,
            and these
             >         queries start
             >         >>>     taking a
             >         >>>     > really long time.
             >         >>>     >
             >         >>>     > One other note is that for most of our
            queries, only a
             >         few of
             >         >>> these
             >         >>>     > sub-tables are actually needed, so most
            of the joins
             >         are wasted.
             >         >>>     > Unfortunately, ahead of time, we don't
            know which
             >         tables will be
             >         >>>     needed
             >         >>>     > -- we're relying on the discriminator.
             >         >>>     >
             >         >>>     > Ideally, we'd be able to specify that
            the ORM should
             >         >>> subqueryload the
             >         >>>     > subclasses (and only execute subqueries
            on the types
             >         that are
             >         >>>     present).
             >         >>>     > This would have to happen both when
            querying the base
             >         table, but
             >         >>> also
             >         >>>     > when accessing relationships. We'd want
            it to execute
             >         a query on
             >         >>> the
             >         >>>     > base table, then execute one query for
            each present
             >         subclass.
             >         >>>     >
             >         >>>     > Another solution might be to use some
            kind of hook that
             >         >>>     >
             >         >>>     > - is executed after a query returns
            with results (or
             >         after a
             >         >>> list of
             >         >>>     > models are added to the session?)
             >         >>>     > - groups the models by type and runs
            its own
             >         subqueries to load
             >         >>>     the data
             >         >>>     >
             >         >>>     > Any help here is greatly appreciated!
             >         >>>
             >         >>>
             >         >>>     The purpose of with_polymorphic is more
            about being able to
             >         >>> filter on
             >         >>>     multiple classes at the same time, which
            is why it uses
             >         joins, but
             >         >>>     these
             >         >>>     don't scale to many subclasses.    Adding
            a subquery
             >         load for the
             >         >>>     related tables would be something that
            the ORM can
             >         someday have as a
             >         >>>     feature, but it would need a lot of tests
            to ensure it's
             >         working as
             >         >>>     advertised.
             >         >>>
             >         >>>     There's a lot of ways to get those other
            tables loaded
             >         but none of
             >         >>> them
             >         >>>     look that great.   Turning off
            with_polymorphic(), one
             >         approach
             >         >>> is to
             >         >>>     collect all the distinct types and
            identifiers from your
             >         query
             >         >>> result;
             >         >>>     then do a separate query for each subtype:
             >         >>>
             >         >>>             result =
            session.query(BaseClass).filter(...).all()
             >         >>>
             >         >>>              types = sorted([(type(obj),
            obj.id <http://obj.id>
             >         <http://obj.id> <http://obj.id>) for obj
             >         >>>     in result],
             >         >>>     key=lambda t: t[0])
             >         >>>
             >         >>>              for type, ids in
            itertools.groupby(types,
             >         key=lambda t:
             >         >>> t[0]):
             >         >>>
             >          session.query(type).filter(type.id.in_(ids)).all()
             >         >>>
             >         >>>     That will emit a query with an INNER JOIN
            for each class
             >         and will
             >         >>>     populate the remaining records in the
            identity map.  The
             >         columns
             >         >>> that
             >         >>>     are already loaded are not re-accessed,
            though the DBAPI
             >         will still
             >         >>>     send

> >>> them over the network to the cursor.You can try

             >         limiting the
             >         >>> columns
             >         >>>     you query for in each statement as well
            by using the
             >         defer() option.
             >         >>>
             >         >>>     Another way is to use with_polymorphic()
            but to provide
             >         a different
             >         >>>     kind
             >         >>>     of SQL statement, like a
            polymorphic_union().   This
             >         would be a
             >         >>>     UNION of
             >         >>>     statements that each have an inner join.
               the resulting
             >         SQL is a
             >         >>> beast
             >         >>>     but it at least isn't using those left
            outer joins.   I
             >         think you
             >         >>> can
             >         >>>     probably use
            sqlalchemy.orm.util.polymorphic_union()
             >         directly to get
             >         >>>     this UNION statement built up automatically.
             >         >>>
             >         >>>     Still another way is to reorganize the
            mappings to use
             >         single-table
             >         >>>     inheritance and relationship() to link
            out to the
             >         related table,
             >         >>> then
             >         >>>     the normal "subqueryload" feature to load
            them as
             >         relationships.
             >         >>> Even
             >         >>>     though this way is ugly, I might use this
            (short of
             >         implementing the
             >         >>>     related table subqueryload feature) just
            to make things
             >         simple.
             >         >>>
             >         >>>
             >         >>>     Definitely a feature that should be added
            but that's not an
             >         >>> immediate
             >         >>>     solution.
             >         >>>
             >         >>>     >
             >         >>>     >
             >         >>>     > Thanks,
             >         >>>     > Damon
             >         >>>     >
             >         >>>     > --
             >         >>>     > SQLAlchemy -
             >         >>>     > The Python SQL Toolkit and Object
            Relational Mapper
             >         >>>     >
             >         >>>     > http://www.sqlalchemy.org/
             >         >>>     >
             >         >>>     > To post example code, please provide an
            MCVE: Minimal,
             >         Complete,
             >         >>> and
             >         >>>     > Verifiable Example. See
             > http://stackoverflow.com/help/mcve
            <http://stackoverflow.com/help/mcve>
             >         <http://stackoverflow.com/help/mcve
            <http://stackoverflow.com/help/mcve>>
             >         >>>     <http://stackoverflow.com/help/mcve
            <http://stackoverflow.com/help/mcve>
             >         <http://stackoverflow.com/help/mcve
            <http://stackoverflow.com/help/mcve>>> for a full
             >         >>>     > description.
             >         >>>     > ---
             >         >>>     > You received this message because you
            are subscribed
             >         to the Google
             >         >>>     > Groups "sqlalchemy" group.
             >         >>>     > To unsubscribe from this group and stop
            receiving
             >         emails from it,
             >         >>>     send
             >         >>>     > an email to
            sqlalchemy+...@googlegroups.com <javascript:>
             >         >>>     >
            <mailto:sqlalchemy+unsubscr...@googlegroups.com
             >         <javascript:>>.
             >         >>>     > To post to this group, send email to
             > sqlal...@googlegroups.com
             >         >>>     <javascript:>
             >         >>>     > <mailto:sqlal...@googlegroups.com
            <javascript:>>.
             >         >>>     > Visit this group at
             > https://groups.google.com/group/sqlalchemy
            <https://groups.google.com/group/sqlalchemy>
             >         <https://groups.google.com/group/sqlalchemy
            <https://groups.google.com/group/sqlalchemy>>

> >>><https://groups.google.com/group/sqlalchemy

            <https://groups.google.com/group/sqlalchemy>
             >         <https://groups.google.com/group/sqlalchemy
            <https://groups.google.com/group/sqlalchemy>>>.
             >         >>>     > For more options, visit
             > https://groups.google.com/d/optout
            <https://groups.google.com/d/optout>
             >         <https://groups.google.com/d/optout
            <https://groups.google.com/d/optout>>
             >         >>>     <https://groups.google.com/d/optout
            <https://groups.google.com/d/optout>
             >         <https://groups.google.com/d/optout
            <https://groups.google.com/d/optout>>>.
             >         >>>
             >         >>> --
             >         >>> SQLAlchemy -
             >         >>> The Python SQL Toolkit and Object Relational
            Mapper
             >         >>>
             >         >>> http://www.sqlalchemy.org/
             >         >>>
             >         >>> To post example code, please provide an MCVE:
            Minimal,
             >         Complete, and
             >         >>> Verifiable Example. See
            http://stackoverflow.com/help/mcve
            <http://stackoverflow.com/help/mcve>
             >         <http://stackoverflow.com/help/mcve
            <http://stackoverflow.com/help/mcve>> for a full
             >         >>> description.
             >         >>> ---
             >         >>> You received this message because you are
            subscribed to the
             >         Google
             >         >>> Groups "sqlalchemy" group.
             >         >>> To unsubscribe from this group and stop
            receiving emails
             >         from it, send
             >         >>> an email to sqlalchemy+...@googlegroups.com
             >         >>>
            <mailto:sqlalchemy+unsubscr...@googlegroups.com>.
             >         >>> To post to this group, send email to
            sqlal...@googlegroups.com
             >         >>> <mailto:sqlal...@googlegroups.com>.
             >         >>> Visit this group at
             > https://groups.google.com/group/sqlalchemy
            <https://groups.google.com/group/sqlalchemy>
             >         <https://groups.google.com/group/sqlalchemy
            <https://groups.google.com/group/sqlalchemy>>.
             >         >>> For more options, visit
            https://groups.google.com/d/optout
            <https://groups.google.com/d/optout>
             >         <https://groups.google.com/d/optout
            <https://groups.google.com/d/optout>>.
             >
             > --
             > SQLAlchemy -
             > The Python SQL Toolkit and Object Relational Mapper
             >
             > http://www.sqlalchemy.org/
             >
             > To post example code, please provide an MCVE: Minimal,
            Complete, and
             > Verifiable Example. See
            http://stackoverflow.com/help/mcve
            <http://stackoverflow.com/help/mcve> for a full
             > description.
             > ---
             > You received this message because you are subscribed to
            the Google
             > Groups "sqlalchemy" group.
             > To unsubscribe from this group and stop receiving emails
            from it, send
             > an email to sqlalchemy+...@googlegroups.com
             > <mailto:sqlalchemy+unsubscr...@googlegroups.com>.
             > To post to this group, send email to
            sqlal...@googlegroups.com
             > <mailto:sqlal...@googlegroups.com>.
             > Visit this group at
            https://groups.google.com/group/sqlalchemy
            <https://groups.google.com/group/sqlalchemy>.
             > For more options, visit
            https://groups.google.com/d/optout
            <https://groups.google.com/d/optout>.

--SQLAlchemy -

        The Python SQL Toolkit and Object Relational Mapper

        http://www.sqlalchemy.org/

        To post example code, please provide an MCVE: Minimal, Complete,
        and Verifiable Example. See http://stackoverflow.com/help/mcve
        <http://stackoverflow.com/help/mcve> for a full description.
        ---
        You received this message because you are subscribed to the
        Google Groups "sqlalchemy" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to sqlalchemy+...@googlegroups.com <javascript:>.
        To post to this group, send email to sqlal...@googlegroups.com
        <javascript:>.

        Visit this group at https://groups.google.com/group/sqlalchemy
        <https://groups.google.com/group/sqlalchemy>.
        For more options, visit https://groups.google.com/d/optout
        <https://groups.google.com/d/optout>.


--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, andVerifiable Example. See http://stackoverflow.com/help/mcve for a fulldescription.

---

You received this message because you are subscribed to the GoogleGroups "sqlalchemy" group.To unsubscribe from this group and stop receiving emails from it, sendan email to sqlalchemy+unsubscr...@googlegroups.com<mailto:sqlalchemy+unsubscr...@googlegroups.com>.To post to this group, send email to sqlalchemy@googlegroups.com<mailto:sqlalchemy@googlegroups.com>.

Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--

SQLAlchemy -The Python SQL Toolkit and Object Relational Mapper


http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.

---You received this message because you are subscribed to the Google Groups "sqlalchemy" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Large number of polymorphic subclasses

Reply via email to