On 6/10/15 10:32 PM, Eric Ogren wrote:
Hello,

I have a very simple data model with a one to many relationship - a thread could have many identifiers that link to it:

class Thread(Base):
        __tablename__ = 'threads'

        id = Column(Integer, primary_key=True)
        link = Column(String(length=512, collation=UTF8), nullable=False)
        identifiers = relationship('ThreadIdentifier', backref='thread')

        def __repr__(self):
return u"<Thread: link {0} id {1}>".format(self.link, self.id)


class ThreadIdentifier(Base):
        __tablename__ = 'threads_identifiers'

id = Column(Integer, primary_key=True) # lame, (identifier, thread_id could be PK)
        identifier = Column(String(length=255), index=True)
thread_id = Column(Integer, ForeignKey('threads.id'), nullable=False, primary_key=True)

        def __repr__(self):
return u"<ThreadIdentifier: {0} (tid {1})>".format(self.identifier, self.thread_id)

If I try to query for threads with a certain identifier, the following SQL gets generated:

> s.query(Thread).filter(Thread.identifiers.any(ThreadIdentifier.identifier == 'foo')).one() 015-06-10 19:02:26,181 INFO sqlalchemy.engine.base.Engine SELECT threads.id AS threads_id, threads.link AS threads_link
FROM threads
WHERE EXISTS (SELECT 1
FROM threads_identifiers
WHERE threads.id = threads_identifiers.thread_id AND threads_identifiers.identifier = %s)
2015-06-10 19:02:26,181 INFO sqlalchemy.engine.base.Engine ('foo',)

The query returns the correct results but is incredibly slow because MySQL's query engine apparently doesn't optimize this EXISTS clause. It does a full table scan of the threads table and a subquery for each row.

yup


If I use an inner join instead (this was just raw SQL I typed in):

select threads.id, threads.link FROM threads INNER JOIN threads_identifiers ON threads_identifiers.thread_id = threads.id WHERE threads_identifiers.identifier = 'foo';

Then MySQL is smart enough to use indices and the query returns incredibly quickly.

yup

Is there a good way to leverage the .any() filter in MySQL? I can sort of work around this by doing something like:

s.query(ThreadIdentifier).filter(ThreadIdentifier.identifier == 'foo').one().thread

but the code is much trickier to read.

To leverage any() in MySQL, not that I know of, unless the number of rows being scanned is very small and/or MySQL/MariaDB eventually improves upon this. So much for SQL being "declarative".

To get your JOIN, do that with the query.  Use a join():

s.query(Thread).join(Thread.identifiers).filter(ThreadIdentifier.identifier == 'foo')




thanks
Eric

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com <mailto:sqlalchemy+unsubscr...@googlegroups.com>. To post to this group, send email to sqlalchemy@googlegroups.com <mailto:sqlalchemy@googlegroups.com>.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to