Re: [sqlalchemy] .any() queries and MySQL performance

Mike Bayer Wed, 10 Jun 2015 21:01:16 -0700


On 6/10/15 10:32 PM, Eric Ogren wrote:

Hello,
I have a very simple data model with a one to many relationship - athread could have many identifiers that link to it:
class Thread(Base):
        __tablename__ = 'threads'

        id = Column(Integer, primary_key=True)
        link = Column(String(length=512, collation=UTF8), nullable=False)
        identifiers = relationship('ThreadIdentifier', backref='thread')

        def __repr__(self):
return u"<Thread: link {0} id {1}>".format(self.link,self.id)
class ThreadIdentifier(Base):
        __tablename__ = 'threads_identifiers'
id = Column(Integer, primary_key=True) # lame, (identifier,thread_id could be PK)
        identifier = Column(String(length=255), index=True)
thread_id = Column(Integer, ForeignKey('threads.id'),nullable=False, primary_key=True)
        def __repr__(self):
return u"<ThreadIdentifier: {0} (tid{1})>".format(self.identifier, self.thread_id)
If I try to query for threads with a certain identifier, the followingSQL gets generated:
>s.query(Thread).filter(Thread.identifiers.any(ThreadIdentifier.identifier== 'foo')).one()015-06-10 19:02:26,181 INFO sqlalchemy.engine.base.Engine SELECTthreads.id AS threads_id, threads.link AS threads_link
FROM threads
WHERE EXISTS (SELECT 1
FROM threads_identifiers
WHERE threads.id = threads_identifiers.thread_id ANDthreads_identifiers.identifier = %s)
2015-06-10 19:02:26,181 INFO sqlalchemy.engine.base.Engine ('foo',)
The query returns the correct results but is incredibly slow becauseMySQL's query engine apparently doesn't optimize this EXISTS clause.It does a full table scan of the threads table and a subquery for eachrow.

yup

If I use an inner join instead (this was just raw SQL I typed in):
select threads.id, threads.link FROM threads INNER JOINthreads_identifiers ON threads_identifiers.thread_id = threads.idWHERE threads_identifiers.identifier = 'foo';
Then MySQL is smart enough to use indices and the query returnsincredibly quickly.

yup

Is there a good way to leverage the .any() filter in MySQL? I can sortof work around this by doing something like:
s.query(ThreadIdentifier).filter(ThreadIdentifier.identifier =='foo').one().thread
but the code is much trickier to read.

To leverage any() in MySQL, not that I know of, unless the number ofrows being scanned is very small and/or MySQL/MariaDB eventuallyimproves upon this. So much for SQL being "declarative".


To get your JOIN, do that with the query.  Use a join():

s.query(Thread).join(Thread.identifiers).filter(ThreadIdentifier.identifier== 'foo')

thanks
Eric

--
You received this message because you are subscribed to the GoogleGroups "sqlalchemy" group.To unsubscribe from this group and stop receiving emails from it, sendan email to sqlalchemy+unsubscr...@googlegroups.com<mailto:sqlalchemy+unsubscr...@googlegroups.com>.To post to this group, send email to sqlalchemy@googlegroups.com<mailto:sqlalchemy@googlegroups.com>.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] .any() queries and MySQL performance

Reply via email to