On 6/10/15 10:32 PM, Eric Ogren wrote:
Hello,
I have a very simple data model with a one to many relationship - a
thread could have many identifiers that link to it:
class Thread(Base):
__tablename__ = 'threads'
id = Column(Integer, primary_key=True)
link = Column(String(length=512, collation=UTF8), nullable=False)
identifiers = relationship('ThreadIdentifier', backref='thread')
def __repr__(self):
return u"<Thread: link {0} id {1}>".format(self.link,
self.id)
class ThreadIdentifier(Base):
__tablename__ = 'threads_identifiers'
id = Column(Integer, primary_key=True) # lame, (identifier,
thread_id could be PK)
identifier = Column(String(length=255), index=True)
thread_id = Column(Integer, ForeignKey('threads.id'),
nullable=False, primary_key=True)
def __repr__(self):
return u"<ThreadIdentifier: {0} (tid
{1})>".format(self.identifier, self.thread_id)
If I try to query for threads with a certain identifier, the following
SQL gets generated:
>
s.query(Thread).filter(Thread.identifiers.any(ThreadIdentifier.identifier
== 'foo')).one()
015-06-10 19:02:26,181 INFO sqlalchemy.engine.base.Engine SELECT
threads.id AS threads_id, threads.link AS threads_link
FROM threads
WHERE EXISTS (SELECT 1
FROM threads_identifiers
WHERE threads.id = threads_identifiers.thread_id AND
threads_identifiers.identifier = %s)
2015-06-10 19:02:26,181 INFO sqlalchemy.engine.base.Engine ('foo',)
The query returns the correct results but is incredibly slow because
MySQL's query engine apparently doesn't optimize this EXISTS clause.
It does a full table scan of the threads table and a subquery for each
row.
yup
If I use an inner join instead (this was just raw SQL I typed in):
select threads.id, threads.link FROM threads INNER JOIN
threads_identifiers ON threads_identifiers.thread_id = threads.id
WHERE threads_identifiers.identifier = 'foo';
Then MySQL is smart enough to use indices and the query returns
incredibly quickly.
yup
Is there a good way to leverage the .any() filter in MySQL? I can sort
of work around this by doing something like:
s.query(ThreadIdentifier).filter(ThreadIdentifier.identifier ==
'foo').one().thread
but the code is much trickier to read.
To leverage any() in MySQL, not that I know of, unless the number of
rows being scanned is very small and/or MySQL/MariaDB eventually
improves upon this. So much for SQL being "declarative".
To get your JOIN, do that with the query. Use a join():
s.query(Thread).join(Thread.identifiers).filter(ThreadIdentifier.identifier
== 'foo')
thanks
Eric
--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to sqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email to sqlalchemy@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.