On Mar 21, 2011, at 7:21 AM, Wouter Overmeire wrote: > New to sql and sqlalchemy. > Using an in memory sqlite database. > > Trying to do some queries based on the examples in the sqlalchemy > documentation. > > I have classes: Tag, Article, Author, Award > article <--> tag: many to many > article <--> author: many to many (so one article can have multiple > authors) > award <--> author: many to many > > Find all articles written by 'author1' > articles = > session.query(Article).filter(Article.authors.any(Author.name == > 'author1')).all() > This works fine, until i start increasing the number of articles and > authors. > For 50 authors with each 1000 articles and each article has 10 tags > (out of 100 tags in total), the above query takes forever, i never let > it complete and interrupt it after 15 minutes. The above query is just > as in the documentation (http://www.sqlalchemy.org/docs/orm/ > tutorial.html?highlight=blogpost#building-a-many-to-many- > relationship), is it suppose to work on larger tables? Very likely i`m > doing something wrong.
Not really, the EXISTS format is known to be generally slower than a straight INNER JOIN. The operator is convenient, especially when doing a NOT EXISTS, but if looking for positive inclusion of sets the more verbose join() approach is typically more efficient. > > If i do: > articles = > session.query(Article).join(Article.authors).filter(Author.name == > 'author1').all() this is done in a fraction of a second. > > Now trying to find articles written by 'author1' and having tag > 'tag1'. > qArticleByTag = > session.query(Article).join(Article.tags).filter(Tag.tag == 'tag1') > qArticleByAuthor = > session.query(Article).join(Article.authors).filter(Author.name == > 'author1') > articles = qArticleByTag.intersect(qArticleByAuthor).all() > > This works fine, but somehow i have the feeling there are more > efficient ways. the SQL set operators like UNION, INTERSECT, etc. are not the most efficient and are cumbersome to use. Using two joins() at once would be more straightforward: session.query(Article).join(Article.tags).filter(Tag.tag=='tag1').join(Article.authors).filter(Author.name=='author1') three sets of rows are intersected above, returning only those rows which are correlated and meet all criterion. > Since i think the above query goes twice through all > the articles. with INTERSECT, probably, though its possible some query optimizers could figure out a more efficient pattern here (I'm not a deep expert in query optimizers to know for sure, however). Sticking with chains of joins is the way to go in any case. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.