Stodge wrote:
Thanks. That doesn't quite work. Based on my data, the following
should (and does) work because it only returns document id=1, which
only has these two tags:
tag_list = ['my document', 'source code']
session.query(Document).\
filter(Document.tags.any(Tag.tag.in_([t for t in tag_list]))).\
filter(~Document.tags.any(~Tag.tag.in_([t for t in tag_list])))
The following should return no records, as there is no document that
has only these tags. Instead it returns document id=2, which only has
the tag 'random stuff':
tag_list = ['my document', 'source code', 'random stuff']
session.query(Document).\
filter(Document.tags.any(Tag.tag.in_([t for t in tag_list]))).\
filter(~Document.tags.any(~Tag.tag.in_([t for t in tag_list])))
Try this query instead:
tag_list = ['my document', 'source code', 'random stuff']
q = session.query(Document)
for t in tag_list:
q = q.filter(Document.tags.any(Tag.tag == t))
q = q.filter(~Document.tags.any(~Tag.tag.in_(tag_list)))
Alternatively, if your database supports aggregating row sets into
arrays/strings (e.g. PostgreSQL supports ARRAY(SELECT ...) to collect
row sets into an array, MySQL and others have GROUP_CONCAT() that I
believe you can use for this purpose), you may prefer a different
technique. I will show an example with PostgreSQL and ARRAY().
subq = session.query(Tag.tag)
subq = subq.join(document_tags)
subq = subq.filter(document_tags.c.document_id == Document.id)
subq = subq.order_by(Tag.tag.asc())
subq = subq.correlate(Document)
subq = subq.subquery()
q = session.query(Document)
q = q.filter(func.array(subq) == sorted(tag_list))
q = q.correlate(s) # May not be needed
This should generate SQL like:
SELECT Document columns
FROM document
WHERE ARRAY(
SELECT tag.tag
FROM tag
JOIN document_tags ON document_tags.tag_id = tag.id
WHERE document_tags.document_id = document.id
ORDER BY tag.tag ASC
) = %(array_1)s
where array_1 would be ['my document', 'random stuff', 'source code'].
Note how both the subquery and array_1 have to be sorted. I tend to
prefer this type of query since its complexity doesn't grow as you add
more tags in your search criteria.
-Conor
On Mar 19, 10:15 am, Michael Bayer mike...@zzzcomputing.com wrote:
select document.* from document join tags on document.id=tags.document_id
where tags.tag='foo' and tags.tag='bar' and tags.tag=
am I missing something ? that would return no rows in most cases.
if you want to find documents that have an exact list of tags, you'd have to
do something like the IN query we started with, and additionally ensure no
extra tags remain.
like:
sess.query(Document).\
filter(Document.tags.any(Tag.id.in_([t.id for t in
tag_list])).\
filter(~Document.tags.any(~Tag.id.in_([t.id for t in
tag_list]))
On Mar 19, 2010, at 8:31 AM, Stodge wrote:
Now we're getting somewhere:
expressions = []
for tag in tag_list:
expressions += [Tag.tag==tag]
documents =
session.query(Document).join(Document.tags).filter(and_(*expressions))
Thanks to a Storm example I found. :)
On Mar 19, 8:12 am, Stodge sto...@gmail.com wrote:
Ok so far I have this:
expressions = []
for tag in tag_list:
expressions += session.query(Document).filter(Tag.tag==tag)
documents =
session.query(Document).join(Document.tags).filter(and_(*expressions))
Doesn't work but it's progress! :)
On Mar 18, 2:37 pm, Stodge sto...@gmail.com wrote:
Thanks that worked beautifully.
On a similar note, how would I match documents with only the tags that
I specify in the list? My naive attempt is:
for tag in tag_list:
session.query(Document).join(Document.tags).filter_by(tag=tag)
But that doesn't work.
On Mar 15, 10:54 am, Michael Bayer mike...@zzzcomputing.com wrote:
Stodgewrote:
I have two classes with a third table:
document_tags = Table('document_tags', metadata,
Column('document_id', Integer, ForeignKey('documents.id')),
Column('tag_id', Integer, ForeignKey('tags.id'))
)
class Document(Base):
__tablename__ = 'documents'
id = Column(Integer, primary_key=True)
title = Column(String)
filename = Column(String)
tags = relation('Tag', secondary=document_tags, backref='tags')
def __init__(self, title, filename):
self.title = title
self.filename = filename
class Tag(Base):
__tablename__ = 'tags'
id = Column(Integer, primary_key=True)
tag = Column(String)
def __init__(self, tag):
self.tag = tag
I want to find all documents with tags in a given list of tags:
documents =
session.query(Document).filter(Document.tags.in_(tag_list))