[sqlalchemy] Re: Constructing where-clauses dynamically
BTW, this post started out as a reply to James but grew into a collective summary of my thoughts on the entire thread so far. You've been warned :) James Taylor wrote: But wouldn't the join from orders to items be OUTER in this case since you can have orders with a total 50 but no items. (Your application logic probably forbids this, but it is allowed at the relational level). Gets complicated quick! Yes, that's one of the things I started to dislike as I got more into the details. The 'obj1.obj2.obj3' join syntax does not imply the type of join (i.e. inner, outer, cross, etc.). My initial thought is to say that this type of join syntax would always use inner joins, which is easy enough to remember. One thing I'm not quite clear on is how the join type is specified with the other join syntaxes in SA. There's join_to/join_via and Mike talked about from_obj=[users.join(ordertable).join(...)], but both of those are just as ambiguous as obj1.obj2.obj3 when it comes to the join type (and they're a lot more verbose at that). I guess SelectResults has outerjoin_to()...more on that later. Here is a more complete join syntax proposal: INNER JOIN (join path syntax sugar): User.c.orders User.c.orders.items INNER JOIN (non-sugar): join(User.c.orders) join(User.c.orders.items) join(User, Order, User.c.user_id Order.c.user_id) OUTER JOIN: outerjoin(User.c.orders) outerjoin(User.c.orders.items) outerjoin(User, Order, User.c.user_id Order.c.user_id) RANDOM JOIN: random_join(User.c.orders) Just kidding on that last one :P Each join function requires one or three positional arguments. The single-arg version takes a join path as it's argument, and performs the join type represented by function on all joins along the path. A stand-alone join path is syntax sugar for join(join path). The three-arg version takes two join-able objects and a condition on which to join them. I realize that a two-arg version is also available where SA tries to find a relationship between the first two args, and IMHO that should be deprecated because it is error-prone. To do a join based on a mapper relationship, use a join path. To do an ad-hoc join use the three-argument version. In addition to join paths and classes, the 3-arg version should also accept tables, strings (entity names), select clauses, and other join objects as arguments. When using anything other than a join path as the first argument, three arguments are required. So this: outerjoin( join(User.c.orders), Items, Order.c.order_id == Item.c.order_id ) and would generate this: FROM users INNER JOIN orders ON users.user_id = orders.user_id LEFT OUTER JOIN items ON orders.order_id = items.order_id An optional 'alias' keyword argument may also be specified in a join function. If the single-arg version is used then the alias applies to the last table in the join path. Otherwise the alias applies to the second argument. join(User.c.orders, alias='ox') join( User, Order, User.c.user_id == Order.c.user_id, alias='ox' ) both of these expressions generate the following (pseudo) SQL: FROM users LEFT OUTER JOIN orders ox ON users.user_id = ox.user_id Possible alternative syntax: User.c.join(Order) The biggest problem with this alternative is that it clutters the User.c namespace. Like we were saying earlier: One thing we might want to change is the 'alias' function. Since the 'orders' table may have a column named 'alias', it might be better to make it a stand-alone function like this: alias(c.orders, 'ox') Mike wrote: i think this is the problem when you start munging namespaces together, as opposed to grouping them under explicit prefixes (i.e. 'c'). But 'c' is actually a conglomeration of namespaces including columns, functions, operations, etc. 'c' should only contain columns and relationships and everything else should be moved to a separate package (maybe sqlalchemy.query or even the root sqlalchemy package). I also think something similar to the SelectResults transformative style of creating joins would be useful if it was built-in to Query (with a few name changes such as 'join_to' should be 'join' and 'outerjoin_to' should be 'outer_join'). One thing I don't like about SelectResults is that it gives special attention to a few functions (avg, sum, min, max, etc.). These functions can incorporated into a query with the more versatile 'func.xyz' syntax, and would just add to the confusion about how to use functions. Anyway, there's another brain dump for now. ~ Daniel --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en
[sqlalchemy] Re: Constructing where-clauses dynamically
On Nov 9, 2006, at 10:20 PM, Daniel Miller wrote: Can you give an example of the SQL (including the joins) that would be generated by your statement above? (user.user_id=orders.user_id AND orders.order_id=items.order_id AND items.item_name='item #4') OR (user.user_id=orders.user_id AND orders.order_id=items.order_id AND items.item_name='item #5') Right. It should be possible (although I'm not sure how simple) to combine conditions using the rules for combining logical expressions (i.e. commutative, transitive, etc.). For example: (A == B AND C == D) OR (A == B AND C == E) can be reduced to (A == B) AND (C == D OR C == E) So what we need is a way to take a group of expressions and reduce them to their simplest form before generating the SQL. However, don't most main-stream databases do this kind of optimization anyway? MySQL does (http://dev.mysql.com/doc/refman/5.0/en/where- optimizations.html). this is what i was getting at; an innocent feature is already spawning complicated expression parsing before we've even written any code. Its possible the databases optimize a clause like the above...then again i dont know if this approach is always going to come up with a valid clause. by adding our own syntaxes, as opposed to using syntaxes that always map directly to SQL, we just dont know (since im not a language syntax expert). Having said all that, it may be simpler to use join_to/join_via in some cases and maybe that's a good reason to keep those functions around. However, I think this new syntax would still be very valuable in many cases. Being able to say user.c.orders==someorder is not a big deal. user.c.orders.items compiling into a Join (or into the join clause ? im not sure) im still not sure how that would be used (i.e. do you stick it in the WHERE criterion or in the from_obj)? The syntax I like best is the one provided by the selectresults extension, where you can just say (well, it has just join_to right now, but this is the idea): query.select().join_via('orders', 'items').select (Item.c.item_name=='item #4') this is also very clear if you want to outerjoin: query.select().outerjoin_via('orders', 'items').select() its quick and its explicit. users.c.orders.items seems more ambiguous. ive been concerned that SelectResults is not easy enough to use (i.e. you need a mapper extension, or a separate SelectResults construction).im thinking it should be built-in to Query somehow, i.e. just say query.selectresults() and start building (but even that's too verbose). Im still wondering if Query should have just been built that way from the start - using a strictly transformative interface instead of keyword-args-based. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Constructing where-clauses dynamically
On Nov 10, 2006, at 12:13 AM, Daniel Miller wrote: This query: q.select(or_( c.orders.items.item_name == 'item#4', c.orders.items.item_name == 'item #5' )) Should generate something similar to this SQL: SELECT ... FROM users u INNER JOIN orders o ON u.user_id = o.user_id INNER JOIN items i ON o.order_id = i.order_id WHERE i.item_name = 'item#4' OR i.item_name = 'item #5' For the given clause, each occurrence of User.c.orders.items is combined into a single join path from users to items, and every occurrence of 'item_name' uses the same table alias 'i'. Is that too simplistic? Here are a few more examples: its kind of what SQL construction does now, i.e. extracts unique FROM objects from the where criterion and adds them to a list, so the approach would probably work (might possibly have other issues we arent thinking of just yet) Notice how the join path for Users.c.orders is superimposed on the join path for Users.c.orders.items. It should be possible to use the alias function to force separate joins like this: q.select(and_( c.orders.alias('ox').order_total 50, or_( c.orders.alias('o').items.item_name == 'item #4', c.orders.alias('o').items.item_name == 'item #5', ) )) ASIDE: One thing we might want to change is the 'alias' function. Since the 'orders' table may have a column named 'alias', it might be better to make it a stand-alone function like this: alias(c.orders, 'ox') i think this is the problem when you start munging namespaces together, as opposed to grouping them under explicit prefixes (i.e. 'c'). so far I think this new syntax is going to be confusing to people since its not explicitly clear what it means underneath (which leaves people to guess, get it wrong, and then report it as bugs/ confusion), and also presents a completely different way to create joins when we already have more explicit ways to do it (inlcuding the selectresults transformative style which i think is very fast and clear). when you start throwing aliases into the mix i think it becomes more overwhelming. compare to the current method: ox = ordertable.alias('ox') q.select(and_(ox.c.order_total 50, or_(itemtable.c.item_name=='item #4', itemtable.c.item_name=='item #5')), from_obj=[users.join (ordertable).join(itemstable).join(ox, users.c.user_id==ox.c.user_id)]) more explicit...but I think clearer. the user definitely knows what they are doing when they do this (note that from_obj hasnt really worked in query() up until version 0.3.0 which is why it looks less familiar, but i think also aids in creating more complex queries). i think people should be expressing exactly the joins they want if they want to be joining at all. I think your original proposal, i.e. User.c.orders == someorder is reasonable since its hard to misunderstand what that does. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Constructing where-clauses dynamically
but what happens if i say: q.select(or_(User.c.orders.items.item_name == 'item#4', User.c.orders.items.item_name == 'item #5')) if we naively convert c.orders.items.item_name=='item #4' into user.user_id=orders.user_id and orders.order_id=items.order_id and items.item_name='item #4, then the other clause for item #5 generates into the same thing and you get an inefficient query. i wonder also if some expressions above dont work correctly if the join conditions are repeated like that. its still better to say: q.select(or_(Item.c.item_name == 'item#4', Item.c.item_name == 'item #5'), from_obj=[c.orders.items]) isnt it ? (User.c.orders.items would be a synonym for query.join_via ('orders', 'items')) On Nov 9, 2006, at 12:38 AM, Michael Bayer wrote: On Nov 8, 2006, at 10:00 PM, Daniel Miller wrote: q = session.query(User) c = getcols(User) q.select( (c.addresses.street == 'some address') (c.orders.items.item_name == 'item #4') ) ohh, wow. heh. i had this discomfort with adding relationships to c, but then you just wrapped up the whole mess of join_to/ join_via into one consistent syntax there, didnt you. youll notice in the docs for join_to/join_via theyre marked with alpha API (meaning I have been antsy with them anyway)so this syntax is compelling. you have to pipe in more often Dan ! --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Constructing where-clauses dynamically
Michael Bayer wrote: but what happens if i say: q.select(or_(User.c.orders.items.item_name == 'item#4', User.c.orders.items.item_name == 'item #5')) if we naively convert c.orders.items.item_name=='item #4' into user.user_id=orders.user_id and orders.order_id=items.order_id and items.item_name='item #4, then the other clause for item #5 generates into the same thing and you get an inefficient query. i wonder also if some expressions above dont work correctly if the join conditions are repeated like that. Can you give an example of the SQL (including the joins) that would be generated by your statement above? its still better to say: q.select(or_(Item.c.item_name == 'item#4', Item.c.item_name == 'item #5'), from_obj=[c.orders.items]) isnt it ? (User.c.orders.items would be a synonym for query.join_via ('orders', 'items')) Right. It should be possible (although I'm not sure how simple) to combine conditions using the rules for combining logical expressions (i.e. commutative, transitive, etc.). For example: (A == B AND C == D) OR (A == B AND C == E) can be reduced to (A == B) AND (C == D OR C == E) So what we need is a way to take a group of expressions and reduce them to their simplest form before generating the SQL. However, don't most main-stream databases do this kind of optimization anyway? MySQL does (http://dev.mysql.com/doc/refman/5.0/en/where-optimizations.html). Having said all that, it may be simpler to use join_to/join_via in some cases and maybe that's a good reason to keep those functions around. However, I think this new syntax would still be very valuable in many cases. FWIW, an equivalent but slightly more concise version of your query above would be this: q.select(user.c.orders.items.item_name.in_(item#4, item #5)) ~ Daniel --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Constructing where-clauses dynamically
Michael Bayer wrote: ok let me rephrase that... i have concerns. i think the concerns could be addressed, and we might be able to add this kind of feature...but i dont want to rush into it singlehandedly. I won't blame you for that. I'll help out as much as possible given my busy schedule. heres the things that should be resolved: - we add non-column properties to SomeClass.c. people will get confused that SomeClass.c has a bunch of stuff on it that mytable.c does not. do we change the name of 'c' ? Part of me say adding extra columns to SomeClass.c wouldn't be so confusing, and would actually be intuitive. For example in the original question that started this thread it seemed logical to me that 'order' should have been in there since it was a mapper property. Sure maybe it's a bit more than meets the eye at first, but I think it's very logical to have mapper properties in SomeClass.c but not have them in table.c. As long as it's well documented I don't think it would be a problem. The only other thing I can think of is to create a function that would return a cols object. In the spirit of python's getattr() function: from sqlalchemy.orm import getcols c = getcols(SomeClass) session.query(SomeClass).select(c.order == order) I often find myself assigning SomeClass.c to a local variable when I'm building a query anyway. It prevents excess verbosity. - people will say things like, SomeClass.c.order order. how do we interpret that ? esp. for composite foreign keys. I assume you're implicitly asking about the choice to use AND or OR to combine multiple comparisons of a composite key. For example (assume Order and SomeClass are related by user_id and quote_id): # given this order = Order(user_id=4, quote_id=5) session.query(SomeClass).select(SomeClass.c.order order) -- Do we generate this? WHERE some_class.user_id 4 AND some_class.quote_id 5 -- or this? WHERE some_class.user_id 4 OR some_class.quote_id 5 I don't think the , =, , and = operators would be very useful on most relationship comparisons; especially those with composite keys. However, the == and != operators are obviously useful in many cases. It's like comparing user-defined Python objects without explicit __gt__ and __lt__ implementations. Python 2.x compares the id() of the objects, while Python 3000 will raise a TypeError (I think that's the right error). The reasoning behind this decision[0] is that unless there is an explicit ordering for a given pair of objects, it doesn't make sense to compare them using the inequality operators. At any rate, I would recommend not implementing those comparison operations for relationships by default (at least not for multi-column relationships). If someone really needed them maybe they could create their own custom relationship subclass? [0] http://mail.python.org/pipermail/python-dev/2005-November/057925.html - is it really as simple as just producing the primary/foreign key columns of both sides ? i fear that a whole class of patterns are going to become apparent after adding that feature, and then people are going to complain about SA being broken until an entire HQL-like layer is added to implement all those features. maybe theyre going to stick SomeClass.c.order into the order_by clause, theyre going to stick it in func() calls, CAST calls, etc. and expecting all this magical behavior. pretty much every other feature of SA proved to be a gargantuan undertaking compared to how easy i thought it would be when i first wrote it :)how would this feature be presented ? Could we implement it as simply as possible for now and see how it's used? As new use cases come up we can either extend the implementation or explain why they are not supported (i.e. too complicated, inconsistent with other parts of SA, etc.). - is this feature also going to figure out all the joins from class A to B, if my query is based on class A and the criterion object is all the way on B? the way select_by, join_to works ? that seems trickier and i can see people expecting that behaivor as well. This is something I thought about immediately when I first imagined the idea. Personally, I don't really like the automagical join behavior because it's not explicit and can lead to obscure bugs that are hard to track down (why is this join suddenly returning the wrong results? ...hours later we discover that a new relationship was added that caused SA to pick a different relationship path for the join). The problem is that a seemingly unrelated (no pun intended) change can cause these magic joins to be interpreted differently. I tried to use that behavior once myself but quickly discarded it when SA used some other relationship path than the one I intended to use. In general I think it's best to require explicit relationship paths at all times and give useful errors when the given path can't be determined using those strict rules. if we
[sqlalchemy] Re: Constructing where-clauses dynamically
On Nov 8, 2006, at 10:00 PM, Daniel Miller wrote: q = session.query(User) c = getcols(User) q.select( (c.addresses.street == 'some address') (c.orders.items.item_name == 'item #4') ) ohh, wow. heh. i had this discomfort with adding relationships to c, but then you just wrapped up the whole mess of join_to/ join_via into one consistent syntax there, didnt you. youll notice in the docs for join_to/join_via theyre marked with alpha API (meaning I have been antsy with them anyway)so this syntax is compelling. you have to pipe in more often Dan ! --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Constructing where-clauses dynamically
dmiller wrote: Is there a reason why this doesn't work: orders = Table('orders', meta, Column('id', Integer, Sequence('order_id_seq'), primary_key=True), ... ) items = Table('items', meta, Column('id', Integer, Sequence('item_id_seq'), primary_key=True), Column('order_id', Integer, ForeignKey(orders.c.id), nullable=False), ... ) class Order(object): pass class Item(object): pass itemMapper = mapper(Item, items) orderMapper = mapper(Order, orders, properties=dict( items=relation(itemMapper, backref=order) )) session = create_session() order = session.query(Order).get(1) # assume order exists itemsNotInOrder = session.query(Item).select(Item.c.order != order) # ERROR! This should work. itemsNotInOrder = session.query(Item).select(Item.c.order_id != order.id) The Item.c object does not have an 'order' attribute. Is there a reason why it can't have one? I would guess that attributes of Item.c are Column instances, which order is not. Your approach seems intuitive (I did the same thing once), but the above example I think is easy enough. I could be wrong or missing something. Just trying to be helpful. Randall ~ Daniel --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Constructing where-clauses dynamically
On Nov 6, 2006, at 3:31 PM, Randall Smith wrote: dmiller wrote: Is there a reason why this doesn't work: ... order = session.query(Order).get(1) # assume order exists itemsNotInOrder = session.query(Item).select(Item.c.order != order) # ERROR! This should work. itemsNotInOrder = session.query(Item).select(Item.c.order_id != order.id) Thanks for the response Randall. I know that works...read on below. The Item.c object does not have an 'order' attribute. Is there a reason why it can't have one? I would guess that attributes of Item.c are Column instances, which order is not. Your approach seems intuitive (I did the same thing once), but the above example I think is easy enough. I could be wrong or missing something. Just trying to be helpful. What I'm getting at is that SA has all the necessary details to create the WHERE clause automatically (it already does something very similar when constructing joins). My case is a simple case with a single-column foreign key (it's even a pretty stupid case at that :). However, it demonstrates a powerful query construction concept that seems to be lacking from SA. Ideally it should also work with a multi- column foreign key, which is much more tedious to do manually (as you suggested) because it requires a separate condition for each column in the key. It seams like this would be fairly simple to add, maybe I'll give it a try if I can find the time. ~ Daniel --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Constructing where-clauses dynamically
the c namespace is only for table columns. if we wanted a namespace that had represented mapper properties, which is not just column-mapped attributes but relationships as well, that would be something else altogether. we'd have to define a whole package of expression objects that represent class attributes instead of columns. its not necessarily a bad idea but its a big can of worms (this notion came up many months ago as well)...it would sort of be like porting HQL, more or less. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---