[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-11 Thread Daniel Miller


BTW, this post started out as a reply to James but grew into a collective 
summary of my thoughts on the entire thread so far. You've been warned :)

James Taylor wrote:
 But wouldn't the join from orders to items be OUTER in this case  
 since you can have orders with a total  50 but no items. (Your  
 application logic probably forbids this, but it is allowed at the  
 relational level).
 
 Gets complicated quick!

Yes, that's one of the things I started to dislike as I got more into the 
details. The 'obj1.obj2.obj3' join syntax does not imply the type of join (i.e. 
inner, outer, cross, etc.).  My initial thought is to say that this type of 
join syntax would always use inner joins, which is easy enough to remember.

One thing I'm not quite clear on is how the join type is specified with the 
other join syntaxes in SA. There's join_to/join_via and Mike talked about 
from_obj=[users.join(ordertable).join(...)], but both of those are just as 
ambiguous as obj1.obj2.obj3 when it comes to the join type (and they're a lot 
more verbose at that). I guess SelectResults has outerjoin_to()...more on that 
later.

Here is a more complete join syntax proposal:

INNER JOIN (join path syntax sugar):
User.c.orders
User.c.orders.items

INNER JOIN (non-sugar):
join(User.c.orders)
join(User.c.orders.items)
join(User, Order, User.c.user_id  Order.c.user_id)

OUTER JOIN:
outerjoin(User.c.orders)
outerjoin(User.c.orders.items)
outerjoin(User, Order, User.c.user_id  Order.c.user_id)

RANDOM JOIN:
random_join(User.c.orders)

Just kidding on that last one :P

Each join function requires one or three positional arguments. The single-arg 
version takes a join path as it's argument, and performs the join type 
represented by function on all joins along the path. A stand-alone join path is 
syntax sugar for join(join path). The three-arg version takes two join-able 
objects and a condition on which to join them. I realize that a two-arg version 
is also available where SA tries to find a relationship between the first two 
args, and IMHO that should be deprecated because it is error-prone.

To do a join based on a mapper relationship, use a join path. To do an ad-hoc 
join use the three-argument version.

In addition to join paths and classes, the 3-arg version should also accept 
tables, strings (entity names), select clauses, and other join objects as 
arguments. When using anything other than a join path as the first argument, 
three arguments are required. So this:

outerjoin(
join(User.c.orders),
Items,
Order.c.order_id == Item.c.order_id
)

and would generate this:

FROM users
INNER JOIN orders
ON users.user_id = orders.user_id
LEFT OUTER JOIN items
ON orders.order_id = items.order_id

An optional 'alias' keyword argument may also be specified in a join function. 
If the single-arg version is used then the alias applies to the last table in 
the join path. Otherwise the alias applies to the second argument.

join(User.c.orders, alias='ox')

join(
User, Order,
User.c.user_id == Order.c.user_id,
alias='ox'
)

both of these expressions generate the following (pseudo) SQL:

FROM users
LEFT OUTER JOIN orders ox
ON users.user_id = ox.user_id



Possible alternative syntax:

User.c.join(Order)

The biggest problem with this alternative is that it clutters the User.c 
namespace. Like we were saying earlier:

 One thing we might want to change is the 'alias' function.  
 Since the 'orders' table may have a column named 'alias', it might  
 be better to make it a stand-alone function like this:
 alias(c.orders, 'ox')

 Mike wrote:
 i think this is the problem when you start munging namespaces  
 together, as opposed to grouping them under explicit prefixes (i.e.  
 'c').

But 'c' is actually a conglomeration of namespaces including columns, 
functions, operations, etc. 'c' should only contain columns and relationships 
and everything else should be moved to a separate package (maybe 
sqlalchemy.query or even the root sqlalchemy package).


I also think something similar to the SelectResults transformative style of 
creating joins would be useful if it was built-in to Query (with a few name 
changes such as 'join_to' should be 'join' and 'outerjoin_to' should be 
'outer_join'). One thing I don't like about SelectResults is that it gives 
special attention to a few functions (avg, sum, min, max, etc.). These 
functions can incorporated into a query with the more versatile 'func.xyz' 
syntax, and would just add to the confusion about how to use functions.


Anyway, there's another brain dump for now.

~ Daniel

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en

[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-10 Thread Michael Bayer


On Nov 9, 2006, at 10:20 PM, Daniel Miller wrote:


 Can you give an example of the SQL (including the joins) that would  
 be generated by your statement above?


(user.user_id=orders.user_id AND orders.order_id=items.order_id AND
items.item_name='item #4') OR
(user.user_id=orders.user_id AND orders.order_id=items.order_id AND
items.item_name='item #5')


 Right. It should be possible (although I'm not sure how simple) to  
 combine conditions using the rules for combining logical  
 expressions (i.e. commutative, transitive, etc.). For example:

 (A == B AND C == D) OR (A == B AND C == E)

 can be reduced to

 (A == B) AND (C == D OR C == E)

 So what we need is a way to take a group of expressions and reduce  
 them to their simplest form before generating the SQL. However,  
 don't most main-stream databases do this kind of optimization  
 anyway? MySQL does (http://dev.mysql.com/doc/refman/5.0/en/where- 
 optimizations.html).


this is what i was getting at; an innocent feature is already  
spawning complicated expression parsing before we've even written any  
code.  Its possible the databases optimize a clause like the  
above...then again i dont know if this approach is always going to  
come up with a valid clause.  by adding our own syntaxes, as opposed  
to using syntaxes that always map directly to SQL, we just dont know  
(since im not a language syntax expert).

 Having said all that, it may be simpler to use join_to/join_via in  
 some cases and maybe that's a good reason to keep those functions  
 around. However, I think this new syntax would still be very  
 valuable in many cases.

Being able to say user.c.orders==someorder is not a big deal.   
user.c.orders.items compiling into a Join (or into the join clause ?   
im not sure) im still not sure how that would be used (i.e. do you  
stick it in the WHERE criterion or in the from_obj)?  The syntax I  
like best is the one provided by the selectresults extension, where  
you can just say (well, it has just join_to right now, but this is  
the idea):

query.select().join_via('orders', 'items').select 
(Item.c.item_name=='item #4')

this is also very clear if you want to outerjoin:

query.select().outerjoin_via('orders', 'items').select()

its quick and its explicit.  users.c.orders.items seems more ambiguous.

ive been concerned that SelectResults is not easy enough to use (i.e.  
you need a mapper extension, or a separate SelectResults  
construction).im thinking it should be built-in to Query somehow,  
i.e. just say query.selectresults() and start building (but even  
that's too verbose).   Im still wondering if Query should have just  
been built that way from the start - using a strictly transformative  
interface instead of keyword-args-based.





--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-10 Thread Michael Bayer


On Nov 10, 2006, at 12:13 AM, Daniel Miller wrote:

 This query:

 q.select(or_(
 c.orders.items.item_name == 'item#4',
 c.orders.items.item_name == 'item #5'
 ))


 Should generate something similar to this SQL:

 SELECT ...
 FROM users u
 INNER JOIN orders o
 ON u.user_id = o.user_id
 INNER JOIN items i
 ON o.order_id = i.order_id
 WHERE i.item_name = 'item#4'
OR i.item_name = 'item #5'


 For the given clause, each occurrence of User.c.orders.items is  
 combined into a single join path from users to items, and every  
 occurrence of 'item_name' uses the same table alias 'i'. Is that  
 too simplistic? Here are a few more examples:


its kind of what SQL construction does now, i.e. extracts unique  
FROM objects from the where criterion and adds them to a list, so  
the approach would probably work (might possibly have other issues we  
arent thinking of just yet)


 Notice how the join path for Users.c.orders is superimposed on the  
 join path for Users.c.orders.items. It should be possible to use  
 the alias function to force separate joins like this:

 q.select(and_(
 c.orders.alias('ox').order_total  50,
 or_(
 c.orders.alias('o').items.item_name == 'item #4',
 c.orders.alias('o').items.item_name == 'item #5',
 )
 ))
 

 ASIDE: One thing we might want to change is the 'alias' function.  
 Since the 'orders' table may have a column named 'alias', it might  
 be better to make it a stand-alone function like this:

 alias(c.orders, 'ox')


i think this is the problem when you start munging namespaces  
together, as opposed to grouping them under explicit prefixes (i.e.  
'c').   so far I think this new syntax is going to be confusing to  
people since its not explicitly clear what it means underneath (which  
leaves people to guess, get it wrong, and then report it as bugs/ 
confusion), and also presents a completely different way to create  
joins when we already have more explicit ways to do it (inlcuding the  
selectresults transformative style which i think is very fast and  
clear).   when you start throwing aliases into the mix i think it  
becomes more overwhelming.

compare to the current method:

ox = ordertable.alias('ox')
q.select(and_(ox.c.order_total  50, or_(itemtable.c.item_name=='item  
#4', itemtable.c.item_name=='item #5')), from_obj=[users.join 
(ordertable).join(itemstable).join(ox, users.c.user_id==ox.c.user_id)])

more explicit...but I think clearer.  the user definitely knows what  
they are doing when they do this (note that from_obj hasnt really  
worked in query() up until version 0.3.0 which is why it looks less  
familiar, but i think also aids in creating more complex queries).  i  
think people should be expressing exactly the joins they want if they  
want to be joining at all.

I think your original proposal, i.e. User.c.orders == someorder is  
reasonable since its hard to misunderstand what that does.





--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-09 Thread Michael Bayer

but what happens if i say:

q.select(or_(User.c.orders.items.item_name == 'item#4',  
User.c.orders.items.item_name == 'item #5'))

if we naively convert c.orders.items.item_name=='item #4' into  
user.user_id=orders.user_id and orders.order_id=items.order_id and  
items.item_name='item #4, then the other clause for item #5  
generates into the same thing and you get an inefficient query.  i  
wonder also if some expressions above dont work correctly if the join  
conditions are repeated like that.

its still better to say:

q.select(or_(Item.c.item_name == 'item#4', Item.c.item_name == 'item  
#5'), from_obj=[c.orders.items])

isnt it ?  (User.c.orders.items would be a synonym for query.join_via 
('orders', 'items'))


On Nov 9, 2006, at 12:38 AM, Michael Bayer wrote:



 On Nov 8, 2006, at 10:00 PM, Daniel Miller wrote:
 q = session.query(User)
 c = getcols(User)

 q.select(
 (c.addresses.street == 'some address') 
 (c.orders.items.item_name == 'item #4')
 )


 ohh, wow.  heh.   i had this discomfort with adding relationships
 to c, but then you just wrapped up the whole mess of join_to/
 join_via into one consistent syntax there, didnt you.  youll notice
 in the docs for join_to/join_via theyre marked with alpha
 API (meaning I have been antsy with them anyway)so this syntax
 is compelling.

 you have to pipe in more often Dan !







 


--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-09 Thread Daniel Miller


Michael Bayer wrote:
 but what happens if i say:
 
 q.select(or_(User.c.orders.items.item_name == 'item#4',  
 User.c.orders.items.item_name == 'item #5'))
 
 if we naively convert c.orders.items.item_name=='item #4' into  
 user.user_id=orders.user_id and orders.order_id=items.order_id and  
 items.item_name='item #4, then the other clause for item #5  
 generates into the same thing and you get an inefficient query.  i  
 wonder also if some expressions above dont work correctly if the join  
 conditions are repeated like that.

Can you give an example of the SQL (including the joins) that would be 
generated by your statement above?

 
 its still better to say:
 
   q.select(or_(Item.c.item_name == 'item#4', Item.c.item_name == 'item  
 #5'), from_obj=[c.orders.items])
 
 isnt it ?  (User.c.orders.items would be a synonym for query.join_via 
 ('orders', 'items'))
 

Right. It should be possible (although I'm not sure how simple) to combine 
conditions using the rules for combining logical expressions (i.e. commutative, 
transitive, etc.). For example:

(A == B AND C == D) OR (A == B AND C == E)

can be reduced to

(A == B) AND (C == D OR C == E)

So what we need is a way to take a group of expressions and reduce them to 
their simplest form before generating the SQL. However, don't most main-stream 
databases do this kind of optimization anyway? MySQL does 
(http://dev.mysql.com/doc/refman/5.0/en/where-optimizations.html).

Having said all that, it may be simpler to use join_to/join_via in some cases 
and maybe that's a good reason to keep those functions around. However, I think 
this new syntax would still be very valuable in many cases.

FWIW, an equivalent but slightly more concise version of your query above would 
be this:

q.select(user.c.orders.items.item_name.in_(item#4, item #5))


~ Daniel

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-08 Thread Daniel Miller


Michael Bayer wrote:
 ok let me rephrase that... i have concerns.  i think the concerns could
 be addressed, and we might be able to add this kind of feature...but i
 dont want to rush into it singlehandedly. 

I won't blame you for that. I'll help out as much as possible given my busy 
schedule.

 heres the things that should
 be resolved:
 
 - we add non-column properties to SomeClass.c.  people will get
 confused that SomeClass.c has a bunch of stuff on it that mytable.c
 does not.  do we change the name of 'c' ?

Part of me say adding extra columns to SomeClass.c wouldn't be so confusing, 
and would actually be intuitive. For example in the original question that 
started this thread it seemed logical to me that 'order' should have been in 
there since it was a mapper property. Sure maybe it's a bit more than meets the 
eye at first, but I think it's very logical to have mapper properties in 
SomeClass.c but not have them in table.c. As long as it's well documented I 
don't think it would be a problem.

The only other thing I can think of is to create a function that would return a 
cols object. In the spirit of python's getattr() function:

from sqlalchemy.orm import getcols

c = getcols(SomeClass)
session.query(SomeClass).select(c.order == order)

I often find myself assigning SomeClass.c to a local variable when I'm building 
a query anyway. It prevents excess verbosity.

 
 - people will say things like, SomeClass.c.order  order.  how do we
 interpret that ?  esp. for composite foreign keys.

I assume you're implicitly asking about the choice to use AND or OR to combine 
multiple comparisons of a composite key. For example (assume Order and 
SomeClass are related by user_id and quote_id):

# given this
order = Order(user_id=4, quote_id=5)
session.query(SomeClass).select(SomeClass.c.order  order)

-- Do we generate this?
WHERE some_class.user_id  4 AND some_class.quote_id  5

-- or this?
WHERE some_class.user_id  4 OR some_class.quote_id  5


I don't think the , =, , and = operators would be very useful on most 
relationship comparisons; especially those with composite keys. However, the == 
and != operators are obviously useful in many cases. It's like comparing 
user-defined Python objects without explicit __gt__ and __lt__ implementations. 
Python 2.x compares the id() of the objects, while Python 3000 will raise a 
TypeError (I think that's the right error). The reasoning behind this 
decision[0] is that unless there is an explicit ordering for a given pair of 
objects, it doesn't make sense to compare them using the inequality operators.

At any rate, I would recommend not implementing those comparison operations for 
relationships by default (at least not for multi-column relationships). If 
someone really needed them maybe they could create their own custom 
relationship subclass?

[0] http://mail.python.org/pipermail/python-dev/2005-November/057925.html

 
 - is it really as simple as just producing the primary/foreign key
 columns of both sides ?  i fear that a whole class of patterns are
 going to become apparent after adding that feature, and then people are
 going to complain about SA being broken until an entire HQL-like
 layer is added to implement all those features.  maybe theyre going to
 stick SomeClass.c.order into the order_by clause, theyre going to stick
 it in func() calls, CAST calls, etc. and expecting all this magical
 behavior.  pretty much every other feature of SA proved to be a
 gargantuan undertaking compared to how easy i thought it would be when
 i first wrote it :)how would this feature be presented ?

Could we implement it as simply as possible for now and see how it's used? As 
new use cases come up we can either extend the implementation or explain why 
they are not supported (i.e. too complicated, inconsistent with other parts of 
SA, etc.).

 
 - is this feature also going to figure out all the joins from class A
 to B, if my query is based on class A and the criterion object is all
 the way on B?  the way select_by, join_to works ?  that seems trickier
 and i can see people expecting that behaivor as well. 

This is something I thought about immediately when I first imagined the idea. 
Personally, I don't really like the automagical join behavior because it's not 
explicit and can lead to obscure bugs that are hard to track down (why is this 
join suddenly returning the wrong results? ...hours later we discover that a 
new relationship was added that caused SA to pick a different relationship path 
for the join). The problem is that a seemingly unrelated (no pun intended) 
change can cause these magic joins to be interpreted differently. I tried to 
use that behavior once myself but quickly discarded it when SA used some other 
relationship path than the one I intended to use. In general I think it's best 
to require explicit relationship paths at all times and give useful errors when 
the given path can't be determined using those strict rules.

 if we 

[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-08 Thread Michael Bayer


On Nov 8, 2006, at 10:00 PM, Daniel Miller wrote:
 q = session.query(User)
 c = getcols(User)

 q.select(
 (c.addresses.street == 'some address') 
 (c.orders.items.item_name == 'item #4')
 )


ohh, wow.  heh.   i had this discomfort with adding relationships  
to c, but then you just wrapped up the whole mess of join_to/ 
join_via into one consistent syntax there, didnt you.  youll notice  
in the docs for join_to/join_via theyre marked with alpha  
API (meaning I have been antsy with them anyway)so this syntax  
is compelling.

you have to pipe in more often Dan !







--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-06 Thread Randall Smith

dmiller wrote:
 Is there a reason why this doesn't work:
 
 orders = Table('orders', meta,
  Column('id', Integer, Sequence('order_id_seq'), primary_key=True),
  ...
 
 )
 items = Table('items', meta,
  Column('id', Integer, Sequence('item_id_seq'), primary_key=True),
  Column('order_id', Integer, ForeignKey(orders.c.id),  
 nullable=False),
  ...
 )
 
 class Order(object): pass
 class Item(object): pass
 
 itemMapper = mapper(Item, items)
 orderMapper = mapper(Order, orders, properties=dict(
  items=relation(itemMapper, backref=order)
 ))
 
 session = create_session()
 order = session.query(Order).get(1) # assume order exists
 itemsNotInOrder = session.query(Item).select(Item.c.order != order) #  
 ERROR!

This should work.
itemsNotInOrder = session.query(Item).select(Item.c.order_id != order.id)

 
 
 The Item.c object does not have an 'order' attribute. Is there a  
 reason why it can't have one?

I would guess that attributes of Item.c are Column instances, which 
order is not.  Your approach seems intuitive (I did the same thing 
once), but the above example I think is easy enough.

I could be wrong or missing something.  Just trying to be helpful.

Randall

 
 ~ Daniel
 
 


--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-06 Thread dmiller


On Nov 6, 2006, at 3:31 PM, Randall Smith wrote:


 dmiller wrote:
 Is there a reason why this doesn't work:

...

 order = session.query(Order).get(1) # assume order exists
 itemsNotInOrder = session.query(Item).select(Item.c.order != order) #
 ERROR!

 This should work.
 itemsNotInOrder = session.query(Item).select(Item.c.order_id !=  
 order.id)

Thanks for the response Randall. I know that works...read on below.




 The Item.c object does not have an 'order' attribute. Is there a
 reason why it can't have one?

 I would guess that attributes of Item.c are Column instances, which
 order is not.  Your approach seems intuitive (I did the same thing
 once), but the above example I think is easy enough.

 I could be wrong or missing something.  Just trying to be helpful.

What I'm getting at is that SA has all the necessary details to  
create the WHERE clause automatically (it already does something very  
similar when constructing joins). My case is a simple case with a  
single-column foreign key (it's even a pretty stupid case at that :).  
However, it demonstrates a powerful query construction concept that  
seems to be lacking from SA. Ideally it should also work with a multi- 
column foreign key, which is much more tedious to do manually (as you  
suggested) because it requires a separate condition for each column  
in the key. It seams like this would be fairly simple to add, maybe  
I'll give it a try if I can find the time.

~ Daniel

--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Constructing where-clauses dynamically

2006-11-06 Thread Michael Bayer

the c namespace is only for table columns.  if we wanted a namespace
that had represented mapper properties, which is not just column-mapped
attributes but relationships as well, that would be something else
altogether.  we'd have to define a whole package of expression objects
that represent class attributes instead of columns.  its not
necessarily a bad idea but its a big can of worms (this notion came up
many months ago as well)...it would sort of be like porting HQL, more
or less.


--~--~-~--~~~---~--~~
 You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---