[sqlalchemy] Re: Returning an attribute from an object when using query_property()?

2010-08-02 Thread SQLAlchemy User
  I have the following code:

  class EventQuery(Query):
    def histogram(self):
      ''' return a histogram of source / count(source) given an eventquery''
      ...
  ...

  class Event(Base):
    __tablename__ = 'events'
    query = Session.query_property(query_cls = EventQuery)
    name = Column(Unicode(128))
    source = Column(BigInteger)
  ...

  Now, Event has 54 additional attributes, some of which are quite large. In 
  EventQuery.histogram(), all I really need is the Event.source attribute. 
  'events' has over 150 million rows, and it's likely that 
  EventQuery.histogram() will be called for all events.

  So - is there any way to 'modify' the query (self) in histogram() so that 
  it only returns (Event.source, func.count(Event.source)), or, for the 
  general case, to modify the query to return an attribute of the object? 
  I've tried from_self, but that seems highly inefficient (at least 20x 
  slower than Session.query(...) ). I'd also really like to keep histogram() 
  as a method of EventQuery.

  I can see reasons why this would be impossible, but perhaps there's a way 
  to do it.

 you can use values(Event.source, func.count(...)), or just put a method on 
 Event called histogram that uses the standard 
 object_session(self).query(...) approach, I'm not really sure what the using 
 the old query_property accessor gets you here


Thanks, Michael. Two followup questions:

1) The histogram() method is designed to provide statistics on a set
of Events (defined by a query), so it made sense (to me) to make it a
method of EventQuery. I don't know how I'd do this as a method of
Event. Could you clarify?

2) When you say 'old query property' accessor, are you implying that
query_property is deprecated? Is there an alternative? I really like
the idea of subclassing Query and adding custom methods. Is this a
misguided approach?

BTW: the values() generator takes at least as long as the from_self()
does, and uses a huge amount of memory since all().values() still
passes the entire object list to python. from_self() took 1920 seconds
to run against the entire 'events' table, where
Session.query(func.distinct(Event.source)).all() took 0.02 seconds.
I'll try experimenting with joins next.

Thanks,

S.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Returning an attribute from an object when using query_property()?

2010-08-02 Thread Michael Bayer

On Aug 2, 2010, at 1:58 PM, SQLAlchemy User wrote:

 I ha
 Thanks, Michael. Two followup questions:
 
 1) The histogram() method is designed to provide statistics on a set
 of Events (defined by a query), so it made sense (to me) to make it a
 method of EventQuery. I don't know how I'd do this as a method of
 Event. Could you clarify?

Well if you need to say Event.generate query.generate query.histogram(), 
then yeah I see what you are going for, though you might say 
Event.histogram.generate query.generate query, that's probably not as 
appealing.

 
 2) When you say 'old query property' accessor, are you implying that
 query_property is deprecated? Is there an alternative? I really like
 the idea of subclassing Query and adding custom methods. Is this a
 misguided approach?

subclassing Query and using query_property aren't necessarily connected things. 
  The idea of MyClass.query(foo) isn't deprecated, though it is kind of an old 
pattern back when we were trying to hide the Session as much as possible (or 
when there wasn't really even a Session).   I tend towards a more coarse 
grained interface, I guess, i.e. I'd just have a coarse grained 
Event.histogram(x, y, z) type of method that works only for the necessary use 
cases.  I don't have a strong case for that approach (I have a weak one which 
is, hide complexity, sort of).


 BTW: the values() generator takes at least as long as the from_self()
 does, and uses a huge amount of memory since all().values() still
 passes the entire object list to python. from_self() took 1920 seconds
 to run against the entire 'events' table, where
 Session.query(func.distinct(Event.source)).all() took 0.02 seconds.
 I'll try experimenting with joins next.

values() should be using the least amount of memory of any approach - it only 
buffers 10 rows in memory at a time, and doesn't deal with session or object 
accounting at all so is also much more performant than all().  I dont know what 
all().values() is or if that's a typo.
Session.query(Event).values(func.distinct(Event.source)) should emit a brief 
SELECT DISTINCT query and whip through a single column for the result.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Returning an attribute from an object when using query_property()?

2010-08-02 Thread SQLAlchemy User


On Aug 2, 11:34 am, Michael Bayer mike...@zzzcomputing.com wrote:

 values() should be using the least amount of memory of any approach - it only 
 buffers 10 rows in memory at a time, and doesn't deal with session or object 
 accounting at all so is also much more performant than all().  I dont know 
 what all().values() is or if that's a typo.    
 Session.query(Event).values(func.distinct(Event.source)) should emit a brief 
 SELECT DISTINCT query and whip through a single column for the result.

It wasn't a typo; it was silliness on my part. The iterator approach
is blazingly fast. Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.