Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-06 Thread Claudio Freire
On Thu, May 2, 2013 at 3:34 PM, Claudio Freire klaussfre...@gmail.com wrote:
 Without the C extension:
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  20811734   27.8290.000   27.8550.000 attributes.py:171(__get__)
   7631984   13.5320.000   31.8510.000 ruby.py:86(get_param)

 With the C extension:
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   7631984   19.5140.000   21.0510.000 ruby.py:86(get_param)

 Notice how the C extension saves a total of 10s (cumtime, sum of internal
 and external time).

 There's no DB access when hitting those arguments, as everything has been
 eagerly loaded. It's all function call overhead.

 Assuming an application makes heavy use of attributes, as get_param does
 (expectable of straightforward code I'd think), that's a 30% speedup of
 CPU-bound code.

 As soon as I get GC right I'll post the patch.


So... I got GC right (I think). I had to remove a few lines from
profiles.txt because, obviously, there's a lot less function calls
now.

There's a second patch, that adds __slots__ to instance state. I found
it speeds up things, marginally, but consistently (State.__init__ was
another function weighing a lot because of millions of calls, this is
the only way I found to speed it up).

I'll get around to the Py3 things now.

PS: Sorry I based it on 0.7.10... my app runs on that... I imagine I
could upgrade to 0.8 with little effort, but never got around to
actually doing it.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-06 Thread Claudio Freire
On Mon, May 6, 2013 at 1:10 PM, Claudio Freire klaussfre...@gmail.com wrote:
 On Thu, May 2, 2013 at 3:34 PM, Claudio Freire klaussfre...@gmail.com wrote:
 Without the C extension:
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  20811734   27.8290.000   27.8550.000 attributes.py:171(__get__)
   7631984   13.5320.000   31.8510.000 ruby.py:86(get_param)

 With the C extension:
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   7631984   19.5140.000   21.0510.000 ruby.py:86(get_param)

 Notice how the C extension saves a total of 10s (cumtime, sum of internal
 and external time).

 There's no DB access when hitting those arguments, as everything has been
 eagerly loaded. It's all function call overhead.

 Assuming an application makes heavy use of attributes, as get_param does
 (expectable of straightforward code I'd think), that's a 30% speedup of
 CPU-bound code.

 As soon as I get GC right I'll post the patch.


 So... I got GC right (I think). I had to remove a few lines from
 profiles.txt because, obviously, there's a lot less function calls
 now.

 There's a second patch, that adds __slots__ to instance state. I found
 it speeds up things, marginally, but consistently (State.__init__ was
 another function weighing a lot because of millions of calls, this is
 the only way I found to speed it up).

 I'll get around to the Py3 things now.

 PS: Sorry I based it on 0.7.10... my app runs on that... I imagine I
 could upgrade to 0.8 with little effort, but never got around to
 actually doing it.


Stupid me... forgot to attach them.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




SQLAlchemy-0.7.10-cinstrumented.patch
Description: Binary data


SQLAlchemy-0.7.10-slotstate.patch
Description: Binary data


Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-06 Thread Michael Bayer
that's a lot of effort there.  How confident are you that memory and references 
are handled correctly in the .c code?   That's a lot of C code, and it took 
years for us to iron out all the memory leaks in the existing C extensions that 
we had - the original author eventually stopped maintaining them, and I had to 
take it all on myself and spend weeks learning the code and ironing out 
remaining, subtle issues (like http://hg.sqlalchemy.org/sqlalchemy/rev/8326 and 
http://hg.sqlalchemy.org/sqlalchemy/rev/8140).   These are very insidious 
issues as they can't be diagnosed by usual gc reference counting.



On May 6, 2013, at 12:11 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Mon, May 6, 2013 at 1:10 PM, Claudio Freire klaussfre...@gmail.com wrote:
 On Thu, May 2, 2013 at 3:34 PM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 Without the C extension:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 20811734   27.8290.000   27.8550.000 attributes.py:171(__get__)
  7631984   13.5320.000   31.8510.000 ruby.py:86(get_param)
 
 With the C extension:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  7631984   19.5140.000   21.0510.000 ruby.py:86(get_param)
 
 Notice how the C extension saves a total of 10s (cumtime, sum of internal
 and external time).
 
 There's no DB access when hitting those arguments, as everything has been
 eagerly loaded. It's all function call overhead.
 
 Assuming an application makes heavy use of attributes, as get_param does
 (expectable of straightforward code I'd think), that's a 30% speedup of
 CPU-bound code.
 
 As soon as I get GC right I'll post the patch.
 
 
 So... I got GC right (I think). I had to remove a few lines from
 profiles.txt because, obviously, there's a lot less function calls
 now.
 
 There's a second patch, that adds __slots__ to instance state. I found
 it speeds up things, marginally, but consistently (State.__init__ was
 another function weighing a lot because of millions of calls, this is
 the only way I found to speed it up).
 
 I'll get around to the Py3 things now.
 
 PS: Sorry I based it on 0.7.10... my app runs on that... I imagine I
 could upgrade to 0.8 with little effort, but never got around to
 actually doing it.
 
 
 Stupid me... forgot to attach them.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.
 
 
 SQLAlchemy-0.7.10-cinstrumented.patchSQLAlchemy-0.7.10-slotstate.patch

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-06 Thread Claudio Freire
On Mon, May 6, 2013 at 1:50 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 that's a lot of effort there.  How confident are you that memory and 
 references are handled correctly in the .c code?

Quite. It's not my first C extension. But, truly, C is complex.

 That's a lot of C code, and it took years for us to iron out all the memory 
 leaks in the existing C extensions that we had - the original author 
 eventually stopped maintaining them, and I had to take it all on myself and 
 spend weeks learning the code and ironing out remaining, subtle issues (like 
 http://hg.sqlalchemy.org/sqlalchemy/rev/8326 and 
 http://hg.sqlalchemy.org/sqlalchemy/rev/8140).   These are very insidious 
 issues as they can't be diagnosed by usual gc reference counting.


There's an answer to those problems that I hesitated proposing, but
you might want to consider: Pyrex. Or Cython. Take your pick. They
*generate* C code, so it's be rather simple to replace the C
extensions with them, and they look a lot more like python, and are a
lot more fool-proof. Really, Pyrex is made for this kind of work. It's
begging you.

It's only the cost of an extra dependency (and the learning curve,
which is there, but far flatter than C's).

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-06 Thread Michael Bayer
did you generate your code here with pyrex?If you want to jump in and 
rework our C extensions to be pyrex based and everything works out just as well 
or better than before, it'll be a great 0.9/1.0 feature.I've got a bit of 
experience with cython already as I've worked on lxml a bit, cython vs. pyrex 
any thoughts ?  based on 
http://docs.cython.org/src/userguide/pyrex_differences.html they seem pretty 
similar (though cython seems more commonplace...)






On May 6, 2013, at 1:20 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Mon, May 6, 2013 at 1:50 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 that's a lot of effort there.  How confident are you that memory and 
 references are handled correctly in the .c code?
 
 Quite. It's not my first C extension. But, truly, C is complex.
 
 That's a lot of C code, and it took years for us to iron out all the memory 
 leaks in the existing C extensions that we had - the original author 
 eventually stopped maintaining them, and I had to take it all on myself and 
 spend weeks learning the code and ironing out remaining, subtle issues (like 
 http://hg.sqlalchemy.org/sqlalchemy/rev/8326 and 
 http://hg.sqlalchemy.org/sqlalchemy/rev/8140).   These are very insidious 
 issues as they can't be diagnosed by usual gc reference counting.
 
 
 There's an answer to those problems that I hesitated proposing, but
 you might want to consider: Pyrex. Or Cython. Take your pick. They
 *generate* C code, so it's be rather simple to replace the C
 extensions with them, and they look a lot more like python, and are a
 lot more fool-proof. Really, Pyrex is made for this kind of work. It's
 begging you.
 
 It's only the cost of an extra dependency (and the learning curve,
 which is there, but far flatter than C's).
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.
 
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-06 Thread Claudio Freire
On Mon, May 6, 2013 at 2:31 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 did you generate your code here with pyrex?If you want to jump in and 
 rework our C extensions to be pyrex based and everything works out just as 
 well or better than before, it'll be a great 0.9/1.0 feature.I've got a 
 bit of experience with cython already as I've worked on lxml a bit, cython 
 vs. pyrex any thoughts ?  based on 
 http://docs.cython.org/src/userguide/pyrex_differences.html they seem pretty 
 similar (though cython seems more commonplace...)

Cython makes a lot more progress, but it's also its drawback at times.
I've sticked to Pyrex when I don't need Cython's benefits, because
Pyrex is far more stable and easier to depend on.

For this kind of work, I'd suggest pyrex. But really both work.

I might try that, after checking Pyrex's compatibility with Py3...
I've never done that.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-06 Thread Claudio Freire
On Mon, May 6, 2013 at 2:31 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 did you generate your code here with pyrex?


Oh, sorry, I didn't answer this.

No. I wrote it by hand.

Pyrex-generated code is inscrutable, not that there's any need to
inscrute. But really, when using pyrex, the C file ought to be
considered merely as an intermediate file. The sources are the .pyx
files.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-06 Thread Michael Bayer
Here's a ticket where we can keep talking about this:

http://www.sqlalchemy.org/trac/ticket/2720

(here's the py3k ticket also: http://www.sqlalchemy.org/trac/ticket/2161)

note that SQLAlchemy 0.9 will no longer use 2to3, and will by Python 2.6-3.3 
in place.The enhancement here is targeted at 0.9 which is currently in the 
rel_0_9 branch.   

as for the __slots__ thing, that's a separate issue.if your patch doesn't 
break tests we can set that for 0.9 as well, I doubt anyone is subclassing 
InstanceState, though I'd want to see what the speedup is with that.   



On May 6, 2013, at 1:37 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Mon, May 6, 2013 at 2:20 PM, Claudio Freire klaussfre...@gmail.com wrote:
 On Mon, May 6, 2013 at 1:50 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 that's a lot of effort there.  How confident are you that memory and 
 references are handled correctly in the .c code?
 
 Quite. It's not my first C extension. But, truly, C is complex.
 
 
 And as I write this... I find an... issue (not leak, but not good
 reference management either). Sorry.
 
 Feel free to gauge cost-benefit here. I'll think about pyrex too.
 
 Have in mind that I'll be using SQLAlchemy for years to come (or
 expect to). I don't think I shall withdraw support in the short term,
 but an oracle I am not.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.
 
 
 SQLAlchemy-0.7.10-cinstrumented.patch

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-06 Thread Claudio Freire
On Mon, May 6, 2013 at 4:27 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 as for the __slots__ thing, that's a separate issue.if your patch doesn't 
 break tests we can set that for 0.9 as well, I doubt anyone is subclassing 
 InstanceState, though I'd want to see what the speedup is with that.

About 15% on state creation (which can easily be a big chunk of any
bulk ORM operations):

 class InstanceState(object):
...__slots__ = ('a','b','c','__dict__','__weakrefs__')
...def __init__(self):
...self.a = a
...self.b = b
...self.c = c
...
 class InstanceStateSlow(object):
...def __init__(self):
...self.a = a
...self.b = b
...self.c = c
...
 def test(which):
...for i in xrange(10):
...   x = which()
...
 import timeit
 timeit.timeit(lambda : test(InstanceStateSlow))
8.486893892288208
 timeit.timeit(lambda : test(InstanceState))
7.35853814697

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-05-02 Thread Claudio Freire
On Fri, Apr 26, 2013 at 8:59 PM, Michael Bayer mike...@zzzcomputing.com
wrote:
 All attributes have to be expire-able and act as proxies for a
database connection so I'm not really sure where to go with that.I'm
not too thrilled about proposals to build in various alternate
performance behaviors as the library starts to try to act in many
different ways that the vast majority of users aren't even aware of, it
increases complexity internally, produces vast amounts of new use cases to
test and maintain, etc.I'm always willing to look at patches that are
all winning, of course, so if you have some way to speed things up without
breaking usage contracts and without major new complexity/brittleness I'd
love to look at a pull request.

 I know, it's just a probe to see what kind of a speedup could be
 obtained by not having that getter's interference. You know... simply
 implementing InstrumentedAttribute in C could do the trick...


 In fact... I'm gonna try that...

 feel free!  though you might be surprised, a C function that just calls
out to all the same Python operations anyway is often only negligibly
faster, not enough to make the extra complexity worth it.


Ok, I got around to profiling this. The C extension saves 20s from 800s, as
noted before, most of those 800s are SA-unrelated application logic, and
the app has been greatly optimized to avoid attribute access, so average
speedup would most likely be far more.

As expected, the getter disappears from profiles (it's seen as the calling
function's time now). I've got a function that makes some unavoidable
access to instrumented attributes. It's get_params, it's called around 7M
times, so while small, the overhead does add up. I made sure expiration
works btw.

Without the C extension:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 20811734   27.8290.000   27.8550.000 attributes.py:171(__get__)
  7631984   13.5320.000   31.8510.000 ruby.py:86(get_param)

With the C extension:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  7631984   19.5140.000   21.0510.000 ruby.py:86(get_param)

Notice how the C extension saves a total of 10s (cumtime, sum of internal
and external time).

There's no DB access when hitting those arguments, as everything has been
eagerly loaded. It's all function call overhead.

Assuming an application makes heavy use of attributes, as get_param does
(expectable of straightforward code I'd think), that's a 30% speedup of
CPU-bound code.

As soon as I get GC right I'll post the patch.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-30 Thread Claudio Freire
On Fri, Apr 26, 2013 at 9:09 PM, Claudio Freire klaussfre...@gmail.com wrote:
 On Fri, Apr 26, 2013 at 9:01 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 All attributes have to be expire-able and act as proxies for a database 
 connection so I'm not really sure where to go with that.I'm not too 
 thrilled about proposals to build in various alternate performance 
 behaviors as the library starts to try to act in many different ways 
 that the vast majority of users aren't even aware of, it increases 
 complexity internally, produces vast amounts of new use cases to test 
 and maintain, etc.I'm always willing to look at patches that are all 
 winning, of course, so if you have some way to speed things up without 
 breaking usage contracts and without major new complexity/brittleness 
 I'd love to look at a pull request.

 I know, it's just a probe to see what kind of a speedup could be
 obtained by not having that getter's interference. You know... simply
 implementing InstrumentedAttribute in C could do the trick...


 In fact... I'm gonna try that...

 feel free!  though you might be surprised, a C function that just calls out 
 to all the same Python operations anyway is often only negligibly faster, 
 not enough to make the extra complexity worth it.

 also if you're looking to help with C, I'd love to get the C extensions out 
 in the Py3K version, we have a patch that's fallen out of date at 
 http://www.sqlalchemy.org/trac/ticket/2161 that needs freshening up and 
 testing.

 Will look into that. The point of the C function is to be able to
 quickly bypass all that _supports_population and function call
 overheads. The getter is dead-simple, so its cost is dominated by
 CPython function call overheads, that are readily removable by
 re-implementing in C. It can reliably and quickly detect when
 instance_dict returns __dict__, too.

Alright, I've got a POC C extension working (gotta profile it yet),
although SQLAlchemy's weird injection of instance_dict forced me to
some ugly hacks:


class InstrumentedAttribute(QueryableAttribute):
Class bound instrumented attribute which adds descriptor methods.

def __set__(self, instance, value):
self.impl.set(instance_state(instance),
instance_dict(instance), value, None)

def __delete__(self, instance):
self.impl.delete(instance_state(instance), instance_dict(instance))

try:
from sqlalchemy.cinstrumented import InstrumentedGetter
__get__ = InstrumentedGetter(globals())
__get__.__name__ = '__get__'
del InstrumentedGetter
except ImportError:
def __get__(self, instance, owner):
if instance is None:
return self

dict_ = instance_dict(instance)
if self._supports_population and self.key in dict_:
return dict_[self.key]
else:
return self.impl.get(instance_state(instance),dict_)

Thing is, doing the whole class in C makes no sense for set and
delete, but it also complicates linking its instance_dict and
instance_state to SA.attribute's.

This way looks ugly, but it reacts immediately to changing those
globals, so it does seem like the better option.

Opinions (while I profile)?

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-30 Thread Michael Bayer

On Apr 30, 2013, at 6:26 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Fri, Apr 26, 2013 at 9:09 PM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 On Fri, Apr 26, 2013 at 9:01 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 All attributes have to be expire-able and act as proxies for a database 
 connection so I'm not really sure where to go with that.I'm not too 
 thrilled about proposals to build in various alternate performance 
 behaviors as the library starts to try to act in many different ways 
 that the vast majority of users aren't even aware of, it increases 
 complexity internally, produces vast amounts of new use cases to test 
 and maintain, etc.I'm always willing to look at patches that are 
 all winning, of course, so if you have some way to speed things up 
 without breaking usage contracts and without major new 
 complexity/brittleness I'd love to look at a pull request.
 
 I know, it's just a probe to see what kind of a speedup could be
 obtained by not having that getter's interference. You know... simply
 implementing InstrumentedAttribute in C could do the trick...
 
 
 In fact... I'm gonna try that...
 
 feel free!  though you might be surprised, a C function that just calls 
 out to all the same Python operations anyway is often only negligibly 
 faster, not enough to make the extra complexity worth it.
 
 also if you're looking to help with C, I'd love to get the C extensions out 
 in the Py3K version, we have a patch that's fallen out of date at 
 http://www.sqlalchemy.org/trac/ticket/2161 that needs freshening up and 
 testing.
 
 Will look into that. The point of the C function is to be able to
 quickly bypass all that _supports_population and function call
 overheads. The getter is dead-simple, so its cost is dominated by
 CPython function call overheads, that are readily removable by
 re-implementing in C. It can reliably and quickly detect when
 instance_dict returns __dict__, too.
 
 Alright, I've got a POC C extension working (gotta profile it yet),
 although SQLAlchemy's weird injection of instance_dict forced me to
 some ugly hacks:
 
 
 class InstrumentedAttribute(QueryableAttribute):
Class bound instrumented attribute which adds descriptor methods.
 
def __set__(self, instance, value):
self.impl.set(instance_state(instance),
instance_dict(instance), value, None)
 
def __delete__(self, instance):
self.impl.delete(instance_state(instance), instance_dict(instance))
 
try:
from sqlalchemy.cinstrumented import InstrumentedGetter
__get__ = InstrumentedGetter(globals())
__get__.__name__ = '__get__'
del InstrumentedGetter
except ImportError:
def __get__(self, instance, owner):
if instance is None:
return self
 
dict_ = instance_dict(instance)
if self._supports_population and self.key in dict_:
return dict_[self.key]
else:
return self.impl.get(instance_state(instance),dict_)
 
 Thing is, doing the whole class in C makes no sense for set and
 delete, but it also complicates linking its instance_dict and
 instance_state to SA.attribute's.
 
 This way looks ugly, but it reacts immediately to changing those
 globals, so it does seem like the better option.
 
 Opinions (while I profile)?

I'd want to see the whole thing, like what's up with that globals() call, etc.  
 The instance_dict is shuttled around everywhere so that we aren't constantly 
pulling it from the given object; we have a system in place whereby the fact 
that instance_dict is object.__dict__ is not necessarily a given, and you can 
actually use some other system of getting at __dict__.   It saved us on a huge 
number of function calls at some point to expand it out like that, as 
inconvenient as it is.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-30 Thread Claudio Freire
On Tue, Apr 30, 2013 at 8:06 PM, Michael Bayer mike...@zzzcomputing.com wrote:
try:
from sqlalchemy.cinstrumented import InstrumentedGetter
__get__ = InstrumentedGetter(globals())
__get__.__name__ = '__get__'
del InstrumentedGetter
except ImportError:
def __get__(self, instance, owner):
if instance is None:
return self

dict_ = instance_dict(instance)
if self._supports_population and self.key in dict_:
return dict_[self.key]
else:
return self.impl.get(instance_state(instance),dict_)

 Thing is, doing the whole class in C makes no sense for set and
 delete, but it also complicates linking its instance_dict and
 instance_state to SA.attribute's.

 This way looks ugly, but it reacts immediately to changing those
 globals, so it does seem like the better option.

 Opinions (while I profile)?

 I'd want to see the whole thing, like what's up with that globals() call, 
 etc.   The instance_dict is shuttled around everywhere so that we aren't 
 constantly pulling it from the given object; we have a system in place 
 whereby the fact that instance_dict is object.__dict__ is not necessarily a 
 given, and you can actually use some other system of getting at __dict__.   
 It saved us on a huge number of function calls at some point to expand it out 
 like that, as inconvenient as it is.

I realize that. That globals call is exactly for that.

Well, without going into the C code, the python code does:

dict_ = instance_dict(instance)

That under the hood means:

dict_ = globals()['instance_dict'](instance)

Well, globals there is bound syntactically in python code, but C code
has no globals, and injecting instance_dict into cinstrumented's
module dict sounded like extra complexity for no reason.

Importing attributes from cinstrumented is also no good, since at the
point the InstrumentedGetter is constructed, attribute isn't on
sys.modules, and doing it on invocation would have meant extra
overhead.

So, that globals call is to mimic python's syntactic binding to the
module's global dict, at import time, and be able to query the dict
and find instance_dict no matter how it's modified later.

Afterward, the getter works more or less like this (in C):

def __get__(self, instance, owner):
if instance is None:
   return self
if self.cached_instance_dict is not None \
   and self.cached_instance_dict is instance_dict \
   and self.cached_supports_population \
   and hasattr(instance, '__dict__'):
return instance.__dict__[self.key]
else:
   self.cached_supports_population = self._supports_population
   self.cached_instance_dict = None
   dict_ = instance_dict(instance)
   if dict_ is instance.__dict__:
   self.cached_instance_dict = instance_dict
   return self.impl.get(instance_state(instance), dict_)

Well, in spite of being more complicated, those self.cache_blah things
are really fast since they just compare pointers in C, and, more
importantly, entity.column will invoke this code from CPython's eval
loop (C) directly to the descriptor's getter (C), in no way incurring
python's frame allocation overhead.

I'm attaching the C module in case it clarifies.

I'm not entirely sure about the garbage collection part yet... so it's
not final.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


/*
instrumented.c
Copyright (C) 2013 Claudio Freire klaussfre...@gmail.com

This module is part of SQLAlchemy and is released under
the MIT License: http://www.opensource.org/licenses/mit-license.php
*/

#include Python.h

#if PY_VERSION_HEX  0x0205  !defined(PY_SSIZE_T_MIN)
typedef int Py_ssize_t;
#define PY_SSIZE_T_MAX INT_MAX
#define PY_SSIZE_T_MIN INT_MIN
typedef Py_ssize_t (*lenfunc)(PyObject *);
#define PyInt_FromSsize_t(x) PyInt_FromLong(x)
typedef intargfunc ssizeargfunc;
#endif

#if PY_VERSION_HEX = 0x0300
#define PyString_InternFromString PyUnicode_InternFromString
#endif


PyObject *get_string = NULL;
PyObject *uget_string = NULL;

/***
 * Structs *
 ***/

typedef struct {
PyObject_HEAD

/* Where to get instance_dict from */
PyObject* globals;

/* Name to which it was bound */
PyObject* name;

/* non-reference, just a pointer for identity comparison */
void *cached_instance_dict;

/* Only valid if cached_instance_dict != NULL and equal to global instance_dict */
int cached_supports_population;
} InstrumentedGetter;

/**
 * InstrumentedGetter *
 **/


Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Mauricio de Abreu Antunes
Read the source:

def all(self):
Return the results represented by this ``Query`` as a list.

This results in an execution of the underlying query.


return list(self)

it means that this method collects everything it needs and it is yielded by
the generator.
If you returns the query for a variable, you can perform a next(variable).


2013/4/26 alonn alonis...@gmail.com

 so not to load too much into memory I should do something like:

 for i in session.query(someobject).filter(idsomething)
 print i

 I'm guessing the answer is no, because of the nature of sql, but I'm not
 an expert so I'm asking.

 Thanks for the help!

 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.






-- 
*Mauricio de Abreu Antunes*
Mobile: (51)930-74-525
Skype: mauricio.abreua

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Mauricio de Abreu Antunes
Query object has a __iter__ descriptor.


2013/4/26 Mauricio de Abreu Antunes mauricio.abr...@gmail.com

 Read the source:

 def all(self):
 Return the results represented by this ``Query`` as a list.

 This results in an execution of the underlying query.

 
 return list(self)

 it means that this method collects everything it needs and it is yielded
 by the generator.
 If you returns the query for a variable, you can perform a next(variable).


 2013/4/26 alonn alonis...@gmail.com

 so not to load too much into memory I should do something like:

 for i in session.query(someobject).filter(idsomething)
 print i

 I'm guessing the answer is no, because of the nature of sql, but I'm not
 an expert so I'm asking.

 Thanks for the help!

 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
 For more options, visit https://groups.google.com/groups/opt_out.






 --
 *Mauricio de Abreu Antunes*
 Mobile: (51)930-74-525
 Skype: mauricio.abreua




-- 
*Mauricio de Abreu Antunes*
Mobile: (51)930-74-525
Skype: mauricio.abreua

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Werner

Hi,

On 26/04/2013 16:41, alonn wrote:

so not to load too much into memory I should do something like:

for i in session.query(someobject).filter(idsomething)
print i

I'm guessing the answer is no, because of the nature of sql, but I'm 
not an expert so I'm asking.
yes you can, check out the doc for querying, e.g. the following if you 
use the ORM.


http://sqlalchemy.readthedocs.org/en/rel_0_8/orm/tutorial.html#querying

Werner

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Claudio Freire
On Fri, Apr 26, 2013 at 12:06 PM, Werner werner.bru...@sfr.fr wrote:
 On 26/04/2013 16:41, alonn wrote:

 so not to load too much into memory I should do something like:

 for i in session.query(someobject).filter(idsomething)
 print i

 I'm guessing the answer is no, because of the nature of sql, but I'm not
 an expert so I'm asking.

 yes you can, check out the doc for querying, e.g. the following if you use
 the ORM.

 http://sqlalchemy.readthedocs.org/en/rel_0_8/orm/tutorial.html#querying


Not entirely, if you don't use yield_per (as shown in the docs in
fact, but worth mentioning).

Seeing query:

if self._yield_per:
fetch = cursor.fetchmany(self._yield_per)
if not fetch:
break
else:
fetch = cursor.fetchall()

Not only that, but also all rows are processed and saved to a local
list, so all instances are built and populated way before you get the
first row. That is, unless you specify yield_per.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Werner

On 26/04/2013 17:07, Claudio Freire wrote:

On Fri, Apr 26, 2013 at 12:06 PM, Werner werner.bru...@sfr.fr wrote:

On 26/04/2013 16:41, alonn wrote:

so not to load too much into memory I should do something like:

for i in session.query(someobject).filter(idsomething)
 print i

I'm guessing the answer is no, because of the nature of sql, but I'm not
an expert so I'm asking.

yes you can, check out the doc for querying, e.g. the following if you use
the ORM.

http://sqlalchemy.readthedocs.org/en/rel_0_8/orm/tutorial.html#querying


Not entirely, if you don't use yield_per (as shown in the docs in
fact, but worth mentioning).

Seeing query:

if self._yield_per:
 fetch = cursor.fetchmany(self._yield_per)
 if not fetch:
 break
else:
 fetch = cursor.fetchall()

Not only that, but also all rows are processed and saved to a local
list, so all instances are built and populated way before you get the
first row. That is, unless you specify yield_per.

Oops, thanks for correcting me.
Werner

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Claudio Freire
On Fri, Apr 26, 2013 at 12:24 PM, Werner werner.bru...@sfr.fr wrote:
 On 26/04/2013 17:07, Claudio Freire wrote:

 On Fri, Apr 26, 2013 at 12:06 PM, Werner werner.bru...@sfr.fr wrote:

 http://sqlalchemy.readthedocs.org/en/rel_0_8/orm/tutorial.html#querying


 Not entirely, if you don't use yield_per (as shown in the docs in
 fact, but worth mentioning).

 Seeing query:

 if self._yield_per:
  fetch = cursor.fetchmany(self._yield_per)
  if not fetch:
  break
 else:
  fetch = cursor.fetchall()

 Not only that, but also all rows are processed and saved to a local
 list, so all instances are built and populated way before you get the
 first row. That is, unless you specify yield_per.

 Oops, thanks for correcting me.

Um... a tad OT, but looking at that code, there's lots of
opportunities for optimization.

I'll have to profile a bit and let you know.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Michael Bayer

On Apr 26, 2013, at 12:24 PM, Claudio Freire klaussfre...@gmail.com wrote:

 
 Um... a tad OT, but looking at that code, there's lots of
 opportunities for optimization.
 
 I'll have to profile a bit and let you know.

are you referring to sqlalchemy/orm/loading.py ?   I'd be pretty impressed if 
you can find significant optimizations there which don't break usage contracts. 
   I've spent years poring over profiles and squeezing every function call 
possible out of that system, sometimes producing entirely new approaches that I 
just had to throw out since they didn't work.   It has been rewritten many 
times.   Some background on the approach is at 
http://www.aosabook.org/en/sqlalchemy.html, 20.7. Query and Loading Behavior. 
 


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Claudio Freire
On Fri, Apr 26, 2013 at 1:35 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 Um... a tad OT, but looking at that code, there's lots of
 opportunities for optimization.

 I'll have to profile a bit and let you know.

 are you referring to sqlalchemy/orm/loading.py ?   I'd be pretty impressed if 
 you can find significant optimizations there which don't break usage 
 contracts.I've spent years poring over profiles and squeezing every 
 function call possible out of that system, sometimes producing entirely new 
 approaches that I just had to throw out since they didn't work.   It has been 
 rewritten many times.   Some background on the approach is at 
 http://www.aosabook.org/en/sqlalchemy.html, 20.7. Query and Loading 
 Behavior.


I know... I'm talking micro-optimization. Pre-binding globals in tight
loops, for instance, like:

def filter_fn(x, tuple=tuple, zip=zip):
return tuple(...)

This is of course only worth it for really really hot loops. That's
why I'm profiling. Maybe it's been done already for all the hottest
loops.

Then there's the possibility to replace some list comprehensions with
itertools, which besides not building a temp list, would also run
entirely in C. This also only makes a difference only on very tight,
builtin-laden loops.

I have an app here that really stresses that part of the ORM, so I can
profile rather easily. In previous profiles, I remember seeing
Query.instances near the top, and all the optimizations I mentioned
above could be applied there, if they make any difference I'll tell.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Michael Bayer

On Apr 26, 2013, at 12:41 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Fri, Apr 26, 2013 at 1:35 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 Um... a tad OT, but looking at that code, there's lots of
 opportunities for optimization.
 
 I'll have to profile a bit and let you know.
 
 are you referring to sqlalchemy/orm/loading.py ?   I'd be pretty impressed 
 if you can find significant optimizations there which don't break usage 
 contracts.I've spent years poring over profiles and squeezing every 
 function call possible out of that system, sometimes producing entirely new 
 approaches that I just had to throw out since they didn't work.   It has 
 been rewritten many times.   Some background on the approach is at 
 http://www.aosabook.org/en/sqlalchemy.html, 20.7. Query and Loading 
 Behavior.
 
 
 I know... I'm talking micro-optimization. Pre-binding globals in tight
 loops, for instance, like:
 
 def filter_fn(x, tuple=tuple, zip=zip):
return tuple(...)
 
 This is of course only worth it for really really hot loops. That's
 why I'm profiling. Maybe it's been done already for all the hottest
 loops.
 
 Then there's the possibility to replace some list comprehensions with
 itertools, which besides not building a temp list, would also run
 entirely in C. This also only makes a difference only on very tight,
 builtin-laden loops.
 
 I have an app here that really stresses that part of the ORM, so I can
 profile rather easily. In previous profiles, I remember seeing
 Query.instances near the top, and all the optimizations I mentioned
 above could be applied there, if they make any difference I'll tell.

the real bottleneck in loading is the loading.instances() function.  I have 
tried for years to reduce overhead in it.  Writing it in C would be best, but 
then again Pypy aims to solve the problem of FN overhead, pre-binding, and 
such.   I don't want to work against Pypy too much.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Claudio Freire
On Fri, Apr 26, 2013 at 2:04 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 are you referring to sqlalchemy/orm/loading.py ?   I'd be pretty impressed 
 if you can find significant optimizations there which don't break usage 
 contracts.I've spent years poring over profiles and squeezing every 
 function call possible out of that system, sometimes producing entirely new 
 approaches that I just had to throw out since they didn't work.   It has 
 been rewritten many times.   Some background on the approach is at 
 http://www.aosabook.org/en/sqlalchemy.html, 20.7. Query and Loading 
 Behavior.


 I know... I'm talking micro-optimization. Pre-binding globals in tight
 loops, for instance, like:

 def filter_fn(x, tuple=tuple, zip=zip):
return tuple(...)

 This is of course only worth it for really really hot loops. That's
 why I'm profiling. Maybe it's been done already for all the hottest
 loops.

 Then there's the possibility to replace some list comprehensions with
 itertools, which besides not building a temp list, would also run
 entirely in C. This also only makes a difference only on very tight,
 builtin-laden loops.

 I have an app here that really stresses that part of the ORM, so I can
 profile rather easily. In previous profiles, I remember seeing
 Query.instances near the top, and all the optimizations I mentioned
 above could be applied there, if they make any difference I'll tell.

 the real bottleneck in loading is the loading.instances() function.  I have 
 tried for years to reduce overhead in it.  Writing it in C would be best, but 
 then again Pypy aims to solve the problem of FN overhead, pre-binding, and 
 such.   I don't want to work against Pypy too much.

That makes the proposition tricky. I don't know PyPy's performance
characteristics that well. I assume pre-binding wouldn't hurt PyPy
much, since loop traces would be nearly the same, but I've never
tested.

Pre-binding in filter_fn improves its runtime ten-fold. Actually,
pre-binding and replacing tuple(genexpr) by tuple([compexpr]), since
genexprs are rather slow compared to list compehensions. The
improvement accounts for 1% of my test's runtime, so if it hurts PyPy,
it might not be so great an optimization (if it doesn't, though, it's
a very cheap one, and it could be applicable in other places). This
particular one helps in the case of query(Column, Column, Column),
which I use a lot.

Note, however, that my test is 40% waiting on the DB, so CPU usage
impact would be proportionally bigger, especially with parallel
workers (I'm using just one thread when profiling though).

Doing those small optimizations to WeakIdentityMap (another one whose
methods are called an obscenely large amount of times), I get about
10% speedup on those. I imagine that could count in some situations.

Ultimately, though, it's InstrumentedAttribute.__get__ the one sucking
up 30% of alchemy-bound CPU time. I guess there's little that can be
done, since it's necessary to track state changes. But there's a neat
property of descriptors, where if they don't implement __get__, then
they don't take precedence over the instance's dict.

This is very interesting, and handy, since when instance_dict is
attrgetter('__dict__'), then, for regular ColumnPropertys, instead of
using InstrumentedAttribute, I can replace that with an
InstrumentedWriteAttribute that has no get. This means, all of a
sudden, no overhead for simple attribute access.

I've tested it and it mostly works. There's the instance_dict is
attrgetter('__dict__') thing hanging over my head, and the more
serious issue of lazy attributes being mostly broken, but it's an
interesting POC IMHO.

Anyway, with that (fragile) change, I get a speedup of 10% overall
runtime, and about 50% alchemy-specific runtime. Considering I knew
about attribute access' slowness and avoided it in my test, that has
to account for something worth looking into? (before optimizing for
attribute access slowness, the test was about 3 times slower IIRC -
*times* - and it does a hefty amount of regex processing beyond
handling attributes)

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Michael Bayer

On Apr 26, 2013, at 6:13 PM, Claudio Freire klaussfre...@gmail.com wrote:

 Ultimately, though, it's InstrumentedAttribute.__get__ the one sucking
 up 30% of alchemy-bound CPU time. I guess there's little that can be
 done, since it's necessary to track state changes. But there's a neat
 property of descriptors, where if they don't implement __get__, then
 they don't take precedence over the instance's dict.
 
 This is very interesting, and handy, since when instance_dict is
 attrgetter('__dict__'), then, for regular ColumnPropertys, instead of
 using InstrumentedAttribute, I can replace that with an
 InstrumentedWriteAttribute that has no get. This means, all of a
 sudden, no overhead for simple attribute access.
 
 I've tested it and it mostly works. There's the instance_dict is
 attrgetter('__dict__') thing hanging over my head, and the more
 serious issue of lazy attributes being mostly broken, but it's an
 interesting POC IMHO.


just to be clear, you're breaking the capability of column-based attributes to 
lazy load at all, right?  Yeah, that can't really fly :).  The whole object is 
a live proxy for a database row, we have deferred, all kinds of stuff.

We have had users work on alternative static object loading routines, but of 
course the Query can return cheap NamedTuples to you if you just want fast 
immutable columns.

 Anyway, with that (fragile) change, I get a speedup of 10% overall
 runtime, and about 50% alchemy-specific runtime. Considering I knew
 about attribute access' slowness and avoided it in my test, that has
 to account for something worth looking into?

All attributes have to be expire-able and act as proxies for a database 
connection so I'm not really sure where to go with that.I'm not too 
thrilled about proposals to build in various alternate performance behaviors 
as the library starts to try to act in many different ways that the vast 
majority of users aren't even aware of, it increases complexity internally, 
produces vast amounts of new use cases to test and maintain, etc.I'm always 
willing to look at patches that are all winning, of course, so if you have some 
way to speed things up without breaking usage contracts and without major new 
complexity/brittleness I'd love to look at a pull request.

 (before optimizing for
 attribute access slowness, the test was about 3 times slower IIRC -
 *times* - and it does a hefty amount of regex processing beyond
 handling attributes)

Im not sure what regexes you're referring to here.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Claudio Freire
On Fri, Apr 26, 2013 at 8:15 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 Anyway, with that (fragile) change, I get a speedup of 10% overall
 runtime, and about 50% alchemy-specific runtime. Considering I knew
 about attribute access' slowness and avoided it in my test, that has
 to account for something worth looking into?

 All attributes have to be expire-able and act as proxies for a database 
 connection so I'm not really sure where to go with that.I'm not too 
 thrilled about proposals to build in various alternate performance 
 behaviors as the library starts to try to act in many different ways that the 
 vast majority of users aren't even aware of, it increases complexity 
 internally, produces vast amounts of new use cases to test and maintain, etc. 
I'm always willing to look at patches that are all winning, of course, so 
 if you have some way to speed things up without breaking usage contracts and 
 without major new complexity/brittleness I'd love to look at a pull request.

I know, it's just a probe to see what kind of a speedup could be
obtained by not having that getter's interference. You know... simply
implementing InstrumentedAttribute in C could do the trick...

 (before optimizing for
 attribute access slowness, the test was about 3 times slower IIRC -
 *times* - and it does a hefty amount of regex processing beyond
 handling attributes)

 Im not sure what regexes you're referring to here.

Oh, it's just application-specific regexes. The point was that there's
a lot of application-specific processing, so the speedup must be big
to be observable through the interference.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Claudio Freire
On Fri, Apr 26, 2013 at 8:47 PM, Claudio Freire klaussfre...@gmail.com wrote:
 On Fri, Apr 26, 2013 at 8:15 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:

 Anyway, with that (fragile) change, I get a speedup of 10% overall
 runtime, and about 50% alchemy-specific runtime. Considering I knew
 about attribute access' slowness and avoided it in my test, that has
 to account for something worth looking into?

 All attributes have to be expire-able and act as proxies for a database 
 connection so I'm not really sure where to go with that.I'm not too 
 thrilled about proposals to build in various alternate performance 
 behaviors as the library starts to try to act in many different ways that 
 the vast majority of users aren't even aware of, it increases complexity 
 internally, produces vast amounts of new use cases to test and maintain, 
 etc.I'm always willing to look at patches that are all winning, of 
 course, so if you have some way to speed things up without breaking usage 
 contracts and without major new complexity/brittleness I'd love to look at a 
 pull request.

 I know, it's just a probe to see what kind of a speedup could be
 obtained by not having that getter's interference. You know... simply
 implementing InstrumentedAttribute in C could do the trick...


In fact... I'm gonna try that...

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Michael Bayer

On Apr 26, 2013, at 7:56 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Fri, Apr 26, 2013 at 8:47 PM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 On Fri, Apr 26, 2013 at 8:15 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 Anyway, with that (fragile) change, I get a speedup of 10% overall
 runtime, and about 50% alchemy-specific runtime. Considering I knew
 about attribute access' slowness and avoided it in my test, that has
 to account for something worth looking into?
 
 All attributes have to be expire-able and act as proxies for a database 
 connection so I'm not really sure where to go with that.I'm not too 
 thrilled about proposals to build in various alternate performance 
 behaviors as the library starts to try to act in many different ways that 
 the vast majority of users aren't even aware of, it increases complexity 
 internally, produces vast amounts of new use cases to test and maintain, 
 etc.I'm always willing to look at patches that are all winning, of 
 course, so if you have some way to speed things up without breaking usage 
 contracts and without major new complexity/brittleness I'd love to look at 
 a pull request.
 
 I know, it's just a probe to see what kind of a speedup could be
 obtained by not having that getter's interference. You know... simply
 implementing InstrumentedAttribute in C could do the trick...
 
 
 In fact... I'm gonna try that...

feel free!  though you might be surprised, a C function that just calls out to 
all the same Python operations anyway is often only negligibly faster, not 
enough to make the extra complexity worth it.




-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Michael Bayer

On Apr 26, 2013, at 7:59 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 
 On Apr 26, 2013, at 7:56 PM, Claudio Freire klaussfre...@gmail.com wrote:
 
 On Fri, Apr 26, 2013 at 8:47 PM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 On Fri, Apr 26, 2013 at 8:15 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 Anyway, with that (fragile) change, I get a speedup of 10% overall
 runtime, and about 50% alchemy-specific runtime. Considering I knew
 about attribute access' slowness and avoided it in my test, that has
 to account for something worth looking into?
 
 All attributes have to be expire-able and act as proxies for a database 
 connection so I'm not really sure where to go with that.I'm not too 
 thrilled about proposals to build in various alternate performance 
 behaviors as the library starts to try to act in many different ways that 
 the vast majority of users aren't even aware of, it increases complexity 
 internally, produces vast amounts of new use cases to test and maintain, 
 etc.I'm always willing to look at patches that are all winning, of 
 course, so if you have some way to speed things up without breaking usage 
 contracts and without major new complexity/brittleness I'd love to look at 
 a pull request.
 
 I know, it's just a probe to see what kind of a speedup could be
 obtained by not having that getter's interference. You know... simply
 implementing InstrumentedAttribute in C could do the trick...
 
 
 In fact... I'm gonna try that...
 
 feel free!  though you might be surprised, a C function that just calls out 
 to all the same Python operations anyway is often only negligibly faster, not 
 enough to make the extra complexity worth it.

also if you're looking to help with C, I'd love to get the C extensions out in 
the Py3K version, we have a patch that's fallen out of date at 
http://www.sqlalchemy.org/trac/ticket/2161 that needs freshening up and testing.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [sqlalchemy] Are sqlalchemy queries a generator?

2013-04-26 Thread Claudio Freire
On Fri, Apr 26, 2013 at 9:01 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 All attributes have to be expire-able and act as proxies for a database 
 connection so I'm not really sure where to go with that.I'm not too 
 thrilled about proposals to build in various alternate performance 
 behaviors as the library starts to try to act in many different ways that 
 the vast majority of users aren't even aware of, it increases complexity 
 internally, produces vast amounts of new use cases to test and maintain, 
 etc.I'm always willing to look at patches that are all winning, of 
 course, so if you have some way to speed things up without breaking usage 
 contracts and without major new complexity/brittleness I'd love to look 
 at a pull request.

 I know, it's just a probe to see what kind of a speedup could be
 obtained by not having that getter's interference. You know... simply
 implementing InstrumentedAttribute in C could do the trick...


 In fact... I'm gonna try that...

 feel free!  though you might be surprised, a C function that just calls out 
 to all the same Python operations anyway is often only negligibly faster, 
 not enough to make the extra complexity worth it.

 also if you're looking to help with C, I'd love to get the C extensions out 
 in the Py3K version, we have a patch that's fallen out of date at 
 http://www.sqlalchemy.org/trac/ticket/2161 that needs freshening up and 
 testing.

Will look into that. The point of the C function is to be able to
quickly bypass all that _supports_population and function call
overheads. The getter is dead-simple, so its cost is dominated by
CPython function call overheads, that are readily removable by
re-implementing in C. It can reliably and quickly detect when
instance_dict returns __dict__, too.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.