[sqlalchemy] Re: Performance of data mapping and plain access

2006-12-29 Thread Michael Bayer


the method that I improved in my checkin is related to the mapper's
construction of a newly loaded instance, and the test case that
improves 20% focuses most of its time loading a list of 2500 items.
the test you have here spends a lot of time doing lots of other things,
such as saving items (about 60% of its time is spent in save operations
rather than loads), creating new sessions, firing off lazy load
callables, etc.  so its work is too diffuse to illustrate the one
bottleneck i sped up a bit.  if you make your two relations lazy=False,
it knocks a few seconds off the total time.

as far as lifecycle, its dependent on your application's workings.  for
a typical web application its easy - a session for each web request.
for other kinds of apps like GUI apps, you have to identify what
constitutes a "session".  it may or may not be efficient to have a new
session for each transaction.  it depends on how much freshness you
need, what kind of concurrency is present, etc.

there is no option to disable the attribute-copy-on-load operation; you
can make it happen by removing the body of the "commit()" method in
sqlalchemy/orm/attributes.py line 642 and replacing it with just
"pass".  but then your session flushes wont work, and neither will your
test program.  if you created a test program that just did loads, it
still doesnt speed things up too much, particulary if you have few
attributes on your objects.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Performance of data mapping and plain access

2006-12-28 Thread Sébastien LELONG



[...]  simpliest thing
would be to use a Session that has flush() overidden. or an engine that
overrides execute() to check for INSERT/UPDATE/DELETE statements and
throws an error [...]


I tried the ReadOnlySession class which overrides the flush() func. Works like 
a charm, this adds a "security net" in my apps as I cannot actually even add 
any grants, so this is very useful for me. Thanks.


Moreover, in the 2nd scenario, if app #1 can insert rows and app #2 can't, 
this means there must be different db users (per app) with specific grants 
for each, which adds complexity and may results in an "evil maintainance"... 



Cheers,

--
Sébastien LELONG
sebastien.lelong[at]sirloon.net

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Performance of data mapping and plain access

2006-12-28 Thread Michael Bayer


if its truly an issue of security then grants would be more
appropriate.  since anything the ORM does to "prevent" a write
operation can be easily overridden, since its Python.  simpliest thing
would be to use a Session that has flush() overidden. or an engine that
overrides execute() to check for INSERT/UPDATE/DELETE statements and
throws an error.  not a feature id think is appropriate for the core of
SA though.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Performance of data mapping and plain access

2006-12-28 Thread Sébastien LELONG



one thing that could make ORM loads much faster would be if you knew
the objects would not need to be flushed() at a later point, and you
disabled history tracking on those instances.  this would prevent the
need to create a copy of the object's attributes at load time.


This reminds me a functionality I was looking for a few weeks ago. I'd liked 
to be able to load read-only objects from a database. Read-only objects were 
(are still) useful for me when retrieving data from a "precious" database 
(production db), ensuring no update/insert/delete operations would occur. 
Another scenario is an app which can actually do these operations, and 
another app which musn't. While this can be done using grants, it could be 
useful (and easier) if things could be done at the ORM level.


Is there any other way to get those read-only objects, without implementing 
this functionality ?


Cheers,

--
Sébastien LELONG
sebastien.lelong[at]sirloon.net

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Performance of data mapping and plain access

2006-12-28 Thread david mugnai



Michael Bayer wrote:
[snip]

actually a lot better than they've been in the past.  if your tests are
useful, I might add them as well (but note that your attachments didnt
come through, so try again).


I forgot the attachments, sorry. Please find them here:

http://d2.spacespa.it:8000/temp/test-userhandling.py
http://d2.spacespa.it:8000/temp/userhandling1.py
http://d2.spacespa.it:8000/temp/userhandling2.py


one thing that could make ORM loads much faster would be if you knew
the objects would not need to be flushed() at a later point, and you
disabled history tracking on those instances.  this would prevent the
need to create a copy of the object's attributes at load time.


How can I disable the history tracking?

[snip]

Strangely, your last commit doesn't seem to bring any speed
improvements. Last measurements:

user handling with ORM
Starting test
Total time: 76.4469230175

real1m17.088s
user0m22.097s
sys 0m0.348s


user handling without ORM
Starting test
Total time: 44.6131839752

real0m45.185s
user0m8.561s
sys 0m0.228s


I'd also like to know how to improve the userhandling2.py file. I have
some doubts about the life-cycle of session objects. Is it better to
create and destroy them at each transaction, or to make them long
lived?

thanks in advance
david


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Performance of data mapping and plain access

2006-12-27 Thread Michael Bayer


ive committed in r2174 some speed enhancements, not including the
abovementioned change to deferring the on-load "copy" operation (which
is a more involved change), that affords a 20% speed improvement in
straight instance loads and a 25% speed improvement in instances loaded
via eager loaders...mostly due to the removal of an expensive
__getattribute__ call that was involved as well as an AttributeError
being thrown/caught for each instance created.

deferring the on-load "copy" operation which i mentioned earlier it
turns out would probably knock another 12% off the loading time, so its
not as dramatic as I thought originally.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Performance of data mapping and plain access

2006-12-27 Thread Michael Bayer


dbrow seems to be exactly what ResultProxy already does (which has no
speed problems).

the answer is in optimizing the "copy" made of each object's state upon
load.  step 1 is to make that an optional step, do some speed tests to
insure that it speeds things up when disabled, step 2 is to possibly
make that "copy" only take place when you actually modify the loaded
object, so that the load operation is fast.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Performance of data mapping and plain access

2006-12-27 Thread Lele Gaifax


Michael Bayer wrote:


the ORM is going to be slower in all cases since there is the overhead
of creating new object instances and populating them, as well as
initializing their attribute instrumentation and also a copy of their
attributes for the purposes of tracking changes when you issue a
flush() statement.


Couldn't an approach like dbrow (http://opensource.theopalgroup.com/) 
bring some benefit in this area? It surely overlaps here and there with 
SA, but expecially thanks to the C impl (there's also a pure Python one) 
it's pretty fast in delivering lots of instances of the same class.


ciao, lele.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Performance of data mapping and plain access

2006-12-27 Thread Michael Bayer


the ORM is going to be slower in all cases since there is the overhead
of creating new object instances and populating them, as well as
initializing their attribute instrumentation and also a copy of their
attributes for the purposes of tracking changes when you issue a
flush() statement.  this strategy was copied from that of hibernate's,
and provides the greatest stability and predictability with regards to
tracking history changes.

we do have some performance tests in the test/ directory which ive used
in the past for profiling and improving these operations, and they are
actually a lot better than they've been in the past.  if your tests are
useful, I might add them as well (but note that your attachments didnt
come through, so try again).

one thing that could make ORM loads much faster would be if you knew
the objects would not need to be flushed() at a later point, and you
disabled history tracking on those instances.  this would prevent the
need to create a copy of the object's attributes at load time.

while theres no option available for that right now its something that
could be added.  that would probably give you a 20-40% speed boost at
least if not more.


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---