[sqlalchemy] Re: [Sqlalchemy-users] Objects with just some fields

2007-05-22 Thread Mike Orr

Regarding the slowness I was experiencing mapping a select (which
caused a nested SELECT in the SQL), I converted the inner select to a
MySQL VIEW and the speed improved dramatically.  The MySQL manual says
it manages a select against a view by merging them into a single
select if it can, but it doesn't seem to do that with nested selects.

EXPLAIN SELECT ... FROM the_view WHERE ...
shows it's actually selecting from the underlying table (i.e., it
lists the real table name), while eanwhile,
EXPLAIN SELECT ... FROM (SELECT ... FROM the_table) AS my_alias WHERE ...
does two selects rather than merging the SQL into one.

The one problem I discovered is that MySQL does not propagate the
primary key to the view, which causes SQLAlchemy to raise:
class 'sqlalchemy.exceptions.ArgumentError': Could not assemble any
primary key columns for mapped table 'IN_Incident'.
I had to explicitly tell SQLAlchemy which columns are primary keys:

  incidents = Table(IN_Incident, meta,
Column(orr_id, Integer, primary_key=True),
autoload=True)
entries = Table(IN_Entry, meta,
Column(entry_id, Integer, primary_key=True),
Column(orr_id, Integer, ForeignKey(incidents.c.orr_id)),
autoload=True)

-- 
Mike Orr [EMAIL PROTECTED]

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: [Sqlalchemy-users] Objects with just some fields

2007-05-22 Thread Michael Bayer


On May 22, 2007, at 6:17 PM, Mike Orr wrote:


 Regarding the slowness I was experiencing mapping a select (which
 caused a nested SELECT in the SQL), I converted the inner select to a
 MySQL VIEW and the speed improved dramatically.  The MySQL manual says
 it manages a select against a view by merging them into a single
 select if it can, but it doesn't seem to do that with nested selects.

 EXPLAIN SELECT ... FROM the_view WHERE ...
 shows it's actually selecting from the underlying table (i.e., it
 lists the real table name), while eanwhile,
 EXPLAIN SELECT ... FROM (SELECT ... FROM the_table) AS my_alias  
 WHERE ...
 does two selects rather than merging the SQL into one.


well thats pretty awful.  an argument to map against the table  
directly instead of a select.  we do have some new ways to get  
additional columns into a mapping now without it needing to  
encapsulate it (column_property(), add_column(), query.select 
(someselect())), so mapping to selects is not quite as necessary.




--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: [Sqlalchemy-users] Objects with just some fields

2007-05-09 Thread Mike Orr

On 5/8/07, Michael Bayer [EMAIL PROTECTED] wrote:


 On May 8, 2007, at 8:47 PM, Mike Orr wrote:

 
  I noticed the ORM was taking a few seconds longer in my batch job.
 
  Today I converted my web application from Select to ORM so I could add
  some methods to the mapped classes, and even tried a relationship with
  a backreference (!).  That worked fine, but I notice it's doing nested
  selects again.

...

 I guess im giong to have to find a way to back this up, but its my
 belief that the database should be pretty good at optimizing nested
 selects such that it doesnt make a whole lot of difference.
 otherwise people would never use views.

I tested it today and the ORM version sometimes takes six seconds to
display a page while the non-ORM version always takes less than a
second.The longer times correspond to pages with the most result
records that haven't been viewed since the application (Pylons) was
restarted.  But sometimes the delays don't happen even under the same
conditions, so I'll have to monitor it for a while.  If it takes a few
extra seconds after the app is started, that's fine, as long as it
isn't happening all the time.

The non-ORM version has functions that create a base select with a
limited number of fields, then the caller adds clauses to it.  It also
doesn't have a relation, but the ORM SQL isn't actually using a
relation either.

-- 
Mike Orr [EMAIL PROTECTED]

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: [Sqlalchemy-users] Objects with just some fields

2007-05-09 Thread Mike Orr

The actual SQL time is 1.04 second for this ORM-inspired query:

SELECT entries.category AS entries_category, entries.title AS
entries_title, entries.thumb200 AS entries_thumb200, entries.creator
AS entries_creator, entries.doctype AS entries_doctype,
entries.filename AS entries_filename, entries.content AS
entries_content, entries.entry_id AS entries_entry_id,
entries.entry_date AS entries_entry_date, entries.is_public AS
entries_is_public, entries.size AS entries_size, entries.orr_id AS
entries_orr_id FROM (SELECT `Entry`.entry_id AS entry_id,
`Entry`.orr_id AS orr_id, `Entry`.entry_date AS entry_date,
`Entry`.creator AS creator, `Entry`.title AS title, `Entry`.category
AS category, `Entry`.content AS content, `Entry`.filename AS filename,
`Entry`.thumb200 AS thumb200, `Entry`.doctype AS doctype, `Entry`.size
AS size, `Entry`.is_public AS is_public FROM `Entry` WHERE
`Entry`.is_public) AS entries WHERE entries.orr_id = 6153 and
entries.category in (1, 2, 3, 4, 6, 7, 8, 9, 10, 11);

vs 0.14 seconds for this non-ORM one:

SELECT `Entry`.entry_id, `Entry`.orr_id, `Entry`.category,
`Entry`.title, `Entry`.entry_date, `Entry`.filename, `Entry`.thumb200
FROM `Entry` WHERE (`Entry`.is_public AND `Entry`.orr_id = 6153)  AND
`Entry`.category IN (1,2,3,4,5,6,7,8,9,10,11) order by
`Entry`.entry_date DESC LIMIT 11;

vs 0.13 seconds for my manual equivalent:

select entry_id, orr_id, category, title, entry_date, filename,
thumb200 from Entry where is_public and orr_id = 6153 and category in
(1,2,3,4,5,6,7,8,9,10,11) order by entry_date desc limit 11;


vs 0.07 seconds for the same with *:

select * from Entry where is_public and orr_id=6153 and category in
(1,2,3,4,6,7,8,9,10,11);

I'm tempted to say ORM is good but not when based on a select, at
least not with this particular dataset.

-- 
Mike Orr [EMAIL PROTECTED]

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: [Sqlalchemy-users] Objects with just some fields

2007-05-09 Thread Michael Bayer


On May 9, 2007, at 2:46 PM, Mike Orr wrote:


 The actual SQL time is 1.04 second for this ORM-inspired query:

 SELECT entries.category AS entries_category, entries.title AS
 entries_title, entries.thumb200 AS entries_thumb200, entries.creator
 AS entries_creator, entries.doctype AS entries_doctype,
 entries.filename AS entries_filename, entries.content AS
 entries_content, entries.entry_id AS entries_entry_id,
 entries.entry_date AS entries_entry_date, entries.is_public AS
 entries_is_public, entries.size AS entries_size, entries.orr_id AS
 entries_orr_id FROM (SELECT `Entry`.entry_id AS entry_id,
 `Entry`.orr_id AS orr_id, `Entry`.entry_date AS entry_date,
 `Entry`.creator AS creator, `Entry`.title AS title, `Entry`.category
 AS category, `Entry`.content AS content, `Entry`.filename AS filename,
 `Entry`.thumb200 AS thumb200, `Entry`.doctype AS doctype, `Entry`.size
 AS size, `Entry`.is_public AS is_public FROM `Entry` WHERE
 `Entry`.is_public) AS entries WHERE entries.orr_id = 6153 and
 entries.category in (1, 2, 3, 4, 6, 7, 8, 9, 10, 11);

 vs 0.14 seconds for this non-ORM one:

 SELECT `Entry`.entry_id, `Entry`.orr_id, `Entry`.category,
 `Entry`.title, `Entry`.entry_date, `Entry`.filename, `Entry`.thumb200
 FROM `Entry` WHERE (`Entry`.is_public AND `Entry`.orr_id = 6153)  AND
 `Entry`.category IN (1,2,3,4,5,6,7,8,9,10,11) order by
 `Entry`.entry_date DESC LIMIT 11;

 vs 0.13 seconds for my manual equivalent:

 select entry_id, orr_id, category, title, entry_date, filename,
 thumb200 from Entry where is_public and orr_id = 6153 and category in
 (1,2,3,4,5,6,7,8,9,10,11) order by entry_date desc limit 11;


 vs 0.07 seconds for the same with *:

 select * from Entry where is_public and orr_id=6153 and category in
 (1,2,3,4,6,7,8,9,10,11);

 I'm tempted to say ORM is good but not when based on a select, at
 least not with this particular dataset.

this test has many issues that prevents any comparison of the nested  
select - your non-ORM query has LIMIT 11 inside of it whereas your  
ORM query does not (use limit or limit() with query() for this).   
also I am assuming you are doing a full fetch - the ORM query has a  
lot more columns in it which will take time to be fetched (these  
columns can be set as deferred to not have them part of a default  
query).

to time this accurately, you need to put the same external LIMIT on  
both queries, and run the queries in the database console directly -  
that will test just the time for the DB to do the initial execution  
without any fetching.  then you need to run both queries in a simple  
DBAPI application and do a full fetch to see how much additional  
overhead comes from the fetch.  beyond that, the ORM has some  
additional overhead in creating objects but its more on the order of  
50% slower, not 1000% slower.

the strategy i would take with this is to use the databases query  
plan as a guide, to see that the optimizer does in fact flatten the  
nested selects into something negligible...ive been reading up on  
MySQL's execution plan view commands and was planning to work with  
them a little bit.



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: [Sqlalchemy-users] Objects with just some fields

2007-05-09 Thread Mike Orr

On 5/9/07, Michael Bayer [EMAIL PROTECTED] wrote:


 On May 9, 2007, at 2:46 PM, Mike Orr wrote:

 
  The actual SQL time is 1.04 second for this ORM-inspired query:
 
  SELECT entries.category AS entries_category, entries.title AS
  entries_title, entries.thumb200 AS entries_thumb200, entries.creator
  AS entries_creator, entries.doctype AS entries_doctype,
  entries.filename AS entries_filename, entries.content AS
  entries_content, entries.entry_id AS entries_entry_id,
  entries.entry_date AS entries_entry_date, entries.is_public AS
  entries_is_public, entries.size AS entries_size, entries.orr_id AS
  entries_orr_id FROM (SELECT `Entry`.entry_id AS entry_id,
  `Entry`.orr_id AS orr_id, `Entry`.entry_date AS entry_date,
  `Entry`.creator AS creator, `Entry`.title AS title, `Entry`.category
  AS category, `Entry`.content AS content, `Entry`.filename AS filename,
  `Entry`.thumb200 AS thumb200, `Entry`.doctype AS doctype, `Entry`.size
  AS size, `Entry`.is_public AS is_public FROM `Entry` WHERE
  `Entry`.is_public) AS entries WHERE entries.orr_id = 6153 and
  entries.category in (1, 2, 3, 4, 6, 7, 8, 9, 10, 11);
 
  vs 0.14 seconds for this non-ORM one:
 
  SELECT `Entry`.entry_id, `Entry`.orr_id, `Entry`.category,
  `Entry`.title, `Entry`.entry_date, `Entry`.filename, `Entry`.thumb200
  FROM `Entry` WHERE (`Entry`.is_public AND `Entry`.orr_id = 6153)  AND
  `Entry`.category IN (1,2,3,4,5,6,7,8,9,10,11) order by
  `Entry`.entry_date DESC LIMIT 11;
 
  vs 0.13 seconds for my manual equivalent:
 
  select entry_id, orr_id, category, title, entry_date, filename,
  thumb200 from Entry where is_public and orr_id = 6153 and category in
  (1,2,3,4,5,6,7,8,9,10,11) order by entry_date desc limit 11;
 
 
  vs 0.07 seconds for the same with *:
 
  select * from Entry where is_public and orr_id=6153 and category in
  (1,2,3,4,6,7,8,9,10,11);
 
  I'm tempted to say ORM is good but not when based on a select, at
  least not with this particular dataset.

 this test has many issues that prevents any comparison of the nested
 select - your non-ORM query has LIMIT 11 inside of it whereas your
 ORM query does not (use limit or limit() with query() for this).

The original query did have .query(...).select(limit=11), I just
forgot to paste that part into the console because I was so intent on
inlining the parameters.  Nevertheless, the non-limited queries
returned 21 records, which is pretty close.

 also I am assuming you are doing a full fetch - the ORM query has a
 lot more columns in it which will take time to be fetched

Of course, that's why I requested the 'autoload_columns' feature. :)

 (these
 columns can be set as deferred to not have them part of a default
 query).

True, but then I have to specify the columns I'm not interested in
rather than the ones I am. :) :)

Anyway, this is all just to provide feedback on how
ORM-based-on-select behaves in a real-world application.  Because the
non-ORM code is consistently fast while the ORM code is inconsistently
slow, I'd better just stick to non-ORM code for this application.  The
application is a bit unusual because of the need to exclude certain
rows and columns from all queries, which is the only reason I'm basing
it on a select -- so I can set the rule in one easy-to-audit place
rather than in every query.  I haven't tried a database view -- maybe
that will have better performance since it's built in into the
database engine.

-- 
Mike Orr [EMAIL PROTECTED]

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: [Sqlalchemy-users] Objects with just some fields

2007-04-27 Thread Mike Orr

On 4/27/07, Michael Bayer [EMAIL PROTECTED] wrote:

 On Apr 27, 2007, at 2:19 PM, Mike Orr wrote:

  Is it possible to make a mapper class that loads  modifies only some
  fields in a table rather than all the fields, yet still autoloads the
  fields rather than having hardcoded column types?

 you want to use a deferred column :

 http://www.sqlalchemy.org/docs/
 adv_datamapping.html#advdatamapping_properties_deferred

That puts the burden in the wrong place: (1) I have to list all the
undesired columns by name, (2) SQLAlchemy has to go to the trouble of
making it a deferred column when I'd prefer it just ignore it
entirely.

 yeah when you map to a select, it treats that like a table, and
 selects from it so that it can be manipulated similarly.

Is it safe to do updates on an object mapped to a select?  E.g.,

_sel = select([... fields ...])
class Foo(object):  pass
mapper(Foo, _sel)
for i in session.query(Foo).select():
i.attr = Value
session.flush()

  What I'd like to do is pass a list of column names to the Table
  constructor and have it autoload those and ignore the others.  I
  couldn't find an argument for this.

 oh.  well that we haven't done yet.  the usual use case is that if
 you know the names of the columns already, why reflect them ?  but
 yes i understand the advantage of getting the data types and foreign
 key constraints reflected.  this wouldnt be a difficult feature to add.

Created ticket #561 suggesting:

Table(Foo, meta, autoload_columns=[foo_id, col2, col3])

  - Is there a supported way to add/remove fields from a query after
  it's been constructed?

 from a select(), it supports adding things.  there is append_column
 (), append_whereclause(), order_by(), etc.  this API needs docstrings
 and probably the names order_by/group_by should be named
 append_order_by() etc.but you can see the methods listed out in
 the HTML docstrings on the site (html docs also included with the dist).

 removing things is not stressed so much since the typical use case is
 building up a query from a core criterion/selectable, and we have
 the usual issue about decisions being made based on things being
 appended, which to support removal would mean a lot of new code
 (which id welcome, of course !) to support un-making those decisions.

.append_column just needs to be documented in the manual.
.remove_column would be useful, but it's not that important if it's
complicated to implement.  (I thought the column list was just a
simple list until the query was compiled.)  I wasn't suggesting
.remove_whereclause or .remove_order_by -- I don't see any point for
those, and how would one identify the element to remove anyway?

-- 
Mike Orr [EMAIL PROTECTED]

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---