Re: [sqlalchemy] lockmode

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 12:35 AM, Michael Mileusnich wrote:

 It's just a basic select statement I do not see any other parms.

SQL Server doesn't support FOR UPDATE unless in conjunction with DECLARE 
CURSOR which is not typical DBAPI usage so this feature is not supported by 
SQLAlchemy with MS-SQL server.

Stackoverflow confirms people can't really use FOR UPDATE with SQL server in 
any practical way:

http://stackoverflow.com/questions/1483725/select-for-update-with-sql-server









 
 Thanks
 Michael Mileusnich
 
 On Sun, Jul 18, 2010 at 5:05 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 What does your SQL output say? Do you see FOR UPDATE in the log?
 
 Sent from my iPhone
 
 
 On Jul 17, 2010, at 8:47 PM, Michael Mileusnich justmike2...@gmail.com 
 wrote:
 
 Hello,
 
 I am running Python 2.6 SQL Alchemy 0.5.8 against MS SQL Server Express 2008 
 with pyODBC.  When I issue something like:
 
 sess.query(job).with_lockmode(update)
 
 It does not seem to be locking according to the query I am getting back from 
 my profiler.  Is this the correct usage?
 
 Thanks
 Mike
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Adrian Price-Whelan
Hi,

I was just wondering at first whether there was a known issue with
ARRAY types, but here is the query:
spectra =
session.query(Spectrum).join(SpectrumHeader).filter(SpectrumHeader.filename
== spPlateFilename).all()

It should return ~1000 objects equaling about 120MB of data. In
Python, this query takes 10 minutes to complete,  but as a SQL query
(copying and pasting the echo'd command) it takes a few seconds:
SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra,
spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values,
spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk,
spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS
spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask,
spectrum.or_mask AS spectrum_or_mask FROM spectrum JOIN
spectrum_header ON spectrum_header.pk = spectrum.spectrum_header_pk
WHERE spectrum_header.filename = 'spPlate-3586-55181.fits'

autoflush and autocommit are both set to False.

It seems like a straightforward query so I'm confused as to what could
be getting hung up.

Thanks for any insight,

Adrian

On Jul 16, 10:24 pm, Michael Bayer mike...@zzzcomputing.com wrote:
 You absolutely need to turn in echoing and locate the specific SQL query 
 which causes the issue.   Queries can take excessive time for a very wide 
 variety of reasons.

 On Jul 16, 2010, at 12:56 PM, Adrian Price-Whelan wrote:



  Hello --

  I'm working with a database populated with data originally from a file 
  structure of files that are ~150MB each. We are dealing with a lot of data 
  that is being stored in the database using the 'ARRAY' datatype, 
  specifically numeric[]. After loading some of the data into the database I 
  tried performing a query to get back some data, and comparing it with code 
  that reads directly from the file system - but the database query took ~50 
  times longer to complete. For instance, to retrieve 100 records that 
  contain a few 4000 element arrays each using the code that reads the 
  filesystem it took less than a second, but the query on the database took 
  around 25 seconds to complete. Has anyone else had issues with array types 
  slowing down queries or does this sound more like another issue?

  Thanks!
  Adrian

  --
  You received this message because you are subscribed to the Google Groups 
  sqlalchemy group.
  To post to this group, send email to sqlalch...@googlegroups.com.
  To unsubscribe from this group, send email to 
  sqlalchemy+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/sqlalchemy?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 9:52 AM, Adrian Price-Whelan wrote:

 Hi,
 
 I was just wondering at first whether there was a known issue with
 ARRAY types, but here is the query:
 spectra =
 session.query(Spectrum).join(SpectrumHeader).filter(SpectrumHeader.filename
 == spPlateFilename).all()
 
 It should return ~1000 objects equaling about 120MB of data. In
 Python, this query takes 10 minutes to complete,  but as a SQL query
 (copying and pasting the echo'd command) it takes a few seconds:
 SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra,
 spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values,
 spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk,
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask,
 spectrum.or_mask AS spectrum_or_mask FROM spectrum JOIN
 spectrum_header ON spectrum_header.pk = spectrum.spectrum_header_pk
 WHERE spectrum_header.filename = 'spPlate-3586-55181.fits'
 
 autoflush and autocommit are both set to False.
 
 It seems like a straightforward query so I'm confused as to what could
 be getting hung up.

does it take a few seconds to fully fetch all the results and it only gets 1000 
rows ?  or is that just to get the initial result? these are different things.  
 also if any of the individual columns are very large BLOBs or perhaps very 
large PG arrays that would add to the overhead of a fetch.   You can also try 
writing a DBAPI-only script that runs the query, as well as running 
engine.execute(myquery.statement) and fetching rows that way to see if some 
in-object process is the factor (which is unlikely).

 
 Thanks for any insight,
 
 Adrian
 
 On Jul 16, 10:24 pm, Michael Bayer mike...@zzzcomputing.com wrote:
 You absolutely need to turn in echoing and locate the specific SQL query 
 which causes the issue.   Queries can take excessive time for a very wide 
 variety of reasons.
 
 On Jul 16, 2010, at 12:56 PM, Adrian Price-Whelan wrote:
 
 
 
 Hello --
 
 I'm working with a database populated with data originally from a file 
 structure of files that are ~150MB each. We are dealing with a lot of data 
 that is being stored in the database using the 'ARRAY' datatype, 
 specifically numeric[]. After loading some of the data into the database I 
 tried performing a query to get back some data, and comparing it with code 
 that reads directly from the file system - but the database query took ~50 
 times longer to complete. For instance, to retrieve 100 records that 
 contain a few 4000 element arrays each using the code that reads the 
 filesystem it took less than a second, but the query on the database took 
 around 25 seconds to complete. Has anyone else had issues with array types 
 slowing down queries or does this sound more like another issue?
 
 Thanks!
 Adrian
 
 --
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group 
 athttp://groups.google.com/group/sqlalchemy?hl=en.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] problem with multi-table mapping - how to configure merge/cascade behaviour on mappers based on join/select?

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 10:24 AM, Harry Percival wrote:

 OK, so I will treat any classes mapped to a join of multiple tables as
 being a read-only API, and manually manage the write-API using
 relationship().
 
 It doesn't look like I can define a relationship from the composite
 mapped class to individual tables though?
 
 
 #single table mappers:
 mapper(Movies,movies_table)
 mapper(Directors,directors_table)
 mapper(Genres,genres_table)
 
 
 j = join(movies_table,md_table).join(directors_table).join(genres_table)
 r0 = relationship(Movies,
  primaryjoin=(movies_table.c.movie_id==md_table.c.movie_id),
  foreign_keys=([md_table.c.movie_id]))
 r1 = 
 relationship(Directors)#,primaryjoin=(and_(movies_table.c.movie_id==md_table.c.movie_id,md_table.c.director_id==directors_table.c.director_id)))#,secondaryjoin=(md_table.c.director_id==directors_table.c.director_id),secondary=md_table)
 r2 = relationship(Genres)
 
 mapper(MoviesAndDirectorsAndGenres,
   j.select(use_labels=True).alias('mdg'),

j.select(...).alias() is the selectable that you have to define all your 
primaryjoins against.   So you need to assign that on the outside to a variable 
first.





   properties={'movie':r0,
   'director':r1,
   'genre':r2
   },
passive_updates=False)
 
 Tried lots of permutations of this, but it doesn't look like I can get
 sqla to understand a 1-to-1 relationship between the
 multiple-table-mapped MoviesAndDirectorsAndGenres objects and the
 Movies, Directors, or Genres objects they're built out of...
 
 I think perhaps I'm just trying to do something that SQLA really
 wasn't designed for...
 
 hp
 
 On Mon, Jul 19, 2010 at 2:29 AM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 On Jul 15, 2010, at 5:11 PM, Harry Percival wrote:
 
 thanks Michael. I really appreciate your help.
 
 How should use .merge()?  I've tried both:
 
 another_new = MoviesAndDirectorsAndGenres() #init another_new as blank row
 another_new = session.merge(new) #attempt to merge with my 'new'
 object that has desired attributes
 
 and
 
 new = session.merge(another_new) #attempt to merge blank object to my
 new object with desired attributes
 
 the former fails trying to add a genre row that already exists, the
 latter fails trying to add an object with all null values.
 
 the thing is, I'm really not trying to merge anything.  I'm trying to
 create a new object, which happens to be made up partially of new
 entries in some tables, and existing entries in other tables.
 genres_genre_id refers to the primary key of the genres table, and a
 genre with genre_id=6 already exists.
 
 
 am i going to have to map an ORM class to each of the composite
 tables, and use references to them via relationship() instead of
 hoping sql can figure it out by itself from the tables that make up
 the join construct?
 
 can supply a sqlite database and some source code if it helps?
 
 It's typically appropriate to map tables individually and connect them via 
 relationship(), if you want to be writing rows to them independently.   
 That's what the phrase partially of new entries in some tables, and 
 existing entries in other tables. implies.
 
 
 
 
 rgds,
 Harry
 
 On Thu, Jul 15, 2010 at 9:48 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 On Jul 15, 2010, at 2:52 PM, Harry Percival wrote:
 
 A new problem, which seems to occur in both IronPython and normal Python:
 
 I have a database with tables for movies, directors, genres (and a
 bridging table movie_directors)
 I have a class mapped to a join of all three of the above
 
 
 j = join(movies_table,md_table).join(directors_table).join(genres_table)
 mapper(MoviesAndDirectorsAndGenres,j.select(use_labels=True).alias('moviesdirectorsgenres'))
 
 this works fine. i can query the database ok using this mapper.
 
 The tables already have some values in.
 Now, I want to create a new composite object, that references some of
 the existing values, but not all - say it's a new movie by an existing
 director in an existing genre:
 
 
 new = MoviesAndDirectorsAndGenres()
 new.movies_movie_id=8 #does not already exist in the table
 new.directors_director_id=2 #already exists in the table
 new.genres_genre_id=6 #already exists too
 session.add(new)
 
 this fails with a constraint error / integrity error, because SQLA
 tries to re-insert genre #6 even though it already exists. it also
 happens if i use session.merge(new).  it also happens if i fully
 specify all of the attributes of each movie/director/genre correctly.
 
 Is this expected behaviour?  is there any way to get sqla to
 intelligently only do inserts when necessary?  is it something to do
 with cascade configuration, and if so, how do i configure cascades on
 a mapper that's not based on relationship() but based on join()?
 
 you should be using merge(), and you should ensure that the objects being 
 merged have the correct primary key values.   If genre #6 

Re: [sqlalchemy] Graceful locking with sqlite

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 10:16 AM, Samuel wrote:

 Hi,
 
 I am trying to access (write to) an SQLite database from multiple
 threads (no ORM), resulting in the following error:
 
 OperationalError: (OperationalError) database is locked
 
 The engine is created using the default SingletonThreadPool. Is there
 a way to do graceful locking to avoid this error?
 
 (I am using sqlalchemy 0.4.8.)

you'll probably have better luck using the NullPool with a file-based SQLite 
database, but sqlite still allows only one writer at a time to access the 
database.





 
 -Samuel
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: how to run a stored procedure?

2010-07-19 Thread Lukasz Szybalski


On Jul 15, 3:35 pm, Lukasz Szybalski szybal...@gmail.com wrote:
 On Jul 15, 2:39 pm, David Gardner dgard...@creatureshop.com wrote:

  take a look 
  at:http://www.sqlalchemy.org/docs/reference/sqlalchemy/expressions.html?...

 Now, Could you give me an example of it ? I understand how to run func
 to get values like current timestamp, or countbut how to run a
 stored proc?

 print
 func.assp_Report_DailyTransactions(start_date='20100701',end_date='20100715') 
      ??
 Is this the format? or?


Any idea what the proper format should be?
I was able to connect to the mssql database vi instance name on linux.
I've updated the docs on how to setup your odbc dsn connection string:
http://lucasmanual.com/mywiki/unixODBC

Now that I'm connected how do I execute, pass in variables like start
and end date?

Thanks,
Lucas

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Adrian Price-Whelan
 does it take a few seconds to fully fetch all the results and it only gets 
 1000 rows ?  or is that just to get the initial result?

I'm not sure what you mean by this - the query does return 1000 rows.

 also if any of the individual columns are very large BLOBs or perhaps very 
 large PG arrays that would add to the overhead of a fetch.

There definitely are columns of PG arrays ~4000 elements each, so back
to my first email it seems like the culprit here could be the ARRAY's

Thanks for your help,

Adrian

On Jul 19, 10:10 am, Michael Bayer mike...@zzzcomputing.com wrote:
 On Jul 19, 2010, at 9:52 AM, Adrian Price-Whelan wrote:





  Hi,

  I was just wondering at first whether there was a known issue with
  ARRAY types, but here is the query:
  spectra =
  session.query(Spectrum).join(SpectrumHeader).filter(SpectrumHeader.filename
  == spPlateFilename).all()

  It should return ~1000 objects equaling about 120MB of data. In
  Python, this query takes 10 minutes to complete,  but as a SQL query
  (copying and pasting the echo'd command) it takes a few seconds:
  SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra,
  spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values,
  spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk,
  spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS
  spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask,
  spectrum.or_mask AS spectrum_or_mask FROM spectrum JOIN
  spectrum_header ON spectrum_header.pk = spectrum.spectrum_header_pk
  WHERE spectrum_header.filename = 'spPlate-3586-55181.fits'

  autoflush and autocommit are both set to False.

  It seems like a straightforward query so I'm confused as to what could
  be getting hung up.

 does it take a few seconds to fully fetch all the results and it only gets 
 1000 rows ?  or is that just to get the initial result? these are different 
 things.   also if any of the individual columns are very large BLOBs or 
 perhaps very large PG arrays that would add to the overhead of a fetch.   You 
 can also try writing a DBAPI-only script that runs the query, as well as 
 running engine.execute(myquery.statement) and fetching rows that way to see 
 if some in-object process is the factor (which is unlikely).





  Thanks for any insight,

  Adrian

  On Jul 16, 10:24 pm, Michael Bayer mike...@zzzcomputing.com wrote:
  You absolutely need to turn in echoing and locate the specific SQL query 
  which causes the issue.   Queries can take excessive time for a very wide 
  variety of reasons.

  On Jul 16, 2010, at 12:56 PM, Adrian Price-Whelan wrote:

  Hello --

  I'm working with a database populated with data originally from a file 
  structure of files that are ~150MB each. We are dealing with a lot of 
  data that is being stored in the database using the 'ARRAY' datatype, 
  specifically numeric[]. After loading some of the data into the database 
  I tried performing a query to get back some data, and comparing it with 
  code that reads directly from the file system - but the database query 
  took ~50 times longer to complete. For instance, to retrieve 100 records 
  that contain a few 4000 element arrays each using the code that reads the 
  filesystem it took less than a second, but the query on the database took 
  around 25 seconds to complete. Has anyone else had issues with array 
  types slowing down queries or does this sound more like another issue?

  Thanks!
  Adrian

  --
  You received this message because you are subscribed to the Google Groups 
  sqlalchemy group.
  To post to this group, send email to sqlalch...@googlegroups.com.
  To unsubscribe from this group, send email to 
  sqlalchemy+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/sqlalchemy?hl=en.

  --
  You received this message because you are subscribed to the Google Groups 
  sqlalchemy group.
  To post to this group, send email to sqlalch...@googlegroups.com.
  To unsubscribe from this group, send email to 
  sqlalchemy+unsubscr...@googlegroups.com.
  For more options, visit this group 
  athttp://groups.google.com/group/sqlalchemy?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 11:46 AM, Adrian Price-Whelan wrote:

 does it take a few seconds to fully fetch all the results and it only gets 
 1000 rows ?  or is that just to get the initial result?
 
 I'm not sure what you mean by this - the query does return 1000 rows.


if you run a query from a SQL console tool, and the query is to return hundreds 
of thousands or millions of results, you'll begin receiving rows immediately, 
and the SQL tool will display these rows within a fraction of a second. 
However, it could then take 20 minutes to scroll through the entire set of 
results.When using psycopg2, result rows are buffered fully.   You won't 
see any result until all result rows as well as all data in each column is 
fully fetched into memory.   This causes the appearance of a vast difference in 
speed between a SQL console tool and a Python script.




 
 also if any of the individual columns are very large BLOBs or perhaps very 
 large PG arrays that would add to the overhead of a fetch.
 
 There definitely are columns of PG arrays ~4000 elements each, so back
 to my first email it seems like the culprit here could be the ARRAY's
 
 Thanks for your help,
 
 Adrian
 
 On Jul 19, 10:10 am, Michael Bayer mike...@zzzcomputing.com wrote:
 On Jul 19, 2010, at 9:52 AM, Adrian Price-Whelan wrote:
 
 
 
 
 
 Hi,
 
 I was just wondering at first whether there was a known issue with
 ARRAY types, but here is the query:
 spectra =
 session.query(Spectrum).join(SpectrumHeader).filter(SpectrumHeader.filename
 == spPlateFilename).all()
 
 It should return ~1000 objects equaling about 120MB of data. In
 Python, this query takes 10 minutes to complete,  but as a SQL query
 (copying and pasting the echo'd command) it takes a few seconds:
 SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra,
 spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values,
 spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk,
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask,
 spectrum.or_mask AS spectrum_or_mask FROM spectrum JOIN
 spectrum_header ON spectrum_header.pk = spectrum.spectrum_header_pk
 WHERE spectrum_header.filename = 'spPlate-3586-55181.fits'
 
 autoflush and autocommit are both set to False.
 
 It seems like a straightforward query so I'm confused as to what could
 be getting hung up.
 
 does it take a few seconds to fully fetch all the results and it only gets 
 1000 rows ?  or is that just to get the initial result? these are different 
 things.   also if any of the individual columns are very large BLOBs or 
 perhaps very large PG arrays that would add to the overhead of a fetch.   
 You can also try writing a DBAPI-only script that runs the query, as well as 
 running engine.execute(myquery.statement) and fetching rows that way to see 
 if some in-object process is the factor (which is unlikely).
 
 
 
 
 
 Thanks for any insight,
 
 Adrian
 
 On Jul 16, 10:24 pm, Michael Bayer mike...@zzzcomputing.com wrote:
 You absolutely need to turn in echoing and locate the specific SQL query 
 which causes the issue.   Queries can take excessive time for a very wide 
 variety of reasons.
 
 On Jul 16, 2010, at 12:56 PM, Adrian Price-Whelan wrote:
 
 Hello --
 
 I'm working with a database populated with data originally from a file 
 structure of files that are ~150MB each. We are dealing with a lot of 
 data that is being stored in the database using the 'ARRAY' datatype, 
 specifically numeric[]. After loading some of the data into the database 
 I tried performing a query to get back some data, and comparing it with 
 code that reads directly from the file system - but the database query 
 took ~50 times longer to complete. For instance, to retrieve 100 records 
 that contain a few 4000 element arrays each using the code that reads the 
 filesystem it took less than a second, but the query on the database took 
 around 25 seconds to complete. Has anyone else had issues with array 
 types slowing down queries or does this sound more like another issue?
 
 Thanks!
 Adrian
 
 --
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group 
 athttp://groups.google.com/group/sqlalchemy?hl=en.
 
 --
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group 
 athttp://groups.google.com/group/sqlalchemy?hl=en.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To 

Re: [sqlalchemy] problem with multi-table mapping - how to configure merge/cascade behaviour on mappers based on join/select?

2010-07-19 Thread Harry Percival
Michael, thanks, as ever, for your help.

So, I think I've managed to specify the relationships:


j = join(movies_table,md_table).join(directors_table).join(genres_table)
js = j.select(use_labels=True).alias('mdg')

r0 = relationship(Movies,
  primaryjoin=(js.c.movies_movie_id==movies_table.c.movie_id),
  foreign_keys=([js.c.movies_movie_id])
  )
r1 = relationship(Directors,

primaryjoin=(js.c.directors_director_id==directors_table.c.director_id),
  foreign_keys=([js.c.directors_director_id])
  )
r2 = relationship(Genres,
  primaryjoin=(js.c.genres_genre_id==genres_table.c.genre_id),
  foreign_keys=([js.c.genres_genre_id])
  )

mapper(MoviesAndDirectorsAndGenres,
   js,
   properties={'movie':r0,
   'director':r1,
   'genre':r2
   },
passive_updates=False)


To test it, I'm trying to create a new composite object, based on a
new movie but existing director  genre:



session=Session()

m_new = Movies()
m_new.id=8
m_new.title = 'new movie'
session.add(m_new)

d2 = session.query(Directors).get(2)
print '***director=',d2
g6 = session.query(Genres).get(6)
print '***genre=',g6

oo_new = MoviesAndDirectorsAndGenres()
oo_new.movie = m_new
oo_new.director = d2
oo_new.genre = g6

try:
#another_new = session.merge(new)
session.merge(oo_new)

but sqla is doing some really weird stuff:

2010-07-19 16:56:01,877 INFO sqlalchemy.engine.base.Engine.0x...37f0 BEGIN
2010-07-19 16:56:01,878 INFO sqlalchemy.engine.base.Engine.0x...37f0 INSERT INTO
 movies (movie_id, title, description, genre_id, release_date) VALUES
(?, ?, ?,?, ?)
2010-07-19 16:56:01,880 INFO sqlalchemy.engine.base.Engine.0x...37f0
(8, 'new movie', None, None, None)
2010-07-19 16:56:01,882 INFO sqlalchemy.engine.base.Engine.0x...37f0
SELECT directors.director_id AS directors_director_id, directors.name
AS directors_name
FROM directors
WHERE directors.director_id = ?
2010-07-19 16:56:01,885 INFO sqlalchemy.engine.base.Engine.0x...37f0 (2,)
***director= test_multitable.Directors object at 0x021E2030
2010-07-19 16:56:01,888 INFO sqlalchemy.engine.base.Engine.0x...37f0
SELECT genres.genre_id AS genres_genre_id, genres.name AS genres_name,
genres.description AS genres_description
FROM genres
WHERE genres.genre_id = ?
2010-07-19 16:56:01,891 INFO sqlalchemy.engine.base.Engine.0x...37f0 (6,)
***genre= test_multitable.Genres object at 0x021E21B0
2010-07-19 16:56:01,894 INFO sqlalchemy.engine.base.Engine.0x...37f0
INSERT INTO genres (genre_id, name, description) VALUES (?, ?, ?)
2010-07-19 16:56:01,895 INFO sqlalchemy.engine.base.Engine.0x...37f0
(6, None, None)
2010-07-19 16:56:01,898 INFO sqlalchemy.engine.base.Engine.0x...37f0 ROLLBACK

attempted to add new composite object, failed with  (IntegrityError)
PRIMARY KEY must be unique u'INSERT INTO genres (genre_id, name,
description) VALUES (?, ?, ?)' (6, None, None)

Traceback (most recent call last):
  File test_multitable.py, line 105, in module
session.commit()
  File sqlalchemy\orm\session.py, line 653, in commit
etc etc
IntegrityError: (IntegrityError) PRIMARY KEY must be unique u'INSERT INTO genres
 (genre_id, name, description) VALUES (?, ?, ?)' (6, None, None)


it seems to trying to insert a new genre, even though I've told it to
use an existing one?

I suppose, once I've done the insert into the movies table, i could
retrieve the new composite object via a session.query, instead of
trying to do a session.merge.  but this wouldn't work for creating a
new Director object, since I also need to create an entry into the
bridging table, something i'm hoping sqla could manage for me...

rgds,
hp


On Mon, Jul 19, 2010 at 3:31 PM, Michael Bayer mike...@zzzcomputing.com wrote:

 On Jul 19, 2010, at 10:24 AM, Harry Percival wrote:

 OK, so I will treat any classes mapped to a join of multiple tables as
 being a read-only API, and manually manage the write-API using
 relationship().

 It doesn't look like I can define a relationship from the composite
 mapped class to individual tables though?


 #single table mappers:
 mapper(Movies,movies_table)
 mapper(Directors,directors_table)
 mapper(Genres,genres_table)


 j = join(movies_table,md_table).join(directors_table).join(genres_table)
 r0 = relationship(Movies,
                  primaryjoin=(movies_table.c.movie_id==md_table.c.movie_id),
                  foreign_keys=([md_table.c.movie_id]))
 r1 = 
 relationship(Directors)#,primaryjoin=(and_(movies_table.c.movie_id==md_table.c.movie_id,md_table.c.director_id==directors_table.c.director_id)))#,secondaryjoin=(md_table.c.director_id==directors_table.c.director_id),secondary=md_table)
 r2 = relationship(Genres)

 mapper(MoviesAndDirectorsAndGenres,
       j.select(use_labels=True).alias('mdg'),

 j.select(...).alias() is the selectable that you have to define all your 

Re: [sqlalchemy] problem with multi-table mapping - how to configure merge/cascade behaviour on mappers based on join/select?

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 12:04 PM, Harry Percival wrote:

 Michael, thanks, as ever, for your help.
 
 So, I think I've managed to specify the relationships:
 
 
 j = join(movies_table,md_table).join(directors_table).join(genres_table)
 js = j.select(use_labels=True).alias('mdg')
 
 r0 = relationship(Movies,
  primaryjoin=(js.c.movies_movie_id==movies_table.c.movie_id),
  foreign_keys=([js.c.movies_movie_id])
  )
 r1 = relationship(Directors,
 
 primaryjoin=(js.c.directors_director_id==directors_table.c.director_id),
  foreign_keys=([js.c.directors_director_id])
  )
 r2 = relationship(Genres,
  primaryjoin=(js.c.genres_genre_id==genres_table.c.genre_id),
  foreign_keys=([js.c.genres_genre_id])
  )
 
 mapper(MoviesAndDirectorsAndGenres,
   js,
   properties={'movie':r0,
   'director':r1,
   'genre':r2
   },
passive_updates=False)
 
 
 To test it, I'm trying to create a new composite object, based on a
 new movie but existing director  genre:
 
 
 
 session=Session()
 
 m_new = Movies()
 m_new.id=8
 m_new.title = 'new movie'
 session.add(m_new)
 
 d2 = session.query(Directors).get(2)
 print '***director=',d2
 g6 = session.query(Genres).get(6)
 print '***genre=',g6
 
 oo_new = MoviesAndDirectorsAndGenres()
 oo_new.movie = m_new
 oo_new.director = d2
 oo_new.genre = g6
 
 try:
 #another_new = session.merge(new)
session.merge(oo_new)
 
 but sqla is doing some really weird stuff:
 
 2010-07-19 16:56:01,877 INFO sqlalchemy.engine.base.Engine.0x...37f0 BEGIN
 2010-07-19 16:56:01,878 INFO sqlalchemy.engine.base.Engine.0x...37f0 INSERT 
 INTO
 movies (movie_id, title, description, genre_id, release_date) VALUES
 (?, ?, ?,?, ?)
 2010-07-19 16:56:01,880 INFO sqlalchemy.engine.base.Engine.0x...37f0
 (8, 'new movie', None, None, None)
 2010-07-19 16:56:01,882 INFO sqlalchemy.engine.base.Engine.0x...37f0
 SELECT directors.director_id AS directors_director_id, directors.name
 AS directors_name
 FROM directors
 WHERE directors.director_id = ?
 2010-07-19 16:56:01,885 INFO sqlalchemy.engine.base.Engine.0x...37f0 (2,)
 ***director= test_multitable.Directors object at 0x021E2030
 2010-07-19 16:56:01,888 INFO sqlalchemy.engine.base.Engine.0x...37f0
 SELECT genres.genre_id AS genres_genre_id, genres.name AS genres_name,
 genres.description AS genres_description
 FROM genres
 WHERE genres.genre_id = ?
 2010-07-19 16:56:01,891 INFO sqlalchemy.engine.base.Engine.0x...37f0 (6,)
 ***genre= test_multitable.Genres object at 0x021E21B0
 2010-07-19 16:56:01,894 INFO sqlalchemy.engine.base.Engine.0x...37f0
 INSERT INTO genres (genre_id, name, description) VALUES (?, ?, ?)
 2010-07-19 16:56:01,895 INFO sqlalchemy.engine.base.Engine.0x...37f0
 (6, None, None)
 2010-07-19 16:56:01,898 INFO sqlalchemy.engine.base.Engine.0x...37f0 ROLLBACK
 
 attempted to add new composite object, failed with  (IntegrityError)
 PRIMARY KEY must be unique u'INSERT INTO genres (genre_id, name,
 description) VALUES (?, ?, ?)' (6, None, None)
 
 Traceback (most recent call last):
  File test_multitable.py, line 105, in module
session.commit()
  File sqlalchemy\orm\session.py, line 653, in commit
etc etc
 IntegrityError: (IntegrityError) PRIMARY KEY must be unique u'INSERT INTO 
 genres
 (genre_id, name, description) VALUES (?, ?, ?)' (6, None, None)
 
 
 it seems to trying to insert a new genre, even though I've told it to
 use an existing one?
 
 I suppose, once I've done the insert into the movies table, i could
 retrieve the new composite object via a session.query, instead of
 trying to do a session.merge.  but this wouldn't work for creating a
 new Director object, since I also need to create an entry into the
 bridging table, something i'm hoping sqla could manage for me...

if you say x = new Foo() x.id = 7; session.add(x); that is an INSERT, no matter 
what the ID is, and will fail as above if that primary key already exists.   If 
you OTOH say x = session.merge(x), it will be an INSERT or an UPDATE depending 
on whether or not primary key id #7 exists in the database already or not.   

You can get an overview of what the various methods do at:

http://www.sqlalchemy.org/docs/session.html#id1




 
 rgds,
 hp
 
 
 On Mon, Jul 19, 2010 at 3:31 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 On Jul 19, 2010, at 10:24 AM, Harry Percival wrote:
 
 OK, so I will treat any classes mapped to a join of multiple tables as
 being a read-only API, and manually manage the write-API using
 relationship().
 
 It doesn't look like I can define a relationship from the composite
 mapped class to individual tables though?
 
 
 #single table mappers:
 mapper(Movies,movies_table)
 mapper(Directors,directors_table)
 mapper(Genres,genres_table)
 
 
 j = join(movies_table,md_table).join(directors_table).join(genres_table)

Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread David Gardner
Try running that query directly against the database see how long that 
takes.
Also try running explain on that query make sure it is using your 
indexes properly.
Since you are only using a single filter make sure that the 
spectrum_header.filename
has an index, and make sure your foreign key column 
spectrum.spectrum_header_pk

is indexed as well.

On 07/19/2010 08:46 AM, Adrian Price-Whelan wrote:

does it take a few seconds to fully fetch all the results and it only gets 1000 
rows ?  or is that just to get the initial result?
 

I'm not sure what you mean by this - the query does return 1000 rows.

   

also if any of the individual columns are very large BLOBs or perhaps very 
large PG arrays that would add to the overhead of a fetch.
 

There definitely are columns of PG arrays ~4000 elements each, so back
to my first email it seems like the culprit here could be the ARRAY's

Thanks for your help,

Adrian

On Jul 19, 10:10 am, Michael Bayermike...@zzzcomputing.com  wrote:
   

On Jul 19, 2010, at 9:52 AM, Adrian Price-Whelan wrote:





 

Hi,
   
 

I was just wondering at first whether there was a known issue with
ARRAY types, but here is the query:
spectra =
session.query(Spectrum).join(SpectrumHeader).filter(SpectrumHeader.filename
== spPlateFilename).all()
   
 

It should return ~1000 objects equaling about 120MB of data. In
Python, this query takes10 minutes to complete,  but as a SQL query
(copying and pasting the echo'd command) it takes a few seconds:
SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra,
spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values,
spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk,
spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS
spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask,
spectrum.or_mask AS spectrum_or_mask FROM spectrum JOIN
spectrum_header ON spectrum_header.pk = spectrum.spectrum_header_pk
WHERE spectrum_header.filename = 'spPlate-3586-55181.fits'
   
 

autoflush and autocommit are both set to False.
   
 

It seems like a straightforward query so I'm confused as to what could
be getting hung up.
   

does it take a few seconds to fully fetch all the results and it only gets 1000 
rows ?  or is that just to get the initial result? these are different things.  
 also if any of the individual columns are very large BLOBs or perhaps very 
large PG arrays that would add to the overhead of a fetch.   You can also try 
writing a DBAPI-only script that runs the query, as well as running 
engine.execute(myquery.statement) and fetching rows that way to see if some 
in-object process is the factor (which is unlikely).





 

Thanks for any insight,
   
 

Adrian
   
 

On Jul 16, 10:24 pm, Michael Bayermike...@zzzcomputing.com  wrote:
   

You absolutely need to turn in echoing and locate the specific SQL query which 
causes the issue.   Queries can take excessive time for a very wide variety of 
reasons.
 
 

On Jul 16, 2010, at 12:56 PM, Adrian Price-Whelan wrote:
 
 

Hello --
   
 

I'm working with a database populated with data originally from a file 
structure of files that are ~150MB each. We are dealing with a lot of data that 
is being stored in the database using the 'ARRAY' datatype, specifically 
numeric[]. After loading some of the data into the database I tried performing 
a query to get back some data, and comparing it with code that reads directly 
from the file system - but the database query took ~50 times longer to 
complete. For instance, to retrieve 100 records that contain a few 4000 element 
arrays each using the code that reads the filesystem it took less than a 
second, but the query on the database took around 25 seconds to complete. Has 
anyone else had issues with array types slowing down queries or does this sound 
more like another issue?
   
 

Thanks!
Adrian
   
 

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group 
athttp://groups.google.com/group/sqlalchemy?hl=en.
   
 

--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group 
athttp://groups.google.com/group/sqlalchemy?hl=en.
   
   



--
David Gardner
Pipeline Tools Programmer
Jim Henson Creature Shop
dgard...@creatureshop.com


--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this 

Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Adrian Price-Whelan
Here is some more detailed information trying the query multiple ways:

Piping the command into psql and writing to a tmp file takes 12 seconds (tmp 
file is 241MB):

time echo SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra, 
spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values, 
spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask AS 
spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk = 
spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
'spPlate-3586-55181.fits'; | psql spectradb -U postgres  tmp

real0m12.052s
user0m2.501s
sys 0m0.274s

engine.execute on the same query took ~6 seconds:
spectra = engine.execute(SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS 
spectrum_ra, spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values, 
spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask AS 
spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk = 
spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
'spPlate-3586-55181.fits';)

spectra = 
session.query(Spectrum).join(SpectrumHeader).filter(SpectrumHeader.filename == 
spPlate-3665-55247.fits).all() clocked in at 489 seconds

Thanks, 
Adrian

On Jul 19, 2010, at 12:24 PM, David Gardner wrote:

 Try running that query directly against the database see how long that takes.
 Also try running explain on that query make sure it is using your indexes 
 properly.
 Since you are only using a single filter make sure that the 
 spectrum_header.filename
 has an index, and make sure your foreign key column 
 spectrum.spectrum_header_pk
 is indexed as well.
 
 On 07/19/2010 08:46 AM, Adrian Price-Whelan wrote:
 does it take a few seconds to fully fetch all the results and it only gets 
 1000 rows ?  or is that just to get the initial result?
 
 I'm not sure what you mean by this - the query does return 1000 rows.
 
   
 also if any of the individual columns are very large BLOBs or perhaps very 
 large PG arrays that would add to the overhead of a fetch.
 
 There definitely are columns of PG arrays ~4000 elements each, so back
 to my first email it seems like the culprit here could be the ARRAY's
 
 Thanks for your help,
 
 Adrian
 
 On Jul 19, 10:10 am, Michael Bayermike...@zzzcomputing.com  wrote:
   
 On Jul 19, 2010, at 9:52 AM, Adrian Price-Whelan wrote:
 
 
 
 
 
 
 Hi,
   
 
 I was just wondering at first whether there was a known issue with
 ARRAY types, but here is the query:
 spectra =
 session.query(Spectrum).join(SpectrumHeader).filter(SpectrumHeader.filename
 == spPlateFilename).all()
   
 
 It should return ~1000 objects equaling about 120MB of data. In
 Python, this query takes10 minutes to complete,  but as a SQL query
 (copying and pasting the echo'd command) it takes a few seconds:
 SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra,
 spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values,
 spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk,
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask,
 spectrum.or_mask AS spectrum_or_mask FROM spectrum JOIN
 spectrum_header ON spectrum_header.pk = spectrum.spectrum_header_pk
 WHERE spectrum_header.filename = 'spPlate-3586-55181.fits'
   
 
 autoflush and autocommit are both set to False.
   
 
 It seems like a straightforward query so I'm confused as to what could
 be getting hung up.
   
 does it take a few seconds to fully fetch all the results and it only gets 
 1000 rows ?  or is that just to get the initial result? these are different 
 things.   also if any of the individual columns are very large BLOBs or 
 perhaps very large PG arrays that would add to the overhead of a fetch.   
 You can also try writing a DBAPI-only script that runs the query, as well 
 as running engine.execute(myquery.statement) and fetching rows that way to 
 see if some in-object process is the factor (which is unlikely).
 
 
 
 
 
 
 Thanks for any insight,
   
 
 Adrian
   
 
 On Jul 16, 10:24 pm, Michael Bayermike...@zzzcomputing.com  wrote:
   
 You absolutely need to turn in echoing and locate the specific SQL query 
 which causes the issue.   Queries can take excessive time for a very wide 
 variety of reasons.
 
 
 On Jul 16, 2010, at 12:56 PM, Adrian Price-Whelan wrote:
 
 
 Hello --
   
 
 I'm working with a database populated with data originally from a file 
 structure of files that are ~150MB each. We are dealing with a lot of 
 data that is being stored in the database using the 'ARRAY' datatype, 
 

[sqlalchemy] Change __tablename__ at runtime?

2010-07-19 Thread Wade Leftwich
I've been using Sqlalchemy to help migrate a bunch of websites into
and out of Drupal. Since a Drupal 'node' can involve fields from 10 or
12 tables, the Declarative approach has been a real timesaver.

But now they're thrown me a serious curveball. It turns out that
Drupal has a 'multisite' mode, where instead of one `node` table you
have `site1_node`, `site2_node`, etc.

I'm not going to try to do a union of `site1_node` and `site2_node` or
anything like that, but -- given that they have exactly the same
structure, is there any way I can define a Node class and specify the
__tablename__ during runtime?

Any advice appreciated; I'm prepared for the advice to be Don't go
there.


Wade Leftwich
Ithaca, NY

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Adrian Price-Whelan
Also, as a follow-up - inserting these ARRAYs into the database is
very slow as well, slower than expected that is. Granted, it is
looping over 1000 objects and inserting a few 4000 element arrays for
each object, but doing one big SQL query takes considerably less time
than session.add(object), session.commit().

I thought that might be another clue, thanks!

On Jul 19, 1:53 pm, Adrian Price-Whelan adrian@gmail.com wrote:
 Here is some more detailed information trying the query multiple ways:

 Piping the command into psql and writing to a tmp file takes 12 seconds (tmp 
 file is 241MB):

 time echo SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra, 
 spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values, 
 spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask AS 
 spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk = 
 spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
 'spPlate-3586-55181.fits'; | psql spectradb -U postgres  tmp

 real    0m12.052s
 user    0m2.501s
 sys     0m0.274s

 engine.execute on the same query took ~6 seconds:
 spectra = engine.execute(SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS 
 spectrum_ra, spectrum.dec AS spectrum_dec, spectrum.values AS 
 spectrum_values, spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask AS 
 spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk = 
 spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
 'spPlate-3586-55181.fits';)

 spectra = 
 session.query(Spectrum).join(SpectrumHeader).filter(SpectrumHeader.filename 
 == spPlate-3665-55247.fits).all() clocked in at 489 seconds

 Thanks,
 Adrian

 On Jul 19, 2010, at 12:24 PM, David Gardner wrote:



  Try running that query directly against the database see how long that 
  takes.
  Also try running explain on that query make sure it is using your indexes 
  properly.
  Since you are only using a single filter make sure that the 
  spectrum_header.filename
  has an index, and make sure your foreign key column 
  spectrum.spectrum_header_pk
  is indexed as well.

  On 07/19/2010 08:46 AM, Adrian Price-Whelan wrote:
  does it take a few seconds to fully fetch all the results and it only 
  gets 1000 rows ?  or is that just to get the initial result?

  I'm not sure what you mean by this - the query does return 1000 rows.

  also if any of the individual columns are very large BLOBs or perhaps 
  very large PG arrays that would add to the overhead of a fetch.

  There definitely are columns of PG arrays ~4000 elements each, so back
  to my first email it seems like the culprit here could be the ARRAY's

  Thanks for your help,

  Adrian

  On Jul 19, 10:10 am, Michael Bayermike...@zzzcomputing.com  wrote:

  On Jul 19, 2010, at 9:52 AM, Adrian Price-Whelan wrote:

  Hi,

  I was just wondering at first whether there was a known issue with
  ARRAY types, but here is the query:
  spectra =
  session.query(Spectrum).join(SpectrumHeader).filter(SpectrumHeader.filename
  == spPlateFilename).all()

  It should return ~1000 objects equaling about 120MB of data. In
  Python, this query takes10 minutes to complete,  but as a SQL query
  (copying and pasting the echo'd command) it takes a few seconds:
  SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra,
  spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values,
  spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk,
  spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS
  spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask,
  spectrum.or_mask AS spectrum_or_mask FROM spectrum JOIN
  spectrum_header ON spectrum_header.pk = spectrum.spectrum_header_pk
  WHERE spectrum_header.filename = 'spPlate-3586-55181.fits'

  autoflush and autocommit are both set to False.

  It seems like a straightforward query so I'm confused as to what could
  be getting hung up.

  does it take a few seconds to fully fetch all the results and it only 
  gets 1000 rows ?  or is that just to get the initial result? these are 
  different things.   also if any of the individual columns are very large 
  BLOBs or perhaps very large PG arrays that would add to the overhead of a 
  fetch.   You can also try writing a DBAPI-only script that runs the 
  query, as well as running engine.execute(myquery.statement) and fetching 
  rows that way to see if some in-object process is the factor (which is 
  unlikely).

  Thanks for any insight,

  Adrian

  On Jul 16, 10:24 pm, Michael Bayermike...@zzzcomputing.com  wrote:

  You absolutely need to turn in echoing and locate the specific SQL 
  query which causes the issue.   Queries can take excessive time for a 

[sqlalchemy] Slides from the Advanced SQLAlchemy Customization tutorial at EuroPython

2010-07-19 Thread jason kirtland
Fellow Alchemers,

I've posted the slides and code from the Advanced SQLAlchemy
Customization tutorial I presented yesterday at EuroPython 2010 in
Birmingham.  Enjoy!

http://discorporate.us/jek/talks/#d2010-07-18

Talk description: http://www.europython.eu/talks/talk_abstracts/#talk67

Cheers,
Jason

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 1:53 PM, Adrian Price-Whelan wrote:

 Here is some more detailed information trying the query multiple ways:
 
 Piping the command into psql and writing to a tmp file takes 12 seconds (tmp 
 file is 241MB):
 
 time echo SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra, 
 spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values, 
 spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask AS 
 spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk = 
 spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
 'spPlate-3586-55181.fits'; | psql spectradb -U postgres  tmp
 
 real  0m12.052s
 user  0m2.501s
 sys   0m0.274s
 
 engine.execute on the same query took ~6 seconds:
 spectra = engine.execute(SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS 
 spectrum_ra, spectrum.dec AS spectrum_dec, spectrum.values AS 
 spectrum_values, spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask AS 
 spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk = 
 spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
 'spPlate-3586-55181.fits';)

Call fetchall() on the result to get a better idea what's going on.Here's 
the source of the ARRAY type:

def result_processor(self, dialect, coltype):
item_proc = self.item_type.result_processor(dialect, coltype)
if item_proc:
def convert_item(item):
if isinstance(item, list):
return [convert_item(child) for child in item]
else:
return item_proc(item)
else:
def convert_item(item):
if isinstance(item, list):
return [convert_item(child) for child in item]
else:
return item
def process(value):
if value is None:
return value
return [convert_item(item) for item in value]
return process

as you can see, if your ARRAY is of a Unicode type or similar, convert() must 
be called on each item (only happens during fetch).  This is a likely cause of 
the slowdown and you should consider what kind of converters you're using on 
your ARRAY members.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Change __tablename__ at runtime?

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 1:56 PM, Wade Leftwich wrote:

 I've been using Sqlalchemy to help migrate a bunch of websites into
 and out of Drupal. Since a Drupal 'node' can involve fields from 10 or
 12 tables, the Declarative approach has been a real timesaver.
 
 But now they're thrown me a serious curveball. It turns out that
 Drupal has a 'multisite' mode, where instead of one `node` table you
 have `site1_node`, `site2_node`, etc.
 
 I'm not going to try to do a union of `site1_node` and `site2_node` or
 anything like that, but -- given that they have exactly the same
 structure, is there any way I can define a Node class and specify the
 __tablename__ during runtime?
 
 Any advice appreciated; I'm prepared for the advice to be Don't go
 there.

you'd probably want to use a variant of the entity name recipe, I've added a 
declarative example at the end:

http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName

when you create an instance of a class, you need to know what table you're 
going to want to be going to.  


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Adrian Price-Whelan
First off, thanks for your quick replies!

I will look into this, but I can tell you that the arrays are strictly numbers 
and the array columns are type numeric[]

Thanks again,
Adrian

On Jul 19, 2010, at 3:47 PM, Michael Bayer wrote:

 
 On Jul 19, 2010, at 1:53 PM, Adrian Price-Whelan wrote:
 
 Here is some more detailed information trying the query multiple ways:
 
 Piping the command into psql and writing to a tmp file takes 12 seconds (tmp 
 file is 241MB):
 
 time echo SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra, 
 spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values, 
 spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask 
 AS spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk 
 = spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
 'spPlate-3586-55181.fits'; | psql spectradb -U postgres  tmp
 
 real 0m12.052s
 user 0m2.501s
 sys  0m0.274s
 
 engine.execute on the same query took ~6 seconds:
 spectra = engine.execute(SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS 
 spectrum_ra, spectrum.dec AS spectrum_dec, spectrum.values AS 
 spectrum_values, spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask 
 AS spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk 
 = spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
 'spPlate-3586-55181.fits';)
 
 Call fetchall() on the result to get a better idea what's going on.Here's 
 the source of the ARRAY type:
 
def result_processor(self, dialect, coltype):
item_proc = self.item_type.result_processor(dialect, coltype)
if item_proc:
def convert_item(item):
if isinstance(item, list):
return [convert_item(child) for child in item]
else:
return item_proc(item)
else:
def convert_item(item):
if isinstance(item, list):
return [convert_item(child) for child in item]
else:
return item
def process(value):
if value is None:
return value
return [convert_item(item) for item in value]
return process
 
 as you can see, if your ARRAY is of a Unicode type or similar, convert() must 
 be called on each item (only happens during fetch).  This is a likely cause 
 of the slowdown and you should consider what kind of converters you're using 
 on your ARRAY members.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 3:55 PM, Adrian Price-Whelan wrote:

 First off, thanks for your quick replies!
 
 I will look into this, but I can tell you that the arrays are strictly 
 numbers and the array columns are type numeric[]

so its going to be doing that somewhat inefficient isinstance(list) thing you 
see below, this appears to be how it handles arrays of arbitrary numbers of 
dimensions.   This could be optimized if the ARRAY type accepted some clues as 
to how many dimensions are present.   Seems a little suspicious that the 
slowdown would be 8000% though.



 
 Thanks again,
 Adrian
 
 On Jul 19, 2010, at 3:47 PM, Michael Bayer wrote:
 
 
 On Jul 19, 2010, at 1:53 PM, Adrian Price-Whelan wrote:
 
 Here is some more detailed information trying the query multiple ways:
 
 Piping the command into psql and writing to a tmp file takes 12 seconds 
 (tmp file is 241MB):
 
 time echo SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra, 
 spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values, 
 spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask 
 AS spectrum_or_mask FROM spectrum JOIN spectrum_header ON 
 spectrum_header.pk = spectrum.spectrum_header_pk WHERE 
 spectrum_header.filename = 'spPlate-3586-55181.fits'; | psql spectradb -U 
 postgres  tmp
 
 real0m12.052s
 user0m2.501s
 sys 0m0.274s
 
 engine.execute on the same query took ~6 seconds:
 spectra = engine.execute(SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS 
 spectrum_ra, spectrum.dec AS spectrum_dec, spectrum.values AS 
 spectrum_values, spectrum.spectrum_header_pk AS 
 spectrum_spectrum_header_pk, spectrum.fiber_number AS 
 spectrum_fiber_number, spectrum.inv_var AS spectrum_inv_var, 
 spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask AS 
 spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk = 
 spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
 'spPlate-3586-55181.fits';)
 
 Call fetchall() on the result to get a better idea what's going on.
 Here's the source of the ARRAY type:
 
   def result_processor(self, dialect, coltype):
   item_proc = self.item_type.result_processor(dialect, coltype)
   if item_proc:
   def convert_item(item):
   if isinstance(item, list):
   return [convert_item(child) for child in item]
   else:
   return item_proc(item)
   else:
   def convert_item(item):
   if isinstance(item, list):
   return [convert_item(child) for child in item]
   else:
   return item
   def process(value):
   if value is None:
   return value
   return [convert_item(item) for item in value]
   return process
 
 as you can see, if your ARRAY is of a Unicode type or similar, convert() 
 must be called on each item (only happens during fetch).  This is a likely 
 cause of the slowdown and you should consider what kind of converters you're 
 using on your ARRAY members.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread thatsanicehatyouhave

On Jul 19, 2010, at 4:08 PM, Michael Bayer wrote:

 so its going to be doing that somewhat inefficient isinstance(list) thing 
 you see below, this appears to be how it handles arrays of arbitrary numbers 
 of dimensions. This could be optimized if the ARRAY type accepted some clues 
 as to how many dimensions are present.

There is a function in postgresql called array_dims that returns the array 
dimensions - could this be of some use? It could query the database dynamically 
to get the number of values.

  Seems a little suspicious that the slowdown would be 8000% though.

We have four columns in that table of type numeric[]. So that query returns 
1000 rows x 4 numeric[] columns of 4634 values in each array. (I work with 
Adrian.)

Cheers,
Demitri

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 4:38 PM, thatsanicehatyouh...@mac.com wrote:

 
 On Jul 19, 2010, at 4:08 PM, Michael Bayer wrote:
 
 so its going to be doing that somewhat inefficient isinstance(list) thing 
 you see below, this appears to be how it handles arrays of arbitrary numbers 
 of dimensions. This could be optimized if the ARRAY type accepted some clues 
 as to how many dimensions are present.
 
 There is a function in postgresql called array_dims that returns the array 
 dimensions - could this be of some use? It could query the database 
 dynamically to get the number of values.

I'd just have you say ARRAY(dims=4)




 
 Seems a little suspicious that the slowdown would be 8000% though.
 
 We have four columns in that table of type numeric[]. So that query returns 
 1000 rows x 4 numeric[] columns of 4634 values in each array. (I work with 
 Adrian.)

so try this type:

class ARRAY(postgresql.ARRAY):
def bind_processor(self, dialect):
return None

def result_processor(self, dialect, coltype):
return None

that will just give you the raw psycopg2 data.



 
 Cheers,
 Demitri
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



[sqlalchemy] Re: how to run a stored procedure?

2010-07-19 Thread Lukasz Szybalski


On Jul 19, 10:29 am, Lukasz Szybalski szybal...@gmail.com wrote:
 On Jul 15, 3:35 pm, Lukasz Szybalski szybal...@gmail.com wrote:

  On Jul 15, 2:39 pm, David Gardner dgard...@creatureshop.com wrote:

   take a look 
   at:http://www.sqlalchemy.org/docs/reference/sqlalchemy/expressions.html?...

  Now, Could you give me an example of it ? I understand how to run func
  to get values like current timestamp, or countbut how to run a
  stored proc?

  print
  func.assp_Report_DailyTransactions(start_date='20100701',end_date='20100715')
        ??
  Is this the format? or?

 Any idea what the proper format should be?
 I was able to connect to the mssql database vi instance name on linux.
 I've updated the docs on how to setup your odbc dsn connection 
 string:http://lucasmanual.com/mywiki/unixODBC

 Now that I'm connected how do I execute, pass in variables like start
 and end date?

Hello,

I can do this:

 l=session.execute(assp_ReportDailyTransactions)

but this returns all the data,

In both pyODBC and ceODBC I can run the following which will return
proper data range based on start and end date parameters.

a=cursor.execute(assp_ReportDailyTransactions @start_date=?,
@end_date=?,20100701,20100719)

but how do I convert that to sqlalchemy like format:

This gives the following error:
 session.execute(assp_ReportDailyTransactions @start_date=?,
@end_date=?,20100701,20100719)
Traceback (most recent call last):
  File stdin, line 1, in module
  File /home/lucas/tmp/sql2008/env/lib/python2.5/site-packages/
SQLAlchemy-0.6.3-py2.5.egg/sqlalchemy/orm/session.py, line 732, in
execute
engine = self.get_bind(mapper, clause=clause, **kw)
  File /home/lucas/tmp/sql2008/env/lib/python2.5/site-packages/
SQLAlchemy-0.6.3-py2.5.egg/sqlalchemy/orm/session.py, line 837, in
get_bind
c_mapper = mapper is not None and _class_to_mapper(mapper) or None
  File /home/lucas/tmp/sql2008/env/lib/python2.5/site-packages/
SQLAlchemy-0.6.3-py2.5.egg/sqlalchemy/orm/util.py, line 636, in
_class_to_mapper
raise exc.UnmappedClassError(class_or_mapper)
sqlalchemy.orm.exc.UnmappedClassError: Class ''20100719'' is not
mapped

Let me know,
Thanks,
Lucas

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread thatsanicehatyouhave
Hi Michael,

Assuming I understood you correctly, I tried the code below. The result was the 
same (the query took 486 seconds). Since I autoload everything, I first adjust 
the column types to the class you defined. Did I misunderstand something?

Thanks again for your help.

Cheers,
Demitri

---

from sqlalchemy.dialects import postgresql
from sqlalchemy.schema import Column
from sqlalchemy.types import Numeric

...

class ARRAY(postgresql.ARRAY):
  def bind_processor(self, dialect):
  return None

  def result_processor(self, dialect, coltype):
  return None

for c in Spectrum.__table__.get_children():
if type(c) is Column:
if c.name in [values, inv_var, and_mask, or_mask]: # 
numeric[] columns
c.type = ARRAY(Numeric)
print c.name

spPlateFilename = spPlate-3586-55181.fits
spectra = session.query(Spectrum).join(SpectrumHeader).\
filter(SpectrumHeader.filename == spPlateFilename).all()


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: how to run a stored procedure?

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 5:38 PM, Lukasz Szybalski wrote:

 
 
 On Jul 19, 10:29 am, Lukasz Szybalski szybal...@gmail.com wrote:
 On Jul 15, 3:35 pm, Lukasz Szybalski szybal...@gmail.com wrote:
 
 On Jul 15, 2:39 pm, David Gardner dgard...@creatureshop.com wrote:
 
 take a look 
 at:http://www.sqlalchemy.org/docs/reference/sqlalchemy/expressions.html?...
 
 Now, Could you give me an example of it ? I understand how to run func
 to get values like current timestamp, or countbut how to run a
 stored proc?
 
 print
 func.assp_Report_DailyTransactions(start_date='20100701',end_date='20100715')
   ??
 Is this the format? or?
 
 Any idea what the proper format should be?
 I was able to connect to the mssql database vi instance name on linux.
 I've updated the docs on how to setup your odbc dsn connection 
 string:http://lucasmanual.com/mywiki/unixODBC
 
 Now that I'm connected how do I execute, pass in variables like start
 and end date?
 
 Hello,
 
 I can do this:
 
 l=session.execute(assp_ReportDailyTransactions)
 
 but this returns all the data,
 
 In both pyODBC and ceODBC I can run the following which will return
 proper data range based on start and end date parameters.
 
 a=cursor.execute(assp_ReportDailyTransactions @start_date=?,
 @end_date=?,20100701,20100719)
 
 but how do I convert that to sqlalchemy like format:

Session.execute() accepts strings that are converted to text():


session.execute(assp_ReportDailyTransactions 
@start_date=:start,@end_date=:end, 
params={'start':20100701,'end':20100719})





 
 This gives the following error:
 session.execute(assp_ReportDailyTransactions @start_date=?,
 @end_date=?,20100701,20100719)
 Traceback (most recent call last):
  File stdin, line 1, in module
  File /home/lucas/tmp/sql2008/env/lib/python2.5/site-packages/
 SQLAlchemy-0.6.3-py2.5.egg/sqlalchemy/orm/session.py, line 732, in
 execute
engine = self.get_bind(mapper, clause=clause, **kw)
  File /home/lucas/tmp/sql2008/env/lib/python2.5/site-packages/
 SQLAlchemy-0.6.3-py2.5.egg/sqlalchemy/orm/session.py, line 837, in
 get_bind
c_mapper = mapper is not None and _class_to_mapper(mapper) or None
  File /home/lucas/tmp/sql2008/env/lib/python2.5/site-packages/
 SQLAlchemy-0.6.3-py2.5.egg/sqlalchemy/orm/util.py, line 636, in
 _class_to_mapper
raise exc.UnmappedClassError(class_or_mapper)
 sqlalchemy.orm.exc.UnmappedClassError: Class ''20100719'' is not
 mapped
 
 Let me know,
 Thanks,
 Lucas
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 5:46 PM, thatsanicehatyouh...@mac.com wrote:

 Hi Michael,
 
 Assuming I understood you correctly, I tried the code below. The result was 
 the same (the query took 486 seconds). Since I autoload everything, I first 
 adjust the column types to the class you defined. Did I misunderstand 
 something?


nope.   you'd have to produce some profiling results for us or a script that 
illustrates the issue.


 
 Thanks again for your help.
 
 Cheers,
 Demitri
 
 ---
 
 from sqlalchemy.dialects import postgresql
 from sqlalchemy.schema import Column
 from sqlalchemy.types import Numeric
 
 ...
 
 class ARRAY(postgresql.ARRAY):
  def bind_processor(self, dialect):
  return None
 
  def result_processor(self, dialect, coltype):
  return None
 
 for c in Spectrum.__table__.get_children():
   if type(c) is Column:
   if c.name in [values, inv_var, and_mask, or_mask]: # 
 numeric[] columns
   c.type = ARRAY(Numeric)
   print c.name
 
 spPlateFilename = spPlate-3586-55181.fits
 spectra = session.query(Spectrum).join(SpectrumHeader).\
   filter(SpectrumHeader.filename == spPlateFilename).all()
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 5:46 PM, thatsanicehatyouh...@mac.com wrote:

 Hi Michael,
 
 Assuming I understood you correctly, I tried the code below. The result was 
 the same (the query took 486 seconds). Since I autoload everything, I first 
 adjust the column types to the class you defined. Did I misunderstand 
 something?

here try this type:



class ARRAY(postgresql.ARRAY):
   def bind_processor(self, dialect):
   return None

   def result_processor(self, dialect, coltype):
   return None
   
   def adapt(self, impltype):
return self



-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Conor
On 07/19/2010 02:47 PM, Michael Bayer wrote:
 On Jul 19, 2010, at 1:53 PM, Adrian Price-Whelan wrote
 Here is some more detailed information trying the query multiple ways:

 Piping the command into psql and writing to a tmp file takes 12 seconds (tmp 
 file is 241MB):

 time echo SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS spectrum_ra, 
 spectrum.dec AS spectrum_dec, spectrum.values AS spectrum_values, 
 spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask 
 AS spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk 
 = spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
 'spPlate-3586-55181.fits'; | psql spectradb -U postgres  tmp

 real 0m12.052s
 user 0m2.501s
 sys  0m0.274s

 engine.execute on the same query took ~6 seconds:
 spectra = engine.execute(SELECT spectrum.pk AS spectrum_pk, spectrum.ra AS 
 spectrum_ra, spectrum.dec AS spectrum_dec, spectrum.values AS 
 spectrum_values, spectrum.spectrum_header_pk AS spectrum_spectrum_header_pk, 
 spectrum.fiber_number AS spectrum_fiber_number, spectrum.inv_var AS 
 spectrum_inv_var, spectrum.and_mask AS spectrum_and_mask, spectrum.or_mask 
 AS spectrum_or_mask FROM spectrum JOIN spectrum_header ON spectrum_header.pk 
 = spectrum.spectrum_header_pk WHERE spectrum_header.filename = 
 'spPlate-3586-55181.fits';)
 
 Call fetchall() on the result to get a better idea what's going on.
 [...]
   

Assuming fetchall() brings the execution time back to ~486 seconds, it
is likely your bottleneck is psycopg2's str-to-Decimal conversion. A big
part of this is that Python's Decimal class is written in pure Python.
You can override psycopg2's type conversion[1], but the only easy speed
increase I see is to leave the values as strings. Parsing them as floats
may help as well.

-Conor

[1] http://initd.org/psycopg/docs/faq.html#problems-with-type-conversions

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread thatsanicehatyouhave
Hi,

Pasted below is a profile of the earlier code posted. I did update it with your 
new definition of ARRAY Michael, but that only shaved off 18 seconds (down to 
468s total) when run without the profiler.

The large number of __new__ calls roughly tracks with the number of numeric 
values coming back from the database - 1000 rows x 4 columns x 4634 per array, 
or 18,536,000 new objects. It looks like Conor's suggestion that the bottleneck 
is the str-to-Decimal conversion is correct. (This would explain why our 
initial imports were taking an extremely long time as well...).

I intend to use lazy loading on these columns, but before that I'm certainly 
open to any suggestions to cut the time down significantly! We'll also look 
into psycopg2's Decimal class.

If the type in the database is numeric[], how can I retrieve it as a string?

Thanks again for the help!

Cheers,
Demitri



Mon Jul 19 20:29:16 2010prof_stats

 194856576 function calls (194856471 primitive calls) in 538.608 CPU 
seconds

   Ordered by: cumulative time
   List reduced from 460 to 20 due to restriction 20

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
11.9681.968  538.608  538.608 string:1(module)
10.0000.000  536.430  536.430 ./examples/db_test.py:56(main)
10.0360.036  536.414  536.414 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/query.py:1447(all)
 10010.0010.000  530.8940.530 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/query.py:1619(instances)
10.0000.000  529.213  529.213 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:2379(fetchall)
10.0000.000  529.211  529.211 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:2360(_fetchall_impl)
1   31.366   31.366  529.211  529.211 {method 'fetchall' of 
'psycopg2._psycopg.cursor' objects}
 18538000  124.7840.000  497.8450.000 
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/decimal.py:511(__new__)
 18551033  290.3530.000  290.3530.000 {built-in method __new__ of type 
object at 0xf6360}
 18538021   45.5680.000   45.5680.000 {built-in method match}
 74152000   17.2190.000   17.2190.000 {built-in method group}
 185386309.9080.0009.9080.000 {isinstance}
 185380006.3560.0006.3560.000 {method 'lstrip' of 'str' objects}
10.0000.0005.4845.484 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/query.py:1560(__iter__)
10.0000.0005.4845.484 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/query.py:1567(_execute_and_instances)
10.0000.0005.4845.484 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/session.py:703(execute)
10.0000.0005.4835.483 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:1148(execute)
10.0000.0005.4835.483 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:1212(_execute_clauseelement)
10.0000.0005.4825.482 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:1255(__execute_context)
10.0000.0005.4825.482 
/Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:1351(_cursor_execute)


Time spent within each function
Mon Jul 19 20:29:16 2010prof_stats

 194856576 function calls (194856471 primitive calls) in 538.608 CPU 
seconds

   Ordered by: internal time
   List reduced from 460 to 20 due to restriction 20

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 18551033  290.3530.000  290.3530.000 {built-in method __new__ of type 
object at 0xf6360}
 18538000  124.7840.000  497.8450.000 
/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/decimal.py:511(__new__)
 18538021   45.5680.000   45.5680.000 {built-in method match}
1   31.366   31.366  529.211  529.211 {method 'fetchall' of 
'psycopg2._psycopg.cursor' objects}
 74152000   17.2190.000   17.2190.000 {built-in method group}
 185386309.9080.0009.9080.000 {isinstance}
 185380006.3560.0006.3560.000 {method 'lstrip' of 'str' objects}
15.4825.4825.4825.482 {method 'execute' of 
'psycopg2._psycopg.cursor' objects}
 185380162.8520.0002.8520.000 {method 'strip' of 'str' objects}
11.9681.968  538.608  538.608 string:1(module)
  92701451.4870.0001.4870.000 {len}
 40000.6340.0000.6340.000 

Re: [sqlalchemy] Re: Unknown Issue Causing Extremely Slow Query

2010-07-19 Thread Michael Bayer

On Jul 19, 2010, at 8:33 PM, thatsanicehatyouh...@mac.com wrote:

 Hi,
 
 Pasted below is a profile of the earlier code posted. I did update it with 
 your new definition of ARRAY Michael, but that only shaved off 18 seconds 
 (down to 468s total) when run without the profiler.
 
 The large number of __new__ calls roughly tracks with the number of numeric 
 values coming back from the database - 1000 rows x 4 columns x 4634 per 
 array, or 18,536,000 new objects. It looks like Conor's suggestion that the 
 bottleneck is the str-to-Decimal conversion is correct. (This would explain 
 why our initial imports were taking an extremely long time as well...).
 
 I intend to use lazy loading on these columns, but before that I'm certainly 
 open to any suggestions to cut the time down significantly! We'll also look 
 into psycopg2's Decimal class.
 
 If the type in the database is numeric[], how can I retrieve it as a string?

psycopg2 returns the Decimal (its a Python built in) natively.   You register a 
new type adapter as they mention here:  
http://initd.org/psycopg/docs/faq.html#problems-with-type-conversions   however 
I don't know if their float() recipe is calling float() on a string or a 
Decimal.   I'm pretty sure if you used the FLOAT type instead of NUMERIC (i.e. 
change your table in the DB, or use CAST) it returns floats directly without 
the decimal thing.




 
 Thanks again for the help!
 
 Cheers,
 Demitri
 
 
 
 Mon Jul 19 20:29:16 2010prof_stats
 
 194856576 function calls (194856471 primitive calls) in 538.608 CPU 
 seconds
 
   Ordered by: cumulative time
   List reduced from 460 to 20 due to restriction 20
 
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
11.9681.968  538.608  538.608 string:1(module)
10.0000.000  536.430  536.430 ./examples/db_test.py:56(main)
10.0360.036  536.414  536.414 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/query.py:1447(all)
 10010.0010.000  530.8940.530 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/query.py:1619(instances)
10.0000.000  529.213  529.213 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:2379(fetchall)
10.0000.000  529.211  529.211 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:2360(_fetchall_impl)
1   31.366   31.366  529.211  529.211 {method 'fetchall' of 
 'psycopg2._psycopg.cursor' objects}
 18538000  124.7840.000  497.8450.000 
 /System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/decimal.py:511(__new__)
 18551033  290.3530.000  290.3530.000 {built-in method __new__ of type 
 object at 0xf6360}
 18538021   45.5680.000   45.5680.000 {built-in method match}
 74152000   17.2190.000   17.2190.000 {built-in method group}
 185386309.9080.0009.9080.000 {isinstance}
 185380006.3560.0006.3560.000 {method 'lstrip' of 'str' 
 objects}
10.0000.0005.4845.484 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/query.py:1560(__iter__)
10.0000.0005.4845.484 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/query.py:1567(_execute_and_instances)
10.0000.0005.4845.484 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/orm/session.py:703(execute)
10.0000.0005.4835.483 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:1148(execute)
10.0000.0005.4835.483 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:1212(_execute_clauseelement)
10.0000.0005.4825.482 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:1255(__execute_context)
10.0000.0005.4825.482 
 /Library/Python/2.6/site-packages/SQLAlchemy-0.6.3-py2.6.egg/sqlalchemy/engine/base.py:1351(_cursor_execute)
 
 
 Time spent within each function
 Mon Jul 19 20:29:16 2010prof_stats
 
 194856576 function calls (194856471 primitive calls) in 538.608 CPU 
 seconds
 
   Ordered by: internal time
   List reduced from 460 to 20 due to restriction 20
 
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 18551033  290.3530.000  290.3530.000 {built-in method __new__ of type 
 object at 0xf6360}
 18538000  124.7840.000  497.8450.000 
 /System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/decimal.py:511(__new__)
 18538021   45.5680.000   45.5680.000 {built-in method match}
1   31.366   31.366  529.211  529.211 {method 'fetchall' of 
 'psycopg2._psycopg.cursor' objects}
 74152000   17.2190.000   17.2190.000 {built-in 

[sqlalchemy] Creating a custom Visitable

2010-07-19 Thread Mike Lewis
I'm interested in creating a visitable (more specifically, a
SchemaItem) to support EXTENDS when creating tables.  How would one
go about this?

Thanks,
Mike

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.