[sqlalchemy] Referencing DELETE table from nested statement

2014-12-03 Thread Stefan Urbanek
Hi,

How can I reference a table that I am deleting from in an inner/nested 
statement?

Here is my simplified oritinal version, no inner condition:

rownum = sql.expression.over(sql.func.row_number(),
 partition_by=table.c.some_id,
 order_by=table.c.dw_created_date.desc())
rownum = rownum.label(rownum)

condition = table.c.dw_valid_to_date == None

primary = sql.expression.select([table.c.dw_some_key, rownum])
primary = primary.where(condition).alias(primary)

condition = primary.c.rownum == 1
unique = sql.expression.select([primary.c.dw_some_key])
unique = unique.where(condition).alias(unique)
condition = table.c.dw_some_key.notin_(unique)
delete = sql.expression.delete(table,
   whereclause=condition)

Which produces something like the following (manually formatted + added 
comments about my intention):

DELETE FROM foo a
WHERE a.dw_key NOT IN
(
SELECT primary.dw_key 
FROM (
SELECT foo.dw_key AS dw_key,
row_number() OVER (PARTITION BY foo.some_id
 ORDER BY foo.dw_created_date DESC) AS rownum 
FROM foo 
WHERE foo.dw_valid_to_date IS NULL

-- Need to add this:
-- AND a.some_id = foo.some_id

) AS primary 
WHERE primary.rownum = 1
)

I want to add the condition commented-out in the above statement. My attempt 
was to alias the table:

Aliased version with inner condition:

inner = table.alias(inner)
rownum = sql.expression.over(sql.func.row_number(),
 partition_by=inner.c.some_id,
 order_by=inner.c.dw_created_date.desc())
rownum = rownum.label(rownum)

condition = inner.c.dw_valid_to_date == None
condition = and_(inner.c.dw_valid_to_date == None,
 inner.c.some_id == table.c.some_id)

primary = sql.expression.select([inner.c.dw_some_key, rownum])
primary = primary.where(condition).alias(primary)

# Select unique and filter non-unique
# 

condition = primary.c.rownum == 1
unique = sql.expression.select([primary.c.dw_some_key])
unique = unique.where(condition).alias(unique)
condition = table.c.dw_some_key.notin_(unique)
delete = sql.expression.delete(table,
   whereclause=condition)


However this produces a product of the inner statement and the original table:

DELETE FROM footable
WHERE footable.dw_some_key NOT IN (
SELECT primary.dw_some_key
FROM (
SELECT inner.dw_some_key AS dw_some_key, row_number()
OVER (PARTITION BY inner.some_id
  ORDER BY inner.dw_created_date DESC) AS rownum 
#
# Here is the problem introduced by the alias:
#
FROM footable AS inner, footable 
WHERE inner.dw_valid_to_date IS NULL
AND inner.some_id = footable.some_id) AS primary 
WHERE primary.rownum = 1

You can see that the inner condition is not referencing the table used in the 
DELETE statement but a new reference to the table.

How can I construct a statement where I can reference the tables as I need to 
in my case?

Thanks for any hints,

Stefan

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Referencing DELETE table from nested statement

2014-12-03 Thread Michael Bayer
why don’t you use a NOT EXISTS correlated subquery?  that way the subquery can 
refer to the outer table completely, rather than having just one column where 
you can call IN.



 On Dec 3, 2014, at 12:36 PM, Stefan Urbanek stefan.urba...@gmail.com wrote:
 
 Hi,
 
 How can I reference a table that I am deleting from in an inner/nested 
 statement?
 
 Here is my simplified oritinal version, no inner condition:
 
rownum = sql.expression.over(sql.func.row_number(),
 partition_by=table.c.some_id,
 order_by=table.c.dw_created_date.desc())
rownum = rownum.label(rownum)
 
condition = table.c.dw_valid_to_date == None
 
primary = sql.expression.select([table.c.dw_some_key, rownum])
primary = primary.where(condition).alias(primary)
 
condition = primary.c.rownum == 1
unique = sql.expression.select([primary.c.dw_some_key])
unique = unique.where(condition).alias(unique)
condition = table.c.dw_some_key.notin_(unique)
delete = sql.expression.delete(table,
   whereclause=condition)
 
 Which produces something like the following (manually formatted + added 
 comments about my intention):
 
 DELETE FROM foo a
 WHERE a.dw_key NOT IN
 (
SELECT primary.dw_key 
FROM (
SELECT foo.dw_key AS dw_key,
row_number() OVER (PARTITION BY foo.some_id
 ORDER BY foo.dw_created_date DESC) AS rownum 
FROM foo 
WHERE foo.dw_valid_to_date IS NULL
 
-- Need to add this:
-- AND a.some_id = foo.some_id
 
) AS primary 
WHERE primary.rownum = 1
 )
 
 I want to add the condition commented-out in the above statement. My attempt 
 was to alias the table:
 
 Aliased version with inner condition:
 
inner = table.alias(inner)
rownum = sql.expression.over(sql.func.row_number(),
 partition_by=inner.c.some_id,
 order_by=inner.c.dw_created_date.desc())
rownum = rownum.label(rownum)
 
condition = inner.c.dw_valid_to_date == None
condition = and_(inner.c.dw_valid_to_date == None,
 inner.c.some_id == table.c.some_id)
 
primary = sql.expression.select([inner.c.dw_some_key, rownum])
primary = primary.where(condition).alias(primary)
 
# Select unique and filter non-unique
# 
 
condition = primary.c.rownum == 1
unique = sql.expression.select([primary.c.dw_some_key])
unique = unique.where(condition).alias(unique)
condition = table.c.dw_some_key.notin_(unique)
delete = sql.expression.delete(table,
   whereclause=condition)
 
 
 However this produces a product of the inner statement and the original table:
 
 DELETE FROM footable
 WHERE footable.dw_some_key NOT IN (
SELECT primary.dw_some_key
FROM (
SELECT inner.dw_some_key AS dw_some_key, row_number()
OVER (PARTITION BY inner.some_id
  ORDER BY inner.dw_created_date DESC) AS rownum 
#
# Here is the problem introduced by the alias:
#
FROM footable AS inner, footable 
WHERE inner.dw_valid_to_date IS NULL
AND inner.some_id = footable.some_id) AS primary 
WHERE primary.rownum = 1
 
 You can see that the inner condition is not referencing the table used in the 
 DELETE statement but a new reference to the table.
 
 How can I construct a statement where I can reference the tables as I need to 
 in my case?
 
 Thanks for any hints,
 
 Stefan
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Handling big Python objects

2014-12-03 Thread Andrea Gavana
Hello list,

sorry for the possibly noob question, I've googled around without much 
success looking for an answer. Basically, I am given a series of this huge 
Python class (a Simulation object), which contains an enormous amount of 
information - when I cPickle it (with highest protocol), it can result to 
files 200-250 MB in size, although rarely it can get up to 2 GB. I am 
looking for intelligent ways to store these objects into a database. I have 
to say that I don't have that much control on this Simulation class, so I 
can't change its internal structure - I'm just looking for a better 
alternative to what I am doing.

So, what I am doing now is basically storing this huge object as a string. 
I have these two methods:

import cPickle
import zlib
import base64

def serialize(my_simulation):
my_db_object = base64.b64encode(zlib.compress(cPickle.dumps(obj, 
cPickle.HIGHEST_PROTOCOL)))
return my_db_object

def deserialize(my_db_object):
my_simulation = 
cPickle.loads(zlib.decompress(base64.b64decode(my_db_object)))
return simulation


I can use them to store/retrieve this big Python classes to/from the 
database, but I feel it's not a particularly effective way to handle this 
problem. I've tried to get my head around BLOBs and LargeBinary stuff, but 
I'm not sure I'm on the right path here. I appreciate any suggestion on how 
to approach the problem, to make the storing/retrieving of these objects a 
bit less time/memory consuming (especially time).

On a related note, I am now using MySQL as a backend - but I am open to 
suggestions about possible alternatives that may make this problem more 
tractable: I have some difficulties in installing other backends (like 
PostgreSQL or psycopg2), but I know I can use Access and Oracle backends. I 
know that Access is not particularly famous in the SQLAlchemy world. Of 
course, if any other backend has advantages over MySQL in dealing with this 
issue, I may try to convince IT to get it installed on our machines.

All suggestions and comments are most welcome.

Thank you in advance for your help.

Andrea.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Handling big Python objects

2014-12-03 Thread AM
I would recommend just storing them on disk and let the OS VMM deal with 
caching for speed. If you are not constrained for space I would 
recommend not zlib-ing it either.


AM

On 12/3/14 1:18 PM, Andrea Gavana wrote:

Hello list,

sorry for the possibly noob question, I've googled around without 
much success looking for an answer. Basically, I am given a series of 
this huge Python class (a Simulation object), which contains an 
enormous amount of information - when I cPickle it (with highest 
protocol), it can result to files 200-250 MB in size, although rarely 
it can get up to 2 GB. I am looking for intelligent ways to store 
these objects into a database. I have to say that I don't have that 
much control on this Simulation class, so I can't change its internal 
structure - I'm just looking for a better alternative to what I am doing.


So, what I am doing now is basically storing this huge object as a 
string. I have these two methods:


import cPickle
import zlib
import base64

def serialize(my_simulation):
my_db_object = base64.b64encode(zlib.compress(cPickle.dumps(obj, 
cPickle.HIGHEST_PROTOCOL)))

return my_db_object

def deserialize(my_db_object):
my_simulation = 
cPickle.loads(zlib.decompress(base64.b64decode(my_db_object)))

return simulation


I can use them to store/retrieve this big Python classes to/from the 
database, but I feel it's not a particularly effective way to handle 
this problem. I've tried to get my head around BLOBs and LargeBinary 
stuff, but I'm not sure I'm on the right path here. I appreciate any 
suggestion on how to approach the problem, to make the 
storing/retrieving of these objects a bit less time/memory consuming 
(especially time).


On a related note, I am now using MySQL as a backend - but I am open 
to suggestions about possible alternatives that may make this problem 
more tractable: I have some difficulties in installing other backends 
(like PostgreSQL or psycopg2), but I know I can use Access and Oracle 
backends. I know that Access is not particularly famous in the 
SQLAlchemy world. Of course, if any other backend has advantages over 
MySQL in dealing with this issue, I may try to convince IT to get it 
installed on our machines.


All suggestions and comments are most welcome.

Thank you in advance for your help.

Andrea.
--
You received this message because you are subscribed to the Google 
Groups sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to sqlalchemy+unsubscr...@googlegroups.com 
mailto:sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com 
mailto:sqlalchemy@googlegroups.com.

Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Handling big Python objects

2014-12-03 Thread Ed Rahn
What do you want to be able to do with the objects? I'd recommed storing 
them as files and then referencing the filename in the database.


Ed


On 12/03/2014 04:18 PM, Andrea Gavana wrote:

Hello list,

sorry for the possibly noob question, I've googled around without 
much success looking for an answer. Basically, I am given a series of 
this huge Python class (a Simulation object), which contains an 
enormous amount of information - when I cPickle it (with highest 
protocol), it can result to files 200-250 MB in size, although rarely 
it can get up to 2 GB. I am looking for intelligent ways to store 
these objects into a database. I have to say that I don't have that 
much control on this Simulation class, so I can't change its internal 
structure - I'm just looking for a better alternative to what I am doing.


So, what I am doing now is basically storing this huge object as a 
string. I have these two methods:


import cPickle
import zlib
import base64

def serialize(my_simulation):
my_db_object = base64.b64encode(zlib.compress(cPickle.dumps(obj, 
cPickle.HIGHEST_PROTOCOL)))

return my_db_object

def deserialize(my_db_object):
my_simulation = 
cPickle.loads(zlib.decompress(base64.b64decode(my_db_object)))

return simulation


I can use them to store/retrieve this big Python classes to/from the 
database, but I feel it's not a particularly effective way to handle 
this problem. I've tried to get my head around BLOBs and LargeBinary 
stuff, but I'm not sure I'm on the right path here. I appreciate any 
suggestion on how to approach the problem, to make the 
storing/retrieving of these objects a bit less time/memory consuming 
(especially time).


On a related note, I am now using MySQL as a backend - but I am open 
to suggestions about possible alternatives that may make this problem 
more tractable: I have some difficulties in installing other backends 
(like PostgreSQL or psycopg2), but I know I can use Access and Oracle 
backends. I know that Access is not particularly famous in the 
SQLAlchemy world. Of course, if any other backend has advantages over 
MySQL in dealing with this issue, I may try to convince IT to get it 
installed on our machines.


All suggestions and comments are most welcome.

Thank you in advance for your help.

Andrea.
--
You received this message because you are subscribed to the Google 
Groups sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to sqlalchemy+unsubscr...@googlegroups.com 
mailto:sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com 
mailto:sqlalchemy@googlegroups.com.

Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Handling big Python objects

2014-12-03 Thread Jonathan Vanasco


On Wednesday, December 3, 2014 4:23:31 PM UTC-5, Ams Fwd wrote:

 I would recommend just storing them on disk and let the OS VMM deal with 
 caching for speed. If you are not constrained for space I would 
 recommend not zlib-ing it either. 


I'll second storing them to disk.  Large object support in all the 
databases is a pain and not very optimal.  Just pickle/unpickle a file and 
use the db to manage that file.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Handling big Python objects

2014-12-03 Thread Andrea Gavana


On Wednesday, December 3, 2014 10:42:27 PM UTC+1, Jonathan Vanasco wrote:



 On Wednesday, December 3, 2014 4:23:31 PM UTC-5, Ams Fwd wrote:

 I would recommend just storing them on disk and let the OS VMM deal with 
 caching for speed. If you are not constrained for space I would 
 recommend not zlib-ing it either. 


 I'll second storing them to disk.  Large object support in all the 
 databases is a pain and not very optimal.  Just pickle/unpickle a file and 
 use the db to manage that file.



Thanks to all of you who replied. A couple of issues that I'm sure I will 
encounter by letting the files on disk:

1. Other users can easily delete/overwrite/rename the files on disk, which 
is something we really, really do not want;
2. The whole point of a database was to have everything centralized in one 
place, not leaving the simulation files scattered around like a mess in the 
whole network drive;
3. As an aside, not zlib-ing the files saves about 5 seconds/simulation 
(over a 20 seconds save) but increases the database size by 4 times. I'll 
have to check if this is OK.

Thank you again for your interest.

Andrea.

 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Handling big Python objects

2014-12-03 Thread Claudio Freire
On Wed, Dec 3, 2014 at 7:23 PM, Andrea Gavana andrea.gav...@gmail.com wrote:
 3. As an aside, not zlib-ing the files saves about 5 seconds/simulation
 (over a 20 seconds save) but increases the database size by 4 times. I'll
 have to check if this is OK.

You can usually specify a compression level of 1 to get a considerable
speedup if you care about speed more than disk efficiency.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] How to handle dynamically-changing tables (gaining and losing columns)

2014-12-03 Thread Tanner Semerad
I'm creating a web application where the underlying tables will be changing 
during the operation of the application. 

Each user is registered to an agency. Each agency has their own schema of 
data tables. Users log in to edit their data within their agency's schema.

Example:
my_agency_schema:
  - table_one
- col_a
- col_b
  - table_two
  - table_three

The initial version of the application was straight-forward - all tables 
and columns were static, hard-coded. Any given table always had the same 
columns. But now we want to give the user the ability to add or remove 
certain columns. Let's say a user adds a new column to table_one - how does 
the SQLAlchemy model now know to look for that? *How do I keep my 
SQLAlchemy models in line with the changes happening to the tables* (or is 
it even possible)? 

I tested using database reflection, and got it working in sort of a hacky 
workaround that I am not proud of. With my hack, each CRUD request 
generates a new SQLAlchemy model that uses introspection to reflect the 
columns of the table, then uses an instance of that model to perform the 
update. It is fairly resource-intensive to reflect the table each time, so 
I really don't think I should go this route.

I considered using a table (or cache like Redis) to keep track of what 
columns each table of each agency has so that the table doesn't need to be 
reflected each time, but that sounds like a large (and somewhat messy) 
undertaking, and I want to be sure what I implement actually makes sense to 
do first.

To put it simply, I want to know what makes sense to do in a situation like 
this, where table columns are being added and removed during program 
execution. I'm not sure if SQLAlchemy is meant to be used for a problem 
like this. How would you handle such a problem? 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Referencing DELETE table from nested statement

2014-12-03 Thread Stefan Urbanek
Thanks,

I used exists (reversed condition):

condition = primary.c.rownum != 1
unique = sql.expression.exists([1],
   whereclause=condition)
delete = sql.expression.delete(table,
   whereclause=unique)

I'm still getting: FROM footable AS inner, footable for the inner-most 
statement. Note that I have deeper nesting – the primary is already 
nested, because I can't use window functions in the WHERE clause.

Note that I need to have the inner.key = to_be_deleted.key condition in the 
inner-most SELECT, not in the EXISTS portion. See my 'condition' for 
'promary' in my example above (there is one stray line, that should be 
ignored).

Or how do you meant to use the the exists() construct?

Stefan

On Wednesday, December 3, 2014 2:05:32 PM UTC-5, Michael Bayer wrote:

 why don’t you use a NOT EXISTS correlated subquery?  that way the subquery 
 can refer to the outer table completely, rather than having just one column 
 where you can call IN. 



  On Dec 3, 2014, at 12:36 PM, Stefan Urbanek stefan@gmail.com 
 javascript: wrote: 
  
  Hi, 
  
  How can I reference a table that I am deleting from in an inner/nested 
 statement? 
  
  Here is my simplified oritinal version, no inner condition: 
  
 rownum = sql.expression.over(sql.func.row_number(), 
  partition_by=table.c.some_id, 
  order_by=table.c.dw_created_date.desc()) 
 rownum = rownum.label(rownum) 
  
 condition = table.c.dw_valid_to_date == None 
  
 primary = sql.expression.select([table.c.dw_some_key, rownum]) 
 primary = primary.where(condition).alias(primary) 
  
 condition = primary.c.rownum == 1 
 unique = sql.expression.select([primary.c.dw_some_key]) 
 unique = unique.where(condition).alias(unique) 
 condition = table.c.dw_some_key.notin_(unique) 
 delete = sql.expression.delete(table, 
whereclause=condition) 
  
  Which produces something like the following (manually formatted + added 
 comments about my intention): 
  
  DELETE FROM foo a 
  WHERE a.dw_key NOT IN 
  ( 
 SELECT primary.dw_key 
 FROM ( 
 SELECT foo.dw_key AS dw_key, 
 row_number() OVER (PARTITION BY foo.some_id 
  ORDER BY foo.dw_created_date DESC) AS rownum 
 FROM foo 
 WHERE foo.dw_valid_to_date IS NULL 
  
 -- Need to add this: 
 -- AND a.some_id = foo.some_id 
  
 ) AS primary 
 WHERE primary.rownum = 1 
  ) 
  
  I want to add the condition commented-out in the above statement. My 
 attempt was to alias the table: 
  
  Aliased version with inner condition: 
  
 inner = table.alias(inner) 
 rownum = sql.expression.over(sql.func.row_number(), 
  partition_by=inner.c.some_id, 
  order_by=inner.c.dw_created_date.desc()) 
 rownum = rownum.label(rownum) 
  
 condition = inner.c.dw_valid_to_date == None 
 condition = and_(inner.c.dw_valid_to_date == None, 
  inner.c.some_id == table.c.some_id) 
  
 primary = sql.expression.select([inner.c.dw_some_key, rownum]) 
 primary = primary.where(condition).alias(primary) 
  
 # Select unique and filter non-unique 
 # 
  
 condition = primary.c.rownum == 1 
 unique = sql.expression.select([primary.c.dw_some_key]) 
 unique = unique.where(condition).alias(unique) 
 condition = table.c.dw_some_key.notin_(unique) 
 delete = sql.expression.delete(table, 
whereclause=condition) 
  
  
  However this produces a product of the inner statement and the original 
 table: 
  
  DELETE FROM footable 
  WHERE footable.dw_some_key NOT IN ( 
 SELECT primary.dw_some_key 
 FROM ( 
 SELECT inner.dw_some_key AS dw_some_key, row_number() 
 OVER (PARTITION BY inner.some_id 
   ORDER BY inner.dw_created_date DESC) AS 
 rownum 
 # 
 # Here is the problem introduced by the alias: 
 # 
 FROM footable AS inner, footable 
 WHERE inner.dw_valid_to_date IS NULL 
 AND inner.some_id = footable.some_id) AS primary 
 WHERE primary.rownum = 1 
  
  You can see that the inner condition is not referencing the table used 
 in the DELETE statement but a new reference to the table. 
  
  How can I construct a statement where I can reference the tables as I 
 need to in my case? 
  
  Thanks for any hints, 
  
  Stefan 
  
  -- 
  You received this message because you are subscribed to the Google 
 Groups sqlalchemy group. 
  To unsubscribe from this group and stop receiving emails from it, send 
 an email to sqlalchemy+...@googlegroups.com javascript:. 
  To post to this group, send email to sqlal...@googlegroups.com 
 javascript:. 
  Visit this group at 

Re: [sqlalchemy] Handling big Python objects

2014-12-03 Thread AM

On 12/3/14 2:23 PM, Andrea Gavana wrote:



On Wednesday, December 3, 2014 10:42:27 PM UTC+1, Jonathan Vanasco wrote:



On Wednesday, December 3, 2014 4:23:31 PM UTC-5, Ams Fwd wrote:

I would recommend just storing them on disk and let the OS VMM
deal with
caching for speed. If you are not constrained for space I would
recommend not zlib-ing it either. 



I'll second storing them to disk.  Large object support in all the
databases is a pain and not very optimal.  Just pickle/unpickle a
file and use the db to manage that file.



Thanks to all of you who replied. A couple of issues that I'm sure I 
will encounter by letting the files on disk:


1. Other users can easily delete/overwrite/rename the files on disk, 
which is something we really, really do not want;


If this is windows group policies are your friends :). If this is linux, 
permissions with a secondary service to access the files are a decent 
choice.


2. The whole point of a database was to have everything centralized in 
one place, not leaving the simulation files scattered around like a 
mess in the whole network drive;


The last time I did it a post processing step in my data pipeline 
organized the files based on a multi-level folder structure based on the 
first x-characters of their sha1.


3. As an aside, not zlib-ing the files saves about 5 
seconds/simulation (over a 20 seconds save) but increases the database 
size by 4 times. I'll have to check if this is OK.


To use compression or not depends on your needs. If the difference in 
time consumed is so stark, I would highly recommend compression.


HTH
AM


Thank you again for your interest.

Andrea.

--
You received this message because you are subscribed to the Google 
Groups sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to sqlalchemy+unsubscr...@googlegroups.com 
mailto:sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com 
mailto:sqlalchemy@googlegroups.com.

Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Handling big Python objects

2014-12-03 Thread Andrea Gavana
Hi,

On Thursday, December 4, 2014 12:02:42 AM UTC+1, Ams Fwd wrote:

 On 12/3/14 2:23 PM, Andrea Gavana wrote: 
  
  
  On Wednesday, December 3, 2014 10:42:27 PM UTC+1, Jonathan Vanasco 
 wrote: 
  
  
  
  On Wednesday, December 3, 2014 4:23:31 PM UTC-5, Ams Fwd wrote: 
  
  I would recommend just storing them on disk and let the OS VMM 
  deal with 
  caching for speed. If you are not constrained for space I would 
  recommend not zlib-ing it either. 
  
  
  I'll second storing them to disk.  Large object support in all the 
  databases is a pain and not very optimal.  Just pickle/unpickle a 
  file and use the db to manage that file. 
  
  
  
  Thanks to all of you who replied. A couple of issues that I'm sure I 
  will encounter by letting the files on disk: 
  
  1. Other users can easily delete/overwrite/rename the files on disk, 
  which is something we really, really do not want; 

 If this is windows group policies are your friends :). If this is linux, 
 permissions with a secondary service to access the files are a decent 
 choice. 



Yes, this is Windows, but no, I can't go around and tell the users that the 
simulation they just saved is not accessible anymore. The database is part 
of a much larger user interface application, where users can generate 
simulations and then decide whether or not they are relevant enough to be 
stored in the database. At a rate of 300 MB per simulation (or more), it 
gets quickly to the size issue.



  2. The whole point of a database was to have everything centralized in 
  one place, not leaving the simulation files scattered around like a 
  mess in the whole network drive; 

 The last time I did it a post processing step in my data pipeline 
 organized the files based on a multi-level folder structure based on the 
 first x-characters of their sha1. 



Again, I am dealing with non-Python people - and in general with people who 
are extremely good at what they do but they don't care about the overall IT 
architecture - as long as it works and it is recognizable from a Windows 
Explorer point of view. A file-based approach is unfortunately not a good 
option in the current setup.

 


  3. As an aside, not zlib-ing the files saves about 5 
  seconds/simulation (over a 20 seconds save) but increases the database 
  size by 4 times. I'll have to check if this is OK. 
  
 To use compression or not depends on your needs. If the difference in 
 time consumed is so stark, I would highly recommend compression. 



I will probably go that way, 5 seconds more or less do not make that much 
of a difference overall. I just wish the backends didn't complain when I 
pass them cPickled objects (bytes instead of strings...).


Andrea.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Handling big Python objects

2014-12-03 Thread Jonathan Vanasco
If you need to handle large objects, you should look into DBAPI drivers 
that can stream results.  The only drivers i know that can handle a stream 
are psycopg2 (postgres) and oursql (mysql). 

There have been a handful of recipes/threads of people using streams for 
blobs in the archives.

I would still do a filesystem based approach.  You wouldn't have to b64 
your files, and could use external tools to compress (ie, bzip2 which would 
take longer, but compresses more).   Your users should never know where 
this data is, it should be treated as a hidden/secure storage area that 
only the app uses -- just like your mysql.  

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.