[sqlalchemy] Infer (and create) Schema from Example Object/Dictionary
My search skills are failing me, and I hope you all can help. (Apologies that there is some heresy here) Assumptions: 1) Suppose I have objects made from json (dicts of strings, lists of dicts, etc.) 2) (for simplicity, assume these nestings don't go very deep) 3) getting this right 90% of the time is fine, and it can assume there are only 1:1 and 1:many. Question: Is there a tool, or what is the easiest way to create / autogenerate a sensible schema based on the object? I am looking for a function with this sort of signature: obj = {'id':1, 'name': 'Gregg', 'events': ['ts': 129292939392, 'what': 'keypress'}, {'ts': 129292939394, 'what': 'click'}] } def gen_schemas_and_create_table_statements(obj=obj, primary_key='id') that would give something like: * 2 (or maybe 3) tables: table1: pk id, string name ; table2 (events): foreign id, ts, what along with the sql to create them. (this is inspired by all the grossness I deal with in Hive). Thanks! GL -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/tuAozc5iqfEJ. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Postgresql Partition / INHERIT and table names....
Suppose this is my table: a_table = Table( 'a_table', metadata, Column('ts',Integer, index=True, nullable=False), Column('country',String, index=True, nullable=False), Column('somestat',Integer,nullable=False), PrimaryKeyConstraint('ts','country',name='summary_pk'), ) then: print select([a_table]).where(a_table.c.country=='de') SELECT a_table.ts, a_table.country, a_table.somestat FROM a_table WHERE a_table.country = %(country_1)s Suppose further that in the Postgres, there are several PARTITIONS INHERIT from this main table, and I want to query them directly... what is a sensible way to do this? I can figure out *which* partition to use, based on outside code, so it doesn't need to be very smart. As a final answer I want something like: print SOMETHING SELECT a_table_PARITION1.ts, a_table_PARITION1.country, a_table_PARITION1.somestat FROM a_table_PARITION1 WHERE a_table_PARITION1.country = %(country_1)s In my head, it is like oh, use a_table, except sub in a new value for a_table.name TEMPORARILY. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Re: Postgresql Partition / INHERIT and table names....
Thank you! I figured a compile visitor might be the right way in, but had no idea of how to do it! Some tutorials just on the visitors would probably explain a lot about how PG works! Cheers! GL On Sep 13, 2:14 pm, Michael Bayer mike...@zzzcomputing.com wrote: On Sep 13, 2010, at 2:48 PM, Gregg Lind wrote: Suppose this is my table: a_table = Table( 'a_table', metadata, Column('ts',Integer, index=True, nullable=False), Column('country',String, index=True, nullable=False), Column('somestat',Integer,nullable=False), PrimaryKeyConstraint('ts','country',name='summary_pk'), ) then: print select([a_table]).where(a_table.c.country=='de') SELECT a_table.ts, a_table.country, a_table.somestat FROM a_table WHERE a_table.country = %(country_1)s Suppose further that in the Postgres, there are several PARTITIONS INHERIT from this main table, and I want to query them directly... what is a sensible way to do this? I can figure out *which* partition to use, based on outside code, so it doesn't need to be very smart. As a final answer I want something like: print SOMETHING SELECT a_table_PARITION1.ts, a_table_PARITION1.country, a_table_PARITION1.somestat FROM a_table_PARITION1 WHERE a_table_PARITION1.country = %(country_1)s In my head, it is like oh, use a_table, except sub in a new value for a_table.name TEMPORARILY. this is not very hard so I created a recipe for this case: http://www.sqlalchemy.org/trac/wiki/UsageRecipes/PartitionTable -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group athttp://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] SQL Expression lang; change table in from clause after query construction; copying queries; printing queries with filled params
suppose: summary_table = Table( 'summary', metadata, Column('ts',Integer, index=True, nullable=False), Column('url',String, index=True, nullable=False), Column('hits',Integer, nullable=False), PrimaryKeyConstraint('ts','url','hits',name='summary_pk'), ) And a query like: s = summary_table.c q = select([ s.url, s.ts, ]).\ where(s.site == bindparam(url)).\ where(s.ts == bindparam(ts) * how to 'copy' a query. copy.copy(q) seems to be inadequate * how to print it, with params filled in. str(q) isn't quite enough. (I know this has been covered before, but I can't seem to find it, and if it's not in the docs, it should be!). The query is bound to an engine already. My desired goal is to see the actual sql (with filled quoted params) that would get sent to the engine. If I had this, I could always just use a regex to change the table. * how to change the table being called. My specific subcase is for an inherited table. I tried this, but it seems very dirty: q._froms[0].name = 'summary_1279234800' Is there a general method for that I should use? (Also, this will be fine if I can get the 'copy' business to work. As usual, SA is great, and I'm doing rude, mean things to it, so if there are simpler ways out of this mess (wanting simple query construction, but one that I can alter to use on specific Postgres INHERIT tables), please inform me. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Re: SQL Expression lang; change table in from clause after query construction; copying queries; printing queries with filled params
Thanks for the advice! One minor nit. At least in my experience, str(bound query) doesn't fill the params, or do quoting properly. Here is a demonstration: fake_table = Table( 'faketable', metadata, Column('ts',Integer, index=True, nullable=False), Column('url',String, index=True, nullable=False), Column('hits',Integer, nullable=False), PrimaryKeyConstraint('ts','url','hits',name='summary_pk'), ) s = fake_table.c q = select([ s.url, s.ts, ]).\ where(s.url == bindparam(url)).\ where(s.ts == bindparam(ts)).\ where(s.hits 100) assert fake_table.metadata.bind.name == 'postgresql' #it's bound assert str(q) == \ SELECT faketable.url, faketable.ts FROM faketable WHERE faketable.url = %(url)s AND faketable.ts = %(ts)s AND faketable.hits %(hits_1)s As you can see the 'url' isn't quoted, which is a problem! Also, the (hits_1) paremeter isn't filled in, even though it's already determined. What I would ideally like to see is this: whatwould_happen(q,**some_dict): SELECT faketable.url, faketable.ts FROM faketable WHERE faketable.url = 'http://mypage.com/index.html' AND faketable.ts = 1829292929 AND faketable.hits 100 If I had this string repr with filled params, I could just a string sub / regex, and go all the way into hackery! On Aug 4, 10:20 am, Michael Bayer mike...@zzzcomputing.com wrote: On Aug 4, 2010, at 10:34 AM, Gregg Lind wrote: suppose: summary_table = Table( 'summary', metadata, Column('ts',Integer, index=True, nullable=False), Column('url',String, index=True, nullable=False), Column('hits',Integer, nullable=False), PrimaryKeyConstraint('ts','url','hits',name='summary_pk'), ) And a query like: s = summary_table.c q = select([ s.url, s.ts, ]).\ where(s.site == bindparam(url)).\ where(s.ts == bindparam(ts) * how to 'copy' a query. copy.copy(q) seems to be inadequate select() has a _generate() method that is used internally for generative operations. But select() supports the generative interface specifically so that you can treat it as an immutable structure, and copying should not be necessary (I use it occasionally when I want to attach some additional state to a select and not affect the original, though that is already a hacky situation). * how to print it, with params filled in. str(q) isn't quite enough. (I know this has been covered before, but I can't seem to find it, and if it's not in the docs, it should be!). The query is bound to an engine already. My desired goal is to see the actual sql (with filled quoted params) that would get sent to the engine. If I had this, I could always just use a regex to change the table. if the query is bound to an engine, meaning, its against a Table who's MetaData is bound to the engine, then str(q) will invoke the compiler for that engine's dialect and you will get the exact SQL that would be emitted. If not, you call q.compile(bind=myengine) or q.compile(dialect=somedialect). * how to change the table being called. My specific subcase is for an inherited table. I tried this, but it seems very dirty: q._froms[0].name = 'summary_1279234800' Is there a general method for that I should use? (Also, this will be fine if I can get the 'copy' business to work. This is not the right way to go. select()s are really not designed for mutation, even though they have some mutative capabilities (which is mostly for performance reasons). If you want a select that is against some other table, you need a new select() object. Additionally, above you're even modifying your Table construct, which, if you have a typical case with Table objects declared at the module level, definitely isn't going to work very well (you could do it with ad-hoc table() constructs, perhaps). The canonical way to create new selects out of old ones that are different is to use clause transformation. In this case it would be: t1 = Table('summary_table', ...) t2 = Table('summary_table_xyz', ...) def replace(obj): if obj is t1: return t2 elif obj in t1.c: return t2.c[obj.key] else: return None from sqlalchemy.sql.visitors import replacement_traverse new_select = replacement_traverse(old_select, None, replace) The name of the table you have above there seems to suggest you have some kind of I have a ton of tables with the same columns thing going on, so here is a recipe for that: from sqlalchemy.sql import Alias from sqlalchemy.ext.compiler import compiles class InhTable(Alias): def __init__(self, table, name): Alias.__init__(self, table, table.name + _ + name) @compiles(InhTable) def compile(element, compiler, **kw): table_name = compiler.process(element.original, **kw
[sqlalchemy] Re: SQL Expression lang; change table in from clause after query construction; copying queries; printing queries with filled params
Thank you for the more detailed explanation! I will do some experiments with it! Gregg On Aug 4, 12:39 pm, Michael Bayer mike...@zzzcomputing.com wrote: On Aug 4, 2010, at 1:26 PM, Gregg Lind wrote: Thanks for the advice! One minor nit. At least in my experience, str(bound query) doesn't fill the params, or do quoting properly. Here is a demonstration: fake_table = Table( 'faketable', metadata, Column('ts',Integer, index=True, nullable=False), Column('url',String, index=True, nullable=False), Column('hits',Integer, nullable=False), PrimaryKeyConstraint('ts','url','hits',name='summary_pk'), ) s = fake_table.c q = select([ s.url, s.ts, ]).\ where(s.url == bindparam(url)).\ where(s.ts == bindparam(ts)).\ where(s.hits 100) assert fake_table.metadata.bind.name == 'postgresql' #it's bound assert str(q) == \ SELECT faketable.url, faketable.ts FROM faketable WHERE faketable.url = %(url)s AND faketable.ts = %(ts)s AND faketable.hits %(hits_1)s As you can see the 'url' isn't quoted, which is a problem! But, that is exactly what is sent to the DBAPI. The quoting happens at the earliest in the DBAPI layer. Some DBAPIs don't ever quote anything, the binds are sent separately for some backends and the database server itself handles interpolation internally. If you turn on your PG logs to log SQL, you'd see the quoting affair is pretty unpleasant so its critical that DBAPIs handle this. The parameters are available from the compiled object as the params collection. Also, the (hits_1) paremeter isn't filled in, even though it's already determined. What I would ideally like to see is this: whatwould_happen(q,**some_dict): SELECT faketable.url, faketable.ts FROM faketable WHERE faketable.url = 'http://mypage.com/index.html'AND faketable.ts = 1829292929 AND faketable.hits 100 If I had this string repr with filled params, I could just a string sub / regex, and go all the way into hackery! On Aug 4, 10:20 am, Michael Bayer mike...@zzzcomputing.com wrote: On Aug 4, 2010, at 10:34 AM, Gregg Lind wrote: suppose: summary_table = Table( 'summary', metadata, Column('ts',Integer, index=True, nullable=False), Column('url',String, index=True, nullable=False), Column('hits',Integer, nullable=False), PrimaryKeyConstraint('ts','url','hits',name='summary_pk'), ) And a query like: s = summary_table.c q = select([ s.url, s.ts, ]).\ where(s.site == bindparam(url)).\ where(s.ts == bindparam(ts) * how to 'copy' a query. copy.copy(q) seems to be inadequate select() has a _generate() method that is used internally for generative operations. But select() supports the generative interface specifically so that you can treat it as an immutable structure, and copying should not be necessary (I use it occasionally when I want to attach some additional state to a select and not affect the original, though that is already a hacky situation). * how to print it, with params filled in. str(q) isn't quite enough. (I know this has been covered before, but I can't seem to find it, and if it's not in the docs, it should be!). The query is bound to an engine already. My desired goal is to see the actual sql (with filled quoted params) that would get sent to the engine. If I had this, I could always just use a regex to change the table. if the query is bound to an engine, meaning, its against a Table who's MetaData is bound to the engine, then str(q) will invoke the compiler for that engine's dialect and you will get the exact SQL that would be emitted. If not, you call q.compile(bind=myengine) or q.compile(dialect=somedialect). * how to change the table being called. My specific subcase is for an inherited table. I tried this, but it seems very dirty: q._froms[0].name = 'summary_1279234800' Is there a general method for that I should use? (Also, this will be fine if I can get the 'copy' business to work. This is not the right way to go. select()s are really not designed for mutation, even though they have some mutative capabilities (which is mostly for performance reasons). If you want a select that is against some other table, you need a new select() object. Additionally, above you're even modifying your Table construct, which, if you have a typical case with Table objects declared at the module level, definitely isn't going to work very well (you could do it with ad-hoc table() constructs, perhaps). The canonical way to create new selects out of old ones that are different is to use clause transformation. In this case it would be: t1 = Table('summary_table', ...) t2 = Table('summary_table_xyz', ...) def
[sqlalchemy] Postgresql OVER clauses
I'd like to use the postgresql OVER constructions, as seen at http://www.postgresql.org/docs/8.4/static/tutorial-window.html. Some of the things I tried: # gives an extra comma, which is fail. print select([s.ts,PARTITION OVER( ts )]) SELECT summary.ts, PARTITION OVER( ts ) FROM summary # more sophisticated examples http://www.sqlalchemy.org/trac/ticket/1844 None of the examples at http://www.sqlalchemy.org/docs/reference/ext/compiler.html match the situation very well. In particular, it seems hard to make new clauses like an OVER clause, such that they can be easily chained onto 'Executable' things like select(), etc. Ideally, I'd like: select([some_table]).over(some_table.c.some_field).group_by(...) To do this now, it seems to me like you have to subclass 'select' (adding an @_generative() over() method, and rewire the compiler some. Maybe I'm talking out my more foolish hole as well :) Please correct me if I'm overthinking this. Gregg -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] PG copy_expert, and sessions
I've been trying to use pg's copy_expert semantics with sessions, but something is eluding me. Observations: 1. It seems like I have to wrap the copy_expert cursor in a transaction and commit() in order to make it actually fire (see pgsql_copy below). 2. When I try wrapping it in session.begin() / commit() (as commented out below) nothing happens. The 'real problem' I'm tackling is: In a rollbackable way: 1. drop and recreate a table 2. fill it with data using 'copy' If this is possible, how do I get it to all play nice? Should I use engine instead of session? If it's not possible, that's sad news. Ref: http://groups.google.com/group/sqlalchemy/browse_thread/thread/6219e5e040dbbe2f/fbbdee1d70d7a00b?lnk=gstq=copy_expert#fbbdee1d70d7a00b === def data_to_tmp(data,dlm=\t): given some tuples of data, return a filehandle to a file filled with data Args: data. iterable of stringable fields. dlm. how fields will be joined in the output. Return: ofh. opened handle to NamedTemporaryFile, at pos 0. Raises: ValueError, TypeError: fields aren't map(str,x)-able (various): data = [(1,2,3)] fh = data_to_tmp(data,dlm=',') print fh.read() 1,2,3 BLANKLINE fh = data_to_tmp(data,dlm=' | ') print fh.read() 1 | 2 | 3 BLANKLINE ofh = NamedTemporaryFile('w+') for d in data: print ofh, dlm.join(map(str,d)) ofh.flush() ofh.seek(0) return ofh def pgsql_copy(session,tablename,string_buffer): SQL_COPY_FROM = '''COPY %(tablename)s FROM STDIN USING DELIMITERS E'\t' ''' \ % (dict(tablename=tablename)) connection = session.bind.connect() # NOTE: without being wrapped in a transaction / commit, the cursor.copy_expert # WILL NOT WORK transaction = connection.begin() #session.begin(subtransactions=True) print (in pgsql_copy) #session.begin(subtransactions=True) try: # We need the raw psycopg-cursor, hidden deep within SA's abstractions. cursor = connection.connection.cursor().cursor print string_buffer.read() string_buffer.seek(0) cursor.copy_expert(SQL_COPY_FROM, string_buffer) #transaction.commit() #set_trace() session.commit() print 'yay' except Exception,exc: session.rollback() #transaction.rollback() print exc raise return True === -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Re: Troubleshooting 'too many clients already'
For completeness: the issue was I had a function that took a dburi, then created another engine, that was called repeatedly. Once I changed it to take an *engine*, the problem cleared up. (side tip: psql has a show all command that is useful for seeing the config info, on machines where one can't see the config file directly) Thanks again for the help! Gregg On Jun 17, 8:04 pm, Michael Bayer mike...@zzzcomputing.com wrote: On Jun 17, 2010, at 8:14 PM, Gregg Lind wrote: What built-in tools / techniques should I use when trying to troubleshoot the 'too many clients already' fatal error. 1. Connecting to PG locally, fwiw, 2. I have full privileges on the db My connections can come from at least: - create_engine I try to be careful to del engine after I'm done with them, but maybe they're not being freed? Clearly, I don't understand the situation, would appreciate insights. del engine suggests you're not doing things as intended, unless you're in some very specific kind of testing scenario. An application should have just one create_engine() call for the life of the whole process, per database backend. Since its a connection pool, it is not intended to be thrown away before the application is completed. By setting the max size on the pool used by your single engine, the total connections that can be open will be throttled and you won't get a too many clients error. Though if you have connections that are not being released in enough time for the next request that needs one, you'd instead get pool timeouts. -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Troubleshooting 'too many clients already'
What built-in tools / techniques should I use when trying to troubleshoot the 'too many clients already' fatal error. 1. Connecting to PG locally, fwiw, 2. I have full privileges on the db My connections can come from at least: - create_engine I try to be careful to del engine after I'm done with them, but maybe they're not being freed? Clearly, I don't understand the situation, would appreciate insights. Thanks in advance! GL -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] server_version_info
Is there a nice way to get server_version_info from an existing connection or engine? Right now it looks quite buried in (for pg): sqlalchemy.database.postgres.PGDialiect().server_version_info(myconnection). -- You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
[sqlalchemy] Declarative issues, with compound foreign key
What I think I'm seeing is that an object can be created even without it's ForeignKeyConstraint being filled. To run the test code below: $ dropdb test18; createdb test18; python testcode.py This builds on http://groups.google.com/group/sqlalchemy/browse_thread/thread/eb240f3f2555a5e7/ . I understand that the foreign table can't create the referent. (and finding the best idiom for use one if it exists or create one) is yet be determined. What I truly don't understand is how any instances of Product can be created, since there is a FK constraint that is not fulfulled. 1. Is the foreign key constraint fulfilled? 2. Is there a good create the referent if it doesn't exist, else use it idiom? 3. Is the polymorphic table business complicating it? It seems liek the compound primary key for A1String is. from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import CheckConstraint, ForeignKey, MetaData, PrimaryKeyConstraint from sqlalchemy import ForeignKeyConstraint from sqlalchemy import Table, Column, Integer, Boolean,Unicode,String from sqlalchemy.orm import relation, backref from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from sqlalchemy.schema import DDL import sys ECHO = bool((sys.argv + [False])[1]) ## utilties for connecting the db, printing it, etc. def print_schema(T=postgres, Base=None): ''' print print_schema will print the schema in use ''' from StringIO import StringIO buf = StringIO() engine = create_engine('%s://' % T, strategy='mock', executor=lambda s, p='': buf.write(str(s) + p)) Base.metadata.create_all(engine) return buf.getvalue() def db_setup(connstring='sqlite:///:memory:', echo=False, Base=None): engine = create_engine(connstring, echo=echo) Session = sessionmaker(bind=engine, autoflush=False, autocommit=False) session = Session() Base.metadata.bind = engine Base.metadata.create_all() return session, engine def _class_repr(self): ''' print our SA class instances in a nicer way ''' # ugly, use sparingly, may have performance hit d = [(k,v) for k,v in self.__dict__.iteritems() if k[0] != _] d = sorted(d, key=lambda x: x[0].lower()) return %s, %s % (self.__class__, d) Base = declarative_base() class Polystring(Base): __tablename__ = 'strings' id = Column(Integer, nullable=False, primary_key=True) string = Column(String, nullable=False, primary_key=True) origin = Column(String, nullable=False, primary_key=True) __mapper_args__ = {'polymorphic_on': origin} # subtype of string class A1String(Polystring): __mapper_args__ = {'polymorphic_identity': 'a1'} products = relation('Product', order_by=Product.id) class Product(Base): __tablename__ = 'product' __table_args__ = ( ForeignKeyConstraint(['regstring_id', 'regstring', 'regstring_type'], [A1String.id, A1String.string , A1String.origin], on), {} ) id = Column(Integer,primary_key=True) regstring_id = Column(Integer) regstring = Column(String) regstring_type = Column(String,default=asn) ## test code session,eng = db_setup(postgres:///test18, Base=Base, echo=ECHO) add = session.add q = session.query c = session.commit r = _class_repr A = Product(id=192832, regstring=some part id) print r(A) add(A) c() # commit print map(r,q(Product).all()) print somehow this managed to get in, without making a polystring, which it should be referencing. assert len(q(Polystring).all()) 0, So, where is the polystring? -- --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Declarative issues, with compound foreign key
Thank you both for the advice. Dern NULLs causing trouble again. GL On Tue, Sep 15, 2009 at 4:34 PM, Conor conor.edward.da...@gmail.com wrote: On Sep 15, 4:08 pm, Michael Bayer mike...@zzzcomputing.com wrote: Gregg Lind wrote: What I think I'm seeing is that an object can be created even without it's ForeignKeyConstraint being filled. To run the test code below: $ dropdb test18; createdb test18; python testcode.py on is not defined: ForeignKeyConstraint(['regstring_id', 'regstring','regstring_type'], [A1String.id, A1String.string, A1String.origin], on), when removing on, the row inserts with regstring_id as NULL. PG appears to accept this so I would assume PG considers a three-column foreign key with one NULL to be NULL. If I try it with all three columns not null, then you get the constraint error. SQLalchemy itself relies upon the database to enforce constraints.In this case you should have the NOT NULL constraint on the Product columns. To expand on this: Most (all?) databases default to a MATCH SIMPLE policy for foreign key constraints: if any FK column is NULL then the FK constraint is satisfied (regardless of the actual values of the non-null columns). It looks like you want MATCH FULL behavior: if some but not all FK columns are NULL then the FK constraint fails. Assuming you really do need the the FK columns to be nullable, you have to either add MATCH FULL to your DDL (probably have to use DDL() + ALTER TABLE; also requires that your database actually supports MATCH FULL) or add a check constraint that mimics the MATCH FULL behavior, e.g.: (regstring_id IS NULL) = (regstring IS NULL) = (regstring_type IS NULL) This builds on http://groups.google.com/group/sqlalchemy/browse_thread/thread/eb240f. .. . I understand that the foreign table can't create the referent. (and finding the best idiom for use one if it exists or create one) is yet be determined. What I truly don't understand is how any instances of Product can be created, since there is a FK constraint that is not fulfulled. 1. Is the foreign key constraint fulfilled? 2. Is there a good create the referent if it doesn't exist, else use it idiom? 3. Is the polymorphic table business complicating it? It seems liek the compound primary key for A1String is. from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import CheckConstraint, ForeignKey, MetaData, PrimaryKeyConstraint from sqlalchemy import ForeignKeyConstraint from sqlalchemy import Table, Column, Integer, Boolean,Unicode,String from sqlalchemy.orm import relation, backref from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from sqlalchemy.schema import DDL import sys ECHO = bool((sys.argv + [False])[1]) ## utilties for connecting the db, printing it, etc. def print_schema(T=postgres, Base=None): ''' print print_schema will print the schema in use ''' from StringIO import StringIO buf = StringIO() engine = create_engine('%s://' % T, strategy='mock', executor=lambda s, p='': buf.write(str(s) + p)) Base.metadata.create_all(engine) return buf.getvalue() def db_setup(connstring='sqlite:///:memory:', echo=False, Base=None): engine = create_engine(connstring, echo=echo) Session = sessionmaker(bind=engine, autoflush=False, autocommit=False) session = Session() Base.metadata.bind = engine Base.metadata.create_all() return session, engine def _class_repr(self): ''' print our SA class instances in a nicer way ''' # ugly, use sparingly, may have performance hit d = [(k,v) for k,v in self.__dict__.iteritems() if k[0] != _] d = sorted(d, key=lambda x: x[0].lower()) return %s, %s % (self.__class__, d) Base = declarative_base() class Polystring(Base): __tablename__ = 'strings' id = Column(Integer, nullable=False, primary_key=True) string = Column(String, nullable=False, primary_key=True) origin = Column(String, nullable=False, primary_key=True) __mapper_args__ = {'polymorphic_on': origin} # subtype of string class A1String(Polystring): __mapper_args__ = {'polymorphic_identity': 'a1'} products = relation('Product', order_by=Product.id) class Product(Base): __tablename__ = 'product' __table_args__ = ( ForeignKeyConstraint(['regstring_id', 'regstring', 'regstring_type'], [A1String.id, A1String.string , A1String.origin], on), {} ) id = Column(Integer,primary_key=True) regstring_id = Column(Integer) regstring = Column(String) regstring_type = Column(String,default=asn) ## test code session,eng = db_setup(postgres:///test18, Base=Base, echo=ECHO) add = session.add q
[sqlalchemy] Re: making an instance read-only
Based on Michael Bayer's hint, I built this instance-level changeable object. It used Declarative, which actually makes it a bit tougher, but this code should work for pypo's also. It's hackish in that it just monkeys with __setattr__, but it's clear(ish) what's happening. - from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Integer,String,Column Base = declarative_base() class NotEditableError(AttributeError): pass class Thingy(Base): __tablename__ = 'thingy' field = Column(String, nullable=False, primary_key=True) def __init__(self, field='', editable=True): self.field = field self.editable = editable # must be last set, or __setattr__ # will have problems def __setattr__(self,attr,value): # catch before editable is defined try: editable = self.editable except AttributeError: editable = True # the only 'always editable' value is editable if attr == editable or editable: # it's all good Base.__setattr__(self, attr, value) else: raise NotEditableError, for this object, fields are not editable def Test(): T = Thingy('abc',False) assert T.field == 'abc' try: T.field = pdq except NotEditableError, exc: pass T.editable = True T.field = pdq assert T.field == 'pdq' return T --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: index in SA
If you need to use some SQL is that is not supported by UniqueConstraint, PrimaryKeyConstraint and friends, you can use the DDL statement: # DDL to only run on postgres, incorporating declarative style from sqlalchemy.schema import DDL DDL('''ALTER TABLE orgdata ADD CONSTRAINT lowername_check CHECK (lowername !~ '[[:upper:]]')''', on=postgres).execute_at('after-create',Orgdata.__table__) --- cf: http://groups.google.com/group/sqlalchemy/browse_thread/thread/82827f877ff853a4?hl=en On Thu, Aug 20, 2009 at 11:35 AM, rajasekhar911 rajasekhar...@gmail.comwrote: Hi Is it possible to add index to my table using sqlalchemy? Or should i do it directly on the database? Thanks.. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Can child-side of a one-to-many create the parent side?
I have a many-to-one relation, let's say between Address and User, both created using Declarative. I would like it that if an Address references a User, it will create that User, if it doesn't exist. Otherwise, it should use the existing one. There should be no addresses in the table that don't have users. *It could be that my whole sense of modeling is fouled up here, and I appreciate correction.* 1. Is this possible using the current _cascade_ tools? Is so, how is it set up? 2. If not, then what is the best way to handle this? 3. From the docs for relation it seems like the cascading is all viewed from the parent (one) side, is that correct? Thanks! Gregg - from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from sqlalchemy import Column, Integer, String from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import ForeignKey from sqlalchemy.orm import relation, backref Base = declarative_base() class User(Base): __tablename__ = 'users' name = Column(String,primary_key=True) def __init__(self, name): self.name = name def __repr__(self): return User('%s') % (self.name) class Address(Base): __tablename__ = 'addresses' email_address = Column(String, nullable=False, primary_key=True) user_name = Column(ForeignKey('users.name'),nullable=False, primary_key=True) user = relation(User, backref=backref('addresses', order_by=email_address), cascade='all') def __init__(self, email_address,uname=None): self.email_address = email_address self.user_name=uname def __repr__(self): return Address('%s'), for User('%s') % (self.email_address, self.user_name) def db_setup(connstring='sqlite:///:memory:', echo=False): engine = create_engine(connstring, echo=echo) session = sessionmaker(bind=engine, autoflush=False, autocommit=False)() Base.metadata.bind = engine Base.metadata.create_all() return session, engine session,engine=db_setup('postgres:///test_test',True) A = Address(m...@place,Daniel Waterhouse) session.add(A) session.commit() # fails because no User named Daniel Waterhouse? I want one created. The user_name is enough info to do so. # otherwise, use the existing one. -- --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Can child-side of a one-to-many create the parent side?
One quick follow up. re: question 2: If I create a User there, and that user exists in the DB, won't that cause problems? I ask because this situation seems analogous to a create if not existing situation. Thanks for the (as ever) fast response. GL On Fri, Aug 14, 2009 at 5:20 PM, Michael Bayer mike...@zzzcomputing.comwrote: Gregg Lind wrote: I have a many-to-one relation, let's say between Address and User, both created using Declarative. I would like it that if an Address references a User, it will create that User, if it doesn't exist. Otherwise, it should use the existing one. There should be no addresses in the table that don't have users. *It could be that my whole sense of modeling is fouled up here, and I appreciate correction.* 1. Is this possible using the current _cascade_ tools? Is so, how is it set up? no, since you're talking about instantiating an object that doesn't exist otherwise. 2. If not, then what is the best way to handle this? the constructor of your Address says, self.user = User(args you want). If that's too early, you can write a SessionExtension.before_flush() that scans through and ensures all Address objects have a default User() as desired. 3. From the docs for relation it seems like the cascading is all viewed from the parent (one) side, is that correct? all relation()s feature cascading. A two-way relationship consists of two separate relation()s, backref is just a shortcut to creating them and linking them together. parent and child are relative to which relation() you're referring to. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Declarative, correlated subqueries
Mike, I totally appreciate the help, but it's just not working, for me. I feel like you've given tons of time on trying to fix this, so if anyone else wants to step in to hit me with the clue stick, that would be delightful. More details db's tried: postgres, sqlite sqlalchemy version: 0.5.5 on 64-bit Centos -- connstring='postgres:///test_a' def demo(): session.query(Route).delete() for t,h,ts,s in samples : session.add(Route(target=t,hop_id=h,ts=ts,startpoint=s)) session.flush() session.commit() sq = session.query(Route.ts,Route.startpoint,Route.target,func.max(Route.hop_id).label('max_hop'))\ .group_by(Route.ts,Route.startpoint,Route.target).subquery() q = session.query(Route,sq.c.max_hop).join(sq,sq.c.max_hop==Route.hop_id) q.all() --- Which gives: ArgumentError: Can't find any foreign key relationships between 'route' and '%(47624668442128 anon)s' All the other code is as in the first email. I'm not sure what's supposed to be happening, but something clearly isn't right, and I'm clearly having some core model grokking fail. In addtion: print join(Route,sq, Route.hop_id==sq.c.max_hop) class '__main__.Route' JOIN (SELECT route.ts AS ts, route.startpoint AS startpoint, route.target AS target, max(route.hop_id) AS max_hop FROM route GROUP BY route.ts, route.startpoint, route.target) AS anon_1 ON route.hop_id = anon_1.max_hop This doesn't seem to connect any of the other parts, other than the hop id. On Thu, Jul 23, 2009 at 8:13 PM, Michael Bayermike...@zzzcomputing.com wrote: On Jul 23, 2009, at 8:43 PM, Gregg Lind wrote: Hm. I appreciate the help, but something is clearly still failing here. session.query(Route,*sq.c).join(sq.c.max_hop) ArgumentError: Can't find any foreign key relationships between 'route' and 'max_hop' Maybe the filter based solution is just fine here :) that's not the call sig for query.join() . if youre dealing with SQL expression components, its join((selectable, onclause), ... ), so here join((sq, sq.c.max_hop==Route.hop)) . --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Declarative, correlated subqueries
Thank you! That tuple thing was a fail on my part, clearly. Doing it exactly as you describe still doesn't get things to be, for lack of a better term, correlated. This, however, achieves what I want: session.query(Route,sq.c.max_hop).join((sq, and_(Route.hop_id==sq.c.max_hop, Route.ts==sq.c.ts,Route.startpoint==sq.c.startpoint,Route.target==sq.c.target))).all() This seems no different than the filter based approach outlined in the initial code. Is this not the point of correlated sub queries, or am I missing something? If so, how do I achieve it? Thanks again, for all the help, and for making SqlA such a great (and powerful!) product. Thanks! GL On Fri, Jul 24, 2009 at 11:59 AM, Michael Bayermike...@zzzcomputing.com wrote: Gregg Lind wrote: session.query(Route.ts,Route.startpoint,Route.target,func.max(Route.hop_id).label('max_hop'))\ .group_by(Route.ts,Route.startpoint,Route.target).subquery() q = session.query(Route,sq.c.max_hop).join(sq,sq.c.max_hop==Route.hop_id) q.all() join takes tuples in this form: join((sq,sq.c.max_hop==Route.hop_id)) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Declarative, correlated subqueries
I have read over http://www.sqlalchemy.org/docs/05/ormtutorial.html#using-subqueries and http://www.mail-archive.com/sqlalchemy@googlegroups.com/msg11439.html, but I'm having trouble putting the pieces together. In the demo() below, I want to find the row in the database with the max for every unique combination of Route(target,startpoint,ts). The code I have there *works*, but doesn't seem to use any subquery magic at all, nothing from 'correlated' subqueries. What might I be missing? Thanks! Gregg L. -- from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import MetaData from sqlalchemy import Table, Column, Integer, String from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from sqlalchemy import func Base = declarative_base() class Route(Base): __tablename__ = 'route' target = Column(String, nullable=False, primary_key=True) hop_id = Column(Integer, nullable=False, primary_key=True) ts = Column(Integer, nullable=False, primary_key=True) startpoint = Column(String, nullable=False, primary_key=True) # a bunch of other fields #data = Column(String, nullable=True, primary_key=False) #... def __repr__(self): return %s %s %s %s %(self.target, self.hop_id, self.ts, self.startpoint) connstring='sqlite:///:memory:' engine = create_engine(connstring, echo=False) session = sessionmaker(bind=engine, autoflush=False, autocommit=False)() Base.metadata.bind = engine Base.metadata.create_all() samples = [ ('T1',1,1000,'S1'), ('T1',2,1000,'S1'), ('T1',3,1000,'S1'), ('T1',1,1000,'S2'), ('T1',2,1000,'S2'), ('T2',1,1000,'S1'), ('T2',2,1000,'S1'), ('T2',3,1000,'S1'), ('T2',4,1000,'S1'), ('T2',1,1500,'S1')] def demo(): for t,h,ts,s in samples : session.add(Route(target=t,hop_id=h,ts=ts,startpoint=s)) session.flush() session.commit() # row in the database with the max_hop for every unique combination of Route(target,startpoint,ts) sq = session.query(Route,func.min(Route.hop_id).label('max_hop')) sq = sq.group_by(Route.ts,Route.startpoint,Route.target).subquery() q = session.query(Route,sq.c.max_hop) q = q.filter(Route.target==sq.c.target) q = q.filter(Route.startpoint == sq.c.startpoint) q = q.filter(Route.hop_id == sq.c.hop_id) q.all() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Declarative, correlated subqueries
On Thu, Jul 23, 2009 at 3:24 PM, Michael Bayermike...@zzzcomputing.com wrote: im assuming you're using MySQL since the GROUP BY below doesn't accommodate every column in the subquery (would be rejected by most DBs). Corrected. It was Sqlite, but good catch. youll want to query each column individually that is part of what you are grouping by. i think you also need to use func.max() here and not func.min(). Yes, dumbpants on me there! the join of the subquery to parent table is then probably just on hop_id. no correlation of subquery is needed either since you are intersecting two complete sets together (all routes intersected with all max hop id routes grouped by x, y, z). How do I implement this join? If I do this: sq = session.query(Route.ts,Route.startpoint,Route.target,func.max(Route.hop_id).label('max_hop')) sq = sq.group_by(Route.ts,Route.startpoint,Route.target).subquery() then: q = session.Query(Route,*sq.c).join(???) What would that join be on? Hop_id isn't in the subquery. I don't mean to be dense, but I'm not quite getting your response. Perhaps I don't understand what correlated subqueries in SqlA are. Is there is a reference that explains where they're used? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Declarative, correlated subqueries
Hm. I appreciate the help, but something is clearly still failing here. session.query(Route,*sq.c).join(sq.c.max_hop) ArgumentError: Can't find any foreign key relationships between 'route' and 'max_hop' Maybe the filter based solution is just fine here :) On Thu, Jul 23, 2009 at 7:29 PM, Michael Bayermike...@zzzcomputing.com wrote: On Jul 23, 2009, at 5:20 PM, Gregg Lind wrote: How do I implement this join? If I do this: sq = session .query (Route .ts ,Route .startpoint,Route.target,func.max(Route.hop_id).label('max_hop')) sq = sq.group_by(Route.ts,Route.startpoint,Route.target).subquery() then: q = session.Query(Route,*sq.c).join(???) What would that join be on? Hop_id isn't in the subquery. sq.c.max_hop --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Queries in historical db's, where the join depends on having an overlapping timestamp (or other filter condition)
''' Based on the thread titled filter relation based on column value at http://groups.google.com/group/sqlalchemy/browse_thread/thread/0db7423d986ba543 and some others, I'm curious about how to better get my SqlA code working with historical (better term?) databases, where the relational state of the system depends on when the question is being asked. Group membership, where both the members of the group, and the information about a group can change over time is shown in the example below. ''' ## for this simple example ## every employee can be in at most one group, duirng any time period ## groups can also change name over time, which doesn't doesn't change ## the group id. ## employees don't change name during their tenure in a group ## Analysis Questions: ## 1. Show the composition of the group (with id=ii) at a given time, ## including name and all members ## 2. Show the history of a group over time, including name changes, ## membership changes ## 3. History for an employee, including when they change groups ''' Questions: 1. Is there a SQLAlchemical way to write group_snapshot_ts into a declarative class, such that the joins and loader respect the time constraints? (Read-only is fine as well on the loaded attributes) a. eager_loader? b. subclassing query? 2. (Secondary, off-topic) Is there a constraint that shows that an employees time period in a group isn't *overlapping* their time period in another. That is, that they are in only one group at once? ''' from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import CheckConstraint, ForeignKey, MetaData, PrimaryKeyConstraint from sqlalchemy import ForeignKeyConstraint, UniqueConstraint from sqlalchemy import Table, Column, Integer, Boolean,Unicode, String from sqlalchemy.orm import relation, backref from sqlalchemy.orm import mapper from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from sqlalchemy.orm.exc import NoResultFound forever=2147483647 # maxint in sql system for int type Base = declarative_base() def db_setup(connstring='sqlite:///:memory:', echo=False): engine = create_engine(connstring, echo=echo) Session = sessionmaker(bind=engine, autoflush=False, autocommit=False) session = Session() Base.metadata.bind = engine Base.metadata.create_all() return session, engine class GroupInfo(Base): __tablename__ = 'group_info' group_id = Column(Integer, primary_key=True, nullable=False, autoincrement=False) group_name = Column(String, primary_key=True, nullable=False) start = Column(Integer, nullable=False, primary_key=True, autoincrement=False) stop = Column(Integer, nullable=False, primary_key=True, autoincrement=False) __table_args__ = ( CheckConstraint('stop start',name='ss_check'), UniqueConstraint('group_id','start', name='u_start'), UniqueConstraint('group_id','stop', name='u_stop'), {} ) class Membership(Base): __tablename__ = 'membership' group_id=Column(ForeignKey(GroupInfo.group_id, onupdate=cascade, ondelete='cascade')) employee_id = Column(Integer, primary_key=True, nullable=False) start = Column(Integer, nullable=False, primary_key=True, autoincrement=False) stop = Column(Integer, nullable=False, primary_key=True, autoincrement=False) __table_args__ = ( CheckConstraint('stop start',name='ss_check'), UniqueConstraint ('group_id','employee_id','start',name='u_start'), UniqueConstraint ('group_id','employee_id','stop',name='u_stop'), {} ) def group_snapshot_ts(session, groupid, ts): GI = GroupInfo G = session.query(GI).filter(GI.group_id==groupid) G = G.filter(GI.start = ts).filter(GI.stop = ts) try: G = G.one() except NoResultFound: return None M = Membership members = session.query(M.employee_id).filter (G.group_id==M.group_id) members = members.filter(M.start = ts).filter(M.stop = ts) G.members = [x[0] for x in members.all()] return dict(id=G.group_id,name=G.group_name, members = G.members) def demo(session): ## populate gnames = (group_id,group_name,start,stop) for g in [(1,group 1, 0,10),(1,new group 1, 10,20), (2,'group2', 3, forever)]: session.add(GroupInfo( **(dict(zip(gnames,g) session.flush() mnames = (group_id,employee_id,start,stop) for e in [ (1,42,1,5),(2,42,5,100),(1,18,2,50)]: session.add(Membership( **(dict(zip(mnames,e) session.flush() session.commit() for (gid, ts) in (10,10), (1,3),(1,6),(1,11),(1,100),(2,1),(2,10): print Group %i, ts %i % (gid,ts) print group_snapshot_ts(session,gid,ts) return None ## run it! session, engine = db_setup() demo(session) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this
[sqlalchemy] Re: Queries in historical db's, where the join depends on having an overlapping timestamp (or other filter condition)
That's a totally fair answer! Mostly, I wish some sense of relational change over time was built into SQL, the way it is in BigTable style systems. Maybe you could shed a little light on how to use the overlap operator? I'm having trouble getting the multiple fields into the clause statement. In [43]: session.query(Membership).filter(Membership.start.op('OVERLAPS',[1,100])) TypeError: op() takes exactly 2 arguments (3 given) It would also be have / emulate a .when(ts) method in queries (perhaps via a query subclass) that would take the timings into account, but this does smack of magic as you suggest! Thanks for the advice! Gregg On Thu, Jul 16, 2009 at 10:17 AM, Michael Bayermike...@zzzcomputing.com wrote: Gregg Lind wrote: Questions: 1. Is there a SQLAlchemical way to write group_snapshot_ts into a declarative class, such that the joins and loader respect the time constraints? (Read-only is fine as well on the loaded attributes) a. eager_loader? b. subclassing query? im really not in favor of trying to build magic awareness of things like this into mappers and query subclasses. If you have complex conditions to be queried upon I am a fan of abstracting the common kinds of queries you need into Python functions and/or class methods. no magic, just regular old application architecture. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Queries in historical db's, where the join depends on having an overlapping timestamp (or other filter condition)
Jess, Thanks for posting the actual class :) Just reading the description use contains_eager didn't tell me enough about how to make it happen. Cheers, Gregg On Thu, Jul 16, 2009 at 12:54 PM, jessjesslp...@gmail.com wrote: I believe that I asked Michael a similar question, in a different way, a few days ago. The answer was to use contains_eager. I used something like the following and it worked great to query what the membership of a group was at a specific time. The two tables remain simple, related by a group_id and the query is dynamic according to the time that you are interested in map = mapper( Group, group_table, properties=dict( members=relation( Member, backref='group' ) ) ) statement = names.outerjoin(members).select( and_( Member.start = reftime, Member.stop = reftime ).apply_labels() query = session.query(Group).options( contains_eager ('members') ) query = query.from_statement(statement) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Declarative base -- only add a constraint if particular db engine
As always, thank you for the complete, exhaustive answer. This particular thing is definitely and edge case, and rather non-obvious, so thank you for walking me through it. Either of those are clean enough for me! Is there are more proper / general way to describe the problem, so google and make this answer easier to find? Gregg On Fri, May 29, 2009 at 12:10 PM, Michael Bayer mike...@zzzcomputing.com wrote: Gregg Lind wrote: I use declarative base for defining classes. I have a constraint that only works in Postgres. How do I declare that constraint lowername_check only if the session is going postgres (and not to sqlite, for example). pg_only_constraint = CheckConstraint(lowername !~ '[[:upper:]]',name='lowername_check'), class Data(Base): __tablename__ = 'Data' lowername=Column(Unicode, nullable=False) __table_args__ = ( pg_only_constraint, {} ) The cleanest way is to use the schema.DDL() construct which can filter against various backends, but requires that you spell out the constraint explicitly: DDL(CREATE CONSTRAINT , on=postgres).execute_at('after-create', Data.__table__) Alternatively, if you want to stick with the CheckConstraint object you can create a function create_pg_constraints() which is called at the point your app calls create_engine(), that contains all PG specific constructs - the function would be called based on engine.dialect.name == postgres. We have a more flexible architecture in 0.6 for this sort of thing and I think if we add an AddConstraint() construct there and also move most of DDL()'s execute-at and on functionality into the base DDLElement class, that would enable both constructs to be combined together as in AddConstraint(CheckConstraint(...args...), on=postgres)).execute_at('after-create', Data.__table__). Thanks! Gregg Lind --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Declarative base -- only add a constraint if particular db engine
Alas, that doesn't seem to matter or help. Even this statement causes the same issue. Odd. Must not be related to the colons, alas. DDL(r''' ''', on=postgres).execute_at('after-create',Data.__table__) On Fri, May 29, 2009 at 3:08 PM, Michael Bayer mike...@zzzcomputing.com wrote: Gregg Lind wrote: I used the DDL style DDL('''ALTER TABLE data ADD CONSTRAINT lowername_check CHECK (lowername !~ '[[\:upper\:]]')''', on=postgres).execute_at('after-create',Data.__table__) and now my print_schema method (based on http://www.sqlalchemy.org/trac/wiki/FAQ#HowcanIgettheCREATETABLEDROPTABLEoutputasastring) breaks (on PG only, because of the DDL), with this error: TypeError: unsupported operand type(s) for +: '_TextClause' and 'str' I escaped the colons in the DDL. Workarounds? escape them with \\: or use r'\:' Gregg Code: def print_schema(T=postgres): ''' print print_schema will print the schema in use ''' global Base from StringIO import StringIO buf = StringIO() print '%s://' % T engine = create_engine('%s://' % T, strategy='mock', executor=lambda s, p='': buf.write(s + p)) Base.metadata.create_all(engine) return buf.getvalue() On Fri, May 29, 2009 at 12:27 PM, Gregg Lind gregg.l...@gmail.com wrote: As always, thank you for the complete, exhaustive answer. This particular thing is definitely and edge case, and rather non-obvious, so thank you for walking me through it. Either of those are clean enough for me! Is there are more proper / general way to describe the problem, so google and make this answer easier to find? Gregg On Fri, May 29, 2009 at 12:10 PM, Michael Bayer mike...@zzzcomputing.com wrote: Gregg Lind wrote: I use declarative base for defining classes. I have a constraint that only works in Postgres. How do I declare that constraint lowername_check only if the session is going postgres (and not to sqlite, for example). pg_only_constraint = CheckConstraint(lowername !~ '[[:upper:]]',name='lowername_check'), class Data(Base): __tablename__ = 'Data' lowername=Column(Unicode, nullable=False) __table_args__ = ( pg_only_constraint, {} ) The cleanest way is to use the schema.DDL() construct which can filter against various backends, but requires that you spell out the constraint explicitly: DDL(CREATE CONSTRAINT , on=postgres).execute_at('after-create', Data.__table__) Alternatively, if you want to stick with the CheckConstraint object you can create a function create_pg_constraints() which is called at the point your app calls create_engine(), that contains all PG specific constructs - the function would be called based on engine.dialect.name == postgres. We have a more flexible architecture in 0.6 for this sort of thing and I think if we add an AddConstraint() construct there and also move most of DDL()'s execute-at and on functionality into the base DDLElement class, that would enable both constructs to be combined together as in AddConstraint(CheckConstraint(...args...), on=postgres)).execute_at('after-create', Data.__table__). Thanks! Gregg Lind --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Questions on SQLA Queries
I believe by map function, Timothy may be implying that you should use any of the python idioms for converting iterables of tuples to a straight tuple. The one I like best from itertools import chain q = session.query(User.name) #(User is a class) names = itertools.chain(*q.all() ) But you could use generator comprehensions ( names = (x[0] for x in q.all()), operator.itemgetter, or map instead. Correct me, Timothy, if necessary. Gregg On Wed, May 27, 2009 at 6:25 AM, Harish Vishwanath harish.shas...@gmail.com wrote: Thanks! Could you elaborate on how you use the map function? I couldn't find it myself in the docs. Regards, Harish On Wed, May 27, 2009 at 3:07 PM, Timothy N. Tsvetkov timothy.tsvet...@gmail.com wrote: Q1. Good question %) I didn't find anything about it in docs (but i didn't search a lot), so i use map function to convert it to a list you want. And I think it is the right solution. Because if you query for more then one column (session.query(User.is, User.name).all()) a list of tuples is what you want to get as a result. So i think it is good, that it works the same way for one or more then one query params. On May 26, 9:10 pm, Harish Vishwanath harish.shas...@gmail.com wrote: Hello, Question 1: When there is a query like below : q = session.query(User.name) #(User is a class) and when I do q.all(), a list of tuples (User.name,) is returned though a single column is asked for. Is there a way to get a list directly from q.all() when a single column is required? Question 2: I need to delete a bulky table and I want to print diagnostics after n number of deletes. Is there a way to use Query object so that a SQL statement like below can be generated? delete from movie where year in (select top 30 year from movie where year 50); , so that a message can be logged after every 30 deletes. I am using Sqlite DB. Regards, Harish --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Relative speed of ORM vs. direct insertions (on simple tables)
Good morning! I hope I'm not tipping any sacred cows here, but for simple SqlA objects, it seems to be a heckuva lot faster to just insert into the table directly (using SqlA expression language) than to insert the objects via session.flush(). In the attached tests, I'm observing a 10x -20x speedup. I'm still new to SqlA, so I was hoping the list here would be kind enough to verify my observation. My questions: 1. If so, why? I assume it's because session_flush() does seperate insert statments (as verified when echo = True is on). 2. In test 3, is this a reasonable away to convert from session to direct table insert? Is there a simpler way than the Thing.to_dict method I hacked together. 3. Are these valid tests? I don't want to have all the embarrassment of some others who have 'slammed' SqlA without a proper grounding. I'm no expert, and I want to make sure what I have is something approximating idiomatic SqlA. I tried to be generous about what to include in the timed section of each test. I do have autoflush off, and I'm using Sqlite (in memory), which might affect things. 4. If there is a faster way to flush out a session, I'm all ears! I understand the Big Win (tm) of the ORM is programmer simplicity and power, but if I can get that without major hits to performance, I'd like to be able to Be Greedy (tm) and have it all. - #!/usr/bin/env python2 import sys import itertools import time from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Table, Column, Integer, String, MetaData, ForeignKey, Boolean from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker Base = declarative_base() class Thing(Base): __tablename__ = 'asn_change' id = Column(Integer, primary_key=True, nullable=False) datum = Column(String, primary_key=True, nullable=False) start = Column(Integer, primary_key=True, nullable=False) stop = Column(Integer, primary_key=True, nullable=False) def to_dict(self): return dict(id=self.id, datum=self.datum, start=self.start, stop=self.start) def dummy_setup(connstring='sqlite:///:memory:'): engine = create_engine(connstring, echo=False) Session = sessionmaker(bind=engine, autoflush=False, autocommit=False) session = Session() Base.metadata.bind = engine Base.metadata.create_all() return session, engine def clean_up(session): session.expunge_all() session.query(Thing).delete() session.commit() ## Tests class Tests(object): def setUp(self): self.session, self.engine = dummy_setup() self.R = 1 def test_01_orm(self): session = self.session clean_up(session) for ii in xrange(self.R): session.add(Thing(id=ii,datum=some data %i %ii, start=0,stop=9)) now = time.time() session.flush() session.commit() t = time.time() - now print timing: %f2.2 % t assert True def test_02_direct_insert(self): session = self.session clean_up(session) stm = Thing.__table__.insert().compile() ipfx = [dict(id=ii,datum=some data %i %ii,start=0,stop=9) for ii in xrange(self.R)] now = time.time() stm.execute(ipfx) t = time.time() - now print timing: %f2.2 % t assert True def test_03_convert_from_session(self): session = self.session clean_up(session) stm = Thing.__table__.insert().compile() for ii in xrange(self.R): session.add(Thing(id=ii,datum=some data %i %ii, start=0,stop=9)) # count the conversion time as part of the test now = time.time() ipfx = [x.to_dict() for x in session.new] stm.execute( ipfx) session.new.clear() t = time.time() - now print timing: %f2.2 % t assert True - $ nosetests -v -s insertion_test.py insertion_test.Tests.test_01_orm ... timing: 8.2513552.2 ok insertion_test.Tests.test_02_direct_insert ... timing: 0.5210562.2 ok insertion_test.Tests.test_03_convert_from_session ... timing: 0.8730292.2 ok -- Ran 3 tests in 14.247s OK --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Relative speed of ORM vs. direct insertions (on simple tables)
Some followups: Python 2.4.3 on 64-bit linux. Timings are near identical in SA 0.5.2 and 0.5.3. On Apr 8, 9:57 am, Gregg Lind gregg.l...@gmail.com wrote: Good morning! I hope I'm not tipping any sacred cows here, but for simple SqlA objects, it seems to be a heckuva lot faster to just insert into the table directly (using SqlA expression language) than to insert the objects via session.flush(). In the attached tests, I'm observing a 10x -20x speedup. I'm still new to SqlA, so I was hoping the list here would be kind enough to verify my observation. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Relative speed of ORM vs. direct insertions (on simple tables)
Thank you for the excellent and comprehensive answer! I didn't realize exactly how much work the session object does As per your advice, I have taken steps to reduce the size of my session object, and things run much more quickly now. On Wed, Apr 8, 2009 at 12:50 PM, Michael Bayer mike...@zzzcomputing.com wrote: Gregg Lind wrote: Good morning! I hope I'm not tipping any sacred cows here, but for simple SqlA objects, it seems to be a heckuva lot faster to just insert into the table directly (using SqlA expression language) than to insert the objects via session.flush(). In the attached tests, I'm observing a 10x -20x speedup. I'm still new to SqlA, so I was hoping the list here would be kind enough to verify my observation. verified My questions: 1. If so, why? I assume it's because session_flush() does seperate insert statments (as verified when echo = True is on). session.flush() performs a topological sort of all dirty/pending/deleted objects based on foreign key dependencies between tables as well as between rows, checks all modified attributes and collections (the collections part sometimes requires a load of the collection, unless certain options are set) for a net change in value, issues INSERT/UPDATE/DELETE statements in an exact order based on dependencies, tailors individual INSERT and UPDATE statements based on the values which are present in memory vs. server side defaults (for inserts) or the values which have a net change (for updates). It then issues all of these statements individually (by necessity, since they all have different argument lists and sometimes inter-row dependencies, you also cannot fetch the last inserted id from an executemany()) which from a DBAPI point of view is slower in any case, since you are calling execute() many times. Newly inserted rows often require extra statements to fetch newly generated primary keys, which are then distributed to all the foreign-key-holding attributes which require it (which are then potentially inserted or updated in subsequent statements). After all SQL is emitted, it then refreshes the bookkeeping status on all entities which were changed, and expires attributes whose values were generated within the DB but don't need to be fetched until needed. OTOH an executemany() call receives a pre-made list of parameters for any number of bind parameter sets, the DBAPI then prepares a single statement and runs it N times, usually within C code, and you can modify or insert tens of thousands of rows in a few seconds (the trick is that you've generated this huge dict of data beforehand, and that your parameters are all of identical structure). 2. In test 3, is this a reasonable away to convert from session to direct table insert? Is there a simpler way than the Thing.to_dict method I hacked together. for simple table mappings, its easy enough to deal with your rows as dicts and use execute() to change things. For more complexity with relations to other tables in various ways, it becomes less trivial. There are always tradeoffs to be navigated according to your specific needs. 3. Are these valid tests? I don't want to have all the embarrassment of some others who have 'slammed' SqlA without a proper grounding. I'm no expert, and I want to make sure what I have is something approximating idiomatic SqlA. I tried to be generous about what to include in the timed section of each test. I do have autoflush off, and I'm using Sqlite (in memory), which might affect things. I didnt look closely but the general observation of expressions are faster than ORM is valid. Your orders of magnitude might be off. 4. If there is a faster way to flush out a session, I'm all ears! I understand the Big Win (tm) of the ORM is programmer simplicity and power, but if I can get that without major hits to performance, I'd like to be able to Be Greedy (tm) and have it all. try keeping the size of the session small, and look into options like passive_deletes and passive_updates, which prevent rows from being loaded in order to accomodate cascades that can be established in the database directly. In any case flushing tens of thousands of objects is unlikely to be performant. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: My sqlalchemy.orm.identity.IdentityManagedState is leaking
I believe you want session.expunge() or session.expunge_all(). GL On Wed, Apr 8, 2009 at 6:05 PM, Chris Lewis cfle...@gmail.com wrote: Hi everyone! I have a threaded application which deals with a lot of records (it generates hundreds of thousands in an hour). I have a Database module that my program imports, and use scoped_session to make the session thread-safe. When a record is to be inserted, my program passes the object to be saved to the Database insert() function. Here's the code: == engine = sa.create_engine(engine_url, echo=echo) Session = sa.orm.scoped_session(sa.orm.sessionmaker(bind=engine, autocommit=False)) def insert(obj): try: session().merge(obj) except Exception, e: log.warning(Database problem: + str(e)) session().rollback() raise else: log.debug(Saved to database) session().commit() Session.remove() def session(): return Session() == Even though I call Session.remove(), it seems that I can't stop sqlalchemy.orm.identity.IdentityManagedState growing. Unit testing which inserts a couple thousand records shows the growth with Heapy. The dict of sqlalchemy.orm.identity.IdentityManagedState starts at 334 objects, ending with 11210 objects. I thought Session.remove() would cause SQLAlchemy to release those resources, but this doesn't seem to be the case. As the process is going to need to long-running (weeks hopefully), I'm far happier with performing expensive CPU operations than exhausting my memory. I am certain this is my own error, but I am not sure what it is. Any help would be appreciated! Thanks in advance, Chris Lewis --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] export and import JSON from database (JSON type)
How does one create a TypeDecorator to export and import JSON to a database using SA? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: export and import JSON from database (JSON type)
Thank you, that's simpler than my attempts, for sure! On Wed, Feb 25, 2009 at 10:27 AM, Roger Demetrescu roger.demetre...@gmail.com wrote: On Wed, Feb 25, 2009 at 11:39, Gregg Lind gregg.l...@gmail.com wrote: How does one create a TypeDecorator to export and import JSON to a database using SA? I did something like that recently: - from sqlalchemy import types import simplejson class JsonString(types.TypeDecorator): impl = types.String def process_result_value(self, value, dialect): if value is None: return None else: return simplejson.loads(value) def process_bind_param(self, value, dialect): if value is None: return None else: return simplejson.dumps(value) - []s Roger --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Events Undo/Redo Functionality?
If you choose to go down the journaling route, some things would be helpful: 1. all of your functions (journalable actions) should have inverse functions to reverse them. If this is true, then from a particular state, you can simply record the actions, checkpointing all user objects periodically. A sequence might be something like: save binary state of file cut(1:30, 1:40) loop(a,13, sec=10) Since you described an music editing app, doing this in clean way (without losing precision and information) might require storing actual binary states. This sounds, to me, a little nightmarish. I don't have any insights on the SA front, alas. Gregg Lind On Tue, Feb 24, 2009 at 2:41 PM, paniq303 paniq...@googlemail.com wrote: On Feb 24, 5:16 pm, a...@svilendobrev.com wrote: these are two general patterns, observer/listener and undo/redo (command-pattern), which have nothing to do with DB. i dont know about 2) DB-observers which would react on somebody else changing the DB, AFAIK that is possible on certain servers only, but the rest should be done independly on higher level than DB-stuff. read about Model-View-Controller. Your reply sounds a bit offended. I'm sorry if my request came over as kind of rude. I don't mean to upset anybody, and I don't want to take up your time, so please send me elsewhere if my inquiry does not fit the projects scope. I'm using a Model-View-Controller scheme at the moment. I also know of the command pattern and how it works. But I'm trying to write as little code on my own as I can. SQLAlchemy provides a high level model part for the application, and borders on the issues I presented above. I can see that the issues I presented are not neccessarily of importance in the scope of SQL. So let me rephrase my question: how could I solve above two issues in the most efficient manner, leveraging whatever SQL and SQLAlchemy have to offer? Is there any short cut I can take on the way to undo/redo functionality? If journaling DB changes would be a good way to track changes, how can I hook up on them best? Where would my plan fit in? I'm sorry if these questions swamp you. Maybe I am thinking too much. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---