Re: [sqlalchemy] A question regarding hybrid properties, relationships and schema changes

2014-06-25 Thread Ken Lareau
On Tue, Jun 24, 2014 at 3:35 PM, Mike Bayer mike...@zzzcomputing.com
wrote:


 On 6/23/14, 8:09 PM, Ken Lareau wrote:
 
  if apptier:
  subq = (
  Session.query(
  Package.pkg_name,
  Package.version,
  Package.revision,
  AppDefinition.app_type,
  AppDeployment.environment
  ).join(Deployment)
   .join(AppDeployment)
   .join(AppDefinition)
   .filter(Package.pkg_name == package_name)
   .filter(AppDeployment.environment == env)
   .filter(AppDeployment.status != 'invalidated'))
 
  [...]
 
  # The actual column name must be used in the subquery
  # usage below; DB itself should be corrected
  versions = (Session.query(subq.c.appType,
  subq.c.version,
  subq.c.revision)
  .group_by(subq.c.appType, subq.c.environment)
  .all())
 
  The parts in the subquery work fine, but the access of the 'column'
  in the final query leads to this:
 
  Traceback (most recent call last):
File stdin, line 1, in module
File /home/klareau/repos/git/tagopsdb/tagopsdb/deploy/deploy.py,
  line 234, in find_deployed_version
  .group_by(subq.c.appType, subq.c.environment)
File
 
 /home/klareau/.virtualenvs/tds/lib/python2.6/site-packages/sqlalchemy/util/_collections.py,
  line 174, in __getattr__
  raise AttributeError(key)
  AttributeError: environment
 
  This is not completely surprising, but I'm uncertain as the best way
  to fix this... help? :)
 nothing is obviously wrong, assuming you are calling subquery() on
 subq at the end.  It will be a select() construct which will have a
 column called .environment on the .c. collection because
 AppDeployment.environment is present.   Can't say why this attribute is
 not here without full details.


Oops there were some additional filters I left out, but this is added
before the final query:

subq = (subq.order_by(AppDeployment.realized.desc())
.subquery(name='t_ordered'))

Not sure why it isn't working, but today my coworker an I massively
rewrote one of the other methods to avoid this same issue, so maybe
we should try the same for this one... though it would be nice to know
what I goofed here, but not sure what additional information I can
give that would help offhand...

-- 
- Ken Lareau

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] postgresql large objects

2014-06-25 Thread Jason Newton
Hi,

I'm wondering how I might get at postgresql's large object type (lob).  
It's not to be confused with the TOASTED bytea, which are limited currently 
to 1 GiB yet in practice is much lower (for me 400MiB)  - it's a special 
table + api designed to handle very large binary objects, like a few 
hundred MiBs to more recently TiBs. I don't see appropriate definitions 
anywhere and can't find any mention of it really with sqlalchemy. psycopg2 
has support for it and calls it lobject, it provides a file like interface 
to the lob which is a good mapping since with libpgsql you use lo_creat, 
lo_seek, lo_write, lo_read to work with these beasts

I took a look at UserDefinedType but on the bind_processor, this doesn't 
distinguish between inserts and updates.  With inserts, you'd use an oid 
allocated from lo_creat in the transaction.  On updates, you'd use 
lo_trunc/lo_write.  As one more constraint, you must be in a transaction 
before any of these functions are usable.  To reference large objects, as 
they are explicitly an out of table storage, the postgresql specific oid is 
used (which allows garbage collection, referential integrity checks etc).

I'll also mention that other tables reference these large objects via oids, 
something like smart pointers in postgres. 

It'd be great to plug large objects into sqlalchemy properly - but can it 
be done?

-Jason

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Treating many-to-many relationships as many-to-one

2014-06-25 Thread Tara Andrews
Hello,

I am trying to use SQLAlchemy to map a database that was made available to 
me, whose contents and schema cannot be altered. The database has a few 
cases where a many-to-one relationship is treated as a many-to-many - that 
is, instead of the child key being set directly onto the parent record, the 
relationship is mapped through an association table. So in the mapping it 
looks like this:

vname_association = Table('VNameFactoid', Base.metadata,
Column('vnameKey', Integer, ForeignKey('VariantName.vnameKey')),
Column('factoidKey', Integer, ForeignKey('Factoid.factoidKey')))

class Factoid(Base):
__tablename__ = 'Factoid'
tstamp = Column(Time)
engDesc = Column(Text)
origLDesc = Column(Text)
factoidKey = Column(Integer, primary_key=True)
vnameInfo = relationship(VariantName, secondary=vname_association, 
backref=factoidData)

class VariantName(Base):
__tablename__ = 'VariantName'
tstamp = Column(Time)
vnameKey = Column(Integer, primary_key=True)
name = Column(String)


This means that I can get at the variant name information from the factoid 
by saying
my_variant_name = some_factoid.vnameInfo[0].name

But it would be much nicer if I could dispense with that [0], since I know 
that there will never be more than one record returned in the list. Is 
there something I can do to be able to say this?
my_variant_name = some_factoid.vnameInfo.name

Best,
-tara

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Oracle with sequence for primary key and that sequence out of sync

2014-06-25 Thread Mike Bayer

On 6/25/14, 5:50 AM, Ofir Herzas wrote:
 Sorry to barge in, but I'm having the exact same issue and I'm pretty
 sure no one altered the sequence manually.
 I'm using sqlalchemy 0.8, python 2.7.6, rhel 6.5, oracle 10g,
 cx_oracle with the same connection string as above

 This issue started just recently after running ok for more than a
 thousand times.
 I should also say that my application is installed at several
 customers, and I have this issue only at the one using Oracle.

I'm open to possibilities for how this could happen, short of the ALTER
SEQUENCE command being emitted, in which case feel free to grep
SQLAlchemy's codebase for this clause (it's not there).   From my
vantage point, if there is actually an issue that is implicitly making
this happen outside of application code, it would have to be on the
driver or server side somehow.





 On Tuesday, June 3, 2014 1:18:42 AM UTC+3, Michael Bayer wrote:


 On Jun 2, 2014, at 1:17 PM, Scott Koranda skor...@gmail.com
 javascript: wrote:

 
  I investigated and found that the sequence seq_admin_groups_id
  was now at the value 68 after having been used previously to
  insert rows with IDs in the 500s.
 
  I stopped the code and used sqlplus to change the sequence
  back to a value in the high 500s. I then restarted the code
  and the exception no longer occurred.
 
  I am unable to explain how the sequence seq_admin_groups_id
  went from in the 500s and working fine to suddenly being 68.
 
  The only place in the Python code where the sequence is used
  explicitly is in the definition of the AdminGroup() class.
 
  I would be grateful for any insights on how the sequence might
  have become out of sync or anything I can change in the code
  to prevent it from happening again.

 there’s nothing on the Python library side that could do that,
 someone had to have run an ALTER SEQUENCE on the database side in
 order for that to happen.

 -- 
 You received this message because you are subscribed to the Google
 Groups sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to sqlalchemy+unsubscr...@googlegroups.com
 mailto:sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com
 mailto:sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Treating many-to-many relationships as many-to-one

2014-06-25 Thread Mike Bayer

On 6/25/14, 7:11 AM, Tara Andrews wrote:
 Hello,

 I am trying to use SQLAlchemy to map a database that was made
 available to me, whose contents and schema cannot be altered. The
 database has a few cases where a many-to-one relationship is treated
 as a many-to-many - that is, instead of the child key being set
 directly onto the parent record, the relationship is mapped through an
 association table. So in the mapping it looks like this:

 vname_association = Table('VNameFactoid', Base.metadata,
 Column('vnameKey', Integer, ForeignKey('VariantName.vnameKey')),
 Column('factoidKey', Integer, ForeignKey('Factoid.factoidKey')))

 class Factoid(Base):
 __tablename__ = 'Factoid'
 tstamp = Column(Time)
 engDesc = Column(Text)
 origLDesc = Column(Text)
 factoidKey = Column(Integer, primary_key=True)
 vnameInfo = relationship(VariantName,
 secondary=vname_association, backref=factoidData)

 class VariantName(Base):
 __tablename__ = 'VariantName'
 tstamp = Column(Time)
 vnameKey = Column(Integer, primary_key=True)
 name = Column(String)


 This means that I can get at the variant name information from the
 factoid by saying
 my_variant_name = some_factoid.vnameInfo[0].name

 But it would be much nicer if I could dispense with that [0], since I
 know that there will never be more than one record returned in the
 list. Is there something I can do to be able to say this?
 my_variant_name = some_factoid.vnameInfo.name

yes you can apply the uselist=False option to the relationship.If
the relationship loads more than one row you'll get a warning.



 Best,
 -tara
 -- 
 You received this message because you are subscribed to the Google
 Groups sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to sqlalchemy+unsubscr...@googlegroups.com
 mailto:sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com
 mailto:sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] A question regarding hybrid properties, relationships and schema changes

2014-06-25 Thread Mike Bayer

On 6/25/14, 2:26 AM, Ken Lareau wrote:
 On Tue, Jun 24, 2014 at 3:35 PM, Mike Bayer mike...@zzzcomputing.com
 mailto:mike...@zzzcomputing.com wrote:


 On 6/23/14, 8:09 PM, Ken Lareau wrote:
 
  if apptier:
  subq = (
  Session.query(
  Package.pkg_name,
  Package.version,
  Package.revision,
  AppDefinition.app_type,
  AppDeployment.environment
  ).join(Deployment)
   .join(AppDeployment)
   .join(AppDefinition)
   .filter(Package.pkg_name == package_name)
   .filter(AppDeployment.environment == env)
   .filter(AppDeployment.status != 'invalidated'))
 
  [...]
 
  # The actual column name must be used in the subquery
  # usage below; DB itself should be corrected
  versions = (Session.query(subq.c.appType,
  subq.c.version,
  subq.c.revision)
  .group_by(subq.c.appType, subq.c.environment)
  .all())
 
  The parts in the subquery work fine, but the access of the 'column'
  in the final query leads to this:
 
  Traceback (most recent call last):
File stdin, line 1, in module
File /home/klareau/repos/git/tagopsdb/tagopsdb/deploy/deploy.py,
  line 234, in find_deployed_version
  .group_by(subq.c.appType, subq.c.environment)
File
 
 
 /home/klareau/.virtualenvs/tds/lib/python2.6/site-packages/sqlalchemy/util/_collections.py,
  line 174, in __getattr__
  raise AttributeError(key)
  AttributeError: environment
 
  This is not completely surprising, but I'm uncertain as the best way
  to fix this... help? :)
 nothing is obviously wrong, assuming you are calling subquery() on
 subq at the end.  It will be a select() construct which will have a
 column called .environment on the .c. collection because
 AppDeployment.environment is present.   Can't say why this
 attribute is
 not here without full details.


 Oops there were some additional filters I left out, but this is added
 before the final query:

 subq = (subq.order_by(AppDeployment.realized.desc())
 .subquery(name='t_ordered'))

 Not sure why it isn't working, but today my coworker an I massively
 rewrote one of the other methods to avoid this same issue, so maybe
 we should try the same for this one... though it would be nice to know
 what I goofed here, but not sure what additional information I can
 give that would help offhand...

what are the names that are actually present on subq.c ?   print
list(subq.c) should give an indication.



 -- 
 - Ken Lareau

 -- 
 You received this message because you are subscribed to the Google
 Groups sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to sqlalchemy+unsubscr...@googlegroups.com
 mailto:sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com
 mailto:sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Treating many-to-many relationships as many-to-one

2014-06-25 Thread Tara Andrews

On Wednesday, June 25, 2014 3:27:48 PM UTC+2, Michael Bayer wrote:

  

 yes you can apply the uselist=False option to the relationship.If 
 the relationship loads more than one row you'll get a warning.


That did it - thanks!

Best,
-tara 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] postgresql large objects

2014-06-25 Thread Mike Bayer

On 6/25/14, 2:35 AM, Jason Newton wrote:
 Hi,

 I'm wondering how I might get at postgresql's large object type
 (lob).  It's not to be confused with the TOASTED bytea, which are
 limited currently to 1 GiB yet in practice is much lower (for me
 400MiB)  - it's a special table + api designed to handle very large
 binary objects, like a few hundred MiBs to more recently TiBs. I don't
 see appropriate definitions anywhere and can't find any mention of it
 really with sqlalchemy. psycopg2 has support for it and calls it
 lobject, it provides a file like interface to the lob which is a good
 mapping since with libpgsql you use lo_creat, lo_seek, lo_write,
 lo_read to work with these beasts

 I took a look at UserDefinedType but on the bind_processor, this
 doesn't distinguish between inserts and updates.  With inserts, you'd
 use an oid allocated from lo_creat in the transaction.  On updates,
 you'd use lo_trunc/lo_write.  As one more constraint, you must be in a
 transaction before any of these functions are usable.  To reference
 large objects, as they are explicitly an out of table storage, the
 postgresql specific oid is used (which allows garbage collection,
 referential integrity checks etc).

 I'll also mention that other tables reference these large objects via
 oids, something like smart pointers in postgres.

 It'd be great to plug large objects into sqlalchemy properly - but can
 it be done?
well we just added the OID type in 0.9.5, so you at least have that.

The PG LOB feature is very sparsely documented - on PG's docs, they only
seem to document the C API
(www.postgresql.org/docs/9.2/static/lo-interfaces.html), feel free to
point me to better documentation on this.   

As far as transparently embedding this into INSERT/UPDATE, my first
thought is that this might be unwise considering how far removed these
functions are from any kind of mainstream usage in such statements -
particularly if separate statements have to be called per value to get
at OIDs or similar.That PG's docs barely mention this whole feature
much less any kind of regular SQL integration is a red flag.   PG's
BYTEA type is already arbitrarily large so there is probably not much
interest in a type like this.If it's the streaming feature you're
looking for, SQLA's usual approach such as that of Oracle's LOB is to
pre-stream it on the result set side (necessary, because fetching a
batch of rows requires it), and cx_Oracle doesn't provide too much
option to stream on the write side.   I've dealt a lot with streaming
datatypes back in the day but sometime in the early 2000's everyone just
stopped using them.

As the feature involves SQL functions I don't think you'd be dealing
only with bind_processor(), the SQL functions themselves would probably
be via SQL-level processing, see
http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing.
   
It's true these functions aren't given access to the compiler context
where you'd be able to discern INSERT from UPDATE, so I'd probably use
two variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd
apply these to an INSERT or UPDATE statement at runtime probably using a
@compiles decorator -
http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take
the existing LOB type and just swap it out for InsertLOB/UpdateLOB.   
Adding compiler context to TypeEngine is something we could look into as
a feature as well so that bind_expression() has this available somehow
and switching around types wouldn't be needed.

building this would involve stringing together hooks that are not quite
set up for this, however for me to really assist here I'd need to see
exact examples of what INSERT, UPDATE and anything else looks like in
conjunction with these functions.

I would note that psycopg2 itself also provides for extension types,
including custom Connection and Cursor subclasses.   If a lot of things
have to happen when these types are in play it might be something that
can occur at that level, PG's type API is obviously a lot more PG specific.




 -Jason
 -- 
 You received this message because you are subscribed to the Google
 Groups sqlalchemy group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to sqlalchemy+unsubscr...@googlegroups.com
 mailto:sqlalchemy+unsubscr...@googlegroups.com.
 To post to this group, send email to sqlalchemy@googlegroups.com
 mailto:sqlalchemy@googlegroups.com.
 Visit this group at http://groups.google.com/group/sqlalchemy.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit 

RE: [sqlalchemy] Oracle with sequence for primary key and that sequence out of sync

2014-06-25 Thread Ofir Herzas
Didn't say it was your responsibility, just thought you should know there might 
be an issue there.

I'll modify the sequence manually and get back to it should the problem persist.

 

 

From: sqlalchemy@googlegroups.com [mailto:sqlalchemy@googlegroups.com] On 
Behalf Of Mike Bayer
Sent: Wednesday, June 25, 2014 4:27 PM
To: sqlalchemy@googlegroups.com
Subject: Re: [sqlalchemy] Oracle with sequence for primary key and that 
sequence out of sync

 

 

On 6/25/14, 5:50 AM, Ofir Herzas wrote:

Sorry to barge in, but I'm having the exact same issue and I'm pretty sure no 
one altered the sequence manually. 

I'm using sqlalchemy 0.8, python 2.7.6, rhel 6.5, oracle 10g, cx_oracle with 
the same connection string as above

 

This issue started just recently after running ok for more than a thousand 
times.

I should also say that my application is installed at several customers, and I 
have this issue only at the one using Oracle.


I'm open to possibilities for how this could happen, short of the ALTER 
SEQUENCE command being emitted, in which case feel free to grep SQLAlchemy's 
codebase for this clause (it's not there).   From my vantage point, if there is 
actually an issue that is implicitly making this happen outside of application 
code, it would have to be on the driver or server side somehow.








On Tuesday, June 3, 2014 1:18:42 AM UTC+3, Michael Bayer wrote: 


On Jun 2, 2014, at 1:17 PM, Scott Koranda skor...@gmail.com javascript:  
wrote: 

 
 I investigated and found that the sequence seq_admin_groups_id 
 was now at the value 68 after having been used previously to 
 insert rows with IDs in the 500s. 
 
 I stopped the code and used sqlplus to change the sequence 
 back to a value in the high 500s. I then restarted the code 
 and the exception no longer occurred. 
 
 I am unable to explain how the sequence seq_admin_groups_id 
 went from in the 500s and working fine to suddenly being 68. 
 
 The only place in the Python code where the sequence is used 
 explicitly is in the definition of the AdminGroup() class. 
 
 I would be grateful for any insights on how the sequence might 
 have become out of sync or anything I can change in the code 
 to prevent it from happening again. 

there’s nothing on the Python library side that could do that, someone had to 
have run an ALTER SEQUENCE on the database side in order for that to happen.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

 

-- 
You received this message because you are subscribed to a topic in the Google 
Groups sqlalchemy group.
To unsubscribe from this topic, visit 
https://groups.google.com/d/topic/sqlalchemy/4iAInBP6iyI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to 
sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] inheriting from mapped classes

2014-06-25 Thread Jonathan Vanasco

On Tuesday, June 24, 2014 9:40:02 PM UTC-4, Victor Olex wrote:

 What I aiming for is to provide users a library of base class(es), which 
 are mapped using SQLAlchemy. The classes are then meant to be extended by 
 users with business logic. I am not hell bent on using inheritance for 
 this, but for now I went with your __abstract__ = True solution only in a 
 somewhat inverted way. 


There's another thread from within the past 2 weeks from someone else 
trying to tackle this problem. 

I mentioned in that post, and I'll mention again here -- the best method I 
found was to use a form of a registry pattern --  where child classes 
inherit from base class and mention any overrides + the types of 
relationships they require or provide. As the classes are initialized, this 
data is recorded in a registry.  after initialization, relationships are 
mapped onto the classes uses the data in the registry.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Oracle with sequence for primary key and that sequence out of sync

2014-06-25 Thread Jonathan Vanasco


On Monday, June 2, 2014 6:18:42 PM UTC-4, Michael Bayer wrote:

there’s nothing on the Python library side that could do that, someone had 
 to have run an ALTER SEQUENCE on the database side in order for that to 
 happen.


I don't even think you could use ALTER SEQUENCE in this situation.  I 
fondly remember Oracle sequences needing to be dropped and re-added with 
a lot of NEXTVALs... or using chunks of PL/SQL scripts to otherwise modify 
the sequence.

Doing a quick look online, it seems like things still work the same way.  

If I were in this situation, I would enable a lot of query and connection 
logging, and then comb through the logs to see what was happening.For a 
variety of reasons, this doesn't seem like it should be possible.

Is it possible that some records were being inserted with a numeric id that 
was originally generated by .nextval() in your app, but has since been 
incremented only in Python ? 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] postgresql large objects

2014-06-25 Thread Jason Newton
Hi,

I've replied inline below.


On Wed, Jun 25, 2014 at 6:46 AM, Mike Bayer mike...@zzzcomputing.com
wrote:


  well we just added the OID type in 0.9.5, so you at least have that.


I came across the entry on the issue tracker a little bit after
submitting.  As usual for me, it's support wasn't added very long ago.


 The PG LOB feature is very sparsely documented - on PG's docs, they only
 seem to document the C API (
 www.postgresql.org/docs/9.2/static/lo-interfaces.html), feel free to
 point me to better documentation on this.


There's also server side functions, they operate almost exactly the same as
client side api: http://www.postgresql.org/docs/9.3/static/lo-funcs.html .
There is no better documentation than those two that I know of, but they
were sufficient for me.


 As far as transparently embedding this into INSERT/UPDATE, my first
 thought is that this might be unwise considering how far removed these
 functions are from any kind of mainstream usage in such statements -
 particularly if separate statements have to be called per value to get at
 OIDs or similar.That PG's docs barely mention this whole feature much
 less any kind of regular SQL integration is a red flag.   PG's BYTEA type
 is already arbitrarily large so there is probably not much interest in a
 type like this.If it's the streaming feature you're looking for,
 SQLA's usual approach such as that of Oracle's LOB is to pre-stream it on
 the result set side (necessary, because fetching a batch of rows requires
 it), and cx_Oracle doesn't provide too much option to stream on the write
 side.   I've dealt a lot with streaming datatypes back in the day but
 sometime in the early 2000's everyone just stopped using them.


Seems to fly in the face at the point of SQLA although integration
difficulties are appreciated.  Most advanced postgresql drivers in any
language bindings have added support for this type although none of them
can hide that it's file like.  PG's BYTEA is NOT arbitrarily large, it has
quite real practical limits and I've hit them regularly in storing
compressed HDF5 documents in the database as part of a bench testing
framework.  The theoretical limit is 1GB but this limit is far less in
practice (
http://www.postgresql.org/message-id/cafj8pracfkoinp2uxeizod5krx29n2ofsoldh0w6ej7rxko...@mail.gmail.com
) . http://www.postgresql.org/docs/9.3/static/storage-toast.html  I'm not
looking for streaming, retrieve/set binary buffers.  It'd be nice to
translate it transparently to HDF5 python in-memory objects (ie h5py).



 As the feature involves SQL functions I don't think you'd be dealing only
 with bind_processor(), the SQL functions themselves would probably be via
 SQL-level processing, see
 http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing.
 It's true these functions aren't given access to the compiler context where
 you'd be able to discern INSERT from UPDATE, so I'd probably use two
 variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply
 these to an INSERT or UPDATE statement at runtime probably using a
 @compiles decorator -
 http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take
 the existing LOB type and just swap it out for InsertLOB/UpdateLOB.
 Adding compiler context to TypeEngine is something we could look into as a
 feature as well so that bind_expression() has this available somehow and
 switching around types wouldn't be needed.


Separate datatypes is clearly not a good approach to this.


 building this would involve stringing together hooks that are not quite
 set up for this, however for me to really assist here I'd need to see exact
 examples of what INSERT, UPDATE and anything else looks like in conjunction
 with these functions.


Everything starts with a transaction block.  Hand crafted inserts use
server side lo_create(-1) (in sql) which allocates a new large object and
returning clause to get the oid in one go back to the client side.  Then I
start using the lobject api on the returned oid.   In patches I've made to
cppdb, in the bind of std::istream I call client side lo_create, lo_open, a
sequence of lo_writes, and finally lo_close.  In retrospect this doesn't
handle updates particularly well and maybe this could be used in sqlalchemy
as is too - an update just allocates a new large object and unreferences
the old one.  This would leave it up to vacuumlo to GC those updated
lobs.  In my work load however, my lobs so far are immutable - they are
results of simulations and this is the way I've worked to date.  It
probably won't stay that way forever.



 I would note that psycopg2 itself also provides for extension types,
 including custom Connection and Cursor subclasses.   If a lot of things
 have to happen when these types are in play it might be something that can
 occur at that level, PG's type API is obviously a lot more PG specific.


-Jason

-- 
You received this message because you are subscribed 

Re: [sqlalchemy] postgresql large objects

2014-06-25 Thread Mike Bayer

On 6/25/14, 2:53 PM, Jason Newton wrote:
 Seems to fly in the face at the point of SQLA although integration
 difficulties are appreciated.  Most advanced postgresql drivers in any
 language bindings have added support for this type although none of
 them can hide that it's file like. 
SQLA doesn't go out of its way for unusual, very database-specific
features that up til now have demonstrated little to no real-world use
(nobody's ever asked for this feature before and googling about
Postgresql LOBs turns up very little).  There are tons of examples of
features like this across many different database backends.  If they are
easy to add, we add them, or if they are easy to address via a recipe,
we add the recipe up to the wiki.

But if the behavior requires substantial changes to the core and
dialect, and the ratio of complexity of impact to sparseness of need is
really high, it's not worth it and actually kind of damaging to most
users to complicate the library for use cases that are extremely rare
and can just as well be addressed by dropping down to raw DBAPI code. 
Complications/destabiliziations/performance degradations that are
hoisted onto the whole userbase for the benefit of a single feature that
is virtually never needed is the wrong choice to make; I'm presented
with this choice all the time and there's nearly always work to be done
in extricating ill-conceived features and behaviors that went in too
quickly.  I'm pretty confident that this feature won't require any of
that, but that remains to be seen.

Just in case this wasn't apparent, you certainly *can* use psycopg2's
bindings when you're in an otherwise SQLAlchemy app.  Worst case you can
retrieve a raw psycopg2 connection using connection.raw_connection and
do whatever you need.  If you truly have some use for LOBs, SQLAlchemy
isn't preventing you from using it, it's just not providing any niceties
around it.   The fact that these unusual use cases are not ever
prohibited by SQLA further raises the bar to adding first class support
for them.


 PG's BYTEA is NOT arbitrarily large, it has quite real practical
 limits and I've hit them regularly in storing compressed HDF5
 documents in the database as part of a bench testing framework.  The
 theoretical limit is 1GB but this limit is far less in practice
 (http://www.postgresql.org/message-id/cafj8pracfkoinp2uxeizod5krx29n2ofsoldh0w6ej7rxko...@mail.gmail.com
 ) . http://www.postgresql.org/docs/9.3/static/storage-toast.html  I'm
 not looking for streaming, retrieve/set binary buffers.  It'd be nice
 to translate it transparently to HDF5 python in-memory objects (ie h5py).
Python translation is very easy in SQLA, its just if you have special
needs for SQL syntaxes, that's where special behaviors may be needed. 
So far it sounds like the only blocking factor is that bind_sql needs to
distinguish between INSERT and UPDATE.  that's not a terribly tall order
though it is inconvenient in that the API would need a
backwards-compatibility layer.


  


 As the feature involves SQL functions I don't think you'd be
 dealing only with bind_processor(), the SQL functions themselves
 would probably be via SQL-level processing, see
 
 http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing.

 It's true these functions aren't given access to the compiler
 context where you'd be able to discern INSERT from UPDATE, so I'd
 probably use two variants of datatype, an InsertLOB and UpdateLOB
 datatype, then I'd apply these to an INSERT or UPDATE statement at
 runtime probably using a @compiles decorator -
 http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is,
 take the existing LOB type and just swap it out for
 InsertLOB/UpdateLOB.Adding compiler context to TypeEngine is
 something we could look into as a feature as well so that
 bind_expression() has this available somehow and switching around
 types wouldn't be needed.


 Separate datatypes is clearly not a good approach to this.

Let me clarify that these separate datatypes would be totally invisible
to the user.   The user would work with a single LOB type.  Translation
to Insert/Update versions would not be explicit and would occur at the
point at which the insert/update construct is compiled.   At the moment
this may be the best approach short of modifying the library (but then
again I can't say much about the approach because i have little idea
what the SQL we're talking about looks like).
 
 Everything starts with a transaction block. 
the psycopg2 DBAPI is implicitly in a transaction block at all times
unless autocommit mode is set up, so there shouldn't be any problem here.

 Hand crafted inserts use server side lo_create(-1) (in sql) which
 allocates a new large object and returning clause to get the oid in
 one go back to the client side.  Then I start using the lobject api on
 the returned oid. 
I'd like to see explicit SQL, 

Re: [sqlalchemy] postgresql large objects

2014-06-25 Thread Jason Newton
 Just in case this wasn't apparent, you certainly *can* use psycopg2's
 bindings when you're in an otherwise SQLAlchemy app.  Worst case you can
 retrieve a raw psycopg2 connection using connection.raw_connection and do
 whatever you need.  If you truly have some use for LOBs, SQLAlchemy isn't
 preventing you from using it, it's just not providing any niceties around
 it.   The fact that these unusual use cases are not ever prohibited by SQLA
 further raises the bar to adding first class support for them.

 Yes, this much I knew would work, its just clunkier than it could be.





  PG's BYTEA is NOT arbitrarily large, it has quite real practical limits
 and I've hit them regularly in storing compressed HDF5 documents in the
 database as part of a bench testing framework.  The theoretical limit is
 1GB but this limit is far less in practice (
 http://www.postgresql.org/message-id/cafj8pracfkoinp2uxeizod5krx29n2ofsoldh0w6ej7rxko...@mail.gmail.com
 ) . http://www.postgresql.org/docs/9.3/static/storage-toast.html  I'm not
 looking for streaming, retrieve/set binary buffers.  It'd be nice to
 translate it transparently to HDF5 python in-memory objects (ie h5py).

 Python translation is very easy in SQLA, its just if you have special
 needs for SQL syntaxes, that's where special behaviors may be needed.  So
 far it sounds like the only blocking factor is that bind_sql needs to
 distinguish between INSERT and UPDATE.  that's not a terribly tall order
 though it is inconvenient in that the API would need a
 backwards-compatibility layer.


Perhaps you can keep backwards compatibility by introspecting the callbacks
to determine the arity of their arguments.








 As the feature involves SQL functions I don't think you'd be dealing only
 with bind_processor(), the SQL functions themselves would probably be via
 SQL-level processing, see
 http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing.
 It's true these functions aren't given access to the compiler context where
 you'd be able to discern INSERT from UPDATE, so I'd probably use two
 variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply
 these to an INSERT or UPDATE statement at runtime probably using a
 @compiles decorator -
 http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take
 the existing LOB type and just swap it out for InsertLOB/UpdateLOB.
 Adding compiler context to TypeEngine is something we could look into as a
 feature as well so that bind_expression() has this available somehow and
 switching around types wouldn't be needed.


  Separate datatypes is clearly not a good approach to this.


 Let me clarify that these separate datatypes would be totally invisible to
 the user.   The user would work with a single LOB type.  Translation to
 Insert/Update versions would not be explicit and would occur at the point
 at which the insert/update construct is compiled.   At the moment this may
 be the best approach short of modifying the library (but then again I can't
 say much about the approach because i have little idea what the SQL we're
 talking about looks like).


Ah.  I thought it would be done in the declarative table descriptions.
I'm not sure, being relatively new to SQLA, that would look like to do.




   Everything starts with a transaction block.

 the psycopg2 DBAPI is implicitly in a transaction block at all times
 unless autocommit mode is set up, so there shouldn't be any problem here.

 AFAIK it won't work with autocommit but I don't have documentation to
prove that, I just remember getting errors in I think JDBC pertaining to it.



Hand crafted inserts use server side lo_create(-1) (in sql) which
 allocates a new large object and returning clause to get the oid in one go
 back to the client side.  Then I start using the lobject api on the
 returned oid.

 I'd like to see explicit SQL, preferably in the form of a psycopg2 script
 that illustrates all the operations you wish to support and specifically
 how they must interact with the database.


I've attached example usages.  As I indicated in my prior email, right now
I only do inserts/selects.



In patches I've made to cppdb, in the bind of std::istream I call
 client side lo_create, lo_open, a sequence of lo_writes, and finally
 lo_close.  In retrospect this doesn't handle updates particularly well and
 maybe this could be used in sqlalchemy as is too - an update just allocates
 a new large object and unreferences the old one.

 I wonder why the mechanics of these functions can't be more cleanly
 wrapped into server-side stored procedures?If they can be distilled
 into simple insert_lob() update_lob() functions that would potentially be a
 better separation of duties.


I think it's because only small lobs could rely on that -  huge lobs that
exceed working memory fit better with a file like api.  It would be a nice
convenience to have, perhaps.  These same small lobs (ie lobs that fit in

Re: [sqlalchemy] postgresql large objects

2014-06-25 Thread Mike Bayer

On 6/25/14, 5:35 PM, Jason Newton wrote:

  


 Hand crafted inserts use server side lo_create(-1) (in sql) which
 allocates a new large object and returning clause to get the oid
 in one go back to the client side.  Then I start using the
 lobject api on the returned oid. 
 I'd like to see explicit SQL, preferably in the form of a psycopg2
 script that illustrates all the operations you wish to support and
 specifically how they must interact with the database.


 I've attached example usages.  As I indicated in my prior email, right
 now I only do inserts/selects.
Here's my immediate thought about the INSERT (and the UPDATE) - the
first way is using events, the second would attempt to move this system
into something more native to the psycopg2 dialect:

Use the before_cursor_execute() and after_cursor_execute() events to get
at this:

http://docs.sqlalchemy.org/en/latest/core/events.html#sqlalchemy.events.ConnectionEvents.before_cursor_execute
http://docs.sqlalchemy.org/en/latest/core/events.html#sqlalchemy.events.ConnectionEvents.after_cursor_execute

1. Create a new type called PGLOB or whatever you want.  For starters,
this can just be a blank UserDefinedType subclass.

in before_cursor_execute:

2. determine if INSERT or UPDATE using context.isinsert, context.isupdate

3. examine the datatypes that SQLAlchemy is using here, by looking at
context.compiled.binds.  The values here are BindParameter objects, you
want to look in those for the ones that are of type PGLOB.

4. From context.compiled.binds, you have the names of the bound params
with the type.  Search and replace the statement for occurrences of
that bound parameter, replace with lo_creat(-1) or whatever you need
there.

5. Also append to the RETURNING clause those cols you need.

6. the statement as a return value will be used, if you set up the event
with retval=True (see the docs).

in after_cursor_execute:

7. in after_cursor_execute - call fetchone() to get the RETURNING
values.  Get that OID you care about then do that work with conn.lobject
and all that.  hopefully this doesnt mess up the existing cursor state.

8.  now the tricky part.  SQLAlchemy needs that row if you're doing
implicit returning to get at primary key values.   psycopg2's cursor
seems to have a scroll() method that works for client side cursors.  
I'd scroll it back one so that SQLA gets the state it expects.

Alternative system, more plugged in:

1. We would be creating new features on
sqlalchemy/dialects/postgresql/psycopg2.py -
PGExecutionContext_psycopg2.  Similar hooks are available here which you
can use to accomplish similar tasks;  you'd want to look at the
pre_exec(), post_exec() and possibly post_insert() methods, and maybe
even _fetch_implicit_returning().If SQLA were to support this more
natively, things would be happening at this level.   But again, I really
wouldn't want all kinds of hardcoded expectations of LOB objects in this
object taking up processing time for the vast majority of use cases that
don't use LOBs, so the extra logic here should be contained within
something that can easily be placed aside based on options or similar.




-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Joins across differently-sharded tables

2014-06-25 Thread Mike Bayer

On 6/25/14, 6:36 PM, Neil Hooey wrote:
 Is it possible to do a join query on two tables that have different
 shard keys in SQLAlchemy?

 For example, if you have two tables:

 users: [userId, name] (sharded on userId)
 messages: [messageId, userId, message] (sharded on messageId)

 Could you select all of the messages for a particular user with:

 select u.name http://u.name, m.userId, m.messageId, m.message
 from messages m
 join users u on m.userId = u.userId

 (Even though messages are sharded on messageId instead of userId?)

 I haven't seen an example of this in the code.
there's no way to do that because two shards are accessed via two
totally different database connections.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Aggregate functions across shards

2014-06-25 Thread Neil Hooey
Do aggregate functions work across shards in SQLAlchemy?

Such as count, sum, etc?

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] Aggregate functions across shards

2014-06-25 Thread Mike Bayer

On 6/25/14, 6:41 PM, Neil Hooey wrote:
 Do aggregate functions work across shards in SQLAlchemy?

 Such as count, sum, etc?

not across them, nope.  you need to merge those yourself.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.