Re: [sqlalchemy] Query manipulation when using joinedload

2013-09-24 Thread Michael Bayer

On Sep 24, 2013, at 12:52 AM, Mick Heywood m...@booodl.com wrote:

 Hi,
 
 I'm attempting to do some universal filtering using a custom Query class.  In 
 this case, I'm trying to filter out all items marked as archived in two 
 related classes.  I'm having some trouble adding the required filters to the 
 query at all the right levels.
 
 I'm using Flask 0.9, SQLAlchemy 0.8 and PostgreSQL 9.1.5
 
 Let's call the two classes Parent and Child, which are inheriting from 
 Archivable:
 
 class Archivable(object):
   @declared_attr
   def is_archived(cls):
 return Column('is_archived', types.Boolean, nullable=False, 
 default=False, index=True)
 
 class Parent(base, Archivable):
   __tablename__ = 'parent'
   id = Column('id', types.BigInteger, primary_key=True, nullable=False)
   is_archived = 
 
 class Child(base, Archivable):
   __tablename__ = 'child'
   id = Column('id', types.BigInteger, primary_key=True, nullable=False)
   parent_id = Column('id', types.BigInteger, ForeignKey('parent.id'), 
 nullable=False)
   parent = relationship('Parent', primaryjoin='Child.parent_id==Parent.id',
 backref='children')
 
 Somewhere in my code I am calling:
   parent = db.session.query(Parent).filter(Parent.id == 
 1234).options(joinedload('children')).first()
 
 This is resulting in a query of the form:
 
 SELECT anon_1.*, child.*
 FROM (
   SELECT parent.*
   FROM parent
   WHERE parent.id = 1234
   LIMIT 1) AS anon_1
 LEFT OUTER JOIN child ON child.parent_id = parent.id
 
 which is fine.
 
 When I try and use a custom query class to access the query and filter it 
 however, I only seem to be able to access elements of the inner subquery.  
 self._entities for instance only shows a single _MapperEntity 
 Mapper|Parent|parent, self.whereclause is a BooleanClauseList of parent.id = 
 1234.  If I try and inject my filters at this stage using the following:

this is by design.  joinedload() is only used for the purposes of loading 
related items and collections against entities which the Query is loading, so 
it is applied such that it is always external to the SELECT statement that the 
Query emits.joinedload() can only load exactly what the original 
relationship() refers to.

If you'd like to join to another table with special criteria and then establish 
that as a collection load, for simple cases you use contains_eager() for that:

http://docs.sqlalchemy.org/en/rel_0_8/orm/loading.html?highlight=contains_eager#contains-eager

You can also consider actually using a custom relationship() for loading 
collections that require special filtering.Basic example is at:

http://docs.sqlalchemy.org/en/rel_0_8/orm/relationships.html#specifying-alternate-join-conditions

and there is also an example of injecting bound parameters into the primaryjoin 
(uses lazy load and/or joinedload too), this might be related to what you're 
trying to do:

http://www.sqlalchemy.org/trac/wiki/UsageRecipes/GlobalFilter






signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] Duplicating primary key value into another column upon insert?

2013-09-24 Thread Michael Bayer

On Sep 23, 2013, at 10:04 PM, Ken Lareau klar...@tagged.com wrote:

 Hopefully this will make sense...
 
 I have a database which is in need of some normalization of the column
 naming in various tables.  In an effort to minimize disruption (since this
 is a live database used by many applications), I'm trying to use a two-step
 approach:
 
 1) Add a new column that will 'mirror' the current primary key column,
then update code to utilize that column once in place
 2) Once all code is converted, remove the original primary key column
and make the new column the primary key instead.
 
 In an effort to minimize change to the code I have currently using my
 SQLAlchemy model, I'm trying to find a way to deal with this from within
 the declarative classes themselves.  So the questions are:
 
  - Is this even possible, and if so, how can it be done?
  - If not possible, is there a good yet minimally intrusive external
change that can be done?
 
 My searching through the docs so far hasn't turned up anything useful,
 I'm afraid...



it depends a lot on how you are generating primary key values.   how are the 
new values being generated?   (things like database platform affect this 
dramatically).   if using something like MySQL's autoincrement, you probably 
want to write a trigger that will populate this new column after an INSERT.
overall a trigger is probably the best way to get this extra value in the table 
efficiently.

OTOH if your applications know the new primary key value beforehand, then 
SQLAlchemy events can be used to copy the value into the new column.

you also want to be mindful of indexing on this column, as applications move to 
use the new column, you probably want queries to still use indexes.


signature.asc
Description: Message signed with OpenPGP using GPGMail


[sqlalchemy] Dynamic model creation for temporary table

2013-09-24 Thread Aleksandr Kuznetsov
Hi!

I have ModelA model (declarative style) for persistent table. I have to 
periodically update that table with some external data (replace existing 
data with new). It must be performed with minimal locking and there must 
not be mixed old data and new data. I see the follwing way to implement it: 
create some temporary table like ModelA table, say it will be ModelATmp; 
fill ModelATmp with new external data; Replace ModelA table with new 
created table (SQL: RENAME TABLE a_table TO a_table_prev; RENAME TABLE 
a_tmp_table TO a_table; DROP TABLE a_table_prev;).

I'm wondering what is a right way to implement such scenario? I want to 
create some table rotator class, pass to it ModelA, get from it ModelATmp, 
fill it and rotate tables with that rotator. Is it possible? How should I 
do it?

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Duplicating primary key value into another column upon insert?

2013-09-24 Thread Ken Lareau
On Tue, Sep 24, 2013 at 7:25 AM, Michael Bayer mike...@zzzcomputing.comwrote:


 On Sep 23, 2013, at 10:04 PM, Ken Lareau klar...@tagged.com wrote:

  Hopefully this will make sense...
 
  I have a database which is in need of some normalization of the column
  naming in various tables.  In an effort to minimize disruption (since
 this
  is a live database used by many applications), I'm trying to use a
 two-step
  approach:
 
  1) Add a new column that will 'mirror' the current primary key column,
 then update code to utilize that column once in place
  2) Once all code is converted, remove the original primary key column
 and make the new column the primary key instead.
 
  In an effort to minimize change to the code I have currently using my
  SQLAlchemy model, I'm trying to find a way to deal with this from within
  the declarative classes themselves.  So the questions are:
 
   - Is this even possible, and if so, how can it be done?
   - If not possible, is there a good yet minimally intrusive external
 change that can be done?
 
  My searching through the docs so far hasn't turned up anything useful,
  I'm afraid...



 it depends a lot on how you are generating primary key values.   how are
 the new values being generated?   (things like database platform affect
 this dramatically).   if using something like MySQL's autoincrement, you
 probably want to write a trigger that will populate this new column after
 an INSERT.overall a trigger is probably the best way to get this extra
 value in the table efficiently.

 OTOH if your applications know the new primary key value beforehand, then
 SQLAlchemy events can be used to copy the value into the new column.

 you also want to be mindful of indexing on this column, as applications
 move to use the new column, you probably want queries to still use indexes.



The primary key values are indeed using autoincrement in MySQL.

The reason for doing this is the database is an existing one over which
I've been building a library to be used to programmatically access it,
but it's not the only library accessing the database, so to prevent
potential
breakage with the other libraries, we taking the safe approach.

The normalization is due to lovely decisions like having all columns
be in lowercase except for the primary key (though sometimes other
keys also have the same problem) where we have something like
'PackageID'.  The new column is meant to replace the original column,
but there will be a small period of time that both columns will exist,
so they must remain in sync until all of the current libraries and appli-
cations are using the new column.  At that point, the new column
will take over and hopefully no one will be the wiser that things were
wrong before. *smile*

As far as I know, none of the applications know the primary key
value beforehand, so that isn't a concern.  Indexing might be, though
the hope is the changeover will be relatively short (matter of a day
or so), but I can still index the new file if the impact is great enough
(the database isn't _too_ large at this point).

Of course, this leads back to the original question... how exactly
does one accomplish this? :)  Hopefully the clarification above helps
some in determining this.


-- 
- Ken Lareau

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Duplicating primary key value into another column upon insert?

2013-09-24 Thread Michael Bayer

On Sep 24, 2013, at 12:00 PM, Ken Lareau klar...@tagged.com wrote:

 
 
 
 Of course, this leads back to the original question... how exactly
 does one accomplish this? :)  Hopefully the clarification above helps
 some in determining this.

as I said, build a trigger:

http://dev.mysql.com/doc/refman/5.0/en/triggers.html




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] Duplicating primary key value into another column upon insert?

2013-09-24 Thread Ken Lareau
On Tue, Sep 24, 2013 at 10:32 AM, Michael Bayer mike...@zzzcomputing.comwrote:


 On Sep 24, 2013, at 12:00 PM, Ken Lareau klar...@tagged.com wrote:

 
 
 
  Of course, this leads back to the original question... how exactly
  does one accomplish this? :)  Hopefully the clarification above helps
  some in determining this.

 as I said, build a trigger:

 http://dev.mysql.com/doc/refman/5.0/en/triggers.html



Oh, you meant on the database side... d'oh.  *slaps hand to forehead*
Thanks, will look at this.

-- 
- Ken Lareau

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Dynamic model creation for temporary table

2013-09-24 Thread Michael Bayer

On Sep 24, 2013, at 10:42 AM, Aleksandr Kuznetsov aku.ru...@gmail.com wrote:

 Hi!
 
 I have ModelA model (declarative style) for persistent table. I have to 
 periodically update that table with some external data (replace existing data 
 with new).

silly question, is there a reason standard replication techniques provided by 
the database can't be used here?   they might be a little more heavy-handed to 
set up than some in-app trick but it'll work more solidly.


 It must be performed with minimal locking and there must not be mixed old 
 data and new data. I see the follwing way to implement it: create some 
 temporary table like ModelA table, say it will be ModelATmp; fill ModelATmp 
 with new external data; Replace ModelA table with new created table (SQL: 
 RENAME TABLE a_table TO a_table_prev; RENAME TABLE a_tmp_table TO a_table; 
 DROP TABLE a_table_prev;).
 
 I'm wondering what is a right way to implement such scenario? I want to 
 create some table rotator class, pass to it ModelA, get from it ModelATmp, 
 fill it and rotate tables with that rotator. Is it possible? How should I do 
 it?

as always, the main issue is concurrency.   can you guarantee that exactly one 
connection at a time will be performing this operation and that during so, no 
other connections will have any read or write locks of any kind on this table?  
otherwise it's not going to work.

to implement the scenario, assuming you've figured out how to make sure a 
connection has exclusive access, you've already spelled it out - just execute 
the SQL you've described on the connection.

as far as build a class or whatnot that sort of depends on app architecture, 
I'd not bother creating a class if there's just one function and no overarching 
pattern of class construction (see Stop Writing Classes! for the general 
idea: http://pyvideo.org/video/880/ )




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] Duplicating primary key value into another column upon insert?

2013-09-24 Thread Ken Lareau
On Tue, Sep 24, 2013 at 11:10 AM, Ken Lareau klar...@tagged.com wrote:


 On Tue, Sep 24, 2013 at 10:32 AM, Michael Bayer 
 mike...@zzzcomputing.comwrote:


 On Sep 24, 2013, at 12:00 PM, Ken Lareau klar...@tagged.com wrote:

 
 
 
  Of course, this leads back to the original question... how exactly
  does one accomplish this? :)  Hopefully the clarification above helps
  some in determining this.

 as I said, build a trigger:

 http://dev.mysql.com/doc/refman/5.0/en/triggers.html



 Oh, you meant on the database side... d'oh.  *slaps hand to forehead*
 Thanks, will look at this.


Sadly after over an hour of fighting with MySQL triggers and being
unsuccessful at getting any variation to work, I've decided that I'm
just going to need to shut down key infrastructure pieces during
the column renaming and avoid the temporary new column as it
would just complicate my current codebase more than it already
is... once again, I do appreciate the help, however.

-- 
- Ken Lareau

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Duplicating primary key value into another column upon insert?

2013-09-24 Thread Michael Bayer

On Sep 24, 2013, at 5:42 PM, Ken Lareau klar...@tagged.com wrote:

 On Tue, Sep 24, 2013 at 11:10 AM, Ken Lareau klar...@tagged.com wrote:
 
 On Tue, Sep 24, 2013 at 10:32 AM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 
 On Sep 24, 2013, at 12:00 PM, Ken Lareau klar...@tagged.com wrote:
 
 
 
 
  Of course, this leads back to the original question... how exactly
  does one accomplish this? :)  Hopefully the clarification above helps
  some in determining this.
 
 as I said, build a trigger:
 
 http://dev.mysql.com/doc/refman/5.0/en/triggers.html
 
 
 
 Oh, you meant on the database side... d'oh.  *slaps hand to forehead*
 Thanks, will look at this.
 
 Sadly after over an hour of fighting with MySQL triggers and being
 unsuccessful at getting any variation to work, I've decided that I'm
 just going to need to shut down key infrastructure pieces during
 the column renaming and avoid the temporary new column as it
 would just complicate my current codebase more than it already
 is... once again, I do appreciate the help, however.

I googled for mysql trigger copy autoincrement and got this:

Copy auto-increment value to another column on insert:

http://forums.mysql.com/read.php?99,186171,186241#msg-186241

the trigger here makes use of LAST_INSERT_ID() and is worth a shot.




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] using WITH query AS, cte() with postgresql

2013-09-24 Thread Nathan Mailg
On Sep 20, 2013, at 11:59 AM, Michael Bayer mike...@zzzcomputing.com wrote:

 
 Ok now I'm still not following - in this case, row is a NamedTuple, or a 
 mapped instance?   if its a NamedTuple then you don't have the service of 
 traversing along object relationships available since the NamedTuple isn't a 
 mapped object.  You'd need to query for a full object with an identity ( 
 query(MyClass) as opposed to query(MyClass.col1, MyClass.col2, ...)
 
 if you provide me with *very minimal, but working* mappings and the query 
 we're working on, I can show you how to make it load entities rather than 
 rows.
 


In the debugger the returned row was of type KeyedTuple, which I think is 
derived from NamedTuple, so yes, it's a NamedTuple and not a mapped instance.

I really appreciate your help with this. Below is as stripped down as I can 
make it while still showing the moving parts:

class Appl(Base):
__tablename__ = 'appl'
id  = Column(Integer, primary_key=True)
refid   = Column(Integer, Sequence('appl_refid_seq'))
appldate= Column(Date)
lastname= Column(Unicode(50))
firstname   = Column(Unicode(50))
cityid  = Column(Integer, ForeignKey('city.id'))
cityid2 = Column(Integer, ForeignKey('city.id'))
#
city = relationship('City', lazy='joined', 
primaryjoin='City.id==Appl.cityid')
city2= relationship('City', lazy='joined', 
primaryjoin='City.id==Appl.cityid2')

class City(Base):
__tablename__ = 'city'
id  = Column(Integer, primary_key=True)
name= Column(Unicode(30), nullable=False)
state   = Column(Unicode(2), nullable=False)
zipcode = Column(Unicode(10))

qlast, qfirst = params['query'].split(' ', 1)
d = DBSession.query(Appl).\
distinct(Appl.refid).\
filter(Appl.lastname.ilike(qlast)).\
filter(Appl.firstname.ilike(qfirst+'%')).\
group_by(Appl).\
order_by(Appl.refid, Appl.appldate.desc())
d = d.cte('distinct_query')
q = DBSession.query(d).\
join(City, City.id==d.c.cityid).\
order_by(d.c.lastname, d.c.firstname)

What I'm trying to get at are the attrs on City after running q, like:

# row is KeyedTuple instance
for row in q.all():
# this works
print row.lastname
# below does not work,
# stuck here trying to get at joined City attrs, e.g. City has a name attr
print row.city.name

Thank you!

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Duplicating primary key value into another column upon insert?

2013-09-24 Thread Ken Lareau
On Tue, Sep 24, 2013 at 2:53 PM, Michael Bayer mike...@zzzcomputing.comwrote:


 On Sep 24, 2013, at 5:42 PM, Ken Lareau klar...@tagged.com wrote:

 On Tue, Sep 24, 2013 at 11:10 AM, Ken Lareau klar...@tagged.com wrote:


 On Tue, Sep 24, 2013 at 10:32 AM, Michael Bayer mike...@zzzcomputing.com
  wrote:


 On Sep 24, 2013, at 12:00 PM, Ken Lareau klar...@tagged.com wrote:

 
 
 
  Of course, this leads back to the original question... how exactly
  does one accomplish this? :)  Hopefully the clarification above helps
  some in determining this.

 as I said, build a trigger:

 http://dev.mysql.com/doc/refman/5.0/en/triggers.html



 Oh, you meant on the database side... d'oh.  *slaps hand to forehead*
 Thanks, will look at this.


 Sadly after over an hour of fighting with MySQL triggers and being
 unsuccessful at getting any variation to work, I've decided that I'm
 just going to need to shut down key infrastructure pieces during
 the column renaming and avoid the temporary new column as it
 would just complicate my current codebase more than it already
 is... once again, I do appreciate the help, however.


 I googled for mysql trigger copy autoincrement and got this:

 Copy auto-increment value to another column on insert:

 http://forums.mysql.com/read.php?99,186171,186241#msg-186241

 the trigger here makes use of LAST_INSERT_ID() and is worth a shot.


Closer, but it requires the main tables to not already have any
values in them (if I'm understanding the SQL correctly), and I'm
working with existing tables already populated with plenty of
data.  Honestly, at this point simply renaming the columns is
far less painful. :)  (And thankfully only needs to be done once.)
But thanks for the info, much appreciated.

-- 
- Ken Lareau

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


[sqlalchemy] Re: Duplicating primary key value into another column upon insert?

2013-09-24 Thread Nicholas Long
My mind keeps going back to events but of course there's the limitation to 
modifying Session state while handling various events. (
http://docs.sqlalchemy.org/en/rel_0_7/orm/events.html#sqlalchemy.orm.events.MapperEvents.after_update
)

But what about using the SQL Expression API?

*Relevant code snippet:*

class MyModel(Base):
__tablename__ = MyTable
priid = Column(Integer(), primary_key=True)
secid = Column(Integer())
 
 
@listens_for(MyModel, 'after_insert')
def mymodel_after_insert(mapper, connection, target):
mytable = MyModel.__table__
priid = target.priid
statement = (mytable.update()
.where(mytable.c.priid == priid)
.values(secid=priid))
connection.execute(statement)


Full code for reference:  
http://paste.pound-python.org/show/WVciGm4jCxgvz84jKrZy/
SQLA echo: http://paste.pound-python.org/show/hSyCGisr0X5eupVdyk3f/

The echo looks sane, though I can't be sure how the SQLAchemy internals are 
affected by this.

Thank you for your input, Michael.

Nick

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


[sqlalchemy] Bulk Update using ShardedSession

2013-09-24 Thread Piotr Deszyński


I'm tryign to do a bulk update using sqlalchemy ShardedSession. I've 
stumbled on a problem when i have to do a bulk update without selecting 
rows from a database. Here's how I've tried to do it:

class Row(Base, object): #pylint: disable=I0011, R0903
'''
Row
'''

__tablename__ = 'my_table'

some_primary = Column(Integer(11), primary_key=True)
sharded_attribute = Column(String())
updated_attribute = Column(Integer(2))
Session = scoped_session(sessionmaker(class_=ShardedSession))Session.configure(
query_chooser=lambda x: SHARDS_LIST,
id_chooser=lambda x, y: SHARDS_LIST,
shard_chooser=my_shard_chooser,
autocommit=False)

Then I'm making a query as follows:

session.query(Row) \
.filter_by(Row.sharded_attribute.in_(attributes_list)) \
.update({Row.updated_attribute: 1}, synchronize_session=False)

So the problem is, that sharded_attribute can be a list of rows which are 
in different shards. Using such a query SQLAlchemy calls my_shard_chooser. 
In this function I'm determining a shard or a shard list depending on the 
params. There's a problem during the bulk update thought, becauseinstance param 
is None, so I don't have value of sharded_attribute. In that case there's 
passed clause attribute, so using it I'm able to determine what values of 
sharded_attribute were provided. Sadly these items might be in different 
shards and I cannot return a list of shards ids (only single id is 
supported).

Does anybody know how I can change my approach to make it working? Do I 
have to resign from using a session? Is there at all a way to do it other 
than doing firstly select for all updated rows and then updating objects 
and calling commit or making direct calls on engine's execute?

Best regards

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Bulk Update using ShardedSession

2013-09-24 Thread Michael Bayer

On Sep 23, 2013, at 8:13 AM, Piotr Deszyński deszyn...@red-sky.pl wrote:

 I'm tryign to do a bulk update using sqlalchemy ShardedSession. I've stumbled 
 on a problem when i have to do a bulk update without selecting rows from a 
 database. Here's how I've tried to do it:
 
 class Row(Base, object): #pylint: disable=I0011, R0903
 '''
 Row
 '''
 
 __tablename__ = 'my_table'
 
 some_primary = Column(Integer(11), primary_key=True)
 sharded_attribute = Column(String())
 updated_attribute = Column(Integer(2))
 
 Session = scoped_session(sessionmaker(class_=ShardedSession))
 Session.configure(
 query_chooser=lambda x: SHARDS_LIST,
 id_chooser=lambda x, y: SHARDS_LIST,
 shard_chooser=my_shard_chooser,
 autocommit=False
 )
 Then I'm making a query as follows:
 
 session.query(Row) \
 .filter_by(Row.sharded_attribute.in_(attributes_list)) \
 .update({Row.updated_attribute: 1}, synchronize_session=False)

the current implementation for ShardedQuery doesn't have support for the 
update() or delete() methods.  Those methods were added to Query long after 
ShardedQuery was developed, and implementing them transparently is slightly 
non-trivial (essentially has to emit the UPDATE or DELETE across all shards 
that apply to the query_chooser, in the same way that _execute_and_instances 
does).

 So the problem is, that sharded_attribute can be a list of rows which are in 
 different shards. Using such a query SQLAlchemy calls my_shard_chooser. In 
 this function I'm determining a shard or a shard list depending on the 
 params. There's a problem during the bulk update thought, becauseinstance 
 param is None, so I don't have value of sharded_attribute. In that case 
 there's passed clause attribute, so using it I'm able to determine what 
 values of sharded_attribute were provided. Sadly these items might be in 
 different shards and I cannot return a list of shards ids (only single id is 
 supported).
 
 Does anybody know how I can change my approach to make it working? Do I have 
 to resign from using a session? Is there at all a way to do it other than 
 doing firstly select for all updated rows and then updating objects and 
 calling commit or making direct calls on engine's execute?
 

If I'm understanding correctly this is all due to update()/delete() not being 
implemented.Looking at how these work right now, there's not a 
straightforward path to override how they execute queries the way 
_execute_and_instances() does, so you might be better off implementing your own 
update()/delete() that's similar to how ShardedQuery._execute_and_instances 
works, but uses core table.update() and table.delete() constructs.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] Bulk Update using ShardedSession

2013-09-24 Thread Piotr Deszyński
Hello,

Ok, thank you for the answer. I will try to look into that deeper then.

Best regards

W dniu środa, 25 września 2013 02:51:06 UTC+2 użytkownik Michael Bayer 
napisał:


 On Sep 23, 2013, at 8:13 AM, Piotr Deszyński 
 desz...@red-sky.pljavascript: 
 wrote:

 I'm tryign to do a bulk update using sqlalchemy ShardedSession. I've 
 stumbled on a problem when i have to do a bulk update without selecting 
 rows from a database. Here's how I've tried to do it:

 class Row(Base, object): #pylint: disable=I0011, R0903
 '''
 Row
 '''

 __tablename__ = 'my_table'

 some_primary = Column(Integer(11), primary_key=True)
 sharded_attribute = Column(String())
 updated_attribute = Column(Integer(2))
 Session = 
 scoped_session(sessionmaker(class_=ShardedSession))Session.configure(
 query_chooser=lambda x: SHARDS_LIST,
 id_chooser=lambda x, y: SHARDS_LIST,
 shard_chooser=my_shard_chooser,
 autocommit=False)

 Then I'm making a query as follows:

 session.query(Row) \
 .filter_by(Row.sharded_attribute.in_(attributes_list)) \
 .update({Row.updated_attribute: 1}, synchronize_session=False)


 the current implementation for ShardedQuery doesn't have support for the 
 update() or delete() methods.  Those methods were added to Query long after 
 ShardedQuery was developed, and implementing them transparently is slightly 
 non-trivial (essentially has to emit the UPDATE or DELETE across all shards 
 that apply to the query_chooser, in the same way that 
 _execute_and_instances does).

 So the problem is, that sharded_attribute can be a list of rows which are 
 in different shards. Using such a query SQLAlchemy calls my_shard_chooser. 
 In this function I'm determining a shard or a shard list depending on the 
 params. There's a problem during the bulk update thought, becauseinstance 
 param 
 is None, so I don't have value of sharded_attribute. In that case there's 
 passed clause attribute, so using it I'm able to determine what values of 
 sharded_attribute were provided. Sadly these items might be in different 
 shards and I cannot return a list of shards ids (only single id is 
 supported).

 Does anybody know how I can change my approach to make it working? Do I 
 have to resign from using a session? Is there at all a way to do it other 
 than doing firstly select for all updated rows and then updating objects 
 and calling commit or making direct calls on engine's execute?


 If I'm understanding correctly this is all due to update()/delete() not 
 being implemented.Looking at how these work right now, there's not a 
 straightforward path to override how they execute queries the way 
 _execute_and_instances() does, so you might be better off implementing your 
 own update()/delete() that's similar to how 
 ShardedQuery._execute_and_instances works, but uses core table.update() and 
 table.delete() constructs.



-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.