Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
Thanks for the detailed explanation! On 28 February 2017 at 16:22, mike bayerwrote: > sorry, lost track on this one. > > On 02/15/2017 10:07 PM, Zsolt Ero wrote: >> >> I'm starting to understand it! Just a few very quick questions: >> >> 1. Is it a good strategy to always use onupdate='CASCADE', >> ondelete='CASCADE' for all foreign keys if the db supports it (in my >> case Postgres 9.5 + psycopg2)? > > > if you want those child objects to actually delete or modify on PK like > that, as opposed to prevent it from happening, then yes. > > >> >> 2. If I'd like to use it on an already populated db, can I "alter >> table" to use these CASCADE options (for example in an alembic >> migration)? > > > yes you need to DROP the foreign key constraints and re-create them. alembic > autogenerate *should* be able to generate this script for you these days > without screwing up. > > >> >> 3. For delete, is this simply an optimisation step and the default >> behaviour is also perfectly fine? > > > it depends. For one level of "parent -> child", the cascade is just an > optimization. However, if you really have "parent -> child -> grandchild -> > great grandchild", and you need operations to cascade all the way through, > the ORM may not be able to support a long chain of "onupdate" if they all > share the same "foreign key" value that's changing, and as far as "ondelete" > I actually don't remember if it loads everything in :) I'd have to try it - > but it would be inefficient in any case.I'd be using CASCADE for a long > chain like that regardless. > > >> >> 4. Simply adding the above parameters is not automatically changing >> SQLAlchemy's delete strategy, I also need to tell it to use passive >> deletes. Do I also need the cascade="all, delete-orphan" option like >> in the docs? > > > To have the ORM fully handle cascading delete, you need to have the "all, > delete-orphan" so the ORM knows what's going on, then the FKs have > ondelete="CASCADE", and then the passive_deletes=True so that the ORM > doesn't needlessly load in collections to delete them. > > > > >> >> Zsolt >> >> >> On 15 February 2017 at 17:25, mike bayer wrote: >>> >>> >>> >>> On 02/15/2017 09:45 AM, Zsolt Ero wrote: 4. An interesting thing is that SQLAlchemy does 3 select calls in the delete case, even if 1 would be enough. Can be seen in the logs. >>> >>> >>> >>> the ORM only deletes rows one at a time based on primary key match. So >>> if >>> you have a relationship() that is configured to cascade deletes, and you >>> have not instructed the system that "ON DELETE CASCADE" will take care of >>> those collections, it will need to ensure these collections are present >>> in >>> memory (e.g. the SELECT) and target each row for deletion individually. >>> You >>> see only one DELETE statement but you'll note it has multiple parameter >>> sets >>> to indicate every row being deleted. Background on how to optimize this >>> is >>> at >>> >>> http://docs.sqlalchemy.org/en/latest/orm/collections.html#using-passive-deletes >>> . >>> >>> In this specific case there seem to be two SELECT statements but that >>> appears to be because of the awkward production of a new object that has >>> the >>> same primary key as another object. In the logs you'll see an UPDATE >>> but >>> this is actually a special case "hack"; normally, we'd see a DELETE of >>> the >>> old row and an INSERT of the new one, however the unit of work does not >>> support this process. There is an option to allow it to work this way >>> in >>> specific cases, although this feature is not present in SQLAlchemy at >>> this >>> time. In the absence of that feature, the behavior is that if the same >>> primary key is present on one object being deleted and another one being >>> added in the same flush, they are rolled into an UPDATE. Then the >>> collection is being deleted and re-added again too, so this is a bit of a >>> crazy example; using a straight UPDATE with correct cascade rules is >>> obviously much more efficient. >>> >>> >>> Zsolt On 15 February 2017 at 04:17, mike bayer wrote: > > > > > On 02/14/2017 08:15 PM, Zsolt Ero wrote: >> >> >> >> I would like to change a primary key's value, to be deterministic, >> based >> on a multi-to-multi relation. Thus I'm populating the tables with a >> temporary ids (just random strings), then calculating the right, >> unique >> id, and changing it afterwards. > > > > > the examples seem to move "transaction.manager" around, which we assume > is > the Zope transaction manager and that by using the context manager the > Session.commit() method is ultimately called, which raises this error. > One > guess is that in the second two examples, the Session is not actually > getting committed,
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
sorry, lost track on this one. On 02/15/2017 10:07 PM, Zsolt Ero wrote: I'm starting to understand it! Just a few very quick questions: 1. Is it a good strategy to always use onupdate='CASCADE', ondelete='CASCADE' for all foreign keys if the db supports it (in my case Postgres 9.5 + psycopg2)? if you want those child objects to actually delete or modify on PK like that, as opposed to prevent it from happening, then yes. 2. If I'd like to use it on an already populated db, can I "alter table" to use these CASCADE options (for example in an alembic migration)? yes you need to DROP the foreign key constraints and re-create them. alembic autogenerate *should* be able to generate this script for you these days without screwing up. 3. For delete, is this simply an optimisation step and the default behaviour is also perfectly fine? it depends. For one level of "parent -> child", the cascade is just an optimization. However, if you really have "parent -> child -> grandchild -> great grandchild", and you need operations to cascade all the way through, the ORM may not be able to support a long chain of "onupdate" if they all share the same "foreign key" value that's changing, and as far as "ondelete" I actually don't remember if it loads everything in :) I'd have to try it - but it would be inefficient in any case.I'd be using CASCADE for a long chain like that regardless. 4. Simply adding the above parameters is not automatically changing SQLAlchemy's delete strategy, I also need to tell it to use passive deletes. Do I also need the cascade="all, delete-orphan" option like in the docs? To have the ORM fully handle cascading delete, you need to have the "all, delete-orphan" so the ORM knows what's going on, then the FKs have ondelete="CASCADE", and then the passive_deletes=True so that the ORM doesn't needlessly load in collections to delete them. Zsolt On 15 February 2017 at 17:25, mike bayerwrote: On 02/15/2017 09:45 AM, Zsolt Ero wrote: 4. An interesting thing is that SQLAlchemy does 3 select calls in the delete case, even if 1 would be enough. Can be seen in the logs. the ORM only deletes rows one at a time based on primary key match. So if you have a relationship() that is configured to cascade deletes, and you have not instructed the system that "ON DELETE CASCADE" will take care of those collections, it will need to ensure these collections are present in memory (e.g. the SELECT) and target each row for deletion individually. You see only one DELETE statement but you'll note it has multiple parameter sets to indicate every row being deleted. Background on how to optimize this is at http://docs.sqlalchemy.org/en/latest/orm/collections.html#using-passive-deletes . In this specific case there seem to be two SELECT statements but that appears to be because of the awkward production of a new object that has the same primary key as another object. In the logs you'll see an UPDATE but this is actually a special case "hack"; normally, we'd see a DELETE of the old row and an INSERT of the new one, however the unit of work does not support this process. There is an option to allow it to work this way in specific cases, although this feature is not present in SQLAlchemy at this time. In the absence of that feature, the behavior is that if the same primary key is present on one object being deleted and another one being added in the same flush, they are rolled into an UPDATE. Then the collection is being deleted and re-added again too, so this is a bit of a crazy example; using a straight UPDATE with correct cascade rules is obviously much more efficient. Zsolt On 15 February 2017 at 04:17, mike bayer wrote: On 02/14/2017 08:15 PM, Zsolt Ero wrote: I would like to change a primary key's value, to be deterministic, based on a multi-to-multi relation. Thus I'm populating the tables with a temporary ids (just random strings), then calculating the right, unique id, and changing it afterwards. the examples seem to move "transaction.manager" around, which we assume is the Zope transaction manager and that by using the context manager the Session.commit() method is ultimately called, which raises this error. One guess is that in the second two examples, the Session is not actually getting committed, because no invocation of "dbsession" is present within the "with transaction.manager" block and I have a vague recollection that zope.transaction might work this way. Another guess is that in the second two examples, maybe you already changed the data in the DB and the operation you're doing has no net change to the rows. In any case, all three examples you should echo the SQL emitted so you can see what it's doing. Setting up the onupdate="CASCADE" should fix this problem. As to why that didn't work from you, keep in mind that is a CREATE TABLE directive so if you just changed
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
Can anyone help with these questions? Nothing urgent, I'm just interested in understanding more about SQLAlchemy. On 16 February 2017 at 04:07, Zsolt Erowrote: > I'm starting to understand it! Just a few very quick questions: > > 1. Is it a good strategy to always use onupdate='CASCADE', > ondelete='CASCADE' for all foreign keys if the db supports it (in my > case Postgres 9.5 + psycopg2)? > > 2. If I'd like to use it on an already populated db, can I "alter > table" to use these CASCADE options (for example in an alembic > migration)? > > 3. For delete, is this simply an optimisation step and the default > behaviour is also perfectly fine? > > 4. Simply adding the above parameters is not automatically changing > SQLAlchemy's delete strategy, I also need to tell it to use passive > deletes. Do I also need the cascade="all, delete-orphan" option like > in the docs? > > Zsolt > > > On 15 February 2017 at 17:25, mike bayer wrote: >> >> >> On 02/15/2017 09:45 AM, Zsolt Ero wrote: >>> >>> 4. An interesting thing is that SQLAlchemy does 3 select calls in the >>> delete case, even if 1 would be enough. Can be seen in the logs. >> >> >> the ORM only deletes rows one at a time based on primary key match. So if >> you have a relationship() that is configured to cascade deletes, and you >> have not instructed the system that "ON DELETE CASCADE" will take care of >> those collections, it will need to ensure these collections are present in >> memory (e.g. the SELECT) and target each row for deletion individually. You >> see only one DELETE statement but you'll note it has multiple parameter sets >> to indicate every row being deleted. Background on how to optimize this is >> at >> http://docs.sqlalchemy.org/en/latest/orm/collections.html#using-passive-deletes >> . >> >> In this specific case there seem to be two SELECT statements but that >> appears to be because of the awkward production of a new object that has the >> same primary key as another object. In the logs you'll see an UPDATE but >> this is actually a special case "hack"; normally, we'd see a DELETE of the >> old row and an INSERT of the new one, however the unit of work does not >> support this process. There is an option to allow it to work this way in >> specific cases, although this feature is not present in SQLAlchemy at this >> time. In the absence of that feature, the behavior is that if the same >> primary key is present on one object being deleted and another one being >> added in the same flush, they are rolled into an UPDATE. Then the >> collection is being deleted and re-added again too, so this is a bit of a >> crazy example; using a straight UPDATE with correct cascade rules is >> obviously much more efficient. >> >> >> >>> >>> Zsolt >>> >>> >>> >>> >>> >>> On 15 February 2017 at 04:17, mike bayer wrote: On 02/14/2017 08:15 PM, Zsolt Ero wrote: > > > I would like to change a primary key's value, to be deterministic, based > on a multi-to-multi relation. Thus I'm populating the tables with a > temporary ids (just random strings), then calculating the right, unique > id, and changing it afterwards. the examples seem to move "transaction.manager" around, which we assume is the Zope transaction manager and that by using the context manager the Session.commit() method is ultimately called, which raises this error. One guess is that in the second two examples, the Session is not actually getting committed, because no invocation of "dbsession" is present within the "with transaction.manager" block and I have a vague recollection that zope.transaction might work this way. Another guess is that in the second two examples, maybe you already changed the data in the DB and the operation you're doing has no net change to the rows. In any case, all three examples you should echo the SQL emitted so you can see what it's doing. Setting up the onupdate="CASCADE" should fix this problem. As to why that didn't work from you, keep in mind that is a CREATE TABLE directive so if you just changed it in your model and didn't recreate the tables, or at least recreate the foreign key constraints using ALTER to drop and create them again with the CASCADE rule set up; this is a server side rule. Here's the MCVE to demonstrate: from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base import md5 as _md5 import random import string def md5(text): return str(_md5.md5(text)) def random_string(num): return ''.join(random.choice( string.ascii_uppercase + string.digits) for _ in range(num)) Base = declarative_base() class
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
I'm starting to understand it! Just a few very quick questions: 1. Is it a good strategy to always use onupdate='CASCADE', ondelete='CASCADE' for all foreign keys if the db supports it (in my case Postgres 9.5 + psycopg2)? 2. If I'd like to use it on an already populated db, can I "alter table" to use these CASCADE options (for example in an alembic migration)? 3. For delete, is this simply an optimisation step and the default behaviour is also perfectly fine? 4. Simply adding the above parameters is not automatically changing SQLAlchemy's delete strategy, I also need to tell it to use passive deletes. Do I also need the cascade="all, delete-orphan" option like in the docs? Zsolt On 15 February 2017 at 17:25, mike bayerwrote: > > > On 02/15/2017 09:45 AM, Zsolt Ero wrote: >> >> 4. An interesting thing is that SQLAlchemy does 3 select calls in the >> delete case, even if 1 would be enough. Can be seen in the logs. > > > the ORM only deletes rows one at a time based on primary key match. So if > you have a relationship() that is configured to cascade deletes, and you > have not instructed the system that "ON DELETE CASCADE" will take care of > those collections, it will need to ensure these collections are present in > memory (e.g. the SELECT) and target each row for deletion individually. You > see only one DELETE statement but you'll note it has multiple parameter sets > to indicate every row being deleted. Background on how to optimize this is > at > http://docs.sqlalchemy.org/en/latest/orm/collections.html#using-passive-deletes > . > > In this specific case there seem to be two SELECT statements but that > appears to be because of the awkward production of a new object that has the > same primary key as another object. In the logs you'll see an UPDATE but > this is actually a special case "hack"; normally, we'd see a DELETE of the > old row and an INSERT of the new one, however the unit of work does not > support this process. There is an option to allow it to work this way in > specific cases, although this feature is not present in SQLAlchemy at this > time. In the absence of that feature, the behavior is that if the same > primary key is present on one object being deleted and another one being > added in the same flush, they are rolled into an UPDATE. Then the > collection is being deleted and re-added again too, so this is a bit of a > crazy example; using a straight UPDATE with correct cascade rules is > obviously much more efficient. > > > >> >> Zsolt >> >> >> >> >> >> On 15 February 2017 at 04:17, mike bayer wrote: >>> >>> >>> >>> On 02/14/2017 08:15 PM, Zsolt Ero wrote: I would like to change a primary key's value, to be deterministic, based on a multi-to-multi relation. Thus I'm populating the tables with a temporary ids (just random strings), then calculating the right, unique id, and changing it afterwards. >>> >>> >>> >>> the examples seem to move "transaction.manager" around, which we assume >>> is >>> the Zope transaction manager and that by using the context manager the >>> Session.commit() method is ultimately called, which raises this error. >>> One >>> guess is that in the second two examples, the Session is not actually >>> getting committed, because no invocation of "dbsession" is present within >>> the "with transaction.manager" block and I have a vague recollection that >>> zope.transaction might work this way. Another guess is that in the >>> second >>> two examples, maybe you already changed the data in the DB and the >>> operation >>> you're doing has no net change to the rows. >>> >>> In any case, all three examples you should echo the SQL emitted so you >>> can >>> see what it's doing. Setting up the onupdate="CASCADE" should fix this >>> problem. As to why that didn't work from you, keep in mind that is a >>> CREATE >>> TABLE directive so if you just changed it in your model and didn't >>> recreate >>> the tables, or at least recreate the foreign key constraints using ALTER >>> to >>> drop and create them again with the CASCADE rule set up; this is a server >>> side rule. >>> >>> Here's the MCVE to demonstrate: >>> >>> from sqlalchemy import * >>> from sqlalchemy.orm import * >>> from sqlalchemy.ext.declarative import declarative_base >>> import md5 as _md5 >>> import random >>> import string >>> >>> >>> def md5(text): >>> return str(_md5.md5(text)) >>> >>> >>> def random_string(num): >>> return ''.join(random.choice( >>> string.ascii_uppercase + string.digits) for _ in range(num)) >>> >>> Base = declarative_base() >>> >>> >>> class Image(Base): >>> __tablename__ = 'images' >>> id = Column(String, primary_key=True, default=lambda: >>> random_string(16)) >>> collections = relationship( >>> 'Collection', secondary='collections_images', >>> back_populates='images') >>> date_created = Column(DateTime, default=func.now()) >>> >>> >>>
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
On 02/15/2017 09:45 AM, Zsolt Ero wrote: 4. An interesting thing is that SQLAlchemy does 3 select calls in the delete case, even if 1 would be enough. Can be seen in the logs. the ORM only deletes rows one at a time based on primary key match. So if you have a relationship() that is configured to cascade deletes, and you have not instructed the system that "ON DELETE CASCADE" will take care of those collections, it will need to ensure these collections are present in memory (e.g. the SELECT) and target each row for deletion individually. You see only one DELETE statement but you'll note it has multiple parameter sets to indicate every row being deleted. Background on how to optimize this is at http://docs.sqlalchemy.org/en/latest/orm/collections.html#using-passive-deletes . In this specific case there seem to be two SELECT statements but that appears to be because of the awkward production of a new object that has the same primary key as another object. In the logs you'll see an UPDATE but this is actually a special case "hack"; normally, we'd see a DELETE of the old row and an INSERT of the new one, however the unit of work does not support this process. There is an option to allow it to work this way in specific cases, although this feature is not present in SQLAlchemy at this time. In the absence of that feature, the behavior is that if the same primary key is present on one object being deleted and another one being added in the same flush, they are rolled into an UPDATE. Then the collection is being deleted and re-added again too, so this is a bit of a crazy example; using a straight UPDATE with correct cascade rules is obviously much more efficient. Zsolt On 15 February 2017 at 04:17, mike bayerwrote: On 02/14/2017 08:15 PM, Zsolt Ero wrote: I would like to change a primary key's value, to be deterministic, based on a multi-to-multi relation. Thus I'm populating the tables with a temporary ids (just random strings), then calculating the right, unique id, and changing it afterwards. the examples seem to move "transaction.manager" around, which we assume is the Zope transaction manager and that by using the context manager the Session.commit() method is ultimately called, which raises this error. One guess is that in the second two examples, the Session is not actually getting committed, because no invocation of "dbsession" is present within the "with transaction.manager" block and I have a vague recollection that zope.transaction might work this way. Another guess is that in the second two examples, maybe you already changed the data in the DB and the operation you're doing has no net change to the rows. In any case, all three examples you should echo the SQL emitted so you can see what it's doing. Setting up the onupdate="CASCADE" should fix this problem. As to why that didn't work from you, keep in mind that is a CREATE TABLE directive so if you just changed it in your model and didn't recreate the tables, or at least recreate the foreign key constraints using ALTER to drop and create them again with the CASCADE rule set up; this is a server side rule. Here's the MCVE to demonstrate: from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base import md5 as _md5 import random import string def md5(text): return str(_md5.md5(text)) def random_string(num): return ''.join(random.choice( string.ascii_uppercase + string.digits) for _ in range(num)) Base = declarative_base() class Image(Base): __tablename__ = 'images' id = Column(String, primary_key=True, default=lambda: random_string(16)) collections = relationship( 'Collection', secondary='collections_images', back_populates='images') date_created = Column(DateTime, default=func.now()) class Collection(Base): __tablename__ = 'collections' id = Column(String, primary_key=True, default=lambda: random_string(16)) name = Column(String) images = relationship( 'Image', secondary='collections_images', back_populates='collections', order_by='desc(Image.date_created)', lazy='dynamic') collections_images = Table( 'collections_images', Base.metadata, Column('collection_id', ForeignKey('collections.id', onupdate="CASCADE"), primary_key=True), Column('image_id', ForeignKey('images.id'), primary_key=True) ) e = create_engine("postgresql://scott:tiger@localhost/test", echo=True) Base.metadata.drop_all(e) Base.metadata.create_all(e) s = Session(e) with s.transaction: s.add(Collection(name='c1', images=[Image(), Image(), Image()])) with s.transaction: collections = s.query(Collection).all() for collection in collections: image_ids = [i.id for i in collection.images.all()] image_ids_string = ','.join(sorted(image_ids)) + collection.name collection.id =
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
Thank you for the explanation! It is indeed deeper than I first thought, but I understand it now. Finally, consider question 4. (multiple select on delete) to be a bug report, even if harmless. On 15 February 2017 at 17:07, mike bayerwrote: > a "delete" cascade can be done on the client side because it involves a > simple ordering of steps: > > 1. DELETE the rows that depend on the target row > > 2. DELETE the target row. > > > an "update" cascade, OTOH, can't work this way. Because the dependent row > remains existing and needs to have a primary key value at all times, and > that primary key needs to correspond to a row in the target table. The only > way to do this client side (without disabling or dropping the constraints > themselves) would be: > > 1. INSERT a new row into the target table with the new primary key value > > 2. UPDATE the rows that depend on the target row > > 3. DELETE the old row from the target table > > For a simple UPDATE of the target table, this is not feasible; INSERT/DELETE > is a very different operation than an UPDATE at many levels; data wise, > isolation/concurrency-wise, etc. Only "ON UPDATE CASCADE" configured on the > server level can accommodate the flow as an UPDATE on the target table, > where Postgresql internally handles the cascading of the primary key change > to all dependent tables without violating referential integrity. > > > > > On 02/15/2017 11:02 AM, Zsolt Ero wrote: >> >> Thanks a lot! I would be still interested in your answer for 3. and 4. >> >> Especially, what is the difference between update and delete from's >> behaviour here? Why >> does SQLAlchemy know how to "cascade" a delete just on the client >> side, while for update it needs server side CASCADE support? >> >> >> >> >> On 15 February 2017 at 16:51, mike bayer wrote: >>> >>> onupdate=CASCADE is an option of ForeignKey, not Column: >>> Table( >>> 'collections_images', Base.metadata, >>> Column('collection_id', >>>ForeignKey('collections.id', onupdate='CASCADE'), >>> primary_key=True, ), >>> Column('image_id', ForeignKey('images.id'), primary_key=True)) >>> >>> >>> >>> >>> On 02/15/2017 09:45 AM, Zsolt Ero wrote: Thanks Mike for looking into this. I've created a minimal program which reproduces my error. As a context, this is a init script for a Pyramid app, but is run from command line, not in a request loop. The tables are dropped and recreated at start. Inklesspen helped me figure out the transaction manager over IRC and I've simplified it into one single block which is both simpler and more reliable. So about SQLAlchemy's behaviour: 1. I do not see anything related to CASCADE in echo when I use onupdate='CASCADE'. 2. Update does not work in my case, logs attached. 3. A manual hack of creating a new collection and deleting the old one does work. It means that delete does not need CASCADE, but SQLAlchemy can calculate the order of calls, if I understand right? 4. An interesting thing is that SQLAlchemy does 3 select calls in the delete case, even if 1 would be enough. Can be seen in the logs. Zsolt On 15 February 2017 at 04:17, mike bayer wrote: > > > > > On 02/14/2017 08:15 PM, Zsolt Ero wrote: >> >> >> >> I would like to change a primary key's value, to be deterministic, >> based >> on a multi-to-multi relation. Thus I'm populating the tables with a >> temporary ids (just random strings), then calculating the right, >> unique >> id, and changing it afterwards. > > > > > the examples seem to move "transaction.manager" around, which we assume > is > the Zope transaction manager and that by using the context manager the > Session.commit() method is ultimately called, which raises this error. > One > guess is that in the second two examples, the Session is not actually > getting committed, because no invocation of "dbsession" is present > within > the "with transaction.manager" block and I have a vague recollection > that > zope.transaction might work this way. Another guess is that in the > second > two examples, maybe you already changed the data in the DB and the > operation > you're doing has no net change to the rows. > > In any case, all three examples you should echo the SQL emitted so you > can > see what it's doing. Setting up the onupdate="CASCADE" should fix > this > problem. As to why that didn't work from you, keep in mind that is a > CREATE > TABLE directive so if you just changed it in your model and didn't > recreate > the tables, or at least recreate the foreign key constraints using > ALTER > to > drop and create them again with the
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
a "delete" cascade can be done on the client side because it involves a simple ordering of steps: 1. DELETE the rows that depend on the target row 2. DELETE the target row. an "update" cascade, OTOH, can't work this way. Because the dependent row remains existing and needs to have a primary key value at all times, and that primary key needs to correspond to a row in the target table. The only way to do this client side (without disabling or dropping the constraints themselves) would be: 1. INSERT a new row into the target table with the new primary key value 2. UPDATE the rows that depend on the target row 3. DELETE the old row from the target table For a simple UPDATE of the target table, this is not feasible; INSERT/DELETE is a very different operation than an UPDATE at many levels; data wise, isolation/concurrency-wise, etc. Only "ON UPDATE CASCADE" configured on the server level can accommodate the flow as an UPDATE on the target table, where Postgresql internally handles the cascading of the primary key change to all dependent tables without violating referential integrity. On 02/15/2017 11:02 AM, Zsolt Ero wrote: Thanks a lot! I would be still interested in your answer for 3. and 4. Especially, what is the difference between update and delete from's behaviour here? Why does SQLAlchemy know how to "cascade" a delete just on the client side, while for update it needs server side CASCADE support? On 15 February 2017 at 16:51, mike bayerwrote: onupdate=CASCADE is an option of ForeignKey, not Column: Table( 'collections_images', Base.metadata, Column('collection_id', ForeignKey('collections.id', onupdate='CASCADE'), primary_key=True, ), Column('image_id', ForeignKey('images.id'), primary_key=True)) On 02/15/2017 09:45 AM, Zsolt Ero wrote: Thanks Mike for looking into this. I've created a minimal program which reproduces my error. As a context, this is a init script for a Pyramid app, but is run from command line, not in a request loop. The tables are dropped and recreated at start. Inklesspen helped me figure out the transaction manager over IRC and I've simplified it into one single block which is both simpler and more reliable. So about SQLAlchemy's behaviour: 1. I do not see anything related to CASCADE in echo when I use onupdate='CASCADE'. 2. Update does not work in my case, logs attached. 3. A manual hack of creating a new collection and deleting the old one does work. It means that delete does not need CASCADE, but SQLAlchemy can calculate the order of calls, if I understand right? 4. An interesting thing is that SQLAlchemy does 3 select calls in the delete case, even if 1 would be enough. Can be seen in the logs. Zsolt On 15 February 2017 at 04:17, mike bayer wrote: On 02/14/2017 08:15 PM, Zsolt Ero wrote: I would like to change a primary key's value, to be deterministic, based on a multi-to-multi relation. Thus I'm populating the tables with a temporary ids (just random strings), then calculating the right, unique id, and changing it afterwards. the examples seem to move "transaction.manager" around, which we assume is the Zope transaction manager and that by using the context manager the Session.commit() method is ultimately called, which raises this error. One guess is that in the second two examples, the Session is not actually getting committed, because no invocation of "dbsession" is present within the "with transaction.manager" block and I have a vague recollection that zope.transaction might work this way. Another guess is that in the second two examples, maybe you already changed the data in the DB and the operation you're doing has no net change to the rows. In any case, all three examples you should echo the SQL emitted so you can see what it's doing. Setting up the onupdate="CASCADE" should fix this problem. As to why that didn't work from you, keep in mind that is a CREATE TABLE directive so if you just changed it in your model and didn't recreate the tables, or at least recreate the foreign key constraints using ALTER to drop and create them again with the CASCADE rule set up; this is a server side rule. Here's the MCVE to demonstrate: from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base import md5 as _md5 import random import string def md5(text): return str(_md5.md5(text)) def random_string(num): return ''.join(random.choice( string.ascii_uppercase + string.digits) for _ in range(num)) Base = declarative_base() class Image(Base): __tablename__ = 'images' id = Column(String, primary_key=True, default=lambda: random_string(16)) collections = relationship( 'Collection', secondary='collections_images', back_populates='images') date_created = Column(DateTime, default=func.now()) class Collection(Base): __tablename__ = 'collections'
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
Thanks a lot! I would be still interested in your answer for 3. and 4. Especially, what is the difference between update and delete from's behaviour here? Why does SQLAlchemy know how to "cascade" a delete just on the client side, while for update it needs server side CASCADE support? On 15 February 2017 at 16:51, mike bayerwrote: > onupdate=CASCADE is an option of ForeignKey, not Column: > Table( > 'collections_images', Base.metadata, > Column('collection_id', >ForeignKey('collections.id', onupdate='CASCADE'), > primary_key=True, ), > Column('image_id', ForeignKey('images.id'), primary_key=True)) > > > > > On 02/15/2017 09:45 AM, Zsolt Ero wrote: >> >> Thanks Mike for looking into this. >> >> I've created a minimal program which reproduces my error. As a >> context, this is a init script for a Pyramid app, but is run from >> command line, not in a request loop. The tables are dropped and >> recreated at start. Inklesspen helped me figure out the transaction >> manager over IRC and I've simplified it into one single block which is >> both simpler and more reliable. >> >> So about SQLAlchemy's behaviour: >> >> 1. I do not see anything related to CASCADE in echo when I use >> onupdate='CASCADE'. >> 2. Update does not work in my case, logs attached. >> 3. A manual hack of creating a new collection and deleting the old one >> does work. It means that delete does not need CASCADE, but SQLAlchemy >> can calculate the order of calls, if I understand right? >> 4. An interesting thing is that SQLAlchemy does 3 select calls in the >> delete case, even if 1 would be enough. Can be seen in the logs. >> >> Zsolt >> >> >> >> >> >> On 15 February 2017 at 04:17, mike bayer wrote: >>> >>> >>> >>> On 02/14/2017 08:15 PM, Zsolt Ero wrote: I would like to change a primary key's value, to be deterministic, based on a multi-to-multi relation. Thus I'm populating the tables with a temporary ids (just random strings), then calculating the right, unique id, and changing it afterwards. >>> >>> >>> >>> the examples seem to move "transaction.manager" around, which we assume >>> is >>> the Zope transaction manager and that by using the context manager the >>> Session.commit() method is ultimately called, which raises this error. >>> One >>> guess is that in the second two examples, the Session is not actually >>> getting committed, because no invocation of "dbsession" is present within >>> the "with transaction.manager" block and I have a vague recollection that >>> zope.transaction might work this way. Another guess is that in the >>> second >>> two examples, maybe you already changed the data in the DB and the >>> operation >>> you're doing has no net change to the rows. >>> >>> In any case, all three examples you should echo the SQL emitted so you >>> can >>> see what it's doing. Setting up the onupdate="CASCADE" should fix this >>> problem. As to why that didn't work from you, keep in mind that is a >>> CREATE >>> TABLE directive so if you just changed it in your model and didn't >>> recreate >>> the tables, or at least recreate the foreign key constraints using ALTER >>> to >>> drop and create them again with the CASCADE rule set up; this is a server >>> side rule. >>> >>> Here's the MCVE to demonstrate: >>> >>> from sqlalchemy import * >>> from sqlalchemy.orm import * >>> from sqlalchemy.ext.declarative import declarative_base >>> import md5 as _md5 >>> import random >>> import string >>> >>> >>> def md5(text): >>> return str(_md5.md5(text)) >>> >>> >>> def random_string(num): >>> return ''.join(random.choice( >>> string.ascii_uppercase + string.digits) for _ in range(num)) >>> >>> Base = declarative_base() >>> >>> >>> class Image(Base): >>> __tablename__ = 'images' >>> id = Column(String, primary_key=True, default=lambda: >>> random_string(16)) >>> collections = relationship( >>> 'Collection', secondary='collections_images', >>> back_populates='images') >>> date_created = Column(DateTime, default=func.now()) >>> >>> >>> class Collection(Base): >>> __tablename__ = 'collections' >>> id = Column(String, primary_key=True, default=lambda: >>> random_string(16)) >>> name = Column(String) >>> images = relationship( >>> 'Image', secondary='collections_images', >>> back_populates='collections', >>> order_by='desc(Image.date_created)', >>> lazy='dynamic') >>> >>> >>> collections_images = Table( >>> 'collections_images', Base.metadata, >>> Column('collection_id', >>>ForeignKey('collections.id', onupdate="CASCADE"), >>>primary_key=True), >>> Column('image_id', ForeignKey('images.id'), primary_key=True) >>> ) >>> >>> e = create_engine("postgresql://scott:tiger@localhost/test", echo=True) >>> >>> Base.metadata.drop_all(e) >>> Base.metadata.create_all(e) >>> >>> s = Session(e) >>> >>> with s.transaction: >>>
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
onupdate=CASCADE is an option of ForeignKey, not Column: Table( 'collections_images', Base.metadata, Column('collection_id', ForeignKey('collections.id', onupdate='CASCADE'), primary_key=True, ), Column('image_id', ForeignKey('images.id'), primary_key=True)) On 02/15/2017 09:45 AM, Zsolt Ero wrote: Thanks Mike for looking into this. I've created a minimal program which reproduces my error. As a context, this is a init script for a Pyramid app, but is run from command line, not in a request loop. The tables are dropped and recreated at start. Inklesspen helped me figure out the transaction manager over IRC and I've simplified it into one single block which is both simpler and more reliable. So about SQLAlchemy's behaviour: 1. I do not see anything related to CASCADE in echo when I use onupdate='CASCADE'. 2. Update does not work in my case, logs attached. 3. A manual hack of creating a new collection and deleting the old one does work. It means that delete does not need CASCADE, but SQLAlchemy can calculate the order of calls, if I understand right? 4. An interesting thing is that SQLAlchemy does 3 select calls in the delete case, even if 1 would be enough. Can be seen in the logs. Zsolt On 15 February 2017 at 04:17, mike bayerwrote: On 02/14/2017 08:15 PM, Zsolt Ero wrote: I would like to change a primary key's value, to be deterministic, based on a multi-to-multi relation. Thus I'm populating the tables with a temporary ids (just random strings), then calculating the right, unique id, and changing it afterwards. the examples seem to move "transaction.manager" around, which we assume is the Zope transaction manager and that by using the context manager the Session.commit() method is ultimately called, which raises this error. One guess is that in the second two examples, the Session is not actually getting committed, because no invocation of "dbsession" is present within the "with transaction.manager" block and I have a vague recollection that zope.transaction might work this way. Another guess is that in the second two examples, maybe you already changed the data in the DB and the operation you're doing has no net change to the rows. In any case, all three examples you should echo the SQL emitted so you can see what it's doing. Setting up the onupdate="CASCADE" should fix this problem. As to why that didn't work from you, keep in mind that is a CREATE TABLE directive so if you just changed it in your model and didn't recreate the tables, or at least recreate the foreign key constraints using ALTER to drop and create them again with the CASCADE rule set up; this is a server side rule. Here's the MCVE to demonstrate: from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base import md5 as _md5 import random import string def md5(text): return str(_md5.md5(text)) def random_string(num): return ''.join(random.choice( string.ascii_uppercase + string.digits) for _ in range(num)) Base = declarative_base() class Image(Base): __tablename__ = 'images' id = Column(String, primary_key=True, default=lambda: random_string(16)) collections = relationship( 'Collection', secondary='collections_images', back_populates='images') date_created = Column(DateTime, default=func.now()) class Collection(Base): __tablename__ = 'collections' id = Column(String, primary_key=True, default=lambda: random_string(16)) name = Column(String) images = relationship( 'Image', secondary='collections_images', back_populates='collections', order_by='desc(Image.date_created)', lazy='dynamic') collections_images = Table( 'collections_images', Base.metadata, Column('collection_id', ForeignKey('collections.id', onupdate="CASCADE"), primary_key=True), Column('image_id', ForeignKey('images.id'), primary_key=True) ) e = create_engine("postgresql://scott:tiger@localhost/test", echo=True) Base.metadata.drop_all(e) Base.metadata.create_all(e) s = Session(e) with s.transaction: s.add(Collection(name='c1', images=[Image(), Image(), Image()])) with s.transaction: collections = s.query(Collection).all() for collection in collections: image_ids = [i.id for i in collection.images.all()] image_ids_string = ','.join(sorted(image_ids)) + collection.name collection.id = md5(image_ids_string)[:16] I have the following models: |classImage(Base):id =Column(String,primary_key=True,default=lambda:random_string(16))collections =relationship('Collection',secondary='collections_images',back_populates='images')classCollection(Base):id =Column(String,primary_key=True,default=lambda:random_string(16))images
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
Thanks Mike for looking into this. I've created a minimal program which reproduces my error. As a context, this is a init script for a Pyramid app, but is run from command line, not in a request loop. The tables are dropped and recreated at start. Inklesspen helped me figure out the transaction manager over IRC and I've simplified it into one single block which is both simpler and more reliable. So about SQLAlchemy's behaviour: 1. I do not see anything related to CASCADE in echo when I use onupdate='CASCADE'. 2. Update does not work in my case, logs attached. 3. A manual hack of creating a new collection and deleting the old one does work. It means that delete does not need CASCADE, but SQLAlchemy can calculate the order of calls, if I understand right? 4. An interesting thing is that SQLAlchemy does 3 select calls in the delete case, even if 1 would be enough. Can be seen in the logs. Zsolt On 15 February 2017 at 04:17, mike bayerwrote: > > > On 02/14/2017 08:15 PM, Zsolt Ero wrote: >> >> I would like to change a primary key's value, to be deterministic, based >> on a multi-to-multi relation. Thus I'm populating the tables with a >> temporary ids (just random strings), then calculating the right, unique >> id, and changing it afterwards. > > > the examples seem to move "transaction.manager" around, which we assume is > the Zope transaction manager and that by using the context manager the > Session.commit() method is ultimately called, which raises this error. One > guess is that in the second two examples, the Session is not actually > getting committed, because no invocation of "dbsession" is present within > the "with transaction.manager" block and I have a vague recollection that > zope.transaction might work this way. Another guess is that in the second > two examples, maybe you already changed the data in the DB and the operation > you're doing has no net change to the rows. > > In any case, all three examples you should echo the SQL emitted so you can > see what it's doing. Setting up the onupdate="CASCADE" should fix this > problem. As to why that didn't work from you, keep in mind that is a CREATE > TABLE directive so if you just changed it in your model and didn't recreate > the tables, or at least recreate the foreign key constraints using ALTER to > drop and create them again with the CASCADE rule set up; this is a server > side rule. > > Here's the MCVE to demonstrate: > > from sqlalchemy import * > from sqlalchemy.orm import * > from sqlalchemy.ext.declarative import declarative_base > import md5 as _md5 > import random > import string > > > def md5(text): > return str(_md5.md5(text)) > > > def random_string(num): > return ''.join(random.choice( > string.ascii_uppercase + string.digits) for _ in range(num)) > > Base = declarative_base() > > > class Image(Base): > __tablename__ = 'images' > id = Column(String, primary_key=True, default=lambda: random_string(16)) > collections = relationship( > 'Collection', secondary='collections_images', > back_populates='images') > date_created = Column(DateTime, default=func.now()) > > > class Collection(Base): > __tablename__ = 'collections' > id = Column(String, primary_key=True, default=lambda: random_string(16)) > name = Column(String) > images = relationship( > 'Image', secondary='collections_images', > back_populates='collections', order_by='desc(Image.date_created)', > lazy='dynamic') > > > collections_images = Table( > 'collections_images', Base.metadata, > Column('collection_id', >ForeignKey('collections.id', onupdate="CASCADE"), >primary_key=True), > Column('image_id', ForeignKey('images.id'), primary_key=True) > ) > > e = create_engine("postgresql://scott:tiger@localhost/test", echo=True) > > Base.metadata.drop_all(e) > Base.metadata.create_all(e) > > s = Session(e) > > with s.transaction: > s.add(Collection(name='c1', images=[Image(), Image(), Image()])) > > with s.transaction: > collections = s.query(Collection).all() > > for collection in collections: > image_ids = [i.id for i in collection.images.all()] > image_ids_string = ','.join(sorted(image_ids)) + collection.name > collection.id = md5(image_ids_string)[:16] > > > > >> >> I have the following models: >> >> |classImage(Base):id >> >> =Column(String,primary_key=True,default=lambda:random_string(16))collections >> >> =relationship('Collection',secondary='collections_images',back_populates='images')classCollection(Base):id >> =Column(String,primary_key=True,default=lambda:random_string(16))images >> >>
Re: [sqlalchemy] Changing primary key's value in a multi-to-multi relation
On 02/14/2017 08:15 PM, Zsolt Ero wrote: I would like to change a primary key's value, to be deterministic, based on a multi-to-multi relation. Thus I'm populating the tables with a temporary ids (just random strings), then calculating the right, unique id, and changing it afterwards. the examples seem to move "transaction.manager" around, which we assume is the Zope transaction manager and that by using the context manager the Session.commit() method is ultimately called, which raises this error. One guess is that in the second two examples, the Session is not actually getting committed, because no invocation of "dbsession" is present within the "with transaction.manager" block and I have a vague recollection that zope.transaction might work this way. Another guess is that in the second two examples, maybe you already changed the data in the DB and the operation you're doing has no net change to the rows. In any case, all three examples you should echo the SQL emitted so you can see what it's doing. Setting up the onupdate="CASCADE" should fix this problem. As to why that didn't work from you, keep in mind that is a CREATE TABLE directive so if you just changed it in your model and didn't recreate the tables, or at least recreate the foreign key constraints using ALTER to drop and create them again with the CASCADE rule set up; this is a server side rule. Here's the MCVE to demonstrate: from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base import md5 as _md5 import random import string def md5(text): return str(_md5.md5(text)) def random_string(num): return ''.join(random.choice( string.ascii_uppercase + string.digits) for _ in range(num)) Base = declarative_base() class Image(Base): __tablename__ = 'images' id = Column(String, primary_key=True, default=lambda: random_string(16)) collections = relationship( 'Collection', secondary='collections_images', back_populates='images') date_created = Column(DateTime, default=func.now()) class Collection(Base): __tablename__ = 'collections' id = Column(String, primary_key=True, default=lambda: random_string(16)) name = Column(String) images = relationship( 'Image', secondary='collections_images', back_populates='collections', order_by='desc(Image.date_created)', lazy='dynamic') collections_images = Table( 'collections_images', Base.metadata, Column('collection_id', ForeignKey('collections.id', onupdate="CASCADE"), primary_key=True), Column('image_id', ForeignKey('images.id'), primary_key=True) ) e = create_engine("postgresql://scott:tiger@localhost/test", echo=True) Base.metadata.drop_all(e) Base.metadata.create_all(e) s = Session(e) with s.transaction: s.add(Collection(name='c1', images=[Image(), Image(), Image()])) with s.transaction: collections = s.query(Collection).all() for collection in collections: image_ids = [i.id for i in collection.images.all()] image_ids_string = ','.join(sorted(image_ids)) + collection.name collection.id = md5(image_ids_string)[:16] I have the following models: |classImage(Base):id =Column(String,primary_key=True,default=lambda:random_string(16))collections =relationship('Collection',secondary='collections_images',back_populates='images')classCollection(Base):id =Column(String,primary_key=True,default=lambda:random_string(16))images =relationship('Image',secondary='collections_images',back_populates='collections',order_by='desc(Image.date_created)',lazy='dynamic')Table('collections_images',Base.metadata,Column('collection_id',ForeignKey('collections.id'),primary_key=True),Column('image_id',ForeignKey('images.id'),primary_key=True))| My problem is the following: 1. Out of the 3 examples below, only one triggers an integrity exception, the other two does not. Why? In all three I'm trying to write to this primary key, which is referenced, thus should produce an exception. Yet, in 2. and 3. it seem nothing is happening when |collection.id =| is set. When I debug via SQL queries it shows absolutely nothing called for the |collection.id =| line. 2. How can I solve this problem? I mean how can I change a primary key's value which is also used in a multi-to-multi relation? The DB is PostgreSQL 9.5 with psycopg2. The examples are: A. triggers exception: |withtransaction.manager:collections =dbsession.query(Collection).all()forcollection incollections:image_ids =[i.id fori incollection.images.all()]image_ids_string =','.join(sorted(image_ids))+collection.name collection.id =md5(image_ids_string)[:16]| B. does not trigger exception |collections =dbsession.query(Collection).all()# ^ and v only these two lines are swapped withtransaction.manager:forcollection incollections:image_ids =[i.id fori
[sqlalchemy] Changing primary key's value in a multi-to-multi relation
I would like to change a primary key's value, to be deterministic, based on a multi-to-multi relation. Thus I'm populating the tables with a temporary ids (just random strings), then calculating the right, unique id, and changing it afterwards. I have the following models: class Image(Base): id = Column(String, primary_key=True, default=lambda: random_string(16)) collections = relationship('Collection', secondary='collections_images', back_populates='images') class Collection(Base): id = Column(String, primary_key=True, default=lambda: random_string(16)) images = relationship('Image', secondary='collections_images', back_populates='collections', order_by='desc(Image.date_created)', lazy='dynamic') Table('collections_images', Base.metadata, Column('collection_id', ForeignKey('collections.id'), primary_key=True), Column('image_id', ForeignKey('images.id'), primary_key=True)) My problem is the following: 1. Out of the 3 examples below, only one triggers an integrity exception, the other two does not. Why? In all three I'm trying to write to this primary key, which is referenced, thus should produce an exception. Yet, in 2. and 3. it seem nothing is happening when collection.id = is set. When I debug via SQL queries it shows absolutely nothing called for the collection.id = line. 2. How can I solve this problem? I mean how can I change a primary key's value which is also used in a multi-to-multi relation? The DB is PostgreSQL 9.5 with psycopg2. The examples are: A. triggers exception: with transaction.manager: collections = dbsession.query(Collection).all() for collection in collections: image_ids = [i.id for i in collection.images.all()] image_ids_string = ','.join(sorted(image_ids)) + collection.name collection.id = md5(image_ids_string)[:16] B. does not trigger exception collections = dbsession.query(Collection).all()# ^ and v only these two lines are swapped with transaction.manager: for collection in collections: image_ids = [i.id for i in collection.images.all()] image_ids_string = ','.join(sorted(image_ids)) + collection.name collection.id = md5(image_ids_string)[:16] C. also does not trigger exception collections = dbsession.query(Collection).all() for collection in collections: image_ids = [i.id for i in collection.images.all()] image_ids_string = ','.join(sorted(image_ids)) + collection.name with transaction.manager: collection.id = md5(image_ids_string)[:16] The exception for the first one is: sqlalchemy.exc.IntegrityError: (psycopg2.IntegrityError) update or delete on table "collections" violates foreign key constraint "fk_collections_images_collection_id_collections" on table "collections_images" DETAIL: Key (id)=(jC3sN8952urTGrqz) is still referenced from table "collections_images". I've also tried onupdate='CASCADE' for both columns in collections_images but didn't change anything. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.