Hi Mike, Sorry about the indentations.
I'm not sure I understand the changes you made to the script after delete as it removes all test_chains, test_var_regions, and test_const regions that are still referenced by the other test_molecules. The only way I've been able to get the delete to work properly is to manually delete test_var_regions and test_const_regions first and then delete the test_molecules, but the ideal outcome I'm trying to achieve is that when a test_molecule is deleted: 1. that the system checks if the chains connected to it are removed if they are not linked to other test_molecules. 2. if a test_chain is going to be deleted then 1. the test_mol_sequence associated with it is deleted if it is no longer associated with any other test_chains 2. any test_var_regions and test_const_regions are deleted if they are not associated with any other test_chains 3. and finally if a test_mol_sequence is deleted that any test_mol_sequence_features are deleted if they are not associated with any other test_mol_sequences. To make things a little easier to explain, if we just are dealing with molecule1 and molecule4 there are only 3 unique test_chains: heavy_chain_1, light_chain_1, and heavy_chain_2 if I deleted molecule1, I would want to retain heavy_chain_2 and light_chain_1 and delete heavy_chain_1 as it was no longer associated with any test_molecule. Ideally, then I would remove any test_const_regions and test_var_regions that are no longer associated with any test_chains. Because heavy_chain_1 shares the same test_var_region as heavy_chain_2, that test_var_region would remain in the system but the test_const_region unique to heavy_chain_1 would be deleted along with the test_mol_sequence and test_mol_sequence_features associated with it. molecule1.chains.add(heavy_chain_1) molecule1.chains.add(light_chain_1) #molecule2.chains.add(heavy_chain_2) #molecule2.chains.add(light_chain_2) #molecule3.chains.add(heavy_chain_1) #molecule3.chains.add(light_chain_2) molecule4.chains.add(heavy_chain_2) molecule4.chains.add(light_chain_1) light_chain_1_sequence = TestMolSequence(content="taglconst1VAR1") heavy_chain_1_sequence = TestMolSequence(content="tagheavyconstant1VAR2") heavy_chain_2_sequence = TestMolSequence(content="tagheavyconstant2VAR2") light_chain_2_sequence = TestMolSequence(content="taglconst1VAR3") Does that make sense? On Tue, Apr 13, 2021 at 4:50 PM Mike Bayer <mike...@zzzcomputing.com> wrote: > Hi there - > > I would ask that you try to make sure your formatting is maintained when > posting examples especially such long ones as I had to re-indent it in > order to run this. > > The delete at the end is failing because of incomplete cascade rules. > The DELETE against "test_mol_sequence" seeks to CASCADE as configured to > the other three tables,, which then fail because there are non-cascading > FKs in the association tables. We can see this via the message: > > update or delete on table "test_var_region" violates foreign key > constraint "test_chain_var_region_var_region_id_fkey" on table > "test_chain_var_region" > DETAIL: Key (id)=(1) is still referenced from table > "test_chain_var_region". > > [SQL: DELETE FROM test_mol_sequence WHERE test_mol_sequence.id = %(id)s] > > > it's DELETEing from test_mol_sequence but the error is against a totally > different table. That's postgresql's cascade. > > if you want these CASCADEs to remain in place then you'd need to ensure > that those linked rows can be deleted without any rows being present in the > association tables. > > I can modify your test at the end to fully DELETE these rows without > qualifying for those that have empty collections only and the script then > passes, because now it's deleting those rows that would otherwise be > dependent on by "test_var_region" and therefore "test_mol_sequence", so > there's nothing unexpected going on. Easiest solution here would be to > add CASCADE rules to the association tables also. If you want that to be > prevented as you mention, and instead expect the script to explicitly > delete those depending rows, then your script is already achieving that. > the "business logic" so to speak in this case would be as below: > > orphan_chains = ( > session.query(TestChain). > # filter(~TestChain.molecules.any()). > all() > ) > for chain in orphan_chains: > session.delete(chain) > > orphan_vrs = ( > session.query(TestVarRegion) > # .filter(~TestVarRegion.chains.any()) > .all() > ) > for orphan_vr in orphan_vrs: > session.delete(orphan_vr) > orphan_crs = ( > session.query(TestConstRegion) > # .filter(~TestConstRegion.chains.any()) > .all() > ) > for orphan_cr in orphan_crs: > session.delete(orphan_cr) > orphan_sequences = ( > session.query(TestMolSequence) > # .filter(~TestMolSequence.chains.any()) > .all() > ) > > > > > > > > > On Tue, Apr 13, 2021, at 10:03 AM, maqui...@gmail.com wrote: > > I need to delete the association table rows for many to many relationships > when I delete one, but the default behavior (to remove those rows) does not > seem to work in my case. > > I have multiple levels of many to many relationships, as you can see in > the example I'll provide below and when I delete a "parent" afterwards I > try to clean up any children left behind that have no other parents. > However, these children are in many to many relationships with other > children and that's when the ORM fails to attempt to remove those children > from their related association tables (at least, in a way that I expect). > > The issue is error is: > def do_executemany(self, cursor, statement, parameters, context=None): > if self.executemany_mode is EXECUTEMANY_DEFAULT: > > cursor.executemany(statement, parameters) > E sqlalchemy.exc.IntegrityError: > (psycopg2.errors.ForeignKeyViolation) update or delete on table > "test_var_region" violates foreign key constraint > "test_chain_var_region_var_region_id_fkey" on table "test_chain_var_region" > E DETAIL: Key (id)=(1) is still referenced from table > "test_chain_var_region". > E > E [SQL: DELETE FROM test_mol_sequence WHERE test_mol_sequence.id > = %(id)s] > E [parameters: ({'id': 5}, {'id': 6}, {'id': 7}, {'id': 8}, > {'id': 9}, {'id': 10})] > E (Background on this error at: http://sqlalche.me/e/gkpj) > > The desired effect, of course, is that the rows in test_chain_var_region > that reference the deleted chains removed. I've tried several strategies > to do this but with no change in this behavior. > > Cascades could be an issue, and I would rather handle removal of any > "orphan" rows in the model tables via business logic than have the database > cascade deletes and potentially remove rows that are associated with other > objects. > > import pytest > from sqlalchemy import ( > Table, > Column, > Integer, > String, > ForeignKey, > create_engine, > ) > from sqlalchemy.ext.declarative import declarative_base, DeclarativeMeta > from sqlalchemy.orm import relationship, Session > from sqlalchemy.util import OrderedSet > > > Base: DeclarativeMeta = declarative_base() > engine = create_engine( > "postgresql://postgres:postgres@localhost:5432/espresso", echo=True > ) > test_chain_const_region = Table( > "test_chain_const_region", > Base.metadata, > Column("chain_id", Integer, ForeignKey("test_chain.id")), > Column("const_region_id", Integer, ForeignKey("test_const_region.id")), > ) > test_chain_var_region = Table( > "test_chain_var_region", > Base.metadata, > Column("chain_id", Integer, ForeignKey("test_chain.id")), > Column("var_region_id", Integer, ForeignKey("test_var_region.id")), > ) > test_molecule_chain = Table( > "test_molecule_chain", > Base.metadata, > Column("molecule_id", Integer, ForeignKey("test_molecule.id")), > Column("chain_id", Integer, ForeignKey("test_chain.id")), > ) > > test_mol_sequence_feat_mol_sequence = Table( > "test_mol_sequence_feat_mol_sequence", > Base.metadata, > Column("mol_sequence_feat_id", Integer, ForeignKey(" > test_mol_sequence_feat.id")), > Column("mol_sequence_id", Integer, ForeignKey("test_mol_sequence.id")), > ) > > > class TestMolecule(Base): > __tablename__ = "test_molecule" > id = Column(Integer, primary_key=True) > label = Column(String) > chains = relationship( > "TestChain", > secondary=test_molecule_chain, > collection_class=OrderedSet, > back_populates="molecules", > ) > > > class TestMolSequence(Base): > __tablename__ = "test_mol_sequence" > > id = Column(Integer, primary_key=True) > content = Column(String, nullable=False, unique=True) > parent_features = relationship( > "TestMolSequenceFeat", > secondary=test_mol_sequence_feat_mol_sequence, > collection_class=OrderedSet, > back_populates="feature_sequences", > single_parent=True, > ) > chains = relationship( > "TestChain", back_populates="mol_sequence", collection_class=OrderedSet > ) > > > class TestMolSequenceFeat(Base): > __tablename__ = "test_mol_sequence_feat" > > id = Column(Integer, primary_key=True) > molecule_sequence_id = Column( > Integer, ForeignKey("test_mol_sequence.id", ondelete="CASCADE"), > ) > molecule_sequence = relationship("TestMolSequence",) > start = Column(Integer) > stop = Column(Integer) > feature_sequences = relationship( > "TestMolSequence", > secondary=test_mol_sequence_feat_mol_sequence, > collection_class=OrderedSet, > back_populates="parent_features", > # single_parent=True, > ) > > > class TestChain(Base): > __tablename__ = "test_chain" > > id = Column(Integer, primary_key=True) > label = Column(String) > chain_type = Column(String) > mol_sequence_id = Column(Integer, ForeignKey("test_mol_sequence.id")) > mol_sequence = relationship("TestMolSequence", back_populates="chains") > molecules = relationship( > "TestMolecule", > secondary=test_molecule_chain, > collection_class=OrderedSet, > back_populates="chains", > ) > var_regions = relationship( > "TestVarRegion", > secondary=test_chain_var_region, > collection_class=OrderedSet, > back_populates="chains", > ) > const_regions = relationship( > "TestConstRegion", > secondary=test_chain_const_region, > collection_class=OrderedSet, > back_populates="chains", > ) > > > class TestVarRegion(Base): > __tablename__ = "test_var_region" > > id = Column(Integer, primary_key=True) > molecule_sequence_id = Column( > Integer, ForeignKey("test_mol_sequence.id", ondelete="CASCADE"), > ) > description = Column(String) > additional_information = Column(String) > label = Column("label", String, nullable=True, unique=False) > molecule_sequence = relationship("TestMolSequence") > chains = relationship( > "TestChain", > secondary=test_chain_var_region, > collection_class=OrderedSet, > back_populates="var_regions", > passive_deletes=True, > ) > > > class TestConstRegion(Base): > __tablename__ = "test_const_region" > > id = Column(Integer, primary_key=True) > molecule_sequence_id = Column( > Integer, ForeignKey("test_mol_sequence.id", ondelete="CASCADE"), > ) > description = Column(String) > additional_information = Column(String) > label = Column("label", String, nullable=True, unique=False) > molecule_sequence = relationship("TestMolSequence") > chains = relationship( > "TestChain", > secondary=test_chain_const_region, > collection_class=OrderedSet, > back_populates="const_regions", > passive_deletes=True, > ) > > > class TestManyToMany: > @pytest.fixture > def engine(self): > return create_engine( > "postgresql://postgres:postgres@localhost:5432/espresso", echo=True > ) > > @pytest.fixture > def session(self): > > *"""Returns an sqlalchemy session, and after the test tears down > everything properly."""* connection = engine.connect() > # begin the nested transaction > transaction = connection.begin() > # use the connection with the already started transaction > session = Session(bind=connection) > > yield session > > session.close() > # roll back the broader transaction > transaction.commit() > # put back the connection to the connection pool > connection.close() > > @pytest.mark.create_m2m_models > def test_create_m2m_models(self, engine): > > Base.metadata.drop_all(bind=engine) > Base.metadata.create_all(bind=engine) > > @pytest.mark.seed_m2m_data > def test_m2m_seeding_data(self, engine, session): > > molecule1 = TestMolecule(label="molecule1") > molecule2 = TestMolecule(label="molecule2") > molecule3 = TestMolecule(label="molecule3") > molecule4 = TestMolecule(label="molecule4") > light_chain_1_sequence = TestMolSequence(content="taglconst1VAR1") > heavy_chain_1_sequence = TestMolSequence(content="tagheavyconstant1VAR2") > heavy_chain_2_sequence = TestMolSequence(content="tagheavyconstant2VAR2") > light_chain_2_sequence = TestMolSequence(content="taglconst1VAR3") > heavy_chain_1 = TestChain( > chain_type="heavy", mol_sequence=light_chain_1_sequence > ) > light_chain_1 = TestChain( > chain_type="light", mol_sequence=heavy_chain_1_sequence > ) > light_chain_2 = TestChain( > chain_type="light", mol_sequence=light_chain_2_sequence > ) > heavy_chain_2 = TestChain( > chain_type="heavy", mol_sequence=heavy_chain_2_sequence > ) > > molecule1.chains.add(heavy_chain_1) > molecule1.chains.add(light_chain_1) > molecule2.chains.add(heavy_chain_2) > molecule2.chains.add(light_chain_2) > molecule3.chains.add(heavy_chain_1) > molecule3.chains.add(light_chain_2) > molecule4.chains.add(heavy_chain_2) > molecule4.chains.add(light_chain_1) > > tag_sequence = TestMolSequence(content="tag") > light_constant_region_seq = TestMolSequence(content="lconst1") > heavy_constant_region_1_seq = TestMolSequence(content="heavyconstant1") > heavy_constant_region_2_seq = TestMolSequence(content="heavyconstant2") > vr1_seq = TestMolSequence(content="VAR1") > vr2_seq = TestMolSequence(content="VAR2") > vr3_seq = TestMolSequence(content="VAR3") > # lc2_const_region_seq = TestMolSequence(content="lconst") > lc1_tag_feature = TestMolSequenceFeat( > start=0, stop=3, molecule_sequence=light_chain_1_sequence > ) > lc1_const_region_feature = TestMolSequenceFeat( > start=3, stop=10, molecule_sequence=light_chain_1_sequence > ) > lc1_var_region_feature = TestMolSequenceFeat( > start=10, stop=14, molecule_sequence=light_chain_1_sequence > ) > hc1_tag_feature = TestMolSequenceFeat( > start=0, stop=3, molecule_sequence=heavy_chain_1_sequence > ) > hc1_const_region_feature = TestMolSequenceFeat( > start=3, stop=17, molecule_sequence=heavy_chain_1_sequence > ) > hc1_var_region_feature = TestMolSequenceFeat( > start=17, stop=21, molecule_sequence=heavy_chain_1_sequence > ) > > hc2_tag_feature = TestMolSequenceFeat( > start=0, stop=3, molecule_sequence=heavy_chain_2_sequence > ) > hc2_const_region_feature = TestMolSequenceFeat( > start=3, stop=17, molecule_sequence=heavy_chain_2_sequence > ) > hc2_var_region_feature = TestMolSequenceFeat( > start=17, stop=21, molecule_sequence=heavy_chain_2_sequence > ) > > lc2_tag_feature = TestMolSequenceFeat( > start=0, stop=3, molecule_sequence=light_chain_2_sequence > ) > lc2_const_region_feature = TestMolSequenceFeat( > start=3, stop=10, molecule_sequence=light_chain_2_sequence > ) > lc2_var_region_feature = TestMolSequenceFeat( > start=10, stop=14, molecule_sequence=light_chain_2_sequence > ) > var_region1 = TestVarRegion(molecule_sequence=vr1_seq) > var_region2 = TestVarRegion(molecule_sequence=vr2_seq) > var_region3 = TestVarRegion(molecule_sequence=vr3_seq) > const_region1 = > TestConstRegion(molecule_sequence=light_constant_region_seq) > const_region2 = > TestConstRegion(molecule_sequence=heavy_constant_region_1_seq) > const_region3 = > TestConstRegion(molecule_sequence=heavy_constant_region_2_seq) > > light_chain_1.var_regions.add(var_region1) > heavy_chain_1.var_regions.add(var_region2) > heavy_chain_2.var_regions.add(var_region2) > light_chain_2.var_regions.add(var_region3) > > light_chain_1.const_regions.add(const_region1) > light_chain_2.const_regions.add(const_region1) > heavy_chain_1.const_regions.add(const_region2) > heavy_chain_2.const_regions.add(const_region3) > > lc1_tag_feature.feature_sequences.add(tag_sequence) > lc1_var_region_feature.feature_sequences.add(vr1_seq) > lc1_const_region_feature.feature_sequences.add(light_constant_region_seq) > > hc1_tag_feature.feature_sequences.add(tag_sequence) > hc1_var_region_feature.feature_sequences.add(vr2_seq) > hc1_const_region_feature.feature_sequences.add(heavy_constant_region_1_seq) > > lc2_tag_feature.feature_sequences.add(tag_sequence) > lc2_var_region_feature.feature_sequences.add(vr3_seq) > lc2_const_region_feature.feature_sequences.add(light_constant_region_seq) > > hc2_tag_feature.feature_sequences.add(tag_sequence) > hc2_var_region_feature.feature_sequences.add(vr2_seq) > hc2_const_region_feature.feature_sequences.add(heavy_constant_region_2_seq) > > session.add_all( > [ > heavy_chain_1, > light_chain_1, > light_chain_2, > heavy_chain_2, > molecule1, > molecule2, > molecule3, > molecule4, > # tag_sequence, > # lc1_tag_feature, > # lc1_const_region_feature, > # hc1_tag_feature, > # hc1_var_region_feature, > # lc1_tag_feature, > # lc2_const_region_feature, > ] > ) > session.commit() > > @pytest.mark.delete_test_m2m_models > def test_create_m2m(self, session): > molecule = session.query(TestMolecule).filter_by(label="molecule1").one() > session.delete(molecule) > session.query(TestChain).filter(~TestChain.molecules.any()).delete( > synchronize_session="fetch" > ) > session.expire_all() > orphan_chains = ( > session.query(TestChain).filter(~TestChain.molecules.any()).all() > ) > for chain in orphan_chains: > session.delete(chain) > orphan_vrs = ( > session.query(TestVarRegion).filter(~TestVarRegion.chains.any()).all() > ) > for orphan_vr in orphan_vrs: > session.delete(orphan_vr) > orphan_crs = ( > session.query(TestConstRegion) > .filter(~TestConstRegion.chains.any()) > .all() > ) > for orphan_cr in orphan_crs: > session.delete(orphan_cr) > orphan_sequences = ( > session.query(TestMolSequence).filter(~TestMolSequence.chains.any()).all() > ) > orphan_sequence: TestMolSequence > for orphan_sequence in orphan_sequences: > session.delete(orphan_sequence) > session.commit() > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/7963c7d7-9053-436e-ae5b-f92519ddeb0en%40googlegroups.com > <https://groups.google.com/d/msgid/sqlalchemy/7963c7d7-9053-436e-ae5b-f92519ddeb0en%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/56bfe88d-8b81-4541-9192-86d42ebcc27f%40www.fastmail.com > <https://groups.google.com/d/msgid/sqlalchemy/56bfe88d-8b81-4541-9192-86d42ebcc27f%40www.fastmail.com?utm_medium=email&utm_source=footer> > . > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CAL6EnB4LLyBJiqLFaG9xO2wZAE08yeipo_Ap7M2h-Csk0JTamg%40mail.gmail.com.