Re: [sqlalchemy] Issue with "complex" many 2 many delete, FK violation

Mark Aquino Tue, 13 Apr 2021 14:30:11 -0700

Hi Mike,

Sorry about the indentations.


I'm not sure I understand the changes you made to the script after delete
as it removes all test_chains, test_var_regions, and test_const regions
that are still referenced by the other test_molecules.   The only way I've
been able to get the delete to work properly is to manually delete
test_var_regions and test_const_regions first and then delete the
test_molecules, but the ideal outcome I'm trying to achieve is that when a
test_molecule is deleted:

   1.  that the system checks if the chains connected to it are removed if
   they are not linked to other test_molecules.
   2. if a test_chain is going to be deleted then
      1. the test_mol_sequence associated with it is deleted if it is no
      longer associated with any other test_chains
      2. any test_var_regions and test_const_regions are deleted if they
      are not associated with any other test_chains
   3. and finally if a test_mol_sequence is deleted that any
   test_mol_sequence_features are deleted if they are not associated with any
   other test_mol_sequences.


To make things a little easier to explain, if we just are dealing with
molecule1 and molecule4 there are only 3 unique test_chains: heavy_chain_1,
light_chain_1, and heavy_chain_2
if I deleted molecule1, I would want to retain heavy_chain_2 and
light_chain_1 and delete heavy_chain_1 as it was no longer associated with
any test_molecule. Ideally, then I would remove any test_const_regions and
test_var_regions that are no longer associated with any test_chains.
Because heavy_chain_1 shares the same test_var_region as heavy_chain_2,
that test_var_region would remain in the system but the test_const_region
unique to heavy_chain_1 would be deleted along with the test_mol_sequence
and test_mol_sequence_features associated with it.

    molecule1.chains.add(heavy_chain_1)
    molecule1.chains.add(light_chain_1)
    #molecule2.chains.add(heavy_chain_2)
    #molecule2.chains.add(light_chain_2)
    #molecule3.chains.add(heavy_chain_1)
    #molecule3.chains.add(light_chain_2)
    molecule4.chains.add(heavy_chain_2)
    molecule4.chains.add(light_chain_1)


light_chain_1_sequence = TestMolSequence(content="taglconst1VAR1")
heavy_chain_1_sequence = TestMolSequence(content="tagheavyconstant1VAR2")
heavy_chain_2_sequence = TestMolSequence(content="tagheavyconstant2VAR2")
light_chain_2_sequence = TestMolSequence(content="taglconst1VAR3")


Does that make sense?



On Tue, Apr 13, 2021 at 4:50 PM Mike Bayer <[email protected]> wrote:

> Hi there -
>
> I would ask that you try to make sure your formatting is maintained when
> posting examples especially such long ones as I had to re-indent it in
> order to run this.
>
> The delete at the end is failing because of incomplete cascade rules.
> The DELETE against "test_mol_sequence" seeks to CASCADE as configured to
> the other three tables,, which then fail because there are non-cascading
> FKs in the association tables.   We can see this via the message:
>
> update or delete on table "test_var_region" violates foreign key
> constraint "test_chain_var_region_var_region_id_fkey" on table
> "test_chain_var_region"
> DETAIL:  Key (id)=(1) is still referenced from table
> "test_chain_var_region".
>
> [SQL: DELETE FROM test_mol_sequence WHERE test_mol_sequence.id = %(id)s]
>
>
> it's DELETEing from test_mol_sequence but the error is against a totally
> different table.  That's postgresql's cascade.
>
> if you want these CASCADEs to remain in place then you'd need to ensure
> that those linked rows can be deleted without any rows being present in the
> association tables.
>
> I can modify your test at the end to fully DELETE these rows without
> qualifying for those that have empty collections only and the script then
> passes, because now it's deleting those rows that would otherwise be
> dependent on by "test_var_region" and therefore "test_mol_sequence",  so
> there's nothing unexpected going on.    Easiest solution here would be to
> add CASCADE rules to the association tables also.  If you want that to be
> prevented as you mention, and instead expect the script to explicitly
> delete those depending rows, then your script is already achieving that.
> the "business logic" so to speak in this case would be as below:
>
>         orphan_chains = (
>             session.query(TestChain).
> #            filter(~TestChain.molecules.any()).
>             all()
>         )
>         for chain in orphan_chains:
>             session.delete(chain)
>
>         orphan_vrs = (
>             session.query(TestVarRegion)
> #            .filter(~TestVarRegion.chains.any())
>             .all()
>         )
>         for orphan_vr in orphan_vrs:
>             session.delete(orphan_vr)
>         orphan_crs = (
>             session.query(TestConstRegion)
> #           .filter(~TestConstRegion.chains.any())
>             .all()
>         )
>         for orphan_cr in orphan_crs:
>             session.delete(orphan_cr)
>         orphan_sequences = (
>             session.query(TestMolSequence)
> #           .filter(~TestMolSequence.chains.any())
>             .all()
>         )
>
>
>
>
>
>
>
>
> On Tue, Apr 13, 2021, at 10:03 AM, [email protected] wrote:
>
> I need to delete the association table rows for many to many relationships
> when I delete one, but the default behavior (to remove those rows) does not
> seem to work in my case.
>
> I have multiple levels of many to many relationships, as you can see in
> the example I'll provide below and when I delete a "parent" afterwards I
> try to clean up any children left behind that have no other parents.
> However, these children are in many to many relationships with other
> children and that's when the ORM fails to attempt to remove those children
> from their related association tables (at least, in a way that I expect).
>
> The issue is error is:
>     def do_executemany(self, cursor, statement, parameters, context=None):
>         if self.executemany_mode is EXECUTEMANY_DEFAULT:
> >           cursor.executemany(statement, parameters)
> E           sqlalchemy.exc.IntegrityError:
> (psycopg2.errors.ForeignKeyViolation) update or delete on table
> "test_var_region" violates foreign key constraint
> "test_chain_var_region_var_region_id_fkey" on table "test_chain_var_region"
> E           DETAIL:  Key (id)=(1) is still referenced from table
> "test_chain_var_region".
> E
> E           [SQL: DELETE FROM test_mol_sequence WHERE test_mol_sequence.id
> = %(id)s]
> E           [parameters: ({'id': 5}, {'id': 6}, {'id': 7}, {'id': 8},
> {'id': 9}, {'id': 10})]
> E           (Background on this error at: http://sqlalche.me/e/gkpj)
>
> The desired effect, of course, is that the rows in test_chain_var_region
> that reference the deleted chains removed.  I've tried several strategies
> to do this but with no change in this behavior.
>
> Cascades could be an issue, and I would rather handle removal of any
> "orphan" rows in the model tables via business logic than have the database
> cascade deletes and potentially remove rows that are associated with other
> objects.
>
> import pytest
> from sqlalchemy import (
> Table,
> Column,
> Integer,
> String,
> ForeignKey,
> create_engine,
> )
> from sqlalchemy.ext.declarative import declarative_base, DeclarativeMeta
> from sqlalchemy.orm import relationship, Session
> from sqlalchemy.util import OrderedSet
>
>
> Base: DeclarativeMeta = declarative_base()
> engine = create_engine(
> "postgresql://postgres:postgres@localhost:5432/espresso", echo=True
> )
> test_chain_const_region = Table(
> "test_chain_const_region",
> Base.metadata,
> Column("chain_id", Integer, ForeignKey("test_chain.id")),
> Column("const_region_id", Integer, ForeignKey("test_const_region.id")),
> )
> test_chain_var_region = Table(
> "test_chain_var_region",
> Base.metadata,
> Column("chain_id", Integer, ForeignKey("test_chain.id")),
> Column("var_region_id", Integer, ForeignKey("test_var_region.id")),
> )
> test_molecule_chain = Table(
> "test_molecule_chain",
> Base.metadata,
> Column("molecule_id", Integer, ForeignKey("test_molecule.id")),
> Column("chain_id", Integer, ForeignKey("test_chain.id")),
> )
>
> test_mol_sequence_feat_mol_sequence = Table(
> "test_mol_sequence_feat_mol_sequence",
> Base.metadata,
> Column("mol_sequence_feat_id", Integer, ForeignKey("
> test_mol_sequence_feat.id")),
> Column("mol_sequence_id", Integer, ForeignKey("test_mol_sequence.id")),
> )
>
>
> class TestMolecule(Base):
> __tablename__ = "test_molecule"
> id = Column(Integer, primary_key=True)
> label = Column(String)
> chains = relationship(
> "TestChain",
> secondary=test_molecule_chain,
> collection_class=OrderedSet,
> back_populates="molecules",
> )
>
>
> class TestMolSequence(Base):
> __tablename__ = "test_mol_sequence"
>
> id = Column(Integer, primary_key=True)
> content = Column(String, nullable=False, unique=True)
> parent_features = relationship(
> "TestMolSequenceFeat",
> secondary=test_mol_sequence_feat_mol_sequence,
> collection_class=OrderedSet,
> back_populates="feature_sequences",
> single_parent=True,
> )
> chains = relationship(
> "TestChain", back_populates="mol_sequence", collection_class=OrderedSet
> )
>
>
> class TestMolSequenceFeat(Base):
> __tablename__ = "test_mol_sequence_feat"
>
> id = Column(Integer, primary_key=True)
> molecule_sequence_id = Column(
> Integer, ForeignKey("test_mol_sequence.id", ondelete="CASCADE"),
> )
> molecule_sequence = relationship("TestMolSequence",)
> start = Column(Integer)
> stop = Column(Integer)
> feature_sequences = relationship(
> "TestMolSequence",
> secondary=test_mol_sequence_feat_mol_sequence,
> collection_class=OrderedSet,
> back_populates="parent_features",
> # single_parent=True,
> )
>
>
> class TestChain(Base):
> __tablename__ = "test_chain"
>
> id = Column(Integer, primary_key=True)
> label = Column(String)
> chain_type = Column(String)
> mol_sequence_id = Column(Integer, ForeignKey("test_mol_sequence.id"))
> mol_sequence = relationship("TestMolSequence", back_populates="chains")
> molecules = relationship(
> "TestMolecule",
> secondary=test_molecule_chain,
> collection_class=OrderedSet,
> back_populates="chains",
> )
> var_regions = relationship(
> "TestVarRegion",
> secondary=test_chain_var_region,
> collection_class=OrderedSet,
> back_populates="chains",
> )
> const_regions = relationship(
> "TestConstRegion",
> secondary=test_chain_const_region,
> collection_class=OrderedSet,
> back_populates="chains",
> )
>
>
> class TestVarRegion(Base):
> __tablename__ = "test_var_region"
>
> id = Column(Integer, primary_key=True)
> molecule_sequence_id = Column(
> Integer, ForeignKey("test_mol_sequence.id", ondelete="CASCADE"),
> )
> description = Column(String)
> additional_information = Column(String)
> label = Column("label", String, nullable=True, unique=False)
> molecule_sequence = relationship("TestMolSequence")
> chains = relationship(
> "TestChain",
> secondary=test_chain_var_region,
> collection_class=OrderedSet,
> back_populates="var_regions",
> passive_deletes=True,
> )
>
>
> class TestConstRegion(Base):
> __tablename__ = "test_const_region"
>
> id = Column(Integer, primary_key=True)
> molecule_sequence_id = Column(
> Integer, ForeignKey("test_mol_sequence.id", ondelete="CASCADE"),
> )
> description = Column(String)
> additional_information = Column(String)
> label = Column("label", String, nullable=True, unique=False)
> molecule_sequence = relationship("TestMolSequence")
> chains = relationship(
> "TestChain",
> secondary=test_chain_const_region,
> collection_class=OrderedSet,
> back_populates="const_regions",
> passive_deletes=True,
> )
>
>
> class TestManyToMany:
> @pytest.fixture
> def engine(self):
> return create_engine(
> "postgresql://postgres:postgres@localhost:5432/espresso", echo=True
> )
>
> @pytest.fixture
> def session(self):
>
> *"""Returns an sqlalchemy session, and after the test tears down
> everything properly."""* connection = engine.connect()
> # begin the nested transaction
> transaction = connection.begin()
> # use the connection with the already started transaction
> session = Session(bind=connection)
>
> yield session
>
> session.close()
> # roll back the broader transaction
> transaction.commit()
> # put back the connection to the connection pool
> connection.close()
>
> @pytest.mark.create_m2m_models
> def test_create_m2m_models(self, engine):
>
> Base.metadata.drop_all(bind=engine)
> Base.metadata.create_all(bind=engine)
>
> @pytest.mark.seed_m2m_data
> def test_m2m_seeding_data(self, engine, session):
>
> molecule1 = TestMolecule(label="molecule1")
> molecule2 = TestMolecule(label="molecule2")
> molecule3 = TestMolecule(label="molecule3")
> molecule4 = TestMolecule(label="molecule4")
> light_chain_1_sequence = TestMolSequence(content="taglconst1VAR1")
> heavy_chain_1_sequence = TestMolSequence(content="tagheavyconstant1VAR2")
> heavy_chain_2_sequence = TestMolSequence(content="tagheavyconstant2VAR2")
> light_chain_2_sequence = TestMolSequence(content="taglconst1VAR3")
> heavy_chain_1 = TestChain(
> chain_type="heavy", mol_sequence=light_chain_1_sequence
> )
> light_chain_1 = TestChain(
> chain_type="light", mol_sequence=heavy_chain_1_sequence
> )
> light_chain_2 = TestChain(
> chain_type="light", mol_sequence=light_chain_2_sequence
> )
> heavy_chain_2 = TestChain(
> chain_type="heavy", mol_sequence=heavy_chain_2_sequence
> )
>
> molecule1.chains.add(heavy_chain_1)
> molecule1.chains.add(light_chain_1)
> molecule2.chains.add(heavy_chain_2)
> molecule2.chains.add(light_chain_2)
> molecule3.chains.add(heavy_chain_1)
> molecule3.chains.add(light_chain_2)
> molecule4.chains.add(heavy_chain_2)
> molecule4.chains.add(light_chain_1)
>
> tag_sequence = TestMolSequence(content="tag")
> light_constant_region_seq = TestMolSequence(content="lconst1")
> heavy_constant_region_1_seq = TestMolSequence(content="heavyconstant1")
> heavy_constant_region_2_seq = TestMolSequence(content="heavyconstant2")
> vr1_seq = TestMolSequence(content="VAR1")
> vr2_seq = TestMolSequence(content="VAR2")
> vr3_seq = TestMolSequence(content="VAR3")
> # lc2_const_region_seq = TestMolSequence(content="lconst")
> lc1_tag_feature = TestMolSequenceFeat(
> start=0, stop=3, molecule_sequence=light_chain_1_sequence
> )
> lc1_const_region_feature = TestMolSequenceFeat(
> start=3, stop=10, molecule_sequence=light_chain_1_sequence
> )
> lc1_var_region_feature = TestMolSequenceFeat(
> start=10, stop=14, molecule_sequence=light_chain_1_sequence
> )
> hc1_tag_feature = TestMolSequenceFeat(
> start=0, stop=3, molecule_sequence=heavy_chain_1_sequence
> )
> hc1_const_region_feature = TestMolSequenceFeat(
> start=3, stop=17, molecule_sequence=heavy_chain_1_sequence
> )
> hc1_var_region_feature = TestMolSequenceFeat(
> start=17, stop=21, molecule_sequence=heavy_chain_1_sequence
> )
>
> hc2_tag_feature = TestMolSequenceFeat(
> start=0, stop=3, molecule_sequence=heavy_chain_2_sequence
> )
> hc2_const_region_feature = TestMolSequenceFeat(
> start=3, stop=17, molecule_sequence=heavy_chain_2_sequence
> )
> hc2_var_region_feature = TestMolSequenceFeat(
> start=17, stop=21, molecule_sequence=heavy_chain_2_sequence
> )
>
> lc2_tag_feature = TestMolSequenceFeat(
> start=0, stop=3, molecule_sequence=light_chain_2_sequence
> )
> lc2_const_region_feature = TestMolSequenceFeat(
> start=3, stop=10, molecule_sequence=light_chain_2_sequence
> )
> lc2_var_region_feature = TestMolSequenceFeat(
> start=10, stop=14, molecule_sequence=light_chain_2_sequence
> )
> var_region1 = TestVarRegion(molecule_sequence=vr1_seq)
> var_region2 = TestVarRegion(molecule_sequence=vr2_seq)
> var_region3 = TestVarRegion(molecule_sequence=vr3_seq)
> const_region1 =
> TestConstRegion(molecule_sequence=light_constant_region_seq)
> const_region2 =
> TestConstRegion(molecule_sequence=heavy_constant_region_1_seq)
> const_region3 =
> TestConstRegion(molecule_sequence=heavy_constant_region_2_seq)
>
> light_chain_1.var_regions.add(var_region1)
> heavy_chain_1.var_regions.add(var_region2)
> heavy_chain_2.var_regions.add(var_region2)
> light_chain_2.var_regions.add(var_region3)
>
> light_chain_1.const_regions.add(const_region1)
> light_chain_2.const_regions.add(const_region1)
> heavy_chain_1.const_regions.add(const_region2)
> heavy_chain_2.const_regions.add(const_region3)
>
> lc1_tag_feature.feature_sequences.add(tag_sequence)
> lc1_var_region_feature.feature_sequences.add(vr1_seq)
> lc1_const_region_feature.feature_sequences.add(light_constant_region_seq)
>
> hc1_tag_feature.feature_sequences.add(tag_sequence)
> hc1_var_region_feature.feature_sequences.add(vr2_seq)
> hc1_const_region_feature.feature_sequences.add(heavy_constant_region_1_seq)
>
> lc2_tag_feature.feature_sequences.add(tag_sequence)
> lc2_var_region_feature.feature_sequences.add(vr3_seq)
> lc2_const_region_feature.feature_sequences.add(light_constant_region_seq)
>
> hc2_tag_feature.feature_sequences.add(tag_sequence)
> hc2_var_region_feature.feature_sequences.add(vr2_seq)
> hc2_const_region_feature.feature_sequences.add(heavy_constant_region_2_seq)
>
> session.add_all(
> [
> heavy_chain_1,
> light_chain_1,
> light_chain_2,
> heavy_chain_2,
> molecule1,
> molecule2,
> molecule3,
> molecule4,
> # tag_sequence,
> # lc1_tag_feature,
> # lc1_const_region_feature,
> # hc1_tag_feature,
> # hc1_var_region_feature,
> # lc1_tag_feature,
> # lc2_const_region_feature,
> ]
> )
> session.commit()
>
> @pytest.mark.delete_test_m2m_models
> def test_create_m2m(self, session):
> molecule = session.query(TestMolecule).filter_by(label="molecule1").one()
> session.delete(molecule)
> session.query(TestChain).filter(~TestChain.molecules.any()).delete(
> synchronize_session="fetch"
> )
> session.expire_all()
> orphan_chains = (
> session.query(TestChain).filter(~TestChain.molecules.any()).all()
> )
> for chain in orphan_chains:
> session.delete(chain)
> orphan_vrs = (
> session.query(TestVarRegion).filter(~TestVarRegion.chains.any()).all()
> )
> for orphan_vr in orphan_vrs:
> session.delete(orphan_vr)
> orphan_crs = (
> session.query(TestConstRegion)
> .filter(~TestConstRegion.chains.any())
> .all()
> )
> for orphan_cr in orphan_crs:
> session.delete(orphan_cr)
> orphan_sequences = (
> session.query(TestMolSequence).filter(~TestMolSequence.chains.any()).all()
> )
> orphan_sequence: TestMolSequence
> for orphan_sequence in orphan_sequences:
> session.delete(orphan_sequence)
> session.commit()
>
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/7963c7d7-9053-436e-ae5b-f92519ddeb0en%40googlegroups.com
> <https://groups.google.com/d/msgid/sqlalchemy/7963c7d7-9053-436e-ae5b-f92519ddeb0en%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/56bfe88d-8b81-4541-9192-86d42ebcc27f%40www.fastmail.com
> <https://groups.google.com/d/msgid/sqlalchemy/56bfe88d-8b81-4541-9192-86d42ebcc27f%40www.fastmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/CAL6EnB4LLyBJiqLFaG9xO2wZAE08yeipo_Ap7M2h-Csk0JTamg%40mail.gmail.com.

Re: [sqlalchemy] Issue with "complex" many 2 many delete, FK violation

Reply via email to