I've spent time unsuccessfully trying to fix some problems
with a many-to-many relationship and lazy joins.

Here's a simplified repro:

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-

    import random
    from sqlalchemy import create_engine, ForeignKey
    from sqlalchemy import Column, Integer, String
    from sqlalchemy.ext.declarative import declarative_base
    from sqlalchemy.orm import relationship, sessionmaker, joinedload

    engine = create_engine('sqlite:///appts.db', echo=True)
    Base = declarative_base()

    #LAZYJOIN = 'dynamic'
    LAZYJOIN = 'joined'

    class Appointment(Base):
        __tablename__ = 'appointments'

        id = Column(Integer, primary_key=True, autoincrement=True, 
nullable=False)
        subject = Column(String)
        persons = relationship(
            'AppointmentPerson',
            cascade='delete, save-update, merge, expunge',
            lazy=LAZYJOIN)
        # rel1 = relationship('Foo')
        # rel2 = relationship('Bar')

    class Person(Base):
        __tablename__ = 'persons'

        id = Column(Integer, primary_key=True, autoincrement=True, 
nullable=False)
        name = Column(String)

    class AppointmentPerson(Base):
        """augmented association table between Appointment and Person"""
        __tablename__ = 'appointment_persons'

        appointment_id = Column(Integer, ForeignKey(Appointment.id, 
ondelete='CASCADE'), nullable=False, primary_key=True)
        person_id = Column(Integer, ForeignKey(Person.id), nullable=False, 
primary_key=True)
        person = relationship(Person)
        # Other columns omitted that are necessary for the real use case

    Base.metadata.create_all(engine)
    Session = sessionmaker(bind=engine)

    def create_data(num_appts=20, num_people=20):
        random.seed(12345)

        session = Session()
        persons = []

        for i in range(num_people):
            person = Person(name="Person_%03d" % (i+1))
            session.add(person)
            persons.append(person)

        session.flush()

        for i in range(num_appts):
            appt = Appointment(subject="Appt_%03d" % (i+1))
            session.add(appt)
            session.flush()

            random.shuffle(persons)

            attendee_count = random.randrange(4) + 1
            for k in range(1, attendee_count):
                p = persons[k]
                print i, k, p.name
                ap = AppointmentPerson(appointment_id=appt.id, 
person_id=p.id)
                appt.persons.append(ap)

                session.add(ap)

        session.commit()

    def query_all_appointments(session):
        query = session.query(Appointment)
        if LAZYJOIN == 'joined':
            query = query.options(joinedload(Appointment.persons)
                                  .joinedload(AppointmentPerson.person))
        return query.all()

    def show_data():
        print "show_data"
        session = Session()
        appts = query_all_appointments(session)
        print len(appts)
        for a in appts:
            people = ", ".join([ap.person.name for ap in a.persons])
            x = "{0}: {1}".format(a.subject, people)

    def update_appointment_people(appt_id):
        session = Session()
        appt = session.query(Appointment).filter(Appointment.id == 
appt_id).one()
        appt.persons.delete()
        session.commit()

    if __name__ == '__main__':
        create_data()
        show_data()
        update_appointment_people(7)

The code originally used

    Appointment.persons = relationship(
        'AppointmentPerson',
        cascade='delete, save-update, merge, expunge',
        lazy='dynamic')

where everything worked, but accessing the persons on every appointment
triggered a separate query for AppointmentPersons, followed by
queries for each Person.

Changing the relationship to `lazy='joined'` and using joinedload

    query = (session.query(Appointment). 
                 options(joinedload(Appointment.persons).
                         joinedload(AppointmentPerson.person)))
    return query.all()

reduces this to one SQL query:

    SELECT appointments.id AS appointments_id,
        appointments.subject AS appointments_subject,
        persons_1.id AS persons_1_id,
        persons_1.name AS persons_1_name,
        appointment_persons_1.appointment_id AS 
appointment_persons_1_appointment_id,
        appointment_persons_1.person_id AS appointment_persons_1_person_id
    FROM appointments
    LEFT OUTER JOIN
        appointment_persons AS appointment_persons_1
        ON appointments.id = appointment_persons_1.appointment_id
    LEFT OUTER JOIN persons AS persons_1
        ON persons_1.id = appointment_persons_1.person_id

which is great.

Aside: In the original code, we're using MySQL, not SQLite.
On my MacBook, I have been unable to reproduce the pathological select 
behavior
with MySQL 5.6, whereas it repros consistently on Amazon's Elastic Beanstalk
with MySQL 5.5 at RDS. (SQLAlchemy 0.9.2, Python 2.7 in both cases.)
I found this baffling.

However, with `lazy='joined'` and using joinedload,
I no longer know how to delete the AppointmentPersons
associated with an Appointment. `appt.persons.delete()` used to work;
now I get `AttributeError: 'InstrumentedList' object has no attribute 
'delete'`

I've tried a variety of different approaches, all unsuccessfully.
Other approaches end up with IntegrityErrors and other things I can't 
remember now.

The real reason why I want to delete AppointmentPersons is
because I have the following pattern in my REST APIs.

    create:
        appt = Appointment()
        session.add(appt)
        session.flush()
        # appt.id is now valid
        update_model_from_request(appt, request.json_body)

    get_by_id:
        return session.query(Appointment).filter(Appointment.id == 
request_id).one()

    update_by_id:
        appt = flush_model(request_id)
        update_model_from_request(appt, request.json_body)

    delete_by_id
        appt = flush_model(request_id)
        session.delete(appt)

    flush_model:
        appt = get_by_id(request_id)
        # Purge various relationships
        appt.rel1.delete()
        appt.rel2.delete()
        appt.people.delete()  # <--
        return appt

    update_model_from_request(appt, json_body):
        # set the fields in an empty appt from json_body
        for p in json_body['people']:
            person = session.query(Person).filter(Person.id == 
p['id']).first() or Person()
            person.name = p['name']
            session.add(person)
            session.flush()
            ap = AppointmentPerson(appointment_id=appt.id, 
person_id=person.id)
            session.add(ap)
        # update rel1 and rel2

This isn't especially efficent and leads to some churn in Persons
and AppointmentPersons, but it's easy to write.

Is there a better way to accomplish this? I could, for the
update_by_id case, partition into (added, deleted, unchanged)
but that's somewhat tedious to compute.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to