Hello Simon,

It really looks like you put the finger on the problem. 
I'm going to try this now and see if it solves the problem, because we're 
currently building Comm objects and not Email or Phones. 

Thank you for your reply !  

On Friday, May 18, 2018 at 2:22:14 PM UTC+2, Simon King wrote:
>
> On Fri, May 18, 2018 at 12:19 PM, Benjamin B. <wix...@gmail.com 
> <javascript:>> wrote: 
> > Hello everybody, 
> > 
> > TL;DR We are experiencing a bug where, with perfectly valid data, the 
> > polymorphic_identity is randomly (0,05%) not respected, and the object 
> > ends-up being the parent class instead of the "requested" child class. 
> > 
> > First, some code ! We have here a three table schema : 
> > 
> >     Comm, the table that contains means of communication (emails, 
> phones, 
> > ...) 
> >     Link, which is a table responsible for linking two or more entities 
> (of 
> > any kind) (legacy architecture... we're slowly getting rid of it) 
> >     Contact, which represents a user. 
> > 
> > 
> > class CommTypes(enum.Enum): 
> >     email = 1 
> >     phone = 2 
> > 
> > 
> > class Comm(db.Model): 
> >     __tablename__ = 'comm' 
> > 
> >     id = db.Column('id', db.Integer, primary_key=True) 
> >     type_id = db.Column('type', db.Integer) 
> > 
> >     __mapper_args__ = { 
> >         'polymorphic_on': type_id 
> >     } 
> > 
> > 
> > class Email(Comm): 
> >     __mapper_args__ = { 
> >         'polymorphic_identity': CommTypes.email.value 
> >     } 
> > 
> > 
> > class Phone(Comm): 
> >     __mapper_args__ = { 
> >         'polymorphic_identity': CommTypes.phone.value 
> >     } 
> > 
> > 
> > class Contact(db.Model): 
> >     id = db.Column('id', db.Integer, primary_key=True) 
> > 
> >     _comms = db.relationship( 
> >         'Comm', 
> >         secondary='link', 
> >         primaryjoin="and_(Contact.id == link.c.id1, link.c.type == 1)", 
> >         secondaryjoin="link.c.id2 == Comm.id", 
> >         foreign_keys='[link.c.id2, link.c.id1]' 
> >     ) 
> > 
> >     @hybrid_property 
> >     def emails(self): 
> >         return [comm for comm in self._comms if isinstance(comm, Email)] 
> > 
> >     def email(): 
> >         if not self.emails: 
> >             return None 
> > 
> >         return self.emails[0] 
> > 
> > The problem we're having is that: in production, with a perfectly valid 
> > contact, its email comm and the appropriate 'link' committed ; right 
> after 
> > being committed, it happens that the email comm which should be an 
> instance 
> > of Email, is actually an instance of Comm. Having a PDB open at the time 
> the 
> > bug happens show us that we have indeed a Comm instead of an Email, but 
> > doing a db.session.refresh(contact) in the repl seems to solve the 
> problem. 
> > 
> > Inspecting the content of the object (meant to be an Email) show us that 
> the 
> > type_id is indeed the right one (=CommTypes.email.value), but for a 
> reason 
> > we do not fully understand, it is a Comm instead. We tried to put a 
> debug 
> > helper in production and we found out that immediately after, trying to 
> get 
> > a new contact with Contact.query.get(contact.id) and getting the email 
> of 
> > this new contact, does lead to the exact same problem. 
> > 
> > Having a random bug, we tried to reproduce by 'brute-forcing' with the 
> code 
> > below, which worked. It takes a random amount of time until the bug is 
> > triggered. 
> > 
> > import requests 
> > import random 
> > import string 
> > 
> > def rand_str(length=10): 
> >     return ''.join(random.choice(string.ascii_lowercase) for _ in 
> > range(length)) 
> > 
> > def rand_email(): 
> >     return 'debug_' + rand_str() + '@' + rand_str() + '.' + 
> > rand_str(length=3) 
> > 
> > def gen_user_payload(): 
> >     return { 
> >         "action": "signup", 
> >         "email": rand_email(), 
> >     } 
> > 
> > def register_all_users(): 
> >     session = requests.Session() 
> >     for _ in range(10000): 
> >         user = gen_user_payload() 
> >         session.post('https://xxx:5000/auth/register', json=user, 
> > verify=False) 
> > 
> > register_all_users() 
> > 
> > Notes : 
> > 
> >     We are using flask / flask_sqlalchemy 
> >     Our sqlalchemy version is 1.1.16. We looked at the changelogs to see 
> if 
> > there was any update linked to our issue, but haven't seen anything in 
> > particular. 
> > 
> > 
> > Do you have any idea about the source of this problem ? What could cause 
> > this sort of bug ? 
> > If you need more data, more code .. do not hesitate to ask ! 
> > 
> > Anyway, thanks for building such an awesome tool ! 
> > 
>
> In your server code, do you ever construct Comm() instances directly, 
> rather than explicitly constructing Email() or Phone() instances? If 
> you do that, the Comm instance will be stored in the session's 
> identity map. As long as that instance exists in the session, 
> SQLAlchemy will not construct an instance of the appropriate subclass 
> (otherwise the session would contain 2 objects representing the same 
> row in the database, which SQLAlchemy tries to avoid). 
>
> For example, the script below creates a Comm object and tries to load 
> it back from the database without expunging the original. The result 
> is always a Comm object. Only after we expunged the session and reload 
> it do we get an Email object back. 
>
> Hope that helps, 
>
> Simon 
>
> ################################################## 
> import enum 
>
> import sqlalchemy as sa 
> import sqlalchemy.orm as saorm 
>
> from sqlalchemy.ext.declarative import declarative_base 
>
> Base = declarative_base() 
>
>
> class CommTypes(enum.Enum): 
>     email = 1 
>     phone = 2 
>
>
> class Comm(Base): 
>     __tablename__ = 'comm' 
>     id = sa.Column('id', sa.Integer, primary_key=True) 
>     type_id = sa.Column('type', sa.Integer) 
>
>     __mapper_args__ = { 
>         'polymorphic_on': type_id 
>     } 
>
>
> class Email(Comm): 
>     __mapper_args__ = { 
>         'polymorphic_identity': CommTypes.email.value 
>     } 
>
>
> class Phone(Comm): 
>     __mapper_args__ = { 
>         'polymorphic_identity': CommTypes.phone.value 
>     } 
>
> def msg(s): 
>     print 
>     print '#' * 70 
>     print s 
>
> if __name__ == '__main__': 
>     msg('creating database') 
>     engine = sa.create_engine('sqlite:///:memory:') 
>     Session = saorm.sessionmaker(bind=engine) 
>     Base.metadata.create_all(engine) 
>
>     session = Session() 
>     msg('constructing Comm instance') 
>     comm1 = Comm(type_id=CommTypes.email.value) 
>     session.add(comm1) 
>     session.flush() 
>     print comm1 
>     assert type(comm1) is Comm 
>     commid = comm1.id 
>
>     # Will probably not issue SQL 
>     msg('using query.get()') 
>     comm2 = session.query(Comm).get(commid) 
>     print comm2 
>     assert comm2 is comm1 
>
>     # Ought to issue SQL 
>     msg('using query.filter_by()') 
>     comm3 = session.query(Comm).filter_by(id=commid).first() 
>     print comm3 
>     assert comm3 is comm1 
>
>     msg('committing and expunging') 
>     session.commit() 
>     session.expunge_all() 
>
>     comm4 = session.query(Comm).get(commid) 
>     print comm4 
>     assert type(comm4) == Email 
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to