[sqlalchemy] Re: Slow relation based assignment.
hey mike, Just to confirm - trunk fixes problem with deletion. Additionally, I have removed the lazy loading condition and it maintains the speed of the query. Thanks again to the team, Martin On Dec 7, 4:14 pm, Michael Bayer [EMAIL PROTECTED] wrote: hey martin - this bug is fixed in trunk r3868, so if you use the svn trunk you can either keep using the dynamic or go back to the regular relation, you should be good in both cases. - mike --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Slow relation based assignment.
Hello again Michael, Have read the documentation you referenced, but am still unsure how to now delete a Tag without generating the following error: (Note - using Postgres in production) (IntegrityError) update or delete on tags violates foreign key constraint employeesTags_tag_id_fkey on employeesTags DETAIL: Key (id)=(3) is still referenced from table employeesTags. 'DELETE FROM tags WHERE tags.id = %(id)s' {'id': 3} Without the lazy='dynamic' it works fine (correctly deletes entries from employeesTags first). The delete operation I am performing is: session.begin() entry = session.query(Tag).filter_by(id=3).first() try: session.delete(entry) session.commit() except Exception, error: print error session.rollback() else: print 'Deleted successfully' Thanks again for all your help so far, Martin On Dec 5, 5:27 pm, Michael Bayer [EMAIL PROTECTED] wrote: hi martin - the issue is that each Tag object contains a collection of 1000 employees on it, and when you make an assignment in the forwards direction (i.e. employee.tag.append(sometag)), the corresponding reverse relation needs to be fully loaded and then updated according to backref semantics. since you're using eager loading by default between employees and tags, there is a load of 20,000 rows each time an uninitialized tags.employees collection is touched. To prevent the backref from being unnecessarily loaded, and since it is a large collection, you should use a dynamic collection for the reverse: mapper(Employee, employees, properties={ 'tags': relation(Tag, secondary=employeesTags,backref=backref('employees', lazy='dynamic'), lazy=False) }) mapper(Tag, tags) the employees collection on Tag is now a filterable Query object which only queries when read from, and you'll see that the time goes down to nothing. you can also append and delete from a dynamic collection like a regular list. large collection techniques are discussed at:http://www.sqlalchemy.org/docs/04/mappers.html#advdatamapping_relatio... we do have a ticket in trac to try improving upon backrefs to not load unloaded collections in any case, this is ticket #871. On Dec 5, 12:07 pm, Martin Pengelly-Phillips [EMAIL PROTECTED] wrote: Hello again, I have recently noticed that a particular assignment seems to be taking a relatively long time. Not being a database expert I am confused as to whether the last assignment 'person.tags = tags' should be so slow when referencing existing tags that are used by other entities - it seems to try and get a list of all employees that use the given tag and then spends the time doing something with the resulting set, but why? Test case below. If the slow assignment is expected do you have any advice on how to speed up such a statement? Thank you in advance, Martin -- import os, datetime, time from sqlalchemy import * from sqlalchemy.orm import * file = '/tmp/test.db' if os.path.isfile(file): os.remove(file) engine = create_engine('sqlite:///%s' % file, echo=True) metadata = MetaData() Session = scoped_session(sessionmaker(autoflush=True, transactional=False, bind=engine)) mapper = Session.mapper # Classes #-- class Employee(object): def __init__(self, name=None): self.name = name def __repr__(self): return '%s:%s' % (self.id, self.name) class Tag(object): def __init__(self, label): self.label = label # Setup tables #-- employees = Table('employees', metadata, Column('id', Integer, primary_key=True), Column('name', String, nullable=False, default='bob'), Column('dob', DateTime, nullable=False, default=datetime.datetime.now), ) tags = Table('tags', metadata, Column('id', Integer, primary_key=True), Column('label', String, nullable=False), ) employeesTags = Table('employeesTags', metadata, Column('employee_id', Integer, ForeignKey('employees.id')), Column('tag_id', Integer, ForeignKey('tags.id')), ) # Mappers #-- mapper(Employee, employees, properties={ 'tags': relation(Tag, secondary=employeesTags, backref='employees', lazy=False)}) mapper(Tag, tags) # Test #-- metadata.create_all(engine) session = Session() session.begin() tags = [] for i in xrange(20): tag = Tag(str(datetime.datetime.now())) tags.append(tag) for i in xrange(1000): p = Employee('john%d' % i) p.tags = tags session.commit() session.clear() session.begin()
[sqlalchemy] Re: Slow relation based assignment.
On Dec 7, 2007, at 5:21 AM, Martin Pengelly-Phillips wrote: Hello again Michael, Have read the documentation you referenced, but am still unsure how to now delete a Tag without generating the following error: (Note - using Postgres in production) (IntegrityError) update or delete on tags violates foreign key constraint employeesTags_tag_id_fkey on employeesTags DETAIL: Key (id)=(3) is still referenced from table employeesTags. 'DELETE FROM tags WHERE tags.id = %(id)s' {'id': 3} Without the lazy='dynamic' it works fine (correctly deletes entries from employeesTags first). The delete operation I am performing is: session.begin() entry = session.query(Tag).filter_by(id=3).first() try: session.delete(entry) session.commit() except Exception, error: print error session.rollback() else: print 'Deleted successfully' hey Martin - I think this is actually a bug in sqlalchemy regarding the dynamic relation; ive added ticket #895. Ive recently enhanced regular relations to also not unnecessarily load backrefs, that code is in trunk if youd like to try it, and i dont think it has this particular issue. Otherwise, for now, when you load the Tag, you have two (well, three) choices: you can load the colleciton of Employees attached to the Tag, and explicitly remove the tag from each employee; or, you can implement ON DELETE CASCADE in your database on the foreign key in question so that it automatically updates itself; finally, you can, within the transaction, issue a DELETE FROM employees_tags where tag_id=3 before you call session.commit(), although this might conflict with existing Employee records. implementing ON DELETE CASCADE is definitely the most legit way to go here since it lets the database do most of the work. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Slow relation based assignment.
hey Mike, Thanks for the update - I'll try it out tomorrow. Martin p.s. Have I mentioned you guys provide the best support I have encountered in a long time (including commercial products). On Dec 7, 4:14 pm, Michael Bayer [EMAIL PROTECTED] wrote: hey martin - this bug is fixed in trunk r3868, so if you use the svn trunk you can either keep using the dynamic or go back to the regular relation, you should be good in both cases. - mike --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Slow relation based assignment.
Thank you Michael - I had completely missed the backref full load. On Dec 5, 5:27 pm, Michael Bayer [EMAIL PROTECTED] wrote: hi martin - the issue is that each Tag object contains a collection of 1000 employees on it, and when you make an assignment in the forwards direction (i.e. employee.tag.append(sometag)), the corresponding reverse relation needs to be fully loaded and then updated according to backref semantics. since you're using eager loading by default between employees and tags, there is a load of 20,000 rows each time an uninitialized tags.employees collection is touched. To prevent the backref from being unnecessarily loaded, and since it is a large collection, you should use a dynamic collection for the reverse: mapper(Employee, employees, properties={ 'tags': relation(Tag, secondary=employeesTags,backref=backref('employees', lazy='dynamic'), lazy=False) }) mapper(Tag, tags) the employees collection on Tag is now a filterable Query object which only queries when read from, and you'll see that the time goes down to nothing. you can also append and delete from a dynamic collection like a regular list. large collection techniques are discussed at:http://www.sqlalchemy.org/docs/04/mappers.html#advdatamapping_relatio... we do have a ticket in trac to try improving upon backrefs to not load unloaded collections in any case, this is ticket #871. On Dec 5, 12:07 pm, Martin Pengelly-Phillips [EMAIL PROTECTED] wrote: Hello again, I have recently noticed that a particular assignment seems to be taking a relatively long time. Not being a database expert I am confused as to whether the last assignment 'person.tags = tags' should be so slow when referencing existing tags that are used by other entities - it seems to try and get a list of all employees that use the given tag and then spends the time doing something with the resulting set, but why? Test case below. If the slow assignment is expected do you have any advice on how to speed up such a statement? Thank you in advance, Martin -- import os, datetime, time from sqlalchemy import * from sqlalchemy.orm import * file = '/tmp/test.db' if os.path.isfile(file): os.remove(file) engine = create_engine('sqlite:///%s' % file, echo=True) metadata = MetaData() Session = scoped_session(sessionmaker(autoflush=True, transactional=False, bind=engine)) mapper = Session.mapper # Classes #-- class Employee(object): def __init__(self, name=None): self.name = name def __repr__(self): return '%s:%s' % (self.id, self.name) class Tag(object): def __init__(self, label): self.label = label # Setup tables #-- employees = Table('employees', metadata, Column('id', Integer, primary_key=True), Column('name', String, nullable=False, default='bob'), Column('dob', DateTime, nullable=False, default=datetime.datetime.now), ) tags = Table('tags', metadata, Column('id', Integer, primary_key=True), Column('label', String, nullable=False), ) employeesTags = Table('employeesTags', metadata, Column('employee_id', Integer, ForeignKey('employees.id')), Column('tag_id', Integer, ForeignKey('tags.id')), ) # Mappers #-- mapper(Employee, employees, properties={ 'tags': relation(Tag, secondary=employeesTags, backref='employees', lazy=False)}) mapper(Tag, tags) # Test #-- metadata.create_all(engine) session = Session() session.begin() tags = [] for i in xrange(20): tag = Tag(str(datetime.datetime.now())) tags.append(tag) for i in xrange(1000): p = Employee('john%d' % i) p.tags = tags session.commit() session.clear() session.begin() tags = session.query(Tag).all()[:2] person = Employee('bob') started = time.time() person.tags = tags print 'Took:', time.time()-started session.commit() session.clear() Session.remove() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Slow relation based assignment.
hi martin - the issue is that each Tag object contains a collection of 1000 employees on it, and when you make an assignment in the forwards direction (i.e. employee.tag.append(sometag)), the corresponding reverse relation needs to be fully loaded and then updated according to backref semantics. since you're using eager loading by default between employees and tags, there is a load of 20,000 rows each time an uninitialized tags.employees collection is touched. To prevent the backref from being unnecessarily loaded, and since it is a large collection, you should use a dynamic collection for the reverse: mapper(Employee, employees, properties={ 'tags': relation(Tag, secondary=employeesTags,backref=backref('employees', lazy='dynamic'), lazy=False) }) mapper(Tag, tags) the employees collection on Tag is now a filterable Query object which only queries when read from, and you'll see that the time goes down to nothing. you can also append and delete from a dynamic collection like a regular list. large collection techniques are discussed at: http://www.sqlalchemy.org/docs/04/mappers.html#advdatamapping_relation_largecollections we do have a ticket in trac to try improving upon backrefs to not load unloaded collections in any case, this is ticket #871. On Dec 5, 12:07 pm, Martin Pengelly-Phillips [EMAIL PROTECTED] wrote: Hello again, I have recently noticed that a particular assignment seems to be taking a relatively long time. Not being a database expert I am confused as to whether the last assignment 'person.tags = tags' should be so slow when referencing existing tags that are used by other entities - it seems to try and get a list of all employees that use the given tag and then spends the time doing something with the resulting set, but why? Test case below. If the slow assignment is expected do you have any advice on how to speed up such a statement? Thank you in advance, Martin -- import os, datetime, time from sqlalchemy import * from sqlalchemy.orm import * file = '/tmp/test.db' if os.path.isfile(file): os.remove(file) engine = create_engine('sqlite:///%s' % file, echo=True) metadata = MetaData() Session = scoped_session(sessionmaker(autoflush=True, transactional=False, bind=engine)) mapper = Session.mapper # Classes #-- class Employee(object): def __init__(self, name=None): self.name = name def __repr__(self): return '%s:%s' % (self.id, self.name) class Tag(object): def __init__(self, label): self.label = label # Setup tables #-- employees = Table('employees', metadata, Column('id', Integer, primary_key=True), Column('name', String, nullable=False, default='bob'), Column('dob', DateTime, nullable=False, default=datetime.datetime.now), ) tags = Table('tags', metadata, Column('id', Integer, primary_key=True), Column('label', String, nullable=False), ) employeesTags = Table('employeesTags', metadata, Column('employee_id', Integer, ForeignKey('employees.id')), Column('tag_id', Integer, ForeignKey('tags.id')), ) # Mappers #-- mapper(Employee, employees, properties={ 'tags': relation(Tag, secondary=employeesTags, backref='employees', lazy=False)}) mapper(Tag, tags) # Test #-- metadata.create_all(engine) session = Session() session.begin() tags = [] for i in xrange(20): tag = Tag(str(datetime.datetime.now())) tags.append(tag) for i in xrange(1000): p = Employee('john%d' % i) p.tags = tags session.commit() session.clear() session.begin() tags = session.query(Tag).all()[:2] person = Employee('bob') started = time.time() person.tags = tags print 'Took:', time.time()-started session.commit() session.clear() Session.remove() --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---