[sqlalchemy] Re: Slow relation based assignment.

2007-12-10 Thread Martin Pengelly-Phillips

hey mike,

Just to confirm - trunk fixes problem with deletion.
Additionally, I have removed the lazy loading condition and it
maintains the speed of the query.

Thanks again to the team,


Martin


On Dec 7, 4:14 pm, Michael Bayer [EMAIL PROTECTED] wrote:
 hey martin -

 this bug is fixed in trunk r3868, so if you use the svn trunk you can
 either keep using the dynamic or go back to the regular relation, you
 should be good in both cases.

 - mike

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Slow relation based assignment.

2007-12-07 Thread Martin Pengelly-Phillips

Hello again Michael,

Have read the documentation you referenced, but am still unsure how to
now delete a Tag without generating the following error: (Note - using
Postgres in production)

(IntegrityError) update or delete on tags violates foreign key
constraint employeesTags_tag_id_fkey on employeesTags
DETAIL:  Key (id)=(3) is still referenced from table employeesTags.
 'DELETE FROM tags WHERE tags.id = %(id)s' {'id': 3}

Without the lazy='dynamic' it works fine (correctly deletes entries
from employeesTags first).

The delete operation I am performing is:

session.begin()
entry = session.query(Tag).filter_by(id=3).first()
try:
session.delete(entry)
session.commit()
except Exception, error:
print error
session.rollback()
else:
print 'Deleted successfully'


Thanks again for all your help so far,


Martin


On Dec 5, 5:27 pm, Michael Bayer [EMAIL PROTECTED] wrote:
 hi martin -

 the issue is that each Tag object contains a collection of 1000
 employees on it, and when you make an assignment in the forwards
 direction (i.e. employee.tag.append(sometag)), the corresponding
 reverse relation needs to be fully loaded and then updated according
 to backref semantics.  since you're using eager loading by default
 between employees and tags, there is a load of 20,000 rows each time
 an uninitialized tags.employees collection is touched.

 To prevent the backref from being unnecessarily loaded, and since it
 is a large collection, you should use a dynamic collection for the
 reverse:

 mapper(Employee, employees, properties={
 'tags': relation(Tag,
 secondary=employeesTags,backref=backref('employees', lazy='dynamic'),
 lazy=False)

 })

 mapper(Tag, tags)

 the employees collection on Tag is now a filterable Query object
 which only queries when read from, and you'll see that the time goes
 down to nothing.  you can also append and delete from a dynamic
 collection like a regular list.

 large collection techniques are discussed 
 at:http://www.sqlalchemy.org/docs/04/mappers.html#advdatamapping_relatio...

 we do have a ticket in trac to try improving upon backrefs to not load
 unloaded collections in any case, this is ticket #871.

 On Dec 5, 12:07 pm, Martin Pengelly-Phillips

 [EMAIL PROTECTED] wrote:
  Hello again,

  I have recently noticed that a particular assignment seems to be
  taking a relatively long time.
  Not being a database expert I am confused as to whether the last
  assignment 'person.tags = tags' should be so slow when referencing
  existing tags that are used by other entities - it seems to try and
  get a list of all employees that use the given tag and then spends the
  time doing something with the resulting set, but why?

  Test case below.

  If the slow assignment is expected do you have any advice on how to
  speed up such a statement?

  Thank you in advance,

  Martin

  --

  import os, datetime, time
  from sqlalchemy import *
  from sqlalchemy.orm import *

  file = '/tmp/test.db'
  if os.path.isfile(file): os.remove(file)
  engine = create_engine('sqlite:///%s' % file, echo=True)
  metadata = MetaData()
  Session = scoped_session(sessionmaker(autoflush=True,
  transactional=False, bind=engine))
  mapper = Session.mapper

  # Classes
  #--
  class Employee(object):
  def __init__(self, name=None):
  self.name = name

  def __repr__(self):
  return '%s:%s' % (self.id, self.name)

  class Tag(object):
  def __init__(self, label):
  self.label = label

  # Setup tables
  #--
  employees = Table('employees', metadata,
Column('id', Integer, primary_key=True),
Column('name', String, nullable=False,
  default='bob'),
Column('dob', DateTime, nullable=False,
  default=datetime.datetime.now),
   )

  tags = Table('tags', metadata,
  Column('id', Integer, primary_key=True),
  Column('label', String, nullable=False),
  )

  employeesTags = Table('employeesTags', metadata,
  Column('employee_id', Integer,
  ForeignKey('employees.id')),
  Column('tag_id', Integer,
  ForeignKey('tags.id')),
)

  # Mappers
  #--
  mapper(Employee, employees, properties={
  'tags': relation(Tag, secondary=employeesTags,
  backref='employees', lazy=False)})

  mapper(Tag, tags)

  # Test
  #--
  metadata.create_all(engine)
  session = Session()
  session.begin()

  tags = []
  for i in xrange(20):
  tag = Tag(str(datetime.datetime.now()))
  tags.append(tag)

  for i in xrange(1000):
  p = Employee('john%d' % i)
  p.tags = tags

  session.commit()
  session.clear()

  session.begin()
  

[sqlalchemy] Re: Slow relation based assignment.

2007-12-07 Thread Michael Bayer


On Dec 7, 2007, at 5:21 AM, Martin Pengelly-Phillips wrote:


 Hello again Michael,

 Have read the documentation you referenced, but am still unsure how to
 now delete a Tag without generating the following error: (Note - using
 Postgres in production)

 (IntegrityError) update or delete on tags violates foreign key
 constraint employeesTags_tag_id_fkey on employeesTags
 DETAIL:  Key (id)=(3) is still referenced from table employeesTags.
 'DELETE FROM tags WHERE tags.id = %(id)s' {'id': 3}

 Without the lazy='dynamic' it works fine (correctly deletes entries
 from employeesTags first).

 The delete operation I am performing is:

 session.begin()
 entry = session.query(Tag).filter_by(id=3).first()
 try:
session.delete(entry)
session.commit()
 except Exception, error:
print error
session.rollback()
 else:
print 'Deleted successfully'


hey Martin -

I think this is actually a bug in sqlalchemy regarding the dynamic  
relation; ive added ticket #895.

Ive recently enhanced regular relations to also not unnecessarily load  
backrefs, that code is in trunk if youd like to try it, and i dont  
think it has this particular issue.

Otherwise, for now, when you load the Tag, you have two (well, three)  
choices:  you can load the colleciton of Employees attached to the  
Tag, and explicitly remove the tag from each employee;  or, you can  
implement ON DELETE CASCADE in your database on the foreign key in  
question so that it automatically updates itself; finally, you can,  
within the transaction, issue a DELETE FROM employees_tags where  
tag_id=3 before you call session.commit(), although this might  
conflict with existing Employee records.

implementing ON DELETE CASCADE is definitely the most legit way to  
go here since it lets the database do most of the work.








--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Slow relation based assignment.

2007-12-07 Thread Martin Pengelly-Phillips

hey Mike,

Thanks for the update - I'll try it out tomorrow.

Martin

p.s. Have I mentioned you guys provide the best support I have
encountered in a long time (including commercial products).

On Dec 7, 4:14 pm, Michael Bayer [EMAIL PROTECTED] wrote:
 hey martin -

 this bug is fixed in trunk r3868, so if you use the svn trunk you can
 either keep using the dynamic or go back to the regular relation, you
 should be good in both cases.

 - mike

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Slow relation based assignment.

2007-12-05 Thread Martin Pengelly-Phillips

Thank you Michael - I had completely missed the backref full load.



On Dec 5, 5:27 pm, Michael Bayer [EMAIL PROTECTED] wrote:
 hi martin -

 the issue is that each Tag object contains a collection of 1000
 employees on it, and when you make an assignment in the forwards
 direction (i.e. employee.tag.append(sometag)), the corresponding
 reverse relation needs to be fully loaded and then updated according
 to backref semantics.  since you're using eager loading by default
 between employees and tags, there is a load of 20,000 rows each time
 an uninitialized tags.employees collection is touched.

 To prevent the backref from being unnecessarily loaded, and since it
 is a large collection, you should use a dynamic collection for the
 reverse:

 mapper(Employee, employees, properties={
 'tags': relation(Tag,
 secondary=employeesTags,backref=backref('employees', lazy='dynamic'),
 lazy=False)

 })

 mapper(Tag, tags)

 the employees collection on Tag is now a filterable Query object
 which only queries when read from, and you'll see that the time goes
 down to nothing.  you can also append and delete from a dynamic
 collection like a regular list.

 large collection techniques are discussed 
 at:http://www.sqlalchemy.org/docs/04/mappers.html#advdatamapping_relatio...

 we do have a ticket in trac to try improving upon backrefs to not load
 unloaded collections in any case, this is ticket #871.

 On Dec 5, 12:07 pm, Martin Pengelly-Phillips

 [EMAIL PROTECTED] wrote:
  Hello again,

  I have recently noticed that a particular assignment seems to be
  taking a relatively long time.
  Not being a database expert I am confused as to whether the last
  assignment 'person.tags = tags' should be so slow when referencing
  existing tags that are used by other entities - it seems to try and
  get a list of all employees that use the given tag and then spends the
  time doing something with the resulting set, but why?

  Test case below.

  If the slow assignment is expected do you have any advice on how to
  speed up such a statement?

  Thank you in advance,

  Martin

  --

  import os, datetime, time
  from sqlalchemy import *
  from sqlalchemy.orm import *

  file = '/tmp/test.db'
  if os.path.isfile(file): os.remove(file)
  engine = create_engine('sqlite:///%s' % file, echo=True)
  metadata = MetaData()
  Session = scoped_session(sessionmaker(autoflush=True,
  transactional=False, bind=engine))
  mapper = Session.mapper

  # Classes
  #--
  class Employee(object):
  def __init__(self, name=None):
  self.name = name

  def __repr__(self):
  return '%s:%s' % (self.id, self.name)

  class Tag(object):
  def __init__(self, label):
  self.label = label

  # Setup tables
  #--
  employees = Table('employees', metadata,
Column('id', Integer, primary_key=True),
Column('name', String, nullable=False,
  default='bob'),
Column('dob', DateTime, nullable=False,
  default=datetime.datetime.now),
   )

  tags = Table('tags', metadata,
  Column('id', Integer, primary_key=True),
  Column('label', String, nullable=False),
  )

  employeesTags = Table('employeesTags', metadata,
  Column('employee_id', Integer,
  ForeignKey('employees.id')),
  Column('tag_id', Integer,
  ForeignKey('tags.id')),
)

  # Mappers
  #--
  mapper(Employee, employees, properties={
  'tags': relation(Tag, secondary=employeesTags,
  backref='employees', lazy=False)})

  mapper(Tag, tags)

  # Test
  #--
  metadata.create_all(engine)
  session = Session()
  session.begin()

  tags = []
  for i in xrange(20):
  tag = Tag(str(datetime.datetime.now()))
  tags.append(tag)

  for i in xrange(1000):
  p = Employee('john%d' % i)
  p.tags = tags

  session.commit()
  session.clear()

  session.begin()
  tags = session.query(Tag).all()[:2]
  person = Employee('bob')

  started = time.time()
  person.tags = tags
  print 'Took:', time.time()-started

  session.commit()
  session.clear()
  Session.remove()

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Slow relation based assignment.

2007-12-05 Thread Michael Bayer

hi martin -

the issue is that each Tag object contains a collection of 1000
employees on it, and when you make an assignment in the forwards
direction (i.e. employee.tag.append(sometag)), the corresponding
reverse relation needs to be fully loaded and then updated according
to backref semantics.  since you're using eager loading by default
between employees and tags, there is a load of 20,000 rows each time
an uninitialized tags.employees collection is touched.

To prevent the backref from being unnecessarily loaded, and since it
is a large collection, you should use a dynamic collection for the
reverse:

mapper(Employee, employees, properties={
'tags': relation(Tag,
secondary=employeesTags,backref=backref('employees', lazy='dynamic'),
lazy=False)
})

mapper(Tag, tags)


the employees collection on Tag is now a filterable Query object
which only queries when read from, and you'll see that the time goes
down to nothing.  you can also append and delete from a dynamic
collection like a regular list.

large collection techniques are discussed at:
http://www.sqlalchemy.org/docs/04/mappers.html#advdatamapping_relation_largecollections

we do have a ticket in trac to try improving upon backrefs to not load
unloaded collections in any case, this is ticket #871.


On Dec 5, 12:07 pm, Martin Pengelly-Phillips
[EMAIL PROTECTED] wrote:
 Hello again,

 I have recently noticed that a particular assignment seems to be
 taking a relatively long time.
 Not being a database expert I am confused as to whether the last
 assignment 'person.tags = tags' should be so slow when referencing
 existing tags that are used by other entities - it seems to try and
 get a list of all employees that use the given tag and then spends the
 time doing something with the resulting set, but why?

 Test case below.

 If the slow assignment is expected do you have any advice on how to
 speed up such a statement?

 Thank you in advance,

 Martin

 --

 import os, datetime, time
 from sqlalchemy import *
 from sqlalchemy.orm import *

 file = '/tmp/test.db'
 if os.path.isfile(file): os.remove(file)
 engine = create_engine('sqlite:///%s' % file, echo=True)
 metadata = MetaData()
 Session = scoped_session(sessionmaker(autoflush=True,
 transactional=False, bind=engine))
 mapper = Session.mapper

 # Classes
 #--
 class Employee(object):
 def __init__(self, name=None):
 self.name = name

 def __repr__(self):
 return '%s:%s' % (self.id, self.name)

 class Tag(object):
 def __init__(self, label):
 self.label = label

 # Setup tables
 #--
 employees = Table('employees', metadata,
   Column('id', Integer, primary_key=True),
   Column('name', String, nullable=False,
 default='bob'),
   Column('dob', DateTime, nullable=False,
 default=datetime.datetime.now),
  )

 tags = Table('tags', metadata,
 Column('id', Integer, primary_key=True),
 Column('label', String, nullable=False),
 )

 employeesTags = Table('employeesTags', metadata,
 Column('employee_id', Integer,
 ForeignKey('employees.id')),
 Column('tag_id', Integer,
 ForeignKey('tags.id')),
   )

 # Mappers
 #--
 mapper(Employee, employees, properties={
 'tags': relation(Tag, secondary=employeesTags,
 backref='employees', lazy=False)})

 mapper(Tag, tags)

 # Test
 #--
 metadata.create_all(engine)
 session = Session()
 session.begin()

 tags = []
 for i in xrange(20):
 tag = Tag(str(datetime.datetime.now()))
 tags.append(tag)

 for i in xrange(1000):
 p = Employee('john%d' % i)
 p.tags = tags

 session.commit()
 session.clear()

 session.begin()
 tags = session.query(Tag).all()[:2]
 person = Employee('bob')

 started = time.time()
 person.tags = tags
 print 'Took:', time.time()-started

 session.commit()
 session.clear()
 Session.remove()
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---