Re: [Sqlalchemy-users] Many to many problem?

Shuo Yang Sun, 04 Dec 2005 18:50:02 -0800

On 12/4/05, Michael Bayer <[EMAIL PROTECTED]> wrote:

get ready for a long one...(this has a lot of unedited thought
process in it)

Just to start off,  the example in my previous email was just  the
state of what it is *now*.   so at the moment, the issue is not dealt
with very well.  Theres probably going to be a lot of issues where Im
going to come out with the ugly unfinished truth of it,  but that
doesnt mean thats how its going to stay..its just until we figure out
how it should really work.  The criterion for a first release is, we
think we've nailed most of these issues and the API has stabilized,
and all the way at the bottom of this email is a potential API change
to deal with this, so thats good.

when you have an object A that has a collection of B, and you change
the contents of the collection on A, then you do a commit, the
database is updated to reflect the changes of the collection on A.
In that case, everything is fine, A is already up to date before the
commit even happened, since you modified its contents programmatically.

the problem arises when you have B sitting around which also contains
a relationship, in the other direction, to one or more objects of
type A.  Even though B was attached to A, B wasnt modified, but now
B's reference to A, which may be a list (many-to-many) or just a
scalar (one-to-many from A's direction), is potentially in an invalid
state.  At the moment, SQLAlchemy didnt even put "B" into the
transaction, since nothing was changed on it, so nothing happens to it.

I had figured that this issue would work itself out mostly in that
applications would put object modifications within an
objectstore.begin()/objectstore.commit() pair, placed at the end of a
session's lifecycle, which would be reset the next time a user
session started.  This is how the ZBlog application gets around the
whole issue.

ZBlog also gets around the issue in another way;  while it has two-
way relationships, it only modifies those relationships in one
direction.  So while a Blog has its owning User attached to it before
saving, it never takes a User and appends a Blog to its 'blogs'
attribute.  The 'blogs' attribute is set up with a flag "live=True",
which means that its a lazyload attribute that *always* loads,
everytime you access it.  So its not useful to write to it.  The
"live=True" idea represents my first attempt at managing this issue.

But now, both of you want to have two way relationships, *and* you
want to set them in both directions, *and* you want the objects to be
in a completely valid state after commit.  I tried to think of every
possible contingency before putting this thing out publically, but
there you go, only a week later and ive already been hit with like, a
dozen.

My first instinct on this, is for "B"s reference to A, or collection
of "A"'s, to be cleared out and reset with a lazy loader, so that it
refreshes from the database the next time it is loaded.  Which I
didnt rush into doing, since it means I would take some in-memory
lists, assume they are up-to-date with regards to the database, clear
them out and reset their loading.   Also I would need to figure out
some way to detect that a B being added to A means that theres an A
that needs to be attached to B, which is not terribly straightforward
(until the end of this email...).

Consider if class A has an attribute "listofB" and class B has an
attribute "listofA", if you do something like this:

Aobj = A.mapper.select(...)
Bobj = B.mapper.select(...)

Bobj.listofA.append(new A())

objectstore.begin()
Aobj.listofB.append(Bobj)
objectstore.commit()

the way thats supposed to work is A's list of B gets saved to the
database, but B's list of A, which has a totally different A inside
of it and was not in the scope of the transaction, should still
remain "pending".

So what should the application do with Bobj's listofA ?  Heres a
whole lot of options, and this is sort of just me thinking here:

1. Should it clear it out and reset it to re-lazy load ?  That would
remove the pending change of the new A() sitting in the list.
2. Should it do #1, but if the list has pending changes raise an
exception ?  hm, maybe.
3. Should it just append Aobj into the list, and not affect any
current changes to the list ?  That might work but then, is the
ordering of the list correct ?   Maybe thats the way to do it.
4. Sort of like #2, should there be a flag on the attribute that
says, "yes its OK to clear this out and re-load if the data changes
in a commit" ?  it seems like an application would *always* want this
to happen though.
5. how about, it will re-lazy-load Bobj's listofA, but *not* clear
out the existing contents, it will add the database results in,
skipping those that were deleted from the list and maintaining those
that were added.  this is actually not too different from #3.
6. should we use magic !  when you append Bobj to Aobj.listofB, it
*automatically* sets the A on Bobj, either blowing away a previous
scalar value, or appending to Bobj's listofA.   The whole
relationship would be maintained without even touching the database.
This is actually like #3 but has a more explicit contract, in that B
gets yanked into the transaction upon commit as well.

So, while this whole thing seems obvious, I havent decided how to
deal with it yet.

#6 is very interesting.   it basically means theres a "backreference"
handler function associated with a relationship, which knows how to
populate the "backreference" upon a change.  There is an example
"examples/adjacencytree/byroot_tree.py" which does something similar
to this, since it is representing a hierarchy of TreeNodes which all
contain a backreference to the root node.  This example suggests that
the "backreference" handler would have to be customizable.

I am going to ponder the "auto-backreference" attribute idea a little
further.

It might look something like:

Course.mapper = mapper(Course, courseTbl)
Student.mapper = mapper(Student, studentTbl, properties={
        'courses': relation(Course.mapper, enrolTbl, lazy=False,
backreference='students')
})

hey wow, guess what...the backreference right there can automatically
add the property 'students' to the Course.mapper as well.  the
property would be a lazy load and also have a backreference to
'courses'.

Yea, I like the "backreference idea". Many-to-many relationships are "two-way" relationships "by nature". That is in theory, neither of the two directions is more important than the other. So when you are specifying a backreference, you are only labelling what the reference the other way should be called, which is very relavant information within the scope of this two-way relationship definition IMO.

Question is if this backreference automatically adds a property to the other table, how do you specify different parameters to that relationship? Or is it not hard?

backreference can also be a function, like this one, which receives
events for " parent.children.append(child)" and "child.parent=parent"

def childappended(parent, child):
        child.root = parent.root
        child.parent = parent
def parentset(parent, child):
         parent.children.append(child)
        child.root = parent.root

TreeNode.mapper=mapper(TreeNode, treetable, properties=dict(
     id=treetable.c.node_id,
     parent=relation(TreeNode,
primaryjoin=treetable.c.parent_node_id==treetable.c.node_id ,
foreignkey=treetable.c.node_id, uselist=True, backreference=parentset),
     children=relation(TreeNode,
primaryjoin=treetable.c.parent_node_id==treetable.c.node_id,
uselist=True, private=True, backreference=childappended),
))

hey wow, this *might* work and not even require much code at
all....basically, if it works in the ZBlog app, it'll probably work
anywhere.

Man, you should be typing some more emails, sqlalchemy will write itself out. :)

On Dec 4, 2005, at 6:13 PM, Robert Leftwich wrote:

> Michael Bayer wrote:
>> nah, its working by design, but you might not like it.  try these
>> lines instead:
>
> I'm not sure I understand the relationship between this and the
> original Student/Courses issue/question. Are you saying that we
> have to *manually* manage *all* m:n mapped relationships in the
> objectstore by using delete+reload when the relationship is changed
> in any way?
>
> Robert
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through
> log files
> for problems?  Stop!  Download the new AJAX search engine that makes
> searching your log files as easy as surfing the  web.  DOWNLOAD
> SPLUNK!
> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> _______________________________________________
> Sqlalchemy-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users

-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Sqlalchemy-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users

--
---------------------------------------------------------------------------------------------------
John S. Yang

Re: [Sqlalchemy-users] Many to many problem?

Reply via email to