I'm developing an application using the ORM, and I am getting into all 
sorts of trouble with what I think should be a pretty simple data model. 
I've tried following the ORM tutorial from the docs, but it seems to get me 
confused every time I try. So I'm looking for something else that maybe 
takes a different approach. Or maybe someone can point out what's wrong for 
me in the following - but even then, pointers to other tutorial material 
would be useful, as I don't really want to end up pestering the list every 
time I hit an issue :-)

My data model is fairly straightforward. I have a Package class, with a 
name. I then have a Release class - each Release is associated with a 
single package, and has a unique version. Releases have a few children - 
Classifiers, Dependencies, and URLs. All of these are multi-valued with no 
natural key (at least, not one worth worrying about). There is some extra 
data associated with Releases and URLs, but that's not too important. 
(People may recognise this as the PyPI data model). This is a pretty 
trivial parent/child one-to-many hierarchy, and I didn't expect it to be 
hard to model.

The obvious (to me!) model is basically (trimmed down a bit):

class Package(Base):
    __tablename__ = 'packages'
    # Use a synthetic key, as package names can change
    id = Column(Integer, primary_key=True)
    name = Column(String, unique=True, nullable=False)
    releases = relationship("Release", backref="package",
            cascade="all, delete-orphan")

class Release(Base):
    __tablename__ = 'releases'
    id = Column(Integer, primary_key=True)
    package_id = Column(Integer, ForeignKey('packages.id'), nullable=False)
    version = Column(String, nullable=False) # Unique within package
    classifiers = relationship("Classifier", backref="release",
            cascade="all, delete-orphan")

class Classifier(Base):
    __tablename__ = 'classifiers'
    id = Column(Integer, primary_key=True)
    release_id = Column(Integer, ForeignKey('releases.id'), nullable=False)
    classifier = Column(String, nullable=False)

So far, so good. But if I want to create a new Release, things get messy. 
This is my basic function:

def new_release(package, version, data):
    r = Release(version)
    r.package = Package(package)
    # Populate the data in r, and create child items
    return r

It's that package reference that messes me up. If the release is for a new 
package, then when I merge the release into the session, the package is 
created. But if it's for an existing package, a new package is created 
(which gives a constraint error if the package name is unique, and 
duplicates if it's not) rather than the session recognising that it's an 
existing package and linking the release to it.

If I was working at the SQL core level, I'd expect to have to query for the 
package and create it if it wasn't there. But I'm not sure I can do that 
with a session, as things get cached in memory by the "unit of work" stuff, 
and I don't know if a query for the release could miss a package that's 
pending insertion, or if the query could cause auto-flushing which might 
cause other issues (either with performance or integrity). Because the 
session is managing the in-memory caching and the transaction management 
"by magic", I don't want to interfere with its mechanisms any more than I 
have to. If I have to keep track of what's gone to the database, and query 
for existing instances and manage the transactions, I probably should just 
use the SQL layer directly (I have a lot of experiences with databases, but 
very little with ORMs, so pure DB code isn't too scary for me, but on the 
other hand I don't know what benefits the ORM should be giving me that I'm 
not seeing).

Is this an application that doesn't actually benefit from the ORM? Or am I 
using it wrongly, and my problems come from misunderstanding the way it 
should be used? As I say, I've a lot of database experience but very little 
with ORMs, so maybe I have an unbalanced view of how much data management 
the ORM should be able to handle for me.

The particular problem here is what's affecting me right now - but I'd be 
even more interested in a good "ORM for experienced SQL developers" 
tutorial that tells me how the ORM differs from the core level (and where 
its benefits lie).

Thanks,
Paul

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to