Hi everyone,

I m trying to get to grips with SQLAlchemy by parsing a file, extracting
certain fields and then storing them in a database. I am currently using the
object relational and trying to get a sense of how it all works. I have two
objects Paper and Author, which is a many to many relationship. So currently
I am parsing the file and then creating a Paper object

names = [ list of author names ]
authors = [ Author(a) for a in names ]
p = Paper(title, authors)

However, when I get to a new paper with an already existing author I get the
following error,
sqlalchemy.exceptions.IntegrityError: (IntegrityError) (1062, "Duplicate
entry 'Sole R' for key 'name'") u'INSERT INTO authors (name) VALUES (%s)'
['Sole R']

which makes sense, however I d prefer not to query the database over and
over to check if "Sole R" has already been added to the database. So I try a
different approach without building Author objects and using the name string
instead as an arg to Paper:
and get the error:
AttributeError: 'str' object has no attribute '_state'

So I'm out of ideas, I want to make it fast as possible and without firing
off lots of queries to the database to look for identity. Is there a way
supported by the ORM that I ve missed completely, so is it more normal to
create a cache whereby you add things to a dict and look for identity in the
dict otherwise query the database? The problem is that some of the files I
am going to be parsing are quite large (Gb) and I dont want to saturate my
database server with requests and likewise I only have limited memory on my
Have I missed something simple in the documentation?

My code is below:

Many thanks in advance,


document_table = Table('documents', metadata,
                       Column('id', Integer, primary_key=True),
                       Column('title', String, nullable=False),

authors_table = Table('authors', metadata,
                      Column('id', Integer, primary_key=True),
                      Column('name', String(40), unique=True),


papers_to_authors_table = Table('p2a_assocation', metadata,
                                Column('document', Integer, ForeignKey('
                                Column('author', Integer, ForeignKey('

class Author(object):
    def __init__(self, name):
        self.name = name

class Paper(object):
    def __init__(self, title, authors):
        self.title = title
        self.authors = author

mapper(Paper, document_table, properties={'authors':relation(Author,
secondary=papers_to_authors_table, backref='publications')})
mapper(Author, authors_table)

You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to