Hi everyone, I m trying to get to grips with SQLAlchemy by parsing a file, extracting certain fields and then storing them in a database. I am currently using the object relational and trying to get a sense of how it all works. I have two objects Paper and Author, which is a many to many relationship. So currently I am parsing the file and then creating a Paper object
names = [ list of author names ] authors = [ Author(a) for a in names ] p = Paper(title, authors) session.add(p) session.commit(p) However, when I get to a new paper with an already existing author I get the following error, sqlalchemy.exceptions.IntegrityError: (IntegrityError) (1062, "Duplicate entry 'Sole R' for key 'name'") u'INSERT INTO authors (name) VALUES (%s)' ['Sole R'] which makes sense, however I d prefer not to query the database over and over to check if "Sole R" has already been added to the database. So I try a different approach without building Author objects and using the name string instead as an arg to Paper: and get the error: AttributeError: 'str' object has no attribute '_state' So I'm out of ideas, I want to make it fast as possible and without firing off lots of queries to the database to look for identity. Is there a way supported by the ORM that I ve missed completely, so is it more normal to create a cache whereby you add things to a dict and look for identity in the dict otherwise query the database? The problem is that some of the files I am going to be parsing are quite large (Gb) and I dont want to saturate my database server with requests and likewise I only have limited memory on my machines. Have I missed something simple in the documentation? My code is below: Many thanks in advance, Nathan document_table = Table('documents', metadata, Column('id', Integer, primary_key=True), Column('title', String, nullable=False), ) authors_table = Table('authors', metadata, Column('id', Integer, primary_key=True), Column('name', String(40), unique=True), ) papers_to_authors_table = Table('p2a_assocation', metadata, Column('document', Integer, ForeignKey(' documents.id')), Column('author', Integer, ForeignKey(' authors.id')) ) class Author(object): def __init__(self, name): self.name = name class Paper(object): def __init__(self, title, authors): self.title = title self.authors = author mapper(Paper, document_table, properties={'authors':relation(Author, secondary=papers_to_authors_table, backref='publications')}) mapper(Author, authors_table) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---