[sqlalchemy] Re: Questions about "session"

Az Tue, 08 Jun 2010 22:44:23 -0700

Traceback (most recent call last):
  File "Main.py", line 39, in <module>
    MCS.monteCarloBasic(trials)
  File "/XXXX/MonteCarloSimulation.py", line 163, in monteCarloBasic
    session.merge(temp_alloc)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
session.py", line 1158, in merge
    self._autoflush()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
session.py", line 897, in _autoflush
    self.flush()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
session.py", line 1354, in flush
    self._flush(objects)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
session.py", line 1432, in _flush
    flush_context.execute()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
unitofwork.py", line 261, in execute
    UOWExecutor().execute(self, tasks)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
unitofwork.py", line 753, in execute
    self.execute_save_steps(trans, task)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
unitofwork.py", line 768, in execute_save_steps
    self.save_objects(trans, task)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
unitofwork.py", line 759, in save_objects
    task.mapper._save_obj(task.polymorphic_tosave_objects, trans)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
mapper.py", line 1428, in _save_obj
    c = connection.execute(statement.values(value_params), params)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/
engine/base.py", line 824, in execute
    return Connection.executors[c](self, object, multiparams, params)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/
engine/base.py", line 874, in _execute_clauseelement
    return self.__execute_context(context)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/
engine/base.py", line 896, in __execute_context
    self._cursor_execute(context.cursor, context.statement,
context.parameters[0], context=context)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/
engine/base.py", line 950, in _cursor_execute
    self._handle_dbapi_exception(e, statement, parameters, cursor,
context)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/
engine/base.py", line 931, in _handle_dbapi_exception
    raise exc.DBAPIError.instance(statement, parameters, e,
connection_invalidated=is_disconnect)
sqlalchemy.exc.IntegrityError: (IntegrityError) columns uid,
session_id, stud_id are not unique u'INSERT INTO sim_alloc (ident,
uid, session_id, stud_id, alloc_proj, alloc_proj_rank) VALUES
(?, ?, ?, ?, ?, ?)' [1, '1d295f48-7386-11df-8e87-00264a052efc', 1,
5796, 1100009, 1]


########

Good news: Got the UUID working in a snap.
Bad news: See error :(


Note: This happened when I started using 'session.merge(temp_alloc)'
instead of 'session.add'

########

On Jun 9, 4:54 am, Az <azfarul.is...@gmail.com> wrote:

On Jun 9, 4:54 am, Az <azfarul.is...@gmail.com> wrote:
> > memo = {}
> > copied_students = copy.deepcopy(students, memo)
> > copied_supervisors = copy.deepcopy(supervisors, memo)
> > copied_projects = copy.deepcopy(projects, memo)
>
> > After you do this, memo will contain a record of all copied objects. You
> > should examine memo.values() to see if it is copying more than you
> > expected. If it did copy just what you expected, then my worries were
> > unfounded.
>
> I'll let you know how that turns out soonish. While I know it's my
> data, is there anything you can suggest from your experience that you
> consider to be "unexpected"?
>
> > Yes, session_id/trial_id and stud_id can repeat, and you can still group
> > things together by run_id. Alternatively, you could add an
> > autoincrementing primary key to SimAllocation, but I believe it is
> > redundant since the combination (run_id, session_id/trial_id, stud_id)
> > should be unique anyway. run_id can definitely be a datetime, but I'm
> > not sure how well sqlite (it sounds like you're using sqlite) supports
> > datetimes in queries 
> > (seehttp://www.sqlalchemy.org/docs/reference/dialects/sqlite.html#date-an...).
> > A GUID (or UUID) is just a 128-bit value (usually random); the benefit
> > here is you can generate it on the client side and be confident that it
> > will be unique on the server (to avoid duplicate primary key errors).
> > Using datetimes or database sequences would also work. You can
> > definitely pass the run_id as an argument to monteCarloBasic, or to each
> > object's create_db_record method.
>
> Also I get why you mention three keys: run_id/guid/uuid and session_id/
> trial_id alone won't suffice... but since we know there are unique
> students (within each single allocation run, etc. So I can get rid of
> the "ident" then? It serves no other purpose really if I can get a key
> combo that's unique and works for.
>
> I am indeed using SQLite3. I take it take my physical database has to
> something like:
> engine = create_engine('sqlite:///Database/spalloc.sqlite3',
> echo=False)?
>
> Also I take it I should generate the UUID (http://docs.python.org/
> library/uuid.html) when I call the MonteCarloBasic function right?
> Since it should be the same for each "call", I take I'll have to
> generate it before the loop. Additionally, how would I actually query
> a 128-bit value? Say I have a bit in my GUI where the supervisor can
> put in a UUID to pull the data off the Database. How would he actually
> know which UUID to put in? Any ideas?
>
> Also once I've got my stuff in the physical database and after my
> program is done, I'd call session.close() right? How do I access the
> DB data then? Would I have to write some separate functions that allow
> me to access the data without using (for example)
> 'session.query(Student)...`? This way the user (i.e. my supervisor)
> won't have to keep running the readData, monteCarloBasic, etc
> functions just to access the DB (that would be poor indeed!).
>
> On Jun 8, 3:07 pm, Conor <conor.edward.da...@gmail.com> wrote:
>
> > On 06/07/2010 07:27 PM, Az wrote:
>
> > >> By default, deepcopy will make one copy of everything in the object
> > >> graph reachable by the object you feed it. The scary part is that,
> > >> unless you also pass in a /memo/ argument to each call to deepcopy, it
> > >> will copy the entire graph /every single call/. So if you deepcopy the
> > >> students dictionary and then deepcopy the projects dictionary, each
> > >> student's allocated_proj attribute will not match any instance in the
> > >> projects dictionary. This is why a use-case-specific copy function is
> > >> recommended: it is a lot easier to predict which objects will get copied
> > >> and which objects will be shared.
>
> > > Shouldn't it match? I mean the student can only get allocated a
> > > project if it exists in the projects dictionary... or is that not the
> > > point?
>
> > > By use-case-specific, you mean I'll have to redefine deepcopy inside
> > > each class like this: def __deepcopy__(self): something, something?
>
> > > The only two places where this is an issue is for Supervisor's
> > > "offered_proj" attribute (a set) where, naturally, each project is an
> > > object and in Project where "proj_sup" is naturally a supervisor
> > > object :D
>
> > > The usefulness of my data structures comes back to bite me now...
>
> > In theory, the following will work, ignoring ORM deepcopy issues
> > discussed at the beginning of this thread:
>
> > memo = {}
> > copied_students = copy.deepcopy(students, memo)
> > copied_supervisors = copy.deepcopy(supervisors, memo)
> > copied_projects = copy.deepcopy(projects, memo)
>
> > After you do this, memo will contain a record of all copied objects. You
> > should examine memo.values() to see if it is copying more than you
> > expected. If it did copy just what you expected, then my worries were
> > unfounded.
>
> > By use-case-specific, I meant define your own copy_objects function that
> > explicitly specifies what is copied:
>
> > def copy_objects(students, supervisors, projects):
> >     memo = {}
> >     copied_students = {}
> >     copied_supervisors = {}
> >     copied_projects = {}
>
> >     def copy_student(student):
> >         student_id = id(student)
> >         if student_id in memo:
> >             return memo[student_id]
>
> >         copied_student = Student()
> >         memo[student_id] = copied_student
> >         copied_student.attr1 = student.attr1
> >         [copy rest of student's attributes]
> >         if you_need_to_copy_students_project:
> >             copied_student.allocated_proj = 
> > copy_project(student.allocated_proj)
> >         return copied_student
>
> >     [define copy_supervisor]
> >     [define copy_project]
>
> >     copied_students = dict((key, copy_student(student)) for (key, student) 
> > in students.iteritems())
> >     copied_supervisors = dict((key, copy_supervisor(supervisor)) for (key, 
> > supervisor) in supervisors.iteritems())
> >     copied_projects = dict((key, copy_project(project)) for (key, project) 
> > in projects.iteritems())
> >     return (copied_students, copied_supervisors, copied_projects)
>
> > As you can see, this makes it clear which objects are copied and which
> > are shared. In retrospect, I think I assumed you didn't want to make
> > copies of your supervisors or projects when I recommended the
> > use-case-specific approach, which kind of violates the spirit of
> > deepcopy. Oh well, my bad.
>
> > >> class Student(object):
> > >>     [existing definitions]
>
> > >>     def create_db_record(self):
> > >>         result = StudentDBRecord()
> > >>         result.ee_id = self.ee_id
> > >>         [copy over other attributes]
> > >>         return result
>
> > >> class StudentDBRecord(object):
> > >>     pass
>
> > > The create_db_record function... does it have to called explicitly
> > > somewhere or does it automatically run?
>
> > You have to call it explicitly, e.g.:
>
> > for unmapped_student in unmapped_students:
> >     mapped_student = unmapped_student.create_db_record()
> >     # I assume you want "find or create" behavior,
> >     # so use session.merge instead of session.add.
> >     mapped_student = session.merge(mapped_student)
>
> > > [...]
>
> > >> I think a primary key of
> > >> (run_id, session_id/trial_id, stud_id) would be good
>
> > > If I make them all primary keys I get a composite key right? Within an
> > > entire M-C simulation the stud_id's would repeat in groups -- so if
> > > there are 100 simulations, each stud_id appears 100 times in that
> > > commit.
>
> > > Run_id is a fantastic idea! I'd probably have it be the date and time?
> > > Given that the simulation takes a while to run... the time will have
> > > changed sufficiently for uniqueness. However, then querying becomes a
> > > pain because of whatever format the date and time data will be in...
> > > so in that case, what is a GUID and is that something we could give to
> > > the Monte-Carlo ourselves before the run as some sort of argument? It
> > > would be the same for an entire run but different from run to run (so
> > > not unique from row to row, but unique from one run set to the other).
> > > Any thoughts on this?
>
> > Yes, session_id/trial_id and stud_id can repeat, and you can still group
> > things together by run_id. Alternatively, you could add an
> > autoincrementing primary key to SimAllocation, but I believe it is
> > redundant since the combination (run_id, session_id/trial_id, stud_id)
> > should be unique anyway. run_id can definitely be a datetime, but I'm
> > not sure how well sqlite (it sounds like you're using sqlite) supports
> > datetimes in queries 
> > (seehttp://www.sqlalchemy.org/docs/reference/dialects/sqlite.html#date-an...).
> > A GUID (or UUID) is just a 128-bit value (usually random); the benefit
> > here is you can generate it on the client side and be confident that it
> > will be unique on the server (to avoid duplicate primary key errors).
> > Using datetimes or database sequences would also work. You can
> > definitely pass the run_id as an argument to monteCarloBasic, or to each
> > object's create_db_record method.
>
> > -Conor

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

[sqlalchemy] Re: Questions about "session"

Reply via email to