Traceback (most recent call last): File "Main.py", line 39, in <module> MCS.monteCarloBasic(trials) File "/XXXX/MonteCarloSimulation.py", line 163, in monteCarloBasic session.merge(temp_alloc) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/ session.py", line 1158, in merge self._autoflush() File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/ session.py", line 897, in _autoflush self.flush() File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/ session.py", line 1354, in flush self._flush(objects) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/ session.py", line 1432, in _flush flush_context.execute() File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/ unitofwork.py", line 261, in execute UOWExecutor().execute(self, tasks) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/ unitofwork.py", line 753, in execute self.execute_save_steps(trans, task) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/ unitofwork.py", line 768, in execute_save_steps self.save_objects(trans, task) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/ unitofwork.py", line 759, in save_objects task.mapper._save_obj(task.polymorphic_tosave_objects, trans) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/ mapper.py", line 1428, in _save_obj c = connection.execute(statement.values(value_params), params) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/ engine/base.py", line 824, in execute return Connection.executors[c](self, object, multiparams, params) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/ engine/base.py", line 874, in _execute_clauseelement return self.__execute_context(context) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/ engine/base.py", line 896, in __execute_context self._cursor_execute(context.cursor, context.statement, context.parameters[0], context=context) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/ engine/base.py", line 950, in _cursor_execute self._handle_dbapi_exception(e, statement, parameters, cursor, context) File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/ lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/ engine/base.py", line 931, in _handle_dbapi_exception raise exc.DBAPIError.instance(statement, parameters, e, connection_invalidated=is_disconnect) sqlalchemy.exc.IntegrityError: (IntegrityError) columns uid, session_id, stud_id are not unique u'INSERT INTO sim_alloc (ident, uid, session_id, stud_id, alloc_proj, alloc_proj_rank) VALUES (?, ?, ?, ?, ?, ?)' [1, '1d295f48-7386-11df-8e87-00264a052efc', 1, 5796, 1100009, 1]
######## Good news: Got the UUID working in a snap. Bad news: See error :( Note: This happened when I started using 'session.merge(temp_alloc)' instead of 'session.add' ######## On Jun 9, 4:54 am, Az <azfarul.is...@gmail.com> wrote: On Jun 9, 4:54 am, Az <azfarul.is...@gmail.com> wrote: > > memo = {} > > copied_students = copy.deepcopy(students, memo) > > copied_supervisors = copy.deepcopy(supervisors, memo) > > copied_projects = copy.deepcopy(projects, memo) > > > After you do this, memo will contain a record of all copied objects. You > > should examine memo.values() to see if it is copying more than you > > expected. If it did copy just what you expected, then my worries were > > unfounded. > > I'll let you know how that turns out soonish. While I know it's my > data, is there anything you can suggest from your experience that you > consider to be "unexpected"? > > > Yes, session_id/trial_id and stud_id can repeat, and you can still group > > things together by run_id. Alternatively, you could add an > > autoincrementing primary key to SimAllocation, but I believe it is > > redundant since the combination (run_id, session_id/trial_id, stud_id) > > should be unique anyway. run_id can definitely be a datetime, but I'm > > not sure how well sqlite (it sounds like you're using sqlite) supports > > datetimes in queries > > (seehttp://www.sqlalchemy.org/docs/reference/dialects/sqlite.html#date-an...). > > A GUID (or UUID) is just a 128-bit value (usually random); the benefit > > here is you can generate it on the client side and be confident that it > > will be unique on the server (to avoid duplicate primary key errors). > > Using datetimes or database sequences would also work. You can > > definitely pass the run_id as an argument to monteCarloBasic, or to each > > object's create_db_record method. > > Also I get why you mention three keys: run_id/guid/uuid and session_id/ > trial_id alone won't suffice... but since we know there are unique > students (within each single allocation run, etc. So I can get rid of > the "ident" then? It serves no other purpose really if I can get a key > combo that's unique and works for. > > I am indeed using SQLite3. I take it take my physical database has to > something like: > engine = create_engine('sqlite:///Database/spalloc.sqlite3', > echo=False)? > > Also I take it I should generate the UUID (http://docs.python.org/ > library/uuid.html) when I call the MonteCarloBasic function right? > Since it should be the same for each "call", I take I'll have to > generate it before the loop. Additionally, how would I actually query > a 128-bit value? Say I have a bit in my GUI where the supervisor can > put in a UUID to pull the data off the Database. How would he actually > know which UUID to put in? Any ideas? > > Also once I've got my stuff in the physical database and after my > program is done, I'd call session.close() right? How do I access the > DB data then? Would I have to write some separate functions that allow > me to access the data without using (for example) > 'session.query(Student)...`? This way the user (i.e. my supervisor) > won't have to keep running the readData, monteCarloBasic, etc > functions just to access the DB (that would be poor indeed!). > > On Jun 8, 3:07 pm, Conor <conor.edward.da...@gmail.com> wrote: > > > On 06/07/2010 07:27 PM, Az wrote: > > > >> By default, deepcopy will make one copy of everything in the object > > >> graph reachable by the object you feed it. The scary part is that, > > >> unless you also pass in a /memo/ argument to each call to deepcopy, it > > >> will copy the entire graph /every single call/. So if you deepcopy the > > >> students dictionary and then deepcopy the projects dictionary, each > > >> student's allocated_proj attribute will not match any instance in the > > >> projects dictionary. This is why a use-case-specific copy function is > > >> recommended: it is a lot easier to predict which objects will get copied > > >> and which objects will be shared. > > > > Shouldn't it match? I mean the student can only get allocated a > > > project if it exists in the projects dictionary... or is that not the > > > point? > > > > By use-case-specific, you mean I'll have to redefine deepcopy inside > > > each class like this: def __deepcopy__(self): something, something? > > > > The only two places where this is an issue is for Supervisor's > > > "offered_proj" attribute (a set) where, naturally, each project is an > > > object and in Project where "proj_sup" is naturally a supervisor > > > object :D > > > > The usefulness of my data structures comes back to bite me now... > > > In theory, the following will work, ignoring ORM deepcopy issues > > discussed at the beginning of this thread: > > > memo = {} > > copied_students = copy.deepcopy(students, memo) > > copied_supervisors = copy.deepcopy(supervisors, memo) > > copied_projects = copy.deepcopy(projects, memo) > > > After you do this, memo will contain a record of all copied objects. You > > should examine memo.values() to see if it is copying more than you > > expected. If it did copy just what you expected, then my worries were > > unfounded. > > > By use-case-specific, I meant define your own copy_objects function that > > explicitly specifies what is copied: > > > def copy_objects(students, supervisors, projects): > > memo = {} > > copied_students = {} > > copied_supervisors = {} > > copied_projects = {} > > > def copy_student(student): > > student_id = id(student) > > if student_id in memo: > > return memo[student_id] > > > copied_student = Student() > > memo[student_id] = copied_student > > copied_student.attr1 = student.attr1 > > [copy rest of student's attributes] > > if you_need_to_copy_students_project: > > copied_student.allocated_proj = > > copy_project(student.allocated_proj) > > return copied_student > > > [define copy_supervisor] > > [define copy_project] > > > copied_students = dict((key, copy_student(student)) for (key, student) > > in students.iteritems()) > > copied_supervisors = dict((key, copy_supervisor(supervisor)) for (key, > > supervisor) in supervisors.iteritems()) > > copied_projects = dict((key, copy_project(project)) for (key, project) > > in projects.iteritems()) > > return (copied_students, copied_supervisors, copied_projects) > > > As you can see, this makes it clear which objects are copied and which > > are shared. In retrospect, I think I assumed you didn't want to make > > copies of your supervisors or projects when I recommended the > > use-case-specific approach, which kind of violates the spirit of > > deepcopy. Oh well, my bad. > > > >> class Student(object): > > >> [existing definitions] > > > >> def create_db_record(self): > > >> result = StudentDBRecord() > > >> result.ee_id = self.ee_id > > >> [copy over other attributes] > > >> return result > > > >> class StudentDBRecord(object): > > >> pass > > > > The create_db_record function... does it have to called explicitly > > > somewhere or does it automatically run? > > > You have to call it explicitly, e.g.: > > > for unmapped_student in unmapped_students: > > mapped_student = unmapped_student.create_db_record() > > # I assume you want "find or create" behavior, > > # so use session.merge instead of session.add. > > mapped_student = session.merge(mapped_student) > > > > [...] > > > >> I think a primary key of > > >> (run_id, session_id/trial_id, stud_id) would be good > > > > If I make them all primary keys I get a composite key right? Within an > > > entire M-C simulation the stud_id's would repeat in groups -- so if > > > there are 100 simulations, each stud_id appears 100 times in that > > > commit. > > > > Run_id is a fantastic idea! I'd probably have it be the date and time? > > > Given that the simulation takes a while to run... the time will have > > > changed sufficiently for uniqueness. However, then querying becomes a > > > pain because of whatever format the date and time data will be in... > > > so in that case, what is a GUID and is that something we could give to > > > the Monte-Carlo ourselves before the run as some sort of argument? It > > > would be the same for an entire run but different from run to run (so > > > not unique from row to row, but unique from one run set to the other). > > > Any thoughts on this? > > > Yes, session_id/trial_id and stud_id can repeat, and you can still group > > things together by run_id. Alternatively, you could add an > > autoincrementing primary key to SimAllocation, but I believe it is > > redundant since the combination (run_id, session_id/trial_id, stud_id) > > should be unique anyway. run_id can definitely be a datetime, but I'm > > not sure how well sqlite (it sounds like you're using sqlite) supports > > datetimes in queries > > (seehttp://www.sqlalchemy.org/docs/reference/dialects/sqlite.html#date-an...). > > A GUID (or UUID) is just a 128-bit value (usually random); the benefit > > here is you can generate it on the client side and be confident that it > > will be unique on the server (to avoid duplicate primary key errors). > > Using datetimes or database sequences would also work. You can > > definitely pass the run_id as an argument to monteCarloBasic, or to each > > object's create_db_record method. > > > -Conor -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.