[Zope] Re: help debugging a "can't pickle" error deep within a catalog reindex
Rob Miller wrote: hi, i'm trying to perform a ZCatalog.refreshCatalog() on a catalog with over 29,000 indexed objects. it churns for a good long time, and eventually fails with a long set of tracebacks, of which i've included a sample at the end of this message. i think i understand the gist of the issue... it's trying to write an object (probably a CatalogBrain) to the database, but this object's __dict__ contains a value that is of type instancemethod, which isn't allowed for persistent objects. the problem is that i can't figure out which specific objects are causing the problem. i've used pdb.post_mortem to get a debug prompt way down in the traceback, but the code goes in and out of C modules, so i'm missing a lot of what's happening. and when i interactively peek at the objects that are being indexed when the error happens, there doesn't seem to be anything wrong, and i can index the objects w/ no problem. i've even tried dropping the subtransaction threshold down to 1, so it will try to commit a savepoint after every object, but none of the objects being indexed seemed to have any problems. okay, after leaving this for a few days, i came back to it and managed to work it out. for posterity's sake, what ended up working was that i added an explicit transaction.commit() after every object reindex, wrapped in a try: except, like so: --- src/Zope/lib/python/Products/ZCatalog/ZCatalog.py 2007-10-29 06:09:30.0 -0700 +++ lib/zope/lib/python/Products/ZCatalog/ZCatalog.py 2008-06-24 10:47:49.0 -0700 @@ -294,6 +294,11 @@ if obj is not None: try: self.catalog_object(obj, p, pghandler=pghandler) +try: +transaction.commit() +except: +import sys, pdb +pdb.post_mortem(sys.exc_info()[2]) except ConflictError: raise except: this made the reindex take a ridiculously long time, and it leaves the catalog in an inconsistent state (i.e. it should only be done on a copy of the database, one that you can throw away), but it did in the end reveal to me which record was causing the problem. the problematic record ended up being a ghosted catalog entry that happened to share the same name as a view. when the catalog tried to look up the object by its path (via unrestrictedTraverse), the view object was returned. thanks for the help! -r ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Re: help debugging a "can't pickle" error deep within a catalog reindex
Rob Miller <[EMAIL PROTECTED]> writes: > Ross Patterson wrote: >> Rob Miller <[EMAIL PROTECTED]> writes: >> >>> hi, >>> >>> i'm trying to perform a ZCatalog.refreshCatalog() on a catalog with >>> over 29,000 indexed objects. it churns for a good long time, and >>> eventually fails with a long set of tracebacks, of which i've included >>> a sample at the end of this message. >>> >>> i think i understand the gist of the issue... it's trying to write an >>> object (probably a CatalogBrain) to the database, but this object's >>> __dict__ contains a value that is of type instancemethod, which isn't >>> allowed for persistent objects. >>> >>> the problem is that i can't figure out which specific objects are >>> causing the problem. i've used pdb.post_mortem to get a debug prompt >>> way down in the traceback, but the code goes in and out of C modules, >>> so i'm missing a lot of what's happening. and when i interactively >>> peek at the objects that are being indexed when the error happens, >>> there doesn't seem to be anything wrong, and i can index the objects >>> w/ no problem. i've even tried dropping the subtransaction threshold >>> down to 1, so it will try to commit a savepoint after every object, >>> but none of the objects being indexed seemed to have any problems. >>> >>> i CAN verify that the instancemethod that is causing the problem >>> renders like this: >>> >>> >> at /session_data_manager>> >>> >>> even that hasn't proven enough for me to concretely identify the >>> source of the problem, though. >>> >>> i've been working on this for a full day already, and am not sure how >>> to proceed. does anyone have any debugging tips that might help me >>> figure out what, exactly, is causing the reindex attempts to blow up? >>> >>> thanks! >> >> Can you get a pdb.post_mortem prompt at the actual ZODB/serialize.py:416 >> error or only at transaction/_transaction.py:267 where some sort of >> previous error is handled? > > yes, i put the post_mortem around the "transaction.savepoint" call at > line 559 of Products/ZCatalog/ZCatalog.py. this is the place where > the error is happening, but it's where the subtransaction threshold > has been reached, so IIUC the error could be caused by any of the > objects in the subtransaction. Then it seems like you're not getting at the actual pickle error and thus aren't looking at the object whose __getstate__ returns the offending instance method. If you have enough memory, try running the refreshCatalog with subtransactions disabled to get the *actual* pickle error. >> If the former, I often find it informative to step up to >> ZODB/serialize.py:407 where obj.__getstate__() is called and find out >> what obj is. Is it the same object every time? If not, is it of the >> same type every time? > > right. as long as the subtxn threshold number is the same, then the > object is the same. but if i change the subtxn threshold, then the > object will be different. as i said, i even tried reducing the subtxn > threshold to '1', so, assuming my understanding is correct, this would > be triggered after every object. it took many hours to get to my > debug prompt; when i did, the obj was FileAttachment object that > didn't seem to have any problems. i was able to interactively index > the object from the pdb prompt, and even commit the transaction, w/ no > problems. The actual error of your TB is the error that savepoint raises when a previous error has occurred. So changing the subtransaction threshold just changes which object is being indexed when the savepoint *happens* to be run and notices that an error has been raised for a previous object in the savepoint. At least, that's my read of it. >> A next step can also be to put a pdb.set_trace() at >> transaction/_transaction.py:340 in the register() method such that >> it's is only called when the offending object is added to the >> transaction. > > i'll give this a shot, thx. > >> Hope some of that helps, > > even just having someone to discuss it with is helpful, for my sanity > at least... ;-) Well, then I'll have to enjoy your sanity vicariously. :) Ross ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Re: help debugging a "can't pickle" error deep within a catalog reindex
Ross Patterson wrote: Rob Miller <[EMAIL PROTECTED]> writes: hi, i'm trying to perform a ZCatalog.refreshCatalog() on a catalog with over 29,000 indexed objects. it churns for a good long time, and eventually fails with a long set of tracebacks, of which i've included a sample at the end of this message. i think i understand the gist of the issue... it's trying to write an object (probably a CatalogBrain) to the database, but this object's __dict__ contains a value that is of type instancemethod, which isn't allowed for persistent objects. the problem is that i can't figure out which specific objects are causing the problem. i've used pdb.post_mortem to get a debug prompt way down in the traceback, but the code goes in and out of C modules, so i'm missing a lot of what's happening. and when i interactively peek at the objects that are being indexed when the error happens, there doesn't seem to be anything wrong, and i can index the objects w/ no problem. i've even tried dropping the subtransaction threshold down to 1, so it will try to commit a savepoint after every object, but none of the objects being indexed seemed to have any problems. i CAN verify that the instancemethod that is causing the problem renders like this: > even that hasn't proven enough for me to concretely identify the source of the problem, though. i've been working on this for a full day already, and am not sure how to proceed. does anyone have any debugging tips that might help me figure out what, exactly, is causing the reindex attempts to blow up? thanks! Can you get a pdb.post_mortem prompt at the actual ZODB/serialize.py:416 error or only at transaction/_transaction.py:267 where some sort of previous error is handled? yes, i put the post_mortem around the "transaction.savepoint" call at line 559 of Products/ZCatalog/ZCatalog.py. this is the place where the error is happening, but it's where the subtransaction threshold has been reached, so IIUC the error could be caused by any of the objects in the subtransaction. If the former, I often find it informative to step up to ZODB/serialize.py:407 where obj.__getstate__() is called and find out what obj is. Is it the same object every time? If not, is it of the same type every time? right. as long as the subtxn threshold number is the same, then the object is the same. but if i change the subtxn threshold, then the object will be different. as i said, i even tried reducing the subtxn threshold to '1', so, assuming my understanding is correct, this would be triggered after every object. it took many hours to get to my debug prompt; when i did, the obj was FileAttachment object that didn't seem to have any problems. i was able to interactively index the object from the pdb prompt, and even commit the transaction, w/ no problems. A next step can also be to put a pdb.set_trace() at transaction/_transaction.py:340 in the register() method such that it's is only called when the offending object is added to the transaction. i'll give this a shot, thx. Hope some of that helps, even just having someone to discuss it with is helpful, for my sanity at least... ;-) thx. -r ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Re: help debugging a "can't pickle" error deep within a catalog reindex
Rob Miller <[EMAIL PROTECTED]> writes: > hi, > > i'm trying to perform a ZCatalog.refreshCatalog() on a catalog with > over 29,000 indexed objects. it churns for a good long time, and > eventually fails with a long set of tracebacks, of which i've included > a sample at the end of this message. > > i think i understand the gist of the issue... it's trying to write an > object (probably a CatalogBrain) to the database, but this object's > __dict__ contains a value that is of type instancemethod, which isn't > allowed for persistent objects. > > the problem is that i can't figure out which specific objects are > causing the problem. i've used pdb.post_mortem to get a debug prompt > way down in the traceback, but the code goes in and out of C modules, > so i'm missing a lot of what's happening. and when i interactively > peek at the objects that are being indexed when the error happens, > there doesn't seem to be anything wrong, and i can index the objects > w/ no problem. i've even tried dropping the subtransaction threshold > down to 1, so it will try to commit a savepoint after every object, > but none of the objects being indexed seemed to have any problems. > > i CAN verify that the instancemethod that is causing the problem > renders like this: > > at /session_data_manager>> > > even that hasn't proven enough for me to concretely identify the > source of the problem, though. > > i've been working on this for a full day already, and am not sure how > to proceed. does anyone have any debugging tips that might help me > figure out what, exactly, is causing the reindex attempts to blow up? > > thanks! Can you get a pdb.post_mortem prompt at the actual ZODB/serialize.py:416 error or only at transaction/_transaction.py:267 where some sort of previous error is handled? If the former, I often find it informative to step up to ZODB/serialize.py:407 where obj.__getstate__() is called and find out what obj is. Is it the same object every time? If not, is it of the same type every time? A next step can also be to put a pdb.set_trace() at transaction/_transaction.py:340 in the register() method such that it's is only called when the offending object is added to the transaction. Hope some of that helps, Ross > Traceback (most recent call last): > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/Products/ZCatalog/ZCatalog.py", > line 296, in refreshCatalog > self.catalog_object(obj, p, pghandler=pghandler) > File > "/home/rob/topp/14000/builds/20080611/opencore/zope/Products/CMFPlone/CatalogTool.py", > line 367, in catalog_object > self._increment_counter() > File > "/home/rob/topp/14000/builds/20080611/opencore/zope/Products/CMFPlone/CatalogTool.py", > line 395, in _increment_counter > self._counter.change(1) > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/BTrees/Length.py", > line 55, in change > self.value += delta > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZODB/Connection.py", > line 890, in register > self._register(obj) > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZODB/Connection.py", > line 900, in _register > self.transaction_manager.get().join(self) > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/transaction/_transaction.py", > line 273, in join > self._prior_operation_failed() # doesn't return > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/transaction/_transaction.py", > line 267, in _prior_operation_failed > raise TransactionFailedError("An operation previously failed, " > TransactionFailedError: An operation previously failed, with traceback: > > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZServer/PubCore/ZServerPublisher.py", > line 25, in __init__ > response=b) > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/Publish.py", > line 401, in publish_module > environ, debug, request, response) > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/Publish.py", > line 202, in publish_module_standard > response = publish(request, module_name, after_list, debug=debug) > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/Publish.py", > line 119, in publish > request, bind=1) > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/mapply.py", > line 88, in mapply > if debug is not None: return debug(object,args,context) > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/Publish.py", > line 42, in call_object > result=apply(object,args) # Type s to step into published object. > File > "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/Products/ZCatalog/ZCatalog.py", > line 260, in manage_catalogReindex > self.refreshCatalog(clear=