[Zope] Re: help debugging a "can't pickle" error deep within a catalog reindex

2008-06-24 Thread Rob Miller

Rob Miller wrote:

hi,

i'm trying to perform a ZCatalog.refreshCatalog() on a catalog with over 
29,000 indexed objects.  it churns for a good long time, and eventually 
fails with a long set of tracebacks, of which i've included a sample at 
the end of this message.


i think i understand the gist of the issue... it's trying to write an 
object (probably a CatalogBrain) to the database, but this object's 
__dict__ contains a value that is of type instancemethod, which isn't 
allowed for persistent objects.


the problem is that i can't figure out which specific objects are 
causing the problem.  i've used pdb.post_mortem to get a debug prompt 
way down in the traceback, but the code goes in and out of C modules, so 
i'm missing a lot of what's happening.  and when i interactively peek at 
the objects that are being indexed when the error happens, there doesn't 
seem to be anything wrong, and i can index the objects w/ no problem.  
i've even tried dropping the subtransaction threshold down to 1, so it 
will try to commit a savepoint after every object, but none of the 
objects being indexed seemed to have any problems.


okay, after leaving this for a few days, i came back to it and managed to work 
it out.  for posterity's sake, what ended up working was that i added an 
explicit transaction.commit() after every object reindex, wrapped in a try: 
except, like so:


--- src/Zope/lib/python/Products/ZCatalog/ZCatalog.py   2007-10-29 
06:09:30.0 -0700
+++ lib/zope/lib/python/Products/ZCatalog/ZCatalog.py   2008-06-24 
10:47:49.0 -0700

@@ -294,6 +294,11 @@
 if obj is not None:
 try:
 self.catalog_object(obj, p, pghandler=pghandler)
+try:
+transaction.commit()
+except:
+import sys, pdb
+pdb.post_mortem(sys.exc_info()[2])
 except ConflictError:
 raise
 except:

this made the reindex take a ridiculously long time, and it leaves the catalog 
in an inconsistent state (i.e. it should only be done on a copy of the 
database, one that you can throw away), but it did in the end reveal to me 
which record was causing the problem.


the problematic record ended up being a ghosted catalog entry that happened to 
share the same name as a view.  when the catalog tried to look up the object 
by its path (via unrestrictedTraverse), the view object was returned.


thanks for the help!

-r

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] Re: help debugging a "can't pickle" error deep within a catalog reindex

2008-06-18 Thread Ross Patterson
Rob Miller <[EMAIL PROTECTED]> writes:

> Ross Patterson wrote:
>> Rob Miller <[EMAIL PROTECTED]> writes:
>>
>>> hi,
>>>
>>> i'm trying to perform a ZCatalog.refreshCatalog() on a catalog with
>>> over 29,000 indexed objects.  it churns for a good long time, and
>>> eventually fails with a long set of tracebacks, of which i've included
>>> a sample at the end of this message.
>>>
>>> i think i understand the gist of the issue... it's trying to write an
>>> object (probably a CatalogBrain) to the database, but this object's
>>> __dict__ contains a value that is of type instancemethod, which isn't
>>> allowed for persistent objects.
>>>
>>> the problem is that i can't figure out which specific objects are
>>> causing the problem.  i've used pdb.post_mortem to get a debug prompt
>>> way down in the traceback, but the code goes in and out of C modules,
>>> so i'm missing a lot of what's happening.  and when i interactively
>>> peek at the objects that are being indexed when the error happens,
>>> there doesn't seem to be anything wrong, and i can index the objects
>>> w/ no problem.  i've even tried dropping the subtransaction threshold
>>> down to 1, so it will try to commit a savepoint after every object,
>>> but none of the objects being indexed seemed to have any problems.
>>>
>>> i CAN verify that the instancemethod that is causing the problem
>>> renders like this:
>>>
>>> >> at /session_data_manager>>
>>>
>>> even that hasn't proven enough for me to concretely identify the
>>> source of the problem, though.
>>>
>>> i've been working on this for a full day already, and am not sure how
>>> to proceed.  does anyone have any debugging tips that might help me
>>> figure out what, exactly, is causing the reindex attempts to blow up?
>>>
>>> thanks!
>>
>> Can you get a pdb.post_mortem prompt at the actual ZODB/serialize.py:416
>> error or only at transaction/_transaction.py:267 where some sort of
>> previous error is handled?
>
> yes, i put the post_mortem around the "transaction.savepoint" call at
> line 559 of Products/ZCatalog/ZCatalog.py.  this is the place where
> the error is happening, but it's where the subtransaction threshold
> has been reached, so IIUC the error could be caused by any of the
> objects in the subtransaction.

Then it seems like you're not getting at the actual pickle error and
thus aren't looking at the object whose __getstate__ returns the
offending instance method.  If you have enough memory, try running the
refreshCatalog with subtransactions disabled to get the *actual* pickle
error.

>> If the former, I often find it informative to step up to
>> ZODB/serialize.py:407 where obj.__getstate__() is called and find out
>> what obj is.  Is it the same object every time?  If not, is it of the
>> same type every time?
>
> right.  as long as the subtxn threshold number is the same, then the
> object is the same.  but if i change the subtxn threshold, then the
> object will be different.  as i said, i even tried reducing the subtxn
> threshold to '1', so, assuming my understanding is correct, this would
> be triggered after every object.  it took many hours to get to my
> debug prompt; when i did, the obj was FileAttachment object that
> didn't seem to have any problems.  i was able to interactively index
> the object from the pdb prompt, and even commit the transaction, w/ no
> problems.

The actual error of your TB is the error that savepoint raises when a
previous error has occurred.  So changing the subtransaction threshold
just changes which object is being indexed when the savepoint *happens*
to be run and notices that an error has been raised for a previous
object in the savepoint.  At least, that's my read of it.

>> A next step can also be to put a pdb.set_trace() at
>> transaction/_transaction.py:340 in the register() method such that
>> it's is only called when the offending object is added to the
>> transaction.
>
> i'll give this a shot, thx.
>
>> Hope some of that helps,
>
> even just having someone to discuss it with is helpful, for my sanity
> at least...  ;-)

Well, then I'll have to enjoy your sanity vicariously.  :)

Ross

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] Re: help debugging a "can't pickle" error deep within a catalog reindex

2008-06-18 Thread Rob Miller

Ross Patterson wrote:

Rob Miller <[EMAIL PROTECTED]> writes:


hi,

i'm trying to perform a ZCatalog.refreshCatalog() on a catalog with
over 29,000 indexed objects.  it churns for a good long time, and
eventually fails with a long set of tracebacks, of which i've included
a sample at the end of this message.

i think i understand the gist of the issue... it's trying to write an
object (probably a CatalogBrain) to the database, but this object's
__dict__ contains a value that is of type instancemethod, which isn't
allowed for persistent objects.

the problem is that i can't figure out which specific objects are
causing the problem.  i've used pdb.post_mortem to get a debug prompt
way down in the traceback, but the code goes in and out of C modules,
so i'm missing a lot of what's happening.  and when i interactively
peek at the objects that are being indexed when the error happens,
there doesn't seem to be anything wrong, and i can index the objects
w/ no problem.  i've even tried dropping the subtransaction threshold
down to 1, so it will try to commit a savepoint after every object,
but none of the objects being indexed seemed to have any problems.

i CAN verify that the instancemethod that is causing the problem
renders like this:

>

even that hasn't proven enough for me to concretely identify the
source of the problem, though.

i've been working on this for a full day already, and am not sure how
to proceed.  does anyone have any debugging tips that might help me
figure out what, exactly, is causing the reindex attempts to blow up?

thanks!


Can you get a pdb.post_mortem prompt at the actual ZODB/serialize.py:416
error or only at transaction/_transaction.py:267 where some sort of
previous error is handled?


yes, i put the post_mortem around the "transaction.savepoint" call at line 559 
of Products/ZCatalog/ZCatalog.py.  this is the place where the error is 
happening, but it's where the subtransaction threshold has been reached, so 
IIUC the error could be caused by any of the objects in the subtransaction.



If the former, I often find it informative to step up to
ZODB/serialize.py:407 where obj.__getstate__() is called and find out
what obj is.  Is it the same object every time?  If not, is it of the
same type every time?


right.  as long as the subtxn threshold number is the same, then the object is 
the same.  but if i change the subtxn threshold, then the object will be 
different.  as i said, i even tried reducing the subtxn threshold to '1', so, 
assuming my understanding is correct, this would be triggered after every 
object.  it took many hours to get to my debug prompt; when i did, the obj was 
FileAttachment object that didn't seem to have any problems.  i was able to 
interactively index the object from the pdb prompt, and even commit the 
transaction, w/ no problems.



A next step can also be to put a pdb.set_trace() at
transaction/_transaction.py:340 in the register() method such that it's
is only called when the offending object is added to the transaction.


i'll give this a shot, thx.


Hope some of that helps,


even just having someone to discuss it with is helpful, for my sanity at 
least...  ;-)


thx.

-r

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] Re: help debugging a "can't pickle" error deep within a catalog reindex

2008-06-17 Thread Ross Patterson
Rob Miller <[EMAIL PROTECTED]> writes:

> hi,
>
> i'm trying to perform a ZCatalog.refreshCatalog() on a catalog with
> over 29,000 indexed objects.  it churns for a good long time, and
> eventually fails with a long set of tracebacks, of which i've included
> a sample at the end of this message.
>
> i think i understand the gist of the issue... it's trying to write an
> object (probably a CatalogBrain) to the database, but this object's
> __dict__ contains a value that is of type instancemethod, which isn't
> allowed for persistent objects.
>
> the problem is that i can't figure out which specific objects are
> causing the problem.  i've used pdb.post_mortem to get a debug prompt
> way down in the traceback, but the code goes in and out of C modules,
> so i'm missing a lot of what's happening.  and when i interactively
> peek at the objects that are being indexed when the error happens,
> there doesn't seem to be anything wrong, and i can index the objects
> w/ no problem.  i've even tried dropping the subtransaction threshold
> down to 1, so it will try to commit a savepoint after every object,
> but none of the objects being indexed seemed to have any problems.
>
> i CAN verify that the instancemethod that is causing the problem
> renders like this:
>
>  at /session_data_manager>>
>
> even that hasn't proven enough for me to concretely identify the
> source of the problem, though.
>
> i've been working on this for a full day already, and am not sure how
> to proceed.  does anyone have any debugging tips that might help me
> figure out what, exactly, is causing the reindex attempts to blow up?
>
> thanks!

Can you get a pdb.post_mortem prompt at the actual ZODB/serialize.py:416
error or only at transaction/_transaction.py:267 where some sort of
previous error is handled?

If the former, I often find it informative to step up to
ZODB/serialize.py:407 where obj.__getstate__() is called and find out
what obj is.  Is it the same object every time?  If not, is it of the
same type every time?

A next step can also be to put a pdb.set_trace() at
transaction/_transaction.py:340 in the register() method such that it's
is only called when the offending object is added to the transaction.

Hope some of that helps,
Ross

> Traceback (most recent call last):
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/Products/ZCatalog/ZCatalog.py",
> line 296, in refreshCatalog
> self.catalog_object(obj, p, pghandler=pghandler)
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/zope/Products/CMFPlone/CatalogTool.py",
> line 367, in catalog_object
> self._increment_counter()
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/zope/Products/CMFPlone/CatalogTool.py",
> line 395, in _increment_counter
> self._counter.change(1)
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/BTrees/Length.py",
> line 55, in change
> self.value += delta
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZODB/Connection.py",
> line 890, in register
> self._register(obj)
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZODB/Connection.py",
> line 900, in _register
> self.transaction_manager.get().join(self)
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/transaction/_transaction.py",
> line 273, in join
> self._prior_operation_failed() # doesn't return
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/transaction/_transaction.py",
> line 267, in _prior_operation_failed
> raise TransactionFailedError("An operation previously failed, "
> TransactionFailedError: An operation previously failed, with traceback:
>
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZServer/PubCore/ZServerPublisher.py",
> line 25, in __init__
> response=b)
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/Publish.py",
> line 401, in publish_module
> environ, debug, request, response)
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/Publish.py",
> line 202, in publish_module_standard
> response = publish(request, module_name, after_list, debug=debug)
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/Publish.py",
> line 119, in publish
> request, bind=1)
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/mapply.py",
> line 88, in mapply
> if debug is not None: return debug(object,args,context)
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/ZPublisher/Publish.py",
> line 42, in call_object
> result=apply(object,args) # Type s to step into published object.
>   File
> "/home/rob/topp/14000/builds/20080611/opencore/lib/zope/lib/python/Products/ZCatalog/ZCatalog.py",
> line 260, in manage_catalogReindex
> self.refreshCatalog(clear=