Re: [Zope-dev] Coroner's toolkit for zope, or how to figure out whatwent wrong.

2002-08-12 Thread Romain Slootmaekers

Jim Fulton wrote:
 Romain Slootmaekers wrote:
 
 Yo,

 we had a nasty crash of our zope server that we use for a b2b web 
 application. The Data.fs ZODB lost a significant amount of data.
 
 
 What sort of crash? Was this a hardware failure, or a software failure?

software.
basically, the server didn't crash, but our applications couldn't 
function anymore because some objects that really have to exist
were gone.

the Data.fs was NOT corrupted,
  but (so far I can tell) a bug in the conflict resolution code caused 
our object (the one upon we set self._p_changed=1)  to be empty. This 
object is a container of other objects that are Persistent themselves 
and at this point, we don't believe the conflict resolution mechanism 
handles these cases correctly.


 
 At this point, we restored the Data.fs from our last backup and the 
 server is back up and running. (breathing relieved)

 What worries me is that we have no clue whatsoever on what happened,
 besides the constatation that somehow, somewhere we lost a whole tree 
 of objects.
 
 
 Was this in the backup? Or in the damaged data file?

nope. the loss of data occured in the 12 hours after our last backup.
so we only (well, it actually is quite a lot :( ) lost the transactions 
that happened between the backup and the restore/restart.


The stack trace in the follow up mail gives some clue on where the 
problem is situated in the code. (as well as the exact version of the 
Zope installation)

Anyway, Murphy's law is once again proven as this thing happened on the 
first day of my vacation. :|

Sloot.



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Coroner's toolkit for zope, or how to figure out whatwent wrong.

2002-08-12 Thread Romain Slootmaekers

Jim Fulton wrote:
 Romain Slootmaekers wrote:
 
 
 I think you are pretty far off here. You said you saw a read conflict.
 No conflict resolution is done for a read conflict. Further, from the very
 brief description of your DB class, it doesn't appear to use any objects
 that actually try to resolve conflicts. I doubt seriously that this has
 anything to do with conflict resolution. It is very doubtful that a 
 database
 error would cause your data to simply disappear without some sort of error,
 like a database corruption error or an error about invalid object ids 
 (dangling
 references). Have you considered an application error?

yes, that's the first thing one does: doubt your own code.

the object in question is created once, and there is no code to delete 
it since in that application, it is of no use.
The only thing that happens is that we add/moify/delete other object to
that rootnode.



 
 If you still have the data file with the lost data, it should be 
 possible to
 analyze it to figure out what went wrong. In particular, it would be 
 helpful
 to figure out just what transaction made the data go away to figure out 
 what it
 might have been doing.

that was exactly the question I was asking in the first place :
tools to browse the ZODB, to see where it broke.


 It simply causes the transaction with the read conflict to be reexecuted.
 


Ok, I figured that out by now as well. the read conflict error has 
indeed nothing to do with our problem. sorry 'bout that...

But we found something else:

I included a script below that produces a stripped down analogy
of our problem. (no zope needed, just ZODB, and you might wanna modify 
the first line to get it working)

The script produces the following output:


C:\zope\develbin\python.exe \temp\test.py
Foo instance at 008DCAC8 0
Foo instance at 008E1280 0
Traceback (most recent call last):
   File \temp\test.py, line 68, in ?
 get_transaction().commit()
   File C:\zope\devel\lib\python\ZODB\Transaction.py, line 234, in commit
 j.commit(o,self)
   File C:\zope\devel\lib\python\ZODB\Connection.py, line 348, in commit
 s=dbstore(oid,serial,p,version,transaction)
   File C:\zope\devel\lib\python\ZODB\FileStorage.py, line 665, in store
 data=self.tryToResolveConflict(oid, oserial, serial, data)
   File C:\zope\devel\lib\python\ZODB\ConflictResolution.py, line 108, 
in tryTo
ResolveConflict
 resolved=resolve(old, committed, newstate)
   File \temp\test.py, line 30, in _p_resolveConflict
 print savedState['data'].getHello()
AttributeError: PersistentReference instance has no attribute 'getHello'


The question is: is intended ZODB behaviour or not, and is there a work 
around ?

have fun,

Sloot.


swhome=r'C:\zope\devel'
import sys
sys.path.insert(0, '%s/lib/python' % swhome)
sys.path.insert(1, '%s/bin/lib' % swhome)

import ZODB
from Persistence import Persistent

class Dummy(Persistent):

def __init__(self):
self.hello = Hello there...

def getHello(self):
return self.hello

class Foo(Persistent):

def __init__(self):
self.data = Dummy()
self.count = 0

def incCounter(self):
self.count += 1

def getCount(self):
return self.count

def _p_resolveConflict(self, oldState, savedState, newState):
print savedState['data'].getHello()
print newState['data'].getHello()
print oldState['data'].getHello()
diffsaved = savedState['count'] - oldState['count']
diffnew = newState['count'] - oldState['count']
newState['count'] = oldState['count'] + diffsaved + diffnew
return newState


from ZODB import FileStorage, DB

storage = FileStorage.FileStorage('/temp/test.fs')
db = DB( storage )

# Init van test object
conn = db.open()

root = conn.root()
root['foo'] = Foo()
get_transaction().commit()

conn.close()

conn1 = db.open()
root1 = conn1.root()
foo1 = root1['foo']

conn2 = db.open()
root2 = conn2.root()
foo2 = root2['foo']

print foo1, foo1.getCount()
print foo2, foo2.getCount()

foo1.incCounter()
get_transaction().commit()

foo2.incCounter()
get_transaction().commit()

print foo1, foo1.getCount()
print foo2, foo2.getCount()