On Mon, Aug 30, 2010 at 12:05 PM, Stefan Fuhrmann < stefanfuhrm...@alice-dsl.de> wrote:
> Johan Corveleyn wrote: > >> On Sun, Aug 29, 2010 at 12:32 PM, <stefan2_at_apache.org> wrote: >> /> Author: stefan2 / >> /> Date: Sun Aug 29 10:32:08 2010 / >> /> New Revision: 990537 / >> /> / >> /> URL: http://svn.apache.org/viewvc?rev=990537&view=rev < >> http://svn.apache.org/viewvc?rev=990537&view=rev> / >> >> /> Log: / >> /> Looking for the cause of Johan Corveleyn's crash (see / >> /> http://svn.haxx.se/dev/archive-2010-08/0652.shtml), it / >> /> seems that wrong / corrupted data contains backward / >> /> pointers, i.e. negative offsets. That cannot happen if / >> /> everything works as intended. / >> >> I've just retried my test after this change (actually with >> performance-branch_at_990579, so updated just 10 minutes ago). Now I get >> >> the assertion error, after running log or blame on that particular >> file: >> >> [[[ >> $ svnserve -d -r c:/research/svn/experiment/repos >> Assertion failed: *ptr > buffer, file >> ..\..\..\subversion\libsvn_subr\svn_temp_serializer.c, line 282 >> >> This application has requested the Runtime to terminate it in an unusual >> way. >> Please contact the application's support team for more information. >> ]]] >> >> That is what I expected looking at the call stacks you posted. > My preliminary analysis goes as follows: > > * The error seems to be limited to relatively rare occasions. > That sufficiently excludes alignment issues and plainly wrong > parameters / function calls. > > * It might be a (still rare) 32-bit-only issue. > > * There seems to be no miscast of types, i.e. the DAG node > being read and causing the PF is actually a DAG node. Even > if conflicting keys were used, the structure could still be read > from the cache and would lead to some logic failure elsewhere. > > What else could it be? Most of the following are rather > > * concurrency issue > * data corruption within the cache itself > * some strange serialization issue that needs very specific data > and / or 32 bit pointers to show up > > > Is there any way I can find more information about this failure, so I >> can help you diagnose the problem? >> >> In fact there is. Just some questions: > > * You are the only one accessing the server and you use > a single client process? > Yes. All on the same machine actually (my laptop). Accessing the server with svn://localhost. > * Does it happen if you log / blame the file for the first time > and no other requests have been made to the server before? > Yes > * Does a command line "svn log" produce some output > before the crash? If so, is there something unusual happening > around these revisions (branch replacement or so)? > Yes. Running "svn log svn://localhost/trunk/some/path/bigfile.xml" yields 969 of the 2279 log entries. From r95849 (last change to this file) down to r42100. Then it suddenly stops. I've checked r42100 with "log -v", and it only mentions text modification of bigfile.xml. Same goes for the previous and next revisions in which bigfile.xml was affected (r42104 and r42042). > > Also, please verify that the crash gets triggered if the server is started > with the following extra parameters: > > * -c0 -M0 -F0 > No crash > * -c0 -M0 > No crash > * -c0 -M1500 -F0 > Crash (actually I did it with -M1000, because M1500 would give me an "Out of memory" immediately). > * -c0 -M1500 Crash (with -M1000 that is) > > > >> Just to be clear: the very same repos does not have this problem when >> accessed by a trunk svnserve. >> > I thought so ;) To narrow down the nature of the problem, > I added some checks that should be able to discern plain > data corruption from (de-)serialization issues. Please apply > either the patch or replace the original files with the versions > in the .zip file. > > A debug build should then, hopefully, trigger a different > and more specific assertion. > > Ok, will try that now. -- Johan