On Mon, Aug 30, 2010 at 12:05 PM, Stefan Fuhrmann <
stefanfuhrm...@alice-dsl.de> wrote:

> Johan Corveleyn wrote:
>
>> On Sun, Aug 29, 2010 at 12:32 PM, <stefan2_at_apache.org> wrote:
>> /> Author: stefan2 /
>> /> Date: Sun Aug 29 10:32:08 2010 /
>> /> New Revision: 990537 /
>> /> /
>> /> URL: http://svn.apache.org/viewvc?rev=990537&view=rev <
>> http://svn.apache.org/viewvc?rev=990537&view=rev> /
>>
>> /> Log: /
>> /> Looking for the cause of Johan Corveleyn's crash (see /
>> /> http://svn.haxx.se/dev/archive-2010-08/0652.shtml), it /
>> /> seems that wrong / corrupted data contains backward /
>> /> pointers, i.e. negative offsets. That cannot happen if /
>> /> everything works as intended. /
>>
>> I've just retried my test after this change (actually with
>> performance-branch_at_990579, so updated just 10 minutes ago). Now I get
>>
>> the assertion error, after running log or blame on that particular
>> file:
>>
>> [[[
>> $ svnserve -d -r c:/research/svn/experiment/repos
>> Assertion failed: *ptr > buffer, file
>> ..\..\..\subversion\libsvn_subr\svn_temp_serializer.c, line 282
>>
>> This application has requested the Runtime to terminate it in an unusual
>> way.
>> Please contact the application's support team for more information.
>> ]]]
>>
>>  That is what I expected looking at the call stacks you posted.
> My preliminary analysis goes as follows:
>
> * The error seems to be limited to relatively rare occasions.
>  That sufficiently excludes alignment issues and plainly wrong
>  parameters / function calls.
>
> * It might be a (still rare) 32-bit-only issue.
>
> * There seems to be no miscast of types, i.e. the DAG node
>  being read and causing the PF is actually a DAG node. Even
>  if conflicting keys were used, the structure could still be read
>  from the cache and would lead to some logic failure elsewhere.
>
> What else could it be? Most of the following are rather
>
> * concurrency issue
> * data corruption within the cache itself
> * some strange serialization issue that needs very specific data
>  and / or 32 bit pointers to show up
>
>
>  Is there any way I can find more information about this failure, so I
>> can help you diagnose the problem?
>>
>>  In fact there is. Just some questions:
>
> * You are the only one accessing the server and you use
>  a single client process?
>

Yes. All on the same machine actually (my laptop). Accessing the server with
svn://localhost.


> * Does it happen if you log / blame the file for the first time
>  and no other requests have been made to the server before?
>

Yes


> * Does a command line "svn log" produce some output
>  before the crash? If so, is there something unusual happening
>  around these revisions (branch replacement or so)?
>

Yes. Running "svn log svn://localhost/trunk/some/path/bigfile.xml" yields
969
of the 2279 log entries. From r95849 (last change to this file) down to
r42100. Then it suddenly stops.

I've checked r42100 with "log -v", and it only mentions text modification of
bigfile.xml. Same goes for the previous and next revisions in which
bigfile.xml was affected (r42104 and r42042).



>
> Also, please verify that the crash gets triggered if the server is started
> with the following extra parameters:
>
> * -c0 -M0 -F0
>

No crash


> * -c0 -M0
>

No crash


> * -c0 -M1500 -F0
>

Crash (actually I did it with -M1000, because M1500 would give me an "Out of
memory" immediately).


> * -c0 -M1500


Crash (with -M1000 that is)


>
>
>
>> Just to be clear: the very same repos does not have this problem when
>> accessed by a trunk svnserve.
>>
> I thought so ;) To narrow down the nature of the problem,
> I added some checks that  should be able to discern plain
> data corruption from (de-)serialization issues. Please apply
> either the patch or replace the original files with the versions
> in the .zip file.
>
> A debug build should then, hopefully, trigger a different
> and more specific assertion.
>
>
Ok, will try that now.

-- 
Johan

Reply via email to