Hi all, I’ve just ran into a weird bug which damaged my svn repository. I still don’t understand what exactly was wrong, so, I don’t know how to describe it in a clear and simple manner, sorry… I’ll just try to describe all the symptoms I’ve experienced. I’ll use real file names, since I wasn’t able to reproduce this bug on synthetic test repository.
*SETUP* Most simple single-user, single-PC setup. Local repository. First svn version: “Subversion command-line client, version 1.8.5.”. Windows 7 x64 Antivirus: Kaspersky Endpoint Security 10 *THE STORY* The story began, when I ran into some sort of error message, while trying to commit r3349. After a bit of struggling, I’ve realized, that my repository got broken after previous commit (r3348). Nasty thing is that previous commit finished without any error message. *SYMPTOMS* **svn verify** Output ends like this: <….> * Verified revision 3346. * Verified revision 3347. svnadmin: E160004: Corrupt node-revision '4d-610.2-2392.r3348/35659066' svnadmin: E160004: Found malformed header '' in revision file **svn checkout** When I try to checkout a new working copy, I receive similar message: <…> W:\testCO\Binar\Matlab\deploy W:\testCO\Binar\Matlab\deploy\x64 W:\testCO\Binar\Matlab\deploy\x64\Binar_x64.prj W:\testCO\Binar\Matlab\deploy\x64\Binar_x64 W:\testCO\Binar\Matlab\deploy\x64\Binar_x64\distrib Corrupt node-revision '4d-610.2-2392.r3348/35659066' Found malformed header '' in revision file **svn Repository Browser** When I navigate to file:///V:/R_Matlab/Binar/trunk/Binar/Matlab/deploy/x64/Binar_x64 in tortoise svn repository browser, I see the same error message: Corrupt node-revision '4d-610.2-2392.r3348/35659066' Found malformed header '' in revision file Here’s a screenshot: http://sdrv.ms/1fJVuwa *ZEROS IN DATA FILE* Luckily, I have a full backup (r3337). I’ve manually repeated all my commits up to r3347 and verified that at this state repository is OK. Next, I’ve tried to reproduce the bug: 1. Firstly (“try1”), I’ve repeated same Matlab commit script (Matlab simply calls svn, just like from cmd). And… «success» - same bug again! 2. Secondly (“try3”), I’ve managed to reproduce the bug using only windows cmd commands. 3. Thirdly (“try4” and “try5(0)”), I wrote a bat-script to reproduce the same actions. I’ve compared R_Matlab\db\revs\3\3348 file for different “tries”: (initial bug is designated as “try0”) and discovered a single interesting thing: each “3348” file has a long sequence of zero-bytes: • try0: 0x2201B0A to 0x2201FFF • try1: 0x2201000 to 0x2201FFF o try0_vs_try1_p1: http://sdrv.ms/Ju7nev o try0_vs_try1_p2: http://sdrv.ms/Ju7tmu o try0_vs_try1_p3: http://sdrv.ms/Ju7AOI • try3: 0x2201B11 to 0x2201FFF o try0_vs_try3_p1: http://sdrv.ms/Ju7G9g o try0_vs_try3_p2: http://sdrv.ms/Ju7HKd • try4: 0x2201000 to 0x2201FFF o try0_vs_try4_p1: http://sdrv.ms/Ju7OFE o try0_vs_try4_p2: http://sdrv.ms/Ju86MJ o try0_vs_try4_p3: http://sdrv.ms/Ju89ID • try5(0): 0x2201000 to 0x2201FFF (just like try4). o try0_vs_try5(0)_p1: http://sdrv.ms/1daKwjG o try0_vs_try5(0)_p2: http://sdrv.ms/1daKxUx o try0_vs_try5(0)_p3: http://sdrv.ms/Ju8iM5 Moreover, try4 and try5 have only one single difference, two zero- bytes, starting from 0x21F9FFE (in case of “try5(0)”): http://sdrv.ms/19jmBdm *BUG DISAPPERED* That’s all I have. 5 broken repositories. After that bug DISAPPEARED. Just like a UFO :) . I’ve launched the SAME script, with the SAME input data 10 more times (“try5(1)”,”try5(2)”…) – nothing – svn correctly commits r3348, resulting repository is valid: • svn verify is OK • I’m able to see contents of “R_Matlab/Binar/trunk/Binar/Matlab/deploy/x64/Binar_x64” in tortoise svn repository browser • svn checkout is OK. When I compare “revs\3348” for “try4” vs “try5(1)” the ONLY difference is those long sequence of zero-bytes mentioned before: • try4_vs_try5(1)_p1: http://sdrv.ms/1edmEdV • try4_vs_try5(1)_p2: http://sdrv.ms/Ju8YkC *REPRODUCTION SCRIPT* The bat script, that resulted in error is quite straightforward. It simply copies several files. It might be not a good idea to copy modified file without committing it first, but still it should not result in error… The bat file (used in try4) is here: http://sdrv.ms/19ld4FN Another thing to mention is that size of files in 3348 commit is about 250 Mbytes…. To my shame, my repository is both large (~30GB) and containing confidential data, so, I’m unable to share it :( . All files mentioned above are in this folder: http://sdrv.ms/1jMN250 *LOKING FOR SIMILAR CASES* Mainly, I’ve just googled “svn: Corrupt node-revision”. It looks like this error message is quite common, but no one tried to understand it’s source. Though, there’s a “what was that?” question in [1](see link below). Moreover, it looks like no one experienced “repetitive” behavior… In some cases, issue was resolved by restoring revision files from backup[1], or using svn dump/load [3,4]. In one report [2], julian.foad <at> wandisco.com was using John Szakmeister's 'fsfsverify.py' to analyze corruption. Though, it looks like in his case, corruption type was quite different. In one post [4], VinnyJames said: “we've seen this happen during heavy load”. 1. http://www.wandisco.com/svnforum/threads/38519-Commit-errors-Revision-files-corrupted 2. http://thread.gmane.org/gmane.comp.version-control.subversion.devel/123110 3. http://stackoverflow.com/questions/5543285/how-do-i-fix-a-repository-with-one-broken-revision 4. http://dev-notes-to-self.blogspot.com/2009/01/fixing-corrupt-subversion-repository.html?showComment=1280529811361#c6899551059356251422 *QUESTIONS* So…. 1. What was that? Any ideas? May it happen again? 2. Any other interesting diagnostic info I can get from these repositories? 3. Should I re-post this to subversion mailing list also? Or is it, most probably, dependent on tortoise somehow? Say, due to some caching? *PS* I’ve already posted the text above on tortoise svn mailing list: http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMessageId=3070808 and received a suggestion to re-post it here: http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMessageId=3070843 *PPS* I’m not subscribed and would appreciate being explicitly Cc:ed in any responses.