Re: relation to minfo-cnt bug Re: predecessor count for the root node-revision is wrong message
On Wed, Mar 28, 2012 at 12:00 PM, Daniel Shahaf danie...@elego.de wrote: Jason Wong wrote on Wed, Mar 28, 2012 at 11:49:20 -0700: dump-noderev.pl /repo / - id: 0.0.r62104/28771 type: dir pred: 0.0.r62103/28680 count: 62071 text: 62104 27520 1238 1238 ea635421e867454f9f7bc503c8160a2c cpath: / copyroot: 0 / minfo-cnt: 25707 - dump-noderev.pl /mirror2 / --- id: 0.0.r62104/6122 type: dir pred: 0.0.r62103/6039 count: 62104 text: 62104 4874 1235 1235 1f315ed2437ba5d70dba2587d9ef2d5a cpath: / copyroot: 0 / minfo-cnt: 25707 --- Is this in line with what you expected? It's in line with my expectations, insofar as on the mirror the 'count' is correct. It also indicates that you weren't bitten by the minfo-cnt part of this bug. As you know from the dev@ thread, Philip identified that part and fixed it too -- after my above email. Thanks again for your help in chasing down this bug. It was backported today towards 1.7.5 too. Cheers, Daniel Hi Daniel. No problem. I am glad the issues are fixed. Thank you for all your help and patience with my slow replies. It has been a busy couple of months for me in trying find the time to do these tests. So for correcting the count information in our live repository, I should run svnsync on it at some point? Is there anything I need to do after running that command in order to have it not link to the original? Thanks. Jason Wong
Re: relation to minfo-cnt bug Re: predecessor count for the root node-revision is wrong message
On Thu, Mar 22, 2012 at 11:32 AM, Jason Wong jwong1m...@gmail.com wrote: Hello Daniel. I will give it a go and let you know what I find. Jason On Wed, Mar 21, 2012 at 1:39 AM, Daniel Shahaf danie...@elego.de wrote: Jason, I've learnt yesterday something new about the minfo-cnt corruption bug: it can manifest not only as absurdly high values (on the order of 2**70), but as far smaller wrong increments too (such as increment of 172 instead of of 0 on one occasion). Could you determine whether said bug has occurred in your history? You can do that by duplicating your repository using svnsync or dump|load, running dump-noderev.pl on / of both copies at the same revisions, and comparing the minfo-cnt values. I would be interested in knowing whether they are equal between the two copies. Thanks, Daniel Jason Wong wrote on Thu, Feb 16, 2012 at 11:42:42 -0800: ./dump-noderev.pl /repository / 61851 -- id: 0.0.r61851/33470 type: dir pred: 0.0.r61850/3844 count: 61818 text: 61851 32225 1232 1232 7555349571e297c23e647cc2441d5b8f cpath: / copyroot: 0 / minfo-cnt: 25685 -- Hello Daniel. The svnsync took a while to run once I got it going. I ran the command on the hotcopy I had made originally to keep the results consistant. I have run the following two commands: dump-noderev.pl /repo / dump-noderev.pl /mirror2 / Here are the outputs from the commands: dump-noderev.pl /repo / - id: 0.0.r62104/28771 type: dir pred: 0.0.r62103/28680 count: 62071 text: 62104 27520 1238 1238 ea635421e867454f9f7bc503c8160a2c cpath: / copyroot: 0 / minfo-cnt: 25707 - dump-noderev.pl /mirror2 / --- id: 0.0.r62104/6122 type: dir pred: 0.0.r62103/6039 count: 62104 text: 62104 4874 1235 1235 1f315ed2437ba5d70dba2587d9ef2d5a cpath: / copyroot: 0 / minfo-cnt: 25707 --- Is this in line with what you expected? Jason Wong.
Re: relation to minfo-cnt bug Re: predecessor count for the root node-revision is wrong message
Hello Daniel. I will give it a go and let you know what I find. Jason On Wed, Mar 21, 2012 at 1:39 AM, Daniel Shahaf danie...@elego.de wrote: Jason, I've learnt yesterday something new about the minfo-cnt corruption bug: it can manifest not only as absurdly high values (on the order of 2**70), but as far smaller wrong increments too (such as increment of 172 instead of of 0 on one occasion). Could you determine whether said bug has occurred in your history? You can do that by duplicating your repository using svnsync or dump|load, running dump-noderev.pl on / of both copies at the same revisions, and comparing the minfo-cnt values. I would be interested in knowing whether they are equal between the two copies. Thanks, Daniel Jason Wong wrote on Thu, Feb 16, 2012 at 11:42:42 -0800: ./dump-noderev.pl /repository / 61851 -- id: 0.0.r61851/33470 type: dir pred: 0.0.r61850/3844 count: 61818 text: 61851 32225 1232 1232 7555349571e297c23e647cc2441d5b8f cpath: / copyroot: 0 / minfo-cnt: 25685 --
Re: predecessor count for the root node-revision is wrong message
Hello Daniel, Philip. I have been following the thread: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message. It looks like you all have it figured out now. Good job. Do you need any more information from me at this point? Thanks. Jason Wong.
Re: predecessor count for the root node-revision is wrong message
On Mon, Mar 19, 2012 at 1:56 PM, Daniel Shahaf danie...@elego.de wrote: Jason Wong wrote on Mon, Mar 19, 2012 at 13:41:19 -0700: Hello Daniel, Philip. I have been following the thread: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message. It looks like you all have it figured out now. Good job. Do you need any more information from me at this point? Thanks. Thanks Jason. It would be useful if you could confirm that you do not run into the error after rebuilding the server with r1302399 and r1302613 applied. (If you run the test suite, apply r1302539 and r1302591 too.) These revisions constitute the fix which is nominated for inclusion in 1.6.18 and 1.7.5; see ^/subversion/branches/1.7.x/STATUS. Hi Daniel. The developer who built the svn client is away and will probably not be back until later this week. What is your ETA for 1.7.5? Just wondering if that would released before the developer I have is back. Thanks. Jason Cheers, Daniel Jason Wong.
Re: predecessor count for the root node-revision is wrong message
On Fri, Mar 2, 2012 at 8:12 AM, Daniel Shahaf danie...@elego.de wrote: Jason Wong wrote on Fri, Mar 02, 2012 at 07:32:38 -0800: On Fri, Mar 2, 2012 at 2:58 AM, Daniel Shahaf danie...@elego.de wrote: Jason Wong wrote on Thu, Mar 01, 2012 at 10:01:26 -0800: I have had a developer here create a build of the latest SVN code with your changes you mentioned in r1294470 for the svnadmin verify Okay, that's great news, for two reasons: 1. It means building svn on windows isn't as painful as it used to be :) Actually, it did take some work to get it going as we did not have another system available to us and also did not have VC++ 6. We had to use VS 2010 in order to do this. Also, for the other components required (python,perl etc), the files after the install were copied to the workstation to see if it would work as we did not want to change the current workstation configuration by running the installers. All in all, it did seem to work. Okay. The normal build requires just the *.exe and *.dll files to be placed appropriately (such that the *.exe's and httpd's find their libsvn_* DLL's at runtime) --- it doesn't require Administrator access, for example. To clarify, Perl is only required to build OpenSSL; it is not required to build APR, Neon, or Subversion. 2. It means I can ask you to build a custom server with the 'inprocess' cache disabled, or (if all else fails) to bisect, per my previous email. One of the things you could try is to disable caching: simply modify the function create_cache() in libsvn_fs_fs/caching.c to always return NULL in *CACHE_P. See below for another suggestion. command. We have run 'svnadmin verify' against every revision of our hotcopy of our repository taken when we first brought this issue to the forums and are now tracking down each of the revisions to see what actions were being done at those times. Thanks! I do hope this work enables us to pinpoint and fix the bug. I will be going through the list to see what else was happening at the same time on the apache server since it was alluded to that there may be concurrency issues. I know the last two times that this error has popped up, we had two svn operations starting at around the same time according to the Apache logs. I will go through the previous apache history to see if this was always the case or not. Thanks, looking forward to hear what you come up with. FWIW, Justin's reply suggests that the error was seen on three different platforms --- Windows, Solaris, and FreeBSD --- so that should narrow down the range of possible explanations. (I'll also note that at ASF's installation we are not running into new instances of the bug.) Hi Daniel. I haven't gone through all the cases yet, but I have made progress through quite a number of them and a pattern seems to be coming up. I have attached 2 txt files. One shows the modified svnadmin verify output from the binaries we built. The other shows the revisions and what appears to have been occuring at the time of the bug. I figure better to provide this now rather than delay any longer for the rest of the results. I will continue to go through the rest of the events and see if there are other differences seen when the issue occurs. I hope this information helps. Thanks. Jason SVN log history for predecessor node error: from svnadmin verify svnadmin: E160004: predecessor count for the root node-revision is wrong: r45558 has 45557, but r45557 has 45557 svnadmin: E160004: predecessor count for the root node-revision is wrong: r46947 has 46945, but r46946 has 46945 svnadmin: E160004: predecessor count for the root node-revision is wrong: r46997 has 46994, but r46996 has 46994 svnadmin: E160004: predecessor count for the root node-revision is wrong: r47004 has 47000, but r47003 has 47000 svnadmin: E160004: predecessor count for the root node-revision is wrong: r47006 has 47001, but r47005 has 47001 svnadmin: E160004: predecessor count for the root node-revision is wrong: r47193 has 47187, but r47192 has 47187 svnadmin: E160004: predecessor count for the root node-revision is wrong: r47715 has 47708, but r47714 has 47708 svnadmin: E160004: predecessor count for the root node-revision is wrong: r47718 has 47710, but r47717 has 47710 svnadmin: E160004: predecessor count for the root node-revision is wrong: r50049 has 50040, but r50048 has 50040 svnadmin: E160004: predecessor count for the root node-revision is wrong: r50963 has 50953, but r50962 has 50953 svnadmin: E160004: predecessor count for the root node-revision is wrong: r51481 has 51470, but r51480 has 51470 svnadmin: E160004: predecessor count for the root node-revision is wrong: r51684 has 51672, but r51683 has 51672 svnadmin: E160004: predecessor count for the root node-revision is wrong: r52082 has 52069, but r52081 has 52069 svnadmin: E160004: predecessor count for the root node-revision is wrong: r53220 has 53205
Re: predecessor count for the root node-revision is wrong message
On Fri, Mar 2, 2012 at 2:58 AM, Daniel Shahaf danie...@elego.de wrote: Jason Wong wrote on Thu, Mar 01, 2012 at 10:01:26 -0800: I have had a developer here create a build of the latest SVN code with your changes you mentioned in r1294470 for the svnadmin verify Okay, that's great news, for two reasons: 1. It means building svn on windows isn't as painful as it used to be :) Actually, it did take some work to get it going as we did not have another system available to us and also did not have VC++ 6. We had to use VS 2010 in order to do this. Also, for the other components required (python,perl etc), the files after the install were copied to the workstation to see if it would work as we did not want to change the current workstation configuration by running the installers. All in all, it did seem to work. 2. It means I can ask you to build a custom server with the 'inprocess' cache disabled, or (if all else fails) to bisect, per my previous email. One of the things you could try is to disable caching: simply modify the function create_cache() in libsvn_fs_fs/caching.c to always return NULL in *CACHE_P. See below for another suggestion. command. We have run 'svnadmin verify' against every revision of our hotcopy of our repository taken when we first brought this issue to the forums and are now tracking down each of the revisions to see what actions were being done at those times. Thanks! I do hope this work enables us to pinpoint and fix the bug. I will be going through the list to see what else was happening at the same time on the apache server since it was alluded to that there may be concurrency issues. I know the last two times that this error has popped up, we had two svn operations starting at around the same time according to the Apache logs. I will go through the previous apache history to see if this was always the case or not. From the results, we see 25 error messages for predecessor count is wrong and the first one appeared on January 26, 2011. Near that time the following events occurred: Jan. 14, 2011 - svn upgraded from 1.6.6 to 1.6.15 Jan. 14, 2011 - Apache HTTP server upgraded from 2.2.15 to 2.2.17 Jan. 21, 2011 - repository was pruned to delete some binary files. Between January and our upgrade in Dec. to 1.7.1, we have had about 14,000 revisions and seen only 25 instances of this node revision issue. During the times we had these errors, we were using svn versions 1.6.15 and 1.6.16. Thanks, very valuable information. I've reviewed the 1.6.6-1.6.15 diff, and I have the following suggestions: - Change subversion/libsvn_fs_fs/fs.h such that SVN_FS_FS__USE_LOCK_MUTEX is set to 1. It was set to 1 in 1.6.6 but to 0 in 1.6.15. (This wouldn't explain why ASF saw it, but it might explain why you're seeing it.) Fail2ban from what I could find does not look like it has a Windows port which I currently have my production environment hosted on. Yeah, sorry. But you can write a cron job -- I mean, a Scheduled Task -- that greps your error logs for 160004 every night and mails you it it found anything, right? That's the error code to watch for for many FS error conditions: % ./tools/dev/which-error.py E160004 00160004 SVN_ERR_FS_CORRUPT I will look into it. We did ask developers to note any error messages that they see from tortoisesvn now as the last time we saw the error message pop up, we asked the developer what happened and he said that an error message popped up and he just tried to check in again and it worked. We will note the exact message next time. Thanks. Jason For convenience I'm attaching a patch that implements both of my suggestions. Let us know please if it has any effect. I will forward this to the developer to look at. Cheers, Daniel Hi. See replies above. I will post what we find. Thanks. Jason
Re: predecessor count for the root node-revision is wrong message
On Thu, Feb 16, 2012 at 12:14 PM, Daniel Shahaf danie...@elego.de wrote: The output from these two tells me two things: 1. The minfo-cnt value is reasonable (within a typical ballpark). That's relevant since minfo-cnt abnormalities were seen in another instance of the bug. 2. Everything else looks correct: the 'id:'/'pred:' headers are accurate, and the 'count:' header was incremented correctly. The 'count:' header does, however, indicate that your repository has _in the past_ triggered an instance of the bug. This is true. We have seen the bug happen before. The first occurence of this that we had seen was Dec. 7th, 2011, a few days after we went from 1.6.16 to 1.7.1. That was the first time we had seen that happen. At the time, we did not know about the cause and the developer who had encountered the error didn't report it and was able to work around it. From the Apache logs we have: [Wed Dec 07 15:16:36 2011] [error] [client 10.2.3.1] predecessor count for the root node-revision is wrong: found 59444, committing r59478 [409, #160004] [Wed Dec 07 15:33:47 2011] [error] [client 10.2.3.2] predecessor count for the root node-revision is wrong: found 59482, committing r59516 [409, #160004] [Wed Dec 07 15:35:19 2011] [error] [client 10.2.3.3] predecessor count for the root node-revision is wrong: found 59488, committing r59522 [409, #160004] [Wed Dec 07 15:44:10 2011] [error] [client 10.2.3.4] predecessor count for the root node-revision is wrong: found 59505, committing r59539 [409, #160004] Of the ips above, the last line is from the build machine. The others were from developer workstations. I mentioned the most recent two times first as we were actually aware of the issue at that time and it was recent so we knew to start looking into it. Between Dec. 7 and Jan. 31, the bug has occurred 12 times, 3 of those times from the build server. The rest are from workstations. This month, it has only occurred once and it was from the build server. Each of these times, the error has occurred in different parts of the repository. In a bit more detail: the value of the 'count:' header should be equal to the revision number given as the third argument to dump-noderev.pl. (That revision number is also embedded in the 'id:' header, and is practically guaranteed to be embedded in the 'text:' header as well.) So, there are two things you can do to help us identify the bug: 1. Hunt for past instance of the bug, identify what revisions triggered it, and try and identify a common pattern to those revisions. (This basically calls for running 'dump-noderev.pl $REPOS /' in a loop and looking for non-sequential 'count:' or 'pred:' headers in the output for a pair of successive revisions.) I will try and see if I can get this done this week. 2. Look for new instances of the bug. You could periodically scan for new instances of the bug, or implement a post-commit hook such as the following (written for unix-like systems, sorry): [[[ # look for a corruption or two minfo_cnt() { dump-noderev.pl $REPOS / $1 | sed -ne 's/minfo-cnt: //p' } PREV_REV=`expr $REV - 1` if expr `minfo_cnt $PREV_REV` - `minfo_cnt $REV` | grep ... /dev/null; then # echo an error to stderr and mail the admin exit 1 fi skipped_root_noderevs() { expr $1 - `dump-noderev.pl $REPOS / $1 | sed -ne 's/^count: //p'` } if [ `skipped_root_noderevs $PREV_REV` -ne `skipped_root_noderevs $REV` ]; then # echo an error to stderr and mail the admin exit 2 fi ]]] I will talk to the build team here about the post-commit hook. We have had the bug occur again since my last reply. Replied above. The summary is that you have indeed ran into the bug, but for some reason not in r61852 but sometime before that, (and why did r61852 trigger the syslog error anyway? Good question) and now we're at the point of trying to identify the cause of the bug --- at least circumstantially. Thanks for your help so far, Daniel Hi Daniel. Replies above. Sorry about the delay in replying. I have been really busy of late. I will try and get the results this week, if not, it will most likely be next week. Thanks Jason.
Re: predecessor count for the root node-revision is wrong message
On Mon, Feb 27, 2012 at 8:09 AM, Stefan Sperling s...@elego.de wrote: On Mon, Feb 27, 2012 at 07:36:39AM -0800, Jason Wong wrote: This is true. We have seen the bug happen before. The first occurence of this that we had seen was Dec. 7th, 2011, a few days after we went from 1.6.16 to 1.7.1. That was the first time we had seen that happen. At the time, we did not know about the cause and the developer who had encountered the error didn't report it and was able to work around it. From the Apache logs we have: [Wed Dec 07 15:16:36 2011] [error] [client 10.2.3.1] predecessor count for the root node-revision is wrong: found 59444, committing r59478 [409, #160004] Just to be clear: These errors emitted by the 1.7.1 server prevent the bug from corrupting new revisions. With a 1.6 server the problem would go unnoticed and create bad revision data. When this corruption occurs, the repository still works. But the history links for affected revisions are incorrect. Hi Stephan. So I think I misunderstood why the error messages were occurring. I had thought that there was a condition done by this check (in 1.7), that was erroneously causing svn to reject the attempt to check-in. I guess I am wondering that if this is the case, then why is it that if the check-in fails, and then we manually check it in again using tortoisesvn, that it works the second time? So the errors prevent the bug from corrupting new revisions? Is this something between the 1.7 versions or would this have been in 1.6 versions as well? We have been using svn for a while now and I am wondering what this means that for 1.6, that this issue has been occurring from communcations between 1.6 client and 1.6 server. Also, is this bug something that svnadmin verify will not detect? The last time we ran svnadmin verify, it said all was good. If it is the case that this bug has been occuring for a long time, what are the implications of the history links for affected revisions? When you say the history links are incorrect, does it just put in a random value or is it actually unreadable values? Does this mean subsequent revisions that occur after these bad revisions will propagate this bad information? A developer asked me to pose the following question. If he was to open a bad revision, would the client fail and give an error prompt or would it display history information which could belong to other files? Thanks. Jason.
Re: predecessor count for the root node-revision is wrong message
On Wed, Feb 15, 2012 at 6:15 PM, Daniel Shahaf danie...@elego.de wrote: Jason Wong wrote on Wed, Feb 15, 2012 at 10:20:23 -0800: On Wed, Feb 8, 2012 at 6:22 PM, Nico Kadel-Garcia nka...@gmail.com wrote: On Wed, Feb 8, 2012 at 7:42 PM, Daniel Shahaf danie...@elego.de wrote: Daniel Shahaf wrote on Thu, Feb 09, 2012 at 01:46:45 +0200: Jason Wong wrote on Wed, Feb 08, 2012 at 15:32:05 -0800: Get xxd.exe from http://www.vim.org/ and cat.exe and sed.exe from http://gnuwin32.sf.net (or from Cygwin). Delete from the script the line that uses the 'head' command. There is a second use of 'head', which you shouldn't delete. So instead, just get head.exe from the same place as the other two, or use the following kind of statement: Or install CygWin and run the scripts from inside CygWin. This does present end-of-line issues, so be very careful about using svn:eol native properties. my $line = do { open FOO, perl -V 21 |; FOO; }; Lastly, there's a 'sed' invocation that uses single-quoted arguments. All it does is print the input up to the first empty line --- feel free to implement it differently. (One way: my @lines = split /\n/, `command | goes | here`; $_ and print or last for @lines; Both of these examples could do with some error checking.) Daniel (yes, there's also a neater way to do this without split(). but it's not a Perl class here) Hello. Sorry for the delay. Here is an update of what I have done since the last time I posted. I have run svn log -q ^/ on the respository and it came back with no missing revisions. I stand corrected, then. I've confirmed on another instance of the bug that 'svn log -q ^/' does not behave abnormally when the bug is present. Sorry for the misinformation. Question to devs: what operation will walk the predecessor links for the root fspath? (and can therefore be used to identify instances of the bug) Since I first posted, each of the projects we have tried to build that had failed have since successfully been built without any changes on our side. What is the significance of this? I don't know how your build process interacts with Subversion. I was having an issue with converting the script to run in windows as I was only getting the first line returned so I set up cygwin. I ran the script against both of the revisions (61815 and 61852) in mentioned in the Apache error log and the output was the same for each. Commands: dump-noderev.pl /repository /project/binaries/release/phase1/iteration/81/trunk 61815 dump-noderev.pl /repository /project/binaries/release/phase1/iteration/81/trunk 61852 Output: id: 9-45362.0-61242.r61424/0 type: dir pred: 9-45362.0-60310/0 Are you sure that's the value of the pred: field? It contains only one ., instead of two. I missed a part of it, you are right. here is the full pred line: pred: 9-45362.0-60310.r60310/0 count: 43 text: 58741 121716266 218 218 74eb31e90880ba1345fc49252ca6efe6 cpath: /project/binaries/release/phase1/iteration/81/trunk copyfrom: 61423 /project/binaries/release/phase1/iteration/80/trunk Is this information helpful? Let me know if this tells you anything. Thanks The fact that the output is identical suggests that the /project/binaries/release/phase1/iteration/81/trunk tree hasn't changed between those two revisions (or that there was a directory replace above it). However, this is the error you report: [Tue Jan 31 11:37:23 2012] [error] [client 9.31.13.109] predecessor count for the root node-revision is wrong: found 61815, committing r61852 [409, #160004] The metadata this error complains about will be output by these two commands: ./dump-noderev.pl /repository / 61851 -- id: 0.0.r61851/33470 type: dir pred: 0.0.r61850/3844 count: 61818 text: 61851 32225 1232 1232 7555349571e297c23e647cc2441d5b8f cpath: / copyroot: 0 / minfo-cnt: 25685 -- ./dump-noderev.pl /repository / 61852 -- id: 0.0.r61852/27663 type: dir pred: 0.0.r61851/33470 count: 61819 text: 61852 26417 1233 1233 712fec619d55677e67aca8f7aa4ceb97 cpath: / copyroot: 0 / minfo-cnt: 25685 Jason. Cheers, Daniel Hi Daniel Thanks for the quick reply. I have posted the results from the two commands you have asked me to run above as well as the full pred value that was incomplete. Let me know if you need any other information. Thanks. Jason
Re: predecessor count for the root node-revision is wrong message
On Wed, Feb 8, 2012 at 6:22 PM, Nico Kadel-Garcia nka...@gmail.com wrote: On Wed, Feb 8, 2012 at 7:42 PM, Daniel Shahaf danie...@elego.de wrote: Daniel Shahaf wrote on Thu, Feb 09, 2012 at 01:46:45 +0200: Jason Wong wrote on Wed, Feb 08, 2012 at 15:32:05 -0800: Get xxd.exe from http://www.vim.org/ and cat.exe and sed.exe from http://gnuwin32.sf.net (or from Cygwin). Delete from the script the line that uses the 'head' command. There is a second use of 'head', which you shouldn't delete. So instead, just get head.exe from the same place as the other two, or use the following kind of statement: Or install CygWin and run the scripts from inside CygWin. This does present end-of-line issues, so be very careful about using svn:eol native properties. my $line = do { open FOO, perl -V 21 |; FOO; }; Lastly, there's a 'sed' invocation that uses single-quoted arguments. All it does is print the input up to the first empty line --- feel free to implement it differently. (One way: my @lines = split /\n/, `command | goes | here`; $_ and print or last for @lines; Both of these examples could do with some error checking.) Daniel (yes, there's also a neater way to do this without split(). but it's not a Perl class here) Hello. Sorry for the delay. Here is an update of what I have done since the last time I posted. I have run svn log -q ^/ on the respository and it came back with no missing revisions. Since I first posted, each of the projects we have tried to build that had failed have since successfully been built without any changes on our side. I was having an issue with converting the script to run in windows as I was only getting the first line returned so I set up cygwin. I ran the script against both of the revisions (61815 and 61852) in mentioned in the Apache error log and the output was the same for each. Commands: dump-noderev.pl /repository /project/binaries/release/phase1/iteration/81/trunk 61815 dump-noderev.pl /repository /project/binaries/release/phase1/iteration/81/trunk 61852 Output: id: 9-45362.0-61242.r61424/0 type: dir pred: 9-45362.0-60310/0 count: 43 text: 58741 121716266 218 218 74eb31e90880ba1345fc49252ca6efe6 cpath: /project/binaries/release/phase1/iteration/81/trunk copyfrom: 61423 /project/binaries/release/phase1/iteration/80/trunk Is this information helpful? Let me know if this tells you anything. Thanks Jason.
Re: predecessor count for the root node-revision is wrong message
Hello and thank you for replying. On Tue, Feb 7, 2012 at 4:04 PM, Daniel Shahaf danie...@elego.de wrote: Jason Wong wrote on Tue, Feb 07, 2012 at 13:23:10 -0800: Any help/comments would be appreciated. Thank you. As I said, I'd be interested in isolating the cause of these errors. Is there anything common to revisions that triggered the bug (as explained above)? Are they concomitant with concurrent writes (commits, propedits, 'svn lock' operations, 'svnadmin pack' operations)? What version of svn does your server run (1.7.1?)? What operating system does your server run? Is there anything noteworthy about its filesystems or disks? I am working with our lead developer to come up with more details on our build process. I will post this when I get it. Our svn repository is 1.7.1 and is hosted on Apache 2.2.21 on a Windows 2003 server. The server has running RAID5 with SCSI disks. Because my systems are on Windows, I don't think the perl script you had sent me will run as there are a couple commands in it that are called which I don't have. Do you have any suggestions for how I can run the script? In the meantime, I am running svn log -q and will go though the output to scan for missing revisions. I will let you know those results when I have them. Thank you. Jason Wong