Re: relation to minfo-cnt bug Re: predecessor count for the root node-revision is wrong message

2012-03-30 Thread Jason Wong
On Wed, Mar 28, 2012 at 12:00 PM, Daniel Shahaf danie...@elego.de wrote:
 Jason Wong wrote on Wed, Mar 28, 2012 at 11:49:20 -0700:
 dump-noderev.pl /repo /
 -
 id: 0.0.r62104/28771
 type: dir
 pred: 0.0.r62103/28680
 count: 62071
 text: 62104 27520 1238 1238 ea635421e867454f9f7bc503c8160a2c
 cpath: /
 copyroot: 0 /
 minfo-cnt: 25707
 -

 dump-noderev.pl /mirror2 /
 ---
 id: 0.0.r62104/6122
 type: dir
 pred: 0.0.r62103/6039
 count: 62104
 text: 62104 4874 1235 1235 1f315ed2437ba5d70dba2587d9ef2d5a
 cpath: /
 copyroot: 0 /
 minfo-cnt: 25707
 ---

 Is this in line with what you expected?

 It's in line with my expectations, insofar as on the mirror the 'count'
 is correct.

 It also indicates that you weren't bitten by the minfo-cnt part of this
 bug.  As you know from the dev@ thread, Philip identified that part and
 fixed it too -- after my above email.

 Thanks again for your help in chasing down this bug.  It was backported
 today towards 1.7.5 too.

 Cheers,

 Daniel

Hi Daniel.

No problem. I am glad the issues are fixed. Thank you for all your help
and patience with my slow replies. It has been a busy couple of months
for me in trying find the time to do these tests.

So for correcting the count information in our live repository, I
should run svnsync on it at some point? Is there anything I need to do
after running that command in order to have it not link to the original?

Thanks.


Jason Wong


Re: relation to minfo-cnt bug Re: predecessor count for the root node-revision is wrong message

2012-03-28 Thread Jason Wong
On Thu, Mar 22, 2012 at 11:32 AM, Jason Wong jwong1m...@gmail.com wrote:
 Hello Daniel.

 I will give it a go and let you know what I find.

 Jason

 On Wed, Mar 21, 2012 at 1:39 AM, Daniel Shahaf danie...@elego.de wrote:
 Jason,

 I've learnt yesterday something new about the minfo-cnt corruption bug:
 it can manifest not only as absurdly high values (on the order of 2**70),
 but as far smaller wrong increments too (such as increment of 172
 instead of of 0 on one occasion).

 Could you determine whether said bug has occurred in your history?  You
 can do that by duplicating your repository using svnsync or dump|load,
 running dump-noderev.pl on / of both copies at the same revisions, and
 comparing the minfo-cnt values.

 I would be interested in knowing whether they are equal between the two
 copies.

 Thanks,

 Daniel

 Jason Wong wrote on Thu, Feb 16, 2012 at 11:42:42 -0800:
   ./dump-noderev.pl /repository / 61851
 --

 id: 0.0.r61851/33470
 type: dir
 pred: 0.0.r61850/3844
 count: 61818
 text: 61851 32225 1232 1232 7555349571e297c23e647cc2441d5b8f
 cpath: /
 copyroot: 0 /
 minfo-cnt: 25685
 --

Hello Daniel.

The svnsync took a while to run once I got it going. I ran the command
on the hotcopy I had made originally to keep the results consistant.

I have run the following two commands:
dump-noderev.pl /repo /
dump-noderev.pl /mirror2 /

Here are the outputs from the commands:

dump-noderev.pl /repo /
-
id: 0.0.r62104/28771
type: dir
pred: 0.0.r62103/28680
count: 62071
text: 62104 27520 1238 1238 ea635421e867454f9f7bc503c8160a2c
cpath: /
copyroot: 0 /
minfo-cnt: 25707
-

dump-noderev.pl /mirror2 /
---
id: 0.0.r62104/6122
type: dir
pred: 0.0.r62103/6039
count: 62104
text: 62104 4874 1235 1235 1f315ed2437ba5d70dba2587d9ef2d5a
cpath: /
copyroot: 0 /
minfo-cnt: 25707
---

Is this in line with what you expected?

Jason Wong.


Re: relation to minfo-cnt bug Re: predecessor count for the root node-revision is wrong message

2012-03-22 Thread Jason Wong
Hello Daniel.

I will give it a go and let you know what I find.

Jason

On Wed, Mar 21, 2012 at 1:39 AM, Daniel Shahaf danie...@elego.de wrote:
 Jason,

 I've learnt yesterday something new about the minfo-cnt corruption bug:
 it can manifest not only as absurdly high values (on the order of 2**70),
 but as far smaller wrong increments too (such as increment of 172
 instead of of 0 on one occasion).

 Could you determine whether said bug has occurred in your history?  You
 can do that by duplicating your repository using svnsync or dump|load,
 running dump-noderev.pl on / of both copies at the same revisions, and
 comparing the minfo-cnt values.

 I would be interested in knowing whether they are equal between the two
 copies.

 Thanks,

 Daniel

 Jason Wong wrote on Thu, Feb 16, 2012 at 11:42:42 -0800:
   ./dump-noderev.pl /repository / 61851
 --

 id: 0.0.r61851/33470
 type: dir
 pred: 0.0.r61850/3844
 count: 61818
 text: 61851 32225 1232 1232 7555349571e297c23e647cc2441d5b8f
 cpath: /
 copyroot: 0 /
 minfo-cnt: 25685
 --


Re: predecessor count for the root node-revision is wrong message

2012-03-19 Thread Jason Wong
Hello Daniel, Philip.

I have been following the thread: #4129 is reproducible Re:
predecessor count for the root node-revision is wrong message.
It looks like you all have it figured out now. Good job.

Do you need any more information from me at this point? Thanks.

Jason Wong.


Re: predecessor count for the root node-revision is wrong message

2012-03-19 Thread Jason Wong
On Mon, Mar 19, 2012 at 1:56 PM, Daniel Shahaf danie...@elego.de wrote:
 Jason Wong wrote on Mon, Mar 19, 2012 at 13:41:19 -0700:
 Hello Daniel, Philip.

 I have been following the thread: #4129 is reproducible Re:
 predecessor count for the root node-revision is wrong message.
 It looks like you all have it figured out now. Good job.

 Do you need any more information from me at this point? Thanks.


 Thanks Jason.  It would be useful if you could confirm that you do not
 run into the error after rebuilding the server with r1302399 and
 r1302613 applied.  (If you run the test suite, apply r1302539 and
 r1302591 too.)  These revisions constitute the fix which is nominated
 for inclusion in 1.6.18 and 1.7.5; see ^/subversion/branches/1.7.x/STATUS.


Hi Daniel.

The developer who built the svn client is away and will probably not be back
until later this week. What is your ETA for 1.7.5? Just wondering if that would
released before the developer I have is back.

Thanks.

Jason

 Cheers,

 Daniel

 Jason Wong.


Re: predecessor count for the root node-revision is wrong message

2012-03-13 Thread Jason Wong
On Fri, Mar 2, 2012 at 8:12 AM, Daniel Shahaf danie...@elego.de wrote:
 Jason Wong wrote on Fri, Mar 02, 2012 at 07:32:38 -0800:
 On Fri, Mar 2, 2012 at 2:58 AM, Daniel Shahaf danie...@elego.de wrote:
  Jason Wong wrote on Thu, Mar 01, 2012 at 10:01:26 -0800:
  I have had a developer here create a build of the latest SVN code
  with your changes you mentioned in r1294470 for the svnadmin verify
 
  Okay, that's great news, for two reasons:
 
  1. It means building svn on windows isn't as painful as it used to be :)

 Actually, it did take some work to get it going as we did not have
 another system available to us and also did not have VC++ 6. We had
 to use VS 2010 in order to do this. Also, for the other components
 required (python,perl etc), the files after the install were copied
 to the workstation to see if it would work as we did not want to
 change the current workstation configuration by running the
 installers. All in all, it did seem to work.


 Okay.  The normal build requires just the *.exe and *.dll files to be
 placed appropriately (such that the *.exe's and httpd's find their
 libsvn_* DLL's at runtime) --- it doesn't require Administrator access,
 for example.

 To clarify, Perl is only required to build OpenSSL; it is not required
 to build APR, Neon, or Subversion.

 
  2. It means I can ask you to build a custom server with the 'inprocess'
  cache disabled, or (if all else fails) to bisect, per my previous email.
 
  One of the things you could try is to disable caching: simply modify
  the function create_cache() in libsvn_fs_fs/caching.c to always return
  NULL in *CACHE_P.  See below for another suggestion.
 
  command. We have run 'svnadmin verify' against every revision of our
  hotcopy of our repository taken when we first brought this issue to
  the forums and are now tracking down each of the revisions to see
  what actions were being done at those times.
 
 
  Thanks!  I do hope this work enables us to pinpoint and fix the bug.

 I will be going through the list to see what else was happening at the
 same time on the apache server since it was alluded to that there may
 be concurrency issues. I know the last two times that this error has
 popped up, we had two svn operations starting at around the same time
 according to the Apache logs. I will go through the previous apache
 history to see if this was always the case or not.


 Thanks, looking forward to hear what you come up with.

 FWIW, Justin's reply suggests that the error was seen on three different
 platforms --- Windows, Solaris, and FreeBSD --- so that should narrow
 down the range of possible explanations.

 (I'll also note that at ASF's installation we are not running into new
 instances of the bug.)

Hi Daniel.

I haven't gone through all the cases yet, but I have made progress
through quite a number of them and a pattern seems to be coming up.

I have attached 2 txt files. One shows the modified svnadmin verify
output from the binaries we built. The other shows the revisions and
what appears to have been occuring at the time of the bug. I figure
better to provide this now rather than delay any longer for the rest
of the results.

I will continue to go through the rest of the events and see if
there are other differences seen when the issue occurs. I hope
this information helps.

Thanks.

Jason
SVN log history for predecessor node error: from svnadmin verify

svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r45558 has 45557, but r45557 has 45557
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r46947 has 46945, but r46946 has 46945
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r46997 has 46994, but r46996 has 46994
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r47004 has 47000, but r47003 has 47000
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r47006 has 47001, but r47005 has 47001
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r47193 has 47187, but r47192 has 47187
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r47715 has 47708, but r47714 has 47708
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r47718 has 47710, but r47717 has 47710
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r50049 has 50040, but r50048 has 50040
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r50963 has 50953, but r50962 has 50953
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r51481 has 51470, but r51480 has 51470
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r51684 has 51672, but r51683 has 51672
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r52082 has 52069, but r52081 has 52069
svnadmin: E160004: predecessor count for the root node-revision is wrong: 
r53220 has 53205

Re: predecessor count for the root node-revision is wrong message

2012-03-02 Thread Jason Wong
On Fri, Mar 2, 2012 at 2:58 AM, Daniel Shahaf danie...@elego.de wrote:
 Jason Wong wrote on Thu, Mar 01, 2012 at 10:01:26 -0800:
 I have had a developer here create a build of the latest SVN code
 with your changes you mentioned in r1294470 for the svnadmin verify

 Okay, that's great news, for two reasons:

 1. It means building svn on windows isn't as painful as it used to be :)

Actually, it did take some work to get it going as we did not have
another system available to us and also did not have VC++ 6. We had
to use VS 2010 in order to do this. Also, for the other components
required (python,perl etc), the files after the install were copied
to the workstation to see if it would work as we did not want to
change the current workstation configuration by running the
installers. All in all, it did seem to work.


 2. It means I can ask you to build a custom server with the 'inprocess'
 cache disabled, or (if all else fails) to bisect, per my previous email.

 One of the things you could try is to disable caching: simply modify
 the function create_cache() in libsvn_fs_fs/caching.c to always return
 NULL in *CACHE_P.  See below for another suggestion.

 command. We have run 'svnadmin verify' against every revision of our
 hotcopy of our repository taken when we first brought this issue to
 the forums and are now tracking down each of the revisions to see
 what actions were being done at those times.


 Thanks!  I do hope this work enables us to pinpoint and fix the bug.

I will be going through the list to see what else was happening at the
same time on the apache server since it was alluded to that there may
be concurrency issues. I know the last two times that this error has
popped up, we had two svn operations starting at around the same time
according to the Apache logs. I will go through the previous apache
history to see if this was always the case or not.


 From the results, we see 25 error messages for predecessor count is
 wrong and the first one appeared on January 26, 2011. Near that time
 the following events occurred:
Jan. 14, 2011 - svn upgraded from 1.6.6 to 1.6.15
Jan. 14, 2011 - Apache HTTP server upgraded from 2.2.15 to 2.2.17
Jan. 21, 2011 - repository was pruned to delete some binary files.

 Between January and our upgrade in Dec. to 1.7.1, we have had about
 14,000 revisions and seen only 25 instances of this node revision
 issue. During the times we had these errors, we were using svn
 versions 1.6.15 and 1.6.16.


 Thanks, very valuable information.

 I've reviewed the 1.6.6-1.6.15 diff, and I have the following
 suggestions:

 - Change subversion/libsvn_fs_fs/fs.h such that
  SVN_FS_FS__USE_LOCK_MUTEX is set to 1.  It was set to 1 in 1.6.6
  but to 0 in 1.6.15.

  (This wouldn't explain why ASF saw it, but it might explain why you're
  seeing it.)

 Fail2ban from what I could find does not look like it has a Windows
 port which I currently have my production environment hosted on.


 Yeah, sorry.  But you can write a cron job -- I mean, a Scheduled Task
 -- that greps your error logs for 160004 every night and mails you it
 it found anything, right?

 That's the error code to watch for for many FS error conditions:
 % ./tools/dev/which-error.py E160004
 00160004  SVN_ERR_FS_CORRUPT


I will look into it. We did ask developers to note any error messages
that they see from tortoisesvn now as the last time we saw the error
message pop up, we asked the developer what happened and he said that
an error message popped up and he just tried to check in again and it
worked. We will note the exact message next time.

 Thanks.

 Jason

 For convenience I'm attaching a patch that implements both of my
 suggestions.  Let us know please if it has any effect.


I will forward this to the developer to look at.

 Cheers,

 Daniel


Hi.

See replies above. I will post what we find.

Thanks.

Jason


Re: predecessor count for the root node-revision is wrong message

2012-02-27 Thread Jason Wong
On Thu, Feb 16, 2012 at 12:14 PM, Daniel Shahaf danie...@elego.de wrote:


 The output from these two tells me two things:

 1. The minfo-cnt value is reasonable (within a typical ballpark).
 That's relevant since minfo-cnt abnormalities were seen in another
 instance of the bug.

 2. Everything else looks correct: the 'id:'/'pred:' headers are accurate,
 and the 'count:' header was incremented correctly.  The 'count:' header
 does, however, indicate that your repository has _in the past_ triggered
 an instance of the bug.

This is true. We have seen the bug happen before. The first occurence
of this that we had seen was Dec. 7th, 2011, a few days after we went
from 1.6.16 to 1.7.1. That was the first time we had seen that happen.
At the time, we did not know about the cause and the developer who
had encountered the error didn't report it and was able to work
around it. From the Apache logs we have:

[Wed Dec 07 15:16:36 2011] [error] [client 10.2.3.1] predecessor
count for the root node-revision is wrong: found 59444,
committing r59478  [409, #160004]
[Wed Dec 07 15:33:47 2011] [error] [client 10.2.3.2] predecessor
count for the root node-revision is wrong: found 59482,
committing r59516  [409, #160004]
[Wed Dec 07 15:35:19 2011] [error] [client 10.2.3.3] predecessor
count for the root node-revision is wrong: found 59488,
committing r59522  [409, #160004]
[Wed Dec 07 15:44:10 2011] [error] [client 10.2.3.4] predecessor
count for the root node-revision is wrong: found 59505,
committing r59539  [409, #160004]

Of the ips above, the last line is from the build machine. The others
were from developer workstations. I mentioned the most recent two
times first as we were actually aware of the issue at that time and
it was recent so we knew to start looking into it. Between Dec. 7 and
Jan. 31, the bug has occurred 12 times, 3 of those times from the
build server. The rest are from workstations. This month, it has only
occurred once and it was from the build server.

Each of these times, the error has occurred in different parts of
the repository.


 In a bit more detail: the value of the 'count:' header should be equal to
 the revision number given as the third argument to dump-noderev.pl.
 (That revision number is also embedded in the 'id:' header, and is
 practically guaranteed to be embedded in the 'text:' header as well.)
 So, there are two things you can do to help us identify the bug:

 1. Hunt for past instance of the bug, identify what revisions triggered
 it, and try and identify a common pattern to those revisions.  (This
 basically calls for running 'dump-noderev.pl $REPOS /' in a loop and
 looking for non-sequential 'count:' or 'pred:' headers in the output for
 a pair of successive revisions.)

I will try and see if I can get this done this week.

 2. Look for new instances of the bug.  You could periodically scan for
 new instances of the bug, or implement a post-commit hook such as the
 following (written for unix-like systems, sorry):

 [[[
 # look for a corruption or two
 minfo_cnt() {
  dump-noderev.pl $REPOS / $1 | sed -ne 's/minfo-cnt: //p'
 }
 PREV_REV=`expr $REV - 1`
 if expr `minfo_cnt $PREV_REV` - `minfo_cnt $REV` | grep ... /dev/null; 
 then
  # echo an error to stderr and mail the admin
  exit 1
 fi

 skipped_root_noderevs() {
  expr $1 - `dump-noderev.pl $REPOS / $1 | sed -ne 's/^count: //p'`
 }
 if [ `skipped_root_noderevs $PREV_REV` -ne `skipped_root_noderevs $REV` 
 ]; then
  # echo an error to stderr and mail the admin
  exit 2
 fi
 ]]]


I will talk to the build team here about the post-commit hook. We have had
the bug occur again since my last reply.


 Replied above.  The summary is that you have indeed ran into the bug,
 but for some reason not in r61852 but sometime before that, (and why
 did r61852 trigger the syslog error anyway?  Good question) and now
 we're at the point of trying to identify the cause of the bug --- at
 least circumstantially.

 Thanks for your help so far,

 Daniel

Hi Daniel.

Replies above. Sorry about the delay in replying. I have been really
busy of late. I will try and get the results this week, if not, it
will most likely be next week.

Thanks

Jason.


Re: predecessor count for the root node-revision is wrong message

2012-02-27 Thread Jason Wong
On Mon, Feb 27, 2012 at 8:09 AM, Stefan Sperling s...@elego.de wrote:
 On Mon, Feb 27, 2012 at 07:36:39AM -0800, Jason Wong wrote:
 This is true. We have seen the bug happen before. The first occurence
 of this that we had seen was Dec. 7th, 2011, a few days after we went
 from 1.6.16 to 1.7.1. That was the first time we had seen that happen.
 At the time, we did not know about the cause and the developer who
 had encountered the error didn't report it and was able to work
 around it. From the Apache logs we have:

   [Wed Dec 07 15:16:36 2011] [error] [client 10.2.3.1] predecessor
   count for the root node-revision is wrong: found 59444,
   committing r59478  [409, #160004]

 Just to be clear: These errors emitted by the 1.7.1 server prevent the
 bug from corrupting new revisions. With a 1.6 server the problem would
 go unnoticed and create bad revision data.

 When this corruption occurs, the repository still works.
 But the history links for affected revisions are incorrect.

Hi Stephan.

So I think I misunderstood why the error messages were occurring.
I had thought that there was a condition done by this check (in 1.7),
that was erroneously causing svn to reject the attempt to check-in.

I guess I am wondering that if this is the case, then why is it that
if the check-in fails, and then we manually check it in again using
tortoisesvn, that it works the second time?

So the errors prevent the bug from corrupting new revisions? Is this
something between the 1.7 versions or would this have been in 1.6
versions as well? We have been using svn for a while now and I am
wondering what this means that for 1.6, that this issue has been
occurring from communcations between 1.6 client and 1.6 server.

Also, is this bug something that svnadmin verify will not detect?
The last time we ran svnadmin verify, it said all was good.

If it is the case that this bug has been occuring for a long time,
what are the implications of the history links for affected
revisions? When you say the history links are incorrect, does it
just put in a random value or is it actually unreadable values?
Does this mean subsequent revisions that occur after these bad
revisions will propagate this bad information?

A developer asked me to pose the following question. If he was to
open a bad revision, would the client fail and give an error prompt
or would it display history information which could belong to other
files?

Thanks.

Jason.


Re: predecessor count for the root node-revision is wrong message

2012-02-16 Thread Jason Wong
On Wed, Feb 15, 2012 at 6:15 PM, Daniel Shahaf danie...@elego.de wrote:
 Jason Wong wrote on Wed, Feb 15, 2012 at 10:20:23 -0800:
 On Wed, Feb 8, 2012 at 6:22 PM, Nico Kadel-Garcia nka...@gmail.com wrote:
  On Wed, Feb 8, 2012 at 7:42 PM, Daniel Shahaf danie...@elego.de wrote:
  Daniel Shahaf wrote on Thu, Feb 09, 2012 at 01:46:45 +0200:
  Jason Wong wrote on Wed, Feb 08, 2012 at 15:32:05 -0800:
 
  Get xxd.exe from http://www.vim.org/ and cat.exe and sed.exe from
  http://gnuwin32.sf.net (or from Cygwin).  Delete from the script the
  line that uses the 'head' command.
 
  There is a second use of 'head', which you shouldn't delete.  So
  instead, just get head.exe from the same place as the other two, or use
  the following kind of statement:
 
  Or install CygWin and run the scripts from inside CygWin. This does
  present end-of-line issues, so be very careful about using svn:eol
  native properties.
 
 
 my $line = do {
 open FOO, perl -V 21 |;
 FOO;
 };
 
  Lastly, there's a 'sed' invocation that uses single-quoted arguments.
  All it does is print the input up to the first empty line --- feel
  free to implement it differently.  (One way:
 
 my @lines = split /\n/, `command | goes | here`;
 $_ and print or last for @lines;
 
  Both of these examples could do with some error checking.)
 
  Daniel
  (yes, there's also a neater way to do this without split(). but it's
  not a Perl class here)

 Hello.

 Sorry for the delay. Here is an update of what I have done since
 the last time I posted.

 I have run svn log -q ^/ on the respository and it came back with
 no missing revisions.


 I stand corrected, then.  I've confirmed on another instance of the bug
 that 'svn log -q ^/' does not behave abnormally when the bug is present.
 Sorry for the misinformation.

 Question to devs: what operation will walk the predecessor links for
 the root fspath? (and can therefore be used to identify instances of
 the bug)

 Since I first posted, each of the projects we have tried to build
 that had failed have since successfully been built without any changes
 on our side.


 What is the significance of this?  I don't know how your build process
 interacts with Subversion.

 I was having an issue with converting the script to run in windows as
 I was only getting the first line returned so I set up cygwin.

 I ran the script against both of the revisions (61815 and 61852) in
 mentioned in the Apache error log and the output was the same for
 each.

 Commands:
dump-noderev.pl /repository
 /project/binaries/release/phase1/iteration/81/trunk 61815
dump-noderev.pl /repository
 /project/binaries/release/phase1/iteration/81/trunk 61852

 Output:
id: 9-45362.0-61242.r61424/0
type: dir
pred: 9-45362.0-60310/0

 Are you sure that's the value of the pred: field?  It contains only
 one ., instead of two.


I missed a part of it, you are right. here is the full pred line:
   pred: 9-45362.0-60310.r60310/0

count: 43
text: 58741 121716266 218 218 74eb31e90880ba1345fc49252ca6efe6
cpath: /project/binaries/release/phase1/iteration/81/trunk
copyfrom: 61423 /project/binaries/release/phase1/iteration/80/trunk

 Is this information helpful? Let me know if this tells you anything. Thanks


 The fact that the output is identical suggests that the
 /project/binaries/release/phase1/iteration/81/trunk tree hasn't changed
 between those two revisions (or that there was a directory replace above
 it).

 However, this is the error you report:

 [Tue Jan 31 11:37:23 2012] [error] [client 9.31.13.109] predecessor count 
 for the root node-revision is wrong: found 61815, committing r61852  [409, 
 #160004]

 The metadata this error complains about will be output by these two
 commands:

  ./dump-noderev.pl /repository / 61851
--

id: 0.0.r61851/33470
type: dir
pred: 0.0.r61850/3844
count: 61818
text: 61851 32225 1232 1232 7555349571e297c23e647cc2441d5b8f
cpath: /
copyroot: 0 /
minfo-cnt: 25685
--

  ./dump-noderev.pl /repository / 61852
--

id: 0.0.r61852/27663
type: dir
pred: 0.0.r61851/33470
count: 61819
text: 61852 26417 1233 1233 712fec619d55677e67aca8f7aa4ceb97
cpath: /
copyroot: 0 /
minfo-cnt: 25685



 Jason.

 Cheers,

 Daniel


Hi Daniel

Thanks for the quick reply. I have posted the results from the two
commands you have asked me to run above as well as the full pred
value that was incomplete.

Let me know if you need any other information.
Thanks.

Jason


Re: predecessor count for the root node-revision is wrong message

2012-02-15 Thread Jason Wong
On Wed, Feb 8, 2012 at 6:22 PM, Nico Kadel-Garcia nka...@gmail.com wrote:
 On Wed, Feb 8, 2012 at 7:42 PM, Daniel Shahaf danie...@elego.de wrote:
 Daniel Shahaf wrote on Thu, Feb 09, 2012 at 01:46:45 +0200:
 Jason Wong wrote on Wed, Feb 08, 2012 at 15:32:05 -0800:

 Get xxd.exe from http://www.vim.org/ and cat.exe and sed.exe from
 http://gnuwin32.sf.net (or from Cygwin).  Delete from the script the
 line that uses the 'head' command.

 There is a second use of 'head', which you shouldn't delete.  So
 instead, just get head.exe from the same place as the other two, or use
 the following kind of statement:

 Or install CygWin and run the scripts from inside CygWin. This does
 present end-of-line issues, so be very careful about using svn:eol
 native properties.


    my $line = do {
        open FOO, perl -V 21 |;
        FOO;
    };

 Lastly, there's a 'sed' invocation that uses single-quoted arguments.
 All it does is print the input up to the first empty line --- feel
 free to implement it differently.  (One way:

    my @lines = split /\n/, `command | goes | here`;
    $_ and print or last for @lines;

 Both of these examples could do with some error checking.)

 Daniel
 (yes, there's also a neater way to do this without split(). but it's
 not a Perl class here)

Hello.

Sorry for the delay. Here is an update of what I have done since
the last time I posted.

I have run svn log -q ^/ on the respository and it came back with
no missing revisions.

Since I first posted, each of the projects we have tried to build
that had failed have since successfully been built without any changes
on our side.

I was having an issue with converting the script to run in windows as
I was only getting the first line returned so I set up cygwin.

I ran the script against both of the revisions (61815 and 61852) in
mentioned in the Apache error log and the output was the same for
each.

Commands:
   dump-noderev.pl /repository
/project/binaries/release/phase1/iteration/81/trunk 61815
   dump-noderev.pl /repository
/project/binaries/release/phase1/iteration/81/trunk 61852

Output:
   id: 9-45362.0-61242.r61424/0
   type: dir
   pred: 9-45362.0-60310/0
   count: 43
   text: 58741 121716266 218 218 74eb31e90880ba1345fc49252ca6efe6
   cpath: /project/binaries/release/phase1/iteration/81/trunk
   copyfrom: 61423 /project/binaries/release/phase1/iteration/80/trunk

Is this information helpful? Let me know if this tells you anything. Thanks

Jason.


Re: predecessor count for the root node-revision is wrong message

2012-02-08 Thread Jason Wong
Hello and thank you for replying.

On Tue, Feb 7, 2012 at 4:04 PM, Daniel Shahaf danie...@elego.de wrote:
 Jason Wong wrote on Tue, Feb 07, 2012 at 13:23:10 -0800:
 Any help/comments would be appreciated. Thank you.


 As I said, I'd be interested in isolating the cause of these errors.
 Is there anything common to revisions that triggered the bug (as
 explained above)?  Are they concomitant with concurrent writes (commits,
 propedits, 'svn lock' operations, 'svnadmin pack' operations)?  What
 version of svn does your server run (1.7.1?)?  What operating system
 does your server run?  Is there anything noteworthy about its
 filesystems or disks?

I am working with our lead developer to come up with more details
on our build process. I will post this when I get it.
Our svn repository is 1.7.1 and is hosted on Apache 2.2.21 on a
Windows 2003 server. The server has running RAID5 with SCSI disks.

Because my systems are on Windows, I don't think the perl script
you had sent me will run as there are a couple commands in it
that are called which I don't have. Do you have any suggestions
for how I can run the script?

In the meantime, I am running svn log -q and will go though the
output to scan for missing revisions. I will let you know those
results when I have them.

Thank you.

Jason Wong