On Wed, Mar 14, 2012 at 5:15 PM, Daniel Shahaf <danie...@elego.de> wrote: > Jason Wong wrote on Tue, Mar 13, 2012 at 06:57:59 -0700: >> On Fri, Mar 2, 2012 at 8:12 AM, Daniel Shahaf <danie...@elego.de> wrote: >> > Jason Wong wrote on Fri, Mar 02, 2012 at 07:32:38 -0800: >> >> On Fri, Mar 2, 2012 at 2:58 AM, Daniel Shahaf <danie...@elego.de> wrote: >> >> > Jason Wong wrote on Thu, Mar 01, 2012 at 10:01:26 -0800: >> >> >> I have had a developer here create a build of the latest SVN code >> >> >> with your changes you mentioned in r1294470 for the svnadmin verify >> >> > >> >> > Okay, that's great news, for two reasons: >> >> > >> >> > 1. It means building svn on windows isn't as painful as it used to be :) >> >> >> >> Actually, it did take some work to get it going as we did not have >> >> another system available to us and also did not have VC++ 6. We had >> >> to use VS 2010 in order to do this. Also, for the other components >> >> required (python,perl etc), the files after the install were copied >> >> to the workstation to see if it would work as we did not want to >> >> change the current workstation configuration by running the >> >> installers. All in all, it did seem to work. >> >> >> > >> > Okay. The normal build requires just the *.exe and *.dll files to be >> > placed appropriately (such that the *.exe's and httpd's find their >> > libsvn_* DLL's at runtime) --- it doesn't require Administrator access, >> > for example. >> > >> > To clarify, Perl is only required to build OpenSSL; it is not required >> > to build APR, Neon, or Subversion. >> > >> >> > >> >> > 2. It means I can ask you to build a custom server with the 'inprocess' >> >> > cache disabled, or (if all else fails) to bisect, per my previous email. >> >> > >> >> > One of the things you could try is to disable caching: simply modify >> >> > the function create_cache() in libsvn_fs_fs/caching.c to always return >> >> > NULL in *CACHE_P. See below for another suggestion. >> >> > >> >> >> command. We have run 'svnadmin verify' against every revision of our >> >> >> hotcopy of our repository taken when we first brought this issue to >> >> >> the forums and are now tracking down each of the revisions to see >> >> >> what actions were being done at those times. >> >> >> >> >> > >> >> > Thanks! I do hope this work enables us to pinpoint and fix the bug. >> >> >> >> I will be going through the list to see what else was happening at the >> >> same time on the apache server since it was alluded to that there may >> >> be concurrency issues. I know the last two times that this error has >> >> popped up, we had two svn operations starting at around the same time >> >> according to the Apache logs. I will go through the previous apache >> >> history to see if this was always the case or not. >> >> >> > >> > Thanks, looking forward to hear what you come up with. >> > >> > FWIW, Justin's reply suggests that the error was seen on three different >> > platforms --- Windows, Solaris, and FreeBSD --- so that should narrow >> > down the range of possible explanations. >> > >> > (I'll also note that at ASF's installation we are not running into new >> > instances of the bug.) >> >> Hi Daniel. >> > > Hi. Sorry for the delay --- was away from svn the last few days. >
No problem. I have been really busy as well past couple weeks on other tasks. >> I haven't gone through all the cases yet, but I have made progress >> through quite a number of them and a pattern seems to be coming up. >> > > Is it safe to summarize your findings as: in every instance of the bug > (as determined by the new 'svnadmin verify' output), the victim revision > was started whilst (victim-1) was in progress? > >From what is there so far, yes. We do have different operations occurring at the same time, but for these ones, I see MERGE and DELETE verbs overlapping in the same or near time intervals according to the Apache logs. I just did a quick look in the Apache logs during a time window where the bug wasn't triggered, and was able to see cases where I have the following: rev(x-1) merge rev(x) merge rev(x) delete rev(x-1) delete This seems fine. The case that I had seen in my reported cases are as follows: rev(x-1) merge rev(x) merge rev(x-1) delete rev(x) delete I would have to look more closely if needed to show that this case always triggers the bug or not. We do not have a large userbase here so it is harder to get a lot of overlapping hits for this type of case. > That by itself is an everyday occurence, but I think it's nonetheless > a useful piece of information. I'll try and digest it further later > when I'm less sleepy (it's way past midnight here). > > (As I understand ra_dav, the MERGE verb corresponds to the FS level > svn_fs_commit_txn(). Someone please correct me if I'm wrong.) > > Thanks, > > Daniel > >> I have attached 2 txt files. One shows the modified svnadmin verify >> output from the binaries we built. The other shows the revisions and >> what appears to have been occuring at the time of the bug. I figure >> better to provide this now rather than delay any longer for the rest >> of the results. >> >> I will continue to go through the rest of the events and see if >> there are other differences seen when the issue occurs. I hope >> this information helps. >> >> Thanks. >> >> Jason >