Re: 1.8.0 build problem on Solaris Sparc using gcc

2013-06-24 Thread Trent Nelson
On Thu, Jun 20, 2013 at 01:38:11PM -0700, Branko Čibej wrote:
> On 20.06.2013 22:30, Branko Čibej wrote:
> > On 20.06.2013 17:52, Rainer Jung wrote:
> >> 1.8.0 calls gcc with -std=c90. For my Solaris 10 Sparc using gcc 4.7.2
> >> this leads to "_STRICT_STDC" getting defined and then limits.h no longer
> >> defines PATH_MAX. So apr.h bombs out during compile with:
> >>
> >> #error no decision has been made on APR_PATH_MAX for your platform
> >>
> >> Adding e.g. "EXTRA_CPPFLAGS = -D__EXTENSIONS__" fixes on my platform.
> >>
> >> gcc manual as I understand it says -std=c90 and -std=c89 are the same
> >> and both are equivalent to -ansi. Was that intended as a new flag for 
> >> 1.8.0?
> > The intended flag was -std=c90, yes.
> > You should be able to use just plain CFLAGS= instead of EXTRA_CPPFLAGS=,
> > the problem with initial CFLAGS being overriden should have been fixed.
> >
> >> The flag is set by SVN_CC_MODE_SETUP in build/ac-macros/compiler.m4 and
> >> passes along into CMODEFLAGS in the Makefile.
> > Yup.
> >
> >> On Solaris that means "turn off any non c90 features". If you want all
> >> of the c90 compatible extensions on top of c90, then you need to define
> >> __EXTENSIONS__.
> > Could you provide a patch that does that? I expect the best place would
> > be in the SVN_CC_MODE_SETUP macro in compiler.m4.
> 
> By the way, I wouldn't be surprised if this was a problem on HP-UX and
> AIX as well; from what I recall, they used to show similar behaviour.
> Unfortunately I don't have access to any of these platforms to test on.

Send me your ssh keys and I'll set up Snakebite access for you :-)
(That goes for any other committers that want access.)

Current platforms available:

% ./sb
Logged you in from 124.169.107.163.
+--+
| Available Hosts  |
|   (Last Update: 2013-06-17 13:19:38Z)|
+--+
| Alias |  OS |   Arch   | CPUs | RAM  | Compilers |
+---|-|--|--|--|---+
| a7|AIX 7.1  | PowerPC  |2 @ 1.4GHz|   8GB|xlc121 |
| h2|HP-UX 11iv2  | PA-RISC  |2 @ 875MHz|   8GB|hpacc0395  |
| h3|HP-UX 11iv3  | Itanium2 |4 @ 1.3GHz|  64GB|hpacc0626  |
| i6|IRIX 6.5.30  |   MIPS   |4 @ 500MHz|   4GB|gcc34  |
|s10|Solaris 10   |  SPARC   |2 @ 1.2GHz|   8GB|suncc123   |
|s11|Solaris 11   |   x64|2 @ 2.4GHz|  16GB|suncc123   |
| s9|Solaris 9|  SPARC   |2 @ 900MHz|   8GB|suncc123   |
| t5|Tru64 5.1B   |  Alpha   |4 @ 667MHz|   7GB|cc65   |
+--+
Enter alias (^C to exit): 

Doesn't get more exotic than that!  (www.snakebite.net for more
info.)

Trent.


Re: modify-file within an added tree (r1356317) ; possibly rep-cache.db -related

2012-07-23 Thread Trent Nelson
On Mon, Jul 16, 2012 at 10:39:58AM -0700, Trent Nelson wrote:
>
> On Jul 16, 2012, at 12:38 PM, Daniel Shahaf wrote:
>
> > Trent Nelson wrote on Mon, Jul 16, 2012 at 08:58:09 -0700:
> >> Somewhat related: is this a FreeBSD box?
> >
> > Yes, it's eris from http://www.apache.org/dev/machines.
> >
> >> ports/sysutils/mcelog is useful for getting info on any ECC errors
> >> that might have occurred.
> >
> > Thanks for the pointer.  The port description says: "The primary purpose
> > is to provide a way to decode MCE output from the FreeBSD kernel into
> > something more human-readable" --- how to get the "raw" MCE output?  I
> > don't see "mce" mentioned in `sysctl -a` or /var/log/messages.
>
> Yeah it's definitely on the cryptic side.  I'm dubious as to whether or
> not the majority of features mentioned in the man page actually work.
>
> From experience, I simply `pkg_add -r mcelog`'d and then ran `mcelog`
> on a FreeBSD box of mine that looked like it had some wonky DIMMs.  I
> noticed a MCE line in the console log, so I ran `mcelog`, and wallah,
> heaps of info about the error (the exact DIMM/slot was handy).

Just had a box play up again in the same manner.  For the archives,
here's the type of message that'll show up on the system console:

MCA: Address 0x7d4ffbcc0
MCA: Bank 1, Status 0xd4004853
MCA: Global Cap 0x0105, Status 0x
MCA: Vendor "AuthenticAMD", ID 0x20f12, APIC ID 7
MCA: CPU 7 COR OVER BUSLG Source IRD Memory
MCA: Address 0x7c3cffe80
MCA: Bank 2, Status 0xd0004863
MCA: Global Cap 0x0105, Status 0x
MCA: Vendor "AuthenticAMD", ID 0x20f12, APIC ID 7
MCA: CPU 7 COR OVER BUSLG Source PREFETCH Memory

Simply running `mcelog` on that box produces this (snipped the first
123 examples):

STATUS d471c833 MCGSTATUS 0
MCGCAP 105 APICID 7 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 33
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
MCE 124
CPU 7 1 instruction cache TSC 991bb5281cc5 
ADDR 7c3cffe80 
  Instruction cache ECC error
   bit46 = corrected ecc error
   bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
 instruction fetch mem transaction
 memory access, level generic'
STATUS d4004853 MCGSTATUS 0
MCGCAP 105 APICID 7 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 33
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
MCE 125
CPU 7 2 bus unit TSC 991bb528226c 
  L2 cache ECC error
  Bus or cache array error
   bit46 = corrected ecc error
   bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
 prefetch mem transaction
 memory access, level generic'

If running `mcelog` doesn't produce anything, it's probably not an
ECC issue.


Trent.

(P.S. Anyone in the market for a cheap Sun Fire v40z?  Very reliable!)


Re: RFC: Standardizing a 'svn:branch' (boolean) property

2012-07-16 Thread Trent Nelson


On 7/17/12 1:14 AM, "Trent Nelson"  wrote:
> 7. Once we detect a root is affected, evn:roots is updated
>accordingly.  In trac@r175, a new tag is created.  Specifically,
>trunk@175 is copied to /tags/trac-0.5-rc1.  That results in two
  ^

s/trunk@175/trunk@174/




Re: RFC: Standardizing a 'svn:branch' (boolean) property

2012-07-16 Thread Trent Nelson


On 7/16/12 8:57 PM, "Johan Corveleyn"  wrote:

>On Mon, Jul 16, 2012 at 3:33 PM, C. Michael Pilato 
>wrote:
>> On 07/16/2012 08:11 AM, Bert Huijben wrote:
>>> As we couldn't think of a usage of the content I would suggest that we
>>>just
>>> always set the property to '*', just like how we handle svn:executable,
>>> svn:needs-lock, etc. This would also make sure that merges of this
>>>property
>>> won't need special handling.
>>
>> +1.  Let's get the mechanics for recognizing branch roots in place
>>first.
>> We can worry about additional policy matters later when we have a better
>> idea what our users might require.
>
>+1 to some support for branch identification. But I'm not sure if such
>a simple property will provide that in a good way. OTOH I don't have a
>better suggestion now.
>
>First, a couple of use cases I have in mind if branch-roots can be
>identified:

I've noticed this thread has a lot of focus on "how to identify branch
roots" but not so much on what to do with that information.  What are the
specific use cases we want to address?

In my case, I had a client with a *huge* Subversion deployment; nearly a
thousand production repos, tens of thousands of sub-teams, hundreds of
gigabytes of data, etc.

They had an awful experience with merging in particular -- one of the most
prominent teams had to essentially stop development for a few weeks whilst
the SCM admins tried to unbreak their mergeinfo (amongst other things).

50+ developers at 10% productivity x 2+ weeks = expensive.
The-CIO-has-heard-about-this-and-is-super-pissed expensive.

My remit was simple: prevent this sort of problem from ever happening
again.  From that, I extrapolated I needed to block a good 20-30 different
types of commits that contributed to "repo entropy".  For example:

- TagDirectoryCreatedManually
- BranchDirectoryCreatedManually
- BranchRenamedToTrunk
- TrunkRenamedToBranch
- TrunkRenamedToTag
- BranchRenamedToTag
- BranchRenamedOutsideRootBaseDir
- TagSubtreePathRemoved
- RenameAffectsMultipleRoots
- UncleanRenameAffectsMultipleRoots
- MultipleRootsCopied
- TagCopied
- UncleanCopy
- FileRemovedFromTag
- CopyKnownRootSubtreeToValidAbsRootPath
- MixedRootsNotClarifiedByExternals
- CopyKnownRootToIncorrectlyNamedRootPath
- CopyKnownRootSubtreeToIncorrectlyNamedRootPath
- UnknownPathRenamedToIncorrectlyNamedNewRootPath
- RenamedKnownRootToIncorrectlyNamedRootPath
- MixedChangeTypesInMultiRootCommit
- CopyKnownRootToKnownRootSubtree


(Full list of events currently blocked by Enversion:
http://people.apache.org/~trent/events.py)

Other requirements:

- Minimal administrative overhead.  Requiring administrators to
  manually specify branch roots (or trying to use regexes when each
  repo had an entirely different layout) does not scale when you're
  dealing with thousands of repositories.  Ditto for requiring dump/
  load dances.

- 100% accuracy for branch identification.

- No false negatives.  Confidence in Subversion was at an all time
  low -- many teams were threatening to ditch it and set up their
  own P4/git repo.  If anything was introduced that made the user
  experience more painful, there would be anarchy.


With all of those requirements set in stone, I came up with the evn:roots
revprop approach.  Which, I'm happy to report, has been chugging along in
production at this client's site for about 18 months now.  Enversion will
now block some... $(cat events.py | grep '^class ' | wc -l)... 100
different types of commits that contribute to "repo entropy".

For the record, here's an outline of Enversion's evn:roots approach.  It
took a few failed attempts before I came up with the design below... but,
as I mentioned, it's been in production for 18+ months on just under a
thousand repos and has met all the original requirements, so I'm pretty
happy with it.

1. Analyze the repository via `evnadmin analyze `.  This
   processes rev 0 to HEAD sequentially.

2. "In the beginning, there was /trunk".  I'm amazed how much
   mileage I got out of this idiom.  Essentially, the only way
   to 'create' a root from scratch is to `svn mkdir .*/trunk`.

3. Once a .*/trunk mkdir is detected, an evn:roots entry is added
   for it in the revprop it was created in.  For example, after
   analyzing r1 of the trac repo:

% svn pg evn:roots --revprop -r1 `gru trac`
{'/trunk/': {'copies': {},
 'created': 1,
 'creation_method': 'created'}}

4. When processing the next revision, the roots from the previous
   revprop are inherited in a simplified format:

% svn pg evn:roots --revprop -r2 `gru trac`
{'/trunk/': {'created': 1 }}

   i.e. the root name and the revision it was created in.

5. Analysis of each subsequent revision always inherits the previous
   revision's roots (in the sim

Re: modify-file within an added tree (r1356317) ; possibly rep-cache.db -related

2012-07-16 Thread Trent Nelson

On Jul 16, 2012, at 12:38 PM, Daniel Shahaf wrote:

> Trent Nelson wrote on Mon, Jul 16, 2012 at 08:58:09 -0700:
>> Somewhat related: is this a FreeBSD box?
> 
> Yes, it's eris from http://www.apache.org/dev/machines.
> 
>> ports/sysutils/mcelog is useful for getting info on any ECC errors
>> that might have occurred.
> 
> Thanks for the pointer.  The port description says: "The primary purpose
> is to provide a way to decode MCE output from the FreeBSD kernel into
> something more human-readable" --- how to get the "raw" MCE output?  I
> don't see "mce" mentioned in `sysctl -a` or /var/log/messages.

Yeah it's definitely on the cryptic side.  I'm dubious as to whether or
not the majority of features mentioned in the man page actually work.

>From experience, I simply `pkg_add -r mcelog`'d and then ran `mcelog`
on a FreeBSD box of mine that looked like it had some wonky DIMMs.  I
noticed a MCE line in the console log, so I ran `mcelog`, and wallah,
heaps of info about the error (the exact DIMM/slot was handy).

> 
>> Is the repo living on ZFS?
> 
> FreeBSD 8.2, zpool v15, zfs v4

Oh my, v15!  Insert standard recommendation of upgrading to v28 (it
landed in 8-stable around July last year) -- huge improvements in the
latter over v15.

That being said, one of my production boxes (that happens to mirror asf,
incidentally), is still on 8.2-stable w/ v15.  Not ideal, but meh, it's
production, and due for an upgrade in a few months.


> 
>> Don't suppose you've got a non-standard vfs.zfs.txg.timeout (greater
>> than 5 seconds?) set?  That could have exacerbated the situation.
>> 
> 
> We have the default setting, but the default is greater than 5 seconds:
> 
> eris,0:/boot% sysctl -a | grep txg 
> vfs.zfs.txg.write_limit_override: 0
> vfs.zfs.txg.synctime: 5
> vfs.zfs.txg.timeout: 30
> 

Ah, looks like the timeout defaults to 30 on v15.  It's been changed to
5 seconds on v28.

> Is there somewhere documentation of what the txg timeout _is_ (as
> opposed to what are the effects of changing the knob)?  Or is the only
> documentation in the source tree?

It tells ZFS how long it's allowed to postpone writes to stable storage.
(Does Subversion explicitly request synchronous writes?  Or do any sync/
fsync dances?)

The reason I thought of this is because of the timing that would have 
had to play out for the bit flip to be plausible.  It would have had to
have happened after the txn->rev dance, but before the data was written
to storage.

All of ZFS's bad-ass checksumming and data healing measures are useless
if the data becomes corrupted in RAM before it's written to disk; ECC is
the only defense for that.  ECC's good at catching single bit errors; 
not so good at catching anything more, so it's plausible for a bit flip
to go completely undetected, even with ECC -- but you should have at 
least *one* 'corrected' entry in the mcelog (or on the console -- don't 
suppose you log that to a file?).

Might be worth lowering vfs.zfs.txg.timeout down to, say, 5 seconds; at
least until an upgrade to v28 is viable.


Trent.



Re: RFC: Standardizing a 'svn:branch' (boolean) property

2012-07-16 Thread Trent Nelson

On Jul 16, 2012, at 11:17 AM, C. Michael Pilato wrote:

> On 07/16/2012 09:41 AM, Stefan Sperling wrote:
>> On Mon, Jul 16, 2012 at 02:11:10PM +0200, Bert Huijben wrote:
>>> Open questions:
>>> * 'svn:branch' or maybe 'svn:root'?
>> 
>> I'd prefer svn:branch but I don't care strongly.
> 
> And I "svn:branch-root".
> 
>>> * Which UI do/should we provide in 'svn'
>>> svn cp --branch  URL
>>> Performs a copy and makes sure there is a svn:branch property on the target
>>> 
>>> svn mkdir --branch 
>>> Creates a new branch
>> 
>> I would favour a new 'svn branch' subcommand which is equivalent
>> to 'svn copy' including a prop-add of 'svn:branch' at the copy target.
> 
> Hrm.  Here's where I think we see things differently, Stefan.  Your
> prescription changes the branching workflow that folks have been using for,
> possibly, a near-decade.  I see things more simply:  empower and encourage
> users to mark their *trunk* as a branch-root, and now everything follows
> naturally from the existing ('svn cp'-based) workflows.  If trunk has the
> svn:branch-root property, every copy thereof will, too.

That's exactly how I did it with Enversion.  "In the beginning, there was 
/trunk.".  (Or at least, '^.*/trunk$'.)

It worked for ~95% of all cases.  There were 3% of repos that never had a 
trunk, and 2% that had a severely broken branch (i.e. the branch was manually 
created via mkdir, then random code from other parts of the tree was manually 
copied over) that unfortunately was still active.

(So, those corner cases have to be handled manually now by Enversion.  But, as 
we have the framework for storing forward-copy information, I've got an 
enhancement in mind that'll use a heuristic to detect, during analysis, that a 
really-badly-broken-directory-that-just-happens-to-be-a-branch-root is still in 
use as in regularly being copied/branched from.)



Trent.



Re: RFC: Standardizing a 'svn:branch' (boolean) property

2012-07-16 Thread Trent Nelson

On Jul 16, 2012, at 8:11 AM, Bert Huijben wrote:

>   Hi,
> 
> On the Berlin hackathon the suggestion was raised that it might help that we
> standardize a new 'svn:branch' property to give tooling a hint on what
> directories are branches and which aren't.

Automatic branch ("root") identification is Enversion's bread and butter, so I 
might as well weigh in here.

Here's an example I've copied from [1]:

% evnadmin root-info /trunk trac
Looking up root '/trunk' in repository 'trac'...
Found '/trunk' in r10932's roots; created in r1.
Displaying root info for '/trunk' from r1:
{'/trunk/': {
'created': 1,
'creation_method': 'created',
'copies': {
174: [('/tags/trac-0.5-rc1/', 175)],
182: [('/tags/trac-0.5/', 183)],
192: [('/tags/trac-0.5.1/', 193)],
...
1086: [('/branches/0.8-stable/', 1087)],
1185: [('/branches/cmlenz-dev/rearch/', 1186)],
1351: [('/branches/cboos-dev/xref-branch/', 1352)],
1370: [('/branches/cmlenz-dev/vclayer/', 1371)],
...
9871: [('/tags/trac-0.12/', 9872)],
9937: [('/branches/0.12-stable/', 9938)],
10647: [('/sandbox/ticket-3584/', 10648)],
10649: [('/sandbox/ticket-3580/', 10650)]
},
}}

That information (a safe Python dict) happens to be stored in a cumulative 
revprop called evn:roots.  Each revision's evn:roots revprop is populated 
during the repository analysis process (`evnadmin analyze `).

The main difference between this approach and the approach you mentioned is 
that Enversion goes to extremes in order to avoid the need for manual 
intervention.  i.e. the need for an administrator or power-user to come in and 
manually mark specific paths as 'roots'.  I talk more about the motivation 
behind this approach in [2].

Having a cumulative revprop worked well because it catered for, say `svn cp 
^/trunk@50 ^/branches/foo` when HEAD is 100 and svn:branch was set at r90.  r50 
has no svn:branch set, so /branches/foo doesn't pick up the 'I-am-a-root' 
metadata.

which reminds me, I really should spend some time trying to get the word 
out about Enversion ;-)


Trent.

[1]: https://github.com/tpn/enversion/blob/master/doc/quick-start.rst
[2]: http://svn.haxx.se/dev/archive-2011-10/0107.shtml



Re: modify-file within an added tree (r1356317) ; possibly rep-cache.db -related

2012-07-16 Thread Trent Nelson

On Jul 15, 2012, at 9:32 AM, Daniel Shahaf wrote:

> Daniel Shahaf wrote on Tue, Jul 10, 2012 at 09:46:55 +0100:
>> Pending rmuir@'s feedback, then, I'll go ahead and edit the revision
>> file in-place.  No change should be needed to any of the svn mirrors.
> 
> I have now made the following edit:s/modify-file/ add-file  /
> 
> svnsync has successfully synced r1356317.  (As I write this, the
> post-commit fs processing is still running; 'current' on harmonia (the
> mirror) was updated at 13:24:29 UTC.)  I do not plan to edit
> rep-cache.db to remove references to the rep of the file in question.

My mirrors are syncing fine now, thanks!


> I am not opening an svn bug yet as there is no evidence that the bug was
> in svn (as opposed to, say, a single bit flip in svn_fs_path_change_kind_t).

I'm inclined to agree.  A bit flip is an entirely plausible explanation for 
something this odd.

Somewhat related: is this a FreeBSD box?  ports/sysutils/mcelog is useful for 
getting info on any ECC errors that might have occurred.  Is the repo living on 
ZFS?  Don't suppose you've got a non-standard vfs.zfs.txg.timeout (greater than 
5 seconds?) set?  That could have exacerbated the situation.


Trent.

Re: modify-file within an added tree (r1356317) ; possibly rep-cache.db -related

2012-07-09 Thread Trent Nelson


On 7/8/12 11:48 AM, "Daniel Shahaf"  wrote:

>Daniel Shahaf wrote on Sat, Jul 07, 2012 at 15:09:38 +0100:
>> How can we have a 'modify-file' within an added-without-copyfrom tree?

That's a pretty impressive invariant violation.  Enversion would have
caught it, but only because we assert that a modify's previous path must
match its current path -- not because we specifically look for this
situation.

I've also never come across this in the wild.  We should definitely ping
`rmuir` to find out how he committed that revision; platform, version, etc.


>>Isn't svnsync correct to complain in this case?

Definitely.

>I'd like to resolve the situation on the live svn.a.o repository.  I see
>a couple of ways to do so:
>
>- Hand-edit the revision file, changing "modify-file" to "add-file   "
>
>- Eliminate r1356317 from history: create an svnsync copy of /repos/asf
>  using an authz file that excludes
>/lucene/cms/trunk/content/solr/api-4_0_0_ALPHA
>  (or maybe just 
>/lucene/cms/trunk/content/solr/api-4_0_0_ALPHA/org/apache/solr/handler/loa
>der/package-summary.html)
>  and have the Lucene PMC recreate that tag later
>
>Thoughts?

I'd lean towards a scripted version of option 1.  You could then
disseminate the script to svnsync mirror admins (like myself), perhaps via
an ASF/infrastructure blog post?


Trent.



Re: Revprop packing implemented

2012-07-09 Thread Trent Nelson
Hi Stefan,


> This week I had one of my "how hard can it be?" moments
> and finally implemented revprop packing

Lots of tools, like svnsync, store metadata in r0 rev props.  Could this
rev be excluded from packing?

What's the performance penalty for modifying a packed rev props?  From
what you've said, it sounds like packing works by fitting as many rev
props into a 64k chunk, so, let's say I've got a million rev repository.
And the first 1000 revs fit neatly inside the first 64k pack.  What
happens when I propedit r1000 and increase its size so that it no longer
neatly fits inside the 64k pack?

(Just checking that it won't cause a knock-on re-write effect of all the
other 999,000 packed rev props.  Sounds like the manifest map takes care
of avoiding that, but just wanted to make sure.)

And, heh, this is going to kill Enversion, which currently writes
extensive root metadata to each revision's rev prop as it analyzes a
repository.  Will there be a way (even if it's at the API level, not CLI)
to disable rev prop packing?



Trent.



Re: Merge policies

2012-04-19 Thread Trent Nelson

On Apr 19, 2012, at 1:36 PM, Trent Nelson wrote:

> 
> On Apr 19, 2012, at 8:38 AM, Stefan Sperling wrote:
> 
>> On Thu, Apr 19, 2012 at 12:48:44PM +0200, Stefan Fuhrmann wrote:
>>> Hi all,
>>> 
>>> After having a closer look at merge and discussing it
>>> with Julian on IRC, there seems to be no silver bullet.
>>> However, we identified a few things that could be changed
>>> and set of constellations that make merge harder than
>>> it needs to be.
>>> 
>>> For the first, there will be another post soon. The second
>>> boils down to policy. Luckily, SVN has a mechanism to
>>> enforce policies: server-side hook scripts. My proposal
>>> is to develop a small set of scripts that a user can
>>> combine to prevent situations that her life harder than
>>> necessary.
>> 
>> I like the idea of providing pre-commit hook samples that each
>> implement a piece of a merge-policy puzzle.
>> 
>> At elego we often advise customers about branching/merging strategies.
>> It is true that virtually every shop has different requirements (i.e.
>> there is no silver bullet). This simply stems from the fact that every
>> software project is different. At the same time, a lot of shops end up
>> using the same or similar strategy and sometimes reinvent the wheel.
>> 
>> Subversion by itself enforces virtually no restrictions, and adding
>> restrictions to the system is often a major part of the work involved
>> in getting a production setup up and running. Often restrictions are
>> added over time as people repeatedly make mistakes that cause problems.
>> 
>> Having an officially supported set of high-quality hook scripts in
>> our tools/ directory that are documented well and support a wide
>> range of use cases in a modular fashion would help a lot. I believe
>> many users would be happy to deploy a hook collection that is officially
>> maintained by the Apache Subversion project rather than relying on
>> self-written or third-party hook scripts.
>> 
>> I would definitely be interested in contributing, and maybe others
>> at elego would, too. I'd be very happy to install "Apache Subversion"
>> branded hooks rather then "elego" branded hooks at customer sites and
>> share the development effort.
>> 
>> And maybe Trent can contribute some of his experience, if he has time?
> 
> As the spirit of JFDI is so rife before bed, I'm proud to announce enversion 
> 0.0.0: https://github.com/tpn/enversion!
> 

It's probably worth noting this particular file:

https://github.com/tpn/enversion/blob/master/evn/constants.py

That's a list of all the things Enversion currently detects/blocks/warns 
against.  As you can see, lots of merge validation stuff.

I'll follow up tomorrow with more info in a separate thread rather than 
hijacking this one.

Trent.

--
class _Notes(Constant):
MergeinfoRemovedFromRepositoryRoot = 'svn:mergeinfo removed from repository 
root'
SubtreeMergeinfoModified = 'subtree svn:mergeinfo modified'
SubtreeMergeinfoRemoved = 'subtree svn:mergeinfo removed'
Merge = 'merge'
RootRemoved = 'root removed'
ValidMultirootCommit = 'valid multi-root commit'
MultipleUnknownAndKnowRootsVerifiedByExternals = 'multiple known and 
unknown roots verified by svn:externals'
BranchRenamed = 'branch renamed'
TrunkRelocated = 'trunk relocated'
FileReplacedViaCopyDuringMerge = 'file replaced via copy during merge'
FileUnchangedButParentCopyOrRenameBug = 'file is unchanged but there is a 
parent rename or copy action'
DirUnchangedButParentCopyOrRenameBug = 'directory is unchanged but there is 
a parent rename or copy action'
UnchangedFileDuringMerge = 'file unchanged during merge'
UnchangedDirDuringMerge = 'dir unchanged during merge'

n = _Notes()

class _Warnings(Constant):
KnownRootRemoved = 'known root removed'

w = _Warnings()

class _Errors(Constant):
TagRenamed = 'tag renamed'
TagModified = 'tag modified'
MultipleUnknownAndKnownRootsModified = 'multiple known and unknown roots 
modified in the same commit'
MixedRootNamesInMultiRootCommit = 'mixed root names in multi-root commit'
MixedRootTypesInMultiRootCommit = 'mixed root types in multi-root commit'
SubversionRepositoryCheckedIn = 'subversion repository checked in'
MergeinfoAddedToRepositoryRoot = "svn:mergeinfo added to repository root 
'/'"
MergeinfoModifiedOnRepositoryRoot = "

Re: Merge policies

2012-04-19 Thread Trent Nelson

On Apr 19, 2012, at 8:38 AM, Stefan Sperling wrote:

> On Thu, Apr 19, 2012 at 12:48:44PM +0200, Stefan Fuhrmann wrote:
>> Hi all,
>> 
>> After having a closer look at merge and discussing it
>> with Julian on IRC, there seems to be no silver bullet.
>> However, we identified a few things that could be changed
>> and set of constellations that make merge harder than
>> it needs to be.
>> 
>> For the first, there will be another post soon. The second
>> boils down to policy. Luckily, SVN has a mechanism to
>> enforce policies: server-side hook scripts. My proposal
>> is to develop a small set of scripts that a user can
>> combine to prevent situations that her life harder than
>> necessary.
> 
> I like the idea of providing pre-commit hook samples that each
> implement a piece of a merge-policy puzzle.
> 
> At elego we often advise customers about branching/merging strategies.
> It is true that virtually every shop has different requirements (i.e.
> there is no silver bullet). This simply stems from the fact that every
> software project is different. At the same time, a lot of shops end up
> using the same or similar strategy and sometimes reinvent the wheel.
> 
> Subversion by itself enforces virtually no restrictions, and adding
> restrictions to the system is often a major part of the work involved
> in getting a production setup up and running. Often restrictions are
> added over time as people repeatedly make mistakes that cause problems.
> 
> Having an officially supported set of high-quality hook scripts in
> our tools/ directory that are documented well and support a wide
> range of use cases in a modular fashion would help a lot. I believe
> many users would be happy to deploy a hook collection that is officially
> maintained by the Apache Subversion project rather than relying on
> self-written or third-party hook scripts.
> 
> I would definitely be interested in contributing, and maybe others
> at elego would, too. I'd be very happy to install "Apache Subversion"
> branded hooks rather then "elego" branded hooks at customer sites and
> share the development effort.
> 
> And maybe Trent can contribute some of his experience, if he has time?

As the spirit of JFDI is so rife before bed, I'm proud to announce enversion 
0.0.0: https://github.com/tpn/enversion!

Quick start guide (that I sort-of just tested on my dusty ol' Macbook):

% git clone g...@github.com:tpn/enversion.git
% export PYTHONPATH=`pwd`/enversion
% export PATH=$PATH:`pwd`/enversion/bin
% evnadmin  
Type 'evnadmin  help' for help on a specific subcommand.

Available subcommands:
analyze
create
disable-remote-debug (drd)
doctest
dump-config (dc)
dump-default-config (ddc)
dump-hook-code (dhc)
enable
enable-remote-debug (erd)
find-merges (fm)
fix-hooks (fh)
root-info (ri)
run-hook (rh)
show-config-file-load-order (scflo)
show-repo-hook-status (srhs)
show-repo-remote-debug-sessions (srrds)
show-roots (sr)
toggle-remote-debug (trd)
version


If you get that far, take a look at the hasn't-even-been-proof-read-yet quick 
start guide: https://github.com/tpn/enversion/blob/master/doc/quick-start.rst



Trent.



Re: svn commit: r1296868 - /subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql

2012-03-14 Thread Trent Nelson
On 3/14/12 3:40 PM, "Daniel Shahaf"  wrote:

>Philip: I recalled last year's discussions about implied indexes, but
>between Trent's reported observations on IRC and a back-of-the-envelope
>test with sqlite3(1) I was led to believe that an implied index does not
>get created for in this case (due to the TEXT column, as my comment
>says).
>
>I'm more than happy to revert this on trunk (if it hasn't been already)
>assuming it's indeed superfluous.
>
>Trent -- have you looked into things from your end yet?  Can you confirm
>or deny the hypothesis that the explicit INDEX was necessary in your
>environment?

Try as I might, I can't reproduce the original behavior that set us off
down this superfluous index route.  +1 to revert; I was wrong.

Trent.



Re: svn commit: r1296868 - /subversion/trunk/subversion/libsvn_fs_fs/rep-cache-db.sql

2012-03-06 Thread Trent Nelson


On 3/6/12 5:35 PM, "Philip Martin"  wrote:

>Philip Martin  writes:
>
>> It may be TEXT but it is also PRIMARY KEY and according to the SQLite
>> docs:
>>
>> http://sqlite.org/lang_createtable.html
>>
>>INTEGER PRIMARY KEY columns aside, both UNIQUE and PRIMARY KEY
>>constraints are implemented by creating an index in the database (in
>>the same way as a "CREATE UNIQUE INDEX" statement would). Such an
>>index is used like any other index in the database to optimize
>>queries. As a result, there often no advantage (but significant
>>overhead) in creating an index on a set of columns that are already
>>collectively subject to a UNIQUE or PRIMARY KEY constraint.
>
>If I create a repository using 1.7 and look at the rep-cache.db I see:
>
>$ sqlite3 rep-cache.db "select * from sqlite_master" | grep index
>index|sqlite_autoindex_rep_cache_1|rep_cache|4|
>
>An index has been created automatically and so adding another index can
>only slow things down.

Yeah this is interesting; you've provided a wealth of information contrary
to everything I said to Daniel yesterday.  I did some SQLite index tuning
a month ago and I could have sworn having a TEXT field as primary key
inhibited index creation -- but, as you've demonstrated with your explain
plan and sqlite_master query, that's clearly not the case.

To rewind things a little: I was manually repairing my asf mirror
yesterday that happened to sync earlier last week at the wrong time and
picked up a dodgy revision.  I manually deleted the affected rows from
rep-cache.db and noticed that all my select queries seemed to be taking an
inordinate amount of time to complete (5s+ at least).  I explain plan the
problematic query, saw no indexes, created one manually, and wallah,
problem solved, all queries returned instantly and explain plan showed
index usage.

I've got a bunch of zfs snapshots of the repo I can have a play around
with tomorrow to see if I can replicate.

Plausible theories off the top of my head:

a) There's something different with my env and indexes were not being
created automatically.  (I do remember having a lot of trouble getting the
asf repo to load from dumps and then complete from subsequent syncs.)
b) There's something else going on.

(Plausible theories?  Yes.  Good theories?  Debatable ;-)


Trent.



Re: Time for 1.6.18?

2012-02-06 Thread Trent Nelson
On Mon, Feb 06, 2012 at 05:34:48PM -0800, Daniel Shahaf wrote:
> Trent Nelson wrote on Mon, Feb 06, 2012 at 15:22:21 -0500:
> > On Mon, Feb 06, 2012 at 11:39:13AM -0800, Hyrum K Wright wrote:
> > > On Mon, Feb 6, 2012 at 1:25 PM, Blair Zajac  wrote:
> > > > With a fsfs corruption bug found and fixed this week [1], should we cut 
> > > > a
> > > > 1.6.18?  1.6.17 was tagged 8 months ago.  If we don't cut a new release,
> > > > then we should probably let downstream distributions know about the 
> > > > issue so
> > > > they can patch their builds.
> > >
> > > I've got time this week if folks are interested in this.  Any other
> > > pressing items for 1.6.18?
> >
> > Any chance the mergeinfo memory leak fix could be included?  This
>
> Merges cleanly to 1.6.x and passes tests.  Backported, with Greg's vote.

Awesome, thanks!  Very much appreciated.

Trent.


Re: Time for 1.6.18?

2012-02-06 Thread Trent Nelson
On Mon, Feb 06, 2012 at 11:39:13AM -0800, Hyrum K Wright wrote:
> On Mon, Feb 6, 2012 at 1:25 PM, Blair Zajac  wrote:
> > With a fsfs corruption bug found and fixed this week [1], should we cut a
> > 1.6.18?  1.6.17 was tagged 8 months ago.  If we don't cut a new release,
> > then we should probably let downstream distributions know about the issue so
> > they can patch their builds.
>
> I've got time this week if folks are interested in this.  Any other
> pressing items for 1.6.18?

Any chance the mergeinfo memory leak fix could be included?  This
issue is severely affecting one of my clients, and they're still
on 1.6.x at the moment.  If it could be included in .18, I won't
have to patch it manually :-)

Extract from 1.7.x's STATUS log w/ relevant revs below.


r1235929 | danielsh | 2012-01-25 15:36:42 -0500 (Wed, 25 Jan 2012) | 15 lines

Merge the r1235264 group from trunk:

 * r1235264, r1235296, r1235302, r1235736
   Fix a swig-py memory leak.
   Justification:
 Seen in the wild with significant effects.
   Notes:
 r1235264 is the fix.
 r1235296 adds a regression test.
 r1235302 is trivial / obvious fix.
 r1235736 is trivial / obvious fix.
   Votes:
 +1: danielsh
 +0: rhuijben



Regards,

Trent.


Re: Commit Hooks Asynchronous?

2012-02-03 Thread Trent Nelson
On Fri, Feb 03, 2012 at 04:11:56PM -0800, Cory Finger wrote:
> Is there a way in subversion to disable the ability of the post-commit
> hook to run concurrently?
>
> Say Jim and Bob both commit a file at nearly the same time, I would
> like to disable the asynchronous capabilities of commit hooks so that
>
> Jim's commit hooks process, then Bob's commit hooks process.

There's no way of doing that out-of-the-box, as far as I know.

You *can* manually enforce sequential access by blocking from within
the context of a start-commit or pre-commit hook, but... well... it
turns into the mother of all race conditions pretty quickly.

I've got a hook framework* that analyzes commits; pre-commits are
analyzed to see if they should be blocked based on a set of rules,
and post-commits are used to update metadata about the commit which
is stored in the said post-commit's revision property.

The revprop metadata is inherently sequential; r6 depends on r5,
etc.  This design decision bit me in the ass a few months later
because of the asynchronous nature of hooks; I was getting into
situations where r6's post-commit would run before I'd finish
processing r5.

The end result was masses of ugly knee-jerk locking code, which I've
included below as a testament to how fiddly it can be trying to
enforce order with commits.

[*]: framework is called 'Enversion' (Enterprise Subversion).  I'm in
 the process of getting it ready to open source under an Apache 2.0
 license.  I mention it in this thread if you're interested in what
 it attempts to do:


http://mail-archives.apache.org/mod_mbox/subversion-dev/201110.mbox/%3c4e9315db.20...@snakebite.org%3E

Regards,

Trent.


Hideous locking code below.  Note that this isn't indicative of the
general code quality ;-)
--
def _init_evn(self):
self.is_rev_for_empty_repo = self.is_rev and self.rev == 0
self.is_rev_for_first_commit = self.is_rev and self.rev == 1
self.is_txn_for_first_commit = self.is_txn and self.base_rev == 0
self.is_normal_rev_or_txn = (
(self.is_txn and self.base_rev >= 1) or
(self.is_rev and self.rev > 1)
)

# Test mutually-exclusive invariants.
assert (
(self.is_rev_for_empty_repo and (
not self.is_txn_for_first_commit and
not self.is_rev_for_first_commit and
not self.is_normal_rev_or_txn
)) or (self.is_rev_for_first_commit and (
not self.is_rev_for_empty_repo and
not self.is_txn_for_first_commit and
not self.is_normal_rev_or_txn
)) or (self.is_txn_for_first_commit and (
not self.is_rev_for_empty_repo and
not self.is_rev_for_first_commit and
not self.is_normal_rev_or_txn
)) or (self.is_normal_rev_or_txn and (
not self.is_rev_for_empty_repo and
not self.is_rev_for_first_commit and
not self.is_txn_for_first_commit
))
)

rc0 = self.r0_revprop_conf

if self.is_rev_for_empty_repo and 'version' not in rc0:
rc0.last_rev = 0
rc0.version = ESVN_VERSION

version = self._load_evn_revprop_int('version', 1)
if version != ESVN_VERSION:
self.die(e.VersionMismatch % (ESVN_VERSION, version))

if version == 1:
self._init_evn_v1()
else:
raise UnexpectedCodePath

def _init_evn_v1(self):

if self.is_rev_for_empty_repo:
return

if self.is_txn_for_first_commit or self.is_rev_for_first_commit:
assert self.base_rev == 0
k = Dict()
if self.is_txn:
k.base_rev = 0
else:
k.rev = 1
c = self.rconf(**k)
self.__roots = (c, {})
return

assert self.base_rev > 0
if self.is_rev:
assert self.rev >= 2
else:
assert self.base_rev >= 1

max_revlock_waits = self.conf.get('general', 'max-revlock-waits')

# Quick sanity check of last_rev to make sure it's not higher than the
# highest rev of the repository.
self._reload_last_rev()
highest_rev = svn.fs.youngest_rev(self.fs, self.pool)
if self.last_rev > highest_rev:
self.die(e.LastRevTooHigh % (self.last_rev, highest_rev))

if self.is_rev and isinstance(self, RepositoryHook):
with open(self.rev_lockfile, 'w') as f:
f.write(str(os.getpid()))
f.flush()
f.close()

if self.good_last_rev and self.good_base_rev_roots:
self._last_rev_and_base_rev_roots_are_good()
return

dbg = self.log.debug
a = (str(self.rev_or_txn), self.base_rev, self.last_rev)
dbg('rev_or

Re: Commit Hooks Asynchronous?

2012-02-03 Thread Trent Nelson
On Fri, Feb 03, 2012 at 02:42:49PM -0800, Cory Finger wrote:
> Hello, I was just curious. I am developing a script that involves both
> a pre-commit hook and a post commit hook.
>
> The post-commit hook opens a file, reads a file, and saves a file. I
> was wondering, do I need to set up a lock-file system to ensure that 2
> people committing at the same time will not cause a problem in the
> file writing process? Or do the commit hooks run asynchronously?

Assuming you're writing to the same file name during each hook
invocation, yes, you'll definitely need to introduce locking.

Alternatively, you could write to a revision-specific file during
post-commit, then have a little daemon that writes to the intended
file sequentially, as files become available.

If the time it takes you to process a commit is proportional to the
size of a commit, your post-commit hooks will finish out of order.

For example:
Post commit for rev 6 (30MB delta) is running.
Post commit for rev 7 (1 prop change) starts, finishes.
Post commit for rev 6 finishes a minute later.

Unless post-commit r7 logic depends on r6, I'd highly recommend
the daemon approach over locking.

Trent.


Re: [PATCH] Fix memory leak in SWIG Python bindings

2012-01-24 Thread Trent Nelson
On Tue, Jan 24, 2012 at 03:44:04AM -0800, Daniel Shahaf wrote:
> Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
> > Index: subversion/bindings/swig/python/tests/mergeinfo.py
> > ===
> > --- subversion/bindings/swig/python/tests/mergeinfo.py  (revision 
> > 1234786)
> > +++ subversion/bindings/swig/python/tests/mergeinfo.py  (working copy)
> > +  def test_mergeinfo_leakage__lingering_range_t_objects_after_del(self):
> > +"""Ensure that there are no svn_merge_range_t objects being tracked by
> > +   the garbage collector after we explicitly `del` the results returned
> > +   by svn_mergeinfo_parse().  We disable gc before the 
> > svn_mergeinfo_parse
> > +   call and force an explicit collection cycle straight after the 
> > `del`;
> > +   if our reference counts are correct, the allocated svn_merge_range_t
> > +   objects will be garbage collected and thus, not appear in the list 
> > of
> > +   objects returned by gc.get_objects()."""
> > +gc.disable()
> > +mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
> > +del mergeinfo
> > +gc.collect()
>
> It seems you need a gc.enable() call as well, to prevent the test from
> having effects on global state.

You know what, you can get rid of that gc.disable() call.  I put it
in by habit; when I was trying to find the leak, I had breakpoints
set within the CPython garbage collection routines -- if I didn't
disable automatic collection, they'd trigger at extremely annoying
times ;-)

The gc.collect() call is sufficient.

> Do the swig-py tests support parallelized running?  The testsuite for
> 'svn' does, and if the testsuite for swig-py does too then the 'gc' call
> have effects on other threads.

I have... no idea :-)  Given that check-swig-py basically just calls
`python subversion/bindings/swig/python/tests/run_all.py', and that
file just uses the standard Python unit test facilities... I'd say
no, they're not currently designed to support parallel running.

I'm sure we could totally shave that 40s test run time down a few
notches if they, though ;-)

Trent.



Re: [PATCH] Fix memory leak in SWIG Python bindings

2012-01-23 Thread Trent Nelson

On Jan 24, 2012, at 12:11 AM, Daniel Shahaf wrote:

> Trent Nelson wrote on Mon, Jan 23, 2012 at 15:07:20 -0500:
>> Fix a memory leak in convert_rangelist by *always* Py_DECREF'ing the object
>> returned by convert_to_swigtype (a SWIG wrapper around svn_merge_range_t)
>> once we've established it's not NULL.
>> 
>> This is required because PyList_Append takes ownership of references passed
>> in (i.e. calls Py_INCREF).  Prior to this fix, the reference counts of any
> 
> Is svn_swig_py_c_strings_to_list() affected too?  It appends a string to
> a list but doesn't Py_DECREF() the string.
> 

Yeah, I saw that, but scheduled it for another day, as I didn't want to clutter 
my main patch.  Also, nothing calls that method -- which also factored into my 
decision to ignore it for this patch. 

>> svn_merge_range_t objects would always be off-by-one, preventing them from
>> ever being garbage collected, having dire effects on the memory usage of
>> long-running programs calling svn_mergeinfo_parse() on real-world data.
>> 
>> Patch by: Trent Nelson 
>> Tested on: FreeBSD, OS X, Windows.
>> 
>> * subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c:
>>  (convert_rangelist): Make sure we call Py_DECREF on the object returned
>>   from convert_to_swigtype.  PyList_New might return NULL; check for that.
>>   Add 'apr_array_header_t *' and 'swig_type_info *' casts where necessary
>>   to suppress compiler warnings.
> 
> Those casts shouldn't be needed -- casts from 'void *' to other
> pointer types are valid C.  What warnings were you seeing?

I'll double-check in the morning.  Side bar: the bindings generate loads of 
warnings when compiled with the target platform's equivalent of -Wall.

I found a bunch of other things whilst traipsing through the SWIG/Python stuff 
that I'll send follow up patches for this week.  Nothing as bad as the main 
memory leak, though.


> 
>> Index: subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
>> ===
>> --- subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c 
>> (revision 1234786)
>> +++ subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c 
>> (working copy)
>> @@ -653,15 +653,27 @@
>> static PyObject *convert_rangelist(void *value, void *ctx, PyObject *py_pool)
>> {
>>   int i;
>> +  int result;
>> +  PyObject *obj;
>>   PyObject *list;
>> -  apr_array_header_t *array = value;
>> +  apr_array_header_t *array = (apr_array_header_t *)value;
>> 
>>   list = PyList_New(0);
>> +  if (list == NULL)
>> +return NULL;
>> +
>>   for (i = 0; i < array->nelts; i++)
>> {
>>   svn_merge_range_t *range = APR_ARRAY_IDX(array, i, svn_merge_range_t 
>> *);
>> -  if (PyList_Append(list, convert_to_swigtype(range, ctx, py_pool)) == 
>> -1)
>> +
>> +  obj = convert_to_swigtype(range, (swig_type_info *)ctx, py_pool);
>> +  if (obj == NULL)
>> goto error;
>> +
>> +  result = PyList_Append(list, obj);
>> +  Py_DECREF(obj);
>> +  if (result == -1)
>> +goto error;
>> }
>>   return list;
>>  error:



[PATCH] Fix memory leak in SWIG Python bindings

2012-01-23 Thread Trent Nelson
Well, after a week of blood, sweat and tears, I finally tracked down the
memory leak that's been slowly chipping away at my sanity for the past
year.

It's a pretty severe leak that affects the Python SWIG bindings: once
allocated, an svn_merge_range_t object can never be freed.  This is
due to a missing Py_DECREF call in swigutil_py.c's convert_rangelist
method.

The range lists can be huge, too, which makes this quite a nasty leak.
I was seeing up to 750KB of RSS being eaten up on each invocation of
svn_mergeinfo_parse() against non-trivial, real-world svn:mergeinfo
data (such as that on /subversion/trunk, for example).

Any of the mergeinfo methods that return a range list (which almost all
do) are affected.  I haven't looked at the Perl or Ruby bindings in any
capacity.

Patch against trunk attached.

Regards,

Trent.

[[

Fix a memory leak in convert_rangelist by *always* Py_DECREF'ing the object
returned by convert_to_swigtype (a SWIG wrapper around svn_merge_range_t)
once we've established it's not NULL.

This is required because PyList_Append takes ownership of references passed
in (i.e. calls Py_INCREF).  Prior to this fix, the reference counts of any
svn_merge_range_t objects would always be off-by-one, preventing them from
ever being garbage collected, having dire effects on the memory usage of
long-running programs calling svn_mergeinfo_parse() on real-world data.

Patch by: Trent Nelson 
Tested on: FreeBSD, OS X, Windows.

* subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c:
  (convert_rangelist): Make sure we call Py_DECREF on the object returned
   from convert_to_swigtype.  PyList_New might return NULL; check for that.
   Add 'apr_array_header_t *' and 'swig_type_info *' casts where necessary
   to suppress compiler warnings.

* subversion/bindings/swig/python/tests/mergeinfo.py:
  (get_svn_merge_range_t_objects): New helper method that returns a list of
   any 'svn_merge_range_t' objects being tracked by the garbage collector,
   which we use to detect memory leaks.
  (SubversionMergeinfoTestCase): Add two new tests to aid in detecting memory
   leaks.  They essentially test the same thing in two different ways; if one
   fails, the other *should* fail.  (Of course, if one fails and the other does
   not -- that's just as valuable, diagnostically, and points to an issue with
   the 'automatic pool memory management' logic.)
(test_mergeinfo_leakage__incorrect_range_t_refcounts): New test.  Verify
 svn_merge_range_t objects returned from svn_mergeinfo_parse() have the
 correct refcounts set.
(test_mergeinfo_leakage__lingering_range_t_objects_after_del): New test.
 Verify that there are no lingering svn_merge_range_t objects after we
 explicitly delete the results returned from svn_mergeinfo_parse().

]]

[[
Index: subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
===
--- subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
(revision 1234786)
+++ subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
(working copy)
@@ -653,15 +653,27 @@
 static PyObject *convert_rangelist(void *value, void *ctx, PyObject *py_pool)
 {
   int i;
+  int result;
+  PyObject *obj;
   PyObject *list;
-  apr_array_header_t *array = value;
+  apr_array_header_t *array = (apr_array_header_t *)value;
 
   list = PyList_New(0);
+  if (list == NULL)
+return NULL;
+
   for (i = 0; i < array->nelts; i++)
 {
   svn_merge_range_t *range = APR_ARRAY_IDX(array, i, svn_merge_range_t *);
-  if (PyList_Append(list, convert_to_swigtype(range, ctx, py_pool)) == -1)
+
+  obj = convert_to_swigtype(range, (swig_type_info *)ctx, py_pool);
+  if (obj == NULL)
 goto error;
+
+  result = PyList_Append(list, obj);
+  Py_DECREF(obj);
+  if (result == -1)
+goto error;
 }
   return list;
  error:
Index: subversion/bindings/swig/python/tests/mergeinfo.py
===
--- subversion/bindings/swig/python/tests/mergeinfo.py  (revision 1234786)
+++ subversion/bindings/swig/python/tests/mergeinfo.py  (working copy)
@@ -18,7 +18,7 @@
 # under the License.
 #
 #
-import unittest, os
+import unittest, os, sys, gc, itertools
 from svn import core, repos, fs
 import utils
 
@@ -29,6 +29,15 @@
 self.start = start
 self.end = end
 
+def get_svn_merge_range_t_objects():
+  """Returns a list 'svn_merge_range_t' objects being tracked by the
+ garbage collector, used for detecting memory leaks."""
+  return [
+o for o in gc.get_objects()
+  if hasattr(o, '__class__') and
+o.__class__.__name__ == 'svn_merge_range_t'
+  ]
+
 class SubversionMergeinfoTestCase(unitte

[PATCH] Improve wording of svn_delta_editor_t's absent_(file|directory) callback functions.

2012-01-10 Thread Trent Nelson
Howdy folks,

The current wording of svn_delta_editor_t's absent_(file|directory)
callback functions is a bit misleading:

"In the (file|directory) ... but cannot be conveyed to the
 consumer (perhaps because of authorization restrictions)."
  ^^^

Because of that wording, I just spent a chunk of time looking through
the code trying to find if these callbacks would be invoked for any
reason *other* than an authorization restriction.

I've concluded that no, they wouldn't :-)

(The only code path that triggers delta.c:absent_file_or_dir() is
 wrapped up in an authz test in delta.c:add_file_or_dir().)

Attached patch improves the wording so that there's no ambiguity
as to when absent_(file|directory) callbacks would be invoked.

Regards,

Trent.

[[[
Improve docstrings for svn_delta_editor_t's absent_(file|directory) callbacks.

* subversion/include/svn_delta.h
  (svn_delta_editor_t): the absent_(file|directory)
   callbacks will *only* be invoked due to authz
   restrictions; state this clearly in the docstrings.

Patch by: Trent Nelson 
]]]

Index: subversion/include/svn_delta.h
===
--- subversion/include/svn_delta.h  (revision 1229889)
+++ subversion/include/svn_delta.h  (working copy)
@@ -916,8 +916,8 @@

   /** In the directory represented by @a parent_baton, indicate that
* @a path is present as a subdirectory in the edit source, but
-   * cannot be conveyed to the edit consumer (perhaps because of
-   * authorization restrictions).
+   * cannot be conveyed to the edit consumer because of authorization
+   * restrictions.
*
* Any temporary allocations may be performed in @a scratch_pool.
*/
@@ -1040,8 +1040,8 @@

   /** In the directory represented by @a parent_baton, indicate that
* @a path is present as a file in the edit source, but cannot be
-   * conveyed to the edit consumer (perhaps because of authorization
-   * restrictions).
+   * conveyed to the edit consumer because of authorization
+   * restrictions.
*
* Any temporary allocations may be performed in @a scratch_pool.
*/




svn_delta.h.patch
Description: svn_delta.h.patch


Re: Identifying branch roots

2011-10-10 Thread Trent Nelson

On 07-Oct-11 6:59 AM, Julian Foad wrote:

On Fri, 2011-10-07 at 11:29 +0100, Julian Foad wrote:

Stefan Sperling wrote:

julianfoad wrote:

+/* This property marks a branch root. Branches with the same value of this
+ * property are mergeable. */
+#define SVN_PROP_BRANCHING_ROOT "svn:ignore" /* ### should be 
"svn:branching-root" */


Hi Stefan. Thanks for picking up on this.


I think your addition of a 'branch root' property is quite a significant
step. Is this really necessary in order to improve the output of
'svn mergeinfo' or do you have additional steps planned that go beyond
tuning output?


Both.  I think knowing whether the (requested) merge source and target
are branch roots (and indeed branches of the *same* "project" or tree)
is important for improving the output and diagnostics of "svn mergeinfo"
and "svn merge" commands.

It could of course enable other new behaviours relating to branches, and
I don't know what those are yet (apart from trivial UI things like
answering "is this a branch?").

So I'm working on the idea that it would be useful to have branch roots
identifiable by some mechanism, so I'll add "some mechanism" (currently
this property, but I'm totally open to a different mechanism such as
branch points being defined in a config file) and see what useful
behaviours I can come up with.


There has been some discussion about adding a property for this
and similar purposes in the past, see
http://svn.haxx.se/dev/archive-2009-09/0156.shtml
(there are probably more threads about this topic)


Yes, and it's time to figure out what we can usefully do with such
information and then we'll know exactly what branch configuration
information we need and what's a good way to store it.

I'll reply to the rest in a further email.

- Julian

Welp, I'm never going to get a better lead in than that, so, hi, folks!
Freelance SCM consultant here; used to specialise in ClearQuest, of all
things, but my last two gigs ended up revolving around Subversion.

Specifically, Subversion merges, in the enterprise, and the, uh, quirks
involved.  Each client had different requirements, and thus, the
solution I ended up delivering to each one differed a bit.  The first
solution was neat, and did all kinds of funky ClearQuest integration
and merge validation, but the second one is more applicable to this
discussion, so I'll describe that first.

In essence, it's a hook framework that attempts to enforce Subversion
best-practices by blocking* incoming commits if it detects one or more
of the following:

(*) Sometimes it'll block, but phrase the error message
along the lines of "if you *really* want to do this,
re-try your commit with the phrase 'CONFIRM MULTI-ROOT
RENAME' somewhere in your commit message".

TagCopied
TagRenamed
TagRemoved
TagModified
TagReplaced
TagSubtreeCopied
TagSubtreeRenamed
TagSubtreeRemoved
TagSubtreeModified
TagSubtreeReplaced
MultipleUnknownAndKnownRootsModified
MixedRootNamesInMultiRootCommit
MixedRootTypesInMultiRootCommit
SubversionRepositoryCheckedIn
MergeinfoAddedToRepositoryRoot
MergeinfoModifiedOnRepositoryRoot
SubtreeMergeinfoAdded
RootMergeinfoRemoved
DirectoryReplacedDuringMerge
EmptyMergeinfoCreated
TagDirectoryCreatedManually
BranchDirectoryCreatedManually
BranchRenamedToTrunk
TrunkRenamedToBranch
TrunkRenamedToTag
BranchRenamedToTag
BranchRenamedOutsideRootBaseDir
TagSubtreePathRemoved
RenameAffectsMultipleRoots
UncleanRenameAffectsMultipleRoots
MultipleRootsCopied
UncleanCopy
FileRemovedFromTag
CopyKnownRootSubtreeToValidAbsRootPath
MixedRootsNotClarifiedByExternals
CopyKnownRootToIncorrectlyNamedRootPath
CopyKnownRootSubtreeToIncorrectlyNamedRootPath
RenamedKnownRootToIncorrectlyNamedRootPath
MixedChangeTypesInMultiRootCommit
CopyKnownRootToKnownRootSubtree
UnknownPathCopiedToIncorrectlyNamedNewRootPath
RenamedKnownRootToKnownRootSubtree
FileUnchangedAndNoParentCopyOrRename
DirUnchangedAndNoParentCopyOrRename
EmptyChangeSet
CopyKnownRootToUnknownPath
CopyKnownRootSubtreeToInvalidRootPath
NewRootCreatedByRenamingUnknownPath
UnknownPathCopiedToKnownRootSubtree
NewRootCreatedByCopyingUnknownPath
PathCopiedFromOutsideRootDuringNonMerge
UnknownDirReplacedViaCopyDuringNonMerge
DirReplacedViaCopyDuringNonMerge
DirectoryReplacedDuringNonMerge
PreviousPathNotMatchedToPathsInMergeinfo
PreviousRevDiffersFromParentCopiedFromRev
PreviousPathDiffersFromParentCopiedFromPath
PreviousRevDiffersFromParentRenamedFromRev
PreviousPathDiffersFromParentRenamedFromPath
KnownRootPathReplacedViaCopy
BranchesDirShouldBeCreatedManuallyNotCopied
TagsDirShouldBeCreatedManuallyNotCopied
CopiedFromPathNotMatchedToPathsInMergeinfo
InvariantViolatedModifyContainsMismatchedPreviousPath
InvariantViolatedModifyC