date:20110615

Re: svn commit: r1136294 - in /subversion/trunk/subversion/libsvn_ra_serf: ra_serf.h util.c

2011-06-15 Thread Lieven Govaerts

Hi Greg,

On Thu, Jun 16, 2011 at 8:00 AM,   wrote:
> Author: gstein
> Date: Thu Jun 16 06:00:55 2011
> New Revision: 1136294
>
> URL: http://svn.apache.org/viewvc?rev=1136294&view=rev
> Log:
> Sketch out some structures for pausing the XML parsing of the "update"
> report. This will allow us to apply a limit to the outstanding number of
> requests queued into the serf context.
>
> No functional changes. This revision simply adds some structures that will
> be needed to perform the pausing.
>
> (here for review; more code to follow...)

Interesting patch.

>
> * subversion/libsvn_ra_serf/ra_serf.h:
>  (svn_ra_serf__xml_parser_t): add PAUSE_ALLOC, PAUSED, and PENDING
>    members to manage the parse-pause process.
>
> * subversion/libsvn_ra_serf/util.c:
>  (PARSE_CHUNK_SIZE, pending_buffer_t, svn_ra_serf__pendint_t,
pendinG_t.

>      SPILL_SIZE): new defines and structures.
>
> Modified:
>    subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h
>    subversion/trunk/subversion/libsvn_ra_serf/util.c
>
> Modified: subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h
> URL: 
> http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h?rev=1136294&r1=1136293&r2=1136294&view=diff
> ==
> --- subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h (original)
> +++ subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h Thu Jun 16 06:00:55 
> 2011
> @@ -590,6 +590,31 @@ struct svn_ra_serf__xml_parser_t {
>
>   /* If an error occurred, this value will be non-NULL. */
>   svn_error_t *error;
> +
> +  /* If "pausing" is to be allowed, then this must refer to the connection's
> +     bucket allocator. It will be used to hold content in memory, then
> +     return it once it has been consumed.
> +
> +     NOTE: if this field is NULL, then pausing is NOT allowed.  */
> +  serf_bucket_alloc_t *pause_alloc;
> +
> +  /* If pausing is allowed, then callbacks can set this value to pause
> +     the XML parsing. Note that additional elements may be parsed once
> +     this value is set (as the current buffer is consumed; no further
> +     buffers will be parsed until PAUSED is cleared).
> +
> +     At some point, the callbacks should clear this value. The main
> +     serf_context_run() loop will notice the change and resume delivery
> +     of content to the XML parser.  */
> +  svn_boolean_t paused;

If delivery is resumed from say disk, it might be 'a while' before all
cached data is consumed and svn is reading from the socket again. To
avoid dropped connections, we might design it so that even while the
XML parser is not paused, svn still continues to read from the socket
and adds the data to the memory/disk cache.

> +
> +  /* While the XML parser is paused, content arriving from the server
> +     must be saved locally. We cannot stop reading, or the server may
> +     decide to drop the connection. The content will be stored in memory
> +     up to a certain limit, and will then be spilled over to disk.
> +
Yes, this is the idea. I have been thinking of putting this
functionality in a bucket, e.g. a cache_bucket. Besides the callback
it fits the bucket model quite well.

The benefits of such a bucket are improved locality of the code, but
also that you can put this cache_bucket in the chain before the
deflate_bucket, so that serf can stored the incoming data compressed
into memory, possibly not even requiring disk access.

> +     See libsvn_ra_serf/util.c  */
> +  struct svn_ra_serf__pending_t *pending;
>  };
>
>  /*
>
> Modified: subversion/trunk/subversion/libsvn_ra_serf/util.c
> URL: 
> http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_ra_serf/util.c?rev=1136294&r1=1136293&r2=1136294&view=diff
> ==
> --- subversion/trunk/subversion/libsvn_ra_serf/util.c (original)
> +++ subversion/trunk/subversion/libsvn_ra_serf/util.c Thu Jun 16 06:00:55 2011
> @@ -52,6 +52,45 @@
>  #define XML_STATUS_ERROR 0
>  #endif
>
> +
> +#define PARSE_CHUNK_SIZE 8000
> +
> +/* As chunks of content arrive from the server, and we need to hold them
> +   in memory (because the XML parser is paused), they are copied into
> +   these buffers. The buffers are arranged into a linked list.  */
> +struct pending_buffer_t {
> +  apr_size_t size;
> +  char data[PARSE_CHUNK_SIZE];
> +  struct pending_memnode_t *next;
> +};

Why not store this in an aggregate bucket? It has to be wrapped in a
bucket anyway to return it to the caller. You can still calculate the
size of the current buffer outside the aggregate bucket.

> +
> +
> +/* This structure records pending data for the parser in memory blocks,
> +   and possibly into a temporary file if "too much" content arrives.  */
> +struct svn_ra_serf__pending_t {
> +  /* The amount of content in memory.  */
> +  apr_size_t memory_size;
> +
> +  /* HEAD points to the first block of the linked list of buffers.
> +     TAIL points

Re: diff wish

2011-06-15 Thread Daniel Shahaf

Branko Čibej wrote on Wed, Jun 15, 2011 at 14:44:45 +0200:
> On 15.06.2011 14:11, Johan Corveleyn wrote:
> >> If you have a different definition of "mis-synchronizes", please explain.
> > No, I don't mean a broken diff. The diff should at all times be
> > *correct*. That was indeed never questioned.
> >
> > I mean something like the example Neels gave with his initial approach
> > for avoid the mis-matching empty line problem. With the naive
> > solution, he gave an example of where it's not nice:
> [...]
> 
> But when would the current "minimal" diff be preferable to the nicest,
> albeit not minimal, diff we can produce? After all, the fix and/or
> patience diff result is not only nicer to look at, it also gives better
> results for blame, which is the other big diff consumer. Likewise, it'll

Why doesn't 'svn blame' take --diff-cmd then?

> give better locality for resolving merge conflicts. That's why I don't
> understand why Subversion, specifically, would need a --minimal option.
> 
> GNU diff is a general-purpose diff tool. 'svn diff' is not.
> 
> -- Brane
>

Re: [PATCH] Issue #3919: fix for spurious property conflict during merge

2011-06-15 Thread Daniel Shahaf

Brian Neal wrote on Wed, Jun 15, 2011 at 14:22:52 -0500:
> I probably not would have been able to write the initial patch without
> it. Once I had all the pre-requisite build tools installed in Ubuntu,

apt-get build-dep subversion ?

> all I had to do was type "make" and watch it go to town. It was
> amazing. The only thing that kind of tripped me up was that I am using
> Ubuntu 10.04, which uses svn 1.6 to checkout the working copy of svn.
> So at the end, when it does a svnversion I got some strange error
> about my working copy being too old. But after googling I convinced
> myself I could ignore this error. So yes, thank for that Makefile,

I've already seen a fix fly by on commits@ for this :)

> it's very useful in lowering the barrier to contributing.
> 
> Best regards,
> BN

Re: API change - make svn_tristate_to_word/_from_word() private?

2011-06-15 Thread Daniel Shahaf

Stefan Fuhrmann wrote on Wed, Jun 15, 2011 at 21:29:32 +0200:
> On 15.06.2011 16:45, Daniel Shahaf wrote:
> >>   * svnserve:main.c which uses it simply to detect whether a
> >>command-line option is "true"/"yes"/"on", and doesn't actually care
> >>about the tri-state-ness.
> >>
> Well, not specifying the parameter is equivalent to the 3rd state.

The code only cares whether the parameter was specified-and-"yes" or no.
It doesn't behave in three different ways for the three different values
of the parameter.

> But this check actually prompted me *not* to duplicate the
> comparison code again but to factor it out to some common
> utility function. Even 3rd party coders might find it helpful.
> >Why does that svnserve option take a string argument?  If it's just
> >a boolean why not do
> >
> >-{"cache-txdeltas", SVNSERVE_OPT_CACHE_TXDELTAS, 1,
> >+{"cache-txdeltas", SVNSERVE_OPT_CACHE_TXDELTAS, 0,
> >
> >?
> Because the default can be "true" / "on" and that default
> may change in later versions.

I hadn't considered that.  Fair enough, assuming the documentation
agrees (ie, we don't promise what the default is).

> Furthermore, both options should have the same UI.

+1

Re: svn commit: r1136114 - /subversion/trunk/configure.ac

2011-06-15 Thread Daniel Shahaf

Stefan Sperling wrote on Wed, Jun 15, 2011 at 19:54:48 +0200:
> On Wed, Jun 15, 2011 at 06:28:28PM +0100, Philip Martin wrote:
> > Stefan Sperling  writes:
> > >> Is cmp as portable as diff?  Is it always available with the same
> > >> behaviour?  autoconf generates calls to diff, so we know that using diff
> > >> will work anywhere that autoconf works.  As far as I can tell autoconf
> > >> doesn't use cmp.
> > >
> > >>From the cmp man page:
> > >
> > > HISTORY
> > >  "A cmp command appeared in Version 1 AT&T UNIX."
> > >
> > > I would say that's ancient enough to be supported anywhere :)
> > 
> > Do they all support -s?
> 
> Very likely. The -s option was present in Unix 7th Edition:
> http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/cmp.c
> It was likely also in 4.4BSD or earlier. NetBSD had the option in 1995.

And Peter observed on IRC that even cmp's that don't support -s will
still DTRT.  (eg, if they error out, the non-zero exit code would
signify a no-op file replace)

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 16:46, Mark Phippard  wrote:
>...
> I want to reiterate my objections to further delaying our release
> process with these interim releases.

In my mind, I was just thinking of a couple weeks with a label
("beta") that people might actually try out.

>...
> I do not think anyone would disagree that if we wait another 2-3
> months then even more bugs would be found and fixed.  That does not
> mean that is the right product release strategy as that logic never
> ends.  There are always going to be more bugs, and as people get
> impatient and start dumping more new features into the release we
> might even be creating more new bugs in the process.

I hear what you're saying. In short, I might rephrase as "we got
nothing blocking us, so let's ship 1.7.0, and be ready for 1.7.1 in
three days." :-)

Cheers,
-g

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Mark Phippard

On Wed, Jun 15, 2011 at 4:19 PM, Hyrum K Wright  wrote:
>> ps. and if you *don't* think it is good enough for 1.7.0, then it sure
>> isn't an RC1. if we roll RC*, then we can't say "oh, but it isn't
>> final. we'll have bugs to fix." that isn't an RC. seen too much of
>> that nonsense in the past...
>
> Completely agree.  A release candidate is a *candidate* for release,
> and if we wouldn't be willing to release it as the final, it's got no
> business being an RC.

Completely agree.

> After we branch, we can take the temperature of
> the community, and determine which of a beta or rc is warranted.  (I
> suspect it will be a beta.)

I want to reiterate my objections to further delaying our release
process with these interim releases.

http://svn.haxx.se/dev/archive-2011-05/0143.shtml

If we want to have beta programs great.  Then we should step up and
advertise them and actively solicit the feedback we want to receive.
Set up polls/surveys whatever, so we know what people tested.  I do
not think we get any value from tossing an alpha/beta over the wall
and then sitting around for a couple weeks and repeating the same
conversations.

1.7 does not introduce a lot of new user facing features so our
existing test framework probably provides us with more value than it
does when we introduce major new features and might not have all the
tests in place that we need.  I am sure the final release, whenever it
happens, will include a few items we wish we will have caught, but
that does not mean we should not release.  We need people to step
forward and say "this is good, let's release" or at least say why we
are not ready.

We have aggressively asked for blockers to get put in the tracker.
People responded and the blockers are getting closed.  I am totally
fine with putting out another call for blockers, but if no one comes
up with any, I do not think the answer is to issue a beta.  We should
issue a release candidate.

I do not think anyone would disagree that if we wait another 2-3
months then even more bugs would be found and fixed.  That does not
mean that is the right product release strategy as that logic never
ends.  There are always going to be more bugs, and as people get
impatient and start dumping more new features into the release we
might even be creating more new bugs in the process.

I think the release is looking good.  When the known blockers are
resolved, I think we are ready to release.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Hyrum K Wright

On Wed, Jun 15, 2011 at 8:10 PM, Greg Stein  wrote:
> On Wed, Jun 15, 2011 at 15:08, Stefan Sperling  wrote:
>> On Wed, Jun 15, 2011 at 02:33:04PM -0400, Mark Phippard wrote:
>>> Given that the most common distinction between alpha and beta is
>>> "feature complete" I have been arguing all along that the existing
>>> "alpha" release should have been labelled "beta".  I would be +1 on
>>> changing that for the next release before we branch.
>>
>> There have been new (small) features and APIs added since alpha1,
>> e.g. mime-type detection with libmagic.
>>
>> Once we've branched we'll start the review process for backports,
>> which will put a definite end to new functionality being added.
>> I'd say just call the first release from the branch "beta" if there are
>> known blockers, and call it "RC1" if there are no known blockers.
>> Until we branch, we should keep calling them "alpha".
>
> I would still like us to consider a "beta" release, regardless. I
> would be very uncomfortable moving from alpha straight to RC1. I just
> don't have the confidence that the codebase is a 1.7.0 release at the
> point we branch.

That's understandable.

> ps. and if you *don't* think it is good enough for 1.7.0, then it sure
> isn't an RC1. if we roll RC*, then we can't say "oh, but it isn't
> final. we'll have bugs to fix." that isn't an RC. seen too much of
> that nonsense in the past...

Completely agree.  A release candidate is a *candidate* for release,
and if we wouldn't be willing to release it as the final, it's got no
business being an RC.  After we branch, we can take the temperature of
the community, and determine which of a beta or rc is warranted.  (I
suspect it will be a beta.)

-Hyrum

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 15:08, Stefan Sperling  wrote:
> On Wed, Jun 15, 2011 at 02:33:04PM -0400, Mark Phippard wrote:
>> Given that the most common distinction between alpha and beta is
>> "feature complete" I have been arguing all along that the existing
>> "alpha" release should have been labelled "beta".  I would be +1 on
>> changing that for the next release before we branch.
>
> There have been new (small) features and APIs added since alpha1,
> e.g. mime-type detection with libmagic.
>
> Once we've branched we'll start the review process for backports,
> which will put a definite end to new functionality being added.
> I'd say just call the first release from the branch "beta" if there are
> known blockers, and call it "RC1" if there are no known blockers.
> Until we branch, we should keep calling them "alpha".

I would still like us to consider a "beta" release, regardless. I
would be very uncomfortable moving from alpha straight to RC1. I just
don't have the confidence that the codebase is a 1.7.0 release at the
point we branch.

Cheers,
-g

ps. and if you *don't* think it is good enough for 1.7.0, then it sure
isn't an RC1. if we roll RC*, then we can't say "oh, but it isn't
final. we'll have bugs to fix." that isn't an RC. seen too much of
that nonsense in the past...

Re: API change - make svn_tristate_to_word/_from_word() private?

2011-06-15 Thread Stefan Fuhrmann


On 15.06.2011 16:45, Daniel Shahaf wrote:

[ bcc stefan2 ]

Julian Foad wrote on Wed, Jun 15, 2011 at 15:14:53 +0100:

New in 1.7 in svn_types.h:
   svn_tristate_t
   svn_tristate_to_word()  # yields "true"/"false"/NULL
   svn_tristate_from_word()  # takes "true"/"yes"/"on"/.../

The _to/from_word() functions don't seem canonically purposed: some
users want a single representation of the values, for use in on-the-wire
protocols and XML output, while other users want a more flexible,
human-readable interpretation that accepts multiple different words
("true" = "on" = "yes").

Well, these look like reasonable conversion utilities to me.
They simply don't make assumptions about the context that
they might be used in.

However, one might argue whether they need to be public.
Before making them private, we need to extend svn_cmdline
to do the conversion. Otherwise, a svn tool (svnserve in that
case) would need to call private API.

The doc string of _from_word() is wrong: it says any input other than
"true" and "false" is unknown.

Yep. Forgot to update that.

The current users are:

   * The main use is in svn_log_changed_path2_t.text_modified
and .props_modified.  For these, we use _to_word() in mod_dav_svn to
write a protocol string (assumes the value is not svn_tristate_unknown),
use _from_word() in ra_neon and ra_serf to interpret a protocol string,
and use _to_word() in "svn log --xml" to output an XML attribute, using
an undocumented feature of svn_xml_make_open_tag() to allow the
attribute string to be NULL.  All of these need only "true"/"false", not
yes/no/off/on.

But they won't be hurt by allowing those alternative values.

   * implementation of svn_config_get_bool() and svn_hash_get_bool(),
which uses _tristate_from_word() to flexibly read "on"/"off"/"yes"/etc.

The is where the more flexible implementation comes from.
It looked a little duplicated.

   * svnserve:main.c which uses it simply to detect whether a
command-line option is "true"/"yes"/"on", and doesn't actually care
about the tri-state-ness.


Well, not specifying the parameter is equivalent to the 3rd state.
But this check actually prompted me *not* to duplicate the
comparison code again but to factor it out to some common
utility function. Even 3rd party coders might find it helpful.

Why does that svnserve option take a string argument?  If it's just
a boolean why not do

-{"cache-txdeltas", SVNSERVE_OPT_CACHE_TXDELTAS, 1,
+{"cache-txdeltas", SVNSERVE_OPT_CACHE_TXDELTAS, 0,

?

Because the default can be "true" / "on" and that default
may change in later versions. Furthermore, both options
should have the same UI.

At least, and at first, we should remove the second and third usages
listed above.

And replace them by what?

   Those usages are more akin to "svn_cmdline" or
"svn_config" utilities, or could well be private.

Further, I consider moving/renaming these two functions to be a bit more
private even for their main use in svn_log_changed_path2_t.
Fair enough.  I don't see that any of the above uses require third
parties API consumers to parse tristates (from string to enum), so no
objection to making _from_word() private.

Making them private would be fine with me but that requires
svn_cmdline to be extended with some equivalent public function.

-- Stefan^2.

Re: API change - rename svn_hash_get_cstring/bool() to svn_config_hash_get_*()

2011-06-15 Thread Stefan Fuhrmann


On 15.06.2011 15:29, Julian Foad wrote:

A heads-up: I'm renaming svn_hash_get_bool() to
svn_config_hash_get_bool(), and svn_hash_get_cstring() to
svn_config_hash_get(), moving them from svn_hash.h to svn_config.h.
They are being used only for config hashes, and don't seem like
general-purpose hash functions.  In particular, _get_bool() looks for a
variety of human-oriented boolean-like strings (on, off, yes, no, etc.).

Note that we already have svn_config_get_bool() and svn_config_get(),
which are similar but operate on a structured config file object,
whereas the functions I'm renaming operate on a simple hash.

The reason why I explicitly didn't follow that direct in the first place
is that svn_hash is very different from svn_config. Moving the
get_bool function to svn_config would basically make it an orphan:
There is no other hash-related function in there let alone a set of
functions that take hashes as kind of an alternative to svn_config
structures.

So, I'm -0 on the rename and -0 on the move.

-- Stefan^2.


Current APIs:
[[[
# In svn_hash.h:

/** Find the value of a @a key in @a hash, return the value.
  *
  * If @a hash is @c NULL or if the @a key cannot be found, the
  * @a default_value will be returned.
  *
  * @since New in 1.7.
  */
const char *
svn_hash_get_cstring(apr_hash_t *hash,
  const char *key,
  const char *default_value);

/** Like svn_hash_get_cstring(), but for boolean values.
  *
  * Parses the value as a boolean value. The recognized representations
  * are 'TRUE'/'FALSE', 'yes'/'no', 'on'/'off', '1'/'0'; case does not
  * matter.
  *
  * @since New in 1.7.
  */
svn_boolean_t
svn_hash_get_bool(apr_hash_t *hash,
   const char *key,
   svn_boolean_t default_value);

# In svn_config.h:

/** Find the value of a (@a section, @a option) pair in @a cfg, set @a
  * *valuep to the value.
  *
  * If @a cfg is @c NULL, just sets @a *valuep to @a default_value. If
  * the value does not exist, expand and return @a default_value. @a
  * default_value can be NULL.
  *
  * The returned value will be valid at least until the next call to
  * svn_config_get(), or for the lifetime of @a default_value. It is
  * safest to consume the returned value immediately.
  *
  * This function may change @a cfg by expanding option values.
  */
void
svn_config_get(svn_config_t *cfg,
const char **valuep,
const char *section,
const char *option,
const char *default_value);

/** Like svn_config_get(), but for boolean values.
  *
  * Parses the option as a boolean value. The recognized representations
  * are 'TRUE'/'FALSE', 'yes'/'no', 'on'/'off', '1'/'0'; case does not
  * matter. Returns an error if the option doesn't contain a known string.
  */
svn_error_t *
svn_config_get_bool(svn_config_t *cfg,
 svn_boolean_t *valuep,
 const char *section,
 const char *option,
 svn_boolean_t default_value);
]]]

Let me know if you have any concerns.  As I don't anticipate any, I'll
probably make the change before waiting for replies.

- Julian

Re: [PATCH] Issue #3919: fix for spurious property conflict during merge

2011-06-15 Thread Brian Neal

Hi Stefan -

On Wed, Jun 15, 2011 at 9:23 AM, Stefan Sperling  wrote:
>
> Committed in r1136063.

Thank you!

>
>> > Also, did you already run the regression tests with your patch ("make
>> > check")? My suggestion might affect the output of 'svn' so tests would
>> > need to be run again (but I will run them either way before committing).
>>
>> I did not.
>
> They all pass.
>

If you think we need to add a test for this bug maybe I could work one
up. Although at first glance this appears a bit daunting (to me).

>> BTW, I really have to hand it to you guys for putting together some
>> nice online docs that explains your development process and the
>> awesome Makefile.svn which allows a new developer to get started with
>> the code so quickly.
>
> Glad to hear that Makefile.svn is useful to others, too.
> I originally used it as my local build script and Hyrum suggested
> to put in our repo in case someone finds it useful. Now there we go :)
>

I probably not would have been able to write the initial patch without
it. Once I had all the pre-requisite build tools installed in Ubuntu,
all I had to do was type "make" and watch it go to town. It was
amazing. The only thing that kind of tripped me up was that I am using
Ubuntu 10.04, which uses svn 1.6 to checkout the working copy of svn.
So at the end, when it does a svnversion I got some strange error
about my working copy being too old. But after googling I convinced
myself I could ignore this error. So yes, thank for that Makefile,
it's very useful in lowering the barrier to contributing.

Best regards,
BN

Re: [Issue 3922] New - pass client cert as stream to auth provider

2011-06-15 Thread Stefan Sperling

On Wed, Jun 15, 2011 at 11:55:24AM -0700, stevek...@tigris.org wrote:
> http://subversion.tigris.org/issues/show_bug.cgi?id=3922
>  Issue #|3922
>  Summary|pass client cert as stream to auth provider
>Component|subversion
>  Version|trunk
> Platform|PC
>  URL|
>   OS/Version|All
>   Status|NEW
>Status whiteboard|
> Keywords|
>   Resolution|
>   Issue type|ENHANCEMENT
> Priority|P3
> Subcomponent|libsvn_subr
>  Assigned to|issues@subversion
>  Reported by|steveking 
> 
> --- Additional comments from stevek...@tigris.org Wed Jun 15 11:55:23 
> -0700 2011 ---

[...]

> I can't set a target milestone, but I think this should be considered post 1.7

I find it quite annoying that it's not possible to set a target
milestone while filing a new issue. It is only possible after the
issue has been filed...

I've added the flurry of issues you've filed to various milestones.
I used my own judgement for this. If someone disagrees feel free
to adjust the milestones again.

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread C. Michael Pilato

On 06/15/2011 03:13 PM, Stefan Küng wrote:
> On 14.06.2011 22:22, C. Michael Pilato wrote:
> 
>> If anybody (ahem, Stefan Küng) is sitting on API improvements or
>> requirements tracked solely in their heads, it's time to put them into the
>> tracker.  Please do not delay.
> 
> Point taken :)
> 
> Filed the following issues:
> http://subversion.tigris.org/issues/show_bug.cgi?id=3922
> http://subversion.tigris.org/issues/show_bug.cgi?id=3923
> http://subversion.tigris.org/issues/show_bug.cgi?id=3924
> http://subversion.tigris.org/issues/show_bug.cgi?id=3925
> http://subversion.tigris.org/issues/show_bug.cgi?id=3926
> http://subversion.tigris.org/issues/show_bug.cgi?id=3927
> http://subversion.tigris.org/issues/show_bug.cgi?id=3928

You.  Are.  Awesome.  Thanks, Stefan!

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Stefan Küng


On 14.06.2011 22:22, C. Michael Pilato wrote:


If anybody (ahem, Stefan Küng) is sitting on API improvements or
requirements tracked solely in their heads, it's time to put them into the
tracker.  Please do not delay.


Point taken :)

Filed the following issues:
http://subversion.tigris.org/issues/show_bug.cgi?id=3922
http://subversion.tigris.org/issues/show_bug.cgi?id=3923
http://subversion.tigris.org/issues/show_bug.cgi?id=3924
http://subversion.tigris.org/issues/show_bug.cgi?id=3925
http://subversion.tigris.org/issues/show_bug.cgi?id=3926
http://subversion.tigris.org/issues/show_bug.cgi?id=3927
http://subversion.tigris.org/issues/show_bug.cgi?id=3928

Maybe issue #3924 should be considered for 1.7 and the API review? But 
I'm happy with the current situation as long as the current svn_wc_* API 
stays public.


Issue #3928 would help with performance and reduce a lot of stat() calls 
in TSVN. But I have a workaround that works so it's not too big of a deal.


Stefan

--
   ___
  oo  // \\  "De Chelonian Mobile"
 (_,\/ \_/ \ TortoiseSVN
   \ \_/_\_/>The coolest Interface to (Sub)Version Control
   /_/   \_\ http://tortoisesvn.net

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Stefan Sperling

On Wed, Jun 15, 2011 at 02:33:04PM -0400, Mark Phippard wrote:
> Given that the most common distinction between alpha and beta is
> "feature complete" I have been arguing all along that the existing
> "alpha" release should have been labelled "beta".  I would be +1 on
> changing that for the next release before we branch.

There have been new (small) features and APIs added since alpha1,
e.g. mime-type detection with libmagic.

Once we've branched we'll start the review process for backports,
which will put a definite end to new functionality being added.
I'd say just call the first release from the branch "beta" if there are
known blockers, and call it "RC1" if there are no known blockers.
Until we branch, we should keep calling them "alpha".

Re: svn propchange: r1136137 - svn:log

2011-06-15 Thread Hyrum K Wright

On Wed, Jun 15, 2011 at 6:36 PM,   wrote:
> Author: philip
> Revision: 1136137
> Modified property: svn:log
>
> Modified: svn:log at Wed Jun 15 18:36:01 2011
> --
> --- svn:log (original)
> +++ svn:log Wed Jun 15 18:36:01 2011
> @@ -2,7 +2,7 @@ Revert part of r1136038 as it is unneces
>  SVN_DEBUG_WORK_QUEUE code.
>
>  * subversion/libsvn_wc/workqueue.c
> -  (run_file_copy_translated): Revert allowing source to be missing.
> +  (run_file_copy_translated): Do not allowing source to be missing.

"Do not allowing" ?  ;)

-Hyrum

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread C. Michael Pilato

On 06/15/2011 02:23 PM, Greg Stein wrote:
> I dunno what kind of review the alpha is getting. Moving it to "beta"
> would get more users. Of course, we still wouldn't know what kind of
> review it is getting, but I would say "more" :-)

Yeah, I rather suspect that the alphas serve exactly one purpose -- getting
the release process kinks worked out.  (Which is valuable -- don't get me
wrong!)  I'd be completely shocked to learn that users are taking it for a
spin on real data.

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

signature.asc
Description: OpenPGP digital signature

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Mark Phippard

On Wed, Jun 15, 2011 at 2:23 PM, Greg Stein  wrote:
> On Wed, Jun 15, 2011 at 14:18, Hyrum K Wright  wrote:
>> On Wed, Jun 15, 2011 at 4:40 PM, C. Michael Pilato  
>> wrote:
>>> On 06/15/2011 12:36 PM, Greg Stein wrote:
 I would also request that should the serf issues be resolved during
 our stabilization period, that we ship with the default as serf. ie.
 don't toss it because it may miss the branchpoint by a couple days
 (trying to avoid that, tho!)
>>>
>>> Sure, no sweat.  It's not really until we're close to rolling RC1 that we
>>> need to make the go/no-go call on Serf as the default.
>>
>> I don't envision at lot of time between the branch and the first
>> release candidate for any serf stabilization to occur.  I suppose we
>> could roll a beta shortly after the branch, and then start the RC
>> train a couple weeks after that, though.
>
> I was assuming that we would have a beta.
>
> I dunno what kind of review the alpha is getting. Moving it to "beta"
> would get more users. Of course, we still wouldn't know what kind of
> review it is getting, but I would say "more" :-)

Given that the most common distinction between alpha and beta is
"feature complete" I have been arguing all along that the existing
"alpha" release should have been labelled "beta".  I would be +1 on
changing that for the next release before we branch.

FWIW, while I want to see is move towards RC, I am OK with waiting for
Serf fixes as long as someone will standup for each bug and say they
have the time to fix it and perhaps give a rough estimate of how long
we should wait.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 14:18, Hyrum K Wright  wrote:
> On Wed, Jun 15, 2011 at 4:40 PM, C. Michael Pilato  
> wrote:
>> On 06/15/2011 12:36 PM, Greg Stein wrote:
>>> I would also request that should the serf issues be resolved during
>>> our stabilization period, that we ship with the default as serf. ie.
>>> don't toss it because it may miss the branchpoint by a couple days
>>> (trying to avoid that, tho!)
>>
>> Sure, no sweat.  It's not really until we're close to rolling RC1 that we
>> need to make the go/no-go call on Serf as the default.
>
> I don't envision at lot of time between the branch and the first
> release candidate for any serf stabilization to occur.  I suppose we
> could roll a beta shortly after the branch, and then start the RC
> train a couple weeks after that, though.

I was assuming that we would have a beta.

I dunno what kind of review the alpha is getting. Moving it to "beta"
would get more users. Of course, we still wouldn't know what kind of
review it is getting, but I would say "more" :-)

Cheers,
-g

Re: svn commit: r1136137 - /subversion/trunk/subversion/libsvn_wc/workqueue.c

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 14:12,   wrote:
> Author: philip
> Date: Wed Jun 15 18:12:20 2011
> New Revision: 1136137
>
> URL: http://svn.apache.org/viewvc?rev=1136137&view=rev
> Log:
> Revert part of r1136038 as it is unnecessary.  Add some
> SVN_DEBUG_WORK_QUEUE code.
>
> * subversion/libsvn_wc/workqueue.c
>  (run_file_copy_translated): Revert allowing source to be missing.

That sentence is awkward. "Revert (allowing...)" ... the change here
is to "Do not allow the source to be missing."

>...

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Hyrum K Wright

On Wed, Jun 15, 2011 at 4:40 PM, C. Michael Pilato  wrote:
> On 06/15/2011 12:36 PM, Greg Stein wrote:
>> I would also request that should the serf issues be resolved during
>> our stabilization period, that we ship with the default as serf. ie.
>> don't toss it because it may miss the branchpoint by a couple days
>> (trying to avoid that, tho!)
>
> Sure, no sweat.  It's not really until we're close to rolling RC1 that we
> need to make the go/no-go call on Serf as the default.

I don't envision at lot of time between the branch and the first
release candidate for any serf stabilization to occur.  I suppose we
could roll a beta shortly after the branch, and then start the RC
train a couple weeks after that, though.

-Hyrum

Re: svn commit: r1136101 - /subversion/trunk/notes/api-changes-1.7.txt

2011-06-15 Thread C. Michael Pilato

On 06/15/2011 01:55 PM, Greg Stein wrote:
> Yeah. I think we embedded content within XML using the quoprint style
> when possible. It made them easier to read. ... but then we switched
> to base64 or somesuch. I mean really... who needs to read this stuff?
> :-P
> 
> +1 to full deprecation.

I thought we added it when we started that skunkworks ra_smtp library?


-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature

Re: svn commit: r1136101 - /subversion/trunk/notes/api-changes-1.7.txt

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 13:14, Bert Huijben  wrote:
>
>
>> -Original Message-
>> From: julianf...@apache.org [mailto:julianf...@apache.org]
>> Sent: woensdag 15 juni 2011 18:07
>> To: comm...@subversion.apache.org
>> Subject: svn commit: r1136101 - /subversion/trunk/notes/api-changes-
>> 1.7.txt
>>
>> Author: julianfoad
>> Date: Wed Jun 15 16:06:44 2011
>> New Revision: 1136101
>>
>> URL: http://svn.apache.org/viewvc?rev=1136101&view=rev
>> Log:
>> * notes/api-changes-1.7.txt: Update.
>>
>> Modified:
>>     subversion/trunk/notes/api-changes-1.7.txt
>>
>> Modified: subversion/trunk/notes/api-changes-1.7.txt
>> URL: http://svn.apache.org/viewvc/subversion/trunk/notes/api-changes-
>> 1.7.txt?rev=1136101&r1=1136100&r2=1136101&view=diff
>> ==
>> 
>> --- subversion/trunk/notes/api-changes-1.7.txt (original)
>> +++ subversion/trunk/notes/api-changes-1.7.txt Wed Jun 15 16:06:44 2011
>> @@ -12,49 +12,49 @@ Observations are marked with "#".
>>
> 
>
>> -  r1134000 svn_quoprint.h - no change
>
> I think we can deprecate all functions in this file.
> I didn't know this file and a quick check shows that we didn't use these 
> functions at all since a very long time ago (probably somewhere far before 
> 1.0).
>
> The only user of this header is subversion\tests\libsvn_delta\svndiff-test.c, 
> and then only if QUOPRINT_SVNDIFFS is defined.

Yeah. I think we embedded content within XML using the quoprint style
when possible. It made them easier to read. ... but then we switched
to base64 or somesuch. I mean really... who needs to read this stuff?
:-P

+1 to full deprecation.

Cheers,
-g

Re: svn commit: r1136114 - /subversion/trunk/configure.ac

2011-06-15 Thread Stefan Sperling

On Wed, Jun 15, 2011 at 06:28:28PM +0100, Philip Martin wrote:
> Stefan Sperling  writes:
> >> Is cmp as portable as diff?  Is it always available with the same
> >> behaviour?  autoconf generates calls to diff, so we know that using diff
> >> will work anywhere that autoconf works.  As far as I can tell autoconf
> >> doesn't use cmp.
> >
> >>From the cmp man page:
> >
> > HISTORY
> >  "A cmp command appeared in Version 1 AT&T UNIX."
> >
> > I would say that's ancient enough to be supported anywhere :)
> 
> Do they all support -s?

Very likely. The -s option was present in Unix 7th Edition:
http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/cmp.c
It was likely also in 4.4BSD or earlier. NetBSD had the option in 1995.

Re: svn commit: r1136114 - /subversion/trunk/configure.ac

2011-06-15 Thread Greg Hudson

On Wed, 2011-06-15 at 13:28 -0400, Philip Martin wrote:
> Do they all support -s?

cmp -s is one of the most portable Unix command invocations in existence
(from general knowledge; I can't give a reference).

Re: svn commit: r1136114 - /subversion/trunk/configure.ac

2011-06-15 Thread Philip Martin

Stefan Sperling  writes:

> On Wed, Jun 15, 2011 at 06:11:31PM +0100, Philip Martin wrote:
>> pet...@apache.org writes:
>> 
>> > Author: peters
>> > Date: Wed Jun 15 16:43:24 2011
>> > New Revision: 1136114
>> >
>> > URL: http://svn.apache.org/viewvc?rev=1136114&view=rev
>> > Log:
>> > * configure.ac: Followup r1134219: Use 'cmp -s' instead of 'diff' to
>> >detect changes to svn_private_config.h.  Take advantage of the fact
>> >that cmp -s produces no output even if a file does not exist.
>> >Use a temp variable to shorten the lines further.
>> 
>> Is cmp as portable as diff?  Is it always available with the same
>> behaviour?  autoconf generates calls to diff, so we know that using diff
>> will work anywhere that autoconf works.  As far as I can tell autoconf
>> doesn't use cmp.
>
>>From the cmp man page:
>
> HISTORY
>  "A cmp command appeared in Version 1 AT&T UNIX."
>
> I would say that's ancient enough to be supported anywhere :)

Do they all support -s?

> Oh, and it also conforms to POSIX.

configure is supposed to work on anything that is "good enough".  I
don't understand why we would choose to use cmp here, when diff is used
in the rest of the script.

-- 
Philip

Re: svn commit: r1136114 - /subversion/trunk/configure.ac

2011-06-15 Thread Stefan Sperling

On Wed, Jun 15, 2011 at 06:11:31PM +0100, Philip Martin wrote:
> pet...@apache.org writes:
> 
> > Author: peters
> > Date: Wed Jun 15 16:43:24 2011
> > New Revision: 1136114
> >
> > URL: http://svn.apache.org/viewvc?rev=1136114&view=rev
> > Log:
> > * configure.ac: Followup r1134219: Use 'cmp -s' instead of 'diff' to
> >detect changes to svn_private_config.h.  Take advantage of the fact
> >that cmp -s produces no output even if a file does not exist.
> >Use a temp variable to shorten the lines further.
> 
> Is cmp as portable as diff?  Is it always available with the same
> behaviour?  autoconf generates calls to diff, so we know that using diff
> will work anywhere that autoconf works.  As far as I can tell autoconf
> doesn't use cmp.

>From the cmp man page:

HISTORY
 "A cmp command appeared in Version 1 AT&T UNIX."

I would say that's ancient enough to be supported anywhere :)

Oh, and it also conforms to POSIX.

RE: svn commit: r1136101 - /subversion/trunk/notes/api-changes-1.7.txt

2011-06-15 Thread Bert Huijben



> -Original Message-
> From: julianf...@apache.org [mailto:julianf...@apache.org]
> Sent: woensdag 15 juni 2011 18:07
> To: comm...@subversion.apache.org
> Subject: svn commit: r1136101 - /subversion/trunk/notes/api-changes-
> 1.7.txt
> 
> Author: julianfoad
> Date: Wed Jun 15 16:06:44 2011
> New Revision: 1136101
> 
> URL: http://svn.apache.org/viewvc?rev=1136101&view=rev
> Log:
> * notes/api-changes-1.7.txt: Update.
> 
> Modified:
> subversion/trunk/notes/api-changes-1.7.txt
> 
> Modified: subversion/trunk/notes/api-changes-1.7.txt
> URL: http://svn.apache.org/viewvc/subversion/trunk/notes/api-changes-
> 1.7.txt?rev=1136101&r1=1136100&r2=1136101&view=diff
> ==
> 
> --- subversion/trunk/notes/api-changes-1.7.txt (original)
> +++ subversion/trunk/notes/api-changes-1.7.txt Wed Jun 15 16:06:44 2011
> @@ -12,49 +12,49 @@ Observations are marked with "#".
> 


> -  r1134000 svn_quoprint.h - no change

I think we can deprecate all functions in this file. 
I didn't know this file and a quick check shows that we didn't use these 
functions at all since a very long time ago (probably somewhere far before 1.0).

The only user of this header is subversion\tests\libsvn_delta\svndiff-test.c, 
and then only if QUOPRINT_SVNDIFFS is defined.

Bert

Re: svn commit: r1136114 - /subversion/trunk/configure.ac

2011-06-15 Thread Philip Martin

pet...@apache.org writes:

> Author: peters
> Date: Wed Jun 15 16:43:24 2011
> New Revision: 1136114
>
> URL: http://svn.apache.org/viewvc?rev=1136114&view=rev
> Log:
> * configure.ac: Followup r1134219: Use 'cmp -s' instead of 'diff' to
>detect changes to svn_private_config.h.  Take advantage of the fact
>that cmp -s produces no output even if a file does not exist.
>Use a temp variable to shorten the lines further.

Is cmp as portable as diff?  Is it always available with the same
behaviour?  autoconf generates calls to diff, so we know that using diff
will work anywhere that autoconf works.  As far as I can tell autoconf
doesn't use cmp.

-- 
Philip

Re: diff wish

2011-06-15 Thread Johan Corveleyn

On Wed, Jun 15, 2011 at 5:41 PM, Greg Hudson  wrote:
> On Wed, 2011-06-15 at 11:30 -0400, Johan Corveleyn wrote:
>> Okay, I guess we should then also rip out --ignore-space-change and
>> --ignore-eol-style, and perhaps --show-c-function. Or, if it's
>> preferred that ignore-space-change and ignore-eol-style be used by
>> default ("because humans are normally not interested in changes in
>> amount of whitespace"), we should use those options by default, and
>> not provide an option to disable them. Fine by me.
>
> Those are not options for determining the diff algorithm.  They are
> options for preprocessing the diff inputs or postprocessing the output.
> Although they're probably only used by a small minority of users,
> there's pretty strong evidence of a compelling need for them.

Hm, ok. I guess you're right. Having thought some more about it, it's
not the same thing.

-- 
Johan

Re: API review - svn_dirent_uri.h

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 12:56, Julian Foad  wrote:
> On Wed, 2011-06-15, Greg Stein wrote:
>> On Wed, Jun 15, 2011 at 12:32, Julian Foad  wrote:
>> > On Fri, 2011-06-10 at 09:06 +0100, Julian Foad wrote:
>> >> I (Julian Foad) wrote:
>> >> > On Thu, 2011-06-09, I (Julian Foad) wrote:
>> >> > > svn_dirent_uri.h
>> >> > >   svn_relpath_internal_style()
>> >> > >   svn_relpath_local_style()
>> >> > >     # These two are inappropriate: only dirents have a 'local' style.
>> >> >
>> >> > I removed the former and made the latter private in r1133964.
>> >
>> > Bert pointed out that there is a desire and precedent for the Windows
>> > client to display in-repository relpaths using the Windows "\"
>> > separator, so svn_relpath_local_style() may be wanted.  It was currently
>>
>> URLs use forward slashes. Backslashes are not proper separators. The
>> above does not make sense.
>
> We know it doesn't make sense from a logical standpoint, and Bert
> doesn't like it any more than you or I do, but apparently that's what
> the Windows client historically has been displaying.

F' those guys. Always breaking things...


;-)

Re: API review - svn_dirent_uri.h

2011-06-15 Thread Julian Foad

On Wed, 2011-06-15, Greg Stein wrote:
> On Wed, Jun 15, 2011 at 12:32, Julian Foad  wrote:
> > On Fri, 2011-06-10 at 09:06 +0100, Julian Foad wrote:
> >> I (Julian Foad) wrote:
> >> > On Thu, 2011-06-09, I (Julian Foad) wrote:
> >> > > svn_dirent_uri.h
> >> > >   svn_relpath_internal_style()
> >> > >   svn_relpath_local_style()
> >> > > # These two are inappropriate: only dirents have a 'local' style.
> >> >
> >> > I removed the former and made the latter private in r1133964.
> >
> > Bert pointed out that there is a desire and precedent for the Windows
> > client to display in-repository relpaths using the Windows "\"
> > separator, so svn_relpath_local_style() may be wanted.  It was currently
> 
> URLs use forward slashes. Backslashes are not proper separators. The
> above does not make sense.

We know it doesn't make sense from a logical standpoint, and Bert
doesn't like it any more than you or I do, but apparently that's what
the Windows client historically has been displaying.

- Julian

Re: API review - svn_dirent_uri.h

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 12:52, Julian Foad  wrote:
> On Wed, 2011-06-15 at 12:42 -0400, Greg Stein wrote:
>...
> And this step of splitting _dirent_skip_ancestor into two variants with
> the two different semantic meanings which are currently conflated - does
> that now make sense to you as it does to me?

Whatever helps you sleep at night is good with me :-), as long as we
expose the minimal set possible.

Thanks,
-g

Re: API review - svn_dirent_uri.h

2011-06-15 Thread Julian Foad

On Wed, 2011-06-15 at 12:42 -0400, Greg Stein wrote:
> On Wed, Jun 15, 2011 at 12:32, Julian Foad  wrote:
> > On Fri, 2011-06-10 at 09:06 +0100, Julian Foad wrote:
> >> I (Julian Foad) wrote:
> >> > On Thu, 2011-06-09, I (Julian Foad) wrote:
> >> > > svn_dirent_uri.h
> >> > >   svn_relpath_internal_style()
> >> > >   svn_relpath_local_style()
> >> > > # These two are inappropriate: only dirents have a 'local' style.
> >> >
> >> > I removed the former and made the latter private in r1133964.
> >
> > Bert pointed out that there is a desire and precedent for the Windows
> > client to display in-repository relpaths using the Windows "\"
> > separator, so svn_relpath_local_style() may be wanted.  It was currently
> 
> URLs use forward slashes. Backslashes are not proper separators. The
> above does not make sense.
> 
> >...
> > Many callers of svn_dirent_skip_ancestor(), on the other hand, expect
> > the 'child' input to be returned if it is not in fact a child.  I am
> > splitting the function into two:
> >
> >  svn_dirent_skip_ancestor() - return NULL if not a child
> >  [some other function name] - return 'child' input if not a child
> 
> PLEASE not another function. There are already too many variants of
> the same thing:
> 
> svn_dirent_is_child()
> svn_dirent_skip_ancestor()
> svn_dirent_is_ancestor()
> 
> These are all trivial variants which leads the reader to say "umm...
> which one is this?" and the developer to say "umm... which do I use
> here?". Please do not add a fourth... it will only exacerbate an
> already awful situation.
> 
> If anything, please reduce the number to ONE.

Absolutely.  I intend to reduce the number.  But the first and more
important thing is to reduce the number of *semantic* variations, rather
than strictly the number of exposed API symbols.  After this round of
changes, the semantics of *_skip_ancestor will match *_is_ancestor
exactly, so the latter becomes strictly redundant and I'll certainly
consider removing it, and even if I don't then its implementation will
be a simple wrapper rather than the current long-winded and separate
code.

As a second change, I think *_is_child() should go, because that's a
semantic variant that can easily be derived from _skip_ancestor.
svn_dirent_is_child() was already public in 1.6, though, so that one
must stay, but we could deprecate that and prefer using
dirent_skip_ancestor().

I guess, to speed this up in readiness for 1.7 branching, the sensible
thing would be to make the _is_child functions private at first.
Adapting their callers to use _skip_ancestor instead will be a bigger
change so that can come later.

So that's what I intend to do.

And this step of splitting _dirent_skip_ancestor into two variants with
the two different semantic meanings which are currently conflated - does
that now make sense to you as it does to me?

- Julian

Re: API review - svn_dirent_uri.h

2011-06-15 Thread C. Michael Pilato

On 06/15/2011 12:42 PM, Greg Stein wrote:
> On Wed, Jun 15, 2011 at 12:32, Julian Foad  wrote:
>> Many callers of svn_dirent_skip_ancestor(), on the other hand, expect
>> the 'child' input to be returned if it is not in fact a child.  I am
>> splitting the function into two:
>>
>>  svn_dirent_skip_ancestor() - return NULL if not a child
>>  [some other function name] - return 'child' input if not a child
> 
> PLEASE not another function.

Indeed.  Can these many callers not just do:

   result = svn_dirent_skip_ancestor(parent, child) || child;

?

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature

Re: API change - make svn_tristate_to_word/_from_word() private?

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 10:14, Julian Foad  wrote:
> New in 1.7 in svn_types.h:
>  svn_tristate_t
>  svn_tristate_to_word()  # yields "true"/"false"/NULL
>  svn_tristate_from_word()  # takes "true"/"yes"/"on"/.../
>...
> At least, and at first, we should remove the second and third usages
> listed above.  Those usages are more akin to "svn_cmdline" or
> "svn_config" utilities, or could well be private.
>
> Further, I consider moving/renaming these two functions to be a bit more
> private even for their main use in svn_log_changed_path2_t.

Concur. Those things always made me a little hinky.

Cheers,
-g

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 12:40, C. Michael Pilato  wrote:
> On 06/15/2011 12:36 PM, Greg Stein wrote:
>...
>> I would also request that should the serf issues be resolved during
>> our stabilization period, that we ship with the default as serf. ie.
>> don't toss it because it may miss the branchpoint by a couple days
>> (trying to avoid that, tho!)
>
> Sure, no sweat.  It's not really until we're close to rolling RC1 that we
> need to make the go/no-go call on Serf as the default.

Thanks. I just called it out because it is arguable to (say) ship a
beta with one default, and RC1 with another. Figured people should be
aware (even though that may not play in my favor :-P )

Cheers,
-g

Re: API review - svn_dirent_uri.h

2011-06-15 Thread Greg Stein

On Wed, Jun 15, 2011 at 12:32, Julian Foad  wrote:
> On Fri, 2011-06-10 at 09:06 +0100, Julian Foad wrote:
>> I (Julian Foad) wrote:
>> > On Thu, 2011-06-09, I (Julian Foad) wrote:
>> > > svn_dirent_uri.h
>> > >   svn_relpath_internal_style()
>> > >   svn_relpath_local_style()
>> > >     # These two are inappropriate: only dirents have a 'local' style.
>> >
>> > I removed the former and made the latter private in r1133964.
>
> Bert pointed out that there is a desire and precedent for the Windows
> client to display in-repository relpaths using the Windows "\"
> separator, so svn_relpath_local_style() may be wanted.  It was currently

URLs use forward slashes. Backslashes are not proper separators. The
above does not make sense.

>...
> Many callers of svn_dirent_skip_ancestor(), on the other hand, expect
> the 'child' input to be returned if it is not in fact a child.  I am
> splitting the function into two:
>
>  svn_dirent_skip_ancestor() - return NULL if not a child
>  [some other function name] - return 'child' input if not a child

PLEASE not another function. There are already too many variants of
the same thing:

svn_dirent_is_child()
svn_dirent_skip_ancestor()
svn_dirent_is_ancestor()

These are all trivial variants which leads the reader to say "umm...
which one is this?" and the developer to say "umm... which do I use
here?". Please do not add a fourth... it will only exacerbate an
already awful situation.

If anything, please reduce the number to ONE.

Cheers,
-g

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread C. Michael Pilato

On 06/15/2011 12:36 PM, Greg Stein wrote:
> On Tue, Jun 14, 2011 at 16:22, C. Michael Pilato  wrote:
>> I'm cautiously optimistic that we're approaching a branchable point, and
>> just wanted to give a heads-up that the very second that I notice that there
>> remain zero or Serf-only issues into the 1.7.0 milestones, I will -- in
>> accordance with sentiments formerly expressed by others among us -- formally
>> propose that we branch this release.
> 
> Sure, though I'm with Hyrum on waiting about a week before the branch
> is made. Invariably, something will come up.

Fair enough.

> I would also request that should the serf issues be resolved during
> our stabilization period, that we ship with the default as serf. ie.
> don't toss it because it may miss the branchpoint by a couple days
> (trying to avoid that, tho!)

Sure, no sweat.  It's not really until we're close to rolling RC1 that we
need to make the go/no-go call on Serf as the default.

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature

Re: 1.7.0 blocking issues / branching

2011-06-15 Thread Greg Stein

On Tue, Jun 14, 2011 at 16:22, C. Michael Pilato  wrote:
> Our 1.7.0 blocking issues collection currently looks like so:
>
> 3875    Serf SEGV in pool handling on error
> 3888    ra_serf unbound memory usage on checkout/export/update
> 3899    Auto-resolve conflicts at wc-wc copy/move destination
> 3915    upgrade should detect checksum mismatch
> 3917    can't checkout from s.a.o with serf 0.7.2
> 3920    wq item refers to file that does not exist
>
> Six (6) issues.  That's it.
>
> Three of those (3875, 3888, and 3917) are Serf-related.

I'm actively working on 3888. Haven't looked at the other two.

>
> Issue 3899 is not, in my opinion, convincingly blocker-worthy, but...
>
> That leaves 3915 (which could be as easy as "just fail the upgrade") and the
> newly opened 3920.
>
> I'm cautiously optimistic that we're approaching a branchable point, and
> just wanted to give a heads-up that the very second that I notice that there
> remain zero or Serf-only issues into the 1.7.0 milestones, I will -- in
> accordance with sentiments formerly expressed by others among us -- formally
> propose that we branch this release.

Sure, though I'm with Hyrum on waiting about a week before the branch
is made. Invariably, something will come up.

I would also request that should the serf issues be resolved during
our stabilization period, that we ship with the default as serf. ie.
don't toss it because it may miss the branchpoint by a couple days
(trying to avoid that, tho!)

> If anybody is sitting on blocking issues which are being tracked solely in
> their heads, it's time to put them into the tracker.  Please do not delay.

I started reviewing ### markers in the code. I would suggest that
others consider doing the same, particularly in libsvn_wc.

> If anybody (ahem, Stefan Küng) is sitting on API improvements or
> requirements tracked solely in their heads, it's time to put them into the
> tracker.  Please do not delay.
>
> Finally, if anybody is planning to submit code changes that are *not*
> directly aimed at getting 1.7.0 out the door, I beg of you to please
> carefully consider if now is the best time for those changes, or if it
> wouldn't be wiser (and more respectful of the community's overall desire to
> see this release come to a conclusion) to defer until the trunk becomes
> "1.8.0-dev".

As we were reaching the 1.6.x branchpoint, Hyrum and I started the
wc-ng work on a branch. After 1.6.x branched, we reintegrated the work
back onto trunk. If "you" are considering some cool new work, then I'd
offer that suggestion as a good way to avoid being blocked.

Re: API review - mod_*.h, svn_auth ... svn_hash.h

2011-06-15 Thread Julian Foad

An update on my observations on this batch of headers.  I am updating
 to reflect this too.

On Thu, 2011-06-09, I (Julian Foad) wrote:
> svn_cache_config.h
>   # Why is the cache config header separate from the cache header?

Was answered to my satisfaction.

>   svn_get_cache_config()
>   svn_set_cache_config()
> # Rename these two to svn_cache_config_*().

Done.

> svn_checksum.h
>   svn_checksum_mismatch_err()
> # Make this private, as it's not general-purpose enough?

Idea rejected.

> svn_config.h
>   SVN_CONFIG_OPTION_DIFF_EXTENSIONS  "diff-extensions"
>   SVN_CONFIG_OPTION_FORCE_USERNAME_CASE  "force-username-case"
> # These two need @since.

Done.

> svn_dav.h
>   SVN_SKEL_MIME_TYPE "application/vnd.svn-skel"
> # Need @since.

Done.

>   SVN_DAV_..._HEADER "SVN-..."
> # 12 new constants, mostly for HTTP protocol v2.
> # Need @since.

Done.

> svn_delta.h
>   SVN_NO_COMPRESSION_LEVEL 0
>   SVN_MAX_COMPRESSION_LEVEL 9
>   SVN_DEFAULT_COMPRESSION_LEVEL 5
> # Use a library prefix SVN_DELTA for these three.
> # Use a common prefix *_COMPRESSION_LEVEL for these three.

Done.

> svn_diff.h
>   svn_diff_fns2_t
> # Needs @since.

Done.

> svn_dirent_uri.h
>   # Rename 'dirent' to something like 'ospath'?  Too late, I guess.

Idea abandoned.

>   svn_relpath_internal_style()
>   svn_relpath_local_style()
> # These two are inappropriate: only dirents have a 'local' style.

One made private, the other removed.  Details elsethread.

>   svn_uri_is_root()
> # Document what "is root" means: "can't split into dir/basename"?

TODO.

>   svn_dirent_skip_ancestor()
> # See ### comment.

TODO.  Details elsethread.

>   svn_relpath_skip_ancestor()
> # See ### comment.
>   svn_uri_skip_ancestor()

Done - both relpath and uri versions.

> svn_error.h
>   svn_error_return()
> # Needs @since.

Done.

> svn_error_codes.h
>   SVN_ERR_WC_CHANGELIST_MOVE - now deprecated
> # Should say "for compatibility with 1.6 API".

Done.


- Julian

Re: API review - svn_dirent_uri.h

2011-06-15 Thread Julian Foad

On Fri, 2011-06-10 at 09:06 +0100, Julian Foad wrote:
> I (Julian Foad) wrote:
> > On Thu, 2011-06-09, I (Julian Foad) wrote:
> > > svn_dirent_uri.h
> > >   svn_relpath_internal_style()
> > >   svn_relpath_local_style()
> > > # These two are inappropriate: only dirents have a 'local' style.
> > 
> > I removed the former and made the latter private in r1133964.

Bert pointed out that there is a desire and precedent for the Windows
client to display in-repository relpaths using the Windows "\"
separator, so svn_relpath_local_style() may be wanted.  It was currently
not used (and is now deleted) so presumably the functionality is
currently implemented by using svn_dirent_local_style().  I suggest we
bring it back as a private function if and when we want to use it.

> > >   svn_dirent_skip_ancestor()
> > > # See ### comment.
> > >   svn_relpath_skip_ancestor()
> > > # See ### comment.
> > >   svn_uri_skip_ancestor()
> > 
> > I intend to make the three skip_ancestor functions return NULL if the
> > supposed child is not in fact a child of (or equal to) the parent path.
> > No callers appear to want the current behaviour of returning the child
> > in that case - I've just confirmed that both by testing and by
> > inspection.
> 
> Scratch that "confirmation": I've only inspected the relpath one, and my
> testing of all of them was flawed.

I intend all the _skip_ancestor functions to return NULL when the
supposed child is not in fact a child.  I have changed the fspath,
relpath and uri functions to behave like that, and all their callers
were either happy with that or were trivially updated to cope.

Many callers of svn_dirent_skip_ancestor(), on the other hand, expect
the 'child' input to be returned if it is not in fact a child.  I am
splitting the function into two:

  svn_dirent_skip_ancestor() - return NULL if not a child
  [some other function name] - return 'child' input if not a child

- Julian

Re: diff wish

2011-06-15 Thread Morten Kloster

On Wed, Jun 15, 2011 at 1:08 AM, Johan Corveleyn  wrote:
> On Tue, Jun 14, 2011 at 5:33 PM, Stefan Sperling  wrote:
>> On Tue, Jun 14, 2011 at 05:21:27PM +0200, Neels J Hofmeyr wrote:
>>> Hi Johan,
>>>
>>> it's been a while and I still haven't sent you my diff wish we briefly
>>> touched on the Subversion hackathon.
>
> Hi Neels, thanks for pursuing this further.
>
>>> Here is a fabricated example of why I don't like diff to match empty lines:
>>
>>> A couple of lines get replaced by completely different ones. By matching the
>>> blank line in the middle, it becomes far less readable, IMHO. In my fantasy
>>> dream world, this diff would print:
>>>
>>> [[[
>>> Index: x
>>> ===
>>> --- x (revision 1)
>>> +++ x (working copy)
>>> @@ -4,11 +4,13 @@
>>>
>>>  void aaa()
>>>  {
>>> -  if (x)
>>> -    do(things);
>>> -
>>> -  if (y)
>>> -    do(stuff);
>>> +  while (x || y)
>>> +  {
>>> +    check(something);
>>> +    notify(stuff);
>>> +
>>> +    try(somethingelse);
>>> +  }
>>>
>>>    bb(b);
>>>  }
>>> ]]]
>
> Yeah, that's certainly a nicer diff for human consumption :-). But
> strictly speaking it's a larger diff (more lines marked as +/-), so
> that makes it less optimal for the current algorithm.
>
> The "minimality" criterion of diff (with the LCS) makes it easy to
> reason about, and makes for a nice and clear mathematical definition
> of the requested diff result. But I agree that it doesn't necessarily
> lead to "good-quality" diffs for human readers.
>
> So: good-quality != minimal, but it's more of a "soft" criterion,
> depends on the language, context, ... what lines are important to the
> user, ...
>

Only for a given definition of "minimal" :-). In computer science, it
makes as much sense or more to let minimal mean the amount of
information needed to encode the diff. With that definition, it is worth
less information to match common lines than uncommon lines (with
unique lines being worth the most), and very common lines are only
worth matching if the surrounding lines also match. The minimal diff
in that sense would also be of high quality from a human perspective.
The downside is that finding a minimal diff in that sense is much
harder (and the precise definition of "minimal" depends on the
encoding used, while the optimal encoding in turn depends on
the statistics of the data we want diffed... so, yeah, it gets messy).

> Introducing heuristics in one form or another is probably the only way
> to improve this. I'm not an expert in this area myself (I'm actually
> more of a theoretical mathematician, so I'm naturally skeptical of
> anything without a formal proof :-)). But I have also read some good
> things about patience diff, like Stefan suggested ...
>
>> Do you know about patience diff?
>> http://bramcohen.livejournal.com/73318.html
>> I think we should try teaching this algorithm to svn diff at some point.
>> It's a lot more generic than just checking for empty lines and should
>> yield the results you want.
>
[]
>
> Intuitively, I'd say: let's look into patience diff (or a variation
> thereof), because it's already being used in several (D)VCS'es, so it
> has already had a lot of exposure. But that's not really a strong
> argument :-). Maybe another approach is easier to implement in
> libsvn_diff, and yields equally good or even better results ... I
> don't know.
>

Actually, patience diff doesn't solve this issue at all - once it has found
an optimal match for the unique lines, it then performs regular minimal
matches on the remaining sections, and it will be about as likely to
generate spurious matches of blank lines as our current diff when
there are large sections of non-matching code (or it can find spurious
matches of unique lines, which can mess up things even worse).

> []

---
Morten

Re: diff wish

2011-06-15 Thread Greg Hudson

On Wed, 2011-06-15 at 11:30 -0400, Johan Corveleyn wrote:
> Okay, I guess we should then also rip out --ignore-space-change and
> --ignore-eol-style, and perhaps --show-c-function. Or, if it's
> preferred that ignore-space-change and ignore-eol-style be used by
> default ("because humans are normally not interested in changes in
> amount of whitespace"), we should use those options by default, and
> not provide an option to disable them. Fine by me.

Those are not options for determining the diff algorithm.  They are
options for preprocessing the diff inputs or postprocessing the output.
Although they're probably only used by a small minority of users,
there's pretty strong evidence of a compelling need for them.

Re: diff wish

2011-06-15 Thread Johan Corveleyn

2011/6/15 Branko Čibej :
> On 15.06.2011 15:38, Johan Corveleyn wrote:
>> 2011/6/15 Branko Čibej :
>>> On 15.06.2011 14:11, Johan Corveleyn wrote:
> If you have a different definition of "mis-synchronizes", please explain.
 No, I don't mean a broken diff. The diff should at all times be
 *correct*. That was indeed never questioned.

 I mean something like the example Neels gave with his initial approach
 for avoid the mis-matching empty line problem. With the naive
 solution, he gave an example of where it's not nice:
>>> [...]
>>>
>>> But when would the current "minimal" diff be preferable to the nicest,
>>> albeit not minimal, diff we can produce? After all, the fix and/or
>>> patience diff result is not only nicer to look at, it also gives better
>>> results for blame, which is the other big diff consumer.
>> Please define "nicest".
>
> The "nicest" is the one that looks most natural to a human. We will
> never achieve that, but can certainly try.

I think we need something more machine-understandable / programmable
than "most natural to a human" :-).

>> Note that I gave an example where f.i. "patience diff" produces worse
>> results IMHO than the "minimal diff" (right below Neels' example):
>>
>> [[[
>> file a
>>
>> aa
>> aa
>> bb
>> bb
>> cc
>> cc
>> abc
>>
>> file b
>>
>> abc
>> aa
>> aa
>> bb
>> bb
>> cc
>> cc
>>
>> Patience diff will give:
>>
>> -aa
>> -aa
>> -bb
>> -bb
>> -cc
>> -cc
>> abc
>> +aa
>> +aa
>> +bb
>> +bb
>> +cc
>> +cc
>>
>> Minimal diff will give:
>>
>> +abc
>> aa
>> aa
>> bb
>> bb
>> cc
>> cc
>> -abc
>> ]]]
>>
>> Which one is the nicest?
>
> Clearly, patience diff gets that heuristic wrong. :)

I guess it depends. Maybe some humans are really focused on the
'abc'-line staying the same, and everything else being moved around
it.

> My point is this: If you start having options for specific diff
> algorithm in "svn diff", are you going to extend them to "svn blame" and
> "svn merge", too? Will different people get different blame results,
> based on which algorithm is the default?

Yes of course. Just like you can have different blame/merge results
with --ignore-space-change and --ignore-eol-style. But gut feeling: if
there is any sanity to both diff algorithms, blame will attribute the
most significant lines to the same person/revision in both cases.
That's just a guess though.

> [...]
>
>> But I don't like the hand-waving discussion that it will always be
>> superior, period. That's just not true. And it would be a big mistake,
>> IMHO, to only support a heuristic diff.
>
> I'm not selling patience diff as generally superior. Clearly it has its
> drawbacks, and there's no reason to assume those can't be fixed, whether
> in the patience algorithm itself, or in an adaptation of that and other
> ideas to "svn diff". I don't think there's a requirement to get from
> "what we have now" to "best possible" in one iteration, or even in one
> dot seven.
>
> Also please explain how "it would be a big mistake IMHO" is anything but
> hand-waving. :)

Yes, you're right. Sorry for the additional hand-waving :).

-- 
Johan

Re: diff wish

2011-06-15 Thread Julian Foad

Let's all take a breather and calm down before saying any more, shall
we?

- Julian

Re: diff wish

2011-06-15 Thread Johan Corveleyn

On Wed, Jun 15, 2011 at 5:15 PM, Greg Hudson  wrote:
> On Wed, 2011-06-15 at 09:38 -0400, Johan Corveleyn wrote:
>> But I don't like the hand-waving discussion that it will always be
>> superior, period. That's just not true. And it would be a big mistake,
>> IMHO, to only support a heuristic diff.
>
> If it's a big mistake to use a "heuristic" diff by default, then adding
> options to change the diff algorithm will not mitigate this mistake.
>
> Similarly, adding options to support a heuristic diff as not-the-default
> is almost completely useless.
>
> I know from experience that it's very easy to stare at a problem for
> long enough to convince yourself that other people care about it as much
> as you do, but in reality, to a very good approximation, nobody wants to
> play around with diff algorithm options.  There are probably a few dozen
> people out there who have configured "git diff" to use --patience by
> default and like it, but in the scheme of things, it's dead code.
>
> Options come at a cost in code complexity and documentation bulk.
> Supporting options for the sake of a very small fraction of users,
> without strong evidence of a compelling need for those users, is not the
> right tradeoff for a code base.

Okay, I guess we should then also rip out --ignore-space-change and
--ignore-eol-style, and perhaps --show-c-function. Or, if it's
preferred that ignore-space-change and ignore-eol-style be used by
default ("because humans are normally not interested in changes in
amount of whitespace"), we should use those options by default, and
not provide an option to disable them. Fine by me.

-- 
Johan

Re: diff wish

2011-06-15 Thread Greg Hudson

On Wed, 2011-06-15 at 09:38 -0400, Johan Corveleyn wrote:
> But I don't like the hand-waving discussion that it will always be
> superior, period. That's just not true. And it would be a big mistake,
> IMHO, to only support a heuristic diff.

If it's a big mistake to use a "heuristic" diff by default, then adding
options to change the diff algorithm will not mitigate this mistake.

Similarly, adding options to support a heuristic diff as not-the-default
is almost completely useless.

I know from experience that it's very easy to stare at a problem for
long enough to convince yourself that other people care about it as much
as you do, but in reality, to a very good approximation, nobody wants to
play around with diff algorithm options.  There are probably a few dozen
people out there who have configured "git diff" to use --patience by
default and like it, but in the scheme of things, it's dead code.

Options come at a cost in code complexity and documentation bulk.
Supporting options for the sake of a very small fraction of users,
without strong evidence of a compelling need for those users, is not the
right tradeoff for a code base.

Re: svn commit: r1136067 - /subversion/site/publish/download/download.html

2011-06-15 Thread C. Michael Pilato

On 06/15/2011 10:27 AM, rhuij...@apache.org wrote:
> Author: rhuijben
> Date: Wed Jun 15 14:27:45 2011
> New Revision: 1136067
> 
> URL: http://svn.apache.org/viewvc?rev=1136067&view=rev
> Log:
> * site/publish/download/download.html
>   Move the selected="selected" inside the start element for ftp.

Oop!  Thanks, Bert.

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature

Re: API change - make svn_tristate_to_word/_from_word() private?

2011-06-15 Thread Daniel Shahaf

[ bcc stefan2 ]

Julian Foad wrote on Wed, Jun 15, 2011 at 15:14:53 +0100:
> New in 1.7 in svn_types.h:
>   svn_tristate_t
>   svn_tristate_to_word()  # yields "true"/"false"/NULL
>   svn_tristate_from_word()  # takes "true"/"yes"/"on"/.../
> 
> The _to/from_word() functions don't seem canonically purposed: some
> users want a single representation of the values, for use in on-the-wire
> protocols and XML output, while other users want a more flexible,
> human-readable interpretation that accepts multiple different words
> ("true" = "on" = "yes").
> 
> The doc string of _from_word() is wrong: it says any input other than
> "true" and "false" is unknown.
> 
> The current users are:
> 
>   * The main use is in svn_log_changed_path2_t.text_modified
> and .props_modified.  For these, we use _to_word() in mod_dav_svn to
> write a protocol string (assumes the value is not svn_tristate_unknown),
> use _from_word() in ra_neon and ra_serf to interpret a protocol string,
> and use _to_word() in "svn log --xml" to output an XML attribute, using
> an undocumented feature of svn_xml_make_open_tag() to allow the
> attribute string to be NULL.  All of these need only "true"/"false", not
> yes/no/off/on.
> 
>   * implementation of svn_config_get_bool() and svn_hash_get_bool(),
> which uses _tristate_from_word() to flexibly read "on"/"off"/"yes"/etc.
> 
>   * svnserve:main.c which uses it simply to detect whether a
> command-line option is "true"/"yes"/"on", and doesn't actually care
> about the tri-state-ness.
> 

Why does that svnserve option take a string argument?  If it's just
a boolean why not do

-{"cache-txdeltas", SVNSERVE_OPT_CACHE_TXDELTAS, 1, 
+{"cache-txdeltas", SVNSERVE_OPT_CACHE_TXDELTAS, 0, 

?

> 
> At least, and at first, we should remove the second and third usages
> listed above.  Those usages are more akin to "svn_cmdline" or
> "svn_config" utilities, or could well be private.
> 
> Further, I consider moving/renaming these two functions to be a bit more
> private even for their main use in svn_log_changed_path2_t.
> 

Fair enough.  I don't see that any of the above uses require third
parties API consumers to parse tristates (from string to enum), so no
objection to making _from_word() private.

> - Julian
> 
> 

Thanks,

Re: [PATCH] Issue #3919: fix for spurious property conflict during merge

2011-06-15 Thread Stefan Sperling

On Wed, Jun 15, 2011 at 07:41:47AM -0500, Brian Neal wrote:
> > Instead of doing nothing, I think this should set the *state to 'merged'
> > using
> >
> >        set_prop_merge_state(state, svn_wc_notify_state_merged);
> >
> > Just like apply_single_prop_add() already does.
> >
> > Do you agree? If so I will apply this patch with that modification.
> 
> Yes, that seems very reasonable since the other routines for merging
> props (add/delete) call that function under similar circumstances (the
> new value is the same as the working value).

Committed in r1136063.

> > Also, did you already run the regression tests with your patch ("make
> > check")? My suggestion might affect the output of 'svn' so tests would
> > need to be run again (but I will run them either way before committing).
> 
> I did not.

They all pass.
 
> BTW, I really have to hand it to you guys for putting together some
> nice online docs that explains your development process and the
> awesome Makefile.svn which allows a new developer to get started with
> the code so quickly.

Glad to hear that Makefile.svn is useful to others, too.
I originally used it as my local build script and Hyrum suggested
to put in our repo in case someone finds it useful. Now there we go :)

Re: SVN 1.7.0. alpha source code

2011-06-15 Thread C. Michael Pilato

On 06/15/2011 03:39 AM, Tony Stevenson wrote:
> I have also made sure it has been applied to the EU, and US mirrors.  So
> it should be live for all now.

Looks good.  Thanks, Tony.

-- 
C. Michael Pilato 
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand



signature.asc
Description: OpenPGP digital signature

API change - make svn_tristate_to_word/_from_word() private?

2011-06-15 Thread Julian Foad

New in 1.7 in svn_types.h:
  svn_tristate_t
  svn_tristate_to_word()  # yields "true"/"false"/NULL
  svn_tristate_from_word()  # takes "true"/"yes"/"on"/.../

The _to/from_word() functions don't seem canonically purposed: some
users want a single representation of the values, for use in on-the-wire
protocols and XML output, while other users want a more flexible,
human-readable interpretation that accepts multiple different words
("true" = "on" = "yes").

The doc string of _from_word() is wrong: it says any input other than
"true" and "false" is unknown.

The current users are:

  * The main use is in svn_log_changed_path2_t.text_modified
and .props_modified.  For these, we use _to_word() in mod_dav_svn to
write a protocol string (assumes the value is not svn_tristate_unknown),
use _from_word() in ra_neon and ra_serf to interpret a protocol string,
and use _to_word() in "svn log --xml" to output an XML attribute, using
an undocumented feature of svn_xml_make_open_tag() to allow the
attribute string to be NULL.  All of these need only "true"/"false", not
yes/no/off/on.

  * implementation of svn_config_get_bool() and svn_hash_get_bool(),
which uses _tristate_from_word() to flexibly read "on"/"off"/"yes"/etc.

  * svnserve:main.c which uses it simply to detect whether a
command-line option is "true"/"yes"/"on", and doesn't actually care
about the tri-state-ness.


At least, and at first, we should remove the second and third usages
listed above.  Those usages are more akin to "svn_cmdline" or
"svn_config" utilities, or could well be private.

Further, I consider moving/renaming these two functions to be a bit more
private even for their main use in svn_log_changed_path2_t.

- Julian

Re: diff wish

2011-06-15 Thread Branko Čibej

On 15.06.2011 15:38, Johan Corveleyn wrote:
> 2011/6/15 Branko Čibej :
>> On 15.06.2011 14:11, Johan Corveleyn wrote:
 If you have a different definition of "mis-synchronizes", please explain.
>>> No, I don't mean a broken diff. The diff should at all times be
>>> *correct*. That was indeed never questioned.
>>>
>>> I mean something like the example Neels gave with his initial approach
>>> for avoid the mis-matching empty line problem. With the naive
>>> solution, he gave an example of where it's not nice:
>> [...]
>>
>> But when would the current "minimal" diff be preferable to the nicest,
>> albeit not minimal, diff we can produce? After all, the fix and/or
>> patience diff result is not only nicer to look at, it also gives better
>> results for blame, which is the other big diff consumer.
> Please define "nicest".

The "nicest" is the one that looks most natural to a human. We will
never achieve that, but can certainly try.

> Note that I gave an example where f.i. "patience diff" produces worse
> results IMHO than the "minimal diff" (right below Neels' example):
>
> [[[
> file a
>
> aa
> aa
> bb
> bb
> cc
> cc
> abc
>
> file b
>
> abc
> aa
> aa
> bb
> bb
> cc
> cc
>
> Patience diff will give:
>
> -aa
> -aa
> -bb
> -bb
> -cc
> -cc
> abc
> +aa
> +aa
> +bb
> +bb
> +cc
> +cc
>
> Minimal diff will give:
>
> +abc
> aa
> aa
> bb
> bb
> cc
> cc
> -abc
> ]]]
>
> Which one is the nicest?

Clearly, patience diff gets that heuristic wrong. :)

My point is this: If you start having options for specific diff
algorithm in "svn diff", are you going to extend them to "svn blame" and
"svn merge", too? Will different people get different blame results,
based on which algorithm is the default?

[...]

> But I don't like the hand-waving discussion that it will always be
> superior, period. That's just not true. And it would be a big mistake,
> IMHO, to only support a heuristic diff.

I'm not selling patience diff as generally superior. Clearly it has its
drawbacks, and there's no reason to assume those can't be fixed, whether
in the patience algorithm itself, or in an adaptation of that and other
ideas to "svn diff". I don't think there's a requirement to get from
"what we have now" to "best possible" in one iteration, or even in one
dot seven.

Also please explain how "it would be a big mistake IMHO" is anything but
hand-waving. :)

-- Brane

Re: diff wish

2011-06-15 Thread Johan Corveleyn

2011/6/15 Branko Čibej :
> On 15.06.2011 14:11, Johan Corveleyn wrote:
>>> If you have a different definition of "mis-synchronizes", please explain.
>> No, I don't mean a broken diff. The diff should at all times be
>> *correct*. That was indeed never questioned.
>>
>> I mean something like the example Neels gave with his initial approach
>> for avoid the mis-matching empty line problem. With the naive
>> solution, he gave an example of where it's not nice:
> [...]
>
> But when would the current "minimal" diff be preferable to the nicest,
> albeit not minimal, diff we can produce? After all, the fix and/or
> patience diff result is not only nicer to look at, it also gives better
> results for blame, which is the other big diff consumer.

Please define "nicest".

Note that I gave an example where f.i. "patience diff" produces worse
results IMHO than the "minimal diff" (right below Neels' example):

[[[
file a

aa
aa
bb
bb
cc
cc
abc

file b

abc
aa
aa
bb
bb
cc
cc

Patience diff will give:

-aa
-aa
-bb
-bb
-cc
-cc
abc
+aa
+aa
+bb
+bb
+cc
+cc

Minimal diff will give:

+abc
aa
aa
bb
bb
cc
cc
-abc
]]]

Which one is the nicest?

> Likewise, it'll
> give better locality for resolving merge conflicts. That's why I don't
> understand why Subversion, specifically, would need a --minimal option.

Because it could very well be possible that --minimal will give less
merge conflicts in a lot of cases. I simply don't know. Do you know of
any research into this?

I just found an interesting (though long) mail thread in the archives
of g...@vger.kernel.org, discussing pros and cons of patience vs.
regular diff [1]. There is some analysis in there, comparing diffs and
comparing merge conflicts. An interesting one is [2], where some
numbers are gathered on the number of merge conflicts, and how large
they are.

Some quotes:
[[[
The most interesting thing to me was: of the 4072 merges I have in my
local git.git clone, only 66 show a difference.

The next interesting thing: none -- I repeat, none! -- resulted in only
one of both methods having conflicts.  In all cases, if patience merge had
conflicts, so had the classical merge, and vice versa.  I would have
expected patience merge to handle some conflicts more gracefully.

...

So I restricted the analysis to the non-subtree merges, and now
non-patience merge comes out 6.97297297297297 conflict lines fewer than
patience merge, with a standard deviation of 58.941106657867 (with a total
count of 37 merges).

Note that ~7 lines difference with a standard deviation of ~59 lines is
pretty close to ~0 lines difference.

In the end, the additional expense of patience merge might just not be
worth it.
]]]

I still agree that patience diff often produces nicer diff output for
humans (especially for moves of blocks of code, and for re-indentation
and stuff like that). Because by focusing on the unique lines it has a
simple heuristic to focus primarily on those lines that are most
interesting, most significant for humans (*usually*). So I too really
like patience diff.

But I don't like the hand-waving discussion that it will always be
superior, period. That's just not true. And it would be a big mistake,
IMHO, to only support a heuristic diff.

-- 
Johan

[1] http://git.661346.n2.nabble.com/libxdiff-and-patience-diff-td1452272.html
[2] 
http://git.661346.n2.nabble.com/libxdiff-and-patience-diff-td1452272i40.html#a2124969

Re: API change - rename svn_hash_get_cstring/bool() to svn_config_hash_get_*()

2011-06-15 Thread Julian Foad

n.b. These two functions are new in 1.7.

- Julian


On Wed, 2011-06-15 at 14:29 +0100, Julian Foad wrote:
> A heads-up: I'm renaming svn_hash_get_bool() to
> svn_config_hash_get_bool(), and svn_hash_get_cstring() to
> svn_config_hash_get(), moving them from svn_hash.h to svn_config.h.
> They are being used only for config hashes, and don't seem like
> general-purpose hash functions.  In particular, _get_bool() looks for a
> variety of human-oriented boolean-like strings (on, off, yes, no, etc.).
> 
> Note that we already have svn_config_get_bool() and svn_config_get(),
> which are similar but operate on a structured config file object,
> whereas the functions I'm renaming operate on a simple hash.
> 
> 
> Current APIs:
> [[[
> # In svn_hash.h:
> 
> /** Find the value of a @a key in @a hash, return the value.
>  *
>  * If @a hash is @c NULL or if the @a key cannot be found, the
>  * @a default_value will be returned. 
>  * 
>  * @since New in 1.7.
>  */
> const char *
> svn_hash_get_cstring(apr_hash_t *hash,
>  const char *key,
>  const char *default_value);
> 
> /** Like svn_hash_get_cstring(), but for boolean values.
>  *
>  * Parses the value as a boolean value. The recognized representations
>  * are 'TRUE'/'FALSE', 'yes'/'no', 'on'/'off', '1'/'0'; case does not
>  * matter.
>  * 
>  * @since New in 1.7.
>  */
> svn_boolean_t 
> svn_hash_get_bool(apr_hash_t *hash,
>   const char *key,
>   svn_boolean_t default_value);
> 
> # In svn_config.h:
> 
> /** Find the value of a (@a section, @a option) pair in @a cfg, set @a
>  * *valuep to the value.
>  *
>  * If @a cfg is @c NULL, just sets @a *valuep to @a default_value. If
>  * the value does not exist, expand and return @a default_value. @a
>  * default_value can be NULL.
>  *
>  * The returned value will be valid at least until the next call to
>  * svn_config_get(), or for the lifetime of @a default_value. It is
>  * safest to consume the returned value immediately.
>  *
>  * This function may change @a cfg by expanding option values.
>  */
> void
> svn_config_get(svn_config_t *cfg,
>const char **valuep,
>const char *section,
>const char *option,
>const char *default_value);
> 
> /** Like svn_config_get(), but for boolean values.
>  *
>  * Parses the option as a boolean value. The recognized representations
>  * are 'TRUE'/'FALSE', 'yes'/'no', 'on'/'off', '1'/'0'; case does not
>  * matter. Returns an error if the option doesn't contain a known string.
>  */
> svn_error_t *
> svn_config_get_bool(svn_config_t *cfg,
> svn_boolean_t *valuep,
> const char *section,
> const char *option,
> svn_boolean_t default_value);
> ]]]
> 
> Let me know if you have any concerns.  As I don't anticipate any, I'll
> probably make the change before waiting for replies.
> 
> - Julian
> 
>

API change - rename svn_hash_get_cstring/bool() to svn_config_hash_get_*()

2011-06-15 Thread Julian Foad

A heads-up: I'm renaming svn_hash_get_bool() to
svn_config_hash_get_bool(), and svn_hash_get_cstring() to
svn_config_hash_get(), moving them from svn_hash.h to svn_config.h.
They are being used only for config hashes, and don't seem like
general-purpose hash functions.  In particular, _get_bool() looks for a
variety of human-oriented boolean-like strings (on, off, yes, no, etc.).

Note that we already have svn_config_get_bool() and svn_config_get(),
which are similar but operate on a structured config file object,
whereas the functions I'm renaming operate on a simple hash.


Current APIs:
[[[
# In svn_hash.h:

/** Find the value of a @a key in @a hash, return the value.
 *
 * If @a hash is @c NULL or if the @a key cannot be found, the
 * @a default_value will be returned. 
 * 
 * @since New in 1.7.
 */
const char *
svn_hash_get_cstring(apr_hash_t *hash,
 const char *key,
 const char *default_value);

/** Like svn_hash_get_cstring(), but for boolean values.
 *
 * Parses the value as a boolean value. The recognized representations
 * are 'TRUE'/'FALSE', 'yes'/'no', 'on'/'off', '1'/'0'; case does not
 * matter.
 * 
 * @since New in 1.7.
 */
svn_boolean_t 
svn_hash_get_bool(apr_hash_t *hash,
  const char *key,
  svn_boolean_t default_value);

# In svn_config.h:

/** Find the value of a (@a section, @a option) pair in @a cfg, set @a
 * *valuep to the value.
 *
 * If @a cfg is @c NULL, just sets @a *valuep to @a default_value. If
 * the value does not exist, expand and return @a default_value. @a
 * default_value can be NULL.
 *
 * The returned value will be valid at least until the next call to
 * svn_config_get(), or for the lifetime of @a default_value. It is
 * safest to consume the returned value immediately.
 *
 * This function may change @a cfg by expanding option values.
 */
void
svn_config_get(svn_config_t *cfg,
   const char **valuep,
   const char *section,
   const char *option,
   const char *default_value);

/** Like svn_config_get(), but for boolean values.
 *
 * Parses the option as a boolean value. The recognized representations
 * are 'TRUE'/'FALSE', 'yes'/'no', 'on'/'off', '1'/'0'; case does not
 * matter. Returns an error if the option doesn't contain a known string.
 */
svn_error_t *
svn_config_get_bool(svn_config_t *cfg,
svn_boolean_t *valuep,
const char *section,
const char *option,
svn_boolean_t default_value);
]]]

Let me know if you have any concerns.  As I don't anticipate any, I'll
probably make the change before waiting for replies.

- Julian

Re: [PATCH] Issue #3919: fix for spurious property conflict during merge

2011-06-15 Thread Philip Martin

Brian Neal  writes:

> I was wondering if you had unit/regression tests
> but I did not see anything about that in the docs, although it is
> quite possible I missed it.

http://subversion.apache.org/docs/community-guide/building.html#automated-tests

Short answer 'make check'.

-- 
Philip

Re: diff wish

2011-06-15 Thread Julian Foad

Johan Corveleyn wrote:
> On Wed, Jun 15, 2011 at 1:24 PM, Julian Foad  wrote:
> > On Wed, 2011-06-15 at 13:16 +0200, Johan Corveleyn wrote:
> >> On Wed, Jun 15, 2011 at 12:53 PM, Stefan Sperling  wrote:
> >> But I'm not convinced that we should simply drop support for "minimal
> >> diffs" [...]
> >>
> >> The minimal diff can produce ugly diffs, but there is one certainty:
> >> it's always a minimal one.
> >
> > But so what?  It's only "minimal" according to the current definition of
> > "minimal" [...]

My point was meant to be: GNU diff is intended to be a general and
complete "diff" utility, and so it makes sense for it to implement all
options that might be useful in any usage, for research purposes, for
readability in typical cases, for balancing speed against quality, etc.
But Subversion's built-in "diff" does not need to be complete and
general because users always have the option of plugging in another diff
that suits their needs better.  It is intended to give users a
convenient way to see what they've changed (in "text" files), and to
communicate such changes in the form of a patch, and that's *all* it
needs to do.  

> The current implementation is "minimal" according to the *most
> natural* definition of "minimal" for diff output, which is still line
> based.

Logically simple, yes.  Venerable, yes.  Natural, in the sense of "makes
most sense to humans"?  No.  It totally disregards the content of a
line, which humans don't.

>  Ok, you can have another definition of "minimal", but it will
> always have its vulnerabilities in certain cases. That's mainly my
> point.

I totally agree with you there, that any definition will have some kind
of "vulnerabilities", except that's perhaps too strong a word.  I didn't
read that as being mainly your point.

- Julian

Re: diff wish

2011-06-15 Thread Branko Čibej

On 15.06.2011 14:11, Johan Corveleyn wrote:
>> If you have a different definition of "mis-synchronizes", please explain.
> No, I don't mean a broken diff. The diff should at all times be
> *correct*. That was indeed never questioned.
>
> I mean something like the example Neels gave with his initial approach
> for avoid the mis-matching empty line problem. With the naive
> solution, he gave an example of where it's not nice:
[...]

But when would the current "minimal" diff be preferable to the nicest,
albeit not minimal, diff we can produce? After all, the fix and/or
patience diff result is not only nicer to look at, it also gives better
results for blame, which is the other big diff consumer. Likewise, it'll
give better locality for resolving merge conflicts. That's why I don't
understand why Subversion, specifically, would need a --minimal option.

GNU diff is a general-purpose diff tool. 'svn diff' is not.

-- Brane

Re: [PATCH] Issue #3919: fix for spurious property conflict during merge

2011-06-15 Thread Brian Neal

Hi Stefan,

> thanks for this patch.
>
> I have one suggestion:
>
> [...]
>
> Instead of doing nothing, I think this should set the *state to 'merged'
> using
>
>        set_prop_merge_state(state, svn_wc_notify_state_merged);
>
> Just like apply_single_prop_add() already does.
>
> Do you agree? If so I will apply this patch with that modification.

Yes, that seems very reasonable since the other routines for merging
props (add/delete) call that function under similar circumstances (the
new value is the same as the working value).

> Also, did you already run the regression tests with your patch ("make
> check")? My suggestion might affect the output of 'svn' so tests would
> need to be run again (but I will run them either way before committing).

I did not.

BTW, I really have to hand it to you guys for putting together some
nice online docs that explains your development process and the
awesome Makefile.svn which allows a new developer to get started with
the code so quickly. I was wondering if you had unit/regression tests
but I did not see anything about that in the docs, although it is
quite possible I missed it.

Thank you very much!
-BN

Re: diff wish

2011-06-15 Thread Johan Corveleyn

On Wed, Jun 15, 2011 at 1:24 PM, Julian Foad  wrote:
> On Wed, 2011-06-15 at 13:16 +0200, Johan Corveleyn wrote:
>> On Wed, Jun 15, 2011 at 12:53 PM, Stefan Sperling  wrote:
>> > On Wed, Jun 15, 2011 at 12:34:31PM +0200, Branko Čibej wrote:
>> >> I'd say not to worry about --minimal and --nice and whatnot. Just make
>> >> diff output the sanest, nicest diff it can find. I think it's a bad idea
>> >> to give diff user-visible options that change the output in ways that
>> >> are hard to explain (shuffling lines around, as opposed to, e.g., using
>> >> a completely different diff format).
>> >
>> > +1
>>
>> Certainly we need to pick the best possible default, which satisfies
>> most users most of the time.
>>
>> But I'm not convinced that we should simply drop support for "minimal
>> diffs" when we arrive at the point that we have a "nice" format. A
>> "nice" diff will always be based on heuristics, taking guesses at what
>> should be considered a deletion, an addition, or a common line. It's a
>> matter of interpretation. So there will always be a chance that it
>> guesses wrong, and totally mis-synchronizes. It may be rare, but IMHO
>> it's impossible to completely avoid this.
>>
>> The minimal diff can produce ugly diffs, but there is one certainty:
>> it's always a minimal one.
>
> But so what?  It's only "minimal" according to the current definition of
> "minimal" which is something like "number of lines removed + number of
> lines added".  A "better" solution might have a "better" definition of
> "minimal", maybe involving something like "total number of unique groups
> of characters".

The current implementation is "minimal" according to the *most
natural* definition of "minimal" for diff output, which is still line
based. Ok, you can have another definition of "minimal", but it will
always have its vulnerabilities in certain cases. That's mainly my
point.

It's very tricky to come up with a good specification (or mathematical
definition) of what you consider a "better" diff. And then someone
comes along with an example where your "better" diff produces a very
ugly result (although it will still be "minimal" according to your
"nice" specification).  Also, keep in mind that people use Subversion
for lots of things that can be vastly different from source code.

That's why I think it would be very unwise to ever simply drop support
for "standard minimal diffing", something that carries with it at
least 30 years of CS research, and has had a *huge* amount of
applications and usage.

As for what I understand under missynchronization ...

On Wed, Jun 15, 2011 at 1:30 PM, Markus Schaber
 wrote:
> If "mis-synchronizes" means that it produces a broken output when applied on 
> the input, then this should be avoided for every price. A "nice" diff must 
> still be a valid diff producing the correct output. But AFAICS this was never 
> questioned.
>
> If you have a different definition of "mis-synchronizes", please explain.

No, I don't mean a broken diff. The diff should at all times be
*correct*. That was indeed never questioned.

I mean something like the example Neels gave with his initial approach
for avoid the mis-matching empty line problem. With the naive
solution, he gave an example of where it's not nice:

On Tue, Jun 14, 2011 at 5:21 PM, Neels J Hofmeyr  wrote:
> The adverse effects of that is that any single line change shows any
> following empty lines as also changed:
>
> [[[
>  context
> -foo
> -
> -
> +foobar
> +
> +
>  Context
> ]]]
>
> and that empty-line-changes also show their preceding non-empty line as
> changed even if it hasn't:
>
> [[[
>  context
> -foo
> -
> +foo
> +
> +
>  Context
> ]]]

Similarly, with patience diff, one can easily come up with examples
where it produces a result that certainly seems sub-optimal to me.
This example was given in a comment to a blog entry about patience
diff [1]:

[[[
file a

aa
aa
bb
bb
cc
cc
abc

file b

abc
aa
aa
bb
bb
cc
cc

would the matching result be

-aa
-aa
-bb
-bb
-cc
-cc
abc
+aa
+aa
+bb
+bb
+cc
+cc
]]]

Ok, this is still "minimal" according to the definition of "patience
diff" (get the maximum amount of common lines when only looking at the
lines which are unique to each side). But I think you would agree it's
not a nice diff.

The "standard minimal diff" algorithm would produce the more sensible
diff in this case:
[[[
+abc
aa
aa
bb
bb
cc
cc
-abc
]]]

-- 
Johan

[1] http://alfedenzo.livejournal.com/170301.html?thread=195901#t195901

Re: [PATCH] Issue #3919: fix for spurious property conflict during merge

2011-06-15 Thread Stefan Sperling

On Sat, Jun 11, 2011 at 03:05:59PM -0500, Brian Neal wrote:
> Hello -
> 
> Attached is a patch for issue #3919 [1]. Please review when you have a
> chance. Thanks!
> 
> Possible commit message:
> 
> [[[
> Fix issue #3919. During a merge of a property, add a check against the
> incoming new property value and the working copy value. If they
> already match, then the merge trivially succeeds.
> 
> * subversion/libsvn_wc/props.c
>   (apply_single_generic_prop_change): Do nothing if the incoming new
> property value already matches the working value.
> 
> ]]]
> 
> 
> [1]: http://subversion.tigris.org/issues/show_bug.cgi?id=3919
> 
> 
> Best regards,
> BN

Hi Brian,

thanks for this patch.

I have one suggestion:

> Index: subversion/libsvn_wc/props.c
> ===
> --- subversion/libsvn_wc/props.c  (revision 1134717)
> +++ subversion/libsvn_wc/props.c  (working copy)
> @@ -1280,8 +1280,9 @@ apply_single_mergeinfo_prop_change(svn_wc_notify_s
>  }
>  
>  /* Merge a change to a property, using the rule that if the working value
> -   is the same as OLD_VAL then apply the change as a simple update
> -   (replacement), otherwise invoke maybe_generate_propconflict().
> +   is equal to the new value then there is nothing we need to do. Else, if
> +   the working value is the same as the old value then apply the change as 
> +   a simple update (replacement), otherwise invoke 
> maybe_generate_propconflict().
> The definition of the arguments and behaviour is the same as
> apply_single_prop_change(). */
>  static svn_error_t *
> @@ -1308,8 +1309,14 @@ apply_single_generic_prop_change(svn_wc_notify_sta
>  
>SVN_ERR_ASSERT(old_val != NULL);
>  
> +  /* If working_val is the same as new_val already then there is nothing to 
> do */
> +  if (working_val && new_val
> +  && svn_string_compare(working_val, new_val))
> +{
> +   ;  /* do nothing */

Instead of doing nothing, I think this should set the *state to 'merged'
using

set_prop_merge_state(state, svn_wc_notify_state_merged);

Just like apply_single_prop_add() already does.

Do you agree? If so I will apply this patch with that modification.

Also, did you already run the regression tests with your patch ("make
check")? My suggestion might affect the output of 'svn' so tests would
need to be run again (but I will run them either way before committing).

> +}
>/* If working_val is the same as old_val... */
> -  if (working_val && old_val
> +  else if (working_val && old_val
>&& svn_string_compare(working_val, old_val))
>  {
>/* A trivial update: change it to new_val. */

AW: diff wish

2011-06-15 Thread Markus Schaber

Hi, Johan,

Von: Johan Corveleyn [mailto:jcor...@gmail.com]
> But I'm not convinced that we should simply drop support for "minimal
> diffs" when we arrive at the point that we have a "nice" format. A "nice"
> diff will always be based on heuristics, taking guesses at what should be
> considered a deletion, an addition, or a common line. It's a matter of
> interpretation. So there will always be a chance that it guesses wrong,
> and totally mis-synchronizes. It may be rare, but IMHO it's impossible to
> completely avoid this. 

If "mis-synchronizes" means that it produces a broken output when applied on 
the input, then this should be avoided for every price. A "nice" diff must 
still be a valid diff producing the correct output. But AFAICS this was never 
questioned.

If you have a different definition of "mis-synchronizes", please explain.

Best regards

Markus Schaber

Re: diff wish

2011-06-15 Thread Julian Foad

On Wed, 2011-06-15 at 13:16 +0200, Johan Corveleyn wrote:
> On Wed, Jun 15, 2011 at 12:53 PM, Stefan Sperling  wrote:
> > On Wed, Jun 15, 2011 at 12:34:31PM +0200, Branko Čibej wrote:
> >> I'd say not to worry about --minimal and --nice and whatnot. Just make
> >> diff output the sanest, nicest diff it can find. I think it's a bad idea
> >> to give diff user-visible options that change the output in ways that
> >> are hard to explain (shuffling lines around, as opposed to, e.g., using
> >> a completely different diff format).
> >
> > +1
> 
> Certainly we need to pick the best possible default, which satisfies
> most users most of the time.
> 
> But I'm not convinced that we should simply drop support for "minimal
> diffs" when we arrive at the point that we have a "nice" format. A
> "nice" diff will always be based on heuristics, taking guesses at what
> should be considered a deletion, an addition, or a common line. It's a
> matter of interpretation. So there will always be a chance that it
> guesses wrong, and totally mis-synchronizes. It may be rare, but IMHO
> it's impossible to completely avoid this.
> 
> The minimal diff can produce ugly diffs, but there is one certainty:
> it's always a minimal one.

But so what?  It's only "minimal" according to the current definition of
"minimal" which is something like "number of lines removed + number of
lines added".  A "better" solution might have a "better" definition of
"minimal", maybe involving something like "total number of unique groups
of characters".

- Julian


>  I think that we always need to have that
> around, even if only as a fallback option (just another one of the
> '-x' options) just in case the nice diff gets it wrong. Just like GNU
> diff still has the --minimal option.
>

Re: Fresh checkout vs 'svn upgrade': How good is good enough?

2011-06-15 Thread Johan Corveleyn

On Wed, Jun 15, 2011 at 1:52 AM, Philip Martin
 wrote:
> Philip Martin  writes:
>
>> Johan Corveleyn  writes:
>>
>>> Also, has anyone tested this on an NFS-working copy? Or CIFS?
>>
>> Working copy on NFS, Linux client and server.
>>
>> Checkout using 1.6:
>> Elapsed:  96s CPU: 16s
>>
>> Upgrade using 1.7:
>> Elapsed: 147s CPU: 21s
>>
>> Checkout using 1.7
>> Elapsed: 216s CPU: 26s
>>
>> So upgrade is faster than checkout, but only because checkout is slow.
>
> That's using the sync option on the NFS server.  Using async
>
> Checkout using 1.6:
> Elapsed:  73s CPU: 16s
>
> Upgrade using 1.7:
> Elapsed: 113s CPU: 20s
>
> Checkout using 1.7
> Elapsed: 180s CPU: 26s

Wow, that looks pretty bad. Especially the checkout time is a cause
for concern IMHO. That may point to a very significant performance
regression accross the board when working with NFS working copies.

Philip, any chance you can perform a complete test run (comparing 1.6
to trunk) of Mark Phippard's perf testsuite? Or Neels' one?

-- 
Johan

Re: diff wish

2011-06-15 Thread Johan Corveleyn

On Wed, Jun 15, 2011 at 12:53 PM, Stefan Sperling  wrote:
> On Wed, Jun 15, 2011 at 12:34:31PM +0200, Branko Čibej wrote:
>> I'd say not to worry about --minimal and --nice and whatnot. Just make
>> diff output the sanest, nicest diff it can find. I think it's a bad idea
>> to give diff user-visible options that change the output in ways that
>> are hard to explain (shuffling lines around, as opposed to, e.g., using
>> a completely different diff format).
>
> +1

Certainly we need to pick the best possible default, which satisfies
most users most of the time.

But I'm not convinced that we should simply drop support for "minimal
diffs" when we arrive at the point that we have a "nice" format. A
"nice" diff will always be based on heuristics, taking guesses at what
should be considered a deletion, an addition, or a common line. It's a
matter of interpretation. So there will always be a chance that it
guesses wrong, and totally mis-synchronizes. It may be rare, but IMHO
it's impossible to completely avoid this.

The minimal diff can produce ugly diffs, but there is one certainty:
it's always a minimal one. I think that we always need to have that
around, even if only as a fallback option (just another one of the
'-x' options) just in case the nice diff gets it wrong. Just like GNU
diff still has the --minimal option.

-- 
Johan

Re: diff wish

2011-06-15 Thread Stefan Sperling

On Wed, Jun 15, 2011 at 12:34:31PM +0200, Branko Čibej wrote:
> I'd say not to worry about --minimal and --nice and whatnot. Just make
> diff output the sanest, nicest diff it can find. I think it's a bad idea
> to give diff user-visible options that change the output in ways that
> are hard to explain (shuffling lines around, as opposed to, e.g., using
> a completely different diff format).

+1

Re: svn commit: r1135635 - /subversion/trunk/subversion/libsvn_wc/props.c

2011-06-15 Thread Stefan Sperling

On Wed, Jun 15, 2011 at 12:22:13PM +0200, Bert Huijben wrote:
> The problem for application builders is that you can't just get the property
> here.
> 
> You can get the new BASE version of the property and the new ACTUAL version
> of the property (which is usually the old actual property on property
> conflicts), but you can't access the other property values (read: ORIGINAL)
> as you don't know the old revision. (Unless you captured it somewhere before
> the operation)
> 

I see, but how is this problem solved by putting a hexdump into
the prej file?

I think that property conflicts aren't being represented very well even
in internal data structures. E.g. in svn_wc_conflict_description2_t some
unrelated field is being used to store the path to the prej file (a
comment added by you points this out).

> The new conflict store is designed to fix this problem, but that will have
> to wait for 1.8.

I also think it is too late now to change any of this for 1.7.
But it needs some serious work for 1.8.

Re: diff wish

2011-06-15 Thread Branko Čibej

On 15.06.2011 01:08, Johan Corveleyn wrote:
> On Tue, Jun 14, 2011 at 5:33 PM, Stefan Sperling  wrote:
>> On Tue, Jun 14, 2011 at 05:21:27PM +0200, Neels J Hofmeyr wrote:
>>> Hi Johan,
>>>
>>> it's been a while and I still haven't sent you my diff wish we briefly
>>> touched on the Subversion hackathon.
> Hi Neels, thanks for pursuing this further.
>
>>> Here is a fabricated example of why I don't like diff to match empty lines:
>>> A couple of lines get replaced by completely different ones. By matching the
>>> blank line in the middle, it becomes far less readable, IMHO. In my fantasy
>>> dream world, this diff would print:
>>>
>>> [[[
>>> Index: x
>>> ===
>>> --- x (revision 1)
>>> +++ x (working copy)
>>> @@ -4,11 +4,13 @@
>>>
>>>  void aaa()
>>>  {
>>> -  if (x)
>>> -do(things);
>>> -
>>> -  if (y)
>>> -do(stuff);
>>> +  while (x || y)
>>> +  {
>>> +check(something);
>>> +notify(stuff);
>>> +
>>> +try(somethingelse);
>>> +  }
>>>
>>>bb(b);
>>>  }
>>> ]]]
> Yeah, that's certainly a nicer diff for human consumption :-). But
> strictly speaking it's a larger diff (more lines marked as +/-), so
> that makes it less optimal for the current algorithm.
>
> The "minimality" criterion of diff (with the LCS) makes it easy to
> reason about, and makes for a nice and clear mathematical definition
> of the requested diff result. But I agree that it doesn't necessarily
> lead to "good-quality" diffs for human readers.
>
> So: good-quality != minimal, but it's more of a "soft" criterion,
> depends on the language, context, ... what lines are important to the
> user, ...
>
> Introducing heuristics in one form or another is probably the only way
> to improve this. I'm not an expert in this area myself (I'm actually
> more of a theoretical mathematician, so I'm naturally skeptical of
> anything without a formal proof :-)). But I have also read some good
> things about patience diff, like Stefan suggested ...
>
>> Do you know about patience diff?
>> http://bramcohen.livejournal.com/73318.html
>> I think we should try teaching this algorithm to svn diff at some point.
>> It's a lot more generic than just checking for empty lines and should
>> yield the results you want.
> Or if Morten has some great inspiration along similar lines, that
> might be equally good or better...
>
> On Tue, Jun 14, 2011 at 7:53 PM, Morten Kloster  wrote:
>> Actually, I was already planning on making the LCS algorithm estimate
>> how statistically significant each matching section is (just a simple
>> heuristic, of course, nothing mathematically exact) - I need this for the
>> proposals in my recent thread "Improvements to diff3 (merge)
>> performance" - and the standard diff could then take an option -noflukes
>> (for instance) which would only keep the significant matches. That
>> should eliminate (or at least reduce) both problems with blank lines and
>> similar issues with braces being matched in unrelated code.
>>
>> Estimating the significance should be quite quick, so no worries there.
>> ---
>> Morten
> Intuitively, I'd say: let's look into patience diff (or a variation
> thereof), because it's already being used in several (D)VCS'es, so it
> has already had a lot of exposure. But that's not really a strong
> argument :-). Maybe another approach is easier to implement in
> libsvn_diff, and yields equally good or even better results ... I
> don't know.
>
> One thing I'm not sure about: suppose we have a really good "heuristic
> diff", should we then make it the default (and make --minimal an
> option), or should we make it optional (--nice)? I guess we'll see
> whenever we get at the point where a heuristic diff is implemented
> :-).
>
> Note that GNU diff uses some heuristics by default (and --minimal is
> an option). Maybe GNU diff would already give a "nice" result with
> your example with its built-in heuristics?

I'd say not to worry about --minimal and --nice and whatnot. Just make
diff output the sanest, nicest diff it can find. I think it's a bad idea
to give diff user-visible options that change the output in ways that
are hard to explain (shuffling lines around, as opposed to, e.g., using
a completely different diff format).

-- Brane

RE: svn commit: r1135635 - /subversion/trunk/subversion/libsvn_wc/props.c

2011-06-15 Thread Bert Huijben



> -Original Message-
> From: Stefan Sperling [mailto:s...@elego.de]
> Sent: woensdag 15 juni 2011 12:12
> To: dev@subversion.apache.org
> Subject: Re: svn commit: r1135635 -
> /subversion/trunk/subversion/libsvn_wc/props.c
> 
> On Wed, Jun 15, 2011 at 06:55:19AM +0300, Daniel Shahaf wrote:
> > Daniel Shahaf wrote on Wed, Jun 15, 2011 at 06:33:46 +0300:
> > > s...@apache.org wrote on Tue, Jun 14, 2011 at 15:10:50 -:
> > > > +  svn_stringbuf_appendcstr(buf, _("Local property value:\n"));
> > > > +  if (mine_is_binary)
> > > > +svn_stringbuf_appendcstr(buf, _("Cannot display: property
value
> is "
> > > > +"binary data\n"));
> > >
> > > Could we just print a hex dump of the value?
> >
> > Concretely:
> >
> > [[[
> > but the property has already been locally changed to a different value.
> > Local property value:
> > mine
> > Incoming property value:
> >
> 0x7E454B460201010002003D000100A0144000
> 40
> > ]]]
> 
> While I understand your reasoning I think this would be very
> confusing to many users. Not all SVN users are programmers.
> 
> And I guess people will in general be using special tools anyway to view
> such properties. E.g. if the prop value was a PNG image, they will want
> to see the image.  A string of hex numbers won't help them much.
> 
> There is precedence here, too. We don't show diffs of binary files,
> we show a placeholder instead that says a diff cannot be shown.
> Would you also prefer svn diff to print a diff of the hex dumps?
> I'd doubt it.
> 
> I'd say let them propget the value and look at it in whatever manner
> they see fit. This is a total edge case anyway. No user using standard
> svn: properties is ever going to run into this.
> 
> > Patch to produce that:

The problem for application builders is that you can't just get the property
here.

You can get the new BASE version of the property and the new ACTUAL version
of the property (which is usually the old actual property on property
conflicts), but you can't access the other property values (read: ORIGINAL)
as you don't know the old revision. (Unless you captured it somewhere before
the operation)

The new conflict store is designed to fix this problem, but that will have
to wait for 1.8.

Bert

Re: svn commit: r1135635 - /subversion/trunk/subversion/libsvn_wc/props.c

2011-06-15 Thread Stefan Sperling

On Wed, Jun 15, 2011 at 06:55:19AM +0300, Daniel Shahaf wrote:
> Daniel Shahaf wrote on Wed, Jun 15, 2011 at 06:33:46 +0300:
> > s...@apache.org wrote on Tue, Jun 14, 2011 at 15:10:50 -:
> > > +  svn_stringbuf_appendcstr(buf, _("Local property value:\n"));
> > > +  if (mine_is_binary)
> > > +svn_stringbuf_appendcstr(buf, _("Cannot display: property value 
> > > is "
> > > +"binary data\n"));
> > 
> > Could we just print a hex dump of the value?
> 
> Concretely:
> 
> [[[
> but the property has already been locally changed to a different value.
> Local property value:
> mine
> Incoming property value:
> 0x7E454B460201010002003D000100A014400040
> ]]]

While I understand your reasoning I think this would be very
confusing to many users. Not all SVN users are programmers.

And I guess people will in general be using special tools anyway to view
such properties. E.g. if the prop value was a PNG image, they will want
to see the image.  A string of hex numbers won't help them much.

There is precedence here, too. We don't show diffs of binary files,
we show a placeholder instead that says a diff cannot be shown.
Would you also prefer svn diff to print a diff of the hex dumps?
I'd doubt it.

I'd say let them propget the value and look at it in whatever manner
they see fit. This is a total edge case anyway. No user using standard
svn: properties is ever going to run into this.

> Patch to produce that:

FSFS caching doc strings

2011-06-15 Thread Julian Foad

Stefan2, I'm posting this attempt to add more detail to some of the FSFS
caching doc strings.  It's incomplete but I've had this sitting in my WC
for a couple of weeks now and am unlikely to look deeper in the short
term.

[[[
Index: subversion/libsvn_fs_fs/fs.h
===
--- subversion/libsvn_fs_fs/fs.h(revision 1135952)
+++ subversion/libsvn_fs_fs/fs.h(working copy)
@@ -238,8 +238,12 @@ typedef struct fs_fs_data_t
  rep key to svn_string_t. */
   svn_cache__t *fulltext_cache;
 
-  /* Pack manifest cache; maps revision numbers to offsets in their respective
- pack files. */
+  /* Pack manifest cache; a cache mapping (svn_revnum_t) shard number to
+ a manifest; and a manifest is a mapping from (svn_revnum_t) revision
+ number offset within a shard to (apr_off_t) byte-offset in the
+ respective pack file.
+ ### Is this right?
+*/
   svn_cache__t *packed_offset_cache;
 
   /* Cache for txdelta_window_t objects; the key is (revFilePath, offset) */
Index: subversion/libsvn_fs_fs/temp_serializer.h
===
--- subversion/libsvn_fs_fs/temp_serializer.h   (revision 1135952)
+++ subversion/libsvn_fs_fs/temp_serializer.h   (working copy)
@@ -93,8 +93,8 @@ svn_fs_fs__deserialize_txdelta_window(vo
   apr_pool_t *pool);
 
 /**
- * Implements #svn_cache__serialize_func_t for manifests
- * (#apr_array_header_t).
+ * Implements #svn_cache__serialize_func_t for a manifest
+ * (@a in is an #apr_array_header_t of apr_off_t elements).
  */
 svn_error_t *
 svn_fs_fs__serialize_manifest(char **data,
@@ -103,8 +103,8 @@ svn_fs_fs__serialize_manifest(char **dat
   apr_pool_t *pool);
 
 /**
- * Implements #svn_cache__deserialize_func_t for manifests
- * (#apr_array_header_t).
+ * Implements #svn_cache__deserialize_func_t for a manifest
+ * (@a *out is an #apr_array_header_t of apr_off_t elements).
  */
 svn_error_t *
 svn_fs_fs__deserialize_manifest(void **out,
@@ -167,8 +167,13 @@ svn_fs_fs__deserialize_dir_entries(void
apr_pool_t *pool);
 
 /**
- * Implements #svn_cache__partial_getter_func_t for a single element
- * identified by its offset in @a baton within a serialized manifest array.
+ * Implements #svn_cache__partial_getter_func_t.
+ * Set @a *out to the (apr_off_t) byte-offset within a pack file of the
+ * revision 
+ * Set (apr_off_t) @a *out to the element
+ * indexed by (apr_int64_t) @a *baton within the serialized manifest array
+ * @a data and @a data_len.
+ * ### Caller thinks *out is apr_off_t; impl thinks it is apr_int64_t.
  */
 svn_error_t *
 svn_fs_fs__get_sharded_offset(void **out,
@@ -180,7 +185,7 @@ svn_fs_fs__get_sharded_offset(void **out
 /**
  * Implements #svn_cache__partial_getter_func_t for a single 
  * #svn_fs_dirent_t within a serialized directory contents hash,
- * identified by its name in @a baton.
+ * identified by its name (const char @a *baton).
  */
 svn_error_t *
 svn_fs_fs__extract_dir_entry(void **out,
]]]

- Julian

Re: SVN 1.7.0. alpha source code

2011-06-15 Thread Tony Stevenson

On Wed, Jun 15, 2011 at 04:25:07AM +0400, Konstantin Kolinko wrote:

> Looking at the mirrors.cgi script at
> /site/trunk/content/dyn/mirrors/mirrors.cgi  at the above link, I
> think there is a bug there.
> 
> There is the following code fragment in parse_mirrors():
> 
>   # Check if the requested Preferred mirror is in the list
>   # Note the user-requested mirror doesn't have a trailing-slash
>   prefmir=None
>   if preferred:
> for mir in mirrors:
>   if mir[2][:-1]==preferred:
> prefmir=mir
> break
> 

> Index: mirrors.cgi
> ===
> --- mirrors.cgi   (revision 1135864)
> +++ mirrors.cgi   (working copy)
> @@ -91,11 +91,11 @@
>break
> 
># Check if the requested Preferred mirror is in the list
> -  # Note the user-requested mirror doesn't have a trailing-slash
> +  # Note the user-requested mirror may have no trailing-slash
>prefmir=None
>if preferred:
>  for mir in mirrors:
> -  if mir[2][:-1]==preferred:
> +  if mir[2].startswith(preferred):
>  prefmir=mir
>  break
># Otherwise pick a preferred mirror from our country

I have patched mirrors.cgi, and initial testing suggests that it has had 
desired effect.  

"Patch submitted by Konstantin Kolinko @gmail.com  --  Message-ID: 
"

Sendingx1/www/www.apache.org/dyn/mirrors/mirrors.cgi
Transmitting file data .
Committed revision 790972.


I have also made sure it has been applied to the EU, and US mirrors.  So it 
should be live for all now. 

Thanks for this. 

-- 

Cheers,
Tony


---
Tony Stevenson

t...@pc-tony.com  //  pct...@apache.org
t...@caret.cam.ac.uk

http://blog.pc-tony.com

GPG - 1024D/51047D66
--"



signature.asc
Description: Digital signature

75 matches

Mail list logo