Re: svn commit: r995478 [1/2] - in /subversion/branches/performance: ./ subversion/bindings/javahl/native/ subversion/bindings/javahl/src/org/apache/subversion/javahl/ subversion/bindings/javahl/src/o

2010-09-10 Thread Branko Čibej
 On 10.09.2010 01:45, Hyrum Wright wrote:
 [1] Well, I've no way of knowing if it was infinite or just taking a
 long time, but I'll defer a solution to the Halting Problem as an
 exercise for the reader. :)
Step 1: Pull plug from power socket.
Step 2: Check if program is still running. If it is not, the question
will this program ever halt has been trivially answered.
QED

NP-complete my foot. :-P


Re: atomic-revprop: help tracking down ra_serf failures

2010-09-10 Thread Daniel Shahaf
Lieven Govaerts wrote on Fri, Sep 10, 2010 at 07:46:55 +0200:
 On Fri, Sep 10, 2010 at 2:12 AM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
  [ I'm only talking about the ra_serf tests in this email ]
 
  When I build the atomic-revprop branch against serf r1408, I see that:
 
  * diff_tests 24 29 (and many others [1]) fail with HEAD of the branch.
     (IIRC these two failures persist if I switch to serf 
  ^/branches/0@head.)
  * diff_tests 24 29 fail with r983269 reverted. (test output attached)
 
 In this log file, the only difference I see is that the output of
 'iota' and 'A' are in reverse order in expected versus actual output.
 

How do you see in a reverse order?  This is what I see:

[[[
DIFF STDOUT (unordered):
--- EXPECTED STDOUT
+++ ACTUAL STDOUT
@@ -1,17 +1,8 @@
-Index: A
-===
 A  (revision 2)
-+++ A  (revision 1)
-
-Property changes on: A
-___
-Deleted: dirprop
-## -1 +0,0 ##
--r2value
 Index: iota
 ===
 --- iota   (revision 2)
 +++ iota   (revision 1)
+
 Property changes on: iota
 ___
 Deleted: fileprop
]]]

which means that the entire diff for A has disappeared.


 Isn't this just a question of ra_serf not guaranteeing depth-first output?
 

No.  The above paste indicates that the expected output is already
marked unordered and that the output for A's dirprop is missing.

 Lieven
 [..]




Re: svn commit: r995475 - /subversion/trunk/subversion/libsvn_repos/load.c

2010-09-10 Thread Stefan Sperling
On Thu, Sep 09, 2010 at 08:58:10PM +0200, Bert Huijben wrote:
  @@ -524,9 +524,11 @@ parse_property_block(svn_stream_t *strea
 else if ((buf[0] == 'D')  (buf[1] == ' '))
   {
 char *keybuf;
  +  apr_int64_t len;
  
  +  SVN_ERR(svn_cstring_atoi64(len, buf + 2));
 SVN_ERR(read_key_or_val(keybuf, actual_length,
  -  stream, atoi(buf + 2), proppool));
  +  stream, (apr_size_t)len, proppool));
 
 What would this do if you have 4GB + 1 byte?

 I expected that we would use the new svn_cstring conversion variants
 to check for that kind of errors (for overflows, etc.), but now we
 just ignore the error at a different level.

How so? apr_strtoi64() will set errno upon overflow so we'll error out.
atoi() did no such thing.

BTW, I've been pondering about remaining problems with apr_strtoi64().
It doesn't seem to be a very good API.
It doesn't tell us reliably whether the string that it parsed was
in fact a valid number. E.g. I think that if it parses the string 0xYZ
in base 16, the endptr will point to Y, errno isn't set, and the returned
value is zero (or possibly whatever we initialised the output argument to).
Which means that we probably can't tell apart 0x00YZ from 0xYZ without
actually looking at the string ourselves. Note that 0x00YZ is supposed to be
valid and parse as 0, because processing stops at the first invalid character,
rather than at '\0' -- which IMO is another misfeature of this API.
I've seen a comment somewhere in our code indicating that we're relying on
apr_strtoi64() to stop processing when it encounters a newline character.
But this could be fixed to pass in a NUL-terminated string instead.

Would it be OK to use the C99 strtol() family functions instead? That
would allow an easy and proper svn_cstring_atoi() etc. implementation.

I know we're using C89, but maybe it's time to move on and upgrade to C99
where the benefits are desirable? When Subversion was started, C89 was about
a decade old, and C99 is just as old now...

Stefan


Re: svn commit: r995603 - /subversion/branches/performance/subversion/libsvn_subr/svn_string.c

2010-09-10 Thread Stefan Sperling
On Thu, Sep 09, 2010 at 10:58:06PM -, stef...@apache.org wrote:
 Author: stefan2
 Date: Thu Sep  9 22:58:06 2010
 New Revision: 995603
 
 URL: http://svn.apache.org/viewvc?rev=995603view=rev
 Log:
 Try really hard to help the compiler generate good code
 for this frequently called function.
 
 * subversion/libsvn_subr/svn_string.c
   (svn_stringbuf_appendbyte): reformulate
 
 Modified:
 subversion/branches/performance/subversion/libsvn_subr/svn_string.c
 
 Modified: subversion/branches/performance/subversion/libsvn_subr/svn_string.c
 URL: 
 http://svn.apache.org/viewvc/subversion/branches/performance/subversion/libsvn_subr/svn_string.c?rev=995603r1=995602r2=995603view=diff
 ==
 --- subversion/branches/performance/subversion/libsvn_subr/svn_string.c 
 (original)
 +++ subversion/branches/performance/subversion/libsvn_subr/svn_string.c Thu 
 Sep  9 22:58:06 2010
 @@ -403,8 +403,11 @@ svn_stringbuf_appendbyte(svn_stringbuf_t
apr_size_t old_len = str-len;
if (str-blocksize  old_len + 1)
  {
 -  str-data[old_len] = byte;
 -  str-data[old_len+1] = '\0';
 +  char *dest = str-data;
 +
 +  dest[old_len] = byte;
 +  dest[old_len+1] = '\0';
 +
str-len = old_len+1;
  }
else
 @@ -412,7 +415,8 @@ svn_stringbuf_appendbyte(svn_stringbuf_t
/* we need to re-allocate the string buffer
 * - let the more generic implementation take care of that part
 */
 -  svn_stringbuf_appendbytes(str, byte, 1);
 +  char b = byte;
 +  svn_stringbuf_appendbytes(str, b, 1);
  }
  }
  
 

If you're expecting this function to stay written as it now,
you need to add a comment explaining why it is laid out the way it is.
Otherwise people might change it, and the resulting generated code will
be less optimal again.

That said, I have my doubts whether tweaking C code in order to tune the
output of certain compilers in certain versions is a useful thing to do.
Will this still work as expected when new compiler versions are released?
It seems you'd rather want to fix the optimizer of the compiler.

Stefan


Re: svn commit: r995475 - /subversion/trunk/subversion/libsvn_repos/load.c

2010-09-10 Thread Stefan Sperling
On Fri, Sep 10, 2010 at 04:02:52PM +0200, Branko Čibej wrote:
  On 10.09.2010 15:26, Stefan Sperling wrote:
  I know we're using C89, but maybe it's time to move on and upgrade to C99
  where the benefits are desirable? When Subversion was started, C89 was about
  a decade old, and C99 is just as old now...
 
 Microsoft's C compiler, to name only one, still does not provide most of
 the C99 features. I don't know about standard library support.
 
 However, I don't see where you gain with using strtol(). First of all,
 it was already in C89 and has effectively the same interface as the APR
 conversions. C99 added strtoll() but it has the same interface. What's
 the benefit?

Hmmm.. looking closer, you're right, the interface is the same.
It's just that the programmer (me) has to use it properly.
The strtol(3) man page has an interesting hint which is missing from
the apr_strtoi64() documentation. It wasn't clear to me how to make
sure that the entire string was a valid number, but this explains it:

 If endptr is non-null, strtol() stores the address of the first invalid
 character in *endptr.  If there were no digits at all, however, strtol()
 stores the original value of nptr in *endptr.  (Thus, if *nptr is not
 `\0' but **endptr is `\0' on return, the entire string was valid.)

I found this snippet in fs_fs.c (which has since been changed, I'm showing
an old version), where the caller is happy with a string like 42\n,
but potentially accepts 42foo:

  /* note that apr_atoi64() will stop reading as soon as it encounters
 the final newline. */
  if (changes_offset)
*changes_offset = rev_offset + apr_atoi64(buf[i]);

So that caller will need to be adjusted.

But yes, checking whether *endptr == '\0' should do the trick.
So I guess we can keep using apr_atoi64(). Thanks for the cluestick :)

Stefan


Re: NODE_DATA / NODES status

2010-09-10 Thread Philip Martin
Erik Huelsmann ehu...@gmail.com writes:

 There are however, a few queries in 'entries.c' which operate directly on
 BASE_/WORKING_NODE. These queries will need to be migrated. However, in our
 old entries, we don't have the concept of op_depths and op roots. That makes
 it a bit hard to migrate the entries file to the exact semantics of the
 NODES table. When we fix the WORKING_NODE concept to have an op_depth == 1
 during migration, however, conversion of the queries in that file isn't much
 of a problem.

 Does anybody expect serious issues from all working nodes having the same
 op_depth?


 The alternative would be to set the op_depth of each working node to the
 path component count of its local_relpath (making each node a stand-alone
 change).


 Now that I write the above, I think it's the sanest to make each working
 node its own oproot. That would be roughly as simple to code as the
 everything is 1 assumption.


 Better ideas? Comments?

The code in entries.c that writes to the database is used by upgrade,
and only by upgrade.  So it's got to produce the correct depth when
upgrading a working copy, but it only has to cope with things that can
be represented in pre-1.7 working copies.  When upgrade writes a node
into the database the parent should already be present and correct.

-- 
Philip


Re: svn commit: r995475 - /subversion/trunk/subversion/libsvn_repos/load.c

2010-09-10 Thread Stefan Sperling
On Fri, Sep 10, 2010 at 04:47:39PM +0200, Bert Huijben wrote:
 
 
  -Original Message-
  From: Stefan Sperling [mailto:s...@elego.de]
  Sent: vrijdag 10 september 2010 15:26
  To: Bert Huijben
  Cc: dev@subversion.apache.org
  Subject: Re: svn commit: r995475 -
  /subversion/trunk/subversion/libsvn_repos/load.c
  
  On Thu, Sep 09, 2010 at 08:58:10PM +0200, Bert Huijben wrote:
@@ -524,9 +524,11 @@ parse_property_block(svn_stream_t *strea
   else if ((buf[0] == 'D')  (buf[1] == ' '))
 {
   char *keybuf;
+  apr_int64_t len;
   
+  SVN_ERR(svn_cstring_atoi64(len, buf + 2));
   SVN_ERR(read_key_or_val(keybuf, actual_length,
-  stream, atoi(buf + 2), proppool));
+  stream, (apr_size_t)len,
 proppool));
  
   What would this do if you have 4GB + 1 byte?
  
   I expected that we would use the new svn_cstring conversion variants
   to check for that kind of errors (for overflows, etc.), but now we
   just ignore the error at a different level.
  
  How so? apr_strtoi64() will set errno upon overflow so we'll error out.
  atoi() did no such thing.
 
 Yes, after 2^64 apr_stroi64() will set an error. But you truncate it to 2^32
 in the read_key_or_val() line. (You explicitly suppressed the warning for
 this problem)
 
 So instead of erroring at 2^32 , you just created a different problem, with
 possible different security implications.

Ah, I see. You're talking about the case where size_t is 32bit.
 
 I think we need a special svn_string_atosize() (Beter name welcome) to
 compensate for this problem.

We can use svn_cstring_strtoui64() with minval = 0 and maxval == APR_SIZE_MAX.
Since APR_SIZE_MAX will expand to the correct max value on either 32bit and
64bit platforms, this should trigger an error if the result won't fit into
a 32bit size_t.

Stefan


[PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8

2010-09-10 Thread Daniel Trebbien
[[[
Add a command line option to the svnsync init, sync, and copy-revprops
subcommands (--source-encoding) that allows the user to specify the
character encoding of translatable properties from the source
repository. This is needed to allow svnsync to import some older
Subversion repositories that have properties that were not encoded in
UTF-8.

As discussed at:
http://thread.gmane.org/gmane.comp.version-control.subversion.user/100020

* subversion/svnsync/main.c
  (svnsync__opt) Added svnsync_opt_source_encoding.
  (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list
of acceptable options for the init, sync, and copy-revprops
subcommands.
  (opt_baton_t, subcommand_baton_t): Added source_encoding field.
  (log_properties_normalized): Removed this obsolete function.
  (copy_revprops): Added a parameter to receive the
subcommand_baton_t*. Removed the quiet parameter as this can be found
from the subcommand baton. Removed the int* normalized_count
parameter, as the counting of normalized properties is an obsolete
function.
  (make_subcommand_baton): Set the source_encoding field of the
resulting subcommand_baton_t object to the value of source_encoding
from opt_baton_t.
  (do_initialize, do_synchronize, do_copy_revprops): Removed counting
and reporting of normalized properties. Pass source_encoding through
to svnsync_* functions.
  (replay_baton_t): Removed obsolete fields normalized_rev_props_count
and normalized_node_props_count
  (main): Handled the case when the command line option is
--source-encoding. Set the source_encoding field of the opt_baton_t*
to either the ARG of --source-encoding or NULL.

* subversion/svnsync/sync.c
  Standardized terminology by replacing normalize with translate.
  (translate_string): Added the encoding parameter. Handle the case
when encoding is not NULL by calling svn_subst_translate_string().
  (svnsync_translate_revprops): Added the encoding parameter. Removed
counting of normalized properties.
  (edit_baton_t): Added the prop_encoding field. Removed the obsolete
int* normalized_node_props_counter field.
  (svnsync_get_sync_editor): Added a prop_encoding parameter, the
value of which is set in the prop_encoding field of the edit_baton_t*.

* subversion/svnsync/sync.h
  Updated the declarations and documentation of the svnsync_* functions.
]]]


Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8

2010-09-10 Thread Daniel Shahaf
The patch didn't make it to the list (only to my personal copy), could
you please re-send with the patch in text/plain MIME type?

Daniel Trebbien wrote on Fri, Sep 10, 2010 at 09:10:08 -0700:
 [[[
 Add a command line option to the svnsync init, sync, and copy-revprops
 subcommands (--source-encoding) that allows the user to specify the
 character encoding of translatable properties from the source
 repository. This is needed to allow svnsync to import some older
 Subversion repositories that have properties that were not encoded in
 UTF-8.
 
 As discussed at:
 http://thread.gmane.org/gmane.comp.version-control.subversion.user/100020
 
 * subversion/svnsync/main.c
   (svnsync__opt) Added svnsync_opt_source_encoding.
   (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list
 of acceptable options for the init, sync, and copy-revprops
 subcommands.
   (opt_baton_t, subcommand_baton_t): Added source_encoding field.
   (log_properties_normalized): Removed this obsolete function.
   (copy_revprops): Added a parameter to receive the
 subcommand_baton_t*. Removed the quiet parameter as this can be found
 from the subcommand baton. Removed the int* normalized_count
 parameter, as the counting of normalized properties is an obsolete
 function.
   (make_subcommand_baton): Set the source_encoding field of the
 resulting subcommand_baton_t object to the value of source_encoding
 from opt_baton_t.
   (do_initialize, do_synchronize, do_copy_revprops): Removed counting
 and reporting of normalized properties. Pass source_encoding through
 to svnsync_* functions.
   (replay_baton_t): Removed obsolete fields normalized_rev_props_count
 and normalized_node_props_count
   (main): Handled the case when the command line option is
 --source-encoding. Set the source_encoding field of the opt_baton_t*
 to either the ARG of --source-encoding or NULL.
 
 * subversion/svnsync/sync.c
   Standardized terminology by replacing normalize with translate.
   (translate_string): Added the encoding parameter. Handle the case
 when encoding is not NULL by calling svn_subst_translate_string().
   (svnsync_translate_revprops): Added the encoding parameter. Removed
 counting of normalized properties.
   (edit_baton_t): Added the prop_encoding field. Removed the obsolete
 int* normalized_node_props_counter field.
   (svnsync_get_sync_editor): Added a prop_encoding parameter, the
 value of which is set in the prop_encoding field of the edit_baton_t*.
 
 * subversion/svnsync/sync.h
   Updated the declarations and documentation of the svnsync_* functions.
 ]]]



Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8

2010-09-10 Thread Philip Martin
Daniel Trebbien dtrebb...@gmail.com writes:

 [[[
 Add a command line option to the svnsync init, sync, and copy-revprops
 subcommands (--source-encoding) that allows the user to specify the
 character encoding of translatable properties from the source
 repository. This is needed to allow svnsync to import some older
 Subversion repositories that have properties that were not encoded in
 UTF-8.

I don't see a patch.

-- 
Philip


Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8

2010-09-10 Thread Daniel Trebbien
On Fri, Sep 10, 2010 at 9:16 AM, Daniel Shahaf d...@daniel.shahaf.name wrote:
 The patch didn't make it to the list (only to my personal copy), could
 you please re-send with the patch in text/plain MIME type?

Hmmm. I wonder if adding a .txt extension will work...
Index: subversion/svnsync/sync.h
===
--- subversion/svnsync/sync.h   (revision 995839)
+++ subversion/svnsync/sync.h   (working copy)
@@ -33,15 +33,15 @@
 #include svn_delta.h
 
 
-/* Normalize the line ending style of the values of properties in REV_PROPS
- * that need translation (according to svn_prop_needs_translation(),
- * currently all svn:* props) so that they contain only LF (\n) line endings.
- * The number of properties that needed normalization is returned in
- * *NORMALIZED_COUNT.
+/* Translate the encoding and line ending style of the values of properties
+ * in rev_props that need translation (according to
+ * svn_prop_needs_translation(), which is currently all svn:* props) so that
+ * they are encoded in UTF-8 and contain only LF (\n) line endings. No
+ * re-encoding is performed if encoding is NULL.
  */
 svn_error_t *
-svnsync_normalize_revprops(apr_hash_t *rev_props,
-   int *normalized_count,
+svnsync_translate_revprops(apr_hash_t *rev_props,
+   const char *encoding,
apr_pool_t *pool);
 
 
@@ -52,18 +52,20 @@
  * which the commit is being made.
  *
  * As the sync editor encounters property values, it might see the need to
- * normalize them (to LF line endings). Each carried out normalization adds 1
- * to the *NORMALIZED_NODE_PROPS_COUNTER (for notification).
+ * translate them (re-encode and/or change to LF line endings).
+ * If PROP_ENCODING is NULL, then property values are presumed to be encoded
+ * in UTF-8 and are not re-encoded. Otherwise, the property values are
+ * presumed to be encoded in PROP_ENCODING, and are translated to UTF-8.
  */
 svn_error_t *
 svnsync_get_sync_editor(const svn_delta_editor_t *wrapped_editor,
 void *wrapped_edit_baton,
 svn_revnum_t base_revision,
 const char *to_url,
+const char *prop_encoding,
 svn_boolean_t quiet,
 const svn_delta_editor_t **editor,
 void **edit_baton,
-int *normalized_node_props_counter,
 apr_pool_t *pool);
 
 
Index: subversion/svnsync/main.c
===
--- subversion/svnsync/main.c   (revision 995839)
+++ subversion/svnsync/main.c   (working copy)
@@ -61,6 +61,7 @@
   svnsync_opt_sync_password,
   svnsync_opt_config_dir,
   svnsync_opt_config_options,
+  svnsync_opt_source_encoding,
   svnsync_opt_disable_locking,
   svnsync_opt_version,
   svnsync_opt_trust_server_cert,
@@ -104,8 +105,8 @@
  the destination repository by any method other than 'svnsync'.\n
  In other words, the destination repository should be a read-only\n
  mirror of the source repository.\n),
-  { SVNSYNC_OPTS_DEFAULT, 'q', svnsync_opt_allow_non_empty,
-svnsync_opt_disable_locking } },
+  { SVNSYNC_OPTS_DEFAULT, svnsync_opt_source_encoding, 'q',
+svnsync_opt_allow_non_empty, svnsync_opt_disable_locking } },
 { synchronize, synchronize_cmd, { sync },
   N_(usage: svnsync synchronize DEST_URL [SOURCE_URL]\n
  \n
@@ -117,7 +118,8 @@
  source URL.  Specifying SOURCE_URL is recommended in particular\n
  if untrusted users/administrators may have write access to the\n
  DEST_URL repository.\n),
-  { SVNSYNC_OPTS_DEFAULT, 'q', svnsync_opt_disable_locking } },
+  { SVNSYNC_OPTS_DEFAULT, svnsync_opt_source_encoding, 'q',
+svnsync_opt_disable_locking } },
 { copy-revprops, copy_revprops_cmd, { 0 },
   N_(usage:\n
  \n
@@ -137,7 +139,8 @@
  DEST_URL repository.\n
  \n
  Form 2 is deprecated syntax, equivalent to specifying 
\-rREV[:REV2]\.\n),
-  { SVNSYNC_OPTS_DEFAULT, 'q', 'r', svnsync_opt_disable_locking } },
+  { SVNSYNC_OPTS_DEFAULT, svnsync_opt_source_encoding, 'q', 'r',
+svnsync_opt_disable_locking } },
 { info, info_cmd, { 0 },
   N_(usage: svnsync info DEST_URL\n
  \n
@@ -200,6 +203,12 @@
   For example:\n

   servers:global:http-library=serf)},
+{source-encoding, svnsync_opt_source_encoding, 1,
+   N_(convert translatable properties from encoding ARG\n
+   
+  to UTF-8. If not specified, then properties are\n
+   
+  presumed to be 

Re: [Issue 3474] making a new subdir, moving files into it and then renaming the subdir, breaks history of the moved files

2010-09-10 Thread Johan Corveleyn
On Fri, Sep 10, 2010 at 11:45 PM,  joha...@tigris.org wrote:
 http://subversion.tigris.org/issues/show_bug.cgi?id=3474

 --- Additional comments from joha...@tigris.org Fri Sep 10 14:45:17 -0700 
 2010 ---
 This issue seems to be fixed on trunk. The described scenario now goes as 
 follows:

 (assuming we're in a working copy with a versioned file a.txt)
 [[[
 $ svn mkdir subdir
 A         subdir             # same as in 1.6

 $ svn mv a.txt subdir
 A         subdir\a.txt
 D         a.txt              # same as in 1.6

 $ svn st
 A       subdir
 A  +    subdir\a.txt
 D       a.txt                # same as in 1.6

 $ svn mv subdir subdir2
 A         subdir2
 D         subdir\a.txt
 D         subdir             # different! will ask on dev list about this.

Is the above output an expected change of behavior? Previously (in
1.6) the above action generated the following output (as can be seen
in the original description of issue 3474):

[[[
$ svn mv subdir/ subdir2
A subdir2
A subdir2\a.txt
D subdir
]]]

It's no problem for me (the result is perfectly fine), just wondering ...

In fact, I never quite understood the output in 1.6 (why are all
children of a moved directory shown as deleted, and not as added in
the new dir?). But I don't understand it now either (I can reverse the
question). Is there any rational explanation, any underlying
reasoning, principles, ... ?

Cheers,
--
Johan


Wrong reply-to address in mails from issue tracker

2010-09-10 Thread Johan Corveleyn
Hi,

The mails coming from the SVN issue tracker (on the issues mailing
list) still contain the old d...@s.t.o address as reply-to. Could
someone change this to the d...@s.a.o address?

Cheers,
-- 
Johan


Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8

2010-09-10 Thread Daniel Shahaf
Daniel Trebbien wrote on Fri, Sep 10, 2010 at 11:12:20 -0700:
 On Fri, Sep 10, 2010 at 9:16 AM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
  The patch didn't make it to the list (only to my personal copy), could
  you please re-send with the patch in text/plain MIME type?
 
 Hmmm. I wonder if adding a .txt extension will work...

 [[[
 Add a command line option to the svnsync init, sync, and copy-revprops
 subcommands (--source-encoding) that allows the user to specify the
 character encoding of translatable properties from the source
 repository. This is needed to allow svnsync to import some older

s/import/sync/

 Subversion repositories that have properties that were not encoded in
 UTF-8.
 
 As discussed at:
 http://thread.gmane.org/gmane.comp.version-control.subversion.user/100020
 

Nice, I see in this link even the posts that were CCed only to d...@.  gmane++

 * subversion/svnsync/main.c
   (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list
 of acceptable options for the init, sync, and copy-revprops
 subcommands.

Missing indentation on the non-first lines.  Should be in one of these forms:

[[[
* subversion/svnsync/main.c
  (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list
 of acceptable options for the init, sync, and copy-revprops subcommands.
]]]

[[[
* subversion/svnsync/main.c
  (svnsync_cmd_table):
 Added svnsync_opt_source_encoding to the list of acceptable options for
 the init, sync, and copy-revprops subcommands.
]]]


   (log_properties_normalized): Removed this obsolete function.

In other words, you dropped the entire count changes feature, without
any discussion.  (It's good that the log message is clear about this
change, though.)

Please don't do that.  Someone spent some time writing and debugging
this feature, you can't just come in and drop their code.  If you really
think this is redundant, then start a discussion and get consensus to
drop this functionality.

And at any rate, your recode log messages work needn't depend on it.
It can (IMO: should) even introduce counters for how many properties it
recoded.

 * subversion/svnsync/sync.c
   Standardized terminology by replacing normalize with translate.

Please don't do this.  It makes it harder to read the diff (it obscures
the functional changes), it's a bikeshed (http://12a2a2.bikeshed.com),
and in my opinion normalize is the more accurate term for what you're
doing now after the patch.  :-)

Just stick to the existing terminology please.

 * subversion/svnsync/sync.h

Overall, log message looks good, wrt amount of detail, etc.  (But
I haven't compared it point-by-point with the actual patch.)

 ]]]
 

 +++ subversion/svnsync/main.c (working copy)
 @@ -104,8 +105,8 @@
   the destination repository by any method other than 'svnsync'.\n
   In other words, the destination repository should be a read-only\n
   mirror of the source repository.\n),
 -  { SVNSYNC_OPTS_DEFAULT, 'q', svnsync_opt_allow_non_empty,
 -svnsync_opt_disable_locking } },
 +  { SVNSYNC_OPTS_DEFAULT, svnsync_opt_source_encoding, 'q',
 +svnsync_opt_allow_non_empty, svnsync_opt_disable_locking } },

Apparently our convention is to append new options to the end of the
array.  But that's not a major concern.

 @@ -200,6 +203,12 @@
 +{source-encoding, svnsync_opt_source_encoding, 1,
 +   N_(convert translatable properties from encoding 
 ARG\n
 +  to UTF-8. If not specified, then properties are\n
 +  presumed to be encoded in UTF-8.)},

Translatable properties may be confusing to an end user, but I don't
have a better suggestion.

 @@ -227,6 +236,7 @@
const char *sync_password;
const char *config_dir;
apr_hash_t *config;
 +  const char *source_encoding;
svn_boolean_t disable_locking;
svn_boolean_t quiet;
svn_boolean_t allow_non_empty;
 @@ -352,6 +362,9 @@
svn_boolean_t quiet;
svn_boolean_t allow_non_empty;
const char *to_url;
 +  
 +  /* initialize, synchronize, and copy-revprops only */
 +  const char *source_encoding;
  
/* initialize only */
const char *from_url;

That's just my preference now, but could you use 'svn diff -x-p' so that
function/struct names show in the hunk headers?

@dev: how about making the patch submission guidelines recommend '-x-p'
be used regularly in patch submissions?

 @@ -785,7 +772,7 @@
  {
const char *to_url, *from_url;
svn_ra_session_t *to_session;
 -  opt_baton_t *opt_baton = b;
 +  opt_baton_t *opt_baton = (opt_baton_t*)b;

Unnecessary change.  As a rule, patches shouldn't have them.

(the same cast is added again later in the diff)

 @@ -1625,9 +1589,9 @@
  /* SUBCOMMAND: help */
  static svn_error_t *
 -help_cmd(apr_getopt_t *os, void *baton, apr_pool_t *pool)
 +help_cmd(apr_getopt_t *os, void *b, apr_pool_t *pool)
  {

Why did you rename the formal parameter?  Isn't it another unnecessary
change (as I 

Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8

2010-09-10 Thread Daniel Shahaf
Daniel Trebbien wrote on Fri, Sep 10, 2010 at 12:24:29 -0700:
 The test repository (attached) was created with Subversion version
 1.6.12 (r955767) on a Debian Squeeze AMD64 system.

Since you're working on this code, let me point out that 'svnadmin load'
works at a level below the UTF-8 validation (the latter occurs in
libsvn_repos).

 After applying my patch and recompiling `svnsync`, QA can test with
 the script at:  http://pastebin.com/g0n9uqYr

QA?  That's not how open source works :-)

Daniel
(re QA remark, I couldn't find an appropriate section in Karl's book...)


Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8

2010-09-10 Thread Daniel Trebbien
 * subversion/svnsync/main.c
   (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list
 of acceptable options for the init, sync, and copy-revprops
 subcommands.

 Missing indentation on the non-first lines.  Should be in one of these forms:

 [[[
 * subversion/svnsync/main.c
  (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list
     of acceptable options for the init, sync, and copy-revprops subcommands.
 ]]]

 [[[
 * subversion/svnsync/main.c
  (svnsync_cmd_table):
     Added svnsync_opt_source_encoding to the list of acceptable options for
     the init, sync, and copy-revprops subcommands.
 ]]]

Gmail added in the line breaks, but they aren't in my log message text
file. Should I wrap the log message to a certain number of characters?

   (log_properties_normalized): Removed this obsolete function.

 In other words, you dropped the entire count changes feature, without
 any discussion.  (It's good that the log message is clear about this
 change, though.)

 Please don't do that.  Someone spent some time writing and debugging
 this feature, you can't just come in and drop their code.  If you really
 think this is redundant, then start a discussion and get consensus to
 drop this functionality.

 And at any rate, your recode log messages work needn't depend on it.
 It can (IMO: should) even introduce counters for how many properties it
 recoded.

When I was working on my changes, I was looking for a to UTF-8
function that would return whether it actually re-encoded the input
string, but did not find one. The re-encoding function that I used,
`svn_subst_translate_string`, actually converts line endings to LF as
well as re-encodes from the given encoding to UTF-8, but it does not
inform the caller of whether it took either action. I guess that I
could write a utility function, kind of like a `strcmp`, but which
ignores any differences at line endings. Unfortunately, this adds
another scan through every property value that is encountered. Already
there is a noticeable decrease in the performance of the modified
`svnsync` as a result of calling `svn_subst_translate_string` on
basically every property value, and adding an additional scan through
each property value would decrease performance further.

I for one do not think that the final NOTE: Normalized ## properties
to LF line endings messages are particularly helpful because they do
not say *which* properties of specific revisions were normalized. As a
possible compromise, I could modify a Perl script that I wrote for the
purpose of identifying all, non-ASCII property values from the GNU
Nano repository to list revisions and property values that *would*
require normalization and/or re-encoding. Just a thought...

 * subversion/svnsync/sync.c
   Standardized terminology by replacing normalize with translate.

 Please don't do this.  It makes it harder to read the diff (it obscures
 the functional changes), it's a bikeshed (http://12a2a2.bikeshed.com),
 and in my opinion normalize is the more accurate term for what you're
 doing now after the patch.  :-)

 Just stick to the existing terminology please.

Okay. I'll change it back.

 @@ -785,7 +772,7 @@
  {
    const char *to_url, *from_url;
    svn_ra_session_t *to_session;
 -  opt_baton_t *opt_baton = b;
 +  opt_baton_t *opt_baton = (opt_baton_t*)b;

 Unnecessary change.  As a rule, patches shouldn't have them.

Reverted.

 @@ -1625,9 +1589,9 @@
  /* SUBCOMMAND: help */
  static svn_error_t *
 -help_cmd(apr_getopt_t *os, void *baton, apr_pool_t *pool)
 +help_cmd(apr_getopt_t *os, void *b, apr_pool_t *pool)
  {

 Why did you rename the formal parameter?  Isn't it another unnecessary
 change (as I explained above) that shouldn't be randomly included in
 a patch?

I renamed it because all of the other functions that take a void*
parameter for passing the opt_baton_t* call it `b`. It doesn't matter,
though. I'll change it back.

 @@ -1982,6 +1951,8 @@

    config = apr_hash_get(opt_baton.config, SVN_CONFIG_CATEGORY_CONFIG,
                          APR_HASH_KEY_STRING);
 +
 +  opt_baton.source_encoding = source_encoding;


 It seems that most other options are parsed into opt_baton.foo directly,
 skipping a local variable.

Okay. I'll change that.

 Index: subversion/svnsync/sync.c
 ===
 --- subversion/svnsync/sync.c (revision 995839)
 +++ subversion/svnsync/sync.c (working copy)
 @@ -45,29 +45,39 @@
 -/* Normalize the line ending style of *STR, so that it contains only
 - * LF (\n) line endings. After return, *STR may point at a new
 - * svn_string_t* allocated from POOL.
 +/* Translate the encoding and line ending style of *STR, so that it contains
 + * only LF (\n) line endings and is encoded in UTF-8. After return, *STR may
 + * point at a new svn_string_t* allocated from POOL if *WAS_TRANSLATED. If
 + * ENCODING is not NULL, then *STR is presumed to be encoded in UTF-8.
   *
 - * *WAS_NORMALIZED is set to TRUE when