Daniel Trebbien wrote on Fri, Sep 10, 2010 at 11:12:20 -0700: > On Fri, Sep 10, 2010 at 9:16 AM, Daniel Shahaf <d...@daniel.shahaf.name> > wrote: > > The patch didn't make it to the list (only to my personal copy), could > > you please re-send with the patch in "text/plain" MIME type? > > Hmmm. I wonder if adding a .txt extension will work...
> [[[ > Add a command line option to the svnsync init, sync, and copy-revprops > subcommands (--source-encoding) that allows the user to specify the > character encoding of translatable properties from the source > repository. This is needed to allow svnsync to import some older s/import/sync/ > Subversion repositories that have properties that were not encoded in > UTF-8. > > As discussed at: > http://thread.gmane.org/gmane.comp.version-control.subversion.user/100020 > Nice, I see in this link even the posts that were CCed only to d...@. gmane++ > * subversion/svnsync/main.c > (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list > of acceptable options for the init, sync, and copy-revprops > subcommands. Missing indentation on the non-first lines. Should be in one of these forms: [[[ * subversion/svnsync/main.c (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list of acceptable options for the init, sync, and copy-revprops subcommands. ]]] [[[ * subversion/svnsync/main.c (svnsync_cmd_table): Added svnsync_opt_source_encoding to the list of acceptable options for the init, sync, and copy-revprops subcommands. ]]] > (log_properties_normalized): Removed this obsolete function. In other words, you dropped the entire "count changes" feature, without any discussion. (It's good that the log message is clear about this change, though.) Please don't do that. Someone spent some time writing and debugging this feature, you can't just come in and drop their code. If you really think this is redundant, then start a discussion and get consensus to drop this functionality. And at any rate, your "recode log messages" work needn't depend on it. It can (IMO: should) even introduce counters for how many properties it recoded. > * subversion/svnsync/sync.c > Standardized terminology by replacing "normalize" with "translate". Please don't do this. It makes it harder to read the diff (it obscures the functional changes), it's a bikeshed (http://12a2a2.bikeshed.com), and in my opinion "normalize" is the more accurate term for what you're doing now after the patch. :-) Just stick to the existing terminology please. > * subversion/svnsync/sync.h Overall, log message looks good, wrt amount of detail, etc. (But I haven't compared it point-by-point with the actual patch.) > ]]] > > +++ subversion/svnsync/main.c (working copy) > @@ -104,8 +105,8 @@ > "the destination repository by any method other than 'svnsync'.\n" > "In other words, the destination repository should be a read-only\n" > "mirror of the source repository.\n"), > - { SVNSYNC_OPTS_DEFAULT, 'q', svnsync_opt_allow_non_empty, > - svnsync_opt_disable_locking } }, > + { SVNSYNC_OPTS_DEFAULT, svnsync_opt_source_encoding, 'q', > + svnsync_opt_allow_non_empty, svnsync_opt_disable_locking } }, Apparently our convention is to append new options to the end of the array. But that's not a major concern. > @@ -200,6 +203,12 @@ > + {"source-encoding", svnsync_opt_source_encoding, 1, > + N_("convert translatable properties from encoding > ARG\n" > + "to UTF-8. If not specified, then properties are\n" > + "presumed to be encoded in UTF-8.")}, "Translatable properties" may be confusing to an end user, but I don't have a better suggestion. > @@ -227,6 +236,7 @@ > const char *sync_password; > const char *config_dir; > apr_hash_t *config; > + const char *source_encoding; > svn_boolean_t disable_locking; > svn_boolean_t quiet; > svn_boolean_t allow_non_empty; > @@ -352,6 +362,9 @@ > svn_boolean_t quiet; > svn_boolean_t allow_non_empty; > const char *to_url; > + > + /* initialize, synchronize, and copy-revprops only */ > + const char *source_encoding; > > /* initialize only */ > const char *from_url; That's just my preference now, but could you use 'svn diff -x-p' so that function/struct names show in the hunk headers? @dev: how about making the patch submission guidelines recommend '-x-p' be used regularly in patch submissions? > @@ -785,7 +772,7 @@ > { > const char *to_url, *from_url; > svn_ra_session_t *to_session; > - opt_baton_t *opt_baton = b; > + opt_baton_t *opt_baton = (opt_baton_t*)b; Unnecessary change. As a rule, patches shouldn't have them. (the same cast is added again later in the diff) > @@ -1625,9 +1589,9 @@ > /* SUBCOMMAND: help */ > static svn_error_t * > -help_cmd(apr_getopt_t *os, void *baton, apr_pool_t *pool) > +help_cmd(apr_getopt_t *os, void *b, apr_pool_t *pool) > { Why did you rename the formal parameter? Isn't it another unnecessary change (as I explained above) that shouldn't be randomly included in a patch? > @@ -1982,6 +1951,8 @@ > > config = apr_hash_get(opt_baton.config, SVN_CONFIG_CATEGORY_CONFIG, > APR_HASH_KEY_STRING); > + > + opt_baton.source_encoding = source_encoding; > It seems that most other options are parsed into opt_baton.foo directly, skipping a local variable. > Index: subversion/svnsync/sync.c > =================================================================== > --- subversion/svnsync/sync.c (revision 995839) > +++ subversion/svnsync/sync.c (working copy) > @@ -45,29 +45,39 @@ > -/* Normalize the line ending style of *STR, so that it contains only > - * LF (\n) line endings. After return, *STR may point at a new > - * svn_string_t* allocated from POOL. > +/* Translate the encoding and line ending style of *STR, so that it contains > + * only LF (\n) line endings and is encoded in UTF-8. After return, *STR may > + * point at a new svn_string_t* allocated from POOL if *WAS_TRANSLATED. If > + * ENCODING is not NULL, then *STR is presumed to be encoded in UTF-8. > * > - * *WAS_NORMALIZED is set to TRUE when *STR needed to be normalized, > + * *WAS_TRANSLATED is set to TRUE when *STR needed to be translated, Do you want to maintain separate counters for "encoding had to be changed" and "linefeeds had to be fixed", or a combined counter? (As I said: if you think counters should be removed completely, make your case in a separate thread. That's orthogonal to the encoding work.) > @@ -501,13 +507,11 @@ > + SVN_ERR(translate_string(&value, &was_translated, eb->prop_encoding, > pool)); Please wrap to 80 characters. > Bottom line, could you fix these issues (in particular the "dropped counters" and the "changed terminology" issues) and re-send the patch? Thanks, Daniel