Re: rebased patches?
Bo Borgerson <[EMAIL PROTECTED]> wrote: > I've pushed a version of the sort branch that contains the following > updates: > > 1. Try to minimize changes to translatable strings > 2. Improve diagnostic messages for files0-from edge-cases > 3. Use the new standardized files0-from test script format > 4. Avoid use of the '>' operator > 5. Follow the log message summary template in HACKING Hi Bo, Thanks for doing that. I've made the following changes (wording, scoping, avoid-sprintf-when-possible, code-aesthetics), and expect to squash them into yours and push the result later today. diff --git a/NEWS b/NEWS index 2fc676e..7b1df75 100644 --- a/NEWS +++ b/NEWS @@ -25,7 +25,7 @@ GNU coreutils NEWS-*- outline -*- sort accepts a new option --batch-size=NMERGE, where NMERGE represents the maximum number of inputs that will be merged at once. - When more than NMERGE inputs are present temporary files are used. + When more than NMERGE inputs are present, sort uses temporary files. ** Bug fixes diff --git a/doc/coreutils.texi b/doc/coreutils.texi index c4a6c9c..f7b9e78 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -3809,8 +3809,9 @@ Example: To sort on the second field, use @option{--key=2,2} @cindex number of inputs to merge, nmerge Merge at most @var{nmerge} inputs at once. -If more than @var{nmerge} inputs are to be merged, then temporary files -will be used. +When @command{sort} has to merge more than @var{nmerge} inputs, +it merges them in groups of @var{nmerge}, saving the result in +a temporary file, which is then used as an input in a subsequent merge. A large value of @var{nmerge} may improve merge performance and decrease temporary storage utilization at the expense of increased memory usage @@ -3821,8 +3822,9 @@ merge performance. The value of @var{nmerge} must be at least 2. The value of @var{nmerge} may be bounded by a resource limit for open -file descriptors. Try @samp{ulimit -n} to check for such a limit. If -the value of @var{nmerge} exceeds this limit, then @command{sort} will +file descriptors. Try @samp{ulimit -n} or @samp{getconf OPEN_MAX} to +to display the limit for a particular system. +If the value of @var{nmerge} exceeds this limit, then @command{sort} will issue a warning to standard error and exit with a nonzero status. @item -o @var{output-file} diff --git a/src/sort.c b/src/sort.c index 2bc3524..1393521 100644 --- a/src/sort.c +++ b/src/sort.c @@ -1077,7 +1077,6 @@ specify_nmerge (int oi, char c, char const *s) uintmax_t n; struct rlimit rlimit; unsigned int max_nmerge = (unsigned int) MAX_NMERGE; - char max_nmerge_buf[INT_BUFSIZE_BOUND (unsigned int)]; enum strtol_error e = xstrtoumax (s, NULL, 10, &n, NULL); /* Try to find out how many file descriptors we'll be able @@ -1089,7 +1088,9 @@ specify_nmerge (int oi, char c, char const *s) if (e == LONGINT_OK) { nmerge = n; - if (nmerge == n) + if (nmerge != n) + e = LONGINT_OVERFLOW; + else { if (nmerge < 2) { @@ -1113,18 +1114,14 @@ specify_nmerge (int oi, char c, char const *s) return; } } - else - e = LONGINT_OVERFLOW; } if (e == LONGINT_OVERFLOW) { - - sprintf (max_nmerge_buf, "%u", max_nmerge); - + char max_nmerge_buf[INT_BUFSIZE_BOUND (unsigned int)]; + uinttostr (max_nmerge, max_nmerge_buf); error (0, 0, _("--%s argument %s too large"), long_options[oi].name, quote(s)); - error (SORT_FAILURE, 0, _("maximum --%s argument with current rlimit is %s"), long_options[oi].name, quote (max_nmerge_buf)); ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: rebased patches?
Bo Borgerson <[EMAIL PROTECTED]> wrote: > Jim Meyering wrote: >> This brings up another (as yet unwritten) guideline: >> Don't change translatable strings if you can avoid it. >> If you must rearrange lines, extract and create new strings, rather than >> extracting and moving into existing blocks. This avoids making unnecessary >> work for translators. ... >> Yes, I'll add this to HACKING RSN ;-) > > If it helps I've pushed a branch called HACKING that adds a new section > for translatability tips and includes your guideline above verbatim. ;) It does, indeed. Thank you. I've pushed this addition: Be nice to translators == Don't change translatable strings if you can avoid it. If you must rearrange individual lines (e.g., in multi-line --help strings), extract and create new strings, rather than extracting and moving into existing blocks. This avoids making unnecessary work for translators. ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: rebased patches?
Jim Meyering wrote: > This brings up another (as yet unwritten) guideline: > Don't change translatable strings if you can avoid it. > If you must rearrange lines, extract and create new strings, rather than > extracting and moving into existing blocks. This avoids making unnecessary > work for translators. Hi Jim, I've pushed a version of the sort branch that contains the following updates: 1. Try to minimize changes to translatable strings 2. Improve diagnostic messages for files0-from edge-cases 3. Use the new standardized files0-from test script format 4. Avoid use of the '>' operator 5. Follow the log message summary template in HACKING > > Yes, I'll add this to HACKING RSN ;-) If it helps I've pushed a branch called HACKING that adds a new section for translatability tips and includes your guideline above verbatim. ;) Thanks, Bo ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: rebased patches?
Bo Borgerson <[EMAIL PROTECTED]> wrote: > Jim Meyering wrote: >> Also, I made some syntactic changes to fit with my policy >> preferences (no ">" operators, and adjusted "const" placement): > > Thanks Jim. > > BTW - Do you have these policy preferences collected somewhere? I don't > remember seeing some of them in the general GNU standards document. > If they are coreutils-specific would it make sense to have a section in > HACKING (or maybe a dedicated standards.txt supplement) to explain them? Those two are my own preferences. Yes, mentioning them in HACKING is on my list. I prefer < to >, and caught the bug from Paul. I'm not alone: http://thread.gmane.org/gmane.comp.version-control.git/3903/focus=4126 I'll dig up justification for my const placement preference later. ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: rebased patches?
Jim Meyering wrote: > Also, I made some syntactic changes to fit with my policy > preferences (no ">" operators, and adjusted "const" placement): Thanks Jim. BTW - Do you have these policy preferences collected somewhere? I don't remember seeing some of them in the general GNU standards document. If they are coreutils-specific would it make sense to have a section in HACKING (or maybe a dedicated standards.txt supplement) to explain them? I think that could be a useful educational tool for relatively inexperienced contributors (like me), and could help reduce the noise in patches that are submitted to the list. Thanks, Bo ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: rebased patches?
Bo Borgerson <[EMAIL PROTECTED]> wrote: > Jim Meyering wrote: >> P.S. no need to send the tarball. >> I like using repo.or.cz. >> >> Also, I think you made it so your new decl, >> >> /* Output columns will be delimited with this string, which may be set >> on the command-line with --output-delimiter=STR. The default is a >> single TAB character. */ >> static char *delimiter; >> >> has two empty lines after it. >> Just one is better. > > Hi Jim, > > I've re-pushed the comm branch with the changes you suggested. Hi Bo, Thanks! I've made some minor editing changes, e.g., [better to use an active voice than a passive one] diff --git a/doc/coreutils.texi b/doc/coreutils.texi index ff6bbb0..d05577d 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -4476,8 +4476,8 @@ Do not check that both input files are in sorted order. Other options are: @item [EMAIL PROTECTED] -Columns will be delimited by @var{str} in output, rather than the default -single TAB character. +Print @var{str} between adjacent output columns, +rather than the default of a single TAB character. The delimiter @var{str} may not be empty. and tweaked the log messages. Also, I made some syntactic changes to fit with my policy preferences (no ">" operators, and adjusted "const" placement): Here are your two adjusted change sets, then mine at the end. >From 98a96822d9dac92de719fa340fe326e1fe0427fe Mon Sep 17 00:00:00 2001 From: Bo Borgerson <[EMAIL PROTECTED]> Date: Sun, 20 Apr 2008 21:24:16 -0400 Subject: [PATCH] comm: ensure that input files are sorted * NEWS: List new behavior. * doc/coreutils.texi (checkOrderOption) New macro for describing `--check-order' and `--nocheck-order', used in both join and comm. * src/comm.c (main): Initialize new options. (usage): Describe new options. (compare_files): Keep an extra pair of buffers for the previous line from each file to check the internal order. (check_order): If an order-check is required, compare and handle the result appropriately. (copylinebuffer): Copy a linebuffer; used for copy before read. * tests/misc/Makefile.am: List new test. * tests/misc/comm: Tests for the comm program, including the new order-checking functionality and attendant command-line options. --- NEWS |3 + doc/coreutils.texi | 39 src/comm.c | 168 +-- tests/Makefile.am |1 + tests/misc/comm| 124 ++ 5 files changed, 303 insertions(+), 32 deletions(-) create mode 100755 tests/misc/comm diff --git a/NEWS b/NEWS index 97f3162..ba39d2f 100644 --- a/NEWS +++ b/NEWS @@ -9,6 +9,9 @@ GNU coreutils NEWS-*- outline -*- ** New features + comm now verifies that the inputs are in sorted order. This check can + be turned off with the --nocheck-order option. + md5sum now accepts the new option, --quiet, to suppress the printing of 'OK' messages. sha1sum, sha224sum, sha384sum, and sha512sum accept it, too. diff --git a/doc/coreutils.texi b/doc/coreutils.texi index a626b45..3bedd73 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -4449,6 +4449,32 @@ status that does not depend on the result of the comparison. Upon normal completion @command{comm} produces an exit code of zero. If there is an error it exits with nonzero status. [EMAIL PROTECTED] checkOrderOption{cmd} +If the @option{--check-order} option is given, unsorted inputs will +cause a fatal error message. If the option @option{--nocheck-order} +is given, unsorted inputs will never cause an error message. If +neither of these options is given, wrongly sorted inputs are diagnosed +only if an input file is found to contain unpairable lines. If an +input file is diagnosed as being unsorted, the @command{\cmd\} command +will exit with a nonzero status (and the output should not be used). + +Forcing @command{\cmd\} to process wrongly sorted input files +containing unpairable lines by specifying @option{--nocheck-order} is +not guaranteed to produce any particular output. The output will +probably not correspond with whatever you hoped it would be. [EMAIL PROTECTED] macro [EMAIL PROTECTED] + [EMAIL PROTECTED] @samp + [EMAIL PROTECTED] --check-order +Fail with an error message if either input file is wrongly ordered. + [EMAIL PROTECTED] --nocheck-order +Do not check that both input files are in sorted order. + [EMAIL PROTECTED] table + @node tsort invocation @section @command{tsort}: Topological sort @@ -5290,18 +5316,7 @@ c c1 c2 b b1 b2 @end example -If the @option{--check-order} option is given, unsorted inputs will -cause a fatal error message. If the option @option{--nocheck-order} -is given, unsorted inputs will never cause an error message. If -neither of these options is given, wrongly sorted inputs are diagnosed -only if an input file is found to contain unpairable lines. If an -input file is diagnosed as being unsorted, the @command{jo
Re: rebased patches?
Bo Borgerson <[EMAIL PROTECTED]> wrote: > Here are my outstanding patches. These are also available for fetch at: > > git://repo.or.cz/coreutils/bo.git > > They are in the following branches: ... > error-msgs (and syntax checks) Thanks, Bo, I've just pushed that one, slightly adjusted: One of the new checks caught a capitalization nit in truncate.c, too ;-) commit 6eec737ade63bd48e0cccd66c021dd5523100f06 Author: Bo Borgerson <[EMAIL PROTECTED]> Date: Fri Apr 4 11:13:13 2008 -0400 standardize some error messages * maint.mk: (sc_error_message_warn_fatal, sc_error_message_uppercase): (sc_error_message_period): Add automatic checks for non-standard error messages. * .x-sc_error_message_uppercase: explicit exclusion for this check * src/cp.c: Standardize some error messages. * src/date.c: Likewise. * src/dircolors.c: Likewise. * src/du.c: Likewise. * src/expr.c: Likewise. * src/install.c: Likewise. * src/join.c: Likewise. * src/ln.c: Likewise. * src/mv.c: Likewise. * src/od.c: Likewise. * src/pr.c: Likewise. * src/split.c: Likewise. * src/truncate.c: Likewise. * src/wc.c: Likewise. * tests/du/files0-from: Expect new error message. * tests/misc/join: Likewise. * tests/misc/split-a: Likewise. * tests/misc/wc-files0-from: Likewise. * tests/misc/xstrtol: Likewise. * lib/xmemxfrm.c: Likewise. diff --git a/src/truncate.c b/src/truncate.c index f353067..f26fd45 100644 --- a/src/truncate.c +++ b/src/truncate.c @@ -326,7 +326,7 @@ main (int argc, char **argv) /* must specify either size or reference file */ if ((ref_file && got_size) || (!ref_file && !got_size)) { - error (0, 0, _("You must specify one of %s or %s"), + error (0, 0, _("you must specify one of %s or %s"), quote_n (0, "--size"), quote_n (1, "--reference")); usage (EXIT_FAILURE); } ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils