Re: [coreutils] [PATCH] uniq: don't continue field processing after end of line

2011-01-17 Thread Pádraig Brady
On 16/01/11 23:53, Sami Kerola wrote:
 Hi,
 
 I notice uniq -f 'insane_large_number' will make process to be busy
 long time without good reason. Attached patch should fix this symptom.

I'd slightly amend that to the following,
to match the other limit checks in the function.

diff --git a/src/uniq.c b/src/uniq.c
index 7bdbc4f..9c7e37c 100644
--- a/src/uniq.c
+++ b/src/uniq.c
@@ -214,7 +214,7 @@ find_field (struct linebuffer const *line)
   size_t size = line-length - 1;
   size_t i = 0;

-  for (count = 0; count  skip_fields; count++)
+  for (count = 0; count  skip_fields  i  size; count++)
 {
   while (i  size  isblank (to_uchar (lp[i])))

 I found the bug after friend of mine asked why uniq does not allow
 specifying field separator, similar way sort -t. I could not answer
 anything rational, so I look the code and tested how it works. That
 inspired me to send bug fix, which is obvious thing to do. But how
 about that -t, do you think this would be worth while addition to
 uniq?

yes. Basically `uniq` should support the same -k and -t
functionality that `sort` does. See also http://debbugs.gnu.org/5832

cheers,
Pádraig.



Re: [coreutils] [PATCH] doc: show how to shred using a single zero-writing pass

2011-01-17 Thread Pádraig Brady
On 17/01/11 10:36, Jim Meyering wrote:
From 7dc6335653afcdad9a3ffa327877571734644285 Mon Sep 17 00:00:00 2001
 From: Jim Meyering meyer...@redhat.com
 Date: Mon, 17 Jan 2011 11:32:35 +0100
 Subject: [PATCH] doc: show how to shred using a single zero-writing pass
 
 * doc/coreutils.texi (shred invocation): Give an example showing how
 to invoke shred in its most basic (fastest) write-only-zeros mode.
 ---
  doc/coreutils.texi |9 +
  1 files changed, 9 insertions(+), 0 deletions(-)
 
 diff --git a/doc/coreutils.texi b/doc/coreutils.texi
 index 9c3e2ed..8fb9f0c 100644
 --- a/doc/coreutils.texi
 +++ b/doc/coreutils.texi
 @@ -8892,6 +8892,15 @@ shred invocation
  shred --verbose /dev/sda5
  @end example
 
 +On modern disks, a single pass that writes only zeros may be enough,
 +and it will be much faster than the default.

Well only 3 times, due to the disk being the bottleneck
(since we changed to the fast internal PRNG by default).
Also for security, writing random data would probably be more effective.
So I'd reword the above sentence to:

To simply clear a disk

 +Use a command like this to tell @command{shred} to skip all random
 +passes and to perform only a final zero-writing pass:
 +
 +@example
 +shred --verbose -n0 --zero /dev/sda5
 +@end example

It's probably not worth noting the equivalent:
dd conv=fdatasync bs=2M  /dev/zero  /dev/sda5

cheers,
Pádraig.



Re: [coreutils] [PATCH] uniq: don't continue field processing after end of line

2011-01-17 Thread Jim Meyering
Pádraig Brady wrote:

 On 16/01/11 23:53, Sami Kerola wrote:
 Hi,

 I notice uniq -f 'insane_large_number' will make process to be busy
 long time without good reason. Attached patch should fix this symptom.

 I'd slightly amend that to the following,
 to match the other limit checks in the function.

 diff --git a/src/uniq.c b/src/uniq.c
 index 7bdbc4f..9c7e37c 100644
 --- a/src/uniq.c
 +++ b/src/uniq.c
 @@ -214,7 +214,7 @@ find_field (struct linebuffer const *line)
size_t size = line-length - 1;
size_t i = 0;

 -  for (count = 0; count  skip_fields; count++)
 +  for (count = 0; count  skip_fields  i  size; count++)
  {
while (i  size  isblank (to_uchar (lp[i])))

Thank you!
I've also adjusted NEWS.
Here's your adjusted patch, followed by another to add a test
to exercise the bug/fix:

From bf0ed321332b01fc38eb892d1deac16629aea07c Mon Sep 17 00:00:00 2001
From: Sami Kerola kerol...@iki.fi
Date: Mon, 17 Jan 2011 00:27:06 +0100
Subject: [PATCH] uniq: don't continue field processing after end of line

* NEWS (Bug fixes): Mention it.
* src/uniq.c (find_field): Stop processing loop when end of line
is reached.  Before this fix, 'uniq -f 100 /etc/passwd'
would run for a very long time.
---
 NEWS   |2 ++
 src/uniq.c |2 +-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/NEWS b/NEWS
index 9ccad63..3ec35c7 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,8 @@ GNU coreutils NEWS-*- 
outline -*-
   rm -f no longer fails for EINVAL or EILSEQ on file systems that
   reject file names invalid for that file system.

+  uniq -f NUM no longer tries to process fields after end of line.
+

 * Noteworthy changes in release 8.9 (2011-01-04) [stable]

diff --git a/src/uniq.c b/src/uniq.c
index 7bdbc4f..9c7e37c 100644
--- a/src/uniq.c
+++ b/src/uniq.c
@@ -214,7 +214,7 @@ find_field (struct linebuffer const *line)
   size_t size = line-length - 1;
   size_t i = 0;

-  for (count = 0; count  skip_fields; count++)
+  for (count = 0; count  skip_fields  i  size; count++)
 {
   while (i  size  isblank (to_uchar (lp[i])))
 i++;
--
1.7.3.5


From 660d57085140c86387f84eb508f4e15caa972bee Mon Sep 17 00:00:00 2001
From: Jim Meyering meyer...@redhat.com
Date: Mon, 17 Jan 2011 12:27:55 +0100
Subject: [PATCH] tests: add a test for today's uniq bug

* tests/misc/uniq-perf: New file.
* tests/Makefile.am (TESTS): Add it.
---
 tests/Makefile.am|1 +
 tests/misc/uniq-perf |   25 +
 2 files changed, 26 insertions(+), 0 deletions(-)
 create mode 100755 tests/misc/uniq-perf

diff --git a/tests/Makefile.am b/tests/Makefile.am
index a5dbd3e..1e4e300 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -286,6 +286,7 @@ TESTS = \
   misc/tty-eof \
   misc/unexpand\
   misc/uniq\
+  misc/uniq-perf   \
   misc/xattr   \
   tail-2/wait  \
   chmod/c-option   \
diff --git a/tests/misc/uniq-perf b/tests/misc/uniq-perf
new file mode 100755
index 000..f000e76
--- /dev/null
+++ b/tests/misc/uniq-perf
@@ -0,0 +1,25 @@
+#!/bin/sh
+# before coreutils-8.10, seq 10|uniq -f 100 would run for days
+
+# Copyright (C) 2011 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+
+. ${srcdir=.}/init.sh; path_prepend_ ../src
+print_ver_ uniq
+
+seq 100  in || fail=1
+timeout 1 uniq -f 100 in || fail=1
+
+Exit $fail
--
1.7.3.5



[coreutils] Re: [PATCH] uniq: don't continue field processing after end of line

2011-01-17 Thread Andreas Schwab


Jim Meyering jim-oxw1nkzkivjk1umjsbk...@public.gmane.org writes:

 +timeout 1 uniq -f 100 in || fail=1

The Cray-3 is so fast it can execute an infinite loop in under 2
seconds!

Andreas. :-)

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.