bug#28468: Bug in wc -l found

2017-09-15 Thread Assaf Gordon
tag 28468 notabug
stop

Hello Rob,

On 2017-09-15 03:03 AM, Weidner, Robert (I/EE-31, extern) wrote:
> seems I found a bug in wc, have a look:

[[
the attach screen shot shows:
$ wc -l monitore-serNr_all-run2.txt
16
while the attached file appears to have 17 lines.
]]

This is not a bug, but the expected behavior.
The file you attached does not have a '\n' character
at the end of the last line.

Technically, 'wc' counts the number of '\n' characters,
not the number of appeared lines.

This is explained here:
   http://www.pixelbeat.org/docs/coreutils-gotchas.html#wc


As such, I'm marking this as not-a-bug, but discussion can
continue by replying to this thread.

regards,
 - assaf






bug#28468: Bug in wc -l found

2017-09-15 Thread Ruediger Meier
On Friday 15 September 2017, Weidner, Robert (I/EE-31, extern) wrote:
> Dear GNU Team,
>
> seems I found a bug in wc, have a look:
>
> [cid:image001.png@01D32E12.3F5A7C20]
>
> Despite of it, I really want to say a BIG Thank you for great
> tool-set, especially tree, which I use for 20 years now!

This is not a bug. wc -l only counts newlines but the last line of your 
input file does not end with a newline, see
 
https://unix.stackexchange.com/questions/314256/wc-l-not-returning-correct-value/314258#314258


cu,
Rudi





bug#28468: Bug in wc -l found

2017-09-15 Thread Weidner, Robert (I/EE-31, extern)
Dear GNU Team,

seems I found a bug in wc, have a look:

[cid:image001.png@01D32E12.3F5A7C20]

Despite of it, I really want to say a BIG Thank you for great tool-set, 
especially tree, which I use for 20 years now!

THX
Rob


Mit freundlichen Gruessen

Robert Weidner
FAS Architektur / zFAS Plattform Test

BFFT Gesellschaft für Fahrzeugtechnik mbH

Im Auftrag der AUDI AG
I/EE-31, extern
85045 Ingolstadt
M: +49 (0) 1520-3427 961
mailto:extern.robert.weid...@audi.de
www.audi.com<http://www.audi.com/>

Dr.-Ludwig-Kraus-Straße 2,  85080 Gaimersheim, Headquarter 3.OG

BFFT Gesellschaft für Fahrzeugtechnik mbH | Dr.-Ludwig-Kraus-Straße 2 | 85080 
Gaimersheim
GmbH mit Hauptsitz in Gaimersheim | AG: Ingolstadt - HRB 2775
Geschäftsführung: Markus Fichtner, Michael Mittag | UST-Id Nr. DE 812 875 182

6CM5021BZW
3CQ4511G1Q
6CM5021XF2
6CM5021Z95
3CQ5133F89
3CQ5131MTH
6CM5021Z93
6CM5021XF1
6CM5021Y39
6CM5021YNY
3CQ5131MS3
3CQ5133F8T
6CM5021Z94
6CM5261TT0
6CM5261TSZ
6CM4F33RZ
6CM5021Y8N

Re: RFC: wc --max-line-length vs. TABs [Re: Bug in wc

2008-08-23 Thread Jim Meyering
Bruno Haible <[EMAIL PROTECTED]> wrote:
> Hi Jim,
>
>> This behavior is not specified, and is currently untested.
>> (it's a GNU invention, from Bruno Haible in textutils-1.22d,
>> which was back in 1997)
>
> The intention of this option is and was to measure the maximum number of
> screen columns used by a file. For many purposes, people are encouraged
> to create/send/commit files with at most 80 screen columns. Or at most 79
> screen columns for others. Or at most 74 columns for GNU texinfo files.
> The option '-L' was intended as a fast check for this metric.
>
> The original mail, sent to bug-gnu-utils on 1997-10-31, had this explanation:
>
>   "While GNU "wc" returns the vertical extent of a piece of text - i.e. the
>number of lines - it does not yet return the horizontal extent of a piece
>of text - i.e. the number of columns. This is a useful functionality, if
>you want to know
>
>  - whether a text will fit on the paper when sent to the printer,
>  - whether an email exceeds the recommended 72 character limit,
>  - (in combination with "nm") how long the identifiers were that made
>`ranlib' dump core,
>  - etc."
>
> I propose a clarification in the documentation (see below).

Hi Bruno,

Thanks for the patch.  I've applied it.
Indeed, I've used it to check code for lines longer than 80.
Obviously, changing how TAB or multi-byte characters are counted
would break that.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: RFC: wc --max-line-length vs. TABs [Re: Bug in wc

2008-08-22 Thread Bruno Haible
Hi Jim,

> This behavior is not specified, and is currently untested.
> (it's a GNU invention, from Bruno Haible in textutils-1.22d,
> which was back in 1997)

The intention of this option is and was to measure the maximum number of
screen columns used by a file. For many purposes, people are encouraged
to create/send/commit files with at most 80 screen columns. Or at most 79
screen columns for others. Or at most 74 columns for GNU texinfo files.
The option '-L' was intended as a fast check for this metric.

The original mail, sent to bug-gnu-utils on 1997-10-31, had this explanation:

  "While GNU "wc" returns the vertical extent of a piece of text - i.e. the
   number of lines - it does not yet return the horizontal extent of a piece
   of text - i.e. the number of columns. This is a useful functionality, if
   you want to know

 - whether a text will fit on the paper when sent to the printer,
 - whether an email exceeds the recommended 72 character limit,
 - (in combination with "nm") how long the identifiers were that made
   `ranlib' dump core,
 - etc."

I propose a clarification in the documentation (see below).

> I'm tempted to make the change, but it seems too drastic, after 11 years.
> Do any of you rely on the current TAB-counting behavior of GNU wc?
> 
> Bruno, what do you think?

If you change the option to count every tab as 1, or every character as 1
regardless of its screen width, the option -L is not usable for its main
purpose any more.

Bruno


2008-08-22  Bruno Haible  <[EMAIL PROTECTED]>

* doc/coreutils.texi (wc invocation): Explain what the option -L
measures.

--- coreutils.texi.bak  2008-08-22 23:55:47.0 +0200
+++ coreutils.texi  2008-08-22 23:59:03.0 +0200
@@ -3137,7 +3137,9 @@
 
 With the @option{--max-line-length} option, @command{wc} prints the length
 of the longest line per file, and if there is more than one file it
-prints the maximum (not the sum) of those lengths.
+prints the maximum (not the sum) of those lengths.  The line lengths here
+are measured in screen columns, according to the current locale and
+assuming tab positions in every 8th column.
 
 The program accepts the following options.  Also see @ref{Common options}.
 



___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: RFC: wc --max-line-length vs. TABs [Re: Bug in wc

2008-08-22 Thread Arnaldo Mandel
Bo Borgerson wrote (on Aug 22, 2008):
 > 
 > Does it make sense to change the behavior for TAB, but not for "wide"
 > characters?

Relying on an undocumented tab length seems bad.  However, on chars I
suggest you just apply the bug->feature operator: document that line
length is in chars, and explain that chars is a locale-dependent
concept.

Just my 2 cents.

am



___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: RFC: wc --max-line-length vs. TABs [Re: Bug in wc

2008-08-22 Thread Bo Borgerson
Jim Meyering wrote:
> 
> I'm tempted to make the change, but it seems too drastic, after 11 years.
> Do any of you rely on the current TAB-counting behavior of GNU wc?
> 

Hi,

It looks like TAB characters aren't alone in being counted by printed
width rather than count:

$ echo '好' | wc -L
2

Does it make sense to change the behavior for TAB, but not for "wide"
characters?

Bo
diff --git a/src/wc.c b/src/wc.c
index 0bb1929..b3f1ab2 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -378,7 +378,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus)
 		{
 		  int width = wcwidth (wide_char);
 		  if (width > 0)
-			linepos += width;
+			linepos ++;
 		  if (iswspace (wide_char))
 			goto mb_word_separator;
 		  in_word = true;
___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


RFC: wc --max-line-length vs. TABs [Re: Bug in wc

2008-08-22 Thread Jim Meyering
Jim Meyering <[EMAIL PROTECTED]> wrote:
> Arnaldo Mandel <[EMAIL PROTECTED]> wrote:
>> Dear maintainers,
>>
>> There is a bug in the implementation of the -L parameter in wc.
>> It is triggered by
>>
>> http://www.ime.usp.br/~am/122/eps/gapqm2.gz
>>
>> Check this out:
>>
>> $ zcat gapqm2.gz |wc -l -c -L
>>   1 6297954 6353180
>>
>> That is, the single line is longer than the whole file.
>>
>> This was pointed out by
>>
>>   William A. M. Gnann <[EMAIL PROTECTED]>
>
> Thanks for reporting it and for giving credit.
> FYI, here's a smaller reproducer:
>
>   $ printf '\t'|wc -L
>   8

This behavior is not specified, and is currently untested.
(it's a GNU invention, from Bruno Haible in textutils-1.22d,
which was back in 1997)

http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=ab5ff1597f5d734b711fbd95389cefcc8203d51c

I.e., the following change to make --max-line-length (-L)
never count a TAB as more than one byte does not induce
any test failure.

I'm tempted to make the change, but it seems too drastic, after 11 years.
Do any of you rely on the current TAB-counting behavior of GNU wc?

Bruno, what do you think?


diff --git a/src/wc.c b/src/wc.c
index 0bb1929..d44cf96 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -363,7 +363,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus)
  linepos = 0;
  goto mb_word_separator;
case '\t':
- linepos += 8 - (linepos % 8);
+ linepos++;
  goto mb_word_separator;
case ' ':
  linepos++;
@@ -437,7 +437,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus)
  linepos = 0;
  goto word_separator;
case '\t':
- linepos += 8 - (linepos % 8);
+ linepos++;
  goto word_separator;
case ' ':
  linepos++;


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Bug in wc (cont.)

2008-08-22 Thread Arnaldo Mandel
My earlier bug report lacked a pssibly relevant piece of info:

The bug showed up with versions 6.10 and 5.97 of wc, on Linux 2.6.24
and 2.6.11, i686 and x86_64, LC_ALL=C.

am

-- 
Arnaldo Mandel
Departamento de Ciência da Computação - Computer Science Department
Universidade de São Paulo, Bra[sz]il  
[EMAIL PROTECTED]
Talvez você seja um Bright http://the-brights.net Maybe you are a Bright.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Bug in wc

2008-08-22 Thread Jim Meyering
Arnaldo Mandel <[EMAIL PROTECTED]> wrote:
> Dear maintainers,
>
> There is a bug in the implementation of the -L parameter in wc.
> It is triggered by
>
> http://www.ime.usp.br/~am/122/eps/gapqm2.gz
>
> Check this out:
>
> $ zcat gapqm2.gz |wc -l -c -L
>   1 6297954 6353180
>
> That is, the single line is longer than the whole file.
>
> This was pointed out by
>
>   William A. M. Gnann <[EMAIL PROTECTED]>

Thanks for reporting it and for giving credit.
FYI, here's a smaller reproducer:

  $ printf '\t'|wc -L
  8


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Bug in wc

2008-08-22 Thread Arnaldo Mandel
Dear maintainers,

There is a bug in the implementation of the -L parameter in wc.
It is triggered by 

http://www.ime.usp.br/~am/122/eps/gapqm2.gz

Check this out:

$ zcat gapqm2.gz |wc -l -c -L
  1 6297954 6353180

That is, the single line is longer than the whole file.

This was pointed out by 

  William A. M. Gnann <[EMAIL PROTECTED]>

Have fun!

-- 
Arnaldo Mandel
Departamento de Ciência da Computação - Computer Science Department
Universidade de São Paulo, Bra[sz]il  
[EMAIL PROTECTED]
Talvez você seja um Bright http://the-brights.net Maybe you are a Bright.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: bug in 'wc -l'

2004-02-06 Thread Andreas Schwab
[EMAIL PROTECTED] writes:

> On Fri, 6 Feb 2004 13:05:09 +0100 (MET), Alfred M. Szmidt
> <[EMAIL PROTECTED]> posted to gmane.comp.gnu.core-utils.bugs:
>  >I am sending you a patch to solve a 'problem' at the wc program.
>  >When used with -l option (to count the number of lines) the last
>  >line isn't counted.
>  > It counts occurences of '\n' (i.e. newline).  So I guess that the
>  > behaviour is correct.
>
> If that's your definition of a line, obviously.

It's POSIX's definition of a line.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


___
Bug-coreutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-coreutils


Re: bug in 'wc -l'

2004-02-06 Thread Alfred M. Szmidt
>I am sending you a patch to solve a 'problem' at the wc
>program.  When used with -l option (to count the number of
>lines) the last line isn't counted.
> It counts occurences of '\n' (i.e. newline).  So I guess that
> the behaviour is correct.

   If that's your definition of a line, obviously. However, the poster
   is probably right that the last line of a file should count as a
   line even if it lacks a final newline.

I was quoting the manual; `wc -l' counts how many times it hits a
`newline' (\n).  Not `lines' per see (since that is hard to define
exactly).

,[ (coreutils)wc invocation ]
| `-l'
| `--lines'
|  Print only the newline counts.
`


Cheers!


___
Bug-coreutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-coreutils


Re: bug in 'wc -l'

2004-02-06 Thread era+gmane
On Fri, 6 Feb 2004 13:05:09 +0100 (MET), Alfred M. Szmidt
<[EMAIL PROTECTED]> posted to gmane.comp.gnu.core-utils.bugs:
 >I am sending you a patch to solve a 'problem' at the wc program.
 >When used with -l option (to count the number of lines) the last
 >line isn't counted.
 > It counts occurences of '\n' (i.e. newline).  So I guess that the
 > behaviour is correct.

If that's your definition of a line, obviously. However, the poster is
probably right that the last line of a file should count as a line
even if it lacks a final newline.

 > Well, if it doesn't end with a newline (\n) then it isn't really a
 > line...

Is too. Is not. Is too. Your reasoning smacks of a circular argument,
anyway (nothing personal :-)

/* era */

-- 
formail -s procmail http://www.euro.cauce.org/
cat | more | cathttp://www.debian.org/



___
Bug-coreutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-coreutils


Re: bug in 'wc -l'

2004-02-06 Thread Alfred M. Szmidt
   I am sending you a patch to solve a 'problem' at the wc program.
   When used with -l option (to count the number of lines) the last
   line isn't counted.

It counts occurences of '\n' (i.e. newline).  So I guess that the
behaviour is correct.

   The problem is: if the input does not end with a newline "\n", this
   line is not counted.

Well, if it doesn't end with a newline (\n) then it isn't really a
line...

Cheers.


___
Bug-coreutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-coreutils


bug in 'wc -l'

2004-02-06 Thread Thobias Salazar Trevisan

hi,

I am sending you a patch to solve a 'problem' at the wc program.
When used with -l option (to count the number of lines) the last
line isn't counted.

First of all, I do not know if it is really a bug, because a
newline must end with a '\n' char. But assume a file, that the
last line does not end with a '\n', so the last line does not
exist ?!

Summarizing:
The problem is: if the input does not end with a newline "\n",
this line is not counted. The classic example:

$ echo -n 'Test line' | wc -l
  0

This patch below solve the problem.

PS: coreutils-5.0

HTH,

thobias
---
echo 13344956207444746332132269002206986P | dc
---
http://thobias.org


--- old_wc.c2004-02-06 08:39:16.0 -0200
+++ wc.c2004-02-06 09:25:52.0 -0200
@@ -276,6 +276,9 @@
}
  bytes += bytes_read;
}
+   /* input might not end with a newline */
+   if ((buf[strlen(buf)-1]) != '\n')
+   ++lines;
 }
 #if HAVE_MBRTOWC && (MB_LEN_MAX > 1)
 # define SUPPORT_OLD_MBRTOWC 1


___
Bug-coreutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-coreutils


Re: Bug in 'wc' --help output

2003-09-06 Thread DervishD
Hi Jim :)

 * Jim Meyering <[EMAIL PROTECTED]> dixit:
> > "Print newline, word and byte counts for each FILE"
> Thanks for the report.
> That was fixed for coreutils-5.0.90.

My excuses, I didn't remember to take a look at alpha.gnu.org :(
Anyway, thanks for such a good job. The free software community owes
you a lot ;) Thanks for answering.

Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


___
Bug-coreutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-coreutils


Re: Bug in 'wc' --help output

2003-09-06 Thread Jim Meyering
DervishD <[EMAIL PROTECTED]> wrote:
> In GNU coreutils 5.0, the 'wc' binary outputs this when invoked
> with --help option:
...
> Print byte, word, and newline counts for each FILE, and a total line if
...
> POSIX, the default output of wc are newlines, words and bytes, not
> the reverse.
> but the --help text is not correct. It should read:
>
> "Print newline, word and byte counts for each FILE"

Thanks for the report.
That was fixed for coreutils-5.0.90.

  ftp://alpha.gnu.org/gnu/coreutils/coreutils-5.0.90.tar.bz2
  (coreutils is the union of fileutils, textutils, and sh-utils)


___
Bug-coreutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-coreutils


Bug in 'wc' --help output

2003-09-06 Thread DervishD
Hi all :)

In GNU coreutils 5.0, the 'wc' binary outputs this when invoked
with --help option:

Usage: wc [OPTION]... [FILE]...
Print byte, word, and newline counts for each FILE, and a total line if
more than one FILE is specified.  With no FILE, or when FILE is -,
read standard input.
[...]

Well, according to the Single Unix Specification v3 and (AFAIK)
POSIX, the default output of wc are newlines, words and bytes, not
the reverse. 'wc' *correctly* outputs the information in that order,
but the --help text is not correct. It should read:

"Print newline, word and byte counts for each FILE"

Thanks a lot for coreutils, finally I have almost all my system
utilities updated together! That's nice, truly.

Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736
http://www.pleyades.net & http://raul.pleyades.net/


___
Bug-coreutils mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-coreutils