Re: cmp(1) '-s' flag ignoring byte offset argument?

2021-01-10 Thread Jordan Geoghegan



On 1/9/21 1:59 AM, Otto Moerbeek wrote:
> On Sat, Jan 09, 2021 at 12:05:31AM -0800, William Ahern wrote:
>
>> On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote:
>>> Hey folks,
>>>
>>> I've noticed some surprising behaviour from cmp(1) when using the '-s'
>>> flag.
>>>
>>> It appears that cmp -s is ignoring the byte offset arguments I'm giving
>>> it.
>> 
>>> Not sure what to make of this, I noticed this same behaviour on
>>> DragonflyBSD and FreeBSD, so maybe I'm just missing something obvious.
>>> This certainly caused some frustration before I figured out what was going
>>> on.
>> The bug seems to be in the short-circuit optimization for regular files[1]:
>>
>> void
>>   c_regular(int fd1, char *file1, off_t skip1, off_t len1,
>>   int fd2, char *file2, off_t skip2, off_t len2)
>>   {
>>  u_char ch, *p1, *p2;
>>  off_t byte, length, line;
>>  int dfound;
>>   
>>  if (sflag && len1 != len2)
>>  exit(1);
>>   
>>  if (skip1 > len1)
>>  eofmsg(file1);
>>  len1 -= skip1;
>>  if (skip2 > len2)
>>  eofmsg(file2);
>>  len2 -= skip2;
>>
>> The short-circuit should probably be moved below the subsequent chunk of
>> code (i.e. below `len2 -= skip2`). The eofmsg function already obeys sflag,
>> so it'll be quiet.[2] Doing this works for me. See patch at end of message.
>>
>> Interestingly, DragonflyBSD and FreeBSD already do it this way[3][4], yet I
>> can confirm FreeBSD still has the problem. (DragonflyBSD has nearly
>> identical code.) But that implementation duplicates the short-circuit, along
>> with the bug of not accounting for skip1 and skip2, in cmp.c as part of
>> implementing the -z flag[5]:
>>
>>  if (special)
>>  c_special(fd1, file1, skip1, fd2, file2, skip2);
>>  else {
>>  if (zflag && sb1.st_size != sb2.st_size) {
>>  if (!sflag)
>>  (void) printf("%s %s differ: size\n",
>>  file1, file2);
>>  exit(DIFF_EXIT);
>>  }
>>  c_regular(fd1, file1, skip1, sb1.st_size,
>>  fd2, file2, skip2, sb2.st_size);
>>  }
>>  exit(0);
>>
>> It appears that the June 20, 2000 fix to the short-circuit in regular.c
>> wasn't recognized during the July 14, 2000 -z feature addition.[6][7]
>>
>> [1] https://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c?rev=1.12
>> [2] https://cvsweb.openbsd.org/src/usr.bin/cmp/misc.c?rev=1.7
>> [3] 
>> https://gitweb.dragonflybsd.org/dragonfly.git/blob/4d4f84f:/usr.bin/cmp/regular.c
>> [4] 
>> https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=344551
>> [5] 
>> https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?revision=344551&view=markup#l193
>> [6] 
>> https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=61883&view=markup
>> [7] 
>> https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?view=markup&pathrev=63157
>>
>> --- regular.c6 Feb 2015 23:21:59 -   1.12
>> +++ regular.c9 Jan 2021 07:51:13 -
>> @@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk
>>  off_t byte, length, line;
>>  int dfound;
>>  
>> -if (sflag && len1 != len2)
>> -exit(1);
>> -
>>  if (skip1 > len1)
>>  eofmsg(file1);
>>  len1 -= skip1;
>>  if (skip2 > len2)
>>  eofmsg(file2);
>>  len2 -= skip2;
>> +
>> +if (sflag && len1 != len2)
>> +exit(1);
>>  
>>  length = MINIMUM(len1, len2);
>>  if (length > SIZE_MAX) {
>>
> I came to the same diff independently. In the meantime it has been committed.
>
>   -Otto
>

Hi Otto and William,

Thanks for confirming that this is a bug, and also for the fixes!

Regards,

Jordan



Re: cmp(1) '-s' flag ignoring byte offset argument?

2021-01-09 Thread Otto Moerbeek
On Sat, Jan 09, 2021 at 12:05:31AM -0800, William Ahern wrote:

> On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote:
> > Hey folks,
> > 
> > I've noticed some surprising behaviour from cmp(1) when using the '-s'
> > flag.
> > 
> > It appears that cmp -s is ignoring the byte offset arguments I'm giving
> > it.
> 
> > Not sure what to make of this, I noticed this same behaviour on
> > DragonflyBSD and FreeBSD, so maybe I'm just missing something obvious.
> > This certainly caused some frustration before I figured out what was going
> > on.
> 
> The bug seems to be in the short-circuit optimization for regular files[1]:
> 
> void
>   c_regular(int fd1, char *file1, off_t skip1, off_t len1,
>   int fd2, char *file2, off_t skip2, off_t len2)
>   {
>   u_char ch, *p1, *p2;
>   off_t byte, length, line;
>   int dfound;
>   
>   if (sflag && len1 != len2)
>   exit(1);
>   
>   if (skip1 > len1)
>   eofmsg(file1);
>   len1 -= skip1;
>   if (skip2 > len2)
>   eofmsg(file2);
>   len2 -= skip2;
> 
> The short-circuit should probably be moved below the subsequent chunk of
> code (i.e. below `len2 -= skip2`). The eofmsg function already obeys sflag,
> so it'll be quiet.[2] Doing this works for me. See patch at end of message.
> 
> Interestingly, DragonflyBSD and FreeBSD already do it this way[3][4], yet I
> can confirm FreeBSD still has the problem. (DragonflyBSD has nearly
> identical code.) But that implementation duplicates the short-circuit, along
> with the bug of not accounting for skip1 and skip2, in cmp.c as part of
> implementing the -z flag[5]:
> 
>   if (special)
>   c_special(fd1, file1, skip1, fd2, file2, skip2);
>   else {
>   if (zflag && sb1.st_size != sb2.st_size) {
>   if (!sflag)
>   (void) printf("%s %s differ: size\n",
>   file1, file2);
>   exit(DIFF_EXIT);
>   }
>   c_regular(fd1, file1, skip1, sb1.st_size,
>   fd2, file2, skip2, sb2.st_size);
>   }
>   exit(0);
> 
> It appears that the June 20, 2000 fix to the short-circuit in regular.c
> wasn't recognized during the July 14, 2000 -z feature addition.[6][7]
> 
> [1] https://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c?rev=1.12
> [2] https://cvsweb.openbsd.org/src/usr.bin/cmp/misc.c?rev=1.7
> [3] 
> https://gitweb.dragonflybsd.org/dragonfly.git/blob/4d4f84f:/usr.bin/cmp/regular.c
> [4] https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=344551
> [5] 
> https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?revision=344551&view=markup#l193
> [6] 
> https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=61883&view=markup
> [7] 
> https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?view=markup&pathrev=63157
> 
> --- regular.c 6 Feb 2015 23:21:59 -   1.12
> +++ regular.c 9 Jan 2021 07:51:13 -
> @@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk
>   off_t byte, length, line;
>   int dfound;
>  
> - if (sflag && len1 != len2)
> - exit(1);
> -
>   if (skip1 > len1)
>   eofmsg(file1);
>   len1 -= skip1;
>   if (skip2 > len2)
>   eofmsg(file2);
>   len2 -= skip2;
> +
> + if (sflag && len1 != len2)
> + exit(1);
>  
>   length = MINIMUM(len1, len2);
>   if (length > SIZE_MAX) {
> 

I came to the same diff independently. In the meantime it has been committed.

-Otto



Re: cmp(1) '-s' flag ignoring byte offset argument?

2021-01-09 Thread William Ahern
On Sat, Jan 09, 2021 at 12:05:31AM -0800, William Ahern wrote:

> Interestingly, DragonflyBSD and FreeBSD already do it this way[3][4], yet I
> can confirm FreeBSD still has the problem. (DragonflyBSD has nearly
> identical code.) But that implementation duplicates the short-circuit, along
> with the bug of not accounting for skip1 and skip2, in cmp.c as part of
> implementing the -z flag[5]:
> 
>   if (special)
>   c_special(fd1, file1, skip1, fd2, file2, skip2);
>   else {
>   if (zflag && sb1.st_size != sb2.st_size) {
>   if (!sflag)
>   (void) printf("%s %s differ: size\n",
>   file1, file2);
>   exit(DIFF_EXIT);
>   }
>   c_regular(fd1, file1, skip1, sb1.st_size,
>   fd2, file2, skip2, sb2.st_size);
>   }
>   exit(0);
> 
> It appears that the June 20, 2000 fix to the short-circuit in regular.c
> wasn't recognized during the July 14, 2000 -z feature addition.[6][7]

Note that zflag is set with sflag earlier in FreeBSD's regular.c:

case 's':   /* silent run */
sflag = 1;
zflag = 1;
break;



Re: cmp(1) '-s' flag ignoring byte offset argument?

2021-01-09 Thread William Ahern
On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote:
> Hey folks,
> 
> I've noticed some surprising behaviour from cmp(1) when using the '-s'
> flag.
> 
> It appears that cmp -s is ignoring the byte offset arguments I'm giving
> it.

> Not sure what to make of this, I noticed this same behaviour on
> DragonflyBSD and FreeBSD, so maybe I'm just missing something obvious.
> This certainly caused some frustration before I figured out what was going
> on.

The bug seems to be in the short-circuit optimization for regular files[1]:

void
  c_regular(int fd1, char *file1, off_t skip1, off_t len1,
  int fd2, char *file2, off_t skip2, off_t len2)
  {
u_char ch, *p1, *p2;
off_t byte, length, line;
int dfound;
  
if (sflag && len1 != len2)
exit(1);
  
if (skip1 > len1)
eofmsg(file1);
len1 -= skip1;
if (skip2 > len2)
eofmsg(file2);
len2 -= skip2;

The short-circuit should probably be moved below the subsequent chunk of
code (i.e. below `len2 -= skip2`). The eofmsg function already obeys sflag,
so it'll be quiet.[2] Doing this works for me. See patch at end of message.

Interestingly, DragonflyBSD and FreeBSD already do it this way[3][4], yet I
can confirm FreeBSD still has the problem. (DragonflyBSD has nearly
identical code.) But that implementation duplicates the short-circuit, along
with the bug of not accounting for skip1 and skip2, in cmp.c as part of
implementing the -z flag[5]:

if (special)
c_special(fd1, file1, skip1, fd2, file2, skip2);
else {
if (zflag && sb1.st_size != sb2.st_size) {
if (!sflag)
(void) printf("%s %s differ: size\n",
file1, file2);
exit(DIFF_EXIT);
}
c_regular(fd1, file1, skip1, sb1.st_size,
fd2, file2, skip2, sb2.st_size);
}
exit(0);

It appears that the June 20, 2000 fix to the short-circuit in regular.c
wasn't recognized during the July 14, 2000 -z feature addition.[6][7]

[1] https://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c?rev=1.12
[2] https://cvsweb.openbsd.org/src/usr.bin/cmp/misc.c?rev=1.7
[3] 
https://gitweb.dragonflybsd.org/dragonfly.git/blob/4d4f84f:/usr.bin/cmp/regular.c
[4] https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=344551
[5] 
https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?revision=344551&view=markup#l193
[6] 
https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=61883&view=markup
[7] 
https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?view=markup&pathrev=63157

--- regular.c   6 Feb 2015 23:21:59 -   1.12
+++ regular.c   9 Jan 2021 07:51:13 -
@@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk
off_t byte, length, line;
int dfound;
 
-   if (sflag && len1 != len2)
-   exit(1);
-
if (skip1 > len1)
eofmsg(file1);
len1 -= skip1;
if (skip2 > len2)
eofmsg(file2);
len2 -= skip2;
+
+   if (sflag && len1 != len2)
+   exit(1);
 
length = MINIMUM(len1, len2);
if (length > SIZE_MAX) {



Re: cmp(1) '-s' flag ignoring byte offset argument?

2021-01-08 Thread Otto Moerbeek
On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote:

> Hey folks,
> 
> I've noticed some surprising behaviour from cmp(1) when using the '-s' flag.
> 
> It appears that cmp -s is ignoring the byte offset arguments I'm giving it.
> 
> I don't want to waste time babbling, so here's an example snippet to show 
> what I'm talking about:
> 
> #!/bin/sh
> 
> echo 'my line' > /tmp/1.txt
> echo 'my other line' >> /tmp/1.txt
> echo 'same same' >> /tmp/1.txt
> 
> echo 'my differnt line' > /tmp/2.txt
> echo 'my other different line' >> /tmp/2.txt
> echo 'same same' >> /tmp/2.txt
> 
> # Determine byte offsets (we only want to compare lines >= 3)
> offset1="$(head -2 /tmp/1.txt | wc -c)"
> offset2="$(head -2 /tmp/2.txt | wc -c)"
> 
> # Compare files and show exit code
> cmp /tmp/1.txt /tmp/2.txt "$offset1" "$offset2"
> printf '\nReturn code = %s\n' "$?"
> 
> cmp -s /tmp/1.txt /tmp/2.txt "$offset1" "$offset2"
> printf '\nReturn code with "-s" = %s\n' "$?"
> 
> As you can see, 'cmp -s' returns an exit code of '1', unlike cmp without the 
> '-s' which returns '0'.
> 
> Not sure what to make of this, I noticed this same behaviour on DragonflyBSD 
> and FreeBSD, so maybe I'm just missing something obvious. This certainly 
> caused some frustration before I figured out what was going on.
> 
> Regards,
> 
> Jordan
> 

This is a bug. It has been there since the beginning, according to
http://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c

FreeBSD has it fixed, NetBSD not.

-Otto

Index: regular.c
===
RCS file: /cvs/src/usr.bin/cmp/regular.c,v
retrieving revision 1.12
diff -u -p -r1.12 regular.c
--- regular.c   6 Feb 2015 23:21:59 -   1.12
+++ regular.c   9 Jan 2021 06:53:20 -
@@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk
off_t byte, length, line;
int dfound;
 
-   if (sflag && len1 != len2)
-   exit(1);
-
if (skip1 > len1)
eofmsg(file1);
len1 -= skip1;
if (skip2 > len2)
eofmsg(file2);
len2 -= skip2;
+
+   if (sflag && len1 != len2)
+   exit(1);
 
length = MINIMUM(len1, len2);
if (length > SIZE_MAX) {



cmp(1) '-s' flag ignoring byte offset argument?

2021-01-08 Thread Jordan Geoghegan
Hey folks,

I've noticed some surprising behaviour from cmp(1) when using the '-s' flag.

It appears that cmp -s is ignoring the byte offset arguments I'm giving it.

I don't want to waste time babbling, so here's an example snippet to show what 
I'm talking about:

#!/bin/sh

echo 'my line' > /tmp/1.txt
echo 'my other line' >> /tmp/1.txt
echo 'same same' >> /tmp/1.txt

echo 'my differnt line' > /tmp/2.txt
echo 'my other different line' >> /tmp/2.txt
echo 'same same' >> /tmp/2.txt

# Determine byte offsets (we only want to compare lines >= 3)
offset1="$(head -2 /tmp/1.txt | wc -c)"
offset2="$(head -2 /tmp/2.txt | wc -c)"

# Compare files and show exit code
cmp /tmp/1.txt /tmp/2.txt "$offset1" "$offset2"
printf '\nReturn code = %s\n' "$?"

cmp -s /tmp/1.txt /tmp/2.txt "$offset1" "$offset2"
printf '\nReturn code with "-s" = %s\n' "$?"

As you can see, 'cmp -s' returns an exit code of '1', unlike cmp without the 
'-s' which returns '0'.

Not sure what to make of this, I noticed this same behaviour on DragonflyBSD 
and FreeBSD, so maybe I'm just missing something obvious. This certainly caused 
some frustration before I figured out what was going on.

Regards,

Jordan