Re: cmp(1) '-s' flag ignoring byte offset argument?
On 1/9/21 1:59 AM, Otto Moerbeek wrote: > On Sat, Jan 09, 2021 at 12:05:31AM -0800, William Ahern wrote: > >> On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote: >>> Hey folks, >>> >>> I've noticed some surprising behaviour from cmp(1) when using the '-s' >>> flag. >>> >>> It appears that cmp -s is ignoring the byte offset arguments I'm giving >>> it. >> >>> Not sure what to make of this, I noticed this same behaviour on >>> DragonflyBSD and FreeBSD, so maybe I'm just missing something obvious. >>> This certainly caused some frustration before I figured out what was going >>> on. >> The bug seems to be in the short-circuit optimization for regular files[1]: >> >> void >> c_regular(int fd1, char *file1, off_t skip1, off_t len1, >> int fd2, char *file2, off_t skip2, off_t len2) >> { >> u_char ch, *p1, *p2; >> off_t byte, length, line; >> int dfound; >> >> if (sflag && len1 != len2) >> exit(1); >> >> if (skip1 > len1) >> eofmsg(file1); >> len1 -= skip1; >> if (skip2 > len2) >> eofmsg(file2); >> len2 -= skip2; >> >> The short-circuit should probably be moved below the subsequent chunk of >> code (i.e. below `len2 -= skip2`). The eofmsg function already obeys sflag, >> so it'll be quiet.[2] Doing this works for me. See patch at end of message. >> >> Interestingly, DragonflyBSD and FreeBSD already do it this way[3][4], yet I >> can confirm FreeBSD still has the problem. (DragonflyBSD has nearly >> identical code.) But that implementation duplicates the short-circuit, along >> with the bug of not accounting for skip1 and skip2, in cmp.c as part of >> implementing the -z flag[5]: >> >> if (special) >> c_special(fd1, file1, skip1, fd2, file2, skip2); >> else { >> if (zflag && sb1.st_size != sb2.st_size) { >> if (!sflag) >> (void) printf("%s %s differ: size\n", >> file1, file2); >> exit(DIFF_EXIT); >> } >> c_regular(fd1, file1, skip1, sb1.st_size, >> fd2, file2, skip2, sb2.st_size); >> } >> exit(0); >> >> It appears that the June 20, 2000 fix to the short-circuit in regular.c >> wasn't recognized during the July 14, 2000 -z feature addition.[6][7] >> >> [1] https://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c?rev=1.12 >> [2] https://cvsweb.openbsd.org/src/usr.bin/cmp/misc.c?rev=1.7 >> [3] >> https://gitweb.dragonflybsd.org/dragonfly.git/blob/4d4f84f:/usr.bin/cmp/regular.c >> [4] >> https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=344551 >> [5] >> https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?revision=344551&view=markup#l193 >> [6] >> https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=61883&view=markup >> [7] >> https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?view=markup&pathrev=63157 >> >> --- regular.c6 Feb 2015 23:21:59 - 1.12 >> +++ regular.c9 Jan 2021 07:51:13 - >> @@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk >> off_t byte, length, line; >> int dfound; >> >> -if (sflag && len1 != len2) >> -exit(1); >> - >> if (skip1 > len1) >> eofmsg(file1); >> len1 -= skip1; >> if (skip2 > len2) >> eofmsg(file2); >> len2 -= skip2; >> + >> +if (sflag && len1 != len2) >> +exit(1); >> >> length = MINIMUM(len1, len2); >> if (length > SIZE_MAX) { >> > I came to the same diff independently. In the meantime it has been committed. > > -Otto > Hi Otto and William, Thanks for confirming that this is a bug, and also for the fixes! Regards, Jordan
Re: cmp(1) '-s' flag ignoring byte offset argument?
On Sat, Jan 09, 2021 at 12:05:31AM -0800, William Ahern wrote: > On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote: > > Hey folks, > > > > I've noticed some surprising behaviour from cmp(1) when using the '-s' > > flag. > > > > It appears that cmp -s is ignoring the byte offset arguments I'm giving > > it. > > > Not sure what to make of this, I noticed this same behaviour on > > DragonflyBSD and FreeBSD, so maybe I'm just missing something obvious. > > This certainly caused some frustration before I figured out what was going > > on. > > The bug seems to be in the short-circuit optimization for regular files[1]: > > void > c_regular(int fd1, char *file1, off_t skip1, off_t len1, > int fd2, char *file2, off_t skip2, off_t len2) > { > u_char ch, *p1, *p2; > off_t byte, length, line; > int dfound; > > if (sflag && len1 != len2) > exit(1); > > if (skip1 > len1) > eofmsg(file1); > len1 -= skip1; > if (skip2 > len2) > eofmsg(file2); > len2 -= skip2; > > The short-circuit should probably be moved below the subsequent chunk of > code (i.e. below `len2 -= skip2`). The eofmsg function already obeys sflag, > so it'll be quiet.[2] Doing this works for me. See patch at end of message. > > Interestingly, DragonflyBSD and FreeBSD already do it this way[3][4], yet I > can confirm FreeBSD still has the problem. (DragonflyBSD has nearly > identical code.) But that implementation duplicates the short-circuit, along > with the bug of not accounting for skip1 and skip2, in cmp.c as part of > implementing the -z flag[5]: > > if (special) > c_special(fd1, file1, skip1, fd2, file2, skip2); > else { > if (zflag && sb1.st_size != sb2.st_size) { > if (!sflag) > (void) printf("%s %s differ: size\n", > file1, file2); > exit(DIFF_EXIT); > } > c_regular(fd1, file1, skip1, sb1.st_size, > fd2, file2, skip2, sb2.st_size); > } > exit(0); > > It appears that the June 20, 2000 fix to the short-circuit in regular.c > wasn't recognized during the July 14, 2000 -z feature addition.[6][7] > > [1] https://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c?rev=1.12 > [2] https://cvsweb.openbsd.org/src/usr.bin/cmp/misc.c?rev=1.7 > [3] > https://gitweb.dragonflybsd.org/dragonfly.git/blob/4d4f84f:/usr.bin/cmp/regular.c > [4] https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=344551 > [5] > https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?revision=344551&view=markup#l193 > [6] > https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=61883&view=markup > [7] > https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?view=markup&pathrev=63157 > > --- regular.c 6 Feb 2015 23:21:59 - 1.12 > +++ regular.c 9 Jan 2021 07:51:13 - > @@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk > off_t byte, length, line; > int dfound; > > - if (sflag && len1 != len2) > - exit(1); > - > if (skip1 > len1) > eofmsg(file1); > len1 -= skip1; > if (skip2 > len2) > eofmsg(file2); > len2 -= skip2; > + > + if (sflag && len1 != len2) > + exit(1); > > length = MINIMUM(len1, len2); > if (length > SIZE_MAX) { > I came to the same diff independently. In the meantime it has been committed. -Otto
Re: cmp(1) '-s' flag ignoring byte offset argument?
On Sat, Jan 09, 2021 at 12:05:31AM -0800, William Ahern wrote: > Interestingly, DragonflyBSD and FreeBSD already do it this way[3][4], yet I > can confirm FreeBSD still has the problem. (DragonflyBSD has nearly > identical code.) But that implementation duplicates the short-circuit, along > with the bug of not accounting for skip1 and skip2, in cmp.c as part of > implementing the -z flag[5]: > > if (special) > c_special(fd1, file1, skip1, fd2, file2, skip2); > else { > if (zflag && sb1.st_size != sb2.st_size) { > if (!sflag) > (void) printf("%s %s differ: size\n", > file1, file2); > exit(DIFF_EXIT); > } > c_regular(fd1, file1, skip1, sb1.st_size, > fd2, file2, skip2, sb2.st_size); > } > exit(0); > > It appears that the June 20, 2000 fix to the short-circuit in regular.c > wasn't recognized during the July 14, 2000 -z feature addition.[6][7] Note that zflag is set with sflag earlier in FreeBSD's regular.c: case 's': /* silent run */ sflag = 1; zflag = 1; break;
Re: cmp(1) '-s' flag ignoring byte offset argument?
On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote: > Hey folks, > > I've noticed some surprising behaviour from cmp(1) when using the '-s' > flag. > > It appears that cmp -s is ignoring the byte offset arguments I'm giving > it. > Not sure what to make of this, I noticed this same behaviour on > DragonflyBSD and FreeBSD, so maybe I'm just missing something obvious. > This certainly caused some frustration before I figured out what was going > on. The bug seems to be in the short-circuit optimization for regular files[1]: void c_regular(int fd1, char *file1, off_t skip1, off_t len1, int fd2, char *file2, off_t skip2, off_t len2) { u_char ch, *p1, *p2; off_t byte, length, line; int dfound; if (sflag && len1 != len2) exit(1); if (skip1 > len1) eofmsg(file1); len1 -= skip1; if (skip2 > len2) eofmsg(file2); len2 -= skip2; The short-circuit should probably be moved below the subsequent chunk of code (i.e. below `len2 -= skip2`). The eofmsg function already obeys sflag, so it'll be quiet.[2] Doing this works for me. See patch at end of message. Interestingly, DragonflyBSD and FreeBSD already do it this way[3][4], yet I can confirm FreeBSD still has the problem. (DragonflyBSD has nearly identical code.) But that implementation duplicates the short-circuit, along with the bug of not accounting for skip1 and skip2, in cmp.c as part of implementing the -z flag[5]: if (special) c_special(fd1, file1, skip1, fd2, file2, skip2); else { if (zflag && sb1.st_size != sb2.st_size) { if (!sflag) (void) printf("%s %s differ: size\n", file1, file2); exit(DIFF_EXIT); } c_regular(fd1, file1, skip1, sb1.st_size, fd2, file2, skip2, sb2.st_size); } exit(0); It appears that the June 20, 2000 fix to the short-circuit in regular.c wasn't recognized during the July 14, 2000 -z feature addition.[6][7] [1] https://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c?rev=1.12 [2] https://cvsweb.openbsd.org/src/usr.bin/cmp/misc.c?rev=1.7 [3] https://gitweb.dragonflybsd.org/dragonfly.git/blob/4d4f84f:/usr.bin/cmp/regular.c [4] https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=344551 [5] https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?revision=344551&view=markup#l193 [6] https://svnweb.freebsd.org/base/head/usr.bin/cmp/regular.c?revision=61883&view=markup [7] https://svnweb.freebsd.org/base/head/usr.bin/cmp/cmp.c?view=markup&pathrev=63157 --- regular.c 6 Feb 2015 23:21:59 - 1.12 +++ regular.c 9 Jan 2021 07:51:13 - @@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk off_t byte, length, line; int dfound; - if (sflag && len1 != len2) - exit(1); - if (skip1 > len1) eofmsg(file1); len1 -= skip1; if (skip2 > len2) eofmsg(file2); len2 -= skip2; + + if (sflag && len1 != len2) + exit(1); length = MINIMUM(len1, len2); if (length > SIZE_MAX) {
Re: cmp(1) '-s' flag ignoring byte offset argument?
On Fri, Jan 08, 2021 at 07:09:01PM -0800, Jordan Geoghegan wrote: > Hey folks, > > I've noticed some surprising behaviour from cmp(1) when using the '-s' flag. > > It appears that cmp -s is ignoring the byte offset arguments I'm giving it. > > I don't want to waste time babbling, so here's an example snippet to show > what I'm talking about: > > #!/bin/sh > > echo 'my line' > /tmp/1.txt > echo 'my other line' >> /tmp/1.txt > echo 'same same' >> /tmp/1.txt > > echo 'my differnt line' > /tmp/2.txt > echo 'my other different line' >> /tmp/2.txt > echo 'same same' >> /tmp/2.txt > > # Determine byte offsets (we only want to compare lines >= 3) > offset1="$(head -2 /tmp/1.txt | wc -c)" > offset2="$(head -2 /tmp/2.txt | wc -c)" > > # Compare files and show exit code > cmp /tmp/1.txt /tmp/2.txt "$offset1" "$offset2" > printf '\nReturn code = %s\n' "$?" > > cmp -s /tmp/1.txt /tmp/2.txt "$offset1" "$offset2" > printf '\nReturn code with "-s" = %s\n' "$?" > > As you can see, 'cmp -s' returns an exit code of '1', unlike cmp without the > '-s' which returns '0'. > > Not sure what to make of this, I noticed this same behaviour on DragonflyBSD > and FreeBSD, so maybe I'm just missing something obvious. This certainly > caused some frustration before I figured out what was going on. > > Regards, > > Jordan > This is a bug. It has been there since the beginning, according to http://cvsweb.openbsd.org/src/usr.bin/cmp/regular.c FreeBSD has it fixed, NetBSD not. -Otto Index: regular.c === RCS file: /cvs/src/usr.bin/cmp/regular.c,v retrieving revision 1.12 diff -u -p -r1.12 regular.c --- regular.c 6 Feb 2015 23:21:59 - 1.12 +++ regular.c 9 Jan 2021 06:53:20 - @@ -51,15 +51,15 @@ c_regular(int fd1, char *file1, off_t sk off_t byte, length, line; int dfound; - if (sflag && len1 != len2) - exit(1); - if (skip1 > len1) eofmsg(file1); len1 -= skip1; if (skip2 > len2) eofmsg(file2); len2 -= skip2; + + if (sflag && len1 != len2) + exit(1); length = MINIMUM(len1, len2); if (length > SIZE_MAX) {
cmp(1) '-s' flag ignoring byte offset argument?
Hey folks, I've noticed some surprising behaviour from cmp(1) when using the '-s' flag. It appears that cmp -s is ignoring the byte offset arguments I'm giving it. I don't want to waste time babbling, so here's an example snippet to show what I'm talking about: #!/bin/sh echo 'my line' > /tmp/1.txt echo 'my other line' >> /tmp/1.txt echo 'same same' >> /tmp/1.txt echo 'my differnt line' > /tmp/2.txt echo 'my other different line' >> /tmp/2.txt echo 'same same' >> /tmp/2.txt # Determine byte offsets (we only want to compare lines >= 3) offset1="$(head -2 /tmp/1.txt | wc -c)" offset2="$(head -2 /tmp/2.txt | wc -c)" # Compare files and show exit code cmp /tmp/1.txt /tmp/2.txt "$offset1" "$offset2" printf '\nReturn code = %s\n' "$?" cmp -s /tmp/1.txt /tmp/2.txt "$offset1" "$offset2" printf '\nReturn code with "-s" = %s\n' "$?" As you can see, 'cmp -s' returns an exit code of '1', unlike cmp without the '-s' which returns '0'. Not sure what to make of this, I noticed this same behaviour on DragonflyBSD and FreeBSD, so maybe I'm just missing something obvious. This certainly caused some frustration before I figured out what was going on. Regards, Jordan