On Mon, Feb 8, 2021 at 2:06 AM Yi-yo Chiang via Toybox < toybox@lists.landley.net> wrote:
> > > On Mon, Feb 8, 2021 at 5:19 PM Rob Landley <r...@landley.net> wrote: > >> On 2/7/21 10:10 AM, Yi-Yo Chiang via Toybox wrote: >> > If file type is symlink and readlink() fails or returns unexpected link >> > size, then the file body wouldn't be written, resulting in a misaligned >> > archive. >> >> Yup, misaligned archive is definitely a bug. I wonder if I can reproduce >> it... >> >> > As explained in the comments, >> >> That's a bigger code comment than I'm checking in, FYI. (Large comments >> go in >> commit messages. In code, comments aren't executable and thus don't get >> regression tested, so having big ones inline is dubious unless they have >> a TODO >> in them.) >> >> > there are some cases where the link size >> > returned by lstat() and readlink() may be different. >> > To accommodate this, we readlink() and update the value of st_size >> > before writing the file header. >> >> I.E. the lstat() link size is not authoritative, the readlink() size is. >> >> > This may happen on Android if the userdata filesystem is encrypted. >> > >> > --- >> > tests/cpio.test | 2 +- >> > toys/posix/cpio.c | 24 ++++++++++++++++++++---- >> > 2 files changed, 21 insertions(+), 5 deletions(-) >> > >> > diff --git a/tests/cpio.test b/tests/cpio.test >> > index 11b3a5bf..a12edd09 100755 >> > --- a/tests/cpio.test >> > +++ b/tests/cpio.test >> > @@ -39,7 +39,7 @@ rm a bb ccc dddd >> > >> > # archive dangling symlinks and empty files even if we cannot open them >> > touch a; chmod a-rwx a; ln -s a/cant b >> > -toyonly testing "archives unreadable empty files" "cpio -o -H >> newc|cpio -it" "a\nb\n" "" "a\nb\n" >> > +toyonly testing "archives unreadable empty files" "cpio -o -H >> newc|cpio -it" "a\nb\na\n" "" "a\nb\na\n" >> > chmod u+rw a; rm -f a b >> >> What does this change accomplish? The changed test still passes without >> your >> code change, so what did you actually demonstrate? >> >> Ah, this fails on your broken filesystem that's returning inconsistent >> data. So >> really you just need to swap the order of a and b so the symlink has an >> entry >> after it... >> >> > diff --git a/toys/posix/cpio.c b/toys/posix/cpio.c >> > index 09c99ae1..200fb365 100644 >> > --- a/toys/posix/cpio.c >> > +++ b/toys/posix/cpio.c >> > @@ -230,6 +230,7 @@ void cpio_main(void) >> > unsigned nlen, error = 0, zero = 0; >> > int len, fd = -1; >> > ssize_t llen; >> > + char *lnkbuf = NULL; >> > >> > len = getline(&name, &size, stdin); >> > if (len<1) break; >> > @@ -242,6 +243,22 @@ void cpio_main(void) >> > continue; >> > } >> > >> > + // It is possible that readlink() may fail or the actual link >> size is >> > + // different from that indicated by lstat(). This may be due to >> a race >> > + // condition (the link is removed/changed between the call to >> lstat() and >> > + // readlink()), >> >> In which case it should get caught with the lstat failure above and >> reported as >> an error before we write out a header for it. (Better to avoid writing a >> bad >> entry at all than emit nonsense for padding reasons...) >> >> > or may happen on encrypted filesystem where lstat() >> > + // returns the size of encrypted link and readlink() returns the >> size of >> > + // decrypted link. >> >> Still really seems like a filesystem bug to me, but we must work around >> the bugs >> that exist. (And it's not as broken as some of the crap reiserfs pulled >> back in >> the day...) > > > Yeah I think this is a filesystem or kernel bug (fscrypt?) and the real > good fix is for the kernel to report consistent lstat() results, but we > have to remedy for existing devices out there that's running with this > buggy lstat() :/ > I've raised a question regarding this issue to our kernel team. > i'll be curious to hear whether it's actually a bug, or a deliberate obfuscation: "no, i'm not even going to tell you the _length_ of this encrypted information because i worry you might be able to make valid inferences from that". https://en.wikipedia.org/wiki/Padding_(cryptography)#Traffic_analysis_and_protection_via_padding kind of thing. > > Thus we first readlink() and update the st_size, bail >> > + // out early if readlink() fails, then write the file header and >> the link. >> > + if (S_ISLNK(st.st_mode)) { >> > + lnkbuf = xreadlink(name); >> > + if (!lnkbuf) { >> > + perror_msg("readlink '%s'", name); >> > + continue; >> > + } >> > + st.st_size = strlen(lnkbuf); >> > + } >> > + >> > if (FLAG(no_preserve_owner)) st.st_uid = st.st_gid = 0; >> > if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) st.st_size = 0; >> > if (st.st_size >> 32) perror_msg("skipping >2G file '%s'", name); >> > @@ -262,9 +279,7 @@ void cpio_main(void) >> > // Write out body for symlink or regular file >> > llen = st.st_size; >> > if (S_ISLNK(st.st_mode)) { >> > - if (readlink(name, toybuf, sizeof(toybuf)-1) == llen) >> > - xwrite(afd, toybuf, llen); >> > - else perror_msg("readlink '%s'", name); >> >> My original bug was that should have been perror_exit() because the file >> was >> corrupted otherwise. Or the xwrite() should have been unconditional so it >> didn't >> get misaligned. Ala I _should_ have written: >> >> if (readlink(name, toybuf, sizeof(toybuf)-1) != llen) >> perror_msg("readlink '%s'", name); >> xwrite(afd, toybuf, llen); >> >> But doing the readlink with the open() and noticing failure before >> writing the >> header is the better fix, and then using the link string length to >> override what >> stat() says becomes a one line bug workaround for the new issue you hit... >> >> > + xwrite(afd, lnkbuf, llen); >> > } else while (llen) { >> > nlen = llen > sizeof(toybuf) ? sizeof(toybuf) : llen; >> > llen -= nlen; >> > @@ -276,7 +291,8 @@ void cpio_main(void) >> > llen = st.st_size & 3; >> > if (llen) xwrite(afd, &zero, 4-llen); >> > } >> > - close(fd); >> > + if (lnkbuf) free(lnkbuf); >> > + if (fd != -1) close(fd); >> >> free(0) and close(-1) are both defined as NOPs by posix. > > > Good to know this!. > > >> I usually test for the >> close one because it's an exra syscall and noise in strace, but free(0) >> is just >> a libc call. And in fact, I can just modify xclose() to do the -1 test >> for me... >> >> (I've gotten REALLY lax about using xclose because as far as I know the >> only >> time close() can EVER fail is on NFS, but I admit "not having to test for >> -1" is >> a reason to use it...) >> >> > } >> > free(name); >> >> I checked in a slightly different fix. Does that solve your problem? >> > > Yup, verified on my android test bench. > > >> >> Thanks, >> >> Rob >> > > > -- > > Yi-yo Chiang > Software Engineer > yochi...@google.com > _______________________________________________ > Toybox mailing list > Toybox@lists.landley.net > http://lists.landley.net/listinfo.cgi/toybox-landley.net >
_______________________________________________ Toybox mailing list Toybox@lists.landley.net http://lists.landley.net/listinfo.cgi/toybox-landley.net