On Wed, May 23, 2018 at 11:34:52AM +0900, Junio C Hamano wrote:

> > @@ -90,13 +99,32 @@ foreach my $tar_file (@ARGV)
> >                     Z8 Z1 Z100 Z6
> >                     Z2 Z32 Z32 Z8 Z8 Z*', $_;
> >             }
> > -           next if $name =~ m{/\z};
> >             $mode = oct $mode;
> >             $size = oct $size;
> >             $mtime = oct $mtime;
> >             next if $typeflag == 5; # directory
> >  
> > -           if ($typeflag != 1) { # handle hard links later
> > +           if ($typeflag eq 'x') { # extended header
> > +                   # If extended header, check for path
> > +                   my $pax_header = '';
> > +                   while ($size > 0 && read(I, $_, 512) == 512) {
> 
> Would we ever get a short-read (i.e. we ask to read 512 bytes,
> syscall returns after reading only 256 bytes, even though next call
> to read would give the remaining 256 bytes and later ones)?

No, because perl's read() is buffered (you need sysread() to get a real
syscall read). We might read fewer than 512 if we hit EOF, but I think
that would be a truncated input, then, since ustar does everything in
512-byte records.

I do think we'd fail to notice the truncation, which isn't ideal. But it
looks like the rest of the script suffers from the same issue.

If anybody cares, it might not be too hard to wrap all of the 512-byte
read calls into a helper that dies on bogus input. I sort of assumed
this was mostly a proof of concept script and nobody used it, though. :)

It makes me wonder if there is a better-tested tar-reading module in
CPAN that could be used (though at the expense of requiring an extra
dependency).

-Peff

Reply via email to