garydgregory commented on code in PR #710:
URL: https://github.com/apache/commons-compress/pull/710#discussion_r2385629397
##########
src/main/java/org/apache/commons/compress/archivers/tar/TarFile.java:
##########
@@ -65,7 +66,7 @@ private final class BoundedTarEntryInputStream extends
BoundedArchiveInputStream
BoundedTarEntryInputStream(final TarArchiveEntry entry, final
SeekableByteChannel channel) throws IOException {
super(entry.getDataOffset(), entry.getRealSize());
if (channel.size() - entry.getSize() < entry.getDataOffset()) {
- throw new ArchiveException("Entry size exceeds archive size");
+ throw new EOFException("Truncated TAR archive: entry size
exceeds archive size");
Review Comment:
I was hoping that we could distinguish between low-level "real"
`IOException`s (reading issues from the underlying stream) and higher-level
formatting and parsing issues using `ArchiveException`. Perhaps this was
misguided. Since `ArchiveException` is a kind of `IOException`, call sites may
choose to catch `IOException` anyway and leave it to readers of stack traces to
figure out the difference.
In this case, and probably others in Compress, eager validation is better I
think. If a call site cares about preserving the state of a stream they can
always use a `PushbackInputStream`.
> I generally favor eager validation, but only when we know the stream will
be fully consumed before handing control back to the caller.
That one doesn't make sense to me. That means that before throwing an
exception, we'd have to read all the way to the end of an input stream? That
sounds like a DoS waiting to happen. Even if you are saying that we should read
to the end only if we throw `EOFException`, then it's still a bad idea for the
same reason IMO.
All of that to say:
- This instance is not an EOF, it's a formatting error.
- It doesn't seem like changing exceptions in this PR is a good idea: It's
thematically different, and the PR is already 42 files, making it harder to
review.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]