On Mon, 8 Jan 2024 15:43:34 GMT, Jaikiran Pai <j...@openjdk.org> wrote:

> The crucial part here is that the "current" `ZipEntry` is the one which was 
> created by a previous call to `ZipInputStream.getNextEntry()` method, which 
> is a public overridable method. Furthermore, the `ZipEntry.getExtra()` method 
> too is overridable. So in theory, the byte[] data returned by the 
> `ZipEntry.getExtra()` isn't controlled or validated and can be anything that 
> the overridden method might return. 

Yes, we cannot rely on the extra field contents to be well-formed and valid. 
Because of this, `hasZip64Extra` must be coded in a defensive fashion, meaning 
it returns true only when it finds a valid Zip64 extra field, false otherwise. 
It should also never throw exceptions. This is the case regardless of ZipEntry 
concerns, since the input ZIP data may already be malformed.

> The bigger question however is, should `readEnd()` now be entrusted the 
> responsibility of reading the `ZipEntry.getExtra()` byte[] contents that can 
> be anything and even if the content of the byte[] seem to represent a zip64 
> entry (i.e. the newly introduced `hasZip64Extra()` method returning `true`), 
> should that decision be trusted to further lead the `readEnd()` method to 
> read the compressed/uncompressed sizes in data descriptor as 8 bytes? Would 
> this need additional checks of some form?

To avoid readEND relying on "untrusted" ZipEntry contents, I moved the call to 
`hasZip64Extra` from `readEND` to `readLOC`. This allows `hasZip64Extra` to 
work on the raw extra data.

 For this to work, I needed to add a new internal boolean field 
`ZipInputStream.expectEightBitDataDescriptor`. When true, this flag indicates 
to `readEND` that the next data descriptor has 8-byte size fields.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/12524#issuecomment-1883339627

Reply via email to