Github user myui commented on a diff in the pull request:
https://github.com/apache/incubator-hivemall/pull/97#discussion_r127502672
--- Diff: core/src/main/java/hivemall/utils/io/IOUtils.java ---
@@ -129,6 +131,29 @@ public static int readInt(final InputStream in) throws
IOException {
return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4 << 0));
}
+ /**
+ * Look ahead InputStream and decompress it as GZIPInputStream if
needed
+ *
+ * @link https://stackoverflow.com/a/4818946
+ */
+ public static InputStream decompressStream(final InputStream is)
throws IOException {
+ final PushbackInputStream pb = new PushbackInputStream(is, 2);
+
+ // look ahead
+ int b1 = pb.read();
--- End diff --
a potential bug. `read()` could return -1 on EOF.
Here is a revised version.
```java
// look ahead
final byte[] signature = new byte[2];
// If no byte is available because the stream is at the end of the
file, the value -1 is returned;
// otherwise, at least one byte is read and stored into b.
final int nread = pb.read(signature);
if (nread > 0) {// may be -1 (EOF) or 1 or 2
pb.unread(signature, 0, nread); // push back
}
final int streamHeader = ((int) signature[0] & 0xff) |
((signature[1] << 8) & 0xff00);
if (streamHeader == GZIPInputStream.GZIP_MAGIC) {
return new GZIPInputStream(pb);
} else {
return pb;
}
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---