PARQUET-783: Close the underlying stream when an H2SeekableInputStream is closed
This PR addresses https://issues.apache.org/jira/browse/PARQUET-783. `ParquetFileReader` opens a `SeekableInputStream` to read a footer. In the process, it opens a new `FSDataInputStream` and wraps it. However, `H2SeekableInputStream` does not override the `close` method. Therefore, when `ParquetFileReader` closes it, the underlying `FSDataInputStream` is not closed. As a result, these stale connections can exhaust a clusters' data nodes' connection resources and lead to mysterious HDFS read failures in HDFS clients, e.g. ``` org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-905337612-172.16.70.103-1444328960665:blk_1720536852_646811517 ``` Author: Michael Allman <mich...@videoamp.com> Closes #388 from mallman/parquet-783-close_underlying_inputstream and squashes the following commits: f4b27c1 [Michael Allman] PARQUET-783 Close the underlying stream when an H2SeekableInputStream is closed Project: http://git-wip-us.apache.org/repos/asf/parquet-mr/repo Commit: http://git-wip-us.apache.org/repos/asf/parquet-mr/commit/091ea27c Tree: http://git-wip-us.apache.org/repos/asf/parquet-mr/tree/091ea27c Diff: http://git-wip-us.apache.org/repos/asf/parquet-mr/diff/091ea27c Branch: refs/heads/parquet-1.8.x Commit: 091ea27cb4989fe4692b2af095d4d99b2c596ce8 Parents: 670acd1 Author: Michael Allman <mich...@videoamp.com> Authored: Mon Dec 5 15:27:14 2016 -0800 Committer: Ryan Blue <b...@apache.org> Committed: Mon Jan 9 16:58:15 2017 -0800 ---------------------------------------------------------------------- .../org/apache/parquet/hadoop/util/H2SeekableInputStream.java | 5 +++++ 1 file changed, 5 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/parquet-mr/blob/091ea27c/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java ---------------------------------------------------------------------- diff --git a/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java b/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java index a706546..ec4567e 100644 --- a/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java +++ b/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/H2SeekableInputStream.java @@ -45,6 +45,11 @@ class H2SeekableInputStream extends SeekableInputStream { } @Override + public void close() throws IOException { + stream.close(); + } + + @Override public long getPos() throws IOException { return stream.getPos(); }