Github user keith-turner commented on a diff in the pull request:
https://github.com/apache/accumulo/pull/134#discussion_r72681960
--- Diff:
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/BoundedRangeFileInputStream.java
---
@@ -62,12 +62,15 @@ public BoundedRangeFileInputStream(FSDataInputStream
in, long offset, long lengt
@Override
public int available() throws IOException {
- int avail = in.available();
- if (pos + avail > end) {
- avail = (int) (end - pos);
- }
+ final FSDataInputStream inLocal = in;
+ synchronized (inLocal) {
+ int avail = inLocal.available();
--- End diff --
> I am considering this, but others had commented to me that adhering to
the API as close as possible was a good thing.
Thats a hard call to make, I don't know what the right answer is. We can
make BoundedRangeInputStream work with the code that currently calls it. To
make it work with code that might call it in the future, need to follow the
interface javadoc. For example the way hadoop uses it could change in the
future.
Calling `in.seek()` then `in.available()` seems like the correct thing to
do. However, based on what @phrocker said, `in.seek()` call may have a
performance penalty, with no real benefit. I don't really like that. One
thing I am thinking is that always returning `end-pos` seems more correct than
only calling `in.available()` w/o calling `in.seek()`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---