ctubbsii commented on a change in pull request #300: ACCUMULO-4708 Limit RFile block size to 2GB URL: https://github.com/apache/accumulo/pull/300#discussion_r140379315
########## File path: core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/BCFile.java ########## @@ -287,8 +287,13 @@ public void finish() throws IOException { */ public long getRawSize() throws IOException { /** - * Expecting the size() of a block not exceeding 4GB. Assuming the size() will wrap to negative integer if it exceeds 2GB. + * size() comes from DataOutputStream, which returns Integer.MAX_VALUE on an overflow, which means we do not know if 2GB or more data has been written. + * Because the data is of an unknown length, we cannot know the block size. To avoid corrupt RFiles, we throw an exception. This should be addressed by + * whatever object is putting data into the stream to ensure this condition is never reached. */ + if (size() == Integer.MAX_VALUE) { Review comment: Just to be safe, could also do `>=` in case implementation changes. We know it should never be larger than max int, but if implementation starts returning some larger value (for whatever reason), we don't want to let that through either. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services