[ https://issues.apache.org/jira/browse/IO-429?focusedWorklogId=596327&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-596327 ]
ASF GitHub Bot logged work on IO-429: ------------------------------------- Author: ASF GitHub Bot Created on: 13/May/21 19:32 Start Date: 13/May/21 19:32 Worklog Time Spent: 10m Work Description: leskin-in commented on a change in pull request #175: URL: https://github.com/apache/commons-io/pull/175#discussion_r632053610 ########## File path: src/main/java/org/apache/commons/io/IOUtils.java ########## @@ -2243,10 +2243,13 @@ public static BufferedReader toBufferedReader(final Reader reader, final int siz * @param input the <code>InputStream</code> to read from * @return the requested byte array * @throws IOException if an I/O error occurs + * @throws IllegalArgumentException if input is longer than the maximum Java array length */ public static byte[] toByteArray(final InputStream input) throws IOException { try (final ByteArrayOutputStream output = new ByteArrayOutputStream()) { - copy(input, output); + if (copy(input, output) == -1) { + throw new IllegalArgumentException("Stream cannot be longer than Integer max value bytes"); Review comment: That is correct if an `InputStream` is a `ByteArrayInputStream`. However, this method accepts generic `InputStream`, which may wrap more data. In that case, the implementation of [`copy()`](https://github.com/apache/commons-io/blob/4dc7b2462ef0b6345828a13d358e34bfc9309ce2/src/main/java/org/apache/commons/io/IOUtils.java#L842-L869) would return `-1`. However, `ByteArrayOutputStream`, which is created in this method as an intermediate buffer, does not check for its *underlying* buffer overflow at [`write()`](https://github.com/apache/commons-io/blob/b2165b7b8888be8500768b6e27e090f89a621510/src/main/java/org/apache/commons/io/output/ByteArrayOutputStream.java#L54-L68). The checks in `write()` only ensure sanity of arguments passed to it; they are valid in case of [`copyLarge()`](https://github.com/apache/commons-io/blob/4dc7b2462ef0b6345828a13d358e34bfc9309ce2/src/main/java/org/apache/commons/io/IOUtils.java#L1148-L1174) (ultimately called by `copy()` mentioned above). The `ByteArrayOutputStream`, however, can store more than `Integer.MAX_VALUE` bytes because it [can use multiple underlying byte arrays](https://github.com/apache/commons-io/blob/401d17349e7ec52d8fa866c35efd24103f332c29/src/main/java/org/apache/commons/io/output/AbstractByteArrayOutputStream.java#L79-L109). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 596327) Time Spent: 1h 20m (was: 1h 10m) > ByteArrayOutputStream can overflow > ---------------------------------- > > Key: IO-429 > URL: https://issues.apache.org/jira/browse/IO-429 > Project: Commons IO > Issue Type: Bug > Components: Utilities > Reporter: Fabian Lange > Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > There are many places involved in the problem, and a good fix might be > problematic performance wise. > For example: > IOUtils.toByteArray(InputStream input) invoked with a Stream which feeds more > than Integer.MAX_VALUE bytes will either crash with > NegativeArraySizeException or maybe worse overflow in such a way that it > returns fine (but only with partial data) > The ByteArrayOutputStream will happily consume the full stream but "int > count" will overflow. At some point then toByteArray is invoked which will do > like new byte[count]. > maybe "needNewBuffer" can throw the IllegalArgumentException, as it gets the > count and could check for the overflow. -- This message was sent by Atlassian Jira (v8.3.4#803005)