[ https://issues.apache.org/jira/browse/SPARK-26068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687497#comment-16687497 ]
Apache Spark commented on SPARK-26068: -------------------------------------- User 'linhong-intel' has created a pull request for this issue: https://github.com/apache/spark/pull/23040 > ChunkedByteBufferInputStream is truncated by empty chunk > -------------------------------------------------------- > > Key: SPARK-26068 > URL: https://issues.apache.org/jira/browse/SPARK-26068 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.0.0 > Reporter: Liu, Linhong > Priority: Major > > If ChunkedByteBuffer contains empty chunk in the middle of it, then the > ChunkedByteBufferInputStream will be truncated. All data behind the empty > chunk will not be read. > The problematic code: > {code:java} > // ChunkedByteBuffer.scala > // Assume chunks.next returns an empty chunk, then we will reach > // else branch no matter chunks.hasNext = true or not. So some data is lost. > override def read(dest: Array[Byte], offset: Int, length: Int): Int = { > if (currentChunk != null && !currentChunk.hasRemaining && chunks.hasNext) > { > currentChunk = chunks.next() > } > if (currentChunk != null && currentChunk.hasRemaining) { > val amountToGet = math.min(currentChunk.remaining(), length) > currentChunk.get(dest, offset, amountToGet) > amountToGet > } else { > close() > -1 > } > } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org