Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/17295#discussion_r107007132 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -56,6 +57,43 @@ private[spark] class BlockResult( val bytes: Long) /** + * Abstracts away how blocks are stored and provides different ways to read the underlying block + * data. The data for a BlockData instance can only be read once, since it may be backed by open + * file descriptors that change state as data is read. + */ +private[spark] trait BlockData { + + def toInputStream(): InputStream + + def toManagedBuffer(): ManagedBuffer + + def toByteBuffer(allocator: Int => ByteBuffer): ChunkedByteBuffer + + def size: Long + + def dispose(): Unit + +} + +private[spark] class ByteBufferBlockData( + val buffer: ChunkedByteBuffer, + autoDispose: Boolean = true) extends BlockData { + + override def toInputStream(): InputStream = buffer.toInputStream(dispose = autoDispose) + + override def toManagedBuffer(): ManagedBuffer = new NettyManagedBuffer(buffer.toNetty) + + override def toByteBuffer(allocator: Int => ByteBuffer): ChunkedByteBuffer = { + buffer.copy(allocator) + } --- End diff -- So I had traced through that stuff 2 or 3 times, and now I did it again and I think I finally understood all that's going on. Basically, the old code was really bad at explicitly disposing of the buffers, meaning a bunch of paths (like the ones that used managed buffers) didn't bother to do it and just left the work to the GC. I changed the code a bit to make the dispose more explicit and added comments in a few key places.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org