Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17295#discussion_r107007132
  
    --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
    @@ -56,6 +57,43 @@ private[spark] class BlockResult(
         val bytes: Long)
     
     /**
    + * Abstracts away how blocks are stored and provides different ways to 
read the underlying block
    + * data. The data for a BlockData instance can only be read once, since it 
may be backed by open
    + * file descriptors that change state as data is read.
    + */
    +private[spark] trait BlockData {
    +
    +  def toInputStream(): InputStream
    +
    +  def toManagedBuffer(): ManagedBuffer
    +
    +  def toByteBuffer(allocator: Int => ByteBuffer): ChunkedByteBuffer
    +
    +  def size: Long
    +
    +  def dispose(): Unit
    +
    +}
    +
    +private[spark] class ByteBufferBlockData(
    +    val buffer: ChunkedByteBuffer,
    +    autoDispose: Boolean = true) extends BlockData {
    +
    +  override def toInputStream(): InputStream = buffer.toInputStream(dispose 
= autoDispose)
    +
    +  override def toManagedBuffer(): ManagedBuffer = new 
NettyManagedBuffer(buffer.toNetty)
    +
    +  override def toByteBuffer(allocator: Int => ByteBuffer): 
ChunkedByteBuffer = {
    +    buffer.copy(allocator)
    +  }
    --- End diff --
    
    So I had traced through that stuff 2 or 3 times, and now I did it again and 
I think I finally understood all that's going on. Basically, the old code was 
really bad at explicitly disposing of the buffers, meaning a bunch of paths 
(like the ones that used managed buffers) didn't bother to do it and just left 
the work to the GC.
    
    I changed the code a bit to make the dispose more explicit and added 
comments in a few key places.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to