Github user mridulm commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1678#discussion_r15683205
  
    --- Diff: 
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
    @@ -147,28 +147,36 @@ private[spark] class DiskBlockObjectWriter(
     
       override def isOpen: Boolean = objOut != null
     
    -  override def commit(): Long = {
    +  override def commitAndClose(): Unit = {
         if (initialized) {
           // NOTE: Because Kryo doesn't flush the underlying stream we 
explicitly flush both the
           //       serializer stream and the lower level stream.
           objOut.flush()
           bs.flush()
    -      val prevPos = lastValidPosition
    -      lastValidPosition = channel.position()
    -      lastValidPosition - prevPos
    -    } else {
    -      // lastValidPosition is zero if stream is uninitialized
    -      lastValidPosition
    +      close()
         }
    +    finalPosition = file.length()
       }
     
    -  override def revertPartialWrites() {
    -    if (initialized) {
    -      // Discard current writes. We do this by flushing the outstanding 
writes and
    -      // truncate the file to the last valid position.
    -      objOut.flush()
    -      bs.flush()
    -      channel.truncate(lastValidPosition)
    +  // Discard current writes. We do this by flushing the outstanding writes 
and then
    +  // truncating the file to its initial position.
    +  override def revertPartialWritesAndClose() {
    +    try {
    +      if (initialized) {
    +        objOut.flush()
    +        bs.flush()
    +        close()
    +      }
    +
    +      val truncateStream = new FileOutputStream(file, true)
    +      try {
    +        truncateStream.getChannel.truncate(initialPosition)
    +      } finally {
    +        truncateStream.close()
    +      }
    +    } catch {
    +      case e: Exception =>
    +        logError("Uncaught exception while reverting partial writes to 
file " + file, e)
    --- End diff --
    
    I meant the former case : close on a writer fails with an exception; while 
earlier streams succeeded.
    So now we have some writers which have committed data (which is not removed 
by subsequent revert) while others are reverted.
    
    On the face of it, I agree, it should not cause issues : but then since the 
expectation from this class is never enforced; and so can silently fail. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to