[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2022-07-28 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572634#comment-17572634
 ] 

Dongjoon Hyun commented on SPARK-34788:
---

No, what I meant is ensuring enough disk at your underlying node level in HDFS 
DataNode in your case. For example, K8s has a nice isolate and guarantee 
(required and limit) in terms of the disk, memory, and CPU.

> On the write side, since Spark usually uses iterators to write data into 
>disk, it's not possible to know the size in advance.

> On the read side (the error in this ticket is actually raised when reading a 
> file), it can be overkill to me if we'd check if the disk is full each time 
> before reading a file.

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users:
> {code:java}
> 9/03/26 09:03:45 ERROR ShuffleBlockFetcherIterator: Failed to create input 
> stream from local block
> java.io.IOException: Error in reading 
> FileSegmentManagedBuffer{file=/local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data,
>  offset=110254956, length=1875458}
>   at 
> org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:111)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:442)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:64)
>   at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>   at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:622)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:98)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:95)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
>   at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139)
>   at org.apache.spark.scheduler.Task.run(Task.scala:112)
>   at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.Th

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2022-07-28 Thread wenweijian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572536#comment-17572536
 ] 

wenweijian commented on SPARK-34788:


I can reproduce it. 

If the performance impact of the proposal (SyncFailedException) is little, I 
think it is worth fixing

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users:
> {code:java}
> 9/03/26 09:03:45 ERROR ShuffleBlockFetcherIterator: Failed to create input 
> stream from local block
> java.io.IOException: Error in reading 
> FileSegmentManagedBuffer{file=/local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data,
>  offset=110254956, length=1875458}
>   at 
> org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:111)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:442)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:64)
>   at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>   at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:622)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:98)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:95)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
>   at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139)
>   at org.apache.spark.scheduler.Task.run(Task.scala:112)
>   at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: 
> /local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data
>  (No such file or directory)
>   at java.io.FileInputStream.open0(Native Meth

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2022-07-28 Thread wuyi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572506#comment-17572506
 ] 

wuyi commented on SPARK-34788:
--

> Why don't you ensure enough disks from the beginning?

On the write side, since Spark usually uses iterators to write data into disk, 
it's not possible to know the size in advance.

On the read side (the error in this ticket is actually raised when reading a 
file), it can be overkill to me if we'd check if the disk is full each time 
before reading a file.

 

> This goal of this Jira issue seems to aim `cosmetic` change which has not 
>much benefit.

Yea, this won't fix the root cause but gives a better error msg for users.

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users:
> {code:java}
> 9/03/26 09:03:45 ERROR ShuffleBlockFetcherIterator: Failed to create input 
> stream from local block
> java.io.IOException: Error in reading 
> FileSegmentManagedBuffer{file=/local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data,
>  offset=110254956, length=1875458}
>   at 
> org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:111)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:442)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:64)
>   at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>   at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:622)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:98)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:95)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
>   at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139)
>   at org.apache.spark.scheduler.Task.run(Task.scala:112)
>   at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecu

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2022-07-28 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572493#comment-17572493
 ] 

Dongjoon Hyun commented on SPARK-34788:
---

Why don't you ensure enough disks from the beginning?

Basically, throwing `IOException` instead of `FileNotFoundException` doesn't 
remove the root cause at all.

This goal of this Jira issue seems to aim `cosmetic` change which has not much 
benefit.

If this is due to a JVM bug still, I'd recommend to add one or two sentence to 
Apache Spark documentation about it.

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users:
> {code:java}
> 9/03/26 09:03:45 ERROR ShuffleBlockFetcherIterator: Failed to create input 
> stream from local block
> java.io.IOException: Error in reading 
> FileSegmentManagedBuffer{file=/local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data,
>  offset=110254956, length=1875458}
>   at 
> org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:111)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:442)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:64)
>   at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>   at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:622)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:98)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:95)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
>   at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139)
>   at org.apache.spark.scheduler.Task.run(Task.scala:112)
>   at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: 
> /local

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2022-07-28 Thread wuyi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572478#comment-17572478
 ] 

wuyi commented on SPARK-34788:
--

[~leo wen] are you able to reproduce in your environment? If yes, we could 
verify the proposal 
([http://weblog.janek.org/Archive/2004/12/20/ExceptionWhenWritingToAFu.html] ).

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users:
> {code:java}
> 9/03/26 09:03:45 ERROR ShuffleBlockFetcherIterator: Failed to create input 
> stream from local block
> java.io.IOException: Error in reading 
> FileSegmentManagedBuffer{file=/local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data,
>  offset=110254956, length=1875458}
>   at 
> org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:111)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:442)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:64)
>   at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>   at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:622)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:98)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:95)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
>   at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139)
>   at org.apache.spark.scheduler.Task.run(Task.scala:112)
>   at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: 
> /local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data
>  (No such file or directory)
>   at j

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2022-07-27 Thread wenweijian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17572212#comment-17572212
 ] 

wenweijian commented on SPARK-34788:


I got this FileNotFoundException when the 
hdfs(/tmp/logs/root/bucket-logs-tfile) disk is full.

when I delele some file in that dir, the exception disappeared. 

 

logs:
{code:java}
org.apache.spark.shuffle.FetchFailedException: Error in reading 
FileSegmentManagedBuffer[file=/home/install/hadoop/data2/hadoop-3.3.0/nm-local-dir/usercache/root/appcache/application_1658221318180_0002/blockmgr-001c6d1a-bda3-4f8e-8998-42e3a8fbaaa9/0d/shuffle_0_7622_0.data,offset=57062119,length=26794]
    at 
org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:770)
    at 
org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:649)
    at 
org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:70)
    at 
org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:29)
    at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
    at 
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
    at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
    at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:200)
    at 
org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:128)
    at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:106)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:131)
    at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Error in reading 
FileSegmentManagedBuffer[file=/home/install/hadoop/data2/hadoop-3.3.0/nm-local-dir/usercache/root/appcache/application_1658221318180_0002/blockmgr-001c6d1a-bda3-4f8e-8998-42e3a8fbaaa9/0d/shuffle_0_7622_0.data,offset=57062119,length=26794]
    at 
org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:112)
    at 
org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:637)
    ... 23 more
Caused by: java.io.FileNotFoundException: 
/home/install/hadoop/data2/hadoop-3.3.0/nm-local-dir/usercache/root/appcache/application_1658221318180_0002/blockmgr-001c6d1a-bda3-4f8e-8998-42e3a8fbaaa9/0d/shuffle_0_7622_0.data
 (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.(FileInputStream.java:138)
    at 
org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:101)
    ... 24 more {code}

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users:
> {code:java}
> 9/03/26 09:03:45 ERROR ShuffleBlockFetcherIterator: Failed to create input 
> stream from local block
> java.io.IOException: Error in reading 
> FileSegmentManagedBuffer{file=/local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data,
>  offset=110254956, length=1875458}
>   at 
> org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:111)
>   at 
> org.apache.spark.storage.ShuffleBlo

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2021-03-18 Thread wuxiaofei (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304603#comment-17304603
 ] 

wuxiaofei commented on SPARK-34788:
---

Please assign it to me, I will try to work it out later.

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users:
> {code:java}
> 9/03/26 09:03:45 ERROR ShuffleBlockFetcherIterator: Failed to create input 
> stream from local block
> java.io.IOException: Error in reading 
> FileSegmentManagedBuffer{file=/local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data,
>  offset=110254956, length=1875458}
>   at 
> org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:111)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:442)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:64)
>   at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>   at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:622)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:98)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:95)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
>   at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139)
>   at org.apache.spark.scheduler.Task.run(Task.scala:112)
>   at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: 
> /local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data
>  (No such file or directory)
>   at java.io.FileInputStream.open0(Native Method)
>   at java.io.FileInputStream.open(FileInputStream.java:195)

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2021-03-18 Thread wuyi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304581#comment-17304581
 ] 

wuyi commented on SPARK-34788:
--

[~dongjoon] I have pasted the error stack. It happened when we reading the 
shuffle data. The cause only says the file is not found, but we believe it's 
highly possible due to the disk full issue after investigation.

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 3.2.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users:
> {code:java}
> 9/03/26 09:03:45 ERROR ShuffleBlockFetcherIterator: Failed to create input 
> stream from local block
> java.io.IOException: Error in reading 
> FileSegmentManagedBuffer{file=/local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data,
>  offset=110254956, length=1875458}
>   at 
> org.apache.spark.network.buffer.FileSegmentManagedBuffer.createInputStream(FileSegmentManagedBuffer.java:111)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:442)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:64)
>   at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>   at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
>   at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown
>  Source)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:622)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:98)
>   at 
> org.apache.spark.sql.execution.aggregate.SortAggregateExec$$anonfun$doExecute$1$$anonfun$3.apply(SortAggregateExec.scala:95)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$12.apply(RDD.scala:839)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:60)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:340)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:304)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
>   at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139)
>   at org.apache.spark.scheduler.Task.run(Task.scala:112)
>   at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: 
> /local_disk0/spark-c2f26f02-2572-4764-815a-cbba65ddb315/executor-b4b76a4c-788c-4cb6-b904-664a883be1aa/blockmgr-36804371-24fe-4131-a3dc-00b7f98f3a3e/11/shuffle_113_1029_0.data
> 

[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2021-03-18 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304511#comment-17304511
 ] 

Dongjoon Hyun commented on SPARK-34788:
---

Could you be more specific, [~Ngone51]? In case of disk full, many parts are 
related. Which code path do you see `FileNotFoundException` instead of 
`IOException`?

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 2.4.7, 3.0.0, 3.1.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users.
>  
> And there's probably a way to detect the disk full: when we get 
> `FileNotFoundException`, we try 
> [http://weblog.janek.org/Archive/2004/12/20/ExceptionWhenWritingToAFu.html] 
> to see if SyncFailedException throws. If SyncFailedException throws, then we 
> throw IOException with the disk full hint.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34788) Spark throws FileNotFoundException instead of IOException when disk is full

2021-03-18 Thread wuyi (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304185#comment-17304185
 ] 

wuyi commented on SPARK-34788:
--

[~wuxiaofei] Could you take a look at this?

> Spark throws FileNotFoundException instead of IOException when disk is full
> ---
>
> Key: SPARK-34788
> URL: https://issues.apache.org/jira/browse/SPARK-34788
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.7, 3.0.0, 3.1.0
>Reporter: wuyi
>Priority: Major
>
> When the disk is full, Spark throws FileNotFoundException instead of 
> IOException with the hint. It's quite a confusing error to users.
>  
> And there's probably a way to detect the disk full: when we get 
> `FileNotFoundException`, we try 
> [http://weblog.janek.org/Archive/2004/12/20/ExceptionWhenWritingToAFu.html] 
> to see if SyncFailedException throws. If SyncFailedException throws, then we 
> throw IOException with the disk full hint.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org