[ 
https://issues.apache.org/jira/browse/SPARK-37710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eren Avsarogullari updated SPARK-37710:
---------------------------------------
    Description: 
*Input/output error* usually points environmental issues such as disk 
read/write failures due to disk corruption, network access failures etc. This 
PR aims to be added detailed error message to catch this kind of environmental 
cases occurring on problematic BlockManager and logs with *BlockManager 
hostname, blockId and blockPath* details.
Following stack-trace occurred on disk corruption:
{code:java}
com.esotericsoftware.kryo.KryoException: java.io.IOException: Input/output error
Serialization trace:
buffers (org.apache.spark.sql.execution.columnar.DefaultCachedBatch)
    at com.esotericsoftware.kryo.io.Input.fill(Input.java:166)
    at com.esotericsoftware.kryo.io.Input.require(Input.java:196)
    at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:346)
    at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:326)
    at 
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:55)
    at 
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:38)
    at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:789)
    at 
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:381)
    at 
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:302)
    at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:789)
    at 
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:132)
    at 
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:816)
    at 
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:296)
    at 
org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:168)
    at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
    at 
org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
    at 
org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
    at 
org.apache.spark.storage.BlockManager.maybeCacheDiskValuesInMemory(BlockManager.scala:1569)
    at 
org.apache.spark.storage.BlockManager.getLocalValues(BlockManager.scala:877)
    at org.apache.spark.storage.BlockManager.get(BlockManager.scala:1163)
...
Caused by: java.io.IOException: Input/output error
    at java.io.FileInputStream.readBytes(Native Method)
    at java.io.FileInputStream.read(FileInputStream.java:255)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
    at 
net.jpountz.lz4.LZ4BlockInputStream.tryReadFully(LZ4BlockInputStream.java:269)
    at 
net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:280)
    at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:243)
    at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:157)
    at com.esotericsoftware.kryo.io.Input.fill(Input.java:164)
    ... 87 more {code}
*Proposed Error Message:*
{code:java}
java.io.IOException: Input/output error. BlockManagerId(driver, localhost, 
49455, None) - blockId: test_my-block-id - blockDiskPath: 
/private/var/folders/kj/mccyycwn6mjdwnglw9g3k6pm0000gq/T/blockmgr-12dba181-771e-4ff9-a2bc-fa3ce6dbabfa/11/test_my-block-id
 {code}

  was:
*Input/output error* usually points environmental issues such as disk 
read/write failures due to disk corruption, network access failures etc. This 
PR aims to be added clear message to catch this kind of environmental cases 
occurring on BlockManager and logs with {*}BlockManager hostname, blockId and 
blockPath{*}.
Following stack-trace occurred on disk corruption:
{code:java}
com.esotericsoftware.kryo.KryoException: java.io.IOException: Input/output error
Serialization trace:
buffers (org.apache.spark.sql.execution.columnar.DefaultCachedBatch)
    at com.esotericsoftware.kryo.io.Input.fill(Input.java:166)
    at com.esotericsoftware.kryo.io.Input.require(Input.java:196)
    at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:346)
    at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:326)
    at 
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:55)
    at 
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:38)
    at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:789)
    at 
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:381)
    at 
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:302)
    at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:789)
    at 
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:132)
    at 
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:816)
    at 
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:296)
    at 
org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:168)
    at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
    at 
org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
    at 
org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
    at 
org.apache.spark.storage.BlockManager.maybeCacheDiskValuesInMemory(BlockManager.scala:1569)
    at 
org.apache.spark.storage.BlockManager.getLocalValues(BlockManager.scala:877)
    at org.apache.spark.storage.BlockManager.get(BlockManager.scala:1163)
...
Caused by: java.io.IOException: Input/output error
    at java.io.FileInputStream.readBytes(Native Method)
    at java.io.FileInputStream.read(FileInputStream.java:255)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
    at 
net.jpountz.lz4.LZ4BlockInputStream.tryReadFully(LZ4BlockInputStream.java:269)
    at 
net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:280)
    at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:243)
    at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:157)
    at com.esotericsoftware.kryo.io.Input.fill(Input.java:164)
    ... 87 more {code}
*Proposed Error Message:*
{code:java}
java.io.IOException: Input/output error usually occurs due to environmental 
problems (e.g: disk corruption, network failure etc) so please check env status 
if healthy. BlockManagerId(driver, localhost, 54937, None) - blockName: 
test_my-block-id - blockDiskPath: 
/private/var/folders/kj/mccyycwn6mjdwnglw9g3k6pm0000gq/T/blockmgr-e86d8f67-a993-407f-ad3b-3cfb667b4ad4/11/test_my-block-id
{code}


> Add detailed error message for java.io.IOException occurring on Kryo flow
> -------------------------------------------------------------------------
>
>                 Key: SPARK-37710
>                 URL: https://issues.apache.org/jira/browse/SPARK-37710
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.1.1
>            Reporter: Eren Avsarogullari
>            Priority: Major
>
> *Input/output error* usually points environmental issues such as disk 
> read/write failures due to disk corruption, network access failures etc. This 
> PR aims to be added detailed error message to catch this kind of 
> environmental cases occurring on problematic BlockManager and logs with 
> *BlockManager hostname, blockId and blockPath* details.
> Following stack-trace occurred on disk corruption:
> {code:java}
> com.esotericsoftware.kryo.KryoException: java.io.IOException: Input/output 
> error
> Serialization trace:
> buffers (org.apache.spark.sql.execution.columnar.DefaultCachedBatch)
>     at com.esotericsoftware.kryo.io.Input.fill(Input.java:166)
>     at com.esotericsoftware.kryo.io.Input.require(Input.java:196)
>     at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:346)
>     at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:326)
>     at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:55)
>     at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:38)
>     at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:789)
>     at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:381)
>     at 
> com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:302)
>     at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:789)
>     at 
> com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:132)
>     at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
>     at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:816)
>     at 
> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:296)
>     at 
> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:168)
>     at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>     at 
> org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
>     at 
> org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
>     at 
> org.apache.spark.storage.BlockManager.maybeCacheDiskValuesInMemory(BlockManager.scala:1569)
>     at 
> org.apache.spark.storage.BlockManager.getLocalValues(BlockManager.scala:877)
>     at org.apache.spark.storage.BlockManager.get(BlockManager.scala:1163)
> ...
> Caused by: java.io.IOException: Input/output error
>     at java.io.FileInputStream.readBytes(Native Method)
>     at java.io.FileInputStream.read(FileInputStream.java:255)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>     at 
> net.jpountz.lz4.LZ4BlockInputStream.tryReadFully(LZ4BlockInputStream.java:269)
>     at 
> net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:280)
>     at 
> net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:243)
>     at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:157)
>     at com.esotericsoftware.kryo.io.Input.fill(Input.java:164)
>     ... 87 more {code}
> *Proposed Error Message:*
> {code:java}
> java.io.IOException: Input/output error. BlockManagerId(driver, localhost, 
> 49455, None) - blockId: test_my-block-id - blockDiskPath: 
> /private/var/folders/kj/mccyycwn6mjdwnglw9g3k6pm0000gq/T/blockmgr-12dba181-771e-4ff9-a2bc-fa3ce6dbabfa/11/test_my-block-id
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to