Hi,
I encountered a strange issue. I run spark-shell with client mode in
kubernets.
as below command:
val data=spark.read.parquet("datapath")

When I run: "data.show", it may raise exceptions, the stacktrace like below:

DEBUG BlockManagerMasterEndpoint: Updating block info on master
taskresult_3 form BlockManagerId(2, 192.168.167.22, 7079, None)
INFO BlockManagerInfo: Added taskresult_3 in memory on 192.168.167.22, 7079
(size: 173 KiB, free: 12 GB)
DEBUG TaskResultGetter: Fetching indirect task result for task 0.2 in stage
1.0 (TID3)
DEBUG BlockManager: Getting remote block taskresult_3
DEBUG BlockManager: Getting remote block taskresult_3 from
BlockManagerId(2, 192.168.167.22, 7079, None)
INFO TransportClientFactory: Found inactive connection to /
192.168.167.22:7079, creating a new one
DEBUG TransportClientFactory: Creating new connection to /
192.168.167.22:7079
DEBUG TransportClientFactory: Connection to /192.168.167.22:7079
successful, running bootstraps..
INFO TransportClientFactory: Successfully created connection to /
192.168.167.22:7079 after 1 ms (0 ms spent in bootstraps)
ERROR TransportResponseHandler: Still have 1 requests outstanding when
connection from <unknown remote> is closed
ERROR OneForOneBlockFetcher: Failed while starting block fetches
java.io.IOException: Connection from <unknown remote> closed
at
org.apache.spark.network.client.TransportResponseHandler.channelInactive(TransportResponseHandler.java:147)
at org.apache.spark.network.client.TransportChannelHandler.channelInactive(
TransportChannelHandler .java:147)
......
ERROR TransportClient: Failed to send RPC RPC 223311111333 to <unknown
remote>: io.netty.channel.StacklessClosedChannelException
......

It looks like the exceptions are related to some data, the same data I run
"data.show" would raise exceptions, but run "data.show(2)" would not.
Looks like it depends on whether the task result needs to be fetched
indirectly?

from the exceptions, which condition there is no specific ip or port,
rather than <unknown remote>? it's very strange.

Does anybody know how to fix it? Why is it <unknown remote>??

thanks.

Reply via email to