Re: Spark SQL met "Block broadcast_xxx not found"

2019-05-07 Thread Jacek Laskowski
Hi,

I'm curious about "I found the bug code". Can you point me at it? Thanks.

Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow me at https://twitter.com/jaceklaskowski


On Tue, May 7, 2019 at 9:34 AM Xilang Yan  wrote:

> Ok... I am sure it is a bug of spark, I found the bug code, but the code is
> removed in 2.2.3, so I just upgrade spark to fix the problem.
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Re: Spark SQL met "Block broadcast_xxx not found"

2019-05-07 Thread Xilang Yan
Ok... I am sure it is a bug of spark, I found the bug code, but the code is
removed in 2.2.3, so I just upgrade spark to fix the problem.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Spark SQL met "Block broadcast_xxx not found"

2019-04-27 Thread Xilang Yan
We met broadcast issue in some of our applications, but not every time we run
application, usually it gone when we rerun it. In the exception log, I see
below two types of exception:

Exception 1:
10:09:20.295 [shuffle-server-6-2] ERROR
org.apache.spark.network.server.TransportRequestHandler - Error opening
block StreamChunkId{streamId=365584526097, chunkIndex=0} for request from
/10.33.46.33:19866
org.apache.spark.storage.BlockNotFoundException: Block broadcast_334_piece0
not found
at
org.apache.spark.storage.BlockManager.getBlockData(BlockManager.scala:361)
~[spark-core_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.network.netty.NettyBlockRpcServer$$anonfun$1.apply(NettyBlockRpcServer.scala:61)
~[spark-core_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.network.netty.NettyBlockRpcServer$$anonfun$1.apply(NettyBlockRpcServer.scala:60)
~[spark-core_2.11-2.2.1.jar:2.2.1]
at scala.collection.Iterator$$anon$11.next(Iterator.scala:363)
~[scala-library-2.11.0.jar:?]
at
scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:31)
~[scala-library-2.11.0.jar:?]
at
org.apache.spark.network.server.OneForOneStreamManager.getChunk(OneForOneStreamManager.java:87)
~[spark-network-common_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.network.server.TransportRequestHandler.processFetchRequest(TransportRequestHandler.java:125)
[spark-network-common_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:103)
[spark-network-common_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:118)
[spark-network-common_2.11-2.2.1.jar:2.2.1]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
[netty-all-4.0.23.Final.jar:4.0.23.Final]


Exception 2:
10:14:37.906 [Executor task launch worker for task 430478] ERROR
org.apache.spark.util.Utils - Exception encountered
org.apache.spark.SparkException: Failed to get broadcast_696_piece0 of
broadcast_696
at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:178)
~[spark-core_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:150)
~[spark-core_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:150)
~[spark-core_2.11-2.2.1.jar:2.2.1]
at scala.collection.immutable.List.foreach(List.scala:383)
~[scala-library-2.11.0.jar:?]
at
org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:150)
~[spark-core_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:222)
~[spark-core_2.11-2.2.1.jar:2.2.1]
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
[spark-core_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
[spark-core_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
[spark-core_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
[spark-core_2.11-2.2.1.jar:2.2.1]
at
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
[spark-core_2.11-2.2.1.jar:2.2.1]
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
[spark-core_2.11-2.2.1.jar:2.2.1]


I think exception 2 is caused by exception 1, so the issue is when executor
A try to get broadcast from executor B, executor B cannot find in local. It
is strange, because broadcast is store in memory and disk, it should not be
removed only when driver asked, but driver will remove broadcast only one
broadcast variable not used anymore.

Could anyone gives some cue on how to find the root cause of this issue?
Thanks a lot!






--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org