I agree, it was by mistake.
I just updated so that the next person looking for torrent broadcast issues
will have a hint :)

Thank you.
Daniel

On Sun, Jun 19, 2016 at 5:26 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> I think good practice is not to hold on to SparkContext in mapFunction.
>
> On Sun, Jun 19, 2016 at 7:10 AM, Takeshi Yamamuro <linguin....@gmail.com>
> wrote:
>
>> How about using `transient` annotations?
>>
>> // maropu
>>
>> On Sun, Jun 19, 2016 at 10:51 PM, Daniel Haviv <
>> daniel.ha...@veracity-group.com> wrote:
>>
>>> Hi,
>>> Just updating on my findings for future reference.
>>> The problem was that after refactoring my code I ended up with a scala
>>> object which held SparkContext as a member, eg:
>>> object A  {
>>>      sc: SparkContext = new SparkContext
>>>      def mapFunction  {}
>>> }
>>>
>>> and when I called rdd.map(A.mapFunction) it failed as A.sc is not
>>> serializable.
>>>
>>> Thanks,
>>> Daniel
>>>
>>> On Tue, Jun 7, 2016 at 10:13 AM, Takeshi Yamamuro <linguin....@gmail.com
>>> > wrote:
>>>
>>>> Hi,
>>>>
>>>> Since `HttpBroadcastFactory` has already been removed in master, so
>>>> you cannot use the broadcast mechanism in future releases.
>>>>
>>>> Anyway, I couldn't find a root cause only from the stacktraces...
>>>>
>>>> // maropu
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 6, 2016 at 2:14 AM, Daniel Haviv <
>>>> daniel.ha...@veracity-group.com> wrote:
>>>>
>>>>> Hi,
>>>>> I've set  spark.broadcast.factory to
>>>>> org.apache.spark.broadcast.HttpBroadcastFactory and it indeed resolve my
>>>>> issue.
>>>>>
>>>>> I'm creating a dataframe which creates a broadcast variable internally
>>>>> and then fails due to the torrent broadcast with the following stacktrace:
>>>>> Caused by: org.apache.spark.SparkException: Failed to get
>>>>> broadcast_3_piece0 of broadcast_3
>>>>>         at
>>>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
>>>>>         at
>>>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
>>>>>         at scala.Option.getOrElse(Option.scala:120)
>>>>>         at
>>>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:137)
>>>>>         at
>>>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
>>>>>         at
>>>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
>>>>>         at scala.collection.immutable.List.foreach(List.scala:318)
>>>>>         at org.apache.spark.broadcast.TorrentBroadcast.org
>>>>> $apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:120)
>>>>>         at
>>>>> org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:175)
>>>>>         at
>>>>> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1220)
>>>>>
>>>>> I'm using spark 1.6.0 on CDH 5.7
>>>>>
>>>>> Thanks,
>>>>> Daniel
>>>>>
>>>>>
>>>>> On Wed, Jun 1, 2016 at 5:52 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>>
>>>>>> I found spark.broadcast.blockSize but no parameter to switch
>>>>>> broadcast method.
>>>>>>
>>>>>> Can you describe the issues with torrent broadcast in more detail ?
>>>>>>
>>>>>> Which version of Spark are you using ?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> On Wed, Jun 1, 2016 at 7:48 AM, Daniel Haviv <
>>>>>> daniel.ha...@veracity-group.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> Our application is failing due to issues with the torrent broadcast,
>>>>>>> is there a way to switch to another broadcast method ?
>>>>>>>
>>>>>>> Thank you.
>>>>>>> Daniel
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> ---
>>>> Takeshi Yamamuro
>>>>
>>>
>>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>
>

Reply via email to