Hi,
Just wondering if anyone has any advice about this issue, as I am
experiencing the same thing. I'm working with multiple broadcast variables
in PySpark, most of which are small, but one of around 4.5GB, using 10
workers at 31GB memory each and driver with same spec. It's not running
out of
There is a open PR [1] to support broadcast larger than 2G, could you try it?
[1] https://github.com/apache/spark/pull/2659
On Tue, Nov 11, 2014 at 6:39 AM, Tom Seddon mr.tom.sed...@gmail.com wrote:
Hi,
Just wondering if anyone has any advice about this issue, as I am
experiencing the same
Hi All,
I am relatively new to spark and currently having troubles with broadcasting
large variables ~500mb in size. Th
e broadcast fails with an error shown below and the memory usage on the
hosts also blow up.
Our hardware consists of 8 hosts (1 x 64gb (driver) and 7 x 32gb (workers))
and we