I've already done that:

>From SparkUI Environment  Spark properties has:

spark.shuffle.spillfalse

On Wed, Mar 18, 2015 at 6:34 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> I think you can disable it with spark.shuffle.spill=false
>
> Thanks
> Best Regards
>
> On Wed, Mar 18, 2015 at 3:39 PM, Darren Hoo <darren....@gmail.com> wrote:
>
>> Thanks, Shao
>>
>> On Wed, Mar 18, 2015 at 3:34 PM, Shao, Saisai <saisai.s...@intel.com>
>> wrote:
>>
>>>  Yeah, as I said your job processing time is much larger than the
>>> sliding window, and streaming job is executed one by one in sequence, so
>>> the next job will wait until the first job is finished, so the total
>>> latency will be accumulated.
>>>
>>>
>>>
>>> I think you need to identify the bottleneck of your job at first. If the
>>> shuffle is so slow, you could enlarge the shuffle fraction of memory to
>>> reduce the spill, but finally the shuffle data will be written to disk,
>>> this cannot be disabled, unless you mount your spark.tmp.dir on ramdisk.
>>>
>>>
>>>
>> I have increased spark.shuffle.memoryFraction  to  0.8  which I can see
>> from SparKUI's environment variables
>>
>> But spill  always happens even from start when latency is less than slide
>> window(I changed it to 10 seconds),
>> the shuflle data disk written is really a snow ball effect,  it slows
>> down eventually.
>>
>> I noticed that the files spilled to disk are all very small in size but
>> huge in numbers:
>>
>> total 344K
>>
>> drwxr-xr-x  2 root root 4.0K Mar 18 16:55 .
>>
>> drwxr-xr-x 66 root root 4.0K Mar 18 16:39 ..
>>
>> -rw-r--r--  1 root root  80K Mar 18 16:54 shuffle_47_519_0.data
>>
>> -rw-r--r--  1 root root  75K Mar 18 16:54 shuffle_48_419_0.data
>>
>> -rw-r--r--  1 root root  36K Mar 18 16:54 shuffle_48_518_0.data
>>
>> -rw-r--r--  1 root root  69K Mar 18 16:55 shuffle_49_319_0.data
>>
>> -rw-r--r--  1 root root  330 Mar 18 16:55 shuffle_49_418_0.data
>>
>> -rw-r--r--  1 root root  65K Mar 18 16:55 shuffle_49_517_0.data
>>
>> MemStore says:
>>
>> 15/03/18 17:59:43 WARN MemoryStore: Failed to reserve initial memory 
>> threshold of 1024.0 KB for computing block rdd_1338_2 in memory.
>> 15/03/18 17:59:43 WARN MemoryStore: Not enough space to cache rdd_1338_2 in 
>> memory! (computed 512.0 B so far)
>> 15/03/18 17:59:43 INFO MemoryStore: Memory use = 529.0 MB (blocks) + 0.0 B 
>> (scratch space shared across 0 thread(s)) = 529.0 MB. Storage limit = 529.9 
>> MB.
>>
>> Not enough space even for 512 byte??
>>
>>
>> The executors still has plenty free memory:
>> 0        slave1:40778 0       0.0 B / 529.9 MB  0.0 B 16 0 15047 15063 2.17
>> h  0.0 B  402.3 MB  768.0 B
>> 1 slave2:50452 0 0.0 B / 529.9 MB  0.0 B 16 0 14447 14463 2.17 h  0.0 B
>> 388.8 MB  1248.0 B
>>
>>     1 lvs02:47325        116 27.6 MB / 529.9 MB  0.0 B 8 0 58169 58177 3.16
>> h  893.5 MB  624.0 B  1189.9 MB
>>
>>     <driver> lvs02:47041 0 0.0 B / 529.9 MB  0.0 B 0 0 0 0 0 ms  0.0 B
>> 0.0 B  0.0 B
>>
>>
>> Besides if CPU or network is the bottleneck, you might need to add more
>>> resources to your cluster.
>>>
>>>
>>>
>>  3 dedicated servers each with CPU 16 cores + 16GB memory and Gigabyte
>> network.
>>  CPU load is quite low , about 1~3 from top,  and network usage  is far
>> from saturated.
>>
>>  I don't even  do any usefull complex calculations in this small Simple
>> App yet.
>>
>>
>>
>

Reply via email to