Hi All,

I am trying to use spark-itemsimilarity on 160M user interactions dataset.
The job launches and running successfully for small data 1M action.
However, when trying for the larger dataset, some spark stages continuously
fail with out of memory exception.

I tried to change the spark.storage.memoryFraction from spark default
configuration, but I face the same issue again. How could I configure spark
when using spark-itemsimilarity, or how to overcome this out of memory
issue.

Can you please advice ?

Thanks,
Hani.​​
​

Hani Al-Shater | Data Science Manager - Souq.com <http://souq.com/>
Mob: +962 790471101 | Phone: +962 65821236 | Skype:
hani.alsha...@outlook.com | halsha...@souq.com <lgha...@souq.com> |
www.souq.com
Nouh Al Romi Street, Building number 8, Amman, Jordan

-- 


*Download free Souq.com <http://souq.com/> mobile apps for iPhone 
<https://itunes.apple.com/us/app/id675000850>, iPad 
<https://itunes.apple.com/ae/app/souq.com/id941561129?mt=8>, Android 
<https://play.google.com/store/apps/details?id=com.souq.app> or Windows 
Phone 
<http://www.windowsphone.com/en-gb/store/app/souq/63803e57-4aae-42c7-80e0-f9e60e33b1bc>
 **and never 
miss a deal! *

Reply via email to