Hi
I am working on building a recommender system on a learning content data. My
data format is a user-item matrix of views. Similar to the below one
NS
this is the settings I have.
# Example:
# spark.master spark://master:7077
# spark.eventLog.enabled true
# spark.eventLog.dir hdfs://namenode:8021/directory
# spark.serializer
org.apache.spark.serializer.KryoSerializer
spark.driver.memory
Data set is not big. It is 56K X 9K . It does have column names as long
strings.
It fits very easily in Pandas. That is also in memory thing. So I am not
sure if memory is an issue here. If Pandas can fit it very easily and work
on it very fast then Spark shouldnt have problems too right?
ᐧ
On
I put driver memory as 6gb instead of 8(half of 16). But does 2 gb make
this difference?
On Tuesday, September 13, 2016, neil90 [via Apache Spark User List] <
ml-node+s1001560n27704...@n3.nabble.com> wrote:
> Double check your Driver Memory in your Spark Web UI make sure the driver
> Memory is
Hi
I even tried the dataframe.cache() action to carry out the cross tab
transformation. However still I get the
same OOM error.
recommender_ct.cache()
---
Py4JJavaError Traceback (most recent
Hi Thanks
I tried that. But got this error. Again OOM. I am not sure what to do now.
For spark.driver.maxResultSize i kept 2g. Rest I did as mentioned above.
16Gb for driver and 2g for executor. I have 16Gb mac. Please help. I am
very delayed on my work because of this and not able to move ahead.