Difference between collect() and take(n)

MEETHU MATHEW Thu, 10 Jul 2014 05:53:59 -0700

Hi all,

I want to know how collect() works, and how it is different from take().I am 
just reading a file of 330MB which has 43lakh rows with 13 columns and calling 
take(4300000) to save to a variable.But the same is not working with 
collect().So is there any difference in the operation of both.



Again,I wanted to set java heap size for my spark pgm. I set it using 
spark.executor.extraJavaOptions in spark-default-conf.sh. Now I want to set the 
same for the worker.Can I do that with SPARK_DAEMON_JAVA_OPTS?Is the following 
syntax correct?

SPARK_DAEMON_JAVA_OPTS="-XX:+UseCompressedOops -Xmx3g"


Thanks & Regards, 
Meethu M

Difference between collect() and take(n)

Reply via email to