Hi all, I want to know how collect() works, and how it is different from take().I am just reading a file of 330MB which has 43lakh rows with 13 columns and calling take(4300000) to save to a variable.But the same is not working with collect().So is there any difference in the operation of both.
Again,I wanted to set java heap size for my spark pgm. I set it using spark.executor.extraJavaOptions in spark-default-conf.sh. Now I want to set the same for the worker.Can I do that with SPARK_DAEMON_JAVA_OPTS?Is the following syntax correct? SPARK_DAEMON_JAVA_OPTS="-XX:+UseCompressedOops -Xmx3g" Thanks & Regards, Meethu M