Hi All,

I have been facing memory issues in spark. im using spark-sql on AWS EMR. i 
have around 50GB file in AWS S3. I want to read this file in BI tool connected 
to spark-sql on thrift-server over OBDC. I'm executing select * from table in 
BI tool(qlikview,tableau).
I run into OOM error sometimes and some time the LOST_EXECUTOR. I'm really 
confused.
The spark runs fine for smaller data set.

I have 3 node EMR cluster with m3.2xlarge.

I have set below conf on spark.

export SPARK_EXECUTOR_INSTANCES=16
export SPARK_EXECUTOR_CORES=16
export SPARK_EXECUTOR_MEMORY=15G
export SPARK_DRIVER_MEMORY=12G
spark.kryoserializer.buffer.max 1024m

Even after setting SPARK_EXECUTOR_INSTANCES as 16, only 2 executors come up.

This is been road block since long time. Any help would be appreciated.

Thanks
Arun.

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful. Where permitted by applicable law, this e-mail 
and other e-mail communications sent to and from Cognizant e-mail addresses may 
be monitored.

Reply via email to