Hi All, I have been facing memory issues in spark. im using spark-sql on AWS EMR. i have around 50GB file in AWS S3. I want to read this file in BI tool connected to spark-sql on thrift-server over OBDC. I'm executing select * from table in BI tool(qlikview,tableau). I run into OOM error sometimes and some time the LOST_EXECUTOR. I'm really confused. The spark runs fine for smaller data set.
I have 3 node EMR cluster with m3.2xlarge. I have set below conf on spark. export SPARK_EXECUTOR_INSTANCES=16 export SPARK_EXECUTOR_CORES=16 export SPARK_EXECUTOR_MEMORY=15G export SPARK_DRIVER_MEMORY=12G spark.kryoserializer.buffer.max 1024m Even after setting SPARK_EXECUTOR_INSTANCES as 16, only 2 executors come up. This is been road block since long time. Any help would be appreciated. Thanks Arun. This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful. Where permitted by applicable law, this e-mail and other e-mail communications sent to and from Cognizant e-mail addresses may be monitored.