Re: Troubleshooting Spark OOM

2019-01-09 Thread Dillon Dukek
I think most spark technical support people would really recommend upgrading to spark 2.0+ for starters. However, I understand that's not always possible. In this case I would double check to make sure that you don't have a situation where you have a join key that has many records associated with i

Re: Troubleshooting Spark OOM

2019-01-09 Thread William Shen
Thank you for the tips. We are running Spark 1.6 (scala), and OOM happens with SparkSQL trying to join a few large dataset together for processing/transformation... On Wed, Jan 9, 2019 at 3:42 PM Ramandeep Singh wrote: > Hi, > > Here are a few suggestions that you can try. > > OOM Issues that, I

Re: Troubleshooting Spark OOM

2019-01-09 Thread Ramandeep Singh
Hi, Here are a few suggestions that you can try. OOM Issues that, I have faced with Spark: *Not enough shuffle partition*s.Increase them. Less memory Overhead settings: Boosting it to around 12 percent. You usually get this as a error message in your executors. *Large Executor Configs*: They can

Re: Troubleshooting Spark OOM

2019-01-09 Thread Dillon Dukek
Hi William, Just to get started, can you describe the spark version you are using and the language? It doesn't sound like you are using pyspark, however, problems arising from that can be different so I just want to be sure. As well, can you talk through the scenario under which you are dealing wi

Troubleshooting Spark OOM

2019-01-09 Thread William Shen
Hi there, We've encountered Spark executor Java OOM issues for our Spark application. Any tips on how to troubleshoot to identify what objects are occupying the heap? In the past, dealing with JVM OOM, we've worked with analyzing heap dumps, but we are having a hard time with locating Spark heap d