Hi, all:
I'm using spark sql thrift server under Spark1.3.1 to do hive sql query. I started spark sql thrift server like ./sbin/start-thriftserver.sh --master yarn-client --num-executors 12 --executor-memory 5g --driver-memory 5g, then sent continuos hive sql to the thrift server. However, about 20 minutes later, I got OOM Error in the thrift server log: “Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "sparkDriver-scheduler-1" , and the memory of thrift server process had increased to 5.7g (use top commend under linux server). With the help of Google, I found the following patch, and added it to spark1.3.1, but it didn't help: https://github.com/apache/spark/pull/12932/commits/559db12bf0b708d95d5066d4c41220ab493c70c9 The following is my jmap output: num #instances #bytes class name ---------------------------------------------- 1: 21844706 1177481920 [C 2: 21842972 524231328 java.lang.String 3: 5429362 311283856 [Ljava.lang.Object; 4: 3619510 296262792 [Ljava.util.HashMap$Entry; 5: 8887511 284400352 java.util.HashMap$Entry 6: 3618802 202652912 java.util.HashMap 7: 51304 150483664 [B 8: 5421523 130116552 java.util.ArrayList 9: 4514237 108341688 org.apache.hadoop.hive.metastore.api.FieldSchema 10: 3611642 57786272 java.util.HashMap$EntrySet So, how to solve this problem to let my spark sql thrift server run longer except for applying for larger driver memory? Dose someone encounter the same situation? Sincerely, Young
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org