Hi, all:

I'm using spark sql thrift server under Spark1.3.1 to do hive sql query.  I 
started spark sql thrift server like ./sbin/start-thriftserver.sh  --master 
yarn-client --num-executors 12 --executor-memory 5g  --driver-memory 5g, then 
sent continuos hive sql to the thrift server.


However, about 20 minutes later, I got OOM Error in the thrift server log: 
“Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler 
in thread "sparkDriver-scheduler-1" , and the memory of thrift server process 
had increased to 5.7g (use top commend under linux server).


With the help of Google, I found the following patch, and added it to 
spark1.3.1, but it didn't help:
https://github.com/apache/spark/pull/12932/commits/559db12bf0b708d95d5066d4c41220ab493c70c9


The following is  my jmap output:


 num     #instances         #bytes  class name
----------------------------------------------
   1:      21844706     1177481920  [C
   2:      21842972      524231328  java.lang.String
   3:       5429362      311283856  [Ljava.lang.Object;
   4:       3619510      296262792  [Ljava.util.HashMap$Entry;
   5:       8887511      284400352  java.util.HashMap$Entry
   6:       3618802      202652912  java.util.HashMap
   7:         51304      150483664  [B
   8:       5421523      130116552  java.util.ArrayList
   9:       4514237      108341688  
org.apache.hadoop.hive.metastore.api.FieldSchema
  10:       3611642       57786272  java.util.HashMap$EntrySet


So, how to solve this problem to let my spark sql thrift server run longer 
except for applying for larger driver memory? Dose someone encounter the same 
situation?


Sincerely,
Young

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to