ZhengYaofeng created SPARK-10529:
------------------------------------

             Summary: When creating multiple HiveContext objects in one jvm, 
jdbc connections to metastore cann't be released and it may cause PermGen 
OutOfMemoryError.
                 Key: SPARK-10529
                 URL: https://issues.apache.org/jira/browse/SPARK-10529
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.4.1
            Reporter: ZhengYaofeng


Test code as follows:

object SqlTest {
  def main (args: Array[String]) {
        var port = 4042

    def createSc = {
      val sparkConf = new SparkConf().setAppName(s"SqlTest_$port")
        .setMaster("spark://zdh221:7077")
        .set("spark.ui.port",port.toString)
        .set("spark.executor.memory","4g")
        .set("spark.executor.cores","2")
        .set("spark.cores.max","6")
      port += 1
      new SparkContext(sparkConf)
    }
        
    while(port - 4042 < 200){
          println(s"============Current app id:${port - 4042}=============")
          
      val hc = new HiveContext(createSc)
      hc.sql("show databases").collect().foreach(println)
      hc.sparkContext.stop()
          //Hive.closeCurrent()
          //hc = null
      Thread.sleep(3000)
    }
        
        Thread.sleep(1000000)
  }
}
        

Testing on spark 1.4.1 with run cmd bellow.
        export 
CLASSPATH="$CLASSPATH:/home/hadoop/spark/conf:/home/hadoop/spark/lib/*:/home/hadoop/zyf/lib/*"
        java -Xmx8096m -Xms1024m -XX:MaxPermSize=1024m -cp $CLASSPATH SqlTest

Files list:
        
/home/hadoop/spark/conf:core-site.xml;hdfs-site.xml;hive-site.xml;slaves;spark-defaults.conf;spark-env.sh
        
/home/hadoop/zyf/lib:hadoop-lzo-0.4.20.jar;mysql-connector-java-5.1.28-bin.jar;sqltest-1.0-SNAPSHOT.jar
        
MySQL is used as the metastore. You can obviously see that jdbc connections to 
MySQL grow constantly through command 'show status like 'Threads_connected';' 
when my test app is running. Even if you invoke 'Hive.closeCurrent()', it 
cann't release current jdbc connections. Besides I can not find another 
possible way. If you take spark 1.3.1 to test, jdbc connections won't grow.

Meanwhile, it ends with 'java.lang.OutOfMemoryError: PermGen space' when 
cycling 45 times, which means 45 HiveContext objects are created. It's 
interesting that if you set MaxPermSize to '2048m', it can cycle 93 times, if 
you set MaxPermSize to '3072m', it can cycle 141 times. So,it indicates that 
each time creating one HiveContext object, it loads the same amount of new 
classes and they won't be released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to