Suhas Nalapure created PHOENIX-4502:
---------------------------------------

             Summary: Phoenix-Spark plugin doesn't release zookeeper connections
                 Key: PHOENIX-4502
                 URL: https://issues.apache.org/jira/browse/PHOENIX-4502
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.11.0
         Environment: HBase 1.2 on Linux (Ubuntu, CentOS)
            Reporter: Suhas Nalapure


*1. Phoenix-Spark plugin doesn't release zookeeper connections*
Example: 
                _for(int i=0; i < 50; i++){
                        Dataset<Row> df = 
sqlContext.read().format("org.apache.phoenix.spark")
                                        .option("table", 
"\"Sales\"").option("zkUrl", "localhost:2181")
                                        .load();
                        df.show(2);
                }
                Thread.sleep(1000*60); _    
 When the above snippet is executed, we can see number of connections to 2181 
increasing and not getting released until after the main thread wakes up from 
sleep and program ends as can be seen below (14 is the number of connections 
even before the program starts to run) :
netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
14
16:52:05
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
22
16:52:15
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
38
16:52:18
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
68
16:52:23
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
100
16:52:27
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
116
16:52:32
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
116
16:52:38
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
116
16:52:52
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
116
16:53:00
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
116
16:53:24
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
14
16:53:32
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
14
16:53:34
root@user1 ~ $

*2. Instead if "jdbc" format is used to create Spark Dataframe, the connection 
count doesn't shoot up *
Example:
                _for(int i=0; i < 50; i++){                     
                        Dataset<Row> df = sqlContext.read().format("jdbc")
                                        .option("url", 
"jdbc:phoenix:localhost:2181")
                                        .option("dbtable", "\"Sales\"")
                                        .option("driver", 
"org.apache.phoenix.jdbc.PhoenixDriver")
                                        .load();
                        df.show(2);
                }
                Thread.sleep(1000*60);_                 
Connection counts during program execution(14 being the count before execution 
starts):

root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
14
17:00:42
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
14
17:00:43
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
16
17:00:46
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
16
17:00:50
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
16
17:00:55
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
16
17:01:12
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
16
17:01:18
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
16
17:01:28
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
16
17:01:34
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
16
17:01:37
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
16
17:01:39
root@user1 ~ $ netstat -anp | grep 2181|grep EST| wc -l; date +"%H:%M:%S"
14
17:02:07



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to