ken created DRILL-6935: -------------------------- Summary: Apache drill show a lot of CLOSE_WAIT states when we access https://ip address:8047 Key: DRILL-6935 URL: https://issues.apache.org/jira/browse/DRILL-6935 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 1.13.0 Reporter: ken
Hi Team, > > Hope all is good. > > We need your help. > > Here is the apache drill process which we installed in our server. > > drill 19220 1 17 16:48 ? 00:15:32 /usr/java/jdk/bin/java > -Xms8G -Xmx8G -XX:MaxDirectMemorySize=96G -XX:ReservedCodeCacheSize=1024m > -Ddrill.exec.enable-epoll=false -XX:+CMSClassUnloadingEnabled -XX:+UseG1GC > -Dlog.path=/var/log/drill/drillbit.log > -Dlog.query.path=/var/log/drill/drillbit_queries.json -cp > /usr/local/apache-drill-1.13.1/conf:/usr/local/apache-drill-1.13.1/jars/*:/usr/local/apache-drill-1.13.1/jars/ext/*:/usr/local/apache-drill-1.13.1/jars/3rdparty/*:/usr/local/apache-drill-1.13.1/jars/classb/*:/usr/local/apache-drill-1.13.1/jars/3rdparty/linux/* > org.apache.drill.exec.server.Drillbit > root 23651 23227 0 18:16 pts/1 00:00:00 grep --color=auto java > > Question 1: > > There are a lot of CLOSE_WAIT states when I access apache drill https://ip > address:8047 <https://theremin.digitalalchemy.net.au:8047/> I have changed > our server ip to xxxx for the secruity reason, this caused that we can't > access apache drill by https://ip address:8047 > <https://theremin.digitalalchemy.net.au:8047/>, so we can't check which SQL > run failed. > > tcp6 0 0 :::8047 :::* LISTEN > 19220/java > tcp6 518 0 192.168.xxxx:8047 192.168.100.131:54132 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.100.222:52986 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:53009 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54131 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:61202 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54366 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54129 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:58627 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:58486 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54134 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:53008 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:56226 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:52991 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:51172 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:36136 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54133 > CLOSE_WAIT 19220/java > tcp6 24 0 192.168. xxxx :8047 192.168.100.131:57474 > ESTABLISHED 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54069 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54130 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:53001 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:52985 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:52990 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54212 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.100.131:58628 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.100.131:53955 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:57391 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:41219 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54307 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.222:53000 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168 xxxx :8047 192.168.100.222:52984 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54308 > CLOSE_WAIT 19220/java > tcp6 1 0 192.168. xxxx :8047 192.168.3.119:46189 > CLOSE_WAIT 19220/java > tcp6 518 0 192.168. xxxx :8047 192.168.100.131:54211 > CLOSE_WAIT 19220/java > > > > Question 2 > > > Our apache drill was down frequently, it seems that it is due to memory > leak. However, we have configured 96G memory for apache dirll, so can you > please advise how can we identify which SQL took a lot of memory? and how > can improve our performance? > > > Error Id: 40d789a6-91ee-4e0b-bfc9-a26358a43df3 on > theremin.root.digitalalchemy:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Memory was leaked by query. Memory leaked: (67043328) > Allocator(op:14:0:0:HashPartitionSender) > 1000000/67043328/101535744/10000000000 (res/actual/peak/limit) > > > Fragment 14:0 > > [Error Id: 40d789a6-91ee-4e0b-bfc9-a26358a43df3 on > theremin.root.digitalalchemy:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.13.0.jar:1.13.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:300) > [drill-java-exec-1.13.0.jar:1.13.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.13.0.jar:1.13.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:266) > [drill-java-exec-1.13.0.jar:1.13.0] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.13.0.jar:1.13.0] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > > ) > > Thank you. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)