By itself CLOSE_WAIT state does not indicate a problem. Check jstack of the Drillbit and Jetty worker threads in particular. Try increasing drill.exec.http.jetty.server.selector setting.

Thank you,

Vlad

On 6/26/18 01:26, Ken Qi (Guangquan) wrote:
Hi Team,

Hope all is good.

We need your help.

Here is the apache drill process which we installed in our server.

drill    19220     1 17 16:48 ?        00:15:32 /usr/java/jdk/bin/java
-Xms8G -Xmx8G -XX:MaxDirectMemorySize=96G -XX:ReservedCodeCacheSize=1024m
-Ddrill.exec.enable-epoll=false -XX:+CMSClassUnloadingEnabled -XX:+UseG1GC
-Dlog.path=/var/log/drill/drillbit.log
-Dlog.query.path=/var/log/drill/drillbit_queries.json -cp
/usr/local/apache-drill-1.13.1/conf:/usr/local/apache-drill-1.13.1/jars/*:/usr/local/apache-drill-1.13.1/jars/ext/*:/usr/local/apache-drill-1.13.1/jars/3rdparty/*:/usr/local/apache-drill-1.13.1/jars/classb/*:/usr/local/apache-drill-1.13.1/jars/3rdparty/linux/*
org.apache.drill.exec.server.Drillbit
root     23651 23227  0 18:16 pts/1    00:00:00 grep --color=auto java

Question 1:

There are a lot of CLOSE_WAIT states when I access apache drill  https://ip
address:8047 <https://theremin.digitalalchemy.net.au:8047/>  I have changed
our server ip to xxxx for the secruity reason, this caused that we can't
access apache drill by  https://ip address:8047
<https://theremin.digitalalchemy.net.au:8047/>, so we can't check which SQL
run failed.

tcp6       0      0 :::8047                 :::*                    LISTEN
     19220/java
tcp6     518      0 192.168.xxxx:8047      192.168.100.131:54132
  CLOSE_WAIT  19220/java
tcp6       1      0 192.168. xxxx :8047      192.168.100.222:52986
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.222:53009
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54131
  CLOSE_WAIT  19220/java
tcp6       1      0 192.168. xxxx :8047      192.168.3.119:61202
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54366
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54129
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:58627
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:58486
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54134
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.222:53008
  CLOSE_WAIT  19220/java
tcp6       1      0 192.168. xxxx :8047      192.168.3.119:56226
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.222:52991
  CLOSE_WAIT  19220/java
tcp6       1      0 192.168. xxxx :8047      192.168.3.119:51172
  CLOSE_WAIT  19220/java
tcp6       1      0 192.168. xxxx :8047      192.168.3.119:36136
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54133
  CLOSE_WAIT  19220/java
tcp6      24      0 192.168. xxxx :8047      192.168.100.131:57474
  ESTABLISHED 19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54069
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54130
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.222:53001
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.222:52985
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.222:52990
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54212
  CLOSE_WAIT  19220/java
tcp6       1      0 192.168. xxxx :8047      192.168.100.131:58628
  CLOSE_WAIT  19220/java
tcp6       1      0 192.168. xxxx :8047      192.168.100.131:53955
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:57391
  CLOSE_WAIT  19220/java
tcp6       1      0 192.168. xxxx :8047      192.168.3.119:41219
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54307
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.222:53000
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168 xxxx :8047      192.168.100.222:52984
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54308
  CLOSE_WAIT  19220/java
tcp6       1      0 192.168. xxxx :8047      192.168.3.119:46189
  CLOSE_WAIT  19220/java
tcp6     518      0 192.168. xxxx :8047      192.168.100.131:54211
  CLOSE_WAIT  19220/java



Question 2


Our apache drill was down frequently, it seems that it is due to memory
leak. However, we have configured 96G memory for apache dirll, so can you
please advise how can we identify which SQL took a lot of memory? and how
can improve our performance?


Error Id: 40d789a6-91ee-4e0b-bfc9-a26358a43df3 on
theremin.root.digitalalchemy:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
IllegalStateException: Memory was leaked by query. Memory leaked: (67043328)
Allocator(op:14:0:0:HashPartitionSender)
1000000/67043328/101535744/10000000000 (res/actual/peak/limit)


Fragment 14:0

[Error Id: 40d789a6-91ee-4e0b-bfc9-a26358a43df3 on
theremin.root.digitalalchemy:31010]
        at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
~[drill-common-1.13.0.jar:1.13.0]
        at
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:300)
[drill-java-exec-1.13.0.jar:1.13.0]
        at
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
[drill-java-exec-1.13.0.jar:1.13.0]
        at
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:266)
[drill-java-exec-1.13.0.jar:1.13.0]
        at
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-1.13.0.jar:1.13.0]
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[na:1.8.0_161]
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[na:1.8.0_161]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]

)

Thank you.


Regards.

Ken Qi

System Operations Department Leader

Digital Alchemy (Nanjing) Limited Company

T : +86 25 83177103 (Ext:2003)
M: +86 13913876298

https://www.digitalalchemy.asia/

<https://www.linkedin.com/pulse/digital-alchemy-expands-north-america-regan-yan/>


Reply via email to