Re: Workers CPU leak [epoll_wait,epoll_ctl]
Hello! On Sat, Jul 29, 2017 at 05:35:08AM -0400, lamsung wrote: > Hello, I have strange issuses with nginx workers. For some time after start > Nginx I notice that some process of workers cause high load to CPU ( > principally sys CPU). [...] > Nginx has such version and modules: > > nginx version: nginx/1.9.11 > built with OpenSSL 1.0.2f 28 Jan 2016 > TLS SNI support enabled > configure arguments: --prefix=/usr --conf-path=/etc/nginx/nginx.conf > --error-log-path=/var/log/nginx/error_log --pid-path=/run/nginx.pid > --lock-path=/run/lock/nginx.lock --with-cc-opt=-I/usr/include > --with-ld-opt=-L/usr/lib64 --http-log-path=/var/log/nginx/access_log > --http-client-body-temp-path=/var/lib/nginx/tmp/client > --http-proxy-temp-path=/var/lib/nginx/tmp/proxy > --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi > --http-scgi-temp-path=/var/lib/nginx/tmp/scgi > --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --with-file-aio --with-ipv6 > --with-pcre --with-threads --without-http_autoindex_module > --without-http_fastcgi_module --without-http_geo_module > --without-http_limit_req_module --without-http_limit_conn_module > --without-http_memcached_module --without-http_uwsgi_module > --with-http_flv_module --with-http_gzip_static_module --with-http_mp4_module > --with-http_perl_module > --add-module=external_module/headers-more-nginx-module-0.261 > --add-module=external_module/ngx_estreaming_module-0.01 > --add-module=external_module/ngx_slice_module-0.01 --with-http_ssl_module > --without-mail_imap_module --without-mail_pop3_module > --without-mail_smtp_module --user='www --group=www' > > and using for video streaming. > > Has anyone encountered such behavior ? Help please. An obvious first step would be to try to reproduce the problem without 3rd party modules. Trying something newer than 1.9.11 is also recommended - it is not supported since release of nginx 1.9.12 on 24 Feb 2016. Current versions are 1.13.3 (mainline) and 1.12.1 (stable). -- Maxim Dounin http://nginx.org/ ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
Hello! On Sun, Mar 20, 2016 at 04:34:14PM -0400, vizl wrote: > We found, that we are running 'truncate -s 0' to file before removing them. > Can it potentially cause the mentioned above problems ? Yes, for sure. This is a modification of a file being served, and it's expected to cause the CPU hog in question when using sendfile in threads on Linux. In nginx 1.9.13 nginx will log an alert instead, see this commit: http://hg.nginx.org/nginx/rev/4df3d9fcdee8 -- Maxim Dounin http://nginx.org/ ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
Thank you. Waiting for 1.9.13 branch. Posted at Nginx Forum: https://forum.nginx.org/read.php?2,264764,265536#msg-265536 ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
On 3/20/16 11:34 PM, vizl wrote: > We found, that we are running 'truncate -s 0' to file before removing them. > Can it potentially cause the mentioned above problems ? > Yes, quite possible. The fix was committed to the mainline branch and will be available in 1.9.13: http://mailman.nginx.org/pipermail/nginx-devel/2016-March/008012.html -- Maxim Konovalov ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
We found, that we are running 'truncate -s 0' to file before removing them. Can it potentially cause the mentioned above problems ? Posted at Nginx Forum: https://forum.nginx.org/read.php?2,264764,265516#msg-265516 ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
Hello! On Wed, Mar 09, 2016 at 10:56:40AM -0500, vizl wrote: > Thank you. > We don't make any changes with files or overwrite them while sendfile > processing it. > Only create temp file and then mv it. The debug log suggests this is not true. > Maybe Is it the same bug concerned with treads aio like in this messeges, > https://forum.nginx.org/read.php?21,264701,265016#msg-265016 > and it would be fixed in future ? Yes, and the patch provided links to the very same thread and resolves the problem in nginx, i.e., CPU hog. With the patch an alert will be correctly logged. Note well that the root cause of CPU hog observed are non-atomic file updates. You will still see other problems till this is resolved (including data corruption), even with the patch. -- Maxim Dounin http://nginx.org/ ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
Thank you. We don't make any changes with files or overwrite them while sendfile processing it. Only create temp file and then mv it. Maybe Is it the same bug concerned with treads aio like in this messeges, https://forum.nginx.org/read.php?21,264701,265016#msg-265016 and it would be fixed in future ? Posted at Nginx Forum: https://forum.nginx.org/read.php?2,264764,265193#msg-265193 ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
Hello! On Wed, Mar 09, 2016 at 09:28:20AM -0500, vizl wrote: > Debug log regarding to hanged PID 7479 http://dev.vizl.org/debug.log.txt This looks like a threads + sendfile() loop due to non-atomic updates of underlying file, similar to one recently reported on the Russian mailing list. Correct solution would be to fix your system to update files atomically instead of overwriting them in place. The patch below will resolve CPU hog and will log an alert instead: # HG changeset patch # User Maxim Dounin# Date 1457536139 -10800 # Wed Mar 09 18:08:59 2016 +0300 # Node ID e96e5dfe4ff8ffe301264c3eb2771596fae24d38 # Parent 93049710cb7f6ea91fa9bd707e88fbe79d82d0ef Truncation detection in sendfile() on Linux. This addresses connection hangs as observed in ticket #504, and CPU hogs with "aio threads; sendfile on" as reported in the mailing list, see http://mailman.nginx.org/pipermail/nginx-ru/2016-March/057638.html. The alert is identical to one used on FreeBSD. diff --git a/src/os/unix/ngx_linux_sendfile_chain.c b/src/os/unix/ngx_linux_sendfile_chain.c --- a/src/os/unix/ngx_linux_sendfile_chain.c +++ b/src/os/unix/ngx_linux_sendfile_chain.c @@ -292,6 +292,19 @@ eintr: } } +if (n == 0) { +/* + * if sendfile returns zero, then someone has truncated the file, + * so the offset became beyond the end of the file + */ + +ngx_log_error(NGX_LOG_ALERT, c->log, 0, + "sendfile() reported that \"%s\" was truncated at %O", + file->file->name.data, file->file_pos); + +return NGX_ERROR; +} + ngx_log_debug3(NGX_LOG_DEBUG_EVENT, c->log, 0, "sendfile: %z of %uz @%O", n, size, file->file_pos); @@ -349,6 +362,19 @@ ngx_linux_sendfile_thread(ngx_connection return NGX_ERROR; } +if (ctx->err != NGX_AGAIN && ctx->sent == 0) { +/* + * if sendfile returns zero, then someone has truncated the file, + * so the offset became beyond the end of the file + */ + +ngx_log_error(NGX_LOG_ALERT, c->log, 0, + "sendfile() reported that \"%s\" was truncated at %O", + file->file->name.data, file->file_pos); + +return NGX_ERROR; +} + *sent = ctx->sent; return (ctx->sent == ctx->size) ? NGX_DONE : NGX_AGAIN; -- Maxim Dounin http://nginx.org/ ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
Debug log regarding to hanged PID 7479 http://dev.vizl.org/debug.log.txt Posted at Nginx Forum: https://forum.nginx.org/read.php?2,264764,265188#msg-265188 ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
On Friday 04 March 2016 09:42:45 vizl wrote: > Sorry, my misprint. > > Config whithout aio on; > > only aio threads=default; > > > do you or some tool periodically change the files ? > no, files are unchanged, just periodically some new are added and some > expired are deleted > Could you provide the debug log? wbr, Valentin V. Bartenev ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
user www; worker_processes 16; thread_pool default threads=128 max_queue=1024; worker_rlimit_nofile 65536; ###timer_resolution 100ms; #error_log /home/logs/error_log.nginx error; error_log /home/logs/error_log.nginx.debug debug; events { worker_connections 3; use epoll; } http { include mime.types; default_type application/octet-stream; index index.html index.htm; output_buffers2 256k; read_ahead256k; # was 1m; aio threads=default; aio on; sendfile on; sendfile_max_chunk256k; server { listen *:80 default rcvbuf=32768 backlog=2048 reuseport deferred; listen *:443 ssl default rcvbuf=32768 backlog=2048 reuseport deferred; server_name localhost; access_log /home/logs/access.log; error_log /home/logs/error.log warn; root /mnt; expires 20m; location ~ ^/crossdomain.xml { } location ~ \.[Ff][Ll][Vv]$ { flv; } location ~ \.[Mm][Pp]4$ { mp4; } } } Posted at Nginx Forum: https://forum.nginx.org/read.php?2,264764,265089#msg-265089 ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
On Friday 04 March 2016 07:12:09 vizl wrote: > Sorry for long answer, but we have doing some tests, and notice that probles > is appear when thread_pool enabled. > > thread_pool default threads=128 max_queue=1024 > > We need to use thread_pool, and can't permenent disable it unfortunately > [..] Do you modify files that are served by nginx? Do you have open_file_cache enabled? wbr, Valentin V. Bartenev ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
Sorry for long answer, but we have doing some tests, and notice that probles is appear when thread_pool enabled. thread_pool default threads=128 max_queue=1024 We need to use thread_pool, and can't permenent disable it unfortunately Posted at Nginx Forum: https://forum.nginx.org/read.php?2,264764,265085#msg-265085 ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
On Wednesday 24 February 2016 09:17:01 vizl wrote: > Hello, I have strange issuses with nginx workers. For some time after start > Nginx I notice that some process of workers cause high load to CPU ( > principally sys CPU). > > At first I've got syscall traces from one of such process: > > futex(0x157d914, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x157d910, {FUTEX_OP_SET, 0, > FUTEX_OP_CMP_GT, 1}) = 1 > epoll_wait(38, {{EPOLLIN, {u32=7156096, u64=7156096}}}, 512, -1) = 1 > epoll_ctl(38, EPOLL_CTL_ADD, 178, {EPOLLOUT|EPOLLET, {u32=3888102096, > u64=140028411886288}}) = 0 > epoll_wait(38, {{EPOLLOUT, {u32=3888102096, u64=140028411886288}}}, 512, -1) > = 1 > epoll_ctl(38, EPOLL_CTL_DEL, 178, 7ffda2bc7f30) = 0 > futex(0x157d914, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x157d910, {FUTEX_OP_SET, 0, > FUTEX_OP_CMP_GT, 1}) = 1 > epoll_wait(38, {{EPOLLIN, {u32=7156096, u64=7156096}}}, 512, -1) = 1 > epoll_ctl(38, EPOLL_CTL_ADD, 178, {EPOLLOUT|EPOLLET, {u32=3888102096, > u64=140028411886288}}) = 0 > epoll_wait(38, {{EPOLLOUT, {u32=3888102096, u64=140028411886288}}}, 512, -1) > = 1 > epoll_ctl(38, EPOLL_CTL_DEL, 178, 7ffda2bc7f30) = 0 > futex(0x157d914, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x157d910, {FUTEX_OP_SET, 0, > FUTEX_OP_CMP_GT, 1}) = 1 > futex(0x157d8d0, FUTEX_WAKE_PRIVATE, 1) = 1 > > epoll_wait, epoll_ctl, futex are repeated circularly. > > Then I've got lsof of process and see who owns of 38 file descriptor: > > nginx 18862 www 38u a_inode0,9 0 6 > [eventpoll] > > also I see several CLOSE_WAIT sockets > > nginx 18862 www 101u IPv4 85643376 0t0 TCP > 154.59.82.194:http->105.107.179.210:24519 (CLOSE_WAIT) > nginx 18862 www 133r REG8,3 0 4743 > /mnt/ssd1/wwwroot/71/7/27394667.mp4 (deleted) > nginx 18862 www 178u IPv4 86054929 0t0 TCP > 154.59.82.194:http->5adc98ed.bb.sky.com:45665 (CLOSE_WAIT) > nginx 18862 www 179r REG8,3 0 5098 > /mnt/ssd1/wwwroot/21/9/29603499.mp4 (deleted) > > > Nginx has such version and modules: > > nginx version: nginx/1.9.11 > built with OpenSSL 1.0.2f 28 Jan 2016 > TLS SNI support enabled > configure arguments: --prefix=/usr --conf-path=/etc/nginx/nginx.conf > --error-log-path=/var/log/nginx/error_log --pid-path=/run/nginx.pid > --lock-path=/run/lock/nginx.lock --with-cc-opt=-I/usr/include > --with-ld-opt=-L/usr/lib64 --http-log-path=/var/log/nginx/access_log > --http-client-body-temp-path=/var/lib/nginx/tmp/client > --http-proxy-temp-path=/var/lib/nginx/tmp/proxy > --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi > --http-scgi-temp-path=/var/lib/nginx/tmp/scgi > --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --with-file-aio --with-ipv6 > --with-pcre --with-threads --without-http_autoindex_module > --without-http_fastcgi_module --without-http_geo_module > --without-http_limit_req_module --without-http_limit_conn_module > --without-http_memcached_module --without-http_uwsgi_module > --with-http_flv_module --with-http_gzip_static_module --with-http_mp4_module > --with-http_perl_module > --add-module=external_module/headers-more-nginx-module-0.261 > --add-module=external_module/ngx_estreaming_module-0.01 > --add-module=external_module/ngx_slice_module-0.01 --with-http_ssl_module > --without-mail_imap_module --without-mail_pop3_module > --without-mail_smtp_module --user='www --group=www' > > and using for video streaming. > > Has anyone encountered such behavior ? Help please. > [..] Could you provide a minimal configuration that is causing problems with debug log? See: http://nginx.org/en/docs/debugging_log.html wbr, Valentin V. Bartenev ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx
Re: Workers CPU leak [epoll_wait,epoll_ctl]
P.S: we are using Gentoo with 4.4.1 kernel and CPU X3330 @ 2.66GHz GenuineIntel GNU/Linux Posted at Nginx Forum: https://forum.nginx.org/read.php?2,264764,264766#msg-264766 ___ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx