Re: Workers CPU leak [epoll_wait,epoll_ctl]

2017-07-29 Thread Maxim Dounin
Hello!

On Sat, Jul 29, 2017 at 05:35:08AM -0400, lamsung wrote:

> Hello, I have strange issuses with nginx workers. For some time after start
> Nginx I notice that some process of workers cause high load to CPU (
> principally sys CPU). 

[...]

> Nginx has such version and modules: 
> 
> nginx version: nginx/1.9.11 
> built with OpenSSL 1.0.2f 28 Jan 2016 
> TLS SNI support enabled 
> configure arguments: --prefix=/usr --conf-path=/etc/nginx/nginx.conf
> --error-log-path=/var/log/nginx/error_log --pid-path=/run/nginx.pid
> --lock-path=/run/lock/nginx.lock --with-cc-opt=-I/usr/include
> --with-ld-opt=-L/usr/lib64 --http-log-path=/var/log/nginx/access_log
> --http-client-body-temp-path=/var/lib/nginx/tmp/client
> --http-proxy-temp-path=/var/lib/nginx/tmp/proxy
> --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi
> --http-scgi-temp-path=/var/lib/nginx/tmp/scgi
> --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --with-file-aio --with-ipv6
> --with-pcre --with-threads --without-http_autoindex_module
> --without-http_fastcgi_module --without-http_geo_module
> --without-http_limit_req_module --without-http_limit_conn_module
> --without-http_memcached_module --without-http_uwsgi_module
> --with-http_flv_module --with-http_gzip_static_module --with-http_mp4_module
> --with-http_perl_module
> --add-module=external_module/headers-more-nginx-module-0.261
> --add-module=external_module/ngx_estreaming_module-0.01
> --add-module=external_module/ngx_slice_module-0.01 --with-http_ssl_module
> --without-mail_imap_module --without-mail_pop3_module
> --without-mail_smtp_module --user='www --group=www' 
> 
> and using for video streaming. 
> 
> Has anyone encountered such behavior ? Help please.

An obvious first step would be to try to reproduce the problem 
without 3rd party modules.

Trying something newer than 1.9.11 is also recommended - it is not 
supported since release of nginx 1.9.12 on 24 Feb 2016.  Current 
versions are 1.13.3 (mainline) and 1.12.1 (stable).

-- 
Maxim Dounin
http://nginx.org/
___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-21 Thread Maxim Dounin
Hello!

On Sun, Mar 20, 2016 at 04:34:14PM -0400, vizl wrote:

> We found, that we are running 'truncate -s 0' to file before removing them.
> Can it potentially cause the mentioned above  problems ?

Yes, for sure.  This is a modification of a file being served, and 
it's expected to cause the CPU hog in question when using sendfile 
in threads on Linux.  In nginx 1.9.13 nginx will log an alert 
instead, see this commit:

http://hg.nginx.org/nginx/rev/4df3d9fcdee8

-- 
Maxim Dounin
http://nginx.org/

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-21 Thread vizl
Thank you. Waiting for 1.9.13 branch.

Posted at Nginx Forum: 
https://forum.nginx.org/read.php?2,264764,265536#msg-265536

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-21 Thread Maxim Konovalov
On 3/20/16 11:34 PM, vizl wrote:
> We found, that we are running 'truncate -s 0' to file before removing them.
> Can it potentially cause the mentioned above  problems ?
> 
Yes, quite possible.

The fix was committed to the mainline branch and will be available
in 1.9.13:

http://mailman.nginx.org/pipermail/nginx-devel/2016-March/008012.html

-- 
Maxim Konovalov

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-20 Thread vizl
We found, that we are running 'truncate -s 0' to file before removing them.
Can it potentially cause the mentioned above  problems ?

Posted at Nginx Forum: 
https://forum.nginx.org/read.php?2,264764,265516#msg-265516

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-09 Thread Maxim Dounin
Hello!

On Wed, Mar 09, 2016 at 10:56:40AM -0500, vizl wrote:

> Thank you. 
> We don't make any changes with files or overwrite them while sendfile
> processing it. 
> Only create temp file and then mv it.

The debug log suggests this is not true.

> Maybe Is it the same bug concerned with treads aio like in this messeges, 
> https://forum.nginx.org/read.php?21,264701,265016#msg-265016
> and it would be fixed in future ?

Yes, and the patch provided links to the very same thread and 
resolves the problem in nginx, i.e., CPU hog.  With the patch an 
alert will be correctly logged.

Note well that the root cause of CPU hog observed are non-atomic 
file updates.  You will still see other problems till this is 
resolved (including data corruption), even with the patch.

-- 
Maxim Dounin
http://nginx.org/

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-09 Thread vizl
Thank you. 
We don't make any changes with files or overwrite them while sendfile
processing it. 
Only create temp file and then mv it.

Maybe Is it the same bug concerned with treads aio like in this messeges, 
https://forum.nginx.org/read.php?21,264701,265016#msg-265016
and it would be fixed in future ?

Posted at Nginx Forum: 
https://forum.nginx.org/read.php?2,264764,265193#msg-265193

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-09 Thread Maxim Dounin
Hello!

On Wed, Mar 09, 2016 at 09:28:20AM -0500, vizl wrote:

> Debug log regarding to hanged  PID 7479 http://dev.vizl.org/debug.log.txt

This looks like a threads + sendfile() loop due to non-atomic updates of 
underlying file, similar to one recently reported on the Russian 
mailing list.

Correct solution would be to fix your system to update files 
atomically instead of overwriting them in place.

The patch below will resolve CPU hog and will log an alert 
instead:

# HG changeset patch
# User Maxim Dounin 
# Date 1457536139 -10800
#  Wed Mar 09 18:08:59 2016 +0300
# Node ID e96e5dfe4ff8ffe301264c3eb2771596fae24d38
# Parent  93049710cb7f6ea91fa9bd707e88fbe79d82d0ef
Truncation detection in sendfile() on Linux.

This addresses connection hangs as observed in ticket #504, and
CPU hogs with "aio threads; sendfile on" as reported in the mailing list,
see http://mailman.nginx.org/pipermail/nginx-ru/2016-March/057638.html.

The alert is identical to one used on FreeBSD.

diff --git a/src/os/unix/ngx_linux_sendfile_chain.c 
b/src/os/unix/ngx_linux_sendfile_chain.c
--- a/src/os/unix/ngx_linux_sendfile_chain.c
+++ b/src/os/unix/ngx_linux_sendfile_chain.c
@@ -292,6 +292,19 @@ eintr:
 }
 }
 
+if (n == 0) {
+/*
+ * if sendfile returns zero, then someone has truncated the file,
+ * so the offset became beyond the end of the file
+ */
+
+ngx_log_error(NGX_LOG_ALERT, c->log, 0,
+  "sendfile() reported that \"%s\" was truncated at %O",
+  file->file->name.data, file->file_pos);
+
+return NGX_ERROR;
+}
+
 ngx_log_debug3(NGX_LOG_DEBUG_EVENT, c->log, 0, "sendfile: %z of %uz @%O",
n, size, file->file_pos);
 
@@ -349,6 +362,19 @@ ngx_linux_sendfile_thread(ngx_connection
 return NGX_ERROR;
 }
 
+if (ctx->err != NGX_AGAIN && ctx->sent == 0) {
+/*
+ * if sendfile returns zero, then someone has truncated the file,
+ * so the offset became beyond the end of the file
+ */
+
+ngx_log_error(NGX_LOG_ALERT, c->log, 0,
+  "sendfile() reported that \"%s\" was truncated at 
%O",
+  file->file->name.data, file->file_pos);
+ 
+return NGX_ERROR;
+}
+
 *sent = ctx->sent;
 
 return (ctx->sent == ctx->size) ? NGX_DONE : NGX_AGAIN;

-- 
Maxim Dounin
http://nginx.org/

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-09 Thread vizl
Debug log regarding to hanged  PID 7479 http://dev.vizl.org/debug.log.txt

Posted at Nginx Forum: 
https://forum.nginx.org/read.php?2,264764,265188#msg-265188

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-04 Thread Valentin V. Bartenev
On Friday 04 March 2016 09:42:45 vizl wrote:
> Sorry, my misprint. 
> 
> Config whithout   aio on;
> 
> only  aio threads=default;
> 
> > do you or some tool periodically change the files ?
> no, files are unchanged, just periodically some new are added and some
> expired are deleted
> 

Could you provide the debug log?

  wbr, Valentin V. Bartenev

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-04 Thread vizl
user www;
worker_processes 16;
thread_pool default threads=128 max_queue=1024;
worker_rlimit_nofile 65536;
###timer_resolution 100ms;

#error_log /home/logs/error_log.nginx error;
error_log /home/logs/error_log.nginx.debug debug;

events {
  worker_connections 3;
  use epoll;
}

http {
  include   mime.types;
  default_type  application/octet-stream;
  index index.html index.htm;

  output_buffers2 256k;
  read_ahead256k;   # was 1m;
  aio   threads=default;
  aio   on;
  sendfile  on;
  sendfile_max_chunk256k;

  server {
listen *:80 default rcvbuf=32768 backlog=2048 reuseport deferred;
listen *:443 ssl default rcvbuf=32768 backlog=2048 reuseport deferred;
server_name  localhost;
access_log  /home/logs/access.log;
error_log   /home/logs/error.log warn;
root /mnt;
expires 20m;

location ~ ^/crossdomain.xml { }
location ~ \.[Ff][Ll][Vv]$ {
  flv;
}
location ~ \.[Mm][Pp]4$ {
  mp4;
}
  }
}

Posted at Nginx Forum: 
https://forum.nginx.org/read.php?2,264764,265089#msg-265089

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-04 Thread Valentin V. Bartenev
On Friday 04 March 2016 07:12:09 vizl wrote:
> Sorry for long answer, but we have doing some tests, and notice that probles
> is appear when thread_pool enabled.
> 
>  thread_pool default threads=128 max_queue=1024
> 
> We need to use thread_pool, and can't permenent disable it unfortunately
> 
[..]

Do you modify files that are served by nginx?
Do you have open_file_cache enabled?

  wbr, Valentin V. Bartenev

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-03-04 Thread vizl
Sorry for long answer, but we have doing some tests, and notice that probles
is appear when thread_pool enabled.

 thread_pool default threads=128 max_queue=1024

We need to use thread_pool, and can't permenent disable it unfortunately

Posted at Nginx Forum: 
https://forum.nginx.org/read.php?2,264764,265085#msg-265085

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-02-24 Thread Valentin V. Bartenev
On Wednesday 24 February 2016 09:17:01 vizl wrote:
> Hello, I have strange issuses with nginx workers. For some time after start
> Nginx I notice that some process of workers cause high load to CPU (
> principally sys CPU). 
> 
> At first I've got syscall traces from one of such process: 
> 
> futex(0x157d914, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x157d910, {FUTEX_OP_SET, 0,
> FUTEX_OP_CMP_GT, 1}) = 1
> epoll_wait(38, {{EPOLLIN, {u32=7156096, u64=7156096}}}, 512, -1) = 1
> epoll_ctl(38, EPOLL_CTL_ADD, 178, {EPOLLOUT|EPOLLET, {u32=3888102096,
> u64=140028411886288}}) = 0
> epoll_wait(38, {{EPOLLOUT, {u32=3888102096, u64=140028411886288}}}, 512, -1)
> = 1
> epoll_ctl(38, EPOLL_CTL_DEL, 178, 7ffda2bc7f30) = 0
> futex(0x157d914, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x157d910, {FUTEX_OP_SET, 0,
> FUTEX_OP_CMP_GT, 1}) = 1
> epoll_wait(38, {{EPOLLIN, {u32=7156096, u64=7156096}}}, 512, -1) = 1
> epoll_ctl(38, EPOLL_CTL_ADD, 178, {EPOLLOUT|EPOLLET, {u32=3888102096,
> u64=140028411886288}}) = 0
> epoll_wait(38, {{EPOLLOUT, {u32=3888102096, u64=140028411886288}}}, 512, -1)
> = 1
> epoll_ctl(38, EPOLL_CTL_DEL, 178, 7ffda2bc7f30) = 0
> futex(0x157d914, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x157d910, {FUTEX_OP_SET, 0,
> FUTEX_OP_CMP_GT, 1}) = 1
> futex(0x157d8d0, FUTEX_WAKE_PRIVATE, 1) = 1
> 
> epoll_wait, epoll_ctl, futex are repeated circularly.
> 
> Then I've got lsof of process and see who owns of 38 file descriptor:
> 
> nginx   18862  www   38u  a_inode0,9 0 6
> [eventpoll]
> 
> also I see several CLOSE_WAIT sockets
> 
> nginx   18862  www  101u IPv4   85643376   0t0   TCP
> 154.59.82.194:http->105.107.179.210:24519 (CLOSE_WAIT)
> nginx   18862  www  133r  REG8,3 0  4743
> /mnt/ssd1/wwwroot/71/7/27394667.mp4 (deleted)
> nginx   18862  www  178u IPv4   86054929   0t0   TCP
> 154.59.82.194:http->5adc98ed.bb.sky.com:45665 (CLOSE_WAIT)
> nginx   18862  www  179r  REG8,3 0  5098
> /mnt/ssd1/wwwroot/21/9/29603499.mp4 (deleted)
> 
> 
> Nginx has such version and modules:
> 
> nginx version: nginx/1.9.11
> built with OpenSSL 1.0.2f  28 Jan 2016
> TLS SNI support enabled
> configure arguments: --prefix=/usr --conf-path=/etc/nginx/nginx.conf
> --error-log-path=/var/log/nginx/error_log --pid-path=/run/nginx.pid
> --lock-path=/run/lock/nginx.lock --with-cc-opt=-I/usr/include
> --with-ld-opt=-L/usr/lib64 --http-log-path=/var/log/nginx/access_log
> --http-client-body-temp-path=/var/lib/nginx/tmp/client
> --http-proxy-temp-path=/var/lib/nginx/tmp/proxy
> --http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi
> --http-scgi-temp-path=/var/lib/nginx/tmp/scgi
> --http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi --with-file-aio --with-ipv6
> --with-pcre --with-threads --without-http_autoindex_module
> --without-http_fastcgi_module --without-http_geo_module
> --without-http_limit_req_module --without-http_limit_conn_module
> --without-http_memcached_module --without-http_uwsgi_module
> --with-http_flv_module --with-http_gzip_static_module --with-http_mp4_module
> --with-http_perl_module
> --add-module=external_module/headers-more-nginx-module-0.261
> --add-module=external_module/ngx_estreaming_module-0.01
> --add-module=external_module/ngx_slice_module-0.01 --with-http_ssl_module
> --without-mail_imap_module --without-mail_pop3_module
> --without-mail_smtp_module --user='www --group=www'
> 
> and using for video streaming.
> 
> Has anyone encountered such behavior ? Help please.
> 
[..]

Could you provide a minimal configuration that is causing problems
with debug log?  See: http://nginx.org/en/docs/debugging_log.html

  wbr, Valentin V. Bartenev

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx


Re: Workers CPU leak [epoll_wait,epoll_ctl]

2016-02-24 Thread vizl
P.S: we are using Gentoo with 4.4.1 kernel and CPU X3330 @ 2.66GHz
GenuineIntel GNU/Linux

Posted at Nginx Forum: 
https://forum.nginx.org/read.php?2,264764,264766#msg-264766

___
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx