A month has passed and not even single suggestion how to debug a
problem. Is assp project dead?
strace made for 10 seconds on server that is not experiencing problem
root@sv1 [/root]# strace -f -p 29554 -c
Process 29554 attached with 8 threads - interrupt to quit
^CProcess 29554 detached
Process 29645 detached
Process 29970 detached
Process 29991 detached
Process 30032 detached
Process 30441 detached
Process 30658 detached
Process 31035 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
49.47 0.116007 1208 96 poll
31.64 0.074212 299 248 nanosleep
17.06 0.040001 8000 5 restart_syscall
1.71 0.004001 37 109 32 futex
0.12 0.000272 0 3259 sched_yield
0.01 0.000030 0 70 fcntl
0.00 0.000000 0 31 6 read
0.00 0.000000 0 38 write
0.00 0.000000 0 3 open
0.00 0.000000 0 8 close
0.00 0.000000 0 575 3 stat
0.00 0.000000 0 2 fstat
0.00 0.000000 0 9 8 lseek
0.00 0.000000 0 1 rt_sigaction
0.00 0.000000 0 58 rt_sigprocmask
0.00 0.000000 0 9 9 ioctl
0.00 0.000000 0 6 select
0.00 0.000000 0 34 alarm
0.00 0.000000 0 2 socket
0.00 0.000000 0 4 2 connect
0.00 0.000000 0 1 accept
0.00 0.000000 0 1 sendto
0.00 0.000000 0 1 recvfrom
0.00 0.000000 0 6 getsockname
0.00 0.000000 0 3 1 getpeername
0.00 0.000000 0 4 getdents
0.00 0.000000 0 1 rename
0.00 0.000000 0 1 prctl
------ ----------- ----------- --------- --------- ----------------
100.00 0.234523 4585 61 total
strace made for 10 seconds on server that is experiencing problem
Process 22699 attached with 8 threads - interrupt to quit
^CProcess 22699 detached
Process 22703 detached
Process 22714 detached
Process 22870 detached
Process 22885 detached
Process 22959 detached
Process 23071 detached
Process 23248 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
33.46 2.232141 54442 41 poll
28.29 1.887313 14630 129 nanosleep
20.51 1.368086 456029 3 restart_syscall
17.43 1.162859 1 1190382 271641 futex
0.30 0.020000 5000 4 close
0.00 0.000161 0 21696 sched_yield
0.00 0.000018 0 4753 1 stat
0.00 0.000000 0 17 3 read
0.00 0.000000 0 16 write
0.00 0.000000 0 4 open
0.00 0.000000 0 3 fstat
0.00 0.000000 0 8 6 lseek
0.00 0.000000 0 1 rt_sigaction
0.00 0.000000 0 54 rt_sigprocmask
0.00 0.000000 0 8 8 ioctl
0.00 0.000000 0 11 select
0.00 0.000000 0 32 alarm
0.00 0.000000 0 1 socket
0.00 0.000000 0 2 1 connect
0.00 0.000000 0 1 accept
0.00 0.000000 0 15 sendto
0.00 0.000000 0 14 recvfrom
0.00 0.000000 0 6 getsockname
0.00 0.000000 0 17 15 getpeername
0.00 0.000000 0 54 fcntl
0.00 0.000000 0 4 getdents
0.00 0.000000 0 1 prctl
------ ----------- ----------- --------- --------- ----------------
100.00 6.670578 1217277 271675 total
it is meadeatly there is much higer number of errors (271641) on futex
calls. Why is that? I will just repleat that both os are same (cloned
partition), both hardware is same, and assp is same.
I speculate that this is because of all futex call during constatnt
accesis of /etc/localtime by assp process. Why does assp constantly
access it? How to fix it?
[pid 22714] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1,
0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
[pid 22959] <... futex resumed> ) = 0
[pid 22714] stat("/etc/localtime", <unfinished ...>
[pid 22959] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 22714] <... stat resumed> {st_mode=S_IFREG|0644, st_size=2679, ...}) = 0
[pid 22959] <... futex resumed> ) = 0
[pid 22959] futex(0x7f11d6efea5c, FUTEX_WAIT_PRIVATE, 1750855243, NULL
<unfinished ...>
[pid 22714] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1,
0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
[pid 22959] <... futex resumed> ) = -1 EAGAIN (Resource
temporarily unavailable)
[pid 22714] <... futex resumed> ) = 0
[pid 22959] futex(0x7f11d6efea20, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 22714] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 22959] <... futex resumed> ) = -1 EAGAIN (Resource
temporarily unavailable)
[pid 22714] <... futex resumed> ) = 0
[pid 22959] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 22714] futex(0x7f11d6efea5c, FUTEX_WAIT_PRIVATE, 1750855245, NULL
<unfinished ...>
[pid 22959] <... futex resumed> ) = 0
[pid 22959] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1,
0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
[pid 22714] <... futex resumed> ) = 0
[pid 22959] futex(0x7f11d6efea5c, FUTEX_WAIT_PRIVATE, 1750855247, NULL
<unfinished ...>
[pid 22714] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 22714] futex(0x7f11d6efea5c, FUTEX_WAKE_OP_PRIVATE, 1, 1,
0x7f11d6efea58, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
[pid 22959] <... futex resumed> ) = 0
--
[pid 22714] <... futex resumed> ) = 0
[pid 22959] <... sched_yield resumed> ) = 0
[pid 22959] stat("/etc/localtime", <unfinished ...>
[pid 22714] futex(0x7f11d6efea20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 22959] <... stat resumed> {st_mode=S_IFREG|0644, st_size=2679, ...}) = 0
[pid 22714] <... futex resumed> ) = 0
[pid 22959] sched_yield( <unfinished ...>
[pid 22714] sched_yield( <unfinished ...>
[pid 22959] <... sched_yield resumed> ) = 0
[pid 22959] sched_yield( <unfinished ...>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Assp-test mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-test