Hi,
Following on from the previous post. Everything has been pretty stable it's
just the pop3d locking up which just happens randomly. I've not ran under gdb
yet but I did get some debug output and a few other bits.
Under normal load dbmail-pop3d is doing around 5-6% CPU and is accepting 250
connections per minute (grep 'connection accepted' over 1 minute at various
times today)
53559 nobody 2 20 0 192M 112M uwait 11 40:40 4.88%
dbmail-pop3d
At completely random times (once thus far today - UK time at 12:20~) pop3d
locks up in that it answers no new queries but continues to run and the CPU
spikes up to 50% or more
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
53559 nobody 2 20 0 172M 95588K uwait 11 37:25 50.88%
dbmail-pop3d
On freebsd I ran truss against the process which is just a constant scroll of
the following:-
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
write(15,"gv\r\n+Skf/hFP/wCXEU37AviO3VWk+L"...,262143) ERR#35 'Resource
temporarily unavailable'
mytop at the time shows little to no activity:-
MySQL on localhost (5.6.12-log)
up
0+22:09:06 [12:30:15]
Queries: 121.6M qps: 1599 Slow: 22.0 Se/In/Up/De(%): 05/01/00/00
qps now: 56 Slow qps: 0.0 Threads: 16 ( 1/ 6) 07/00/00/00
Key Efficiency: 100.0% Bps in/out: 133.3k/305.1k Now in/out: 812.2/16.8k
Master: mysql-bin.003692/80665838 do: ign:
Id User Host/IP DB Time Cmd Query or State
-- ---- ------- -- ---- --- ----------
1 dbmail localhost dbmail 0 Sleep
1397 root localhost test 0 Query show full processlist
2 dbmail localhost dbmail 2 Sleep
3 dbmail localhost dbmail 6 Sleep
4 dbmail localhost dbmail 18 Sleep
1393 root localhost 69 Sleep
5 dbmail localhost dbmail 87 Sleep
6 dbmail localhost dbmail 87 Sleep
514 dbmail localhost dbmail 203 Sleep
515 dbmail localhost dbmail 207 Sleep
7 dbmail localhost dbmail 356 Sleep
8 dbmail localhost dbmail 667 Sleep
9 dbmail localhost dbmail 667 Sleep
766 dbmail localhost dbmail 2087 Sleep
10 dbmail localhost dbmail 2156 Sleep
11 dbmail localhost dbmail 2156 Sleep
To put it into a little more perspective with debugging enabled at the moment
it normally logs between 50000 to 80000 lines 'per minute' in the log file but
when pop3d locks up it's logging around 20 lines a minute.
[root@redacted-4a ~]# cat dbmail | grep " 12:20" | wc -l
80757
[root@ redacted -4a ~]# cat dbmail | grep " 12:24" | wc -l
61002
[root@ redacted -4a ~]# cat dbmail | grep " 12:25" | wc -l
56135
[root@ redacted -4a ~]# cat dbmail | grep " 12:26" | wc -l
63500
[root@ redacted -4a ~]# cat dbmail | grep " 12:27" | wc -l
20
The debug output for pop3 during the issue (the 20 lines from 12:27) is below
and it just seems like it's spending a whole minute dealing with 1 users
mailbox:-
[root@mail4-db-4a ~]# cat dbmail | grep " 12:27"
Mar 14 12:27:00 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16040/1222118:gwQyZ31AUn^M
RxR4h6+QsBAMUZnWBV/7gdwugzE]
Mar 14 12:27:03 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16040/1206078:EGzXAi+Xx4tL4PofLBol72tlxXk/^M
uIcxMLvT4]
Mar 14 12:27:05 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S >
[17448/1190038:puZkW65uMzYTpQK5Mx1HPY4jjLPwu2ajaDrQ2Ee]
Mar 14 12:27:08 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S >
[16040/1172590:w8L0ZpbHRlci9GbGF0ZURlY29kZS9MZW5ndGggM]
Mar 14 12:27:11 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [17448/1156550:WMYMB9^M
gNLuXhXn3sTLjaRfsZ5jh2vWdvXsnlC]
Mar 14 12:27:14 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/1139102:tY1FPeh2Xl/a9ffIuZPHZX^M
IRqKnCTz2dz5Jbt]
Mar 14 12:27:18 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [19112/1122678:WUYRjHUbm72iU5Ll5GUAqOyKaj^M
d3Vsq3IqZtv]
Mar 14 12:27:21 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/1103566:Ljs040^M
WrbdLM6r9sDW7PxK/hXX9Om3293pbzQ]
Mar 14 12:27:24 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [17832/1087142:E3VqVKfeU8^M
E9DXq4mE6kVCC3v+h6tqcqMoz2N]
Mar 14 12:27:26 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [17832/1069310:OBk0DWpptxhT^M
JOTtbHPXvTooZI5FchSO65rpo]
Mar 14 12:27:30 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/1051478:8A7PU/8FT/APkC^M
rNZ38ik/8IjqQLEls6NN/wD]
Mar 14 12:27:33 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/1035054:qOHpypuS3jGU8RSk43^M
+1yRT+zofHVfFWHO/ZY]
Mar 14 12:27:36 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/1018630:iORk2Ixxn7ua0jmsbFWlfU^M
KfH4iMbEq5UEetX]
Mar 14 12:27:39 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/1002206:8AwViP/moP7Ip/zfgFH/ESV+3j^M
/wBEv+EX/hO]
Mar 14 12:27:42 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S >
[19112/985782:kTWEUkp1Kzg1I6jBcfaInRnW20uCL7^M Q8Uhh2R]
Mar 14 12:27:45 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/966670:KKACigAooA^M
KKACigAooAKKACigAooAKKACigA]
Mar 14 12:27:49 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/950246:v/AA/+Oq+Gf2cP^M
+CeP7Udp8AvEVv4WNlq3wL8]
Mar 14 12:27:51 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/933822:gFH/D9/wD6w0fsFf8A^M
iO//AN3Uf8TR/wDVD8O]
Mar 14 12:27:54 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/917398:xGVrDPFUaFGtXq0vrXvULU^M
8PSq1KjlFOTVGE3]
Mar 14 12:27:58 mail4-db-4a dbmail/pop3d[53559]: Debug:[clientbase]
ci_write(+343): [0x805334000] S > [16424/900974:pXnhfZRpYyTj7DDUKvPRot1HH2^M
sav1/iVg+HO]
I'm happy to send a copy of the debug off list including the section from 12:26
if it helps at all. Not sure if it will.
Best regards,
James Greig
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of
Paul J Stevens
Sent: 13 March 2014 07:51
To: DBMail mailinglist
Subject: Re: [Dbmail] Iphone HTML emails
On 12-03-14 22:05, James Greig wrote:
> That should read 'no seg faults' at all :)
Most excellent!
> Should also point out dbmail-imapd and dbmail-lmtpd both running on
> the same box continue to run fine and accept connections whilst pop3d
> is unavailable. The machine itself is pretty busy but at the time
> it happens there's no sudden rush of connections or anything that
> stands out, but like I say it's a pretty busy box anyway.
Next time that happens, you might do:
lsof -np `pidof dbmail-imapd`
perhaps there's a large amount of connections in WAIT_STATE or a similar
suspicious state.
a short strace might be helpful, or even a gdb backtrace. Anything that might
help discover what pop3d is *doing*. Is it blocking, stuck in a loop, suffering
some kind of resource depletion, anything.
--
________________________________________________________________
Paul J Stevens pjstevns @ gmail, twitter, github, linkedin
www.nfg.nl/[email protected]/+31.85.877.99.97
_______________________________________________
DBmail mailing list
[email protected]
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
_______________________________________________
DBmail mailing list
[email protected]
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail