I'm still unable to run versions of ASSP newer than build 16018. I'm
seeing the same exact behavior on any version newer than 16018, but
reverting to 16018 fixes the problem.

Something is killing the main thread, or causing long delays, which just
makes everything start to fail.

ASSP will run for about five minutes after it is started, with no
problems. Then it will start getting slow. I'll see things like this in
the log:

Feb-10-16 10:28:29 [Worker_10000] Warning: Worker_10000 - check the
'ADO' database connections has taken 32.774 seconds (max=1.000s)
Feb-10-16 10:28:31 [Worker_10000] Info: Name Server x.x.x.x:
ResponseTime = 1232 ms for sourceforge.net
Feb-10-16 10:28:31 [Worker_10000] Info: Name Server y.y.y.y:
ResponseTime = 282 ms for sourceforge.net
Feb-10-16 10:28:31 [Worker_10000] Info: Name Server z.z.z.z:
ResponseTime = 11 ms for sourceforge.net
Feb-10-16 10:28:31 [Worker_10000] Info: switched (DNS) nameserver order
from x.x.x.x , y.y.y.y , z.z.z.z to z.z.z.z , x.x.x.x , y.y.y.y

The nameserver on IP x.x.x.x is on the same machine as ASSP, and
absolutely nothing is going on with the machine. All eight cores of the
CPU are essentially sitting idle, there are many GB of memory free, the
disk queue is sitting at near zero - the machine is essentially idle, so
there is no reason why a DNS query would take 1232 ms. I can only assume
then that the DNS query test that is being run by ASSP is taking too
long because of a problem within ASSP, and the response time being shown
is invalid.

Similarly, the SQL server, also running on the local machine, is 100%
healthy and is sitting idle, waiting for a query from ASSP. There is no
way it should be sitting there for 32 seconds waiting for a connection.

After this ASSP starts sending some queued email:

Feb-10-16 10:28:32 [Worker_10000] Info: looking for files to (re)send
Feb-10-16 10:28:42 [Worker_10000] (re)send - try to open:
D:/ASSP/resendmail/n000000153.eml
Feb-10-16 10:28:48 [Worker_10000] (re)send - process:
D:/ASSP/resendmail/n000000153.eml (first time)
Feb-10-16 10:28:49 [Worker_10000] (re)send -
D:/ASSP/resendmail/n000000153.eml - From: r...@domain.com - To:
r...@domain.com
Feb-10-16 10:28:49 [Worker_10000] (re)send
D:/ASSP/resendmail/n000000153.eml to host: 127.0.0.1:6027 (smtpDestination)
Feb-10-16 10:28:50 NB-18099-00275 [Worker_2] [MessageOK] y.y.y.y
<u...@domain.com> to: u...@domain.com message ok - (whiteListedIPs
'y.y.y.y/32 New Web server') - [New password activation] ->
D:/ASSP/notspam/275.eml
Feb-10-16 10:28:58 [Worker_10000] Info: Net::SMTP is used to send mail
Feb-10-16 10:29:00 [Worker_5] Worker_5 wakes up
Feb-10-16 10:29:01 [Main_Thread] IP 127.0.0.1 matches
allowStatConnectionsFrom - with 127.0.0.1/32
Feb-10-16 10:29:03 [Worker_10000] Info: successful sent file
D:/ASSP/resendmail/n000000153.eml to 127.0.0.1:6027 (smtpDestination)
Feb-10-16 10:29:04 [Worker_10000] (re)send - try to open:
D:/ASSP/resendmail/n000000224.eml
Feb-10-16 10:29:04 [Worker_10000] (re)send - process:
D:/ASSP/resendmail/n000000224.eml (first time)
Feb-10-16 10:29:04 [Worker_10000] (re)send -
D:/ASSP/resendmail/n000000224.eml - From: r...@netbound.com - To:
r...@netbound.com
Feb-10-16 10:29:04 [Worker_10000] (re)send
D:/ASSP/resendmail/n000000224.eml to host: 127.0.0.1:6027 (smtpDestination)
Feb-10-16 10:29:04 [Worker_10000] Info: Net::SMTP is used to send mail
Feb-10-16 10:29:05 [Main_Thread] Info: no (more) data readable
(connection possibly closed by browser)
Feb-10-16 10:29:06 [Worker_10000] Info: successful sent file
D:/ASSP/resendmail/n000000224.eml to 127.0.0.1:3027 (smtpDestination)
Feb-10-16 10:29:07 [Main_Thread] Info: no (more) data readable
(connection possibly closed by browser)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:29:32 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)

And then it dies. It will write literally thousands and thousands of
these "unable to detect any running worker" lines into the log - several
hundred of them per second. This goes on for a couple of minutes until
it just gives up and restarts:

Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:36 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:37 [Main_Thread] Info: unable to detect any running
worker for a new connection - wait (max 30 seconds)
Feb-10-16 10:31:37 [Main_Thread] Info: ConnectionTransferTimeOut (30
seconds) is now reached
Feb-10-16 10:31:37 [Main_Thread] Warning: Main_Thread is unable to
transfer connection to any worker - try again!
Feb-10-16 10:31:37 [Main_Thread] Info: notification message queued to
sent to r...@netbound.com
Feb-10-16 10:31:37 [Main_Thread] Error: Main_Thread is unable to
transfer connection to any worker within 120 seconds - restart ASSP!
Feb-10-16 10:31:38 [Main_Thread] Initializing shutdown sequence
Feb-10-16 10:31:40 [Shutdown] Info: removing all SMTP and Proxy listeners
Feb-10-16 10:31:41 [Shutdown] Tell Worker 3 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 4 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 5 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 1 - QUIT
Feb-10-16 10:31:41 [Shutdown] Tell Worker 2 - QUIT
Feb-10-16 10:31:41 [Shutdown] Waiting for all SMTP-Workers to be finished
Feb-10-16 10:31:45 [Shutdown] Warning: poll cycle (2) has taken
3.1647617816925 seconds - this is very much is too long
Feb-10-16 10:32:08 NB-18036-09167 [Worker_3] 22.22.22.22
<u...@domain.ca> to: u...@domain.on.ca info: received all data - all
data moved to send queue (8)
Feb-10-16 10:32:08 [Worker_3] Worker_3 has active connections. Will wait
until all connections are finished but max 45 seconds!
Feb-10-16 10:32:31 [Shutdown] Error: at least one of the SMTP workers
has not finished work within 50 seconds
Feb-10-16 10:32:33 [Shutdown] Closing all databases
Feb-10-16 10:32:45 [Shutdown] Info: removing all WEB listeners
Feb-10-16 10:32:45 [Shutdown] Info: shutdown reason was: restarting
Feb-10-16 10:32:45 [Shutdown] ASSP finished work

I need to find out what has changed in ASSP, and what I can do to fix
this problem! Thomas, I did a compare between builds 16018 and 16021
(16018 works fine, 16021 and newer exhibits the above behavior), and see
very little - some SSL changes, griplist upload, not much else.

Any idea where I could start to try to figure out what is going on?


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to