>Number:         5485
>Category:       mod_jserv
>Synopsis:       Servlets stop responding after working correctly for an 
>extended period
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    jserv
>State:          open
>Class:          sw-bug
>Submitter-Id:   apache
>Arrival-Date:   Thu Dec 16 14:10:01 PST 1999
>Last-Modified:
>Originator:     [EMAIL PROTECTED]
>Organization:
apache
>Release:        apache 1.3.6 and jserv 1.0
>Environment:
SunOS dat-chi 5.6 Generic_105181-13 sun4u sparc SUNW,Ultra-4
JDBC access to an oracle listener
Sun javac compiler
>Description:
We are using apache and jserv as an oracle database front end for a web-based 
application.
There are approximately 40 servlets which work in conjunction to produce the 
correct HTML displays and control
database activities.

The web server and jserv run for extended periods with no problems and then the 
servlets stop responding.
The web server is still running.  Each servlet logs its transactions and status 
to log files but there are no problem
indications in those files.  The /usr/local/apache/jserv/logs files give no 
problem indications either.  The
/usr/local/apache/logs/access_log shows this access in progress when the 
servlets stopped responding

[15/Dec/1999:22:29:49 -0600] "GET /mdex/DirectorySelectForAdd?. . . . . . 
(deleted this user's personal info)

By checking a different servlet's logfile which has activity about once a 
second I was able to see that that
servlet stopped responding at the same time.

The next lines in the access_log shows this line repeated 5 more times at 5 
minute 2 seconds intervals.  I'm
guessing this is coming from the web server itself since our user applications 
do not automatically send retries.

The /usr/local/apache/logs/error_log file does not show an entry with a 
timestamp at the same time that the
servlets stop responding but there are the following lines between the last 
entry and when the web server
was restarted:

thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0748(1124094120) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0748(1124094120) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0748(1124094120) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0748(1124094120) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0748(1124094120) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0748(1124094120) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0740(5367200) failed: 3 = ESRCH.
thr_continue of 0xeabc0748(1124094120) failed: 3 = ESRCH.
thr_continue of 0xeabc07f8(16) failed: 3 = ESRCH.
thr_continue of 0xeabc06d0(-280320148) failed: 3 = ESRCH.
thr_continue of 0xeabc06e0(0) failed: 3 = ESRCH.

The servlets did not respond until the problem was discovered about 12 hours 
later.  The web server was
restarted by "apachectl graceful" and the servlets started working again.
>How-To-Repeat:
This problem occurs randomly as far as we can tell.  We have not been able to 
cause it to happen.  Since the
web server is being restarted frequently as we make changes to the servlets 
(the product is still in beta) it is not
clear if it is related to the amount of time or number of transactions since 
start time.  This has occured about
once a week.  Our application is a 7x24 used by major cell phone carriers and 
even short outages
are not permitted.
>Fix:
Don't have the slightest, but it has the feel of a breakdown in communications 
between the web server and jserv
or the jserv gets locked up on an internal error.
>Audit-Trail:
>Unformatted:
[In order for any reply to be added to the PR database, you need]
[to include <[EMAIL PROTECTED]> in the Cc line and make sure the]
[subject line starts with the report component and number, with ]
[or without any 'Re:' prefixes (such as "general/1098:" or      ]
["Re: general/1098:").  If the subject doesn't match this       ]
[pattern, your message will be misfiled and ignored.  The       ]
["apbugs" address is not added to the Cc line of messages from  ]
[the database automatically because of the potential for mail   ]
[loops.  If you do not include this Cc, your reply may be ig-   ]
[nored unless you are responding to an explicit request from a  ]
[developer.  Reply only with text; DO NOT SEND ATTACHMENTS!     ]



Reply via email to