Hi,
Periodically, my web app logs are showing timeout expired errors when
connecting to my sql server. The timeout expired errors seem to correspond to
one error which states that the SQL server can't be found. This only happens
for about one second, twice per day (sometimes more often), and other
concurrent requests do not seem to time out. This is a new error that just
started appearing about a week ago.
I've spent the past week attempting to diagnose the problem.
What I have done thus far:
- My sql server performed an average of 40 queries per second. I reduced this
to about 18 by optimizing and removing some duplicate queries and other things
that weren't written optimally.
- Did a dbreindex for the entire db, removed old records which were no longer
used. Created some new indexes, which seemed to improve performance somewhat
(average CPU is now about 6% during peak hours). Statistics are updated nightly.
- Upgraded my network card to a gigabit card, connected through a new gigabit
switch. At one point, I assumed this must be a networking issue, but I believe
I've ruled this out. Neither server shows any logs of any network disconnects
at any time.
- Changed the schedule of my backups, as this seemed to be related, however, it
doesn't correspond to the errors at this point.
- Checked for long running queries. Corrected one, which was running an average
of 3 to 5 seconds, reduced it to a max of 500ms, average of 63ms. This is still
the longest running query on the server, but there's really nothing that can be
done to reduce the response time at this point (other than maybe upgrading the
server maybe).
- I/O seems normal. CPU never peaks above 50%, and is usually averaging 6%. DB
server has 1gig of ram, DB is 1.5gigs, but paging statistics don't seem to be
anything significant.
The errors happen during peak times, but seem more frequent during non-peak
hours, generally between 11pm and 1am. There is nothing scheduled during this
time frame.
I have maintain database connection set to yes.
All of the queries that are timing out use with (nolock), so they shouldn't
be deadlocking. Typically these queries average between 0ms and 16ms, rarely
longer. None of the timed out queries are updates, only nolock reads.
I've covered everything that I can think of... is there anything else that I
should be looking at?
I almost suspect that the connection between the servers is dropping, though
there is no reason for this, and nothing that would indicate this is the case
that I can find...
Geoff B
~|
Find out how CFTicket can increase your company's customer support
efficiency by 100%
http://www.houseoffusion.com/banners/view.cfm?bannerid=49
Message: http://www.houseoffusion.com/lists.cfm/link=i:4:195433
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations Support: http://www.houseoffusion.com/tiny.cfm/54