Re: Maybe I need a SQL Service Consultant...

Brad Wood Mon, 09 Feb 2009 22:10:14 -0800

> SQL Server 2000 (we don't have a lot of money and the licence was
> cheaper... but we are planning to upgrade sometime soon)


Ok.  2000 doesn't give you near the info 2005 does, but it will give you 
quite a bit.
First you need to keep an eye on when your performance spikes hit (which I 
think you're already doing) so you can try and get a handle on what is 
happening right then with the currently running requests.

When a slow down starts, you need to get a snap shot of what was going on.

First of all on your CF server.  Grab a full stack trace.  This is trivial 
with SeeFusion, and I assume Fusion Reactor as well.  You can scan through 
it to look for patterns.  Were multiple pages running the same query, etc? 
If Fusion Reactor allows you to log slow DB calls, turn that on with a 
threshhold low enough to log the stuff running at the time of the slow down.

Now, for your SQL server.

select *
from master.dbo.sysprocesses
http://msdn.microsoft.com/en-us/library/aa260456(SQL.80).aspx


will return a record for every process on the SQL server.  This will tell 
you where the server the request is coming from (the DB doesn't know the 
client's ip-- only the web servers') usually I filter out sleeping processes 
and order by CPU desc.  Of course, if the problem is locks, then the cpu 
usage won't be high for the problem processes since they aren't doing 
anything but waiting.

You can get the command the processes is running, but it is just basic info 
like "select", "update".
Also in the sysprocesses table is the blocked column.  It contains the spid 
of any processes which are blocking it.

If
select *
from master.dbo.sysprocesses
where blocked > 0

returns a lot of records, you are dealing with a lot of locks.  Often times 
one process will chain down and block a number of requests all blocking 
others... you've got to keep climbing the lock tree to see who's blocking 
who until you find a process who is blocking other processes, but not being 
blocked themselves:

select *
from master.dbo.sysprocesses p1
where blocked = 0
and spid in (
                    select blocked
                    from master.dbo.sysprocesses p2
                    where blocked > 0)

I didn't test that, but I think it is right.  If you have one processes 
blocking everyone else, then you need to pull out sp_lock which is a system 
proc that shows you all the locks that exist in the database right then. 
You might get a TON of records back, but you need to know which ones to 
worry about.
The spid column tells you what spid is holding or waiting for the lock. 
Cross references this with sysprocesses.
Lock are a course of their own.  In general keep your eyes peeled for the 
following:

A TON of RID locks for one process.  This may indicate a table scan or an 
update to many records at a time
TAB locks are a lock on an entire table
FIL locks can show occaisionally while writing a large tranny log file to a 
slow disk

Of course, the lock type has to be taken in conjuction with the lock mode:
Shared locks are generally harmless, but it depends on your isolation level.
Exclusive locks are the ones to look out for.

Lock status tells you if the lock is aquired or still being waited for.
The database id and object id are your key to finding what the object 
(usually a table) was being locked.

select *
from sysdatabases

will give you the database ids

use databasename
object_name(object_id)

If you want to make your life easier, google for sp_lock2 which a proc that 
looks up the object names for you.

Your goal is to find the pattern of what sql statements/tables are the 
repeat offenders and which ones are creating a large number of locks or 
leaving locks open for a long time before committing them.  If you are using 
transactions in your SQL you need to be careful how long you wait to commit 
those.  This would be bad:

begin transaction
update table with(holdlock)
set col = foo
-- run database maintneance plan...
commit transaction

Once you know what statement is giving you problems, take it and run it in 
query analyzer to get the real execution plan (not the estimated one).
Reading an execution plan is a whole topic of its own, but you want to find 
the slowest parts of it.  Look for any table scans or extremely large amount 
of records being returned by one of the inner operations.  An index seek is 
most always better than an index scan.

If you are doing a large amount of reads and the data doesn't need to be 
transanctionally safe, use with(nolock) which sets the isolations level for 
that statement to read uncommitted.  Make sure you are updating tables on an 
indexed key to avoid table scans which want to escalate to exlcusive table 
locks.

> Error Executing Database Query. [Macromedia][SQLServer JDBC
> Driver][SQLServer]Transaction (Process ID 110) was deadlocked on lock
> resources with another process and has been chosen as the deadlock
> victim. Rerun the transaction.
>
> This kind of error message has only happened 8 times today... but
> maybe it's related.  I don't have a CLUE how to deal with SQL Server
> deadlocks!

Whoa!  8 times?  Let's make it your mission to see that happen 0 times a 
day. :)  You're probably familiar with what a dead lock is.  Process 1 has 
resource and A and needs resource B.  Process 2 has resource B and needs 
resource A.  The resources are most commonly table, but can also be rows in 
a table if two processes have row or page locks on different parts of a 
table and want to escalate, but can't.  In general, if you have a series of 
tables you commonly update in a transaction, make sure you always update 
them in the same order.  If you are logging your errors and you know what 
queries are getting deadlocked then you can make a good guess what tables 
are the problem.  A lot of deadlocks can be avoided in the first place by 
making sure your locks stay open as short as possible.  This means 
calculating what needs to be changed, then getting in, doing the 
udpate/insert and committing quickly.  Large stored procs with a begin trans 
at the top and a commit trans at the bottom are bad news.

At any rate, you don't have to just sit and guess what your dead locks are. 
Run a trace for deadlocks with SQL profiler.  It will tell you the two 
processes involved as well as the resource they were fighting over.  Use the 
object id to find the table in question.  That should help you track down 
where that is coming from.

> WHENEVER there is a spike, I see the disk WRITE on the C/D drive jump
> to 100% while the disk read jumps up and down between 100% and 0%...
> during this time, I see disk activity on the E/F drive

Where is your tempdb stored?  It would be nice to have your page file 
elsewhere, but it shouldn't be a big problem unless your server is tight on 
RAM and a lot of stuff is being swapped out to disk.  Look at Windows task 
manager to see the total amount of memory in use.  If it is greater that the 
amount of RAM installed, you are probably swapping badly.

Find out what kind of RAID you have.  Mirroring provides the much needed 
redundacy, but no performance benifit until it is 10 or 01 which basically a 
mirred set of striped drives or vice versa.  Also, does your RAID controller 
have built-in RAM to cache commonly used data?  You will want to get a feel 
for what you hard drive activity normally looks like, to compare it to the 
levels you see when the database is misbehaving.

Things like table scans on a few million records will cause a lot of read 
activity, but you seem to be getting a lot of write activity.  This is why I 
asked about your tempdb.  SQL server uses internal temp tables you don't 
even know about.  If you are doing some monster selects with a few dozens 
tables all joined at the same time, there might be some massive internal 
temp tables happening based on the indexes available.  This is where your 
execution plans come in and tell you what is happening.

> So I'm using perfmon to monitor % disk read and write on the two
> physical sets of disks.

You might get some more meaningful data from the  Avg. Disk sec/Read and 
Write and Avg Disk Queue Length.
Once again, there are about a billion things that factor into this, but for 
the first counter, you probably don't want to see your disks' avg reads and 
writes taking more than 50 ms.  A good value for this counter would be more 
like 10ms which means that the average read or write only takes 10ms to 
happen.
The second counter is more complex and is based on your RAID or SAN 
configuration.  The basic rule of thumb is you don't want to see more than a 
small handful of queued I/O (think under 5-10) PER disk in your ARRAY.  An 
array of 20 disks with an avg disk queue length of 40 under load would be 
pretty good. (2 per disk)

Of course, your hard disk performance is often the symptom, not the problem. 
You wouldn't want to go out and buy more drives, just because you had some 
poor SQL that wasn't taking advantage of your indexes.

> It can't be a coincidence that when these spikes occur are also the
> ONLY time I see disk write activity to the E/F drives, where the
> transaction logs are.

A lot transaction log activity probably means a lot of data is being 
updated/inserted somewhere.  Make sure updates only affect the necessary 
rows. Don't do "delete from #temp_table" when a truncate could do.  Well, it 
seems I have typed much more than I intended to.  I guess SQL server 
performance can be an expansive topic. Good luck, and let us know what you 
find.

~Brad 


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:319108
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4

Re: Maybe I need a SQL Service Consultant...

Reply via email to