> SQL Server 2000 (we don't have a lot of money and the licence was > cheaper... but we are planning to upgrade sometime soon)
Ok. 2000 doesn't give you near the info 2005 does, but it will give you quite a bit. First you need to keep an eye on when your performance spikes hit (which I think you're already doing) so you can try and get a handle on what is happening right then with the currently running requests. When a slow down starts, you need to get a snap shot of what was going on. First of all on your CF server. Grab a full stack trace. This is trivial with SeeFusion, and I assume Fusion Reactor as well. You can scan through it to look for patterns. Were multiple pages running the same query, etc? If Fusion Reactor allows you to log slow DB calls, turn that on with a threshhold low enough to log the stuff running at the time of the slow down. Now, for your SQL server. select * from master.dbo.sysprocesses http://msdn.microsoft.com/en-us/library/aa260456(SQL.80).aspx will return a record for every process on the SQL server. This will tell you where the server the request is coming from (the DB doesn't know the client's ip-- only the web servers') usually I filter out sleeping processes and order by CPU desc. Of course, if the problem is locks, then the cpu usage won't be high for the problem processes since they aren't doing anything but waiting. You can get the command the processes is running, but it is just basic info like "select", "update". Also in the sysprocesses table is the blocked column. It contains the spid of any processes which are blocking it. If select * from master.dbo.sysprocesses where blocked > 0 returns a lot of records, you are dealing with a lot of locks. Often times one process will chain down and block a number of requests all blocking others... you've got to keep climbing the lock tree to see who's blocking who until you find a process who is blocking other processes, but not being blocked themselves: select * from master.dbo.sysprocesses p1 where blocked = 0 and spid in ( select blocked from master.dbo.sysprocesses p2 where blocked > 0) I didn't test that, but I think it is right. If you have one processes blocking everyone else, then you need to pull out sp_lock which is a system proc that shows you all the locks that exist in the database right then. You might get a TON of records back, but you need to know which ones to worry about. The spid column tells you what spid is holding or waiting for the lock. Cross references this with sysprocesses. Lock are a course of their own. In general keep your eyes peeled for the following: A TON of RID locks for one process. This may indicate a table scan or an update to many records at a time TAB locks are a lock on an entire table FIL locks can show occaisionally while writing a large tranny log file to a slow disk Of course, the lock type has to be taken in conjuction with the lock mode: Shared locks are generally harmless, but it depends on your isolation level. Exclusive locks are the ones to look out for. Lock status tells you if the lock is aquired or still being waited for. The database id and object id are your key to finding what the object (usually a table) was being locked. select * from sysdatabases will give you the database ids use databasename object_name(object_id) If you want to make your life easier, google for sp_lock2 which a proc that looks up the object names for you. Your goal is to find the pattern of what sql statements/tables are the repeat offenders and which ones are creating a large number of locks or leaving locks open for a long time before committing them. If you are using transactions in your SQL you need to be careful how long you wait to commit those. This would be bad: begin transaction update table with(holdlock) set col = foo -- run database maintneance plan... commit transaction Once you know what statement is giving you problems, take it and run it in query analyzer to get the real execution plan (not the estimated one). Reading an execution plan is a whole topic of its own, but you want to find the slowest parts of it. Look for any table scans or extremely large amount of records being returned by one of the inner operations. An index seek is most always better than an index scan. If you are doing a large amount of reads and the data doesn't need to be transanctionally safe, use with(nolock) which sets the isolations level for that statement to read uncommitted. Make sure you are updating tables on an indexed key to avoid table scans which want to escalate to exlcusive table locks. > Error Executing Database Query. [Macromedia][SQLServer JDBC > Driver][SQLServer]Transaction (Process ID 110) was deadlocked on lock > resources with another process and has been chosen as the deadlock > victim. Rerun the transaction. > > This kind of error message has only happened 8 times today... but > maybe it's related. I don't have a CLUE how to deal with SQL Server > deadlocks! Whoa! 8 times? Let's make it your mission to see that happen 0 times a day. :) You're probably familiar with what a dead lock is. Process 1 has resource and A and needs resource B. Process 2 has resource B and needs resource A. The resources are most commonly table, but can also be rows in a table if two processes have row or page locks on different parts of a table and want to escalate, but can't. In general, if you have a series of tables you commonly update in a transaction, make sure you always update them in the same order. If you are logging your errors and you know what queries are getting deadlocked then you can make a good guess what tables are the problem. A lot of deadlocks can be avoided in the first place by making sure your locks stay open as short as possible. This means calculating what needs to be changed, then getting in, doing the udpate/insert and committing quickly. Large stored procs with a begin trans at the top and a commit trans at the bottom are bad news. At any rate, you don't have to just sit and guess what your dead locks are. Run a trace for deadlocks with SQL profiler. It will tell you the two processes involved as well as the resource they were fighting over. Use the object id to find the table in question. That should help you track down where that is coming from. > WHENEVER there is a spike, I see the disk WRITE on the C/D drive jump > to 100% while the disk read jumps up and down between 100% and 0%... > during this time, I see disk activity on the E/F drive Where is your tempdb stored? It would be nice to have your page file elsewhere, but it shouldn't be a big problem unless your server is tight on RAM and a lot of stuff is being swapped out to disk. Look at Windows task manager to see the total amount of memory in use. If it is greater that the amount of RAM installed, you are probably swapping badly. Find out what kind of RAID you have. Mirroring provides the much needed redundacy, but no performance benifit until it is 10 or 01 which basically a mirred set of striped drives or vice versa. Also, does your RAID controller have built-in RAM to cache commonly used data? You will want to get a feel for what you hard drive activity normally looks like, to compare it to the levels you see when the database is misbehaving. Things like table scans on a few million records will cause a lot of read activity, but you seem to be getting a lot of write activity. This is why I asked about your tempdb. SQL server uses internal temp tables you don't even know about. If you are doing some monster selects with a few dozens tables all joined at the same time, there might be some massive internal temp tables happening based on the indexes available. This is where your execution plans come in and tell you what is happening. > So I'm using perfmon to monitor % disk read and write on the two > physical sets of disks. You might get some more meaningful data from the Avg. Disk sec/Read and Write and Avg Disk Queue Length. Once again, there are about a billion things that factor into this, but for the first counter, you probably don't want to see your disks' avg reads and writes taking more than 50 ms. A good value for this counter would be more like 10ms which means that the average read or write only takes 10ms to happen. The second counter is more complex and is based on your RAID or SAN configuration. The basic rule of thumb is you don't want to see more than a small handful of queued I/O (think under 5-10) PER disk in your ARRAY. An array of 20 disks with an avg disk queue length of 40 under load would be pretty good. (2 per disk) Of course, your hard disk performance is often the symptom, not the problem. You wouldn't want to go out and buy more drives, just because you had some poor SQL that wasn't taking advantage of your indexes. > It can't be a coincidence that when these spikes occur are also the > ONLY time I see disk write activity to the E/F drives, where the > transaction logs are. A lot transaction log activity probably means a lot of data is being updated/inserted somewhere. Make sure updates only affect the necessary rows. Don't do "delete from #temp_table" when a truncate could do. Well, it seems I have typed much more than I intended to. I guess SQL server performance can be an expansive topic. Good luck, and let us know what you find. ~Brad ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:319108 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4