On 24-7-2015 00:23, conver...@gmail.com [firebird-support] wrote: > Thanks for your insightful response. FWIW, I would like to mention that, > in the same server, we have another database (same size ~7 GB) no one > connects to, it's a restore of the production database from January this > year. This database works perfectly even when the production database is > down. We try only a few test connections though. > > Below is some of the requested information, at a time when the > production database performance is normal. > > Firebird.conf: > --------------------- > > DefaultDbCachePages = 1024 > #FileSystemCacheThreshold = 65536 (commented out) > #FileSystemCacheSize = 0 (commented out) > > > Server environment: > -------------------------- > > CPU utiliza tion: 11% > Memory utilization: 11 GB (out of 32) > > Note.- Even when the DB performance is down, this values are in the same > range or even lower. No swapping. > > gstat output (normal performance): > --------------------------------------------------------- > > Database header page information: > Flags 0 > Checksum 12345 > Generation 19572161 > Page size 16384 > ODS version 11.2 > Oldest transaction 18709808 > Oldest active 18953295 > Oldest snapshot 18851591 > Next transaction 19520857
The large transaction gap indicates that you have long running transactions, which can lead to performance problems due to garbage accumulation. > Bumped transaction 1 > Sequence number 0 > Next attachment ID 50438 > Implementation ID 26 > Shadow count 0 > Page buffers 3000 This might be a bit high for Classic. This means that each connection can take 47 MB in cached pages. However with 32 GB available, that might not be that relevant. > Next header page 0 > Database dialect 1 > Creation date Jul 7, 2015 7:00:57 > Attributes no reserve As already noted by Thomas: don't use "no reserve" (from the gstat manual: "All pages will be filled to 100% and will be most useful on read-only databases. No space is reserved in each page for updates and/or deletions.") > Variable header data: > Database backup GUID: {BF8D26E0-970E-431A-7FAD-E2D9BDB2E4DA} > Sweep interval: 0 > *END* > > Note.- We seep the database manually each night. > > fb_lock_print output (normal performance): > ---------------------------------------------------------------- > > LOCK_HEADER BLOCK > Version: 145, Active owner: 0, Length: 28311552, Used: 27588104 > Flags: 0x0001 > Enqs: 69364533, Converts: 192066, Rejects: 36029, Blocks: 282250 > Deadlock scans: 7, Deadlocks: 0, Scan interval: 10 > Acquires: 77720068, Acquire blocks: 2159883, Spin count: 0 > Mutex wait: 2.8% > Hash slots: 1009, Hash lengths (min/avg/max): 51/ 66/ 81 > Remove node: 0, Insert queue: 0, Insert prior: 0 > Owners (145): forward: 441288, backward: 98120 > Free owners (11): forward: 24695928, backward: 23070064 > Free locks (2963): forward: 22024, backward: 27499760 > Free requests (42905): forward: 22145288, backward: 25253392 > Lock Ordering: Enabled You need to increase the value of LockHashSlots in firebird.conf as the hash length is rather long. > Firebird.log (IBMCASA is the server's host name) > ------------------------------------------------------------------ > > The log is literally FULL of 10053 and 10054 error entries like the > following: > > IBMCASA Thu Jul 23 10:27:27 2015 > Unable to complete network request to host "IBMCASA". > Error writing data to the connection. > > > IBMCASA Thu Jul 23 10:27:29 2015 > Unable to complete network r equest to host "IBMCASA". > Error reading data from the connection. > > > IBMCASA Thu Jul 23 10:27:30 2015 > INET/inet_error: read errno = 10054 > > > According to the log, this errors seems to be happening every second or > every few seconds/minutes, since March 8 2014 and until today even as > I'm writing this. Each day, this errors stop at 11:49 PM when the last > users stop working on the client apps, then they'll start again every > morning at 6:00 AM when the first client apps connect to the database. Error 10054 is connection reset by peer, it means that the connection was terminated without properly signalling a connection close to the server. This might indicate a problem in the application: not properly closing connections, or applications being closed/killed before the connection could be closed properly. Combined with Error 10053 it might mean that you are also using events and that the server tries to notify a client of an event, when the client is no longer there. It would still be interesting to see the values when there is a performance problem. Mark -- Mark Rotteveel