Re: Re: [firebird-support] Re: [FB 2.1] Firebird engine seems to slow down on high load without utilizing hardware

2016-04-19 Thread liviuslivius liviusliv...@poczta.onet.pl [firebird-support]
soryy not Shane but Sean 
 
regards,
karol Bieniaszewski
 

Re: [firebird-support] Re: [FB 2.1] Firebird engine seems to slow down on high load without utilizing hardware

2016-04-19 Thread liviuslivius liviusliv...@poczta.onet.pl [firebird-support]
>>1) We do monitor the transaction gab and if needed, we interact. 
 
how often and how? By MON$ tables or you read header page?
 
>>2) We do run gbak daily. We do not use nbackup - we would like to and tried 
>>it but it's corrupting the db.
 
and this slowdown is not in time of backup i suppose? 
Will be good if you can test this corruption scenario on newer firebird version 
- especially on FB3.
And send us your feedback.
 
>>3) We do use read-only readcommited transactions with record versioning as 
>>much as we can.
>>Otherwise, same Settings but not read-only.
 
good
 
More questions:
1. as Shane ask what is your CPU utilization % but per core?
2. how do you checked that your system slowdown? Some queries are slower or all 
queries are slower.
Did you run some monitoring throught MON$ tables ten?
3. and same as Shane ask - how often do you run manual sweep? And did you 
consider to change FB to newer version
In new version sweep was optimized and do not visit unnescessary pages. page 
contain now flag if sweep is needed.
4. What is your TempCacheLimit?
5. and i am also interested about disc test with cristalmark
 
regards,
Karol Bieniaszewski
 

RE: [firebird-support] Re: [FB 2.1] Firebird engine seems to slow down on high load without utilizing hardware

2016-04-19 Thread 'Leyne, Sean' s...@broadviewsoftware.com [firebird-support]


> 1) We have sweep set to 0. We do monitor the transaction gab and if
> needed, we interact. Any kind of automatic sweep under high load will kill
> the server :)

1- Do you run a manual sweep on a regular basis?


2- What is the CPU % like when Firebird "slows down"?


3- Have you tried using CrystalDiskMark to measure your disk IOPs?

My local workstation SSD results are:

---
CrystalDiskMark 5.0.2 x64 (C) 2007-2015 hiyohiyo
---
   Sequential Read (Q=  2,T= 4) :   564.293 MB/s
  Sequential Write (Q=  2,T= 4) :   535.874 MB/s
  Random Read 4KiB (Q=256,T= 1) :   227.652 MB/s [ 55579.1 IOPS]
 Random Write 4KiB (Q=256,T= 1) :   195.258 MB/s [ 47670.4 IOPS]
 Sequential Read (T= 1) :   545.480 MB/s
Sequential Write (T= 1) :   520.939 MB/s
   Random Read 4KiB (Q= 1,T= 1) :36.577 MB/s [  8929.9 IOPS]
  Random Write 4KiB (Q= 1,T= 1) :90.805 MB/s [ 22169.2 IOPS]


3- Have you tried to analyze/determine the long running SQLs being executed, to 
determine if wrong/no indexes are being used by queries or other optimizations 
that could be made?


Sean



[firebird-support] Re: [FB 2.1] Firebird engine seems to slow down on high load without utilizing hardware

2016-04-19 Thread thetr...@yahoo.com [firebird-support]
We splitted our system now to 2 databases, reducing the connections to about 
250 per DB.
 Running on same Hardware (same SAN-Storage) and splitted the CPU ressources.
 

 Both system seems stable now, but that's not really a solution we prefered. 
Just helps us to not lose customers at the moment.



[firebird-support] Re: [FB 2.1] Firebird engine seems to slow down on high load without utilizing hardware

2016-04-19 Thread thetr...@yahoo.com [firebird-support]
1) We have sweep set to 0. We do monitor the transaction gab and if needed, we 
interact. Any kind of automatic sweep under high load will kill the server :)
 

 2) We do run gbak daily. We do not use nbackup - we would like to and tried it 
but it's corrupting the db.
 

 3) We do use read-only readcommited transactions with record versioning as 
much as we can.
 Otherwise, same Settings but not read-only.



[firebird-support] Re: [FB 2.1] Firebird engine seems to slow down on high load without utilizing hardware

2016-04-12 Thread thetr...@yahoo.com [firebird-support]
Hey Thomas,
thanks for your extensive reply.
Unfortunatly we'r still bound to some old 32bit UDF functionality which we 
can't get in 64bit. 
I think you know about the use of SuperClassic with 32bit Server - 2GB RAM 
Limit :)
It's not impossible, but also not really a fast route we can go. But for sure 
again a reason to talk about moving the switch to 2.5.

We did ran some some disk IO benchmarks (with AS SSD) today, and in times of 
SSD kinda depressing :D
The thing is, sure this numbers look really low. But the system never uses it. 
The monitoring of the SAN show's that this load's are never used. The 
Single-4k-read is worring me, but i lean towards that our 500 proceses are more 
like the 64-thread test. But even then, we only messured 100 Iops reading on 
livesystem.

Sequential Read speed: ~ 450 MB / s
Sequential Write speed: ~500 MB / s
4k read: 196 Iops
4k write: 1376 Iops
4k-64 thread read: 15945 Iops
4k-64 thread write: 7361 Iops


 Garbage Info still needs to be collected.
But first signs show that this indeed could be a potential problem.
From Sintatica, every 20 Minutes a Peak in GC for ~15.000 transactions. This 
get's fixed by the server in the relative small amount of time (i think < 1 
minute), since it's really only a single peak in the graph everytime.
When the GC stop increasing and the server starts to collect it, we see an 
increase of concurrent running transactions (= transactions are longer open and 
processed slower).

We don't have data from the live system yet to see if this behaviour kind of 
"snowballs" when there is really high load on the server.

Best Regards,

---In firebird-support@yahoogroups.com,  wrote :

 Hi Patrick,
 
 > Hi Thomas, nice to get a response from you. We already met in ~2010 in Linz 
 > at
 > your office :)
 > (ex. SEM GmbH, later Playmonitor GmbH)
 
 I know. XING (Big Brother) is watching you. Nice to see that you are still 
running with Firebird. ;-)
 
 
 > First, sorry for posting a mixed state of informations. The config settings i
 > postet are the current settings.
 > But the Lock-Table-Header was from last saturday (day of total system crash) 
 > -
 > we changed Hash Slot Value since than, but it didn't work. New Table looks
 > like:
 > 
 > 
 > LOCK_HEADER BLOCK
 > Version: 16, Active owner: 0, Length: 134247728, Used: 55790260
 > Semmask: 0x0, Flags: 0x0001
 > Enqs: 1806423519, Converts: 4553851, Rejects: 5134185, Blocks: 56585419
 > Deadlock scans: 82, Deadlocks: 0, Scan interval: 10
 > Acquires: 2058846891, Acquire blocks: 321584126, Spin count: 0
 > Mutex wait: 15.6%
 > Hash slots: 20011, Hash lengths (min/avg/max): 0/ 7/ 18
 > Remove node: 0, Insert queue: 0, Insert prior: 0
 > Owners (297): forward: 385160, backward: 38086352
 > Free owners (43): forward: 52978748, backward: 20505128
 > Free locks (41802): forward: 180712, backward: 3620136
 > Free requests (-1097572396): forward: 46948676, backward: 13681252
 > Lock Ordering: Enabled
 > 
 > 
 > The Min/Avg/Max hash lengths look better now, but as you mentioned the Mutex
 > wait is worring us too.
 > We have 2 direct questions about that.
 > 
 > 
 > 1) What are the negative effects of increasing Hash-Slots (too high)?
 
 It somehow defines the initial size of a hash table which is used for lock(ed) 
object lookup by a key (= hash value), ideally with constant O(1) run-time 
complexity. If the hash table is too small, due to a too small value for hash 
slots, it starts to degenerate into a linked/linear list per hash slot. Worst 
case resulting in O(n) complexity for lookups. The above 20011 setting shows an 
AVG hash length which looks fine.
 
 As you might know, Classic having a dedicated process per connection model 
somehow needs a (global) mechanism to synchronize/protect shared data 
structures across these processes via IPC. This is what the lock manager and 
the lock table is used for.
 
 > 2) As far as we know, we can't influence Mutex wait directly (it's just
 > informational). But do you think that's the reason the underlying hardware is
 > not utilized?
 
 I don't think you are disk IO bound. Means, I'm not convinced that faster IO 
will help. Somehow backed by the high mutex wait. Under normal operations you 
see 100-500 IOPS with some room for further increase as shown in the 1700 IOPS 
backup use case. Don't know how random disk IO is in this two scenarios. Any 
chance to run some sort of disk IO benchmarks or do you already know your upper 
limits for your SAN IOPS wise?
 
 > 
 > 
 > We do consider to upgrade to 2.5, but had our eyes on FB 3 over the last 
 > year,
 > waiting for it to get ready.
 > With 2.5.x we tested around a long time now, but never found a real reason to
 > upgrade - since it's a reasonable amount of work for us. When you say it
 > improves the lock contention, this sound pretty good. But again the question,
 > do you think lock contention is limiting our system?
 
 Dmitry, Vlad etc. will correct me (in case he is following the thread), but I 
recall t

[firebird-support] Re: [FB 2.1] Firebird engine seems to slow down on high load without utilizing hardware

2016-04-12 Thread thetr...@yahoo.com [firebird-support]

 Hey Alexey,
thanks you for our input. I think what you say is correct, and we reviewed our 
disk setup again.
We are utilizing mechnical discs so it's kinda hard to compare SSD performance 
to them.
But they should provide enought IOPS for our load.

Unfortunatly we can't just switch to a single SSD, since we would loose 
replication and failover systems the SAN provides which is a critical demand 
for us. I'm afraid for now we have to stick with it, until we have some facts 
to proof that the SAN Setup is our limiting factor. And data is not should that 
for me currently.

On a sidenode, we do own a server with SSD setup, but in tests we couldn't get 
a noticable performance gain through increasement of IOs this way. (tests was 
generic and not real world load unfortunatly)

Best Regards,
Patrick

---In firebird-support@yahoogroups.com,  wrote :

 Hi Patrick,
 
 If you say that problem occurred recently, I would suggest you to check SAN 
disks health.
 
 However, these values
 >Average
 system IOPS under load read: 100
 >Average system IOPS under load write: 550
 >Backup Restore IOPS read: 1700
 >Backup Restore IOPS write: 250 are really, really low. 
 1700 IOPS for the database with 4k page means 6.8Mb/sec (in case of random 
reads).
 
 I suggest to install a single SSD drive and check how it will work.
 SSD IOPS looks like
   Random Read 4KB (QD=32) :   283.050 MB/s [ 69104.0 IOPS]
   Random Write 4KB (QD=32) :   213.837 MB/s [ 52206.2 IOPS]
 
 
 From our optimization practice we found that if you need to optimize only the 
single instance of the database, the most cost effective way is to upgrade to 
SSD first, and only then fix other problems.
 
 Regards,
 Alexey Kovyazin
 IBSurgeon HQbird www.ib-aid.com http://www.ib-aid.com
 
 
 

   hi,
 recently we had some strange performance issues with our Firebird DB server.
 On high load, our server started to slow down. Select and update SQL query 
times did go up by more than 500% on average,
 but reaching unreasonable high execution times at worst case. (several minutes 
instead of < 1sec)
 

 OIT/OAT/Next Transaction statistics was within 1000 the hole time
 We were not able to messure any hardware limiting factor. Indeed, this system 
was running with only 8 cores at about 70% CPU usage on max. load.
 We decided that this may be our problem since we experienced a similar problem 
at about 80% CPU load in the past.
 So we upgraded the hardware. As expected, the CPU-load dropped to ~35% usage 
on max. load scenario.
 But this did not solve the problem.
 Same story for the harddisk system. The usage is not even near it's max 
capacity.
 

 We also can't see any impact on the harddisk.
 We'r kind of stuck with our ideas, because we have no idea what could be a 
potential bottleneck to the system.
 Since the hardware doesn't show a limit, there have to be anything else - most 
likely firebird engine related that's limiting our system.
 We would be very grateful if anyone can give us hints where we can search 
further.
 Or someone has similar experiences to share with us.
 

 

 Operating System: Windows Server 2003
 Firebird: 2.1.5 Classic
 Dedicated database server (VMWare)
 

 CPU: 16 cores, each 2.4 GHz
 RAM: 32 GB
 About 14GB are used from OS and firebird processes under max load.
 
 HDD: SAN Storage System
 

 Average system IOPS under load read: 100
 Average system IOPS under load write: 550
 Backup Restore IOPS read: 1700
 Backup Restore IOPS write: 250
 SAN IPOS Limit (max): 3000
 

 Firebird Config Settings, based on defaults
 DefaultDbCachePages = 1024
 LockMemSize = 134247728
 LockHashSlots = 20011
 
 Database
 size: about 45 GB
 450 to 550 concurrent connections
 Daily average of 65 transactions / second (peak should be higher)
 

 FB_LOCK_PRINT (without any params) while system was slowing down (~4 days 
uptime).
 I have to note, Firebird was not able to print the complete output (stats was 
not cropped by me)
 

 LOCK_HEADER BLOCK
 Version: 16, Active owner:  0, Length: 134247728, Used: 82169316
 Semmask: 0x0, Flags: 0x0001
 Enqs: 4211018659, Converts: 10050437, Rejects: 9115488, Blocks: 105409192
 Deadlock scans:   1049, Deadlocks:  0, Scan interval:  10
 Acquires: 4723416170, Acquire blocks: 640857597, Spin count:   0
 Mutex wait: 13.6%
 Hash slots: 15077, Hash lengths (min/avg/max):3/  12/  25
 Remove node:  0, Insert queue: 36, Insert prior: 74815332
 Owners (456): forward: 131316, backward: 14899392
 Free owners (9): forward: 39711576, backward: 49867232
 Free locks (42409): forward: 65924212, backward: 23319052
 

 
 With best Regards,
 

 Patrick Friessnegg
 Synesc GmbH