3.x SuperServer: quick performance when the same (trivial) query is executed by more than two sessions ------------------------------------------------------------------------------------------------------
Key: CORE-5962 URL: http://tracker.firebirdsql.org/browse/CORE-5962 Project: Firebird Core Issue Type: Bug Affects Versions: 3.0.4 Environment: Firebird 3.0.1 and 3.0.4, SuperServer OS: CentOS-7; Windows-2008 Server R2 Hardware: 1) POSIX: server with 2 CPU sockets x 12 physical cores x 2 HT 2) POSIX: server with 2 CPU sockets x 6 physical cores x 2 HT; 2) Windows: machine with 1 socket x 4 physical cores x 4 HT; Reporter: Pavel Zotov It was ecountered (suddenly) that performance of any trivial SQL statement will be *very* poor when testing against SuperServer. I have lot of results for fresh 3.0.4, but the same result is for 3.0.1 (sep-2016). This is reproduced in 100% of runs when statement does something with DB tables (rather than "no-DB" statement that evaluates something, e.g. check pattern matching of some text or calls deterministic PSQL-function in loop, etc). Table can be either fixed or GTT - it does not matter; but performance on GTT goes down slower. Checked table always contained only 1 record and single field: ID int. Steps to reproduce. 0) create empty database (see below: "E:\FB30.TMPINSTANCE\examples\empbuild\tmptest.fdb"), prepare FB instance for work as SuperServer; 1) create batch scenario and change its variables to be match to your enviroment ("fbhome", "fbport", "dbname", ISC_USER & ISC_PASSWORD). === @echo off setlocal enabledelayedexpansion enableextensions set fbhome=E:\FB30.TMPINSTANCE set fbhost=localhost set fbport=3030 set dbname=E:\FB30.TMPINSTANCE\examples\empbuild\tmptest.fdb set sql=%~dpn0.sql set ISC_USER=SYSDBA set ISC_PASSWORD=masterkey set att_cnt=%1 if .!att_cnt!.==.. ( echo Must specify number of establishing attachments. exit ) md %~dp0logs 2>nul if .1.==.0. ( !fbhome!\gfix.exe -mode read_write !fbhost!/!fbport!:!dbname! ( echo set bail on; echo recreate table fix_test(id int^); echo insert into fix_test(id^) values(1^); echo commit; echo recreate global temporary table gtt_test(id int^) on commit delete rows; echo insert into gtt_test(id^) values(1^); echo commit; ) > %~dpn0.tmp !fbhome!\isql.exe !fbhost!/!fbport!:!dbname! -i !%~dpn0.tmp! if errorlevel 1 ( echo Error while preparing test DB. exit ) del %~dpn0.tmp !fbhome!\gfix.exe -mode read_only !fbhost!/!fbport!:!dbname! ) for /l %%i in (1,1,!att_cnt!) do ( set run_isql=!fbhome!\isql.exe !fbhost!/!fbport!:!dbname! -i !sql! set /a k=1000+%%i set log=%~dp0logs\tmp.!k:~1,3!.log echo "!run_isql! 1^>!log! 2>&1" start /min cmd /c "!run_isql! 1>!log! 2>&1" ) === 2) create SQL script in the same folder and with name = name of batch (i.e., "ss_test.bat" --> "ss_test.sql"): === set autoddl off; commit; set transaction read only; set bail on; set plan on ; set explain on; set stat on; set list on; set term ^; execute block returns(dts_before timestamp, dts_after timestamp) as declare c int = 1373078; declare i int; begin dts_before = 'now'; while (c>0) do begin select 1 from rdb$database into i; --------------------------------- [ 1 ] c = c-1; end dts_after= 'now'; suspend; end ^ set term ;^ quit; === Please look in this SQL: note that statement marked as " [ 1 ] " will be executed in PSQL loop about 1.3E+6 times, but it is absolutely trivial. 3) write somewhere number of physical cores of your CPU: WMIC.exe /locale:ms_409 cpu get NumberOfCores /format:list > c:\temp\my_cpu_phys_cores.txt 4) launch trace with following config: === database = (%[\\/](security3).fdb|(security.db)) { enabled = false } database = { enabled = true log_errors = true log_initfini = false time_threshold = 0 log_statement_finish = true print_plan = true } === 5) run batch with specifying as 1st (and single) argument number of ISQL sessions that should be launched in asynchronous mode (i.e. using START command; on POSIX this id sone by specifying "&" after command line). Please start with ss_test.bat 1, and repeat it 2-3 times -- this will give somewhat like etalone results. Stop trace and make filter of its data (we are interested only about FETCHES): findstr /c:"fetch(es)" trace.log | findstr /v "mark(s)" > filtered_trace.log NOTE! After each start you have to wait until ALL child windows will be closed (and no further activity in trace can be seen). After run "ss_test.bat 1" - repeat with "ss_test.bat 2" and then with "ss_test.bat 4", ""ss_test.bat 8" and so on (and WAIT until all sessions will complete!) Also, do restart of trace before each launch and stop it, plus filter it into separate text file (see above). You will see that: 1) elapsed time grows approx. linearly - EVEN WHEN NUMBER OF ATTACHES LESS THAN PHYSICAL CPU CORES 2) elapsed time of different sessions becomes VALUABLE DIFFER when their number greater than physical CPU cores. Some of attaches (usually 2-3) can finish their job MUCH FASTER that others. Difference can be up to 2...3 times. Workload balance is poor here (unfair). 3) Growth of elapsed time remains linear with increasing number of attachments, but coefficient between Y/X is much greater after exceeding number of physical cores. If you look again into batch then you can see there: "if .1.==.0. (" Change it to: .1.==.1. -- and try with other types of DB objects: GTT, selectable procedures, views etc. Of course, after this you have to change SQL script that will handle these (new) objects. I tried with following: * "dummy loop" with only "i = i+1", no query to any DB object; result: elapsed time did not increased for 1,2,4... attaches until all physical cores will be involved in work (so this is 100% scalability); * loop with evaluating: bool_var = <LONG_STRING> LIKE '%qwerty%'; result: good scalability until all physical cores were involved; for 8 and 16 attachments there was valuable difference in elapsed time (unfair balancing of cpu time); * loop with query to GTT instead of fixed table RDB$DATABASE. Result: POOR, no scalability. Linear growth of elapsed time; and again valuable difference of speed between attachments (unfair balancing). Good news: avg time for GTT is about 3-4 times less than for RDB$DATABASE * loop with query to FIXED USER-DEFINED table (i.e. not from DB dictionary). Result: POOR, the same as for RDB$DATABASE. Test was also repeated: * for different values of DB cache pages; * for database access = read_only; * for FW = OFF * for different transactions isolation level (including snapshot reserving table) Results were the same. ::::::::::::::::::::::::::::::::::: NOTE ::::::::::::::::::::::::::::::::::::::: NO such troubles in Classic and SuperClassic! ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Please look into attached .xlsx: there are data of measures on Windows host. Sheets: * Cs vs sC vs SS -- results for Classic, SuperClassic and SuperServer, for "select 1 from rdb$database", for attaches count: 1,2,4,8 and 16 * benchmark query to FIX table -- benchmark of fixed table vs GTT; also - results for old FB 3.0.1.32610, 27.09.2016 * benchmark query to GTT table -- miscelan results for GTT when we query it directly or via VIEW or via SP or function; results for deterministic PSQL function; results for DIFFERENT GTTs * benchmark string`LIKE` pattern -- single measurement for loop with <LONG_STRING> LIKE '%qwerty%' statement; * .bat, .sql -- content of batch and SQL * sys info -- firebird.conf for 3.0.x that was used; also - miscelan system info about machine hardware -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://tracker.firebirdsql.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel