On 2015-03-30 11:46 AM, Jeff Roux wrote: > Hi everyone, > > I have a daemon that collects information and stores it in a SQLite > database. The table has 1 million rows. > > This daemon is running on a HP server with 12 cores, 32 GB of RAM, > and a SSD drive. I have performance issues with some requests. For > instance, the following request takes more than 5 seconds to > accomplish with SQlite3 (in that particular case, the WHERE clause > selects all the data in the database, i.e. 1000000 rows): > > SELECT SUM(nbBytesDecompOut + nbBytesCompIn) as vol, portLan as item > FROM flows > WHERE timestamp>=1383770600 AND timestamp<=1384770600 AND protocol IN (17, 6) > GROUP BY portLan ORDER BY vol DESC LIMIT 6; > > I have done some tests with or without "INDEXED BY" clauses and got > nearly the same results. > > I compared the performance with a mySQL and the same request takes > less than 1 second to accomplish. > > Could you give me some directions to optimize this kind of request > with SQlite3 when there is a big amount of data in the table ? I > need to increase 3 times the number of rows and the performance will > become unacceptable for my application.
I think something else is wrong, we might need your Schema to better judge. The reason I say something else is wrong, is that I made this script quick to reproduce what I could establish from your writings might be what the schema looks like, and then added a single index and tried the query. I deliberately added a primary key and then did not use it at all to avoid those advantages as it may not suit your needs. You will see I used 3 million rows in stead of 1 million (because you said your data may grow three times bigger) and some random values in the appropriate ranges to fake the data (where I had an idea, and guessed the rest). The query takes less than 50 milliseconds, even over 3 million rows. In fact the entire script creating the DB, inserting 3 million rows, creating an index, and then doing 3 queries all took only 18 seconds in total (without running analyze). This will even go a bit faster if I turn stats off, but I left it in for clarity. If this does not enlighten your quest, kindly provide your full schema and example data shapes which might give us a clue. NOTE: In my data it works out that about ~12K rows satisfies the where clause - if this is significantly more in your case it might change the timing a bit, but nothing like your suggested times. See results below: -- Processing Script for File: D:\Documents\TestScript3.txt -- Script Items: 7 Parameter Count: 0 -- 2015-03-30 16:16:11.795 | [Success] Script Started... -- ================================================================================================ DROP TABLE IF EXISTS flows; -- Item Stats: Item No: 1 Query Size (Chars): 29 -- VM Work Steps: 92 Rows Modified: 0 -- Full Query Time: 0d 00h 00m and 00.485s -- Query Result: Success. -- ------------------------------------------------------------------------------------------------ CREATE TABLE flows( id INTEGER PRIMARY KEY, ipLan TEXT, ipWan TEXT, portLan INT, portWan INT, protocol INT, nbBytesDecompOut INT, nbBytesCompIn INT, tos INT, timestamp INT ); -- Item Stats: Item No: 2 Query Size (Chars): 207 -- VM Work Steps: 31 Rows Modified: 0 -- Full Query Time: -- --- --- --- --.---- -- Query Result: Success. -- ------------------------------------------------------------------------------------------------ /* POPULATE the table with random + scaled data to emulate an actual dataset of 3M rows */ WITH acc(x,mx8,dx8,mxK,dxK,rK) AS ( SELECT 0,0,0,0,0,100 UNION ALL SELECT x+1, (x%8), CAST(round(x/8,0) AS INT), (x%1024), CAST(round(x/1024,0) AS INT), CAST(abs(round(random()/1024000000000,0)) AS INT) FROM acc WHERE (x<3000000) -- Testing 3 million rows ) INSERT INTO flows SELECT x, '192.168.1.'||mx8, '55.66.77.'||(dx8%256), -- ipLan, ipWan 1000+mx8, 5000+mx8, (x%18), -- portlan, portWan, protocol 64+(rk%1024000), -- nbBytesDecompOut 1024+(rk%1024000), -- nbBytesDecompIn (dx8%3), (1380000000+(x*10)) -- tos, timestamp FROM acc; -- Item Stats: Item No: 3 Query Size (Chars): 691 -- Result Columns: 0 Result Rows: 0 -- VM Work Steps: 180000087 Rows Modified: 3000001 -- Full Query Time: 0d 00h 00m and 12.849s -- Query Result: Success. -- ------------------------------------------------------------------------------------------------ CREATE INDEX idxFlowTimeProt ON flows (timestamp, protocol); -- Item Stats: Item No: 4 Query Size (Chars): 62 -- VM Work Steps: 27000038 Rows Modified: 0 -- Full Query Time: 0d 00h 00m and 04.169s -- Query Result: Success. -- ------------------------------------------------------------------------------------------------ SELECT * FROM flows LIMIT 20 -- Test to see if Data actually populated as intended - First 20 rows ; -- id ipLan ipWan portLan portWan protocol nbBytesDecompOut nbBytesCompIn tos timestamp -- -- ----- ----- ------- ------- -------- ---------------- ------------- --- --------- -- 0 192.168.1.0 55.66.77.0 1000 5000 0 164 1124 0 1380000000 -- 1 192.168.1.0 55.66.77.0 1000 5000 1 228007 228967 0 1380000010 -- 2 192.168.1.1 55.66.77.0 1001 5001 2 937308 938268 0 1380000020 -- 3 192.168.1.2 55.66.77.0 1002 5002 3 932047 933007 0 1380000030 -- 4 192.168.1.3 55.66.77.0 1003 5003 4 18772 19732 0 1380000040 -- 5 192.168.1.4 55.66.77.0 1004 5004 5 107330 108290 0 1380000050 -- 6 192.168.1.5 55.66.77.0 1005 5005 6 450805 451765 0 1380000060 -- 7 192.168.1.6 55.66.77.0 1006 5006 7 44990 45950 0 1380000070 -- 8 192.168.1.7 55.66.77.0 1007 5007 8 29031 29991 0 1380000080 -- 9 192.168.1.0 55.66.77.1 1000 5000 9 254758 255718 1 1380000090 -- 10 192.168.1.1 55.66.77.1 1001 5001 10 372799 373759 1 1380000100 -- 11 192.168.1.2 55.66.77.1 1002 5002 11 674595 675555 1 1380000110 -- 12 192.168.1.3 55.66.77.1 1003 5003 12 607498 608458 1 1380000120 -- 13 192.168.1.4 55.66.77.1 1004 5004 13 593350 594310 1 1380000130 -- 14 192.168.1.5 55.66.77.1 1005 5005 14 604620 605580 1 1380000140 -- 15 192.168.1.6 55.66.77.1 1006 5006 15 305027 305987 1 1380000150 -- 16 192.168.1.7 55.66.77.1 1007 5007 16 882158 883118 1 1380000160 -- 17 192.168.1.0 55.66.77.2 1000 5000 17 848808 849768 2 1380000170 -- 18 192.168.1.1 55.66.77.2 1001 5001 0 166423 167383 2 1380000180 -- 19 192.168.1.2 55.66.77.2 1002 5002 1 559863 560823 2 1380000190 -- Item Stats: Item No: 5 Query Size (Chars): 104 -- Result Columns: 10 Result Rows: 20 -- VM Work Steps: 268 Rows Modified: 0 -- Full Query Time: -- --- --- --- --.---- -- Query Result: Success. -- ------------------------------------------------------------------------------------------------ SELECT * FROM flows LIMIT 2999980, 22 -- Last ~20 rows of 3M ; -- id ipLan ipWan portLan portWan protocol nbBytesDecompOut nbBytesCompIn tos timestamp -- -- ----- ----- ------- ------- -------- ---------------- ------------- --- --------- -- 2999980 192.168.1.3 55.66.77.213 1003 5003 10 139933 140893 0 1409999800 -- 2999981 192.168.1.4 55.66.77.213 1004 5004 11 72353 73313 0 1409999810 -- 2999982 192.168.1.5 55.66.77.213 1005 5005 12 532504 533464 0 1409999820 -- 2999983 192.168.1.6 55.66.77.213 1006 5006 13 326813 327773 0 1409999830 -- 2999984 192.168.1.7 55.66.77.213 1007 5007 14 849854 850814 0 1409999840 -- 2999985 192.168.1.0 55.66.77.214 1000 5000 15 182420 183380 1 1409999850 -- 2999986 192.168.1.1 55.66.77.214 1001 5001 16 415822 416782 1 1409999860 -- 2999987 192.168.1.2 55.66.77.214 1002 5002 17 529187 530147 1 1409999870 -- 2999988 192.168.1.3 55.66.77.214 1003 5003 0 926383 927343 1 1409999880 -- 2999989 192.168.1.4 55.66.77.214 1004 5004 1 376955 377915 1 1409999890 -- 2999990 192.168.1.5 55.66.77.214 1005 5005 2 259516 260476 1 1409999900 -- 2999991 192.168.1.6 55.66.77.214 1006 5006 3 1021395 1022355 1 1409999910 -- 2999992 192.168.1.7 55.66.77.214 1007 5007 4 926643 927603 1 1409999920 -- 2999993 192.168.1.0 55.66.77.215 1000 5000 5 642814 643774 2 1409999930 -- 2999994 192.168.1.1 55.66.77.215 1001 5001 6 571990 572950 2 1409999940 -- 2999995 192.168.1.2 55.66.77.215 1002 5002 7 193349 194309 2 1409999950 -- 2999996 192.168.1.3 55.66.77.215 1003 5003 8 362891 363851 2 1409999960 -- 2999997 192.168.1.4 55.66.77.215 1004 5004 9 343189 344149 2 1409999970 -- 2999998 192.168.1.5 55.66.77.215 1005 5005 10 429932 430892 2 1409999980 -- 2999999 192.168.1.6 55.66.77.215 1006 5006 11 705548 706508 2 1409999990 -- 3000000 192.168.1.7 55.66.77.215 1007 5007 12 395805 396765 2 1410000000 -- Item Stats: Item No: 6 Query Size (Chars): 65 -- Result Columns: 10 Result Rows: 21 -- VM Work Steps: 9000248 Rows Modified: 0 -- Full Query Time: 0d 00h 00m and 00.389s -- Query Result: Success. -- ------------------------------------------------------------------------------------------------ /* Actual Query in Question */ SELECT SUM(nbBytesDecompOut + nbBytesCompIn) as vol, portLan as item FROM flows WHERE timestamp>=1383770600 AND timestamp<=1384770600 AND protocol IN (17, 6) GROUP BY portLan ORDER BY vol DESC LIMIT 6; -- vol item -- --- ---- -- 1422217500 1007 -- 1417060608 1005 -- 1403637410 1002 -- 1395339326 1003 -- 1389292160 1000 -- 1380818028 1006 -- Item Stats: Item No: 7 Query Size (Chars): 213 -- Result Columns: 2 Result Rows: 6 -- VM Work Steps: 872434 Rows Modified: 0 -- Full Query Time: 0d 00h 00m and 00.039s // <--- Less than 0.05 seconds -- Query Result: Success. -- ------------------------------------------------------------------------------------------------ -- Script Stats: Total Script Execution Time: 0d 00h 00m and 18.221s -- Total Script Query Time: 0d 00h 00m and 17.931s -- Total Database Rows Changed: 3000001 -- Total Virtual-Machine Steps: 216873198 -- Last executed Item Index: 7 -- Last Script Error: -- ------------------------------------------------------------------------------------------------ -- 2015-03-30 16:16:30.001 | [Success] Script Success. -- 2015-03-30 16:16:33.247 | [Success] Transaction Committed. -- ------- DB-Engine Logs (Contains logged information from all DB connections during run) ------ -- [2015-03-30 16:16:11.764] APPLICATION : Script D:\Documents\TestScript3.txt started with Initialization at 16:16:11.764 on 30 March. -- ================================================================================================