On 2015-03-30 11:46 AM, Jeff Roux wrote:
> Hi everyone,
>
> I have a daemon that collects information and stores it in a SQLite
> database. The table has 1 million rows.
>
> This daemon is running on a HP server with 12 cores, 32 GB of RAM,
> and a SSD drive. I have performance issues with some requests. For
> instance, the following request takes more than 5 seconds to
> accomplish with SQlite3 (in that particular case, the WHERE clause
> selects all the data in the database, i.e. 1000000 rows):
>
> SELECT SUM(nbBytesDecompOut + nbBytesCompIn) as vol, portLan as item
> FROM flows
> WHERE timestamp>=1383770600 AND timestamp<=1384770600 AND protocol IN (17, 6)
> GROUP BY portLan ORDER BY vol DESC LIMIT 6;
>
> I have done some tests with or without "INDEXED BY" clauses and got
> nearly the same results.
>
> I compared the performance with a mySQL and the same request takes
> less than 1 second to accomplish.
>
> Could you give me some directions to optimize this kind of request
> with SQlite3 when there is a big amount of data in the table ? I
> need to increase 3 times the number of rows and the performance will
> become unacceptable for my application.
I think something else is wrong, we might need your Schema to better
judge. The reason I say something else is wrong, is that I made this
script quick to reproduce what I could establish from your writings
might be what the schema looks like, and then added a single index and
tried the query.
I deliberately added a primary key and then did not use it at all to
avoid those advantages as it may not suit your needs.
You will see I used 3 million rows in stead of 1 million (because you
said your data may grow three times bigger) and some random values in
the appropriate ranges to fake the data (where I had an idea, and
guessed the rest). The query takes less than 50 milliseconds, even over
3 million rows. In fact the entire script creating the DB, inserting 3
million rows, creating an index, and then doing 3 queries all took only
18 seconds in total (without running analyze). This will even go a bit
faster if I turn stats off, but I left it in for clarity.
If this does not enlighten your quest, kindly provide your full schema
and example data shapes which might give us a clue.
NOTE: In my data it works out that about ~12K rows satisfies the where
clause - if this is significantly more in your case it might change the
timing a bit, but nothing like your suggested times.
See results below:
-- Processing Script for File: D:\Documents\TestScript3.txt
-- Script Items: 7 Parameter Count: 0
-- 2015-03-30 16:16:11.795 | [Success] Script Started...
--
================================================================================================
DROP TABLE IF EXISTS flows;
-- Item Stats: Item No: 1 Query Size
(Chars): 29
-- VM Work Steps: 92 Rows
Modified: 0
-- Full Query Time: 0d 00h 00m and 00.485s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
CREATE TABLE flows(
id INTEGER PRIMARY KEY,
ipLan TEXT,
ipWan TEXT,
portLan INT,
portWan INT,
protocol INT,
nbBytesDecompOut INT,
nbBytesCompIn INT,
tos INT,
timestamp INT
);
-- Item Stats: Item No: 2 Query Size
(Chars): 207
-- VM Work Steps: 31 Rows
Modified: 0
-- Full Query Time: -- --- --- --- --.----
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
/* POPULATE the table with random + scaled data to emulate an actual
dataset of 3M rows */
WITH acc(x,mx8,dx8,mxK,dxK,rK) AS (
SELECT 0,0,0,0,0,100
UNION ALL
SELECT x+1, (x%8), CAST(round(x/8,0) AS INT), (x%1024),
CAST(round(x/1024,0) AS INT),
CAST(abs(round(random()/1024000000000,0)) AS INT)
FROM acc
WHERE (x<3000000) -- Testing 3 million rows
)
INSERT INTO flows SELECT
x, '192.168.1.'||mx8, '55.66.77.'||(dx8%256), -- ipLan, ipWan
1000+mx8, 5000+mx8, (x%18), -- portlan, portWan,
protocol
64+(rk%1024000), -- nbBytesDecompOut
1024+(rk%1024000), -- nbBytesDecompIn
(dx8%3), (1380000000+(x*10)) -- tos, timestamp
FROM acc;
-- Item Stats: Item No: 3 Query Size
(Chars): 691
-- Result Columns: 0 Result
Rows: 0
-- VM Work Steps: 180000087 Rows
Modified: 3000001
-- Full Query Time: 0d 00h 00m and 12.849s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
CREATE INDEX idxFlowTimeProt ON flows (timestamp, protocol);
-- Item Stats: Item No: 4 Query Size
(Chars): 62
-- VM Work Steps: 27000038 Rows
Modified: 0
-- Full Query Time: 0d 00h 00m and 04.169s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
SELECT * FROM flows LIMIT 20 -- Test to see if Data actually populated
as intended - First 20 rows
;
-- id ipLan ipWan portLan portWan protocol
nbBytesDecompOut nbBytesCompIn tos timestamp
-- -- ----- ----- ------- ------- --------
---------------- ------------- --- ---------
-- 0 192.168.1.0 55.66.77.0 1000 5000 0 164 1124
0 1380000000
-- 1 192.168.1.0 55.66.77.0 1000 5000 1 228007
228967 0 1380000010
-- 2 192.168.1.1 55.66.77.0 1001 5001 2 937308
938268 0 1380000020
-- 3 192.168.1.2 55.66.77.0 1002 5002 3 932047
933007 0 1380000030
-- 4 192.168.1.3 55.66.77.0 1003 5003 4 18772
19732 0 1380000040
-- 5 192.168.1.4 55.66.77.0 1004 5004 5 107330
108290 0 1380000050
-- 6 192.168.1.5 55.66.77.0 1005 5005 6 450805
451765 0 1380000060
-- 7 192.168.1.6 55.66.77.0 1006 5006 7 44990
45950 0 1380000070
-- 8 192.168.1.7 55.66.77.0 1007 5007 8 29031
29991 0 1380000080
-- 9 192.168.1.0 55.66.77.1 1000 5000 9 254758
255718 1 1380000090
-- 10 192.168.1.1 55.66.77.1 1001 5001 10 372799
373759 1 1380000100
-- 11 192.168.1.2 55.66.77.1 1002 5002 11 674595
675555 1 1380000110
-- 12 192.168.1.3 55.66.77.1 1003 5003 12 607498
608458 1 1380000120
-- 13 192.168.1.4 55.66.77.1 1004 5004 13 593350
594310 1 1380000130
-- 14 192.168.1.5 55.66.77.1 1005 5005 14 604620
605580 1 1380000140
-- 15 192.168.1.6 55.66.77.1 1006 5006 15 305027
305987 1 1380000150
-- 16 192.168.1.7 55.66.77.1 1007 5007 16 882158
883118 1 1380000160
-- 17 192.168.1.0 55.66.77.2 1000 5000 17 848808
849768 2 1380000170
-- 18 192.168.1.1 55.66.77.2 1001 5001 0 166423
167383 2 1380000180
-- 19 192.168.1.2 55.66.77.2 1002 5002 1 559863
560823 2 1380000190
-- Item Stats: Item No: 5 Query Size
(Chars): 104
-- Result Columns: 10 Result
Rows: 20
-- VM Work Steps: 268 Rows
Modified: 0
-- Full Query Time: -- --- --- --- --.----
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
SELECT * FROM flows LIMIT 2999980, 22 -- Last ~20 rows of 3M
;
-- id ipLan ipWan portLan portWan protocol
nbBytesDecompOut nbBytesCompIn tos timestamp
-- -- ----- ----- ------- ------- --------
---------------- ------------- --- ---------
-- 2999980 192.168.1.3 55.66.77.213 1003 5003 10
139933 140893 0 1409999800
-- 2999981 192.168.1.4 55.66.77.213 1004 5004 11
72353 73313 0 1409999810
-- 2999982 192.168.1.5 55.66.77.213 1005 5005 12
532504 533464 0 1409999820
-- 2999983 192.168.1.6 55.66.77.213 1006 5006 13
326813 327773 0 1409999830
-- 2999984 192.168.1.7 55.66.77.213 1007 5007 14
849854 850814 0 1409999840
-- 2999985 192.168.1.0 55.66.77.214 1000 5000 15
182420 183380 1 1409999850
-- 2999986 192.168.1.1 55.66.77.214 1001 5001 16
415822 416782 1 1409999860
-- 2999987 192.168.1.2 55.66.77.214 1002 5002 17
529187 530147 1 1409999870
-- 2999988 192.168.1.3 55.66.77.214 1003 5003 0
926383 927343 1 1409999880
-- 2999989 192.168.1.4 55.66.77.214 1004 5004 1
376955 377915 1 1409999890
-- 2999990 192.168.1.5 55.66.77.214 1005 5005 2
259516 260476 1 1409999900
-- 2999991 192.168.1.6 55.66.77.214 1006 5006 3
1021395 1022355 1 1409999910
-- 2999992 192.168.1.7 55.66.77.214 1007 5007 4
926643 927603 1 1409999920
-- 2999993 192.168.1.0 55.66.77.215 1000 5000 5
642814 643774 2 1409999930
-- 2999994 192.168.1.1 55.66.77.215 1001 5001 6
571990 572950 2 1409999940
-- 2999995 192.168.1.2 55.66.77.215 1002 5002 7
193349 194309 2 1409999950
-- 2999996 192.168.1.3 55.66.77.215 1003 5003 8
362891 363851 2 1409999960
-- 2999997 192.168.1.4 55.66.77.215 1004 5004 9
343189 344149 2 1409999970
-- 2999998 192.168.1.5 55.66.77.215 1005 5005 10
429932 430892 2 1409999980
-- 2999999 192.168.1.6 55.66.77.215 1006 5006 11
705548 706508 2 1409999990
-- 3000000 192.168.1.7 55.66.77.215 1007 5007 12
395805 396765 2 1410000000
-- Item Stats: Item No: 6 Query Size
(Chars): 65
-- Result Columns: 10 Result
Rows: 21
-- VM Work Steps: 9000248 Rows
Modified: 0
-- Full Query Time: 0d 00h 00m and 00.389s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
/* Actual Query in Question */
SELECT SUM(nbBytesDecompOut + nbBytesCompIn) as vol, portLan as item
FROM flows
WHERE timestamp>=1383770600 AND timestamp<=1384770600 AND protocol IN
(17, 6)
GROUP BY portLan ORDER BY vol DESC
LIMIT 6;
-- vol item
-- --- ----
-- 1422217500 1007
-- 1417060608 1005
-- 1403637410 1002
-- 1395339326 1003
-- 1389292160 1000
-- 1380818028 1006
-- Item Stats: Item No: 7 Query Size
(Chars): 213
-- Result Columns: 2 Result
Rows: 6
-- VM Work Steps: 872434 Rows
Modified: 0
-- Full Query Time: 0d 00h 00m and 00.039s
// <--- Less than 0.05 seconds
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
-- Script Stats: Total Script Execution Time: 0d 00h 00m and
18.221s
-- Total Script Query Time: 0d 00h 00m and
17.931s
-- Total Database Rows Changed: 3000001
-- Total Virtual-Machine Steps: 216873198
-- Last executed Item Index: 7
-- Last Script Error:
--
------------------------------------------------------------------------------------------------
-- 2015-03-30 16:16:30.001 | [Success] Script Success.
-- 2015-03-30 16:16:33.247 | [Success] Transaction Committed.
-- ------- DB-Engine Logs (Contains logged information from all DB
connections during run) ------
-- [2015-03-30 16:16:11.764] APPLICATION : Script
D:\Documents\TestScript3.txt started with Initialization at 16:16:11.764
on 30 March.
--
================================================================================================