On 2015-03-30 11:46 AM, Jeff Roux wrote:
> Hi everyone,
>
> I have a daemon that collects information and stores it in a SQLite
> database. The table has 1 million rows.
>
> This daemon is running on a HP server with 12 cores, 32 GB of RAM,
> and a SSD drive. I have performance issues with some requests. For
> instance, the following request takes more than 5 seconds to
> accomplish with SQlite3 (in that particular case, the WHERE clause
> selects all the data in the database, i.e. 1000000 rows):
>
> SELECT SUM(nbBytesDecompOut + nbBytesCompIn) as vol, portLan as item
> FROM  flows
> WHERE timestamp>=1383770600 AND timestamp<=1384770600 AND protocol IN (17, 6)
> GROUP BY portLan ORDER BY vol DESC LIMIT 6;
>
> I have done some tests with or without "INDEXED BY" clauses and got
> nearly the same results.
>
> I compared the performance with a mySQL and the same request takes
> less than 1 second to accomplish.
>
> Could you give me some directions to optimize this kind of request
> with SQlite3 when there is a big amount of data in the table ? I
> need to increase 3 times the number of rows and the performance will
> become unacceptable for my application.

I think something else is wrong, we might need your Schema to better 
judge. The reason I say something else is wrong, is that I made this 
script quick to reproduce what I could establish from your writings 
might be what the schema looks like, and then added a single index and 
tried the query.

I deliberately added a primary key and then did not use it at all to 
avoid those advantages as it may not suit your needs.

You will see I used 3 million rows in stead of 1 million (because you 
said your data may grow three times bigger) and some random values in 
the appropriate ranges to fake the data (where I had an idea, and 
guessed the rest). The query takes less than 50 milliseconds, even over 
3 million rows. In fact the entire script creating the DB, inserting 3 
million rows, creating an index, and then doing 3 queries all took only 
18 seconds in total (without running analyze).  This will even go a bit 
faster if I turn stats off, but I left it in for clarity.

If this does not enlighten your quest, kindly provide your full schema 
and example data shapes which might give us a clue.

NOTE: In my data it works out that about ~12K rows satisfies the where 
clause - if this is significantly more in your case it might change the 
timing a bit, but nothing like your suggested times.
See results below:

   -- Processing Script for File: D:\Documents\TestScript3.txt
   -- Script Items: 7          Parameter Count: 0

   -- 2015-03-30 16:16:11.795  |  [Success]    Script Started...
   -- 
================================================================================================

DROP TABLE IF EXISTS flows;

   --    Item Stats:  Item No:           1             Query Size 
(Chars):  29
   --                 VM Work Steps:     92            Rows 
Modified:       0
   --                 Full Query Time:   0d 00h 00m and 00.485s
   --                 Query Result:      Success.
   -- 
------------------------------------------------------------------------------------------------

CREATE TABLE flows(
   id INTEGER PRIMARY KEY,
   ipLan TEXT,
   ipWan TEXT,
   portLan INT,
   portWan INT,
   protocol INT,
   nbBytesDecompOut INT,
   nbBytesCompIn INT,
   tos INT,
   timestamp INT
);

   --    Item Stats:  Item No:           2             Query Size 
(Chars):  207
   --                 VM Work Steps:     31            Rows 
Modified:       0
   --                 Full Query Time:   -- --- --- --- --.----
   --                 Query Result:      Success.
   -- 
------------------------------------------------------------------------------------------------

/* POPULATE the table with random + scaled data to emulate an actual 
dataset of 3M rows */

WITH acc(x,mx8,dx8,mxK,dxK,rK) AS (
     SELECT 0,0,0,0,0,100
   UNION ALL
     SELECT x+1, (x%8), CAST(round(x/8,0) AS INT), (x%1024), 
CAST(round(x/1024,0) AS INT),
            CAST(abs(round(random()/1024000000000,0)) AS INT)
       FROM acc
      WHERE (x<3000000)  -- Testing 3 million rows
)
INSERT INTO flows SELECT
   x, '192.168.1.'||mx8, '55.66.77.'||(dx8%256),  -- ipLan, ipWan
   1000+mx8, 5000+mx8, (x%18),                    -- portlan, portWan, 
protocol
   64+(rk%1024000),                               -- nbBytesDecompOut
   1024+(rk%1024000),                             -- nbBytesDecompIn
   (dx8%3), (1380000000+(x*10))                   -- tos, timestamp
   FROM acc;


   --    Item Stats:  Item No:           3             Query Size 
(Chars):  691
   --                 Result Columns:    0             Result 
Rows:         0
   --                 VM Work Steps:     180000087      Rows 
Modified:       3000001
   --                 Full Query Time:   0d 00h 00m and 12.849s
   --                 Query Result:      Success.
   -- 
------------------------------------------------------------------------------------------------

CREATE INDEX idxFlowTimeProt ON flows (timestamp, protocol);

   --    Item Stats:  Item No:           4             Query Size 
(Chars):  62
   --                 VM Work Steps:     27000038      Rows 
Modified:       0
   --                 Full Query Time:   0d 00h 00m and 04.169s
   --                 Query Result:      Success.
   -- 
------------------------------------------------------------------------------------------------

SELECT * FROM flows LIMIT 20  -- Test to see if Data actually populated 
as intended - First 20 rows
;

   -- id    ipLan    ipWan    portLan    portWan    protocol 
nbBytesDecompOut    nbBytesCompIn    tos    timestamp
   -- --    -----    -----    -------    -------    -------- 
----------------    -------------    ---    ---------
   -- 0    192.168.1.0    55.66.77.0    1000    5000    0    164 1124    
0    1380000000
   -- 1    192.168.1.0    55.66.77.0    1000    5000    1 228007    
228967    0    1380000010
   -- 2    192.168.1.1    55.66.77.0    1001    5001    2 937308    
938268    0    1380000020
   -- 3    192.168.1.2    55.66.77.0    1002    5002    3 932047    
933007    0    1380000030
   -- 4    192.168.1.3    55.66.77.0    1003    5003    4    18772 
19732    0    1380000040
   -- 5    192.168.1.4    55.66.77.0    1004    5004    5 107330    
108290    0    1380000050
   -- 6    192.168.1.5    55.66.77.0    1005    5005    6 450805    
451765    0    1380000060
   -- 7    192.168.1.6    55.66.77.0    1006    5006    7    44990 
45950    0    1380000070
   -- 8    192.168.1.7    55.66.77.0    1007    5007    8    29031 
29991    0    1380000080
   -- 9    192.168.1.0    55.66.77.1    1000    5000    9 254758    
255718    1    1380000090
   -- 10    192.168.1.1    55.66.77.1    1001    5001    10 372799    
373759    1    1380000100
   -- 11    192.168.1.2    55.66.77.1    1002    5002    11 674595    
675555    1    1380000110
   -- 12    192.168.1.3    55.66.77.1    1003    5003    12 607498    
608458    1    1380000120
   -- 13    192.168.1.4    55.66.77.1    1004    5004    13 593350    
594310    1    1380000130
   -- 14    192.168.1.5    55.66.77.1    1005    5005    14 604620    
605580    1    1380000140
   -- 15    192.168.1.6    55.66.77.1    1006    5006    15 305027    
305987    1    1380000150
   -- 16    192.168.1.7    55.66.77.1    1007    5007    16 882158    
883118    1    1380000160
   -- 17    192.168.1.0    55.66.77.2    1000    5000    17 848808    
849768    2    1380000170
   -- 18    192.168.1.1    55.66.77.2    1001    5001    0 166423    
167383    2    1380000180
   -- 19    192.168.1.2    55.66.77.2    1002    5002    1 559863    
560823    2    1380000190

   --    Item Stats:  Item No:           5             Query Size 
(Chars):  104
   --                 Result Columns:    10            Result 
Rows:         20
   --                 VM Work Steps:     268           Rows 
Modified:       0
   --                 Full Query Time:   -- --- --- --- --.----
   --                 Query Result:      Success.
   -- 
------------------------------------------------------------------------------------------------

SELECT * FROM flows LIMIT 2999980, 22 -- Last ~20 rows of 3M
;

   -- id    ipLan    ipWan    portLan    portWan    protocol 
nbBytesDecompOut    nbBytesCompIn    tos    timestamp
   -- --    -----    -----    -------    -------    -------- 
----------------    -------------    ---    ---------
   -- 2999980    192.168.1.3    55.66.77.213    1003    5003    10 
139933    140893    0    1409999800
   -- 2999981    192.168.1.4    55.66.77.213    1004    5004    11 
72353    73313    0    1409999810
   -- 2999982    192.168.1.5    55.66.77.213    1005    5005    12 
532504    533464    0    1409999820
   -- 2999983    192.168.1.6    55.66.77.213    1006    5006    13 
326813    327773    0    1409999830
   -- 2999984    192.168.1.7    55.66.77.213    1007    5007    14 
849854    850814    0    1409999840
   -- 2999985    192.168.1.0    55.66.77.214    1000    5000    15 
182420    183380    1    1409999850
   -- 2999986    192.168.1.1    55.66.77.214    1001    5001    16 
415822    416782    1    1409999860
   -- 2999987    192.168.1.2    55.66.77.214    1002    5002    17 
529187    530147    1    1409999870
   -- 2999988    192.168.1.3    55.66.77.214    1003    5003    0 
926383    927343    1    1409999880
   -- 2999989    192.168.1.4    55.66.77.214    1004    5004    1 
376955    377915    1    1409999890
   -- 2999990    192.168.1.5    55.66.77.214    1005    5005    2 
259516    260476    1    1409999900
   -- 2999991    192.168.1.6    55.66.77.214    1006    5006    3 
1021395    1022355    1    1409999910
   -- 2999992    192.168.1.7    55.66.77.214    1007    5007    4 
926643    927603    1    1409999920
   -- 2999993    192.168.1.0    55.66.77.215    1000    5000    5 
642814    643774    2    1409999930
   -- 2999994    192.168.1.1    55.66.77.215    1001    5001    6 
571990    572950    2    1409999940
   -- 2999995    192.168.1.2    55.66.77.215    1002    5002    7 
193349    194309    2    1409999950
   -- 2999996    192.168.1.3    55.66.77.215    1003    5003    8 
362891    363851    2    1409999960
   -- 2999997    192.168.1.4    55.66.77.215    1004    5004    9 
343189    344149    2    1409999970
   -- 2999998    192.168.1.5    55.66.77.215    1005    5005    10 
429932    430892    2    1409999980
   -- 2999999    192.168.1.6    55.66.77.215    1006    5006    11 
705548    706508    2    1409999990
   -- 3000000    192.168.1.7    55.66.77.215    1007    5007    12 
395805    396765    2    1410000000

   --    Item Stats:  Item No:           6             Query Size 
(Chars):  65
   --                 Result Columns:    10            Result 
Rows:         21
   --                 VM Work Steps:     9000248       Rows 
Modified:       0
   --                 Full Query Time:   0d 00h 00m and 00.389s
   --                 Query Result:      Success.
   -- 
------------------------------------------------------------------------------------------------

/* Actual Query in Question */

SELECT SUM(nbBytesDecompOut + nbBytesCompIn) as vol, portLan as item
   FROM  flows
  WHERE timestamp>=1383770600 AND timestamp<=1384770600 AND protocol IN 
(17, 6)
  GROUP BY portLan ORDER BY vol DESC
  LIMIT 6;

   -- vol    item
   -- ---    ----
   -- 1422217500    1007
   -- 1417060608    1005
   -- 1403637410    1002
   -- 1395339326    1003
   -- 1389292160    1000
   -- 1380818028    1006

   --    Item Stats:  Item No:           7             Query Size 
(Chars):  213
   --                 Result Columns:    2             Result 
Rows:         6
   --                 VM Work Steps:     872434        Rows 
Modified:       0
   --                 Full Query Time:   0d 00h 00m and 00.039s         
    // <---  Less than  0.05  seconds
   --                 Query Result:      Success.
   -- 
------------------------------------------------------------------------------------------------

   --   Script Stats: Total Script Execution Time:     0d 00h 00m and 
18.221s
   --                 Total Script Query Time:         0d 00h 00m and 
17.931s
   --                 Total Database Rows Changed:     3000001
   --                 Total Virtual-Machine Steps:     216873198
   --                 Last executed Item Index:        7
   --                 Last Script Error:
   -- 
------------------------------------------------------------------------------------------------

   -- 2015-03-30 16:16:30.001  |  [Success]    Script Success.
   -- 2015-03-30 16:16:33.247  |  [Success]    Transaction Committed.
   -- -------  DB-Engine Logs (Contains logged information from all DB 
connections during run)  ------
   -- [2015-03-30 16:16:11.764] APPLICATION : Script 
D:\Documents\TestScript3.txt started with Initialization at 16:16:11.764 
on 30 March.
   -- 
================================================================================================


Reply via email to