I also added v.shrink_to_fit() after the resize and tried again with Size[0] 
and Size[1] swapped. The time difference reappeared with the second pass 4+ 
secs faster than the first.



________________________________
From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> on behalf of 
Abroży Nieprzełoży <abrozynieprzelozy314...@gmail.com>
Sent: Saturday, May 26, 2018 5:39:03 PM
To: SQLite mailing list
Subject: Re: [sqlite] This is driving me nuts

> I swapped Size[0] with Size[1] so that the resize was going from large to
> small. The timings on each pass were then similar but were still around the
> 24.xxx mark associated with the i==1 pass.
Note the difference between size and capacity of the vector.
Reducing the size does not necessarily reduce the capacity,
so the vector may not free memory when reducing size.
You should call shrink_to_fit to free additional memory.
http://en.cppreference.com/w/cpp/container/vector/shrink_to_fit


What's the value returned by sqlite3_memory_highwater after executing the query?
http://www.sqlite.org/c3ref/memory_highwater.html


2018-05-26 17:43 GMT+02:00, x :
> I'm using c++ builder 10.2 tokyo on windows 10 and sqlite 3.23.1. I was
> working on code that stored RowIDs in a vector. I was sizing the vector
> beforehand and then timing how long it took to store the RowIDs returned by
> an sqlite query. By accident I sized the vector too big (by a factor of 10)
> and found the timings doubled (sometimes tripled) despite the fact it was
> still only stepping through the same query. We're talking going from 20 secs
> to 40-60 secs.
>
> At first I thought 'memory problem' BUT as I'm using a ms surface pro 4 with
> 16 GB ram and 512 GB SSD and running a 64 bit release version I thought how
> can that be? I mean the mistakenly sized vector only used 8 GB?
>
> I've managed to whittle the problem down to the following console
> application. The 'Big' table in the following code has just under 112
> million records and contains 2 integer columns and 4 text columns
> (full_name_1, full_name_2,sort_name_1,sort_name_2). It has one (unique)
> index on the 2 integer columns. I did a vacuum on the database beforehand.
> Apart from background tasks the console app was the only one running.
>
> #include <vcl.h>
> #include <windows.h>
> #pragma hdrstop
> #pragma argsused
> #include <tchar.h>
> #include <stdio.h>
> #include <conio.h>
> #include <iostream>
> #include <vector>
> #include "sqlite.h"
>
> std::vector<int64_t> v;
> const int Size[]={112000000,1000000000}; // 112 million, 1 billion
> int _tmain(int argc, _TCHAR* argv[])
> {
>         sqlite3 *DB;
>         sqlite3_open("c:/SQLiteData/Formbook.db",&DB);
>         sqlite3_stmt *stmt;
>         sqlite3_prepare_v2(DB,"select RowID from Big order by
> RowID",-1,&stmt,NULL);
>
>         for (int i=0; i<2; i++)
>         {
>                 v.resize(Size[i]);
>
>                 // NB the exact same code is executed whether i is 0 or 1.
> The only thing that
>                 // changes is the size() of v and v isn't even used in the
> timed code below.
>
>                 clock_t Start=clock();
>
>                 while (sqlite3_step(stmt)==SQLITE_ROW) {}
>
>                 // Above just steps through stmt (111,724,900 steps to be
> exact).
>
>                 std::cout << 1.0*(clock()-Start)/CLOCKS_PER_SEC <<
> std::endl;
>
>                 sqlite3_reset(stmt);
>         }
>         sqlite3_finalize(stmt);
>         sqlite3_close(DB);
>         getch();
>         return 0;
> }
>
> 5 sets of timings (secs) were as follows
>
> i==0 i==1
> 17.610    24.172
> 20.344    24.594
> 19.953    24.375
> 19.891    23.594
> 19.938    25.516
>
> I can't understand why the second pass (exact same code executed) takes an
> average of 4.8 secs longer.
>
> To add to the puzzle
>
> I tried making v an int64_t*, replaced the resize with v=new
> int64_t[Size[i]] and added delete [] v to the end of the inner block. Pretty
> much the same thing as far as memory goes yet the average timings for i==0
> and i==1 were almost identical at around 17.4.
>
> I tried replacing the sqlite related code with non-sqlite code e.g.
> populating v up to v.begin+112000000 using rand(). Again the timing anomaly
> disappeared.
>
> I swapped Size[0] with Size[1] so that the resize was going from large to
> small. The timings on each pass were then similar but were still around the
> 24.xxx mark associated with the i==1 pass.
>
> I copied the Big table from its current db to a new db where it was the only
> table. I then redid the timings using the new db but there was no real
> change.
>
> i==0 i==1
> 17.594    25.672
> 20.563    24.406
> 21.219    23.843
> 20.484    25.343
> 20.562    25.172
>
> I lastly tried gradually lowering Size[1] from 1 billion to 200 million step
> 200 million. There was little difference to the results for
> Size[1]=800,000,000 but at Size[1]=600,000,000 the difference was down to
> under a second. At Size[1]=400,000,000 timings for i==0 and i==1 were pretty
> much the same.
>
>
> While that last test suggests it must be a memory issue it begs the question
> why was there no difference in timings for i==0 & i==1 when an array was
> used rather than a vector? Surely the memory requirements are very similar?
>
> The timing anomaly was still apparent when Size[1]=800 million so why, when
> you've got 16 GB ram, does a 6.4 GB vector cause any problems?
>
> What's it got to do with sqlite, if anything? Why was sqlite_step slowed
> down?
>
> Any suggestions appreciated.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to