On 2015-07-03 12:16 AM, Rob Willett wrote:
> To try and isolate the problem away from the Perl program, we then generated 
> the 5,000 SQL calls that would be made by the Perl program into a file. e.g
>
> select Id,Calculation,Peak,Red,Amber,Green from RAG where text = 'traffic is 
> moving well on diversion.';
> select Id,Calculation,Peak,Red,Amber,Green from RAG where text = 'approach 
> with caution.';
> select Id,Calculation,Peak,Red,Amber,Green from RAG where text = 'traffic is 
> currently flowing well in the area.';
> select Id,Calculation,Peak,Red,Amber,Green from RAG where text = 'traffic is 
> slow moving past the restrictions.';
> select Id,Calculation,Peak,Red,Amber,Green from RAG where text = 'traffic is 
> slow moving on approach to closures in both directions.';
> select Id,Calculation,Peak,Red,Amber,Green from RAG where text = 'expect 
> delays.';
>
> ?. similar but slightly different SQL for the next 5,000 lines.
>
> We then piped that straight into sqlite and timed the output.
>
> time cat ,test1 | sqlite3 db.sqlite > /dev/null
>
> real  0m49.610s
> user  0m31.328s
> sys   0m18.272s
>
> This took around 50 seconds which is faster than the Perl program (122 
> seconds). Since the Perl program has to read in a 10 x 5MB XML file, build up 
> XML structures, walk the XML, extract lines from the XML and do lots of other 
> programming stuff this doesn?t seem unreasonable.
>
> Whilst piping a large amount of SQL into sqlite doesn?t feel quite right, 
> we?re struggling with how else we can benchmark this to see if we have a 
> problem or not. All suggestions welcomed.

I can tell you that the SQLite3 command line program uses threading and 
is extremely fast and optimised, it is no surprise that the query takes 
much longer in many other programs using a library or even compiled-in 
SQLite code. If it takes 50 seconds piped into SQLite then 100+ seconds 
in Perl is not completely weird.

You say that the SELECT on the Table is taking 13ms/call, but I assume 
you mean the SELECT statement takes 13ms to run from start to finish and 
copy all the selected DB content into Perl text arrays... this may be a 
bottleneck. How many lines (records) are returned by the queries on 
average? For 10K+ lines, 13ms would be a fantastic result - but on a 43K 
row table I doubt that very much.

The most immediate optimisation I can see is making the column collation 
NOCASE (as opposed to the Index). This will avoid running the collation 
function on every item. SQLite may actually optimise this already and I 
don't know if it does, but it's worth testing. Another thing I would do 
is add a column which uses tokens for strings, as it seems these are 
standard repeating strings which may be reduced to a simple Integer, 
which will be much faster to look up than long string comparisons.

You could add the token replacement in Perl or SQL, and update the 
current db by simply:
UPDATE RAG SET token = 1 WHERE text = 'traffic is moving well on diversion.'
UPDATE RAG SET token = 2 WHERE text = 'approach with caution.'
etc.
Then create an index for token and select like this:
select Id,Calculation,Peak,Red,Amber,Green from RAG where token = 2
and so on.
The gain will be minimal for any single select, but on 5K selects it 
will not be insignificant.

If it isn't part of the usual maintenance, try re-pack the DB file using 
VACUUM.

In normal Code we would of course compute and re-use the prepared 
statements shaving off another significant bit of time, but not sure if 
the Perl interface allows for this or if it will reduce significant time 
for those straight-forward queries.

Looking at your system specs, it's hard to imagine your results NOT 
being significantly faster than the quoted test on the SQLite site 
(Since then, even SQLite itself has undergone many significant 
performance improvements). If the above still doesn't make it much 
faster I'd start checking for hardware bottlenecks or other settings 
that might influence the speed.


Reply via email to