The title mentions mobile devices for the sole reason - the criteria for the
performance are much higher than in the desktop world.
For example our application (where we consider adding FTS) may run into
memory problems on some weaker devices when we increase memory consumption
by approx. 10 MB. In other words, every MB saved counts.

I constructed a rather complex test of various activities on a sample table
(accounts) and an FTS index built over selected table columns. The purpose
was to select best policies and identify potential risks.

Main steps incl. the duration and max. memory used by SQLite during that
step:
 1. (0.06 sec, 1.6 MB)  Dropping old FTS table (if any)
 2. (0.05 sec, 2.5 MB)  Deleting all account records (the table is empty at
the end)
 3. (2.31 sec, 2.5 MB)  Bulk insert of accounts
 4. (0.03 sec, 2.5 MB)  Creating (empty) FTS index
 5. (1.51 sec, 7.4 MB)  Rebuilding FTS index
 6. (4.42 sec, 8.0 MB)  Deleting all accounts (FTS online, i.e. triggers
automatically update FTS index) *)
 7. (7.49 sec, 2.7 MB)  Bulk insert of accounts. (again, FTS is online)
 8. (0.02 sec, 2.6 MB)  Testing FTS index using sample FTS query
 9. (0.69 sec, 2.7 MB)  FTS index optimization (INSERT INTO
rs_FTSEmails_email(rs_FTSEmails_email) VALUES('optimize'))
10. (0.08 sec, 2.8 MB)  Drop FTS index and accounts
11. (0.01 sec, 2.5 MB)  Creating FTS index
12. (6.99 sec, 2.7 MB)  Bulk insert of accounts (FTS online)

*) SQLITE CRASH
When I added FTS index optimization after the accounts table was emptied, I
got a crash:
 Access violation in fts3SegWriterFlush: pWriter=NULL
 Called by fts3SegmentMerge: pWriter=NULL
 By a brief code inspection it looks like pWriter cannot be NULL, but that's
what the debugger shows.

DB details:
- Account table has 10000 records and occupies around 3.2 MB
- Bulk insert was done with the script created by sqlite shell dump command.
(without CREATE TABLE, wrapped in a transaction)
- FTS index is external content FTS4 table over selected acount columns
- Creating FTS index: Includes also triggers maintaining this index (as
described in http://www.sqlite.org/fts3.html, chapter 6.2.2).

Test setup:
 SQLite v3.7.15.2
 DB setup: WAL, cache_size=2000, page_size=1024, private NOCASE collation
used on all text columns.
 Hardware: Desktop computer (W7, x64, intel i5 2.4GHz)
 The test was run by modified sqlite shell utility.

Below is the summary of the most important findings. In case of interest I
can add detailed test output.


***** MEMORY CONSUMPTION *****

Memory-critical actions:
- Rebuilding FTS index from scratch (step 5) used 7.4 MB.
- Deleting accounts with online FTS index (step 6) used 8 MB.

Memory consumption for all other actions was around 2.5 MB. (The full page
cache takes ~2.3 MB.)
This includes FTS queries, but also:
- (!) Bulk insert of accounts with online FTS index. One intuitively feels
that this should be the most demanding action.

Another thing of interest is that after all accounts were deleted, the FTS
index (although empty) apparently occupies the same space as before.
It is a question what does this mean for the future.


***** SPEED *****

Operations using triggers are very slow. (Too slow for my expectation.)
E.g. bulk insert of accounts with online FTS (step 7) takes 7.49 sec, while
the same work done in 2 steps (steps 3-5) takes 3.82 sec.

On a positive note, FTS queries are really substantially faster than similar
LIKE queries.
(Following test with a huge table showed 500x speedup.)


***** TESTS WIT A HUGE DB *****

Email table has 10000 records and occupies around 100 MB.

 1. (0.39 sec, 2.5 MB) Dropping old FTS table (if any)
 2. (0.04 sec, 2.4 MB) Creating FTS index
 3. (24.8 sec, 7.4 MB) Rebuilding FTS index
 4. (54 sec, 8.5 MB)   SELECT *, length(block) AS len_blob FROM
rs_FTSEmails_email_segments ORDER BY len_blob DESC LIMIT 5
 5. (0.06 sec, 3.4 MB) Sample FTS query 'ivan -resco*'
 6. (31.9 sec, 2.7 MB) SQL query using LIKE
 7. (5.52 sec, 7.9 MB) FTS index optimization
 8. (0.05 sec, 3.4 MB) Sample FTS query 'ivan -resco*'
 9. (36 sec, 13.3 MB)  SELECT *, length(block) AS len_blob FROM
rs_FTSEmails_email_segments ORDER BY len_blob DESC LIMIT 5

Influence of the FTS index optimization:
- The optimization itself takes 5.52 sec.
- FTS query seems to be faster, but the effect is hardly measurable
- Larger memory consumption for some queries.
   Possible reason: rs_FTSEmails_email_segments table (internal FTS table)
has larger blobs (max. size 1.49MB instead of 670K for non-optimized index).

Here are some data collected in older tests. The numbers represent seconds.
 Bulk insert of emails (FTS offline) 8.27, 8.64, 9.64
 FTSindex creation     22.56, 22.50, 23.49
 FTSindex optimization    3.92, 3.92, 3.84
 FTSindex integrity check   12.85, 12.71, 12.68
 Bulk insert of emails (FTS online) 64.48


***** SUMMARY *****

(Note that above times refer to desktop; mobile devices will be a few times
slower.)

1. Triggers are memory-friendly, but too slow. They can be used for small
changes only.
In case of bulk actions, the triggers should be dropped and the FTS index
rebuilt after the action completes.

2. Rebuilding FTS index takes a lot of memory, whereby the amount does not
depend on the table size.
   Question: Would it be possible to reduce this number? Apparently FTS
index can be built with less memory.

3. Don't use FTS index optimization. Small gains in terms of speed, risk of
high memory consumption.

4. There is some problem in SQLite FTS code as the described crash proves.
(Might relate to empty index only)

5. DB size: This is not described above, but it is important to realize,
that SQLite DB does not shrink.
   FTS index takes compareble space to the data being indexed and it will be
stored first in the WAL log, then in the DB itself.
   In the worst case the DB size may blow up 2-3x.
   Solution for the WAL log: PRAGMA journal_size_limit=10485760 (fast and
easy)
   Solution for the DB: VACUUM (slow, that's way difficult)

Any questions/comments are welcome.


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to