"Brett Wilson" <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> I'm working on using sqlite for Firefox's bookmarks and history
> system. We need to expire history older than N days when the browser
> shuts down. It's currently slower than I would like and I'd appreciate
> any tricks you might be able to suggest.
> 
> Database layout: We have a table of URLs and titles called
> moz_history. Then there is a separate table called moz_historyvisit
> that associates a date and some other stats with the ID of the correct
> entry in moz_history. There may be many visits for each URL (each time
> you visit the page). We have a table of bookmarks that references the
> history table as well.
> 
> Deleting the visits is easy, we have an index over the dates so we just say
>   "DELETE FROM moz_historyvisit WHERE visit_date < ?1"
> This seems fast enough.
> 
> Now we need to expire history entries that have no visits, are not
> bookmarked, and do not begin with "place:" (which are used for
> internal stuff). The simple command is
>   DELETE from moz_history WHERE
>   id NOT IN (SELECT page_id FROM moz_historyvisit)
>   AND id NOT IN (SELECT page_id FROM moz_bookmarks)
>   AND SUBSTR(url,0,6) <> 'place:';
> This is pretty slow. We have indices over page_ids in both bookmarks
> and visits. The string compare seems like it might be slow but it
> doesn't seem to matter much whether we keep it in or not (99% of the
> URLs will fail after the first character compare because they begin
> with 'http:').
> 

Some ideas:

  CREATE INDEX moz_history_id ON moz_history(id);
  DELETE FROM moz_history
   WHERE id IN (SELECT page_id FROM moz_historyvisit WHERE visit_date<?1)
     AND id NOT IN (SELECT page_id FROM moz_bookmarks)
     AND SUBSTR(url,0,6) <> 'place:';

Do the above before deleting from moz_historyvisit, of course.  This
is similar to your trigger idea.  Note that it only works if moz_history.id
is indexed.  The optimizer rule is this:

   column IN (subquery)

will use an index if an index is available on column.  But

   column NOT IN (subquery)

always does a full table scan.

Another thought, though less likely to help much is:

  DELETE FROM moz_history
   WHERE id NOT IN (SELECT page_id FROM moz_hsitoryvisit
                    UNION SELECT page_id FROM moz_bookmarks)
     AND SUBSTR(url,0,6) <> 'place:';

--
D. Richard Hipp   <[EMAIL PROTECTED]>

Reply via email to