Thanks for the replies - I'll try and read through them all thoroughly a
bit later.

But for now a bit of background.

My software creates a large table containing anything between about 250K
and Millions of rows when first run, the indexes are created immediately
after the table is populated and the tables do not change afterwards. This
process generally takes 10 minutes to an hour or so.

The reason for the indexes is that the data is then reviewed by a user and
each user will have a different end result in mind and so will be looking
at (lots of) different subsets of the data but will want pretty much
instant results, i.e. a few seconds not minutes. I have no idea which
columns, other than a few of the main columns, that they will need to do
selects and sorts on. It's unlikely to be all of them for a specific user
but all users are very likely to use all columns but wont really know which
until they start looking at the data and even then the desired end reult is
sometimes fluid so requirements change.

The size of the database is never an issue. Memory is unlikely to be an
issue. The initial processing time (data loading and indexing) is not an
issue.

The whole target here is to drive the end user experience and reduce any
wait time when they are at the keyboard to a minimum.

I have considered just indexing the main columns and indexing other columns
on demand (first use if you like) but can't see the benefit of that other
than saving time and disk space at stage 1, but there is no need for this




On 26 November 2012 01:52, Simon Slavin <slav...@bigfraud.org> wrote:

>
> On 25 Nov 2012, at 11:30pm, "Jay A. Kreibich" <j...@kreibi.ch> wrote:
>
> > If you view an index as an optimization, then the idea
> >  is usually to increase overall performance, so that there is net win.
> [snip]
> >
> >  I disagree with this idea, as it implies there is a 1:1 exchange in
> >  read performance gains and write performance loss.
>
> To enlarge upon Jay's monotribe, I'll put these two together.
>  Optimization of a system sometimes doesn't mean optimizing /net/
> performance, it means speeding things up when time matters.  I once got an
> ovation from a roomfull of staff by deleting a couple of indexes, not
> adding them.  Because they spent most of their time entering data, but
> reports were generated by other people they never met and didn't care
> about. My changes reduced a two-second pause after hitting 'send' to half a
> second, massively reducing false keystrokes.
>
> The report-needing people, on the other hand didn't care.  It already took
> the system 30 minutes to generate the reports: they pressed the button and
> came back an hour later.  Changing the 30 minutes to 50 minutes didn't
> impact their workflow badly.  Out of courtesy I explained why the
> end-of-day reports were taking longer and nobody complained.  (By the way
> this sort of thing is why banks used to close to customers at 3:30pm even
> though the staff still worked until 5pm.)
>
> This was decades ago on a pre-SQL system.  One thing I like about SQL is
> that the language used to change and consult the database doesn't say
> anything about indexes (except in some non-standard extensions).  So you
> can add or delete indexes when you want without needing to change a line of
> code in any of your software and having to deploy new versions of your
> software.  [insert dramatic evil genius music here]
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to