On Sun, Nov 25, 2012 at 01:29:48PM +0000, Paul Sanderson scratched on the wall:
> Yes NULL -
> 
> I underastand that ecvery coumn needs to be read, that is self evident,
> however my feeling (not tested) is that the process is much slower than it
> needs to be, i..e the process of creating an index on a column whos values
> are all NULL takes longer than just reading all of the columns - I have to
> admit to not testing/timimg this though, I'll have a go when time permits.

  An index is essentially a specialized table that uses a different key.
  Tables and indexes use an almost identical data structure when
  stored.  The main difference is that tables are sorted and keyed via
  the ROWID column, while indexes are sorted and keyed by one or more
  indexed columns.  Every index has one row for each row in the table.

  So the question then becomes, if you were creating a one column table
  and inserted a large number of rows, would you expect a significant
  difference in the insert time if all the values were the same, or if
  they were different?

> That does lead to another question. Is their a method of creating multiple
> indexes at the same time, e.g. create an index on each (or specified)
> column in a table in one pass - rather than do each column in turn. This
> would save on the overhead of reading the entire table for each column.

  Scanning the table is not the expensive part of creating an index.
  Creating a large index requires a significant amount of memory to sort
  and rebalance the B-Tree used to store the index data.  Doing more
  than one index at a time would make that problem much worse, and
  would likely be slower, overall, then creating the indexes one at a
  time.


  As others have pointed out, it might be worth backing and asking what
  you're trying to achieve with all these indexes.  Putting an index on
  each column is usually undesirable.  A given SELECT can usually only
  use one index per query (or sub-query), so it rarely makes sense to
  stack up the indexes... adding unused indexes only slows down
  insert/update/deletes, as well as makes the file size much larger.
  Additionally, indexing any column (even one used heavily in a query
  filter) is not useful if the index doesn't reduce the working data
  set to ~10% of the rows.  In other words, having an index on a column
  that has all one value (be it NULL or anything else) accomplishes
  nothing but significantly slowing down queries that use the index.
  Same is true, for example, of a True/False column with an even
  distribution.  Generally, unless the index can be used to eliminate 
  the vast majority of rows, it will slow down, rather than speed up, a
  query.

  Indexes are not magic bullets, and using them properly requires
  understanding how they work and how they are used.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to