P.S. When experimenting with this, try naming this 'second' table first
to reduce the work the database has to do in selecting/rejecting
records. Ypu can experiment with this by creating this new table,
populating it with a SELECT from the existing table (only the records
without statistics), and then timing the SELECT/JOIN.
-ken
On 15-Apr-05, at 6:15 AM, Ken & Deb Allen wrote:
I cannot speak for SQLITE for certain, but I know with a number of
commercial RDBMS packages the index is not used unless it contains a
sufficient degree of distribution. Microsoft SQL Server, for example,
is quite explicit in stating that if the distribution of values across
the range of records in an index falls below a certain level, then the
index is ignored, which is why you are used to put the most dynamic
column first in a multi-column index. I do not know if SQLITE performs
any of this analysis and makes decisions on whether to use the index
or not, but it is possible, especially if a sort is involved, that the
index is not helping much if all records have one of two values, and
as the majority of he records have the same value, you are looking
only for the minority records. I do not know if using a date/time for
this files will speed it up or not, since your statistical inclusion
query will be for records "WHERE statistic IS NULL" or "WHERE
statistic = 0", and as the database grows in size this will be the
minority set of records.
One 'trick' I have seen in the past was to create a second table that
contained an entry for new records, also keyed by the same "id"; as
each sale is created, an entry is also made in that table, possibly by
a trigger. You select the records for statistics using a JOIN (inner),
which will only return records that are in both tables. As the records
are processed, the record is removed from the second table, and
therefore the records will not be part of the JOIN in the future. In
some databases this may be faster than using an index; I have not
experimented with this in SQLITE, however.
-ken
On 15-Apr-05, at 4:30 AM, msaka msaka wrote:
i dont need working with dates and times..
i only set 1 records on which was done..
which index is better to create for this problem?
must i cut db files? or speed of statistic will be same now and after
1000 000 records?
-----Pôvodná správa-----
Od: Gert Rijs [mailto:[EMAIL PROTECTED]
Odoslané: 15. apríla 2005 7:13
Komu: sqlite-users@sqlite.org
Predmet: Re: [sqlite] optimize table
msaka msaka wrote:
CREATE TABLE sale(
id INTEGER NOT NULL UNIQUE PRIMARY KEY,
bill_no integer,
bill_item_no integer,
item_name varchar(20),
qty integer,
amount DOUBLE,
statistic integer
)
Why don't you replace "statistic" with a timestamp formatted like
yyyymmdd, perhaps add the time as well. If you add an index on it you
can efficiently retrieve the rows for certain date periods without
needing to update the table.
Gert