Hi Denis,
I've been reading your email carefully and I'd like to comment it.
On 28/03/2006, at 14:24, Dennis Cote wrote:
With these tables you will have 25K rows in the File table, one per
file, and 250K rows in the Attribute table assuming an average of
10 attributes per file (your example had 14).
An index on the attribute name and value pairs will greatly speedup
lookups of a particular attribute.
There is still some redundancy in this set of tables since the
names of the attributes are being repeated over and over in the
Attribute table. This may not be a big problem, but you really only
have 100 attributes, so there is no sense storing 25K copies of the
name of a common attribute that applies to all files. This
information can be factored out into a third table as shown below.
[...] This will give you three tables (which will not cause a
noticeable difference in the database open time). One with 25K rows
of file names. One with 100 rows of attribute definition data,
which includes the attribute name. And one with 250K rows of
attribute value data.
Due to application requirements, I must rely on LIKE and GLOB in
order to match data, matching data that contains some value
(sensitive or insensitive match).
Now, it seems to me that using either LIKE or GLOB will force a row
scan anyhow, since it can't use the index, correct? So your solution
would force me to perform the row scan for 250K rows when matching
values. That is 10 times more data to scan through.
I'm not sure if there is a way to speed up LIKE or GLOB queries such as:
SELECT myvalue FROM files WHERE myvalue LIKE '%finit%';
Thanks a lot for the advice.
Regards,
-- Tito