Robb Kerr <[EMAIL PROTECTED]> wrote on 11/25/2005 11:59:48 AM:

> I'm building a new clipart site. I need to have keyword searching. I'm
> seeking opinions about table design. Here are my proposed options. If
> anyone has any other suggestions, please make them.
> 
> Option One
> Related tables. Table one (clipart pieces) contains ClipartID and
> ClipartName fields. Table two (keywords) contains KeywordID, ClipartID 
and
> Keyword fields. This option will create an incredibly large related 
table
> (keywords) with each piece of clipart having tens of related fields in 
the
> keyword table. But, searching ought to be fast.
> 
> Option Two
> Single table. Table one (clipart pieces) contains ClipartID, ClipartName
> and Keywords fields. The Keywords field would be a long text field that
> would be searched with a full-text search. Searching and maintenance 
would
> be easier but would searching be slowed down significantly?
> 
> Please provide any input you have and make any alternate suggestions.
> 
> Robb Kerr
> Digital IGUANA
> 

If speed and flexibility is your priority, use option one. 

Remember: FT indexing will skip all stopwords and words smaller than the 
minimum FT length. By default, that length is set to 4 but there are ways 
to make it smaller and you can also provide an empty stopword list so you 
can get around that, too.

For option 1 I think you need a third table that maps keywords to clipart. 
That way you only need to store the string value of a keyword once. It 
will keep your keywords table smaller and make lookups faster. Technically 
speaking, doing it this way "normalizes" your keyword data. Normalized 
databases most often perform much better than denomalized databases. There 
are some notable exceptions but as a general rule this is true.

Making a separate keyword table makes it easier to search for a list of 
keywords and locate not just full matches but  partial matches, too 
(matched 6 of 8 search terms). The FT search returns a relevance number 
but as applied to "smaller" chunks of text (a list of 20 or 30 keywords or 
key phrases), that may not be very useful.

Search the archives for various techniques of finding lists of values from 
normalized data. This type of question has appeared frequently.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Reply via email to