We don't have a SQL server yet. The speed of the archives comes from (or will) a 
really funky caching system where each thread (with all messages) will be cached until 
a new message is added to the thread. The indexing will be done on a message basis 
rather than a thread one. This will allow the incremental additions. Once we get a SQL 
box, I'll look into what you suggested. As a side note, we have over 12,000 threads 
and 66,000 messages in the CF-Talk archive alone. My concern with the indexing was if 
a verity collection can handle 100,000 or more items in an index. 
The current archives are from jan 2001. I'm getting access2000 to convert the older 
posts into archivable threads as well so we can have back to the start. This will at 
least double the current message count.


At 12:05 PM 6/5/02, you wrote:
>> It took me a few days to port over all the CF-Talk messages
>> and thread them. Indexing for verity searches is next. The
>> indexes will be based on the year, so if you want to search
>> both this year and last, you have to select two indexes to
>> search through. It's needed to avoid monster indexes or 250,000
>> messages each. Of course, if someone has a suggestion such as
>> having all the messages in a single index and can show that
>> it will not be a major problem, I'm all for it.
>
>Are you porting this to SQL Server as well?  I know there was some talk of
>that but I wasn't sure what came of it.  If you port it to SQL Server, I
>would use the full-text search for two reasons:
>
>1. You can search on individual fields.  You can say I want to see all the
>messages by Person A, posted between date B and date C, with the words "blah
>blah" in the subject line and "coldfusion" in the body.  In Verity, however,
>I believe you have to search on all the fields.  So it would be all the
>messages that have Person A, "blah blah" and "coldfusion" in either the
>sender line, subject line, or body text.
>
>2. I'm not positive that you can't do this in Verity, but SQL Server allows
>for incremental indexing.  All you have to do is put a timestamp field on
>your tables and SQL Server will only re-index fields that have been added or
>changed since the last indexing.
>
>
>
>Ben Johnson
>Hostworks, Inc.
>
>
______________________________________________________________________
Signup for the Fusion Authority news alert and keep up with the latest news in 
ColdFusion and related topics. http://www.fusionauthority.com/signup.cfm
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/cf-talk@houseoffusion.com/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists

Reply via email to