Thanks Mark: ----- Original Message ----- From: "Mark C. Roduner, Jr." <[EMAIL PROTECTED]> Sent: Tuesday, March 25, 2003 3:45 PM Subject: RE: Your professional opinion Please...
> Brian, > Here's Some hints on how to accomplish an efficiant way > to index the data > Regular Expressions: > ([\w\d]{5,64}) -Matches all Word and Mumeric data in a > given string > Database > Tables > files : [int id][char*255 file name] > (Propagate This With File Names) > word : [int id][char*64 word] > (Propagate This With *Unique* Words) > map : [int id][int word][int files] > (Propagate This With `file`.`id`, > `word`.`id` > where `word`.`name` is found in file > named by > `file`.`name`) > Querys > To Find a file With given words > SELECT `file`.`name` from `file`, > `word`, `map` > where (`word`.`name` IN > ('word1','word2', 'word3')) and > (`map`.`word`=`word`.`id` and > `map`.`file`=`file`.`id`) > GROUP BY `file`.`name`; > Room for Improvement > Add in a field into the MAP table that gives the > offset > (in words) where the word was found. This would > prove > useful for "Quoted Queries" (ie: Phrase > searching). > Add a blob segment into the FILE table for > easier access > to the data (very optional, _will_ bloat your > database) Probably a little more than I can do in the allotted time. > If you're willing to pay for it, I'll Write it for you. Unfortunately there is no budget for this project. > BTW, I recommend JAVA for writing the reader program, > much easier and clean cut to do regular expressions, and > PHP (v4.x) for the search program (easier UI). Understood. Appreciate the feedback. Best regards, Brian -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]