Thanks Mark:

----- Original Message ----- 
From: "Mark C. Roduner, Jr." <[EMAIL PROTECTED]>
Sent: Tuesday, March 25, 2003 3:45 PM
Subject: RE: Your professional opinion Please...


> Brian,
> Here's Some hints on how to accomplish an efficiant way 
> to index the data
 
> Regular Expressions:
> ([\w\d]{5,64}) -Matches all Word and Mumeric data in a
> given string
> Database
> Tables
> files : [int id][char*255 file name]
> (Propagate This With File Names)
> word : [int id][char*64 word]
> (Propagate This With *Unique* Words)
> map : [int id][int word][int files]
> (Propagate This With `file`.`id`,
> `word`.`id` 
> where `word`.`name` is found in file
> named by
> `file`.`name`)
> Querys
> To Find a file With given words
> SELECT `file`.`name` from `file`,
> `word`, `map` 
> where (`word`.`name` IN
> ('word1','word2', 'word3')) and 
> (`map`.`word`=`word`.`id` and
> `map`.`file`=`file`.`id`)
> GROUP BY `file`.`name`;
> Room for Improvement
> Add in a field into the MAP table that gives the
> offset 
> (in words) where the word was found.  This would
> prove
> useful for "Quoted Queries" (ie: Phrase
> searching).
> Add a blob segment into the FILE table for
> easier access
> to the data (very optional, _will_ bloat your
> database)

Probably a little more than I can do in the allotted time.

> If you're willing to pay for it, I'll Write it for you. 

Unfortunately there is no budget for this project.

> BTW, I recommend JAVA for writing the reader program, 
> much easier and clean cut to do regular expressions, and 
> PHP (v4.x) for the search program (easier UI).

Understood.

Appreciate the feedback.

Best regards,

Brian


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to