At 7:37 AM +1000 4/11/06, Taco Fleur wrote:
Hi all,

I am trying to find out whether it is possible to return the count of
keywords found in a text field, does anyone know?
For example; "ColdFusion or Java" is entered in the search string and 20
records are found that match, I need to count how many times "ColdFusion"
and "Java" appears in each match, add those two and than sort descending on
that total.

I was actually using verity for the search, but the client insists he sees
the number of keywords found, which Verity does not do.

The text searched are Résumé's, I initially thought that MySQL could search
the résumé's when stored as binary data, but I was wrong, so I now have the
CV's converted to HTML and then store them in the DB as VARCHAR

Any help would be much appreciated, I am having a hard time coming from a MS
SQL background ;-)


There's no function that I know of to do that directly; however, you
could do something like this:

   select
      (length(your_text_field)-length(replace(your_text_field,
'coldfusion','')))/length('coldfusion')
      as wordcount
   from
      your_table

That is, it removes all instances of 'coldfusion' from  your string,
gets the difference in length from the unaltered string, and divides
that by the number of characters in your search string. I've used
this method several times.

Of course, if you want to avoid matching against terms like
'javalike' or 'coldfusionista' then you've gotta do some additional
checking, for example:

   select
      (length(your_text_field)-length(replace(concat('
',your_text_field,' '), ' coldfusion ','')))/length(' coldfusion ')
      as wordcount
   from
      your_table

See

   http://dev.mysql.com/doc/refman/4.1/en/string-functions.html

for more info.

If you're doing this alot, it might be more efficient to build a word
index table like:

   word        char(32) not null         # or whatever your max word
length is likely to be
   word_count  integer unsigned not null
   document_id integer unsigned not null

where document_id is a foreign key pointing at the table containing
your fulltext. This would be easier to extend to handle synonym
handling too, and you could do all the suffix handling/stemming you
need (eg; to take care of plurals). I've done something like that as
well, and included an extra field for the metaphone version of the
word, to match approximate spellings. If the text fields were all in
plain text, you could even include character positions like

   word          char(32) not null
   word_position integer unsigned not null
   document_id   integer unsigned not null

then you could get word counts by doing a select count(word).


        steve


 Kind regards,


Taco Fleur

Free Call 1800 032 982 or Mobile 0421 851 786
Pacific Fox  <http://www.pacificfox.com.au/> http://www.pacificfox.com.au an
industry leader with commercial IT experience since 1994 …

*

        Web Design and Development
*

        SMS Solutions, including developer API
*

        Domain Registration, .COM for as low as fifteen dollars a year,
.COM.AU for fifty dollars two years!




--
+--------------- my people are the people of the dessert, ---------------+
| Steve Edberg                                http://pgfsun.ucdavis.edu/ |
| UC Davis Genome Center                            [EMAIL PROTECTED] |
| Bioinformatics programming/database/sysadmin             (530)754-9127 |
+---------------- said t e lawrence, picking up his fork ----------------+

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to