Re: text mining under mysql

M. A. Alves Thu, 10 Jan 2002 10:57:18 -0800

> > > . . .
> > Isn't function MATCH what you want?
> > . . .
> The problem is that I don't know the expression for the 'AGAINST' part.
> Given a document I'd like to know what it is about without reading it.
> . . .


So you want a list of _descriptors_ for each document. That is
_precoordination_ which is always computationally expensive.

/* And yes I think you have to (re)implement it as access to the MySQL
full-text indexes is hard or impossible and/or undocumented (correct me if
I am wrong). */

But I wonder if you really do not have the latitude to refocus the problem
from the user side: the user inputs a search expression: that is your
"AGAINST part".

Of course it may take a few seconds, and if your document base is huge it
may be impractical, but try it anyway ;-)

Sure this is not 'hard' text mining but it is a good first step in the
incremental development of an information retrieval system. Also, it gives
you more data to mine: the entered search expressions :-)

Afterwords you can---and you should!---improve the system with
precoordination and text mining. The current best models I know for
document description are by the research group GLINT of the CENTRIA. I
used to be among them. Call [EMAIL PROTECTED] and tell him I say
hello :-)

-- 
   ,
 M A R I O   data miner, LIACC, room 221   tel 351+226078830, ext 121
 A M A D O   Rua Campo Alegre, 823         fax 351+226003654
 A L V E S   P-4150-180 PORTO, Portugal    mob 351+939354002



---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: text mining under mysql

Reply via email to