I think I see what you are after.  I'm after the same knowledge. :)

The only things that I can recommend are books:
 Modern Information Retrieval
 Managing Gigabytes

And online resources like:
 http://finance.groups.yahoo.com/group/mg/ (note the weird host name)
 http://www.sims.berkeley.edu/~hearst/irbook/

There is a pile of stuff in Citeseer, but those papers never really dig
into the details and always require solid previous knowledge of the
field.  They are no replacement for a class or a textbook.

If you find a good forum for IR, please share.

Otis


I don't know about IR forums, but maybe the following link helps to get an introduction for those not familiar with the field of IR.
It gives an overview over possible weighting schemas used with vector space model:


http://www.dcs.gla.ac.uk/~tombrosa/AIS/SMART-tutorial/weights.html

These weights have been implemented in SMART, which is a famous retrieval system developed at Cornell University by Gerald Salton, one of the big names in the history of IR (see http://www.cs.cornell.edu/Info/Department/Annual96/Beginning/salton.html).

The weighting methods  used in SMART  can be coded with 3 characters.

First char gives the term-freq procedure to be used
     Second char gives the inverted-doc-freq procedure to be used.
     Third char gives the normalization procedure to be used.

Any combination of 3 letters is in theory acceptable. The system accounts for the boolean model (by using e.g. bnn schema), as well as for more sophisticated weights.

While these schemas are theoretically attractive, it seems that empirically other weightings have been proven to be more useful (e.g. not squaring the idf term).

Hope this helps,
roxana



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to