Dirk Meyer wrote:

We search for the label, split on the _. So we search for 'fellowship
ext 1' The 1 gives us many results we don't want. But for other cases,
we need the numbers (Babylon 5, or sequells).

The big question now: how can we make it produce better results? What
about:

1. search like we search now, the list may be long
2. if the number of return items is greater 10 remove all titles which
   don't include at least one _word_. So results without 'fellowship'
   or 'ext' (only containing 1) will be deleted.
3. sort the results:
   a) Most popular searches to the top
   b) Inside two areas (popular and not so popular), search by number
      of matched words: each word in the title and not in the search
      string hitpoint--, each search word in the title hitpoint += 5.

What do you think?

I've made a small change locally that (1) throws out any non-word characters from the name (\W) and (2) throws out any single-character words from the name. This seems to produce much better matches.

In the example above, it would search for "dvd fellowship ext" and frind it, instead of searching for "dvd [fellowship ext d 1]".

Lars
--
Lars Eggert <[EMAIL PROTECTED]>           USC Information Sciences Institute

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature



Reply via email to