I started with 1.2.8 and noticed the following problems/questions:
(http://search.rz.uni-osnabrueck.de/cgi-bin/s.cgi for regression ...)

1. Order of  search expressions

There were different results, when searching for:

Elsner -Meyer -> OK
-Meyer Elsner -> No search expression(s)

+Elsner +Meyer -> No search expression(s)
Elsner +Meyer -> OK

A - or + appended to the first search expression should be possible.

2.  "Not indexed yet"

It is not possible to get ALL pages indexed, even if using index -s 0.
There are always about 160 pages left when I try to index.
A true "index what is left to index" would be helpful.

3. Delete entries

I changed my robots.txt file for a WWW server (www.rz.uni-osnabrueck.de),
so that some directories should not be indexed.

After re-indexing, these directories are still in the database.
I know that I can completely delete the database and re-build it,
but I would like to index into an existing database.

How can I instruct index to delete URLs which are now excluded by
robots.txt?

The same situation occurs for servers xyz which had a "Server xyz" once
and should now be excluded, "# Server xyz" in aspseek.conf.


4. option selected.

The feature "<option value="..." selected="$dp"> as described on
page 33 of the 2002/02/18 manual (1.2.8) does not work, besides $ps.

It would be helpful if all Aspseek "display/sort/order/date" variables would
be treated as cookie variables like $ps. At the moment, I identify as 
candidates:
dp, dx, dm, dd, dy, db, de, fm, ps, c, s, np, gr, cs, ad, bd, ds, kw, tl

5. Disallow / Allow

Using Disallow and Allow is still hard work.

These are my  main scenarios :

(a) Allow some known good extensions and Disallow everything else.

Allow \.htm$
Allow \.shtm$
Allow \.cfm$
Allow \.html$
Allow \.shtml$
Allow \.cfml$
Allow \.ps$
Allow \.pdf$
Disallow *

(b) Disallow some known bad extensions and Allow everything else

DisallowNoMatch /$  | \.htm$ | \.shtm$ | \.cfm$
Disallow /cgi-bin/ /script \.cgi$  /nph
Disallow \?
Disallow \?D=A$ \?D=A$ \?D=D$ \?M=A$ \?M=D$ \?N=A$ \?N=D$ \?S=A$ \?S=D$
Disallow /[.]{1,2} /\%2e /\%2f
Disallow [^:]//
Disallow ///
Allow *

Does this index .htm, .shtm and .cfm only?

Did anybody create a list of "good" extensions for DisallowNoMatch?

What is the difference between Allow and DisallowNoMatch
DisallowNoMatch \.htm$ should be the same as Allow \.htm$ ?

Regards Frank Elsner
#-------------------------------------------------------#
Dipl.-Math. Frank Elsner
Universitaet Osnabrueck (University of Osnabrueck)
- Rechenzentrum - (Computing Center)
Albrechstrasse 28, AVZ
D-49076 Osnabrueck
Deutschland (Germany)

Tel. (Phone): ++49 (0)541/969-2343 Fax: -2470
E-Mail: [EMAIL PROTECTED]
#-------------------------------------------------------#

Reply via email to