On Thu, 15 Nov 2001, Kir Kolyshkin wrote:

> Gregory Kozlovsky wrote:
> > 
> > Here are some small problems I found that would be nice to have fixed.
> > 
> > 1. ASPSeek tries to follow <IMG SRC ...> links to images. This wastes time
> > and space.
> 
> I believe you are wrong here. Please re-check this and prove ;)

Images are only retrieved as a side effect of indexing dynamic content since it
not known until the document headers are examined what the content type is. By
this time the url must have been stored into urlword (via href discovery) yet
it will not store the images content when index processes the url, a side
effect of how the indexer works. 

It does not index IMG SRC tags directly. 


Matt.

> > 2. There are a lot of messages like:
> >    mailto:somebody@somewhere
> >    Unsupported protocol
> > The same thing for javascript:
> 
> This is just a debugging message, it should not be considered as error.
> 
> > 3. The French stopwords list does not include "de" which is one of the most
> > common words in French.
> 
> Ok, I will include it. Can somebody from France confirm it?
> 
> -- 
> [EMAIL PROTECTED]  ICQ 7551596  Phone +7 903 6722750
> Hard work may not kill you,  but why take chances?
> --
> 



Reply via email to