On 11 Apr, Ying Zhang wrote:
>
> Hi
>
> I recently installed ht://Dig (3.1.1) on my site, but it won't index the
> site. When I try to index other sites, it does work.
>
> I think it may have something to do with the pages on my site -- they are
> generated dynamically with PHP whereas the sites that do work are static
> html pages. To confirm this, I created a directory with some pages inside
> that are plain html files and they do get index. Has anyone had similar
> problems?
>
> This is what I get runnint htdig -vvv:
>
> Warning: unknown locale!
> 1:0:http://dcfonline.sfu.ca/
> New server: dcfonline.sfu.ca, 80
> Retrieval command for http://dcfonline.sfu.ca/robots.txt: GET /robots.txt
> HTTP/1.0
> User-Agent: htdig/3.1.1 ([EMAIL PROTECTED])
> Host: dcfonline.sfu.ca
>
> Header line: HTTP/1.1 404 Not Found
> Header line: Date: Mon, 12 Apr 1999 04:39:47 GMT
> Header line: Server: Apache/1.3.4 (Unix) PHP/3.0.7
> Header line: Connection: close
> Header line: Content-Type: text/html
> Header line:
> returnStatus = 1
> pushed
> pick: dcfonline.sfu.ca, # servers = 1
> 0:0:0:http://dcfonline.sfu.ca/: Retrieval command for
> http://dcfonline.sfu.ca/: GET / HTTP/1.0
> User-Agent: htdig/3.1.1 ([EMAIL PROTECTED])
> Host: dcfonline.sfu.ca
>
> Header line: HTTP/1.1 200 OK
> Header line: Date: Mon, 12 Apr 1999 04:39:47 GMT
> Header line: Server: Apache/1.3.4 (Unix) PHP/3.0.7
> Header line: Connection: close
> Header line: Content-Type: text/html
> Header line:
> returnStatus = 0
> Read 4081 from document
> Read a total of 4081 bytes
> size = 4081
> pick: dcfonline.sfu.ca, # servers = 1
>
> Kind of strange, because looking at the source of that page through a
> browser it looks perfectly normal.. Hope someone can help.
>
> Ying
Well, it can be done :-)
I run a PHP mirror site, which is indexed using HTDig. The first things
I would check are:
1) Make sure you are not trying to use the 'local files' option - it
won't work without setting php up as a parser for htdig.
2) Ensure that your exlude_urls and limit_urls_to values don't exclude
what you want to index
Cheers
--
David Robley
WEBMASTER | Phone +61 8 8374 0970
RESEARCH CENTRE FOR INJURY STUDIES | http://www.nisu.flinders.edu.au/
AusEinet | http://auseinet.flinders.edu.au/
Flinders University, ADELAIDE, SOUTH AUSTRALIA
Visit the PHP mirror at http://au.php.net:81/
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.