On 11 Apr, Ying Zhang wrote:
> 
> Hi
> 
> I recently installed ht://Dig (3.1.1) on my site, but it won't index the
> site.  When I try to index other sites, it does work.
> 
> I think it may have something to do with the pages on my site -- they are
> generated dynamically with PHP whereas the sites that do work are static
> html pages.  To confirm this, I created a directory with some pages inside
> that are plain html files and they do get index.  Has anyone had similar
> problems?
> 
> This is what I get runnint htdig -vvv:
> 
> Warning: unknown locale!
>         1:0:http://dcfonline.sfu.ca/
> New server: dcfonline.sfu.ca, 80
> Retrieval command for http://dcfonline.sfu.ca/robots.txt: GET /robots.txt
> HTTP/1.0
> User-Agent: htdig/3.1.1 ([EMAIL PROTECTED])
> Host: dcfonline.sfu.ca
> 
> Header line: HTTP/1.1 404 Not Found
> Header line: Date: Mon, 12 Apr 1999 04:39:47 GMT
> Header line: Server: Apache/1.3.4 (Unix) PHP/3.0.7
> Header line: Connection: close
> Header line: Content-Type: text/html
> Header line:
> returnStatus = 1
>  pushed
> pick: dcfonline.sfu.ca, # servers = 1
> 0:0:0:http://dcfonline.sfu.ca/: Retrieval command for
> http://dcfonline.sfu.ca/: GET / HTTP/1.0
> User-Agent: htdig/3.1.1 ([EMAIL PROTECTED])
> Host: dcfonline.sfu.ca
> 
> Header line: HTTP/1.1 200 OK
> Header line: Date: Mon, 12 Apr 1999 04:39:47 GMT
> Header line: Server: Apache/1.3.4 (Unix) PHP/3.0.7
> Header line: Connection: close
> Header line: Content-Type: text/html
> Header line:
> returnStatus = 0
> Read 4081 from document
> Read a total of 4081 bytes
>  size = 4081
> pick: dcfonline.sfu.ca, # servers = 1
> 
> Kind of strange, because looking at the source of that page through a
> browser it looks perfectly normal.. Hope someone can help.
> 
> Ying

Well, it can be done :-)

I run a PHP mirror site, which is indexed using HTDig. The first things
I would check are:

1) Make sure you are not trying to use the 'local files' option - it
won't work without setting php up as a parser for htdig.

2) Ensure that your exlude_urls and limit_urls_to values don't exclude
what you want to index

Cheers
-- 
David Robley

WEBMASTER                           | Phone +61 8 8374 0970
RESEARCH CENTRE FOR INJURY STUDIES  | http://www.nisu.flinders.edu.au/
AusEinet                            | http://auseinet.flinders.edu.au/
            Flinders University, ADELAIDE, SOUTH AUSTRALIA
            Visit the PHP mirror at http://au.php.net:81/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to