I am unable to reach the server you are referencing. It times out.

The htdig error is perhaps a little misleading. htdig always looks for a robots.txt file regardless of whether it exists or not. The error is not a problem with robots.txt, but the fact that dsscos.prod.fedex.com is unavailable on port 443 (and thus unable to retrieve a robots.txt file).

If you are inside a corporate firewall, it may be that you DO have access to that server on that port, but from outside, it doesn't appear.

Always remember that if the machine htdig is on is able to access the site through a web browser, htdig should be able to access it as well.

One more thing, you may need to configure htdig for ssl connections.

Good luck.

Ted Stresen-Reuter

On Apr 4, 2006, at 6:15 PM, Junie Ablay wrote:

Hi,
 
My systems are RH7 with htdig-3.2.0b6 installed.    I'm trying to get htdig to index  the site running on these servers.  The URL i'm trying to index is a virtual IP that goes to 2 servers. What this means is that if I type in this URL, it will go to one of the server with less traffic on it.  I'm trying to build the dbase in just one of the servers. The ports 80 and 443 are open.    
 
This is the alarm I get.  I tried to locate the robots.txt and that file is only available on /opt/fedex/htdig/htdig-3.2.0b6/test/htdocs/robots.txt not on the Document Root. If I type in the URL https://dsscos.prod.fedex.com/robots.txt, it says page cannot be found.
 
httpd is running and and if I go to the site https://dsscos.prod.fedex.com/, I can see data which means this is a valid URL.
 
+++
[EMAIL PROTECTED] bin]# ./rundig -vvv
ht://dig Start Time: Tue Apr  4 10:32:47 2006
        1:1:https://dsscos.prod.fedex.com/
New server: dsscos.prod.fedex.com, 443
 - Persistent connections: enabled
 - HEAD before GET: enabled
 - Timeout: 30
 - Connection space: 0
 - Max Documents: -1
 - TCP retries: 1
 - TCP wait time: 5
 - Accept-Language:
Trying to retrieve robots.txt file
Making HTTP request on https://dsscos.prod.fedex.com/robots.txt
Unable to establish the connection with host: dsscos.prod.fedex.com (port 443)
Request time: 35 secs
Unable to establish the connection with host: dsscos.prod.fedex.com (port 443)
Request time: 35 secs
.Unable to establish the connection with host: dsscos.prod.fedex.com (port 443)
Request time: 35 secs
. pushed
pick: dsscos.prod.fedex.com, # servers = 1
> dsscos.prod.fedex.com with a traditional HTTP connection
ht://dig End Time: Tue Apr  4 10:34:32 2006
htpurge: Database is empty!
 
Preamble text:
+++++

 
Just to simplify the process, I just followed the procedure:

The standard GNU installation process works for ht://Dig.
./configure --prefix=/usr/local
make
make install
vi /usr/local/conf/htdig.conf
/usr/local/bin/rundig
(The final three commands must be issued as root.)
If I changed the "start_url" into a non-SSL site, rundig runs well.
 
Any help would be greatly appreciated.
 
Junie Ablay III


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
ht://Dig general mailing list: <[email protected]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to