I have found  htdig very reliable in digging simple sites, but the following
site seems to have broken it

http://www.xml-cml.org/


The first four documents on this site, as determined by a (successful)
use of w3mir areas follows. All are within framesets.

w3mir: index.html
w3mir: front_lower.html
w3mir: front_page.html,
w3mir: left_index.html

htdig 3.1.2 indexes the first three, and then stops.  left_index is called
frome a frameset declaration in front_page.html, itself called from a 
frameset declaration in index.html

vvv mode gives the following diagnostics
===================
title: Untitled Document
href: http://www.xml-cml.org/left_index.html ()
resolving 'http://www.xml-cml.org/left_index.html'
*href: http://www.xml-cml.org/front_page.html ()
resolving 'http://www.xml-cml.org/front_page.html'

   pushing http://www.xml-cml.org/front_page.html
+ size = 359
pick: www.xml-cml.org, # servers = 1
3:3:2:http://www.xml-cml.org/front_page.html: Retrieval command for http://www.x
ml-cml.org/front_page.html: GET /front_page.html HTTP/1.0
User-Agent: htdig/3.1.2 ([EMAIL PROTECTED])
Referer: http://www.xml-cml.org/front_lower.html
Host: www.xml-cml.org
==================

ie left_index.html is regognised as present in the frameset, but no
"push" of this document occurs

Is anyone aware of any issues with the use of framesets that call other
framesets?   There was some dicussion of nested framesets on June  11,
but I was not sure from that whether htdig had specific problems with such
framesets

Dr Henry Rzepa,  Dept. Chemistry,  Imperial College,  LONDON SW7 2AY;
mailto:[EMAIL PROTECTED]; Tel  (44) 171 594 5774; Fax: (44) 171 594 5804.
URL: http://www.ch.ic.ac.uk/rzepa/ 

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.

Reply via email to