A surprising omission in the  5 year history of indexing  HTML is the
cgi-bin request. Normally, such  URLs are excluded, for obvious reasons,
in the default conf file settings.

It does mean however that even their existence is not captured by the digging.

Both the 3.2 and 4.o HTML specs allow
<A TITLE=string HREF=cgi-bin request>

where at least the title can describe the type of resource accessed by the
HREF pointer to  e.g. a remote database. My question is:

can the  TITLE of a cgi-bin anchor be indexed easily by  htdig? 
It would in effect represent meta data about the resource. I am not sure other
metadata schemas (eg DC) can easily flag such information.

My next point is to observer that, amazingly, the  formal 3.2 and 4.0 
definitions of the  <FOR> attributes do not include a title!!  This means
that <FORM ACTION=cgi-=bin request> cannot have a title
atrribute. Formally, <FORM> and <A> should be equivalenced,
and if one has a title, the other should.

Actually, on this final point, can someone let me know whether htdig
exactly conforms to any W3 spec of HTML?  For example, does it
track TITLE attributes in the various elements that have them, eg
<object> etc etc. 

If anyone has any thoughts on how to handle meta indexing of cgi-bin
requests, please let me know.

Thanks! 

Dr Henry Rzepa,  Dept. Chemistry,  Imperial College,  LONDON SW7 2AY;
mailto:[EMAIL PROTECTED]; Tel  (44) 171 594 5774; Fax: (44) 171 594 5804.
URL: http://www.ch.ic.ac.uk/rzepa/ 

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.

Reply via email to