Greetings,

Has anyone else run across this?  Somebody somewhere on our website made a
link like...
  http://www.goshen.edu./foo
(Notice the period after the domain name).

Now, I'm getting duplicates of many pages in my htdig (3.2...) index.  
E.g. 
  http://www.goshen.edu./foo and
  http://www.goshen.edu/foo
Look like two different pages.

I was able to axe the duplicates with this configuration file 
directive (yeah, it would be more elegant to do a URL rewrite...)
  exclude_urls:   goshen.edu./

But this got me thinking.... I thought I should/could really get rid of
these with (apache) webserver re-write rules, but after a bit of trying,  
could not get apache's mod_rewrite to treat
goshen.edu./ and  goshen.edu/ in any distinguishable way.

Now, I'm wondering if this is a deeper problem?  This URL:
  http://httpd.apache.org/docs-2.0/mod/mod_proxy.html 
seems to indicate that a domain with a trailing period might in principal
be considered equivalent to one without.  (Search the page for "trailing
period".)

If so, it might be nice if htdig could automagically treat as the same
URLs like mydomain.com. and mydomain.com .  Those other guye, Google, et.  
al. must surely compensate for this sort of thing? But perhaps someone who
knows more about DNS than I do could straighten me out?

-Paul

-- 
Paul Meyer Reimer      paulmr at goshen.edu
Goshen College          
Goshen, IN  46526       




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to