Ah, well, sure, too easy to hide it ;-)

find /www/htdocs/ -name *.htm* -type f | sed 's/\/www\/htdocs/htt
p:\/\/www\.yourdomain\.com/' > /where/ever/you/need/it/allfiles.list

Limits the filetype to any *.htm* files (and ignores
directories named "foo.html") so you don't end up with
tons of image files in the file list.

In my config file I have:
start_url:  `/where/ever/you/need/it/allfiles.list`
  (Note: this does not have to be within the htdocs tree)

local_urls: http://www.yourdomain.com=/www/htdocs/
  If you have server parsed html (like php), you certainly 
  would won't use local_urls, although it speeds things up
  quite a bit.

  Maybe you would also like to add
limit_urls_to:          ${start_url}
  as well.

Marcel



On 24 May 00, at 16:06, J. op den Brouw wrote:

> 
> Are you willing to share this script with me? I need exactly that
> what you wrote.
> 
> Thanx in advance.
> 
> On Wed, 24 May 2000, Marcel Hicking wrote:
> 
> > Since I dont't have a document referring all files
> > to be indexed, I'm thinking of generating a
> > start_url file "on the fly".
> > 
> > I have been doing this for a much smaller site: 
> > I have set up a little shell script to generate
> > a list with all available files and send it through
> > sed to convert local paths to http://...-URLs. 
> > ht://dig is set up with  start_url=allfiles.list
> > and a local_urls line to "undo" the above mapping 
> > again.
> > 
> > Do you think this is appropriate for a larger search
> > or do you have any other suggestions?
> > 
> > Marcel
> 
> --jesse
> --------------------------------------------------------------------
> J. op den Brouw                           Johanna Westerdijkplein 75
> Haagse Hogeschool                                  2521 EN  DEN HAAG
> Faculty of Engeneering                                   Netherlands
> Electrical Engeneering                                +31 70 4458936
> -------------------- [EMAIL PROTECTED] --------------------
> 
> Linux - because reboots are for hardware changes
> 



------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to