According to Dave Parfitt:
> I want people to be able to create pdf files and save them to a 
> directory on our intranet.
> I don't want to make a hyperlink to each new pdf file on a webpage in 
> order for that new pdf
> file to be indexed.  Can this be done?

It can be done pretty easily if you have shell access to the web server,
and can run a "find" command to get a list of all your PDF files, as
explained in http://www.htdig.org/FAQ.html#q5.25

To give you a more concrete example, more relevant to your PDF files,
on my system the DocumentRoot for Apache is set to /home/httpd/html,
so I can use this command:

find /home/httpd/html -type f -name '*.[Pp][Dd][Ff]' -print |
   sed -e 's|/home/httpd/html/|http://www.scrc.umanitoba.ca/|' \
   > /etc/htdig/pdflist.txt

to build the list of URLs of all *.pdf and *.PDF files on my server.
I can then put this attribute setting in my htdig.conf to use this list:

start_url: `/etc/htdig/pdflist.txt`

Just change the directory name and URL in the sed command and find command
to whatever you need for your server, and you can use whatever file name
you want for the pdf list.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This sf.net email is sponsored by: See the NEW Palm 
Tungsten T handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to