Ok.
I supouse that your web server is run by Linux or Unix.
And that you login and get in Linux/Unix.
That be your account prompt.
$ _
And that you have access to write a file on the main directory of
your server.
Then you change to that directory with the command :
$cd /home/httpd/html
I supose that this is the main directory of your web server.
Now I going to create a html file that contain all files in all
directories down the main directory /home/httpd/html and I'm going
to name it all_links_of_my_web_site.html with the command.
$ find . -depth -print | awk '{ print "<a href="$1">"$1"</a>"}' - >
all_links_of_my_web_site.html
That command must be typed in one line and the symbol $ is your prompt.
Check the file size and adjust htdig.conf max_doc_size to have a
greater value so it be read til the end.
Now you have a file that points to every file in your site. Now tell
htdig to index this file in the start_url.
start_url: http://your.domain/all_links_of_my_web_site.html
Good Luck
----
Heriberto Cantu
http://www.elinux.com.mx
Monterrey, Mexico
Tel: (8)129-1121
Cel: 0448-256-8807
At 05:10 p.m. 15/12/00 +0000, you wrote:
>Trying to understand the last message from Heriberto:
>
>Are you saying that you can create a file which contains all URLs for the
site and
>thereby aid in indexing the site?
>
>What does $ find . -depth -print mean? Is it supposed to be typed somewhere?
>If so, where?
>
>When you say "complete path to file" do you actually mean "files" (plural)?
>
>Where would you enter or use it? In the server? Under a given
sub-directory?
>At the prompt on the browser?
>
>What does "awk" mean?
>
>What do you mean by "pipe the output"?
>
>What do you mean by "print the link"?
>
>What is the significance of "<a href=$1>$1</a>?
>
>What is someone supposed to do with this? Type it? Insert it?
>If so, where?
>
>If you could be more specific, I'll try to follow.
>
>Thanks.
>
>At 04:49 PM 12/15/00 -0600, you wrote:
>>Maybe a better idea is to use find to create such file.
>>$ find . -depth -print
>>
>>And now you have the complete path to the file.
>>
>>You just need to pipe the output to awk and print the
>>link "<a href=$1>$1</a>
>>
>>Good Luck
>>
>>At 09:09 a.m. 15/12/00 +0000, you wrote:
>>>At 18:46 14/12/2000 -0500, Geoff Hutchison wrote:
>>>>You can list as many URLs as you want in the start_url attribute, or you
>>>>can also include a file into the htdig.conf. e.g.:
>>>>
>>>>start_url: `/path/to/urls.txt`
>>>
>>>
>>>I guess this would be the way to do it, excuse me if I'm stating the
obvious.
>>>
>>>Go to your root directory (For your web docs) eg /news/archive
>>>ls -R > temp.file
>>>
>>>you'l get eg
>>>
>>>/news/archive:
>>>
>>>file1 file2 file3
>>>
>>>write a short script to parse temp.file
>>>
>>>find a line that ends in :
>>>strip the :
>>>
>>>write to urls.txt
>>>
>>>the line that ended in : (- the colon)/file1
>>>the line that ended in : (- the colon) /file2
>>>...
>>>till you find another line that end in :
>>>
>>>Actually i think there's a far easier way to do this
>>>in perl but I can't think of it off the top of my head.
>>>
>>>Maybe a Feature Request? - Ability to give a start directory
>>>and index the files in the directory tree? (Or was that another ht://
>>product)
>>>
>>> Dunk
>>>
>>>
>>>
>>>>--
>>>>-Geoff Hutchison
>>>>Williams Students Online
>>>>http://wso.williams.edu/
>>>>
>>>>
>>>>------------------------------------
>>>>To unsubscribe from the htdig mailing list, send a message to
>>>>[EMAIL PROTECTED]
>>>>You will receive a message to confirm this.
>>>>List archives: <http://www.htdig.org/mail/menu.html>
>>>>FAQ: <http://www.htdig.org/FAQ.html>
>>>
>>>
>>>------------------------------------
>>>To unsubscribe from the htdig mailing list, send a message to
>>>[EMAIL PROTECTED]
>>>You will receive a message to confirm this.
>>>List archives: <http://www.htdig.org/mail/menu.html>
>>>FAQ: <http://www.htdig.org/FAQ.html>
>>>
>>>
>>----
>>Heriberto Cantu
>>http://www.elinux.com.mx
>>Monterrey, Mexico
>>Tel: (8)129-1121
>>Cel: 0448-256-8807
>>
>>
>>
>>------------------------------------
>>To unsubscribe from the htdig mailing list, send a message to
>>[EMAIL PROTECTED]
>>You will receive a message to confirm this.
>>List archives: <http://www.htdig.org/mail/menu.html>
>>FAQ: <http://www.htdig.org/FAQ.html>
>
>-------------------------------------------------------------
>The Nationalist Movement
>PO Box 2000
>Learned MS 39154
>(601) 885-2288
>Clinic: http://www.nationalist.org/board/html/index.php
>Crosstarlist: http://www.nationalist.org/docs/resources/list.html
>E-mail: mailto:[EMAIL PROTECTED]
>Forum: http://www.nationalist.org/forum/index.php
>Home Page: http://www.nationalist.org
>ICQ: 5429992
>Newsgroup: alt.national
>Views not necessarily those of The Nationalist Movement
>� 2000 by The Nationalist Movement
>-------------------------------------------------------------
>
>END
>
>
>------------------------------------
>To unsubscribe from the htdig mailing list, send a message to
>[EMAIL PROTECTED]
>You will receive a message to confirm this.
>List archives: <http://www.htdig.org/mail/menu.html>
>FAQ: <http://www.htdig.org/FAQ.html>
>
>
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives: <http://www.htdig.org/mail/menu.html>
FAQ: <http://www.htdig.org/FAQ.html>