On Tue, 13 Apr 2004, [iso-8859-1] Tinni wrote:

> For url indexing, i found there are lots of urls which are different from our site.  
> In your explanation below - it is the point one you mentioned...
> 
> I have used 'start_url' though,  but still it is spidering different urls.  Now i 
> used the following parameters.

What do your start_url and limit_urls_to attributes look like? This is the
most important part with regard to keeping the dig within the intended
sites.

> common_url_parts: http://www.example.com/

This attribute just provides a way to reduce the amount of space used for
common strings in the database. It doesn't affect which URLs are indexed.

> local_urls:     http://www.example.com/

This attribute just request that htdig grab the files directly from the
local file system rather than going through the web server. Again, it
doesn't affect which URLs are indexed.

> local_urls_only: true

This attribute says that only files available through the local filesystem
are to be indexed. It might very well be limiting the URLs being indexed,
but in a round about way. Perhaps even in an incorrect way depending on
exactly what you are trying to accomplish.

> I am merging all the files now.. Thre are 20 sites i need  to merge. While i was 
> creating the individual database, i found one file (huge volume) is being created as 
> named "core".  What is the file for? I have deleted the file, it seems it is a 
> binary..

In most cases finding a big file named core is a bad thing. It means that
some program is crashing. The core file contains a lot of information
about the program and the state it was in when it crashed. Running the
command 'file core' might provide some insight into which programs is
crashing.

Jim


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to