thanks.  that's more cumbersome that just indexing the new file everyday.

-----Original Message-----
From: David Adams [mailto:[EMAIL PROTECTED]
Sent: Monday, March 22, 2004 4:14 AM
To: Erick Calder; [EMAIL PROTECTED]
Subject: Re: [htdig] how to suppress indexing /


Your index page is being generated by the Apache server, which helpfully (?)
provides several versions with different sort orders.  The bad_querystr:
statement excludes all these.

If you want to exclude the index.html page altogether then you can do this
by writing your own and including

<meta name="robots" content="noindex, follow">

in the head part of the page.

David Adams

----- Original Message -----
From: "Erick Calder" <[EMAIL PROTECTED]>
To: "David Adams" <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Friday, March 19, 2004 5:20 PM
Subject: RE: [htdig] how to suppress indexing /


> I take it that ignores links that match (any of?) those urls... but isn't
> there a way to index /wotd/data (which produces a listing of the .html
files
> in that directory) without haing that index page itself get indexed?
>
> it's almost as though a exclusive recursion level ==1 should need to be
> specified.
>
> -----Original Message-----
> From: David Adams [mailto:[EMAIL PROTECTED]
> Sent: Friday, March 19, 2004 5:45 AM
> To: David Adams; Erick Calder; [EMAIL PROTECTED]
> Subject: Re: [htdig] how to suppress indexing /
>
>
> Sorry, that should of course be:
>
> bad_querystr:    C=D&O=A C=D&O=D \
>                          C=M&O=A C=M&O=D \
>                          C=N&O=A C=N&O=D \
>                          C=S&O=A C=S&O=D
>
> or to include Apache 1 servers as well:
>
> bad_querystr:    D=A  D=D \
>                          M=A  M=D \
>                          N=A  N=D \
>                          S=A  S=D \
>                          C=D&O=A C=D&O=D \
>                          C=M&O=A C=M&O=D \
>                          C=N&O=A C=N&O=D \
>                          C=S&O=A C=S&O=D
>
> David Adams
>
> ----- Original Message -----
> From: "David Adams" <[EMAIL PROTECTED]>
> To: "Erick Calder" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
> Sent: Friday, March 19, 2004 1:22 PM
> Subject: Re: [htdig] how to suppress indexing /
>
>
> > It looks as though you have an Apache 2 server.  You can reduce those
> > multiple /wotd/data/ entries to one by including this in your
> configuration
> > file:
> >
> > bad_querstr:    C=D&O=A C=D&O=D \
> >                        C=M&O=A C=M&O=D \
> >                        C=N&O=A C=N&O=D \
> >                        C=S&O=A C=S&O=D
> >
> > David Adams
> > Corporate Information Services
> > Information Systems Services
> > University of Southampton
> >
> > ----- Original Message -----
> > From: "Erick Calder" <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Sent: Thursday, March 18, 2004 8:35 PM
> > Subject: [htdig] how to suppress indexing /
> >
> >
> > > hello everyone,
> > >
> > > I publish an index of the word-of-the-day from yourdictionary.com
which
> > may
> > > be found at: http://www.arix.com/wotd/
> > >
> > > I create the index by grabbing the daily WOTD and writing a .html file
> > into
> > > /var/www/html/wotd/data.  I create a config file (today.conf) to index
> the
> > > new file and call "htdig -c today.conf; htmerge -c today.conf" - a
> sample
> > > config file is included below.
> > >
> > > my question is: when I search for a word I get a bunch of hits like:
> > >
> > > Index of /wotd/data
> > >
> > > try it yourself by searching for "prince".  why is this and how can I
> > > suppress it?
> > >
> > > tia - erick
> > >
> > > --- today.conf ---
> > >
> > > common_dir:     /var/www/html/wotd
> > > database_dir:   ${common_dir}/db
> > > start_url:              http://www.arix.com/wotd/data/prince.html
> > > limit_urls_to:  ${start_url}
> > > max_head_length:        10000
> > > max_doc_size:           200000
> > > maintainer:             [EMAIL PROTECTED]
> > > no_excerpt_show_top:    true
> > > excerpt_length: 300
> > > template_map:   Long long ${common_dir}/long.html
> > > template_name:  long
> > > search_algorithm:       exact:1 synonyms:0.5 endings:0.1
> > > search_results_header: ${common_dir}/header.html
> > > search_results_footer: ${common_dir}/footer.html
> > > nothing_found_file: ${common_dir}/nichts.html
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by: IBM Linux Tutorials
> > > Free Linux tutorial presented by Daniel Robbins, President and CEO of
> > > GenToo technologies. Learn everything from fundamentals to system
> > > administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> > > _______________________________________________
> > > ht://Dig general mailing list: <[EMAIL PROTECTED]>
> > > ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
> > > List information (subscribe/unsubscribe, etc.)
> > > https://lists.sourceforge.net/lists/listinfo/htdig-general
> > >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: IBM Linux Tutorials
> > Free Linux tutorial presented by Daniel Robbins, President and CEO of
> > GenToo technologies. Learn everything from fundamentals to system
> > administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> > _______________________________________________
> > ht://Dig general mailing list: <[EMAIL PROTECTED]>
> > ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
> > List information (subscribe/unsubscribe, etc.)
> > https://lists.sourceforge.net/lists/listinfo/htdig-general
> >
>
>



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to