Re: UdmSearch: Indexer performance limits (MySQL locking?)

Charlie Hornberger Thu, 22 Jun 2000 13:16:36 -0700
Or, unless I'm missing something, there's an -f option to the indexer that
you can use to tell the indexer to read a specific list of URLs from a
file.  So if you need to reindex www.foo.com/apple.html and
www.bar.com/orange.html every time the indexer starts up, you could just:

  echo "http://www.foo.com/apple.html" > startup_urls.txt
  echo "http://www.bar.com/orange.html" >> startup_urls.txt
  /path/to/indexer -a -f startup_urls.txt

... at least, that's what I seem to remember.  Unfortunately, I haven't
upgraded udmsearch since I contributed a patch to add the -f option way
back around version 2.2.1, so maybe it's not there anymore ;-) 

(For my purposes, I *know* that http://www.whatever.com/newfiles.html will
contain a list of newly added files to the sites on my server(s) ...  so I
simply tell the indexer to start at http://www.whatever.com/newfiles.html,
reindex whatever documents it finds there, and then give up.)

Cheers,
Charlie

On Thu, 22 Jun 2000, Willem Brown wrote:

> Hi,
> 
> I expire those pages manualy before starting the indexer. The 
> command is like this.
> 
> indexer -a -u %/~lists/%/index.html -u %/~lists/%/mail%.html -u %/~lists/%/thr%.html
> 
> The idea is to select certain pages (urls) using -u, to expire, which is
> what the -a option does.
> 
> Regards
> Willem Brown
> 
> On Thu, Jun 22, 2000 at 12:11:12PM -0700, J C Lawrence wrote:
> > On Thu, 22 Jun 2000 15:46:20 -0300 (ADT) 
> > The Hermit Hacker <[EMAIL PROTECTED]> wrote:
> > 
> > > Try setting your Period higher and using the -n option to restrict
> > > the number of pages it does in an invocation ...
> > 
> > > for instance, set Period to 1week, and -n option to 20k ...
> > 
> > > this way it only processes 20k expired pages, and they will only
> > > expire again in a week ...
> > 
> > This creates a different problem, and is why I have the Period set
> > low:
> > 
> >   There are certain pages that must be spidered every index.
> > 
> > Specifically, most of my site consists of mailing list archives.
> > The various index pages for the messages in those archives need to
> > be spidered every time to pick up the new messages.
> > 
> > Summary:
> > 
> >   I need a long period to prevent everything being indexed every
> > time, but I also need certain pages to be spidered for new URLs
> > every single time the indexer runs.  How to do?
> > 
> > -- 
> > J C Lawrence                                 Home: [EMAIL PROTECTED]
> > ----------(*)                              Other: [EMAIL PROTECTED]
> > --=| A man is as sane as he is dangerous to his environment |=--
> > ______________
> > If you want to unsubscribe send "unsubscribe udmsearch"
> > to [EMAIL PROTECTED]
> > 
> 
> -- 
>  /* =============================================================== */
>  /*      Linux, FreeBSD, NetBSD, OpenBSD. The choice is yours.      */
>  /* =============================================================== */
> 
> "The release of emotion is what keeps us health.  Emotionally healthy."
> 
> "That may be, Doctor.  However, I have noted that the healthy release
> of emotion is frequently unhealthy for those closest to you."
>               -- McCoy and Spock, "Plato's Stepchildren", stardate 5784.3
> ______________
> If you want to unsubscribe send "unsubscribe udmsearch"
> to [EMAIL PROTECTED]
> 

______________
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]
Re: UdmSearch: Indexer performance limits (MySQL locking?)

Reply via email to