Given the following URL:

  http://www.kanga.nu/archives/MUD-Dev-L/2000Q2/index.php

Where indexer is configured to honour meta/robots tags, the HTML at
that page contains the following meta line:

  <META NAME="robots" CONTENT="noindex,follow">

as I don't want that page indexed, but I do want it spidered, and
where the top three archived messages listed and their URLs are
__NOT__ present in UdmSearch'es index, and the following indexer
command line is executed:

  indexer -N 10 -a -u %/archives/%`date +%Y`%/index.php -e -n 20000

Why aren't the three new messages indexed?

Output from indexer:

  Indexer[14442]: indexer from UdmSearch v.3.0.18/MySQL started with 
'/usr/local/udmsearch/etc/indexer.conf'
  Indexer[14444]: [1] http://www.kanga.nu/archives/IRead-L/2000Q2/index.php
  Indexer[14445]: [2] http://www.kanga.nu/archives/MUD-Dev-L/2000Q2/index.php
  Indexer[14447]: [4] http://www.kanga.nu/archives/MUD-Dev-L/2000Q1/index.php
  Indexer[14448]: [5] http://www.kanga.nu/archives/Meta-L/2000Q2/index.php
  Indexer[14450]: [7] http://www.kanga.nu/archives/Meta-L/2000Q1/index.php
  Indexer[14445]: [2] Done
  Indexer[14452]: [9] Done
  Indexer[14446]: [3] Done
  Indexer[14453]: [10] Done
  Indexer[14449]: [6] Done
  Indexer[14447]: [4] Done
  Indexer[14444]: [1] Done
  Indexer[14448]: [5] Done
  Indexer[14450]: [7] Done
  Indexer[14451]: [8] Done

Which is totally correct asides from not noticing the three new URLs
and indexing them.  This is supported by the apache logs BTW which
report only:

  bush.kanga.nu - - [24/Jun/2000:00:07:54 -0700] "GET 
/archives/IRead-L/2000Q2/index.php HTTP/1.0" 200 2412 "-" "UdmSearch/3.0.18" 1 
www.kanga.nu
  bush.kanga.nu - - [24/Jun/2000:00:07:54 -0700] "GET 
/archives/MUD-Dev-L/2000Q1/index.php HTTP/1.0" 200 143686 "-" "UdmSearch/3.0.18" 1 
www.kanga.nu
  bush.kanga.nu - - [24/Jun/2000:00:07:54 -0700] "GET 
/archives/Meta-L/2000Q2/index.php HTTP/1.0" 200 84618 "-" "UdmSearch/3.0.18" 0 
www.kanga.nu
  bush.kanga.nu - - [24/Jun/2000:00:07:54 -0700] "GET 
/archives/Meta-L/2000Q1/index.php HTTP/1.0" 200 10740 "-" "UdmSearch/3.0.18" 0 
www.kanga.nu
  bush.kanga.nu - - [24/Jun/2000:00:07:54 -0700] "GET 
/archives/MUD-Dev-L/2000Q2/index.php HTTP/1.0" 200 158 "-" "UdmSearch/3.0.18" 1 
www.kanga.nu

Ideas?

-- 
J C Lawrence                                 Home: [EMAIL PROTECTED]
----------(*)                              Other: [EMAIL PROTECTED]
--=| A man is as sane as he is dangerous to his environment |=--
______________
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]

Reply via email to