> YOu may delete pages with "html;" substring
> indexer  -Cw -u "%html;%"

Does this command also remove indexed entries in the ndict table? (ver
3.0.10 I think)
I have been trying to monitor the (very slow) progress as indexer removes
entries from the url table 50 at a time - slowed no doubt by the size of the
database I have (over 135,000 webpages indexed so far and about 30000 or so
that need to be removed). It has taken over 8 hours so far to remove 12,500
entries or so. However, watching the database using phpMyAdmin, it does not
seem to be removing the entries from ndict at all.

If it is not affecting ndict, perhaps I could just issue a mysql command to
the effect of:
"DELETE FROM url WHERE url LIKE '%html;%'";
and it would be faster. Further creation of a script might remove entries
from ndict that did not need to be there. Not sure on the later though.

A related question would be how much faster is crc/multi over the version I
am using (which is crc alone I think, since I have only the ndict, url,
robots, and stopwords tables in my database)? Would I see substantial
improvements using a different style of Udmsearch database?

I don't really want to have to respider the entire database but if upgrading
my udmsearch would add substantial speed improvements then it would be worth
it. I like Udmsearch, but its slow on the system I am running it on (Solaris
7 on a Sun box clone of some sort - not my system. Not sure of the RAM).

Thanks,

Atho

______________
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]

Reply via email to