Atho wrote:
> 
> > YOu may delete pages with "html;" substring
> > indexer  -Cw -u "%html;%"
> 
> Does this command also remove indexed entries in the ndict table? (ver
> 3.0.10 I think)

Yes.

> I have been trying to monitor the (very slow) progress as indexer removes
> entries from the url table 50 at a time - slowed no doubt by the size of the
> database I have (over 135,000 webpages indexed so far and about 30000 or so
> that need to be removed). It has taken over 8 hours so far to remove 12,500
> entries or so. However, watching the database using phpMyAdmin, it does not
> seem to be removing the entries from ndict at all.

It does remove. Check the status of those URLs:

indexer -S -u "%html;%"

May be they indexed with non-200 status. It means that those URLs
have no correspondent records in words tables.


> If it is not affecting ndict, perhaps I could just issue a mysql command to
> the effect of:
> "DELETE FROM url WHERE url LIKE '%html;%'";
> and it would be faster. Further creation of a script might remove entries
> from ndict that did not need to be there. Not sure on the later though.
> A related question would be how much faster is crc/multi over the version I
> am using (which is crc alone I think, since I have only the ndict, url,
> robots, and stopwords tables in my database)? Would I see substantial
> improvements using a different style of Udmsearch database?

It is faster.

> I don't really want to have to respider the entire database but if upgrading
> my udmsearch would add substantial speed improvements then it would be worth
> it. I like Udmsearch, but its slow on the system I am running it on (Solaris
> 7 on a Sun box clone of some sort - not my system. Not sure of the RAM).

You may create new database and respider into it while old one 
may work for search.


-- 
Alexander Barkov
IZHCOM, Izhevsk
email:    [EMAIL PROTECTED]      | http://www.izhcom.ru
Phone:    +7 (3412) 51-32-11 | Fax: +7 (3412) 51-20-80
ICQ:      7748759
______________
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]

Reply via email to