> I too wonder how to get rid of all those non status 200 urls. I have a 
> few million 404's and you know 404 means not found. It also means the 
> web site owner removed them from their html tree so they will most 
> likely never be available. Actually anything above 200 I don't really 
> care to have around. Why keep all this around? I think anything that's 
> not returned as a 200 should be removed if I want to remove them, but 
> I don't know how to do this.
>
> Maybe Kir will be so kind to tell us how to get rid of all the non 200 
> status urls without us breaking something. Maybe he'll have an answer 
> why index -D causes this error too. Who knows?

Do you actually bother looking at the manuals or the help?

[index --help]
-C            Clear database
-s status     Limit index to documents matching status (HTTP Status code)

I think this will help you quite a bit along the way...

- G

Reply via email to