> I too wonder how to get rid of all those non status 200 urls. I have a > few million 404's and you know 404 means not found. It also means the > web site owner removed them from their html tree so they will most > likely never be available. Actually anything above 200 I don't really > care to have around. Why keep all this around? I think anything that's > not returned as a 200 should be removed if I want to remove them, but > I don't know how to do this. > > Maybe Kir will be so kind to tell us how to get rid of all the non 200 > status urls without us breaking something. Maybe he'll have an answer > why index -D causes this error too. Who knows?
Do you actually bother looking at the manuals or the help? [index --help] -C Clear database -s status Limit index to documents matching status (HTTP Status code) I think this will help you quite a bit along the way... - G
