Re: Delete corrupted doc

2007-07-26 Thread Rafael Rossini
I see, thanks. On 7/26/07, Mike Klaas <[EMAIL PROTECTED]> wrote: On 26-Jul-07, at 10:18 AM, Rafael Rossini wrote: > Yes, I optimized, but in the with SOLR. I don´t know why, but when > optimize > an index with SOLR, it leaves you with about 15 files, instead of > the 3... You are probably no

Re: Delete corrupted doc

2007-07-26 Thread Mike Klaas
On 26-Jul-07, at 10:18 AM, Rafael Rossini wrote: Yes, I optimized, but in the with SOLR. I don´t know why, but when optimize an index with SOLR, it leaves you with about 15 files, instead of the 3... You are probably not using the compound file format. Try setting: true in solrconfig

Re: Delete corrupted doc

2007-07-26 Thread Yonik Seeley
On 7/26/07, Rafael Rossini <[EMAIL PROTECTED]> wrote: > Well... thanks for the help, this was really my last solution (rebuild) but > I think I have no other choice... I really can´t tell exactly if this > corruption was caused by bad hardware or not, but do you guys have any > ideia about what mig

Re: Delete corrupted doc

2007-07-26 Thread Rafael Rossini
Well... thanks for the help, this was really my last solution (rebuild) but I think I have no other choice... I really can´t tell exactly if this corruption was caused by bad hardware or not, but do you guys have any ideia about what might have happend here? Could I have generated this corruption

Re: Delete corrupted doc

2007-07-26 Thread Yonik Seeley
On 7/26/07, Mark Miller <[EMAIL PROTECTED]> wrote: > Anyway, what this says to me (and I should have realized this before) is > that there is no document with your corrupt id, rather there is a term that > thinks it is in that invalid doc id. The corruption must be in the > term:docids inverted ind

Re: Delete corrupted doc

2007-07-26 Thread Mark Miller
From what I can tell, you shouldn't need to even try my first suggestion (what happened to the experts on this question by the way?). Returning true from isDeleted for the corrupt id should not matter. It appears to me that deletes are handled by keeping a simple list of the id's that are delet

Re: Delete corrupted doc

2007-07-26 Thread Rafael Rossini
Yes, I optimized, but in the with SOLR. I don´t know why, but when optimize an index with SOLR, it leaves you with about 15 files, instead of the 3... I´ll try to optimize directly on lucene, and see what happens, if nothing happens I´ll try your suggestion. Thanks a lot Mark!! On 7/26/07, Mark M

Re: Delete corrupted doc

2007-07-26 Thread Mark Miller
You know, on second though, a merge shouldn't even try to access a doc > maxdoc (i think). Have you just tried an optimize? On 7/25/07, Rafael Rossini <[EMAIL PROTECTED]> wrote: Hi guys, Is there a way of deleting a document that, because of some corruption, got and docID larger than the m

Re: Delete corrupted doc

2007-07-26 Thread Mark Miller
This may not be very elegant, but if you are really in a jam, here is what I would try: Check out a copy of Lucene. Modify the isDeleted method on both MultiReader and SegmentReader so that it returns true if the docid passed in is the id in question (if it is not the id, then just have the metho

Delete corrupted doc

2007-07-25 Thread Rafael Rossini
Hi guys, Is there a way of deleting a document that, because of some corruption, got and docID larger than the maxDoc() ? I´m trying to do this but I get this Exception: Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 106577 at org.apache.lucen