1) not only does ConstantScoreRangeQuery uses a RangeFilter, but TestConstantScoreRangeQuery and TestRangeFilter share a base class that creates the index.
2) perhaps the issue is that corruption is happening when segments are merged -- and most tests don't surface the problem becuse they tend to operate on small simple indexes of one segment? One thing i remember about the base class for those RangeFIlter tests is that it makes an index with several thousand docs -- enough that the default indexer options are probably making/merging more then a few segments. i don't know anythign about TestIndexModifier, but if i remember correctly INdexModifier manages a reader and a writer and opens/closes them as needed to do whatever operation you wnat -- so i'm guessing it's test would open/close a writer several times while adding docs, which may make multiple segments .. and if it does and optimize that would definitely merge thosue segments. i would start by twidling the RangeFilter test base class to do a much smaller number of documents .. if that fixes the problem, then try chaging the merge factor and min merge docs to be really low, and if that causes the problem again, you'll be on to something. you could probably make a simple test case where you add some docs to an indexwriter (with a coupld of fields that have multibyte characters), reopen the writer, add some more docs (ditto), then open a TermEnum and record every term in the index, then optimize the index, and then open a new TermEnum and assert that every term matches ... I'm guessing that would fail for you at teh moment. (but work against the trunk) : Date: Tue, 11 Apr 2006 16:49:18 -0700 : From: Marvin Humphrey <[EMAIL PROTECTED]> : Reply-To: java-dev@lucene.apache.org : To: java-dev@lucene.apache.org : Subject: Re: bytecount as prefix : : : On Apr 11, 2006, at 12:05 PM, Marvin Humphrey wrote: : : > TestRangeFilter. : : A phantom blank Term shows up out of nowhere in the middle of the : merge process. : : When you stick a System.err.println into TermInfosWriter's writeTerm, : you ordinarily see it adding Terms in proper sort order: : : [junit] TINFO: : : [junit] TINFO: body:body : [junit] TINFO: id:000000000000 : [junit] TINFO: rand:-00953139433 : [junit] TINFO: : : [junit] TINFO: body:body : [junit] TINFO: id:000000000001 : [junit] TINFO: rand:000015869780 : : Here's several docs being merged together: : : [junit] TINFO: : : [junit] TINFO: body:body : [junit] TINFO: id:000000000009 : [junit] TINFO: rand:-00563669564 : [junit] TINFO: : : [junit] TINFO: body:body : [junit] TINFO: id:000000000000 : [junit] TINFO: id:000000000001 : [junit] TINFO: id:000000000002 : [junit] TINFO: id:000000000003 : [junit] TINFO: id:000000000004 : [junit] TINFO: id:000000000005 : [junit] TINFO: id:000000000006 : [junit] TINFO: id:000000000007 : [junit] TINFO: id:000000000008 : [junit] TINFO: id:000000000009 : [junit] TINFO: rand:-00072576061 : [junit] TINFO: rand:-00260794310 : [junit] TINFO: rand:-00563669564 : [junit] TINFO: rand:-00953139433 : [junit] TINFO: rand:-01094000683 : [junit] TINFO: rand:-01481464619 : [junit] TINFO: rand:-02099458317 : [junit] TINFO: rand:000015869780 : [junit] TINFO: rand:001019870061 : [junit] TINFO: rand:001565603387 : [junit] TINFO: : : [junit] TINFO: body:body : [junit] TINFO: id:000000000010 : [junit] TINFO: rand:001271292228 : : At some point, late in the merge process, this happens: : : [junit] TermInfosWriter: rand:-00449774276 : [junit] TermInfosWriter: rand:-00467363681 : [junit] TermInfosWriter: rand:-00479945420 : [junit] TermInfosWriter: rand:-00506239929 : [junit] TermInfosWriter: : // Huh???? : [junit] TermInfosWriter: rand:-00512006124 : [junit] TermInfosWriter: rand:-00526876979 // <- look at this : number : [junit] TermInfosWriter: rand:-00531589361 : [junit] TermInfosWriter: rand:-00563669564 : [junit] TermInfosWriter: rand:-00638261924 : : Here's the first few terms coming off of a Term Enum, later. As you : can see, the sort order is messed up. That's because the .tis stream : has gotten out of sync somehow. : : [junit] TERMS: : [junit] rand:26876979 // <- the last few digits of that number : from earlier : [junit] rand:31589361 : [junit] rand:63669564 : [junit] rand:638261924 : [junit] rand:733778983 : [junit] rand:770310547 : [junit] rand:806409190 : [junit] rand:849606785 : [junit] rand:869935672 : [junit] rand:927974448 : [junit] rand:953139433 : [junit] rand:954514004 : [junit] rand:961290394 : [junit] rand:1067018129 : [junit] rand:1081398108 : [junit] rand:1094000683 : [junit] rand:1139978555 : [junit] rand:1231799109 : : I'm stumped for now. : : Marvin Humphrey : Rectangular Research : http://www.rectangular.com/ : : : --------------------------------------------------------------------- : To unsubscribe, e-mail: [EMAIL PROTECTED] : For additional commands, e-mail: [EMAIL PROTECTED] : -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]