Got it.
This was the problem, in TermInfosWriter.writeTerm():
-lastTerm = term;
+lastBytes = bytes;
}
Without lastTerm being updated, the auxiliary term dictionary got
screwed up. This problem only manifested on large tests because small
tests never moved past the first entry, which
No progress yet.
I think my next move is to do what I did when trying to get KinoSearch
to write Lucene-compatible indexes:
1) Generate an optimized split-file format Lucene index from a
pathological test corpus.
2) Hack KinoSearch so that it ought to produce an index which is
identical
On Sat, May 06, 2006 at 05:11:02PM +0900, David Balmain wrote:
> Hi Marvin,
>
> Where are you with this? I also have a vested interest in seeing
> Lucene move to using byte counts. I was wondering if I could help out.
> Is the patch you pasted here the latest you have?
All I've added since then i
Hi Marvin,
Where are you with this? I also have a vested interest in seeing
Lucene move to using byte counts. I was wondering if I could help out.
Is the patch you pasted here the latest you have?
Cheers,
Dave
On 4/12/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote:
Greets,
I'm back working on
Marvin Humphrey wrote:
A phantom blank Term shows up out of nowhere in the middle of the merge
process.
When you stick a System.err.println into TermInfosWriter's writeTerm...
Did you try putting a print statement in SegmentMergeInfo.next(), to see
where this blank term comes from?
Doug
org
: To: java-dev@lucene.apache.org
: Subject: Re: bytecount as prefix
:
:
: On Apr 11, 2006, at 12:05 PM, Marvin Humphrey wrote:
:
: > TestRangeFilter.
:
: A phantom blank Term shows up out of nowhere in the middle of the
: merge process.
:
: When you stick a System.err.println into TermInfosW
On Apr 11, 2006, at 12:05 PM, Marvin Humphrey wrote:
TestRangeFilter.
A phantom blank Term shows up out of nowhere in the middle of the
merge process.
When you stick a System.err.println into TermInfosWriter's writeTerm,
you ordinarily see it adding Terms in proper sort order:
[j
On Apr 11, 2006, at 2:27 PM, Marvin Humphrey wrote:
"all but last", "all but first" and "all but ends" pass!
Scratch that, it's totally untrue. I'd forgotten that these compound
test cases bail as soon as there's a single failure. "all but last"
also fails to return any docs at all.
M
On Apr 11, 2006, at 2:08 PM, Yonik Seeley wrote:
On 4/11/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote:
What do the failing tests have in common?
On TestIndexModifier, only a small portion of the deletions fail, and
they're all for fairly high values of delId -- sometimes the highest,
but not
On 4/11/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote:
> What do the failing tests have in common?
>
> On TestIndexModifier, only a small portion of the deletions fail, and
> they're all for fairly high values of delId -- sometimes the highest,
> but not always. For RangeFilter and ConstantScoreRa
On Apr 11, 2006, at 12:18 PM, Doug Cutting wrote:
Marvin Humphrey wrote:
I'm back working on converting Lucene to using a byte count
instead of a char count at as a prefix at the head of each
String. Three tests are failing: TestIndexModifier,
TestConstantScoreRangeQuery, and TestRang
Marvin Humphrey wrote:
I'm back working on converting Lucene to using a byte count instead of
a char count at as a prefix at the head of each String. Three tests
are failing: TestIndexModifier, TestConstantScoreRangeQuery, and
TestRangeFilter.
Why those and not others?
- private static f
12 matches
Mail list logo