Re: Omit positions but not TF

2009-11-09 Thread Simon Willnauer
On Mon, Nov 9, 2009 at 6:03 PM, Michael McCandless wrote: > How about opening an issue?  This way someone else can come along and > pick up the torch... +1 > > Mike > > On Mon, Nov 9, 2009 at 11:26 AM, Andrzej Bialecki wrote: >> Andrzej Bialecki wrote: >>> >>> Michael McCandless wrote:

Re: Omit positions but not TF

2009-11-09 Thread Michael McCandless
How about opening an issue? This way someone else can come along and pick up the torch... Mike On Mon, Nov 9, 2009 at 11:26 AM, Andrzej Bialecki wrote: > Andrzej Bialecki wrote: >> >> Michael McCandless wrote: >>> >>> +1 >>> >>> I guess we'd add a Fieldable.setOmitPositions?  And then save that

Re: Omit positions but not TF

2009-11-09 Thread Andrzej Bialecki
Andrzej Bialecki wrote: Michael McCandless wrote: +1 I guess we'd add a Fieldable.setOmitPositions? And then save that in FieldInfos, and fix the postings writing/reading to respect it? Ie, we can just change the index format. Encoding as negative numbers Yes, that's what I had in mind. I

Re: Omit positions but not TF

2009-11-08 Thread Andrzej Bialecki
Michael McCandless wrote: +1 I guess we'd add a Fieldable.setOmitPositions? And then save that in FieldInfos, and fix the postings writing/reading to respect it? Ie, we can just change the index format. Encoding as negative numbers Yes, that's what I had in mind. I was a bit shy of bumping

Re: Omit positions but not TF

2009-11-08 Thread Michael McCandless
+1 I guess we'd add a Fieldable.setOmitPositions? And then save that in FieldInfos, and fix the postings writing/reading to respect it? Ie, we can just change the index format. Encoding as negative numbers isn't great because the termFreq is written as a vInt, which consumes 5 bytes to encode a

Omit positions but not TF

2009-11-07 Thread Andrzej Bialecki
Hi, During one of discussions at ApacheCon it occurred to me that it would be useful to have an option to discard positional information but still keep the term frequency. Even though position-dependent queries wouldn't work then, still any other queries would work fine and we would get the r