Re: reset versus setReader on TokenStream

2012-08-29 Thread Robert Muir
On Wed, Aug 29, 2012 at 4:18 PM, Benson Margulies wrote: > If I'm following, you've created a division of labor between setReader and > reset. Thats not true. setReader shouldnt be doing any labor. its really only a setter! One possibility here is to make it final (though its not obvious to me t

RE: reset versus setReader on TokenStream

2012-08-29 Thread Uwe Schindler
Hi, > To me, it seemed natural to overload #setReader so that our tokenizer was in a > consistent state once it was called. It occurs to me to wonder about > order: if #reset is called before #setReader, I'm up creek unless I copy my > reset > implementation into a local override of #setReader.

Re: reset versus setReader on TokenStream

2012-08-29 Thread Benson Margulies
If I'm following, you've created a division of labor between setReader and reset. We have a tokenizer that has a good deal of state, since it has to split the input into chunks. If I'm following here, you'd recommend that we do nothing special in setReader, but have #reset fix up all the state on

RE: reset versus setReader on TokenStream

2012-08-29 Thread Uwe Schindler
.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Robert Muir [mailto:rcm...@gmail.com] > Sent: Wednesday, August 29, 2012 10:08 PM > To: java-user@lucene.apache.org > Subject: Re: reset versus setReader on TokenStream &

Re: reset versus setReader on TokenStream

2012-08-29 Thread Robert Muir
On Wed, Aug 29, 2012 at 3:58 PM, Benson Margulies wrote: > I think I'm beginning to get the idea. Is the following plausible? > > At the bottom of the stack, there's an actual source of data -- like a > tokenizer. For one of those, reset() is a bit silly, and something like > setReader is the brai

Re: reset versus setReader on TokenStream

2012-08-29 Thread Robert Muir
On Wed, Aug 29, 2012 at 3:54 PM, Benson Margulies wrote: > Some interlinear commentary on the doc. > > * Resets this stream to the beginning. > > To me this implies a rewind. As previously noted, I don't see how this > works for the existing implementations. its not a rewind. the javadocs here

Re: reset versus setReader on TokenStream

2012-08-29 Thread Benson Margulies
I think I'm beginning to get the idea. Is the following plausible? At the bottom of the stack, there's an actual source of data -- like a tokenizer. For one of those, reset() is a bit silly, and something like setReader is the brains of the operation. Some number of other components may be stacke

Re: reset versus setReader on TokenStream

2012-08-29 Thread Benson Margulies
Some interlinear commentary on the doc. * Resets this stream to the beginning. To me this implies a rewind. As previously noted, I don't see how this works for the existing implementations. * As all TokenStreams must be reusable, * any implementations which have state that needs to be re

Re: reset versus setReader on TokenStream

2012-08-29 Thread Robert Muir
On Wed, Aug 29, 2012 at 3:45 PM, Benson Margulies wrote: > On Wed, Aug 29, 2012 at 3:37 PM, Robert Muir wrote: > >> ok, lets help improve it: I think these have likely always been confusing. >> >> before they were both reset: reset() and reset(Reader), even though >> they are unrelated. I thought

Re: reset versus setReader on TokenStream

2012-08-29 Thread Benson Margulies
On Wed, Aug 29, 2012 at 3:37 PM, Robert Muir wrote: > ok, lets help improve it: I think these have likely always been confusing. > > before they were both reset: reset() and reset(Reader), even though > they are unrelated. I thought the rename would help this :) > > Does the TokenStream workfloat

Re: reset versus setReader on TokenStream

2012-08-29 Thread Robert Muir
ok, lets help improve it: I think these have likely always been confusing. before they were both reset: reset() and reset(Reader), even though they are unrelated. I thought the rename would help this :) Does the TokenStream workfloat here help? http://lucene.apache.org/core/4_0_0-BETA/core/org/ap

reset versus setReader on TokenStream

2012-08-29 Thread Benson Margulies
I've read the javadoc through a few times, but I confess that I'm still feeling dense. Are all tokenizers responsible for implementing some way of retaining the contents of their reader, so that a call to reset without a call to setReader rewinds? I note that CharTokenizer doesn't implement #reset