This is not coupled because: termLength() is the number of chars in the term buffer, where the offsets give the offsets in the orginal char stream. If you use a CharFilter to e.g. remove chars, the termLength will get shorter, but the offset are still the original ones. Also both things are indexed in different ways, the termLength and offsets have no relation and must (as said before) not even follow a contract like end-start=length.
----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Babak Farhang [mailto:[email protected]] > Sent: Friday, November 13, 2009 11:50 PM > To: [email protected] > Subject: Redundant fields Token class? > > I'm writing a TokenFilter and am confused about why class Token has > both an *endOffset* and a *termLength* field. It would appear that > the following invariant should always hold for a Token instance: > > termLength() == endOffset() - startOffset() > > If so, then > > 1) Why 2 fields, instead of 1? > 2) Why isn't the invariant enforced in the class? > > -Babak > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
