On Nov 19, 2007 7:02 PM, Doug Cutting <[EMAIL PROTECTED]> wrote: > Yonik Seeley wrote: > > 1) If we are deprecating some methods like String termText(), how > > about at the same time deprecating "String type"? If we want > > lightweight per-token metadata for communication between filters, an > > int or a long used as a bitvector (32 or 64 independent boolean vars > > per token) would be much more useful than a single String. > > There are tokenizers that use the type string, e.g., StandardFilter & > similar things in Nutch. How would you replace such uses? Add a bit > for each token type? Is that really that much more useful?
It is, given that it enables a token to have more than one type at once. The benefit is probably relatively minor (the number of people who would use it), and I wouldn't have brought it up except that it could piggy-back on the other recent changes to Token. -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]