[ https://issues.apache.org/jira/browse/LUCENE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722324#action_12722324 ]
Michael McCandless commented on LUCENE-1701: -------------------------------------------- bq. In my opinion: Storing this info in the segments is not doable without pitfalls: The proposal was not to store it into the segments file (which I agree it has serious problems, since it's global). I had considered FieldInfos (which is "roughly" Lucene's "schema", per segment), but that too has clear problems. My proposal was the flags per-field stored in the fdt file. In that file, we are already writing one byte's worth of flags (only 3 of the bits are used now), for every stored field instance. This is in FieldsWriter.java ~ line 181. The flags now record whether each specific field instance was tokenized, compressed, binary. FieldsReader then uses these flags to reconstruct the Field instances when building the document. This bits are never merged; they are copied (because they apply to that one field instance, in that one document). My proposal was to add another flag bit (numeric) and make use of that to return a NumericField instance when you get your document back. It would have no impact to the index size, since we still have 5 free bits to use. But, it is technically a (one bit) change to the index format, which people seriously objected to. So net/net I'm OK going forward without it. > Add NumericField and NumericSortField, make plain text numeric parsers public > in FieldCache, move trie parsers to FieldCache > ---------------------------------------------------------------------------------------------------------------------------- > > Key: LUCENE-1701 > URL: https://issues.apache.org/jira/browse/LUCENE-1701 > Project: Lucene - Java > Issue Type: New Feature > Components: Index, Search > Affects Versions: 2.9 > Reporter: Uwe Schindler > Assignee: Uwe Schindler > Fix For: 2.9 > > Attachments: LUCENE-1701-test-tag-special.patch, LUCENE-1701.patch, > LUCENE-1701.patch, LUCENE-1701.patch, NumericField.java > > > In discussions about LUCENE-1673, Mike & me wanted to add a new NumericField > to o.a.l.document specific for easy indexing. An alternative would be to add > a NumericUtils.newXxxField() factory, that creates a preconfigured Field > instance with norms and tf off, optionally a stored text (LUCENE-1699) and > the TokenStream already initialized. On the other hand > NumericUtils.newXxxSortField could be moved to NumericSortField. > I and Yonik tend to use the factory for both, Mike tends to create the new > classes. > Also the parsers for string-formatted numerics are not public in FieldCache. > As the new SortField API (LUCENE-1478) makes it possible to support a parser > in SortField instantiation, it would be good to have the static parsers in > FieldCache public available. SortField would init its member variable to them > (instead of NULL), so making code a lot easier (FieldComparator has this ugly > null checks when retrieving values from the cache). > Moving the Trie parsers also as static instances into FieldCache would make > the code cleaner and we would be able to hide the "hack" > StopFillCacheException by making it private to FieldCache (currently its > public because NumericUtils is in o.a.l.util). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org