Re: Lucene 2.9

Michael McCandless Mon, 09 Mar 2009 14:00:29 -0700


markharw00d wrote:

>>(a "write once" schema)
I like this idea. Enforcing consistent field-typing on instances offields with the same name does not seem like an unreasonablerestriction - especially given the upsides to this.

And also when it's "opt-in", ie, you can continue to use untyped/unrestricted fields.

One note of caution is that users may be tempted to store primarykeys as ints or longs and incur the overhead of trie encoding whenthere is no use case for range queries on these types of fields.
I've often thought of field types as belonging to these broadcategories (rather than ints/strings/longs etc):1) Quantifiers - used to express numeric quantities that are oftenqueried in ranges e.g. price, datetime, longitude
2) Identifiers  -   designed to be unique e.g. urls, primary keys
3) Controlled vocabularies - enums e.g. male/female or public/private
4) Uncontrolled vocabularies - e.g free text
"Ints" can be used to represent types 1 to 3 but the practical usesof them differ (range queries vs straight-look ups vs faceted groups)It seems like the likely use cases are more important than the rawdata format (int vs long etc)

I like working by usage instead of underlying type, and I like thisbreakdown. It would allow us to do a better job defaulting thesettings for these fields types.


Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Lucene 2.9

Reply via email to