>>(a "write once" schema)

I like this idea. Enforcing consistent field-typing on instances of fields with the same name does not seem like an unreasonable restriction - especially given the upsides to this.

It doesn't dispense with all the full schema logic in Solr but seems like a useful baseline for supporting basic numeric field types well in Lucene.

One note of caution is that users may be tempted to store primary keys as ints or longs and incur the overhead of trie encoding when there is no use case for range queries on these types of fields.

I've often thought of field types as belonging to these broad categories (rather than ints/strings/longs etc): 1) Quantifiers - used to express numeric quantities that are often queried in ranges e.g. price, datetime, longitude
2) Identifiers  -   designed to be unique e.g. urls, primary keys
3) Controlled vocabularies - enums e.g. male/female or public/private
4) Uncontrolled vocabularies - e.g free text

"Ints" can be used to represent types 1 to 3 but the practical uses of them differ (range queries vs straight-look ups vs faceted groups) It seems like the likely use cases are more important than the raw data format (int vs long etc)

Cheers
Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to