Based on more searches and manual consolidation, I've put together some of the ideas for this already suggested in a summary below. The last item in the summary seems to be interesting, low technical cost way of doing it.
Basically, it treats the index like a 'BigTable', a la "No SQL". Erick Erickson pointed out: "...but there's absolutely no requirement that all documents in SOLR have the same fields..." I guess I don't have the right understanding of what goes into a Document in Solr. Is it just a set of fields, each with it's own independent field type declaration/id, it's name, and it's content? So even though there's a schema for an index, one could ignore it and jsut throw any other named fields and types and content at document addition time? So If I wanted to search on a base set, all documents having it, I could then additionally filter based on the (might be wrong use of this) dynamic fields? Origninal Thread that I started: ---------------------------------------- http://lucene.472066.n3.nabble.com/A-schema-inside-a-Solr-Schema-Schema-in-a-can-tt2103260.html ----------------------------------------------------------------------------------------------------- Repeat of the problem, (not actual ratios, numbers, i.e. could be WORSE!): ----------------------------------------------------------------------------------------------------- 1/ Base object of some kind, x number of fields 2/ Derived objects representing Divisiion in company, different customer bases, etc. each having 2 additional, unique fields. 3/ Assume 1000 such derived object types 4/ A 'flattened' Index would have the x base object fields, ****and 2000**** additional fields ================================================ Solutions Posited ----------------------- A/ First thought, muliti-value columns as key pairs. 1/ Difficult to access individual items of more than one 'word' length for querying in multivalued fields. 2/ All sorts of statistical stuff probably wouldn't apply? 3/ (James Dayer said:) There's also one "gotcha" we've experienced when searching acrosse multi-valued fields: SOLR will match across field occurences. In the example below, if you were to search q=contrib_name:(james AND smith), you will get this record back. It matches one name from one contributor and another name from a different contributor. This is not what our users want. As a work-around, I am converting these to phrase queries with slop: "james smith"~50 ... Just use a slop # smaller than your positionIncrementGap and bigger than the # of terms entered. This will prevent the cross-field matches yet allow the words to occur in any order. The problem with this approach is that Lucene doesn't support wildcards in phrases B/ Dynamic fields was suggested, but I am not sure exactly how they work, and the person who suggested it was not sure it would work, either. C/ Different field naming conventions were suggested in field types were similar. I can't predict that. D/ Found this old thread, and i had other suggestions: 1/ Use multiple cores, one for each record type/schema, aggregate them in during the query. 2/ Use a fixed number of additional fields X 2. Eatch additional field is actually a pair of fields. The first of the pair gives the colmn name, the second gives the data. a) Although I like this, I wonder how many extra fields to use, b) it was pointed out that relevancy and other statistical criterial for queries might suffer. 3/ Index the different objects exactly as they are, i.e. as Erick Erickson said: "I'm not entirely sure this is germane, but there's absolutely no requirement that all documents in SOLR have the same fields. So it's possible for you to index the "wildly different content" in "wildly different fields" <G>. Then searching for screen:LCD would be straightforward."... Dennis Gearon Signature Warning ---------------- It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036' EARTH has a Right To Life, otherwise we all die.