: In my example the city was parent -- I raised this example to explain : that index-time joining is more general than just nested docs (ie, I : think we should keep the name "join" for this module... also because : we should factor out more general search-time-only join capabilities : into it).
i think that may be the wrong approach to take when discussing "examples", while it's great to say there are dozens of usecases that these features can all support in dozens of diff ways" we should relaly focus on naming/deming these use cases in the ways where they really make the most sense. In otherwords, i don't think we should say "All of these types of problems are different types of nails, and all of these modules are specialty hammers that are slightly distinct from eachother in how they work, but you can use any of these hammers on any of these nails" instead we should say "here are some specialty hammers, you can use them for lots of types of nails, ut for each hammer here is the type of nail where it really shines" "block-index-join" as i understand it requires all the docs you want to join up to be in one contigious range of docids in the index, so if you want to re-index one doc in a block you have to re-index the entire block -- so the city/doctor example doesn't sound like a good generic example of when/why to use this (because a doctor might change his office hours, or address -- maybe even leavong the city completely, while a city might change it's population w/o the doctor being affected at all. The "book and pages" example seems much more appropriate, since in the real world these things change in lock step -- pages aren't added/removed to a book; pages don't change w/o the book itself being fundementally changed. the fields of a page document are the text of that page, and that is inheriently data about the book -- the fields of a doctor document are metadata about the doctor, and that is not inheriently data about the city the doctor lives in. as for the name ... i understand why it's called "module/join" and i understand why the classes are called "BlockJoinQuery" and "BlockJoinCollector" but i don't think those names really stand out and convey to end users what they do and how/why they are useful. Personally i think better names would be "modules/subdocuments", "ParentDocumentQuery" and "ChildDocumentsCollector" I know mcccandless isn't a fan of the name "Nested Documents" because this functionality *can* be used for use cases where the data being modeled is not strictly organized in a nested relationship, but that doesn't mean it's *optimal* or easy for a user to apply to other usecases, because they have to design their model (and their indexing strategy) in such a way that they think them as nested or hierarchical documents. Naming it "module/subdocuments" would not only emphasis the usecase where it really shines, it would more importantly draw attention to how users have to model their data in order to take advantage of it -- and using "ParentDocument" and "ChildDocuments" in the names of the Query/Collector would make it clear what they "match" on relative the underlying query that they wrap/collect it would also help distibguish from more general joins like what solr does today -- it seems like that should eventually take the name "module/join" At a minum we should rename what we have now "modules/block-join" or "modules/index-join" (but the later is confusing) and eventually add "modules/query-join" (yes, yes, block joins provide a query, btu the differnce is when you you have to make a decision about how you want to join your model, at index time or at query time. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
