Hi, This is more a Solr question but anyway, i'd strongly suggest not to index multiple entities into a single document. Your example queries are very simple if you only index a person per document.
If you cannot do that you can still hack your way around but it is really cumbersome. Cheers, > Hi, > > I need to index hierarchical data but as far as I have seen nutch/solr do > not have a concept > like hierarchie, the index seems to be flat. > > Now I have a problem I would solve using some sort of hierarchy and would > like to know how you would > solve it. > > Lets assume I have a set of pages I index that contain information about > persons, several persons per page. > Each person has some properties I can parse in my plugin as the information > has a certain structure. > Therefore my index contains fields like firstname, lastname, email,... each > of them as multiValued because > there are many persons on a page. As an example I say that each person has > one or more email addresses associated with it. > > Now I would like to formulate queries like: return all fields of all > persons that have lastname XXX or return all Email addresses of persons > XXX. Since the fields are multiValue how can I solve this problem? I see > no possibility to associate > an entry within firstname with the corresponding lastname, as both fields > are multiValue. > Note that I have no unique id or something that I could use. > > Or would the trick here be that the persons are treated as separate > documents and indexed > separately? Meaning that when parsing I split them and index them, so that > each person has a > separate entry within the index? > > Any source code / plugin I could have a look at? > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/indexing-hierarchical-data-schema-desig > n-tp3052894p3052894.html Sent from the Nutch - User mailing list archive at > Nabble.com.

