Hi,

This is more a Solr question but anyway, i'd strongly suggest not to index 
multiple entities into a single document. Your example queries are very simple 
if you only index a person per document.

If you cannot do that you can still hack your way around but it is really 
cumbersome.

Cheers,

> Hi,
> 
> I need to index hierarchical data but as far as I have seen nutch/solr do
> not have a concept
> like hierarchie, the index seems to be flat.
> 
> Now I have a problem I would solve using some sort of hierarchy and would
> like to know how you would
> solve it.
> 
> Lets assume I have a set of pages I index that contain information about
> persons, several persons per page.
> Each person has some properties I can parse in my plugin as the information
> has a certain structure.
> Therefore my index contains fields like firstname, lastname, email,... each
> of them as multiValued because
> there are many persons on a page. As an example I say that each person has
> one or more email addresses associated with it.
> 
> Now I would like to formulate queries like: return all fields of all
> persons that have lastname XXX or return all Email addresses of persons
> XXX. Since the fields are multiValue how can I solve this problem? I see
> no possibility to associate
> an entry within firstname with the corresponding lastname, as both fields
> are multiValue.
> Note that I have no unique id or something that I could use.
> 
> Or would the trick here be that the persons are treated as separate
> documents and indexed
> separately? Meaning that when parsing I split them and index them, so that
> each person has a
> separate entry within the index?
> 
> Any source code / plugin I could have a look at?
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/indexing-hierarchical-data-schema-desig
> n-tp3052894p3052894.html Sent from the Nutch - User mailing list archive at
> Nabble.com.

Reply via email to