I just started using Solr, and I am trying to figure out how to setup my schema. I know that Solr doesn’t have JOINs, and so I am having some difficulty figuring out how would I setup a schema for the following fictional situation. For example, let us say that :
- I have a 10000+ customers, each having some specific info (StoreId , Name, Phone, Address, City, State, Zip, etc) - Each customer has a subset of the 100+ products I am looking to track, each product having some specific info (ProductId, Name, Width, Height, Depth, Weight, Density, etc) - I want to be able to search by the product info but have facets return the number of customers, rather than the number of products, that meet my criteria - I want to display (and sort) customers based on my product search In relational databases, I would simply create two tables (customer and product) and JOIN them. I could then craft a sql query to count the number of distinct StoreId values in the result (something like facets). In Solr, however, there are no joins. As far as I can tell, my options are to: - create two Solr instances, one with customer info and one with product info; I would search the product Solr instance and identify the StoreId values return, and then use that info to search the customer Solr instance to get the customer info. The problem with this is the second query could have ten thousand ANDs (one for each StoreId returned by the first query) - create a single Solr instance that contains a denormalized version of the data where each doc would contain both the customer info and the product info for a given product. The problem with this is that my facets would return the number of products, not the number of customers - create a single Solr instance that contains a denormalized version of the data where each doc contains the customer info and info for ALL products that the customer might have (likely done via dynamicfields). The problem with this is that my schema would be a bit messy and that my queries could have hundreds of ANDs and Ors (one AND for each product field, and one OR for each product); for example, q=((Width1:50 AND Density1:7) OR (Width2:50 AND Density2:7) OR …) Does anyone have any advice on this? Are there other schemas that might work? Hopefully the example makes sense. -- View this message in context: http://old.nabble.com/question-about-schemas-tp26600956p26600956.html Sent from the Solr - User mailing list archive at Nabble.com.