Hello Everyone, I was wondering what are the various best practices that everyone follows for indexing nested XML into Solr. Please don't feel limited by examples, feel free to share your own experiences.
Given an xml structure such as the following: <categoryPath> <category> <id>cat001</id> <name>Everything</name> </category> <category> <id>cat002</id> <name>Music</name> </category> <category> <id>cat003</id> <name>Pop</name> </category> </categoryPath> How do you make the best use of the data when indexing? 1) Do you use Scenario A? categoryPath_category_id = cat001 cat002 cat003 (flattened) categoryPath_category_name = Everything Music Pop (flattened) If so then how do you manage to find the corresponding categoryPath_category_id if someone's search matches a value in the categoryPath_category_name field? I understand that Solr is not about lookups but this may be important information for you to display right away as part of the search results page rendering. 2) Do you use Scenario B? categoryPath_category_id = [cat001 cat002 cat003] (the [] signifies a multi-value field) categoryPath_category_name = [Everything Music Pop] (the [] signifies a multi-value field) And once again how do you find associated data sets once something matches. Side Question: How can one configure DIH to store the data this way for Scenario B? Thanks! - Pulkit