I am using the DataImportHandler to Query a SQL Server and populate Solr. Unfortunately, SQL does not have an understanding of hierarchical relationships, and hence I use Table Joins. The following is an outline of my table structure:
PROD_TABLE -> SKU (Primary Key) -> Title (varchar) -> Descr (varchar) CAT_TABLE -> SKU (Foreign Key) -> CategoryLevel (int i.e. 1, 2, 3 …) -> CategoryName (varchar) I specify the SQL Query in the db-data-config.xml file – a snippet of which looks like: <dataConfig> <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://localhost\...."/> <document> <entity name="Product" query="SELECT SKU, Title, Descr FROM PROD_TABLE"> <field column="SKU" name="SKU" /> <field column="Title" name="Title" /> <field column="Descr" name="Descr" /> <entity name="Cat1" query="SELECT CategoryName from CAT_TABLE where SKU='${Product.SKU}' AND CategoryLevel=1"> <field column="CategoryName" name="Category1" /> </entity> <entity name="Cat2" query="SELECT CategoryName from CAT_TABLE where SKU='${Product.SKU}' AND CategoryLevel=2"> <field column="CategoryName" name="Category2" /> </entity> <entity name="Cat3" query="SELECT CategoryName from CAT_TABLE where SKU='${Product.SKU}' AND CategoryLevel=3"> <field column="CategoryName" name="Category3" /> </entity> </entity> </document> </dataConfig> It seems like the DataImportHandler handler sends out three or four queries for each Product. This results in a very slow import. Is there any way to speed this up? I would not mind an intermediate step of first extracting SQL and then putting it into Solr. Thank you for all your help. O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/Speed-up-import-of-Hierarchical-Data-tp4063924.html Sent from the Solr - User mailing list archive at Nabble.com.