I am using the DataImportHandler to Query a SQL Server and populate Solr.
Unfortunately, SQL does not have an understanding of hierarchical
relationships, and hence I use Table Joins. The following is an outline of
my table structure: 


PROD_TABLE
-> SKU (Primary Key)
-> Title  (varchar)
-> Descr (varchar)

CAT_TABLE
-> SKU (Foreign Key)
->  CategoryLevel (int i.e. 1, 2, 3 …)
-> CategoryName  (varchar)

I specify the SQL Query in the db-data-config.xml file – a snippet of which
looks like: 

<dataConfig>
    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
url="jdbc:sqlserver://localhost\...."/>
    <document>
        <entity name="Product" 
                                query="SELECT SKU, Title, Descr FROM 
PROD_TABLE">
            <field column="SKU" name="SKU" />
                        <field column="Title" name="Title" />
            <field column="Descr" name="Descr" />

                        <entity name="Cat1"  
                    query="SELECT CategoryName from CAT_TABLE where
SKU='${Product.SKU}' AND CategoryLevel=1">
                                <field column="CategoryName" name="Category1" 
/> 
                        </entity>
                        <entity name="Cat2"  
                    query="SELECT CategoryName from CAT_TABLE where
SKU='${Product.SKU}' AND CategoryLevel=2">
                                <field column="CategoryName" name="Category2" 
/> 
                        </entity>
                        <entity name="Cat3"  
                    query="SELECT CategoryName from CAT_TABLE where
SKU='${Product.SKU}' AND CategoryLevel=3">
                                <field column="CategoryName" name="Category3" 
/> 
                        </entity>
                        
        </entity>
    </document>
</dataConfig>

It seems like the DataImportHandler handler sends out three or four queries
for each Product. This results in a very slow import. Is there any way to
speed this up? I would not mind an intermediate step of first extracting SQL
and then putting it into Solr.

Thank you for all your help. 
O. O.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Speed-up-import-of-Hierarchical-Data-tp4063924.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to