Specifying multiple documents in DataImportHandler dataConfig

Rupert Fiasco Tue, 08 Sep 2009 16:06:04 -0700

I am using the DataImportHandler with a JDBC datasource. From my
understanding of DIH, for each of my "content types" e.g. Blog posts,
Mesh Categories, etc I would construct a series of document/entity
sets, like


<dataConfig>
<dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://...." />

    <!-- BLOG ENTRIES -->
    <document name="blog_entries">
      <entity name="blog_entries" query="select
id,title,keywords,summary,data,title as name_fc,'BlogEntry' as type
from blog_entries">
        <field column="id" name="pk_i" />
        <field column="id" name="id" />
        <field column="title" name="text_t" />
        <field column="data" name="text_t" />
      </entity>
    </document>

    <!-- MESH CATEGORIES -->
    <document name="mesh_category">
      <entity name="mesh_categories" query="select
id,name,node_key,name as name_fc,'MeshCategory' as type from
mesh_categories">
        <field column="id" name="pk_i" />
        <field column="id" name="id" />
        <field column="name" name="text_t" />
        <field column="node_key" name="string" />
        <field column="name_fc" name="facet_value" />
        <field column="type" name="type_t" />
      </entity>
    </document>
</datasource>
</dataConfig>


Solr parses this just fine and allows me to issue a
/dataimport?command=full-import and it runs, but it only runs against
the "first" document (blog_entries). It doesnt run against the 2nd
document (mesh_categories).

If I remove the 2 document elements and wrap both entity sets in just
one document tag, then both sets get indexed, which seemingly achieves
my goal. This just doesnt make sense from my understanding of how DIH
works. My 2 content types are indeed separate so they logically
represent two document types, not one.

Is this correct? What am I missing here?

Thanks
-Rupert

Specifying multiple documents in DataImportHandler dataConfig

Reply via email to