Maybe I should be more clear: I have multiple tables in my DB that I
need to save to my Solr index. In my app code I have logic to persist
each table, which maps to an application model to Solr. This is fine.
I am just trying to speed up indexing time by using DIH instead of
going through my application. From what I understand of DIH I can
specify one dataSource element and then a series of document/entity
sets, for each of my models. But like I said before, DIH only appears
to want to index the first document declared under the dataSource tag.

-Rupert

On Tue, Sep 8, 2009 at 4:05 PM, Rupert Fiasco<rufia...@gmail.com> wrote:
> I am using the DataImportHandler with a JDBC datasource. From my
> understanding of DIH, for each of my "content types" e.g. Blog posts,
> Mesh Categories, etc I would construct a series of document/entity
> sets, like
>
> <dataConfig>
> <dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://...." />
>
>    <!-- BLOG ENTRIES -->
>    <document name="blog_entries">
>      <entity name="blog_entries" query="select
> id,title,keywords,summary,data,title as name_fc,'BlogEntry' as type
> from blog_entries">
>        <field column="id" name="pk_i" />
>        <field column="id" name="id" />
>        <field column="title" name="text_t" />
>        <field column="data" name="text_t" />
>      </entity>
>    </document>
>
>    <!-- MESH CATEGORIES -->
>    <document name="mesh_category">
>      <entity name="mesh_categories" query="select
> id,name,node_key,name as name_fc,'MeshCategory' as type from
> mesh_categories">
>        <field column="id" name="pk_i" />
>        <field column="id" name="id" />
>        <field column="name" name="text_t" />
>        <field column="node_key" name="string" />
>        <field column="name_fc" name="facet_value" />
>        <field column="type" name="type_t" />
>      </entity>
>    </document>
> </datasource>
> </dataConfig>
>
>
> Solr parses this just fine and allows me to issue a
> /dataimport?command=full-import and it runs, but it only runs against
> the "first" document (blog_entries). It doesnt run against the 2nd
> document (mesh_categories).
>
> If I remove the 2 document elements and wrap both entity sets in just
> one document tag, then both sets get indexed, which seemingly achieves
> my goal. This just doesnt make sense from my understanding of how DIH
> works. My 2 content types are indeed separate so they logically
> represent two document types, not one.
>
> Is this correct? What am I missing here?
>
> Thanks
> -Rupert
>

Reply via email to