I am still struggling with nested DIH myself, but I notice that your
correlation condition is on the field level (@StoreId='${store.id}).
Were you planning to repeat it for each field definition?

Have you tried putting it instead in the forEach section?

Alternatively, maybe you need to use $skipDoc as in the Wikipedia
import example?

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Sat, Jul 21, 2012 at 1:34 PM, Tobias Berg <tobias.h...@gmail.com> wrote:
> Hi,
>
> I'm trying to index a set of stores and their articles. I have two
> XML-files, one that contains the data of the stores and one that contains
> articles for each store. I'm using DIH with XPathEntityProcessor to process
> the file containing the store, and using a nested entity I try to get all
> articles that belongs to the specific store. The problem I encounter is
> that every store gets the same articles.
>
> For testing purposes I've stripped down the xml-files to only include id:s
> for testing purposes. The store file (StoresTest.xml) looks like this:
>
> <?xml version="1.0" encoding="utf-8"?>
> <Stores><Store><Id>0102</Id></Store><Store><Id>0104</Id></Store></Stores>
>
> The Store-Articles relations file (StoreArticlesTest.xml) looks like this:
> <?xml version="1.0" encoding="utf-8"?><StoreArticles><Store
> StoreId="0102"><ArticleId>18004</ArticleId></Store><Store
> StoreId="0104"><ArticleId>17004</ArticleId><ArticleId>10004</ArticleId></Store></StoreArticles>
>
> And my dih-config file looks like this:
>
> <dataConfig>
>         <dataSource type="FileDataSource" encoding="UTF-8" />
>         <document>
>    <entity name="store"
> processor="XPathEntityProcessor"
> stream="true"
> forEach="/Stores/Store"
> url="../../../data/StoresTest.xml"
> transformer="TemplateTransformer"
>>
> <field column="id"  xpath="/Stores/Store/Id" />
> <entity name="storearticle"
> processor="XPathEntityProcessor"
> stream="true"
> forEach="/StoreArticles"
> url="../../../data/StoreArticlesTest.xml"
> transformer="LogTransformer"
> logTemplate="Processing ${store.id}" logLevel="info"
> rootEntity="true">
>  <field column="store_articles_txt" xpath="/StoreArticles/Store[@StoreId='${
> store.id}']/ArticleId" />
> </entity>
>    </entity>
> </document>
> </dataConfig>
>
> The result I get in Solr is this:
>
> <response>
> <lst name="responseHeader">...</lst>
> <result name="response" numFound="2" start="0">
> <doc>
> <str name="id">0102</str>
> <arr name="store_articles_txt">
> <str>18004</str>
> </arr>
> </doc>
> <doc>
> <str name="id">0104</str>
> <arr name="store_articles_txt">
> <str>18004</str>
> </arr>
> </doc>
> </result>
> </response>
>
> As you see, both stores gets the article for the first store. I would have
> expected the second store to have two articles: 17004 and 10004.
>
> In the log messages printed using LogTransformer I see that each
> store.idis processed but somehow it only picks up the articles for the
> first store.
>
> Any ideas?
>
> /Tobias Berg

Reply via email to