try this

add two xpaths in your forEach

forEach="/document/category/item | /document/category/name"

and add a field as follows

<field column="catgoryname" xpath ="/document/category/name"
commonField="true"/>

Please try it out and let me know.

On Thu, Sep 10, 2009 at 7:30 AM, venn hardy <venn.ha...@hotmail.com> wrote:
>
> Hello,
>
>
>
> I am using SOLR 1.4 (from nighly build) and its URLDataSource in conjunction 
> with the XPathEntityProcessor. I have successfully imported XML content, but 
> I think I may have found a limitation when it comes to the commonField 
> attribute in the DataImportHandler.
>
>
>
> Before writing my own parser to read in a whole XML document, I thought I'd 
> post the question here (since I got some great advice last time).
>
>
>
> The bulk of my content is contained within each <item> tag. However, each 
> item has a parent called <category> and each category has a name which I 
> would like to import. In my forEach loop I specify the 
> /document/category/item as the collection of items I am interested in. Is 
> there anyway to extract an element from underneath a parent node? To be a 
> more more specific (see eg xml below). I would like to index the following:
>
> - category: Category 1; id: 1; author: Author 1
>
> - category: Category 1; id: 2; author: Author 2
>
> - category: Category 2; id: 3; author: Author 3
>
> - category: Category 2; id: 4; author: Author 4
>
>
>
> Any ideas on how I can get to a parent node from within a child during data 
> import? If it cant be done, what do you suggest would be the best way so I 
> can keep using the DataImportHandler... would XSLT be a good idea to 'flatten 
> out' the structure a bit?
>
>
>
> Thanks
>
>
>
> This is what my XML document looks like:
>
> <document>
>  <category>
>  <name>Category 1</name>
>  <item>
>   <id>1</id>
>   <author>Author 1</author>
>  </item>
>  <item>
>   <id>2</id>
>   <author>Author 2</author>
>  </item>
>  </category>
>  <category>
>  <name>Category 2</name>
>  <item>
>   <id>3</id>
>   <author>Author 3</author>
>  </item>
>  <item>
>   <id>4</id>
>   <author>Author 4</author>
>  </item>
>  </category>
> </document>
>
>
>
> And this is what my dataConfig looks like:
> <dataConfig>
>  <dataSource type="URLDataSource" />
>  <document>
>   <entity name="archive" pk="id" 
> url="http://localhost:9080/data/20090817070752.xml"; 
> processor="XPathEntityProcessor" forEach="/document/category/item" 
> transformer="DateFormatTransformer" stream="true" dataSource="dataSource">
>    <field column="category" xpath="/document/category/name" 
> commonField="true" />
>    <field column="id" xpath="/document/category/item/id" />
>    <field column="author" xpath="/document/category/item/author" />
>   </entity>
>  </document>
> </dataConfig>
>
>
>
> This is how I have specified my schema
> <fields>
>   <field name="id" type="string" indexed="true" stored="true" required="true" 
> />
>   <field name="author" type="string" indexed="true" stored="true"/>
>   <field name="category" type="string" indexed="true" stored="true"/>
> </fields>
>
> <uniqueKey>id</uniqueKey>
> <defaultSearchField>id</defaultSearchField>
>
>
>
>
>
>
> _________________________________________________________________
> Need a place to rent, buy or share? Let us find your next place for you!
> http://clk.atdmt.com/NMN/go/157631292/direct/01/



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Reply via email to