Hi all, I'm trying to set up DataImportHandler to index some XML documents available over web services. The XML includes both content and metadata, so for the indexable content, I'm trying to just index everything under the content tag:
<entity dataSource="kbws" name="kbxml" pk="title" url="resturl" processor="XPathEntityProcessor" forEach="/document" transformer="HTMLStripTransformer" flatten="true"> <field column="content" name="content" xpath="/document/kbml/body" flatten="true" stripHTML="true" /> <field column="title" name="title" xpath="/document/kbml/kbq" /> </entity> The result of this is that the title field gets populated and indexed (there are no child nodes of /document/kbml/kbq), but content does not get indexed at all. Since /document/kbml/body has many children, I expected that flatten="true" would store all of the body text in the field. Instead, it stores nothing at all. I've tried this with many combinations of transformers and flatten options, and the result is the same each time. Here are the relevant field declarations from the schema (the type="text" is just the one from the example's schema.xml). I have tried combinations here as well of stored= and multiValued=, with the same result each time. <field name="title" type="text" indexed="true" stored="true" multiValued="true" /> <field name="content" type="text" indexed="true" stored="true" multiValued="true" /> If it would help troubleshooting, I could send along some sample XML. I don't want to spam the list with an attachment unless it's necessary, though :) Thanks in advance for your help, Adam Foltzer