send a small sample xml snippet you are trying to index and it may help On Tue, Oct 6, 2009 at 9:29 PM, Adam Foltzer <acfolt...@gmail.com> wrote: > Hi all, > > I'm trying to set up DataImportHandler to index some XML documents available > over web services. The XML includes both content and metadata, so for the > indexable content, I'm trying to just index everything under the content > tag: > > <entity dataSource="kbws" name="kbxml" pk="title" > url="resturl" processor="XPathEntityProcessor" > forEach="/document" transformer="HTMLStripTransformer" > flatten="true"> > <field column="content" name="content" xpath="/document/kbml/body" > flatten="true" stripHTML="true" /> > <field column="title" name="title" xpath="/document/kbml/kbq" /> > </entity> > > The result of this is that the title field gets populated and indexed (there > are no child nodes of /document/kbml/kbq), but content does not get indexed > at all. Since /document/kbml/body has many children, I expected that > flatten="true" would store all of the body text in the field. Instead, it > stores nothing at all. I've tried this with many combinations of > transformers and flatten options, and the result is the same each time. > > Here are the relevant field declarations from the schema (the type="text" is > just the one from the example's schema.xml). I have tried combinations here > as well of stored= and multiValued=, with the same result each time. > > <field name="title" type="text" indexed="true" stored="true" > multiValued="true" /> > <field name="content" type="text" indexed="true" stored="true" > multiValued="true" /> > > If it would help troubleshooting, I could send along some sample XML. I > don't want to spam the list with an attachment unless it's necessary, though > :) > > Thanks in advance for your help, > > Adam Foltzer >
-- ----------------------------------------------------- Noble Paul | Principal Engineer| AOL | http://aol.com