There is an option somewhere to use the full XML DOM implementation
for using xpaths. The purpose of the XPathEP is to be as simple and
dumb as possible and handle most cases: RSS feeds and other open
standards.

Search for xsl(optional)

http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1

On Sat, Apr 9, 2011 at 5:32 AM,  <karsten-s...@gmx.de> wrote:
> Hi Folks,
>
> does anyone improve DIH XPathRecordReader to deal with nested xpaths?
> e.g.
> data-config.xml with
>  <entity .. processor="XPathEntityProcessor" ..
>  <field column="title" xpath="//body/h1"/>
>  <field column="alltext” xpath="//body" flatten="true"/>
> and the XML stream contains
>  /html/body/h1...
> will only fill field “alltext” but field “title” will be empty.
>
> This is a known issue from 2009
> https://issues.apache.org/jira/browse/SOLR-1437#commentauthor_12756469_verbose
>
> So three questions:
> 1. How to fill a “search over all”-Field without nested xpaths?
>   (schema.xml  <copyField source="*" dest="alltext"/> will not help, because 
> we lose the original token order)
> 2. Does anyone try to improve XPathRecordReader to deal with nested xpaths?
> 3. Does anyone else need this feature?
>
>
> Best regards
>  Karsten
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to