Re: DIH - importing XML with nested elements that have the same name

2012-09-26 Thread Gora Mohanty
On 26 September 2012 23:47, Billy Newman  wrote:
> To be a little more specific, the error message I get is:
> "forEach cannot start with '//'"
>
> Cannot really find anything on this except for
> https://issues.apache.org/jira/browse/SOLR-1437.  Which only talks
> about using '//' for the xpath attribute in the field tag, nothing
> about using '//' in a forEach attribute.

You are probably running into a limitation on what XPath
syntax Solr can handle.

Have never tried this, but one possibility might be to use
XSLT to get all the  elements. The DIH "entity"
element can be given a "xsl" attribute for this. Please
see http://wiki.apache.org/solr/DataImportHandler , and
search Google for examples on using XSLT with Solr DIH.

Your other alternative is to extract the  elements
before indexing into Solr. You could do that inside SolrJ,
for example.

Regards,
Gora


Re: DIH - importing XML with nested elements that have the same name

2012-09-26 Thread Billy Newman
To be a little more specific, the error message I get is:
"forEach cannot start with '//'"

Cannot really find anything on this except for
https://issues.apache.org/jira/browse/SOLR-1437.  Which only talks
about using '//' for the xpath attribute in the field tag, nothing
about using '//' in a forEach attribute.

Any suggestions?

Thanks,
Billy

On Wed, Sep 26, 2012 at 10:59 AM, Billy Newman  wrote:
> Hello all,
>
> I am running solr 4.0.0-BETA and I am running into an issue when
> trying to import an XML document in which I want forEach to pull from
> nested elements with the same element name.
>
> doc example:
> 
>   
> 1
> Item 1
>   
>   
> 2
> Item 2
>  
> 3
> Item 3
>  
>   4
>   Item 4
>
>  
>   
> 
>
> Where each item can contain 'n' number of items.
>
> forEach="/test/item", will get get item 1 and 2 but not 3 or 4.  I
> cannot really use an "|" in this case as I cannot define this for
> infinite levels.
>
> In XPath I think typically you would use something like "//item"  to
> get all item elements.  But I think I am hitting limitations with
> solr's XPath implementation as from my reading it seems like only full
> path is implemented with no wild cards.
>
> Has anyone else found a good solution to this issue?
>
> Is it possible to use Java's implementation of XPath to do things like 
> "//item"?
>
> Thanks in advance!
> Billy