Re: [DIH] Multiple repeat XPath stmts
TNX. A lifesaver... -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-Multiple-repeat-XPath-stmts-tp499770p3989439.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: [DIH] Multiple repeat XPath stmts
As I said, copying is not an option. That will break everything else. On Sep 14, 2009, at 1:07 AM, Noble Paul നോബിള് नोब्ळ् wrote: The XPathRecordreader has a limit one mapping per xpath. So copying is the best solution On Mon, Sep 14, 2009 at 2:54 AM, Fergus McMenemie fer...@twig.me.uk wrote: I'm trying to import several RSS feeds using DIH and running into a bit of a problem. Some feeds define a GUID value that I map to my Solr ID, while others don't. I also have a link field which I fill in with the RSS link field. For the feeds that don't have the GUID value set, I want to use the link field as the id. However, if I define the same XPath twice, but map it to two diff. columns I don't get the id value set. For instance, I want to do: schema.xml field name=id type=string indexed=true stored=true required=true/ field name=link type=string indexed=true stored=false/ DIH config: field column=id xpath=/rss/channel/item/link / field column=link xpath=/rss/channel/item/link / Because I am consolidating multiple fields, I'm not able to do copyFields, unless of course, I wanted to implement conditional copy fields (only copy if the field is not defined) which I would rather not. How do I solve this? How about. entity name=x ... transformer=TemplateTransformer field column=link xpath=/rss/channel/item/link / field column=GUID xpath=/rss/channel/GUID / field column=id template=${x.link} / field column-id template=${x.GUID} / The TemplateTransformer does nothing if its source expression is null. So the first transform assign the fallback value to ID, this is overwritten by the GUID if it is defined. You can not sort of do if-then-else using a combination of template and regex transformers. Adding a bit of maths to the transformers and I think we will have a turing complete language:-) fergus. Thanks, Grant -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: [DIH] Multiple repeat XPath stmts
if you wish to use conditional copy you can use a RegexTransformer field column=guid xpath=/rss/channel/guid/ field column=id regex=.* sourceColName=guid replaceWith=${entityname.guid}/ this means that if guid!= null 'id' will be set to guid On Mon, Sep 14, 2009 at 4:16 PM, Grant Ingersoll gsing...@apache.org wrote: As I said, copying is not an option. That will break everything else. On Sep 14, 2009, at 1:07 AM, Noble Paul നോബിള് नोब्ळ् wrote: The XPathRecordreader has a limit one mapping per xpath. So copying is the best solution On Mon, Sep 14, 2009 at 2:54 AM, Fergus McMenemie fer...@twig.me.uk wrote: I'm trying to import several RSS feeds using DIH and running into a bit of a problem. Some feeds define a GUID value that I map to my Solr ID, while others don't. I also have a link field which I fill in with the RSS link field. For the feeds that don't have the GUID value set, I want to use the link field as the id. However, if I define the same XPath twice, but map it to two diff. columns I don't get the id value set. For instance, I want to do: schema.xml field name=id type=string indexed=true stored=true required=true/ field name=link type=string indexed=true stored=false/ DIH config: field column=id xpath=/rss/channel/item/link / field column=link xpath=/rss/channel/item/link / Because I am consolidating multiple fields, I'm not able to do copyFields, unless of course, I wanted to implement conditional copy fields (only copy if the field is not defined) which I would rather not. How do I solve this? How about. entity name=x ... transformer=TemplateTransformer field column=link xpath=/rss/channel/item/link / field column=GUID xpath=/rss/channel/GUID / field column=id template=${x.link} / field column-id template=${x.GUID} / The TemplateTransformer does nothing if its source expression is null. So the first transform assign the fallback value to ID, this is overwritten by the GUID if it is defined. You can not sort of do if-then-else using a combination of template and regex transformers. Adding a bit of maths to the transformers and I think we will have a turing complete language:-) fergus. Thanks, Grant -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: [DIH] Multiple repeat XPath stmts
I'm trying to import several RSS feeds using DIH and running into a bit of a problem. Some feeds define a GUID value that I map to my Solr ID, while others don't. I also have a link field which I fill in with the RSS link field. For the feeds that don't have the GUID value set, I want to use the link field as the id. However, if I define the same XPath twice, but map it to two diff. columns I don't get the id value set. For instance, I want to do: schema.xml field name=id type=string indexed=true stored=true required=true/ field name=link type=string indexed=true stored=false/ DIH config: field column=id xpath=/rss/channel/item/link / field column=link xpath=/rss/channel/item/link / Because I am consolidating multiple fields, I'm not able to do copyFields, unless of course, I wanted to implement conditional copy fields (only copy if the field is not defined) which I would rather not. How do I solve this? How about. entity name=x ... transformer=TemplateTransformer field column=link xpath=/rss/channel/item/link / field column=GUID xpath=/rss/channel/GUID / field column=id template=${x.link} / field column-id template=${x.GUID} / The TemplateTransformer does nothing if its source expression is null. So the first transform assign the fallback value to ID, this is overwritten by the GUID if it is defined. You can not sort of do if-then-else using a combination of template and regex transformers. Adding a bit of maths to the transformers and I think we will have a turing complete language:-) fergus. Thanks, Grant -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===
Re: [DIH] Multiple repeat XPath stmts
The XPathRecordreader has a limit one mapping per xpath. So copying is the best solution On Mon, Sep 14, 2009 at 2:54 AM, Fergus McMenemie fer...@twig.me.uk wrote: I'm trying to import several RSS feeds using DIH and running into a bit of a problem. Some feeds define a GUID value that I map to my Solr ID, while others don't. I also have a link field which I fill in with the RSS link field. For the feeds that don't have the GUID value set, I want to use the link field as the id. However, if I define the same XPath twice, but map it to two diff. columns I don't get the id value set. For instance, I want to do: schema.xml field name=id type=string indexed=true stored=true required=true/ field name=link type=string indexed=true stored=false/ DIH config: field column=id xpath=/rss/channel/item/link / field column=link xpath=/rss/channel/item/link / Because I am consolidating multiple fields, I'm not able to do copyFields, unless of course, I wanted to implement conditional copy fields (only copy if the field is not defined) which I would rather not. How do I solve this? How about. entity name=x ... transformer=TemplateTransformer field column=link xpath=/rss/channel/item/link / field column=GUID xpath=/rss/channel/GUID / field column=id template=${x.link} / field column-id template=${x.GUID} / The TemplateTransformer does nothing if its source expression is null. So the first transform assign the fallback value to ID, this is overwritten by the GUID if it is defined. You can not sort of do if-then-else using a combination of template and regex transformers. Adding a bit of maths to the transformers and I think we will have a turing complete language:-) fergus. Thanks, Grant -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === -- - Noble Paul | Principal Engineer| AOL | http://aol.com