[ https://issues.apache.org/jira/browse/SOLR-7383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Noble Paul updated SOLR-7383: ----------------------------- Description: The DIH example (solr/example/example-DIH/solr/rss/conf/rss-data-config.xml) is broken again. See associated issues. Below is a config that should work. This is caused by Slashdot seemingly oscillating between RDF/RSS and pure RSS. Perhaps we should depend upon something more static, rather than an external service that is free to change as it desires. {code:xml} <dataConfig> <dataSource type="URLDataSource" /> <document> <entity name="slashdot" pk="link" url="http://rss.slashdot.org/Slashdot/slashdot" processor="XPathEntityProcessor" forEach="/RDF/item" transformer="DateFormatTransformer"> <field column="source" xpath="/RDF/channel/title" commonField="true" /> <field column="source-link" xpath="/RDF/channel/link" commonField="true" /> <field column="subject" xpath="/RDF/channel/subject" commonField="true" /> <field column="title" xpath="/RDF/item/title" /> <field column="link" xpath="/RDF/item/link" /> <field column="description" xpath="/RDF/item/description" /> <field column="creator" xpath="/RDF/item/creator" /> <field column="item-subject" xpath="/RDF/item/subject" /> <field column="date" xpath="/RDF/item/date" dateTimeFormat="yyyy-MM-dd'T'HH:mm:ss" /> <field column="slash-department" xpath="/RDF/item/department" /> <field column="slash-section" xpath="/RDF/item/section" /> <field column="slash-comments" xpath="/RDF/item/comments" /> </entity> </document> </dataConfig> {code} was: The DIH example (solr/example/example-DIH/solr/rss/conf/rss-data-config.xml) is broken again. See associated issues. Below is a config that should work. This is caused by Slashdot seemingly oscillating between RDF/RSS and pure RSS. Perhaps we should depend upon something more static, rather than an external service that is free to change as it desires. <dataConfig> <dataSource type="URLDataSource" /> <document> <entity name="slashdot" pk="link" url="http://rss.slashdot.org/Slashdot/slashdot" processor="XPathEntityProcessor" forEach="/RDF/item" transformer="DateFormatTransformer"> <field column="source" xpath="/RDF/channel/title" commonField="true" /> <field column="source-link" xpath="/RDF/channel/link" commonField="true" /> <field column="subject" xpath="/RDF/channel/subject" commonField="true" /> <field column="title" xpath="/RDF/item/title" /> <field column="link" xpath="/RDF/item/link" /> <field column="description" xpath="/RDF/item/description" /> <field column="creator" xpath="/RDF/item/creator" /> <field column="item-subject" xpath="/RDF/item/subject" /> <field column="date" xpath="/RDF/item/date" dateTimeFormat="yyyy-MM-dd'T'HH:mm:ss" /> <field column="slash-department" xpath="/RDF/item/department" /> <field column="slash-section" xpath="/RDF/item/section" /> <field column="slash-comments" xpath="/RDF/item/comments" /> </entity> </document> </dataConfig> > DIH: rewrite XPathEntityProcessor/RSS example as the smallest good demo > possible > -------------------------------------------------------------------------------- > > Key: SOLR-7383 > URL: https://issues.apache.org/jira/browse/SOLR-7383 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler > Affects Versions: 5.0, 6.0 > Reporter: Upayavira > Assignee: Alexandre Rafalovitch > Priority: Minor > Attachments: atom_20170315.tgz, rss-data-config.xml > > > The DIH example (solr/example/example-DIH/solr/rss/conf/rss-data-config.xml) > is broken again. See associated issues. > Below is a config that should work. > This is caused by Slashdot seemingly oscillating between RDF/RSS and pure > RSS. Perhaps we should depend upon something more static, rather than an > external service that is free to change as it desires. > {code:xml} > <dataConfig> > <dataSource type="URLDataSource" /> > <document> > <entity name="slashdot" > pk="link" > url="http://rss.slashdot.org/Slashdot/slashdot" > processor="XPathEntityProcessor" > forEach="/RDF/item" > transformer="DateFormatTransformer"> > > <field column="source" xpath="/RDF/channel/title" > commonField="true" /> > <field column="source-link" xpath="/RDF/channel/link" > commonField="true" /> > <field column="subject" xpath="/RDF/channel/subject" > commonField="true" /> > > <field column="title" xpath="/RDF/item/title" /> > <field column="link" xpath="/RDF/item/link" /> > <field column="description" xpath="/RDF/item/description" /> > <field column="creator" xpath="/RDF/item/creator" /> > <field column="item-subject" xpath="/RDF/item/subject" /> > <field column="date" xpath="/RDF/item/date" > dateTimeFormat="yyyy-MM-dd'T'HH:mm:ss" /> > <field column="slash-department" xpath="/RDF/item/department" /> > <field column="slash-section" xpath="/RDF/item/section" /> > <field column="slash-comments" xpath="/RDF/item/comments" /> > </entity> > </document> > </dataConfig> > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org