[ 
https://issues.apache.org/jira/browse/SOLR-7383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre Rafalovitch updated SOLR-7383:
----------------------------------------
    Attachment: atom_20170315.tgz

Attached is a replacement example that uses StackOverflow ATOM feed and 
demonstrates ALL and more features than the original RSS example (as far as I 
can tell). And some features (e.g. commonField) now actually work.

It has a different directory name, so can be decompressed alongside other DIH 
examples. 

It is not cleaned up, as I need to double-check camelCases vs dashes vs 
underscores, spaces vs tabs and maybe another comment or two (and removing 
checkist comment at the top of DIH definition file)

But it should work and demonstrate a nice example. The solrconfig.xml file is 
super-minimal similar to work in SOLR-9601. It also uses new updateProcessors 
syntax.

If this looks good, then RSS example will just be deleted and this will be the 
new one.

I will appreciate the reviews and comments, as this example is 15! times 
smaller than the RSS one.

> DIH rss example is broken again
> -------------------------------
>
>                 Key: SOLR-7383
>                 URL: https://issues.apache.org/jira/browse/SOLR-7383
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 5.0, 6.0
>            Reporter: Upayavira
>            Assignee: Alexandre Rafalovitch
>            Priority: Minor
>         Attachments: atom_20170315.tgz, rss-data-config.xml
>
>
> The DIH example (solr/example/example-DIH/solr/rss/conf/rss-data-config.xml) 
> is broken again. See associated issues.
> Below is a config that should work.
> This is caused by Slashdot seemingly oscillating between RDF/RSS and pure 
> RSS. Perhaps we should depend upon something more static, rather than an 
> external service that is free to change as it desires.
> <dataConfig>
>     <dataSource type="URLDataSource" />
>     <document>
>         <entity name="slashdot"
>                 pk="link"
>                 url="http://rss.slashdot.org/Slashdot/slashdot";
>                 processor="XPathEntityProcessor"
>                 forEach="/RDF/item"
>                 transformer="DateFormatTransformer">
>                               
>             <field column="source" xpath="/RDF/channel/title" 
> commonField="true" />
>             <field column="source-link" xpath="/RDF/channel/link" 
> commonField="true" />
>             <field column="subject" xpath="/RDF/channel/subject" 
> commonField="true" />
>                       
>             <field column="title" xpath="/RDF/item/title" />
>             <field column="link" xpath="/RDF/item/link" />
>             <field column="description" xpath="/RDF/item/description" />
>             <field column="creator" xpath="/RDF/item/creator" />
>             <field column="item-subject" xpath="/RDF/item/subject" />
>             <field column="date" xpath="/RDF/item/date" 
> dateTimeFormat="yyyy-MM-dd'T'HH:mm:ss" />
>             <field column="slash-department" xpath="/RDF/item/department" />
>             <field column="slash-section" xpath="/RDF/item/section" />
>             <field column="slash-comments" xpath="/RDF/item/comments" />
>         </entity>
>     </document>
> </dataConfig>



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to