Re: Error when using URLDataSource to index RSS items

2014-06-07 Thread Alexandre Rafalovitch
It sounds like maybe when you run this from code, you are getting an
error page instead of the RSS feed and that error page is a malformed
HTML.

Do you have a proxy where you run the code? If so, your browser may be
using proxy and your DIH code does not. You could try running
something like WireShark, Fiddler or similar t inspect the
request/response you are actually getting.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Sat, Jun 7, 2014 at 10:52 AM, ienjreny ismaeel.enjr...@gmail.com wrote:
 Hello,

 I am using the following script to index RSS items

 dataSource type=URLDataSource encoding=UTF-8 /
   document
 entity name=slashdot
 pk=link
 url=http://www.alarabiya.net/.mrss/ar.xml;
 processor=XPathEntityProcessor
 forEach=/rss/channel/item

   field column=category_name name=category_name
 xpath=/rss/channel/item/title /
   field column=link name=url xpath=/rss/channel/item/link /

 /entity
   /document

 But I am facing the following error

 Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag
 /head; expected /meta.

 Can any body help?



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Error-when-using-URLDataSource-to-index-RSS-items-tp4140548.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Error when using URLDataSource to index RSS items

2014-06-06 Thread ienjreny
Hello,

I am using the following script to index RSS items

dataSource type=URLDataSource encoding=UTF-8 /
  document
entity name=slashdot
pk=link
url=http://www.alarabiya.net/.mrss/ar.xml;
processor=XPathEntityProcessor
forEach=/rss/channel/item

  field column=category_name name=category_name
xpath=/rss/channel/item/title /
  field column=link name=url xpath=/rss/channel/item/link /

/entity
  /document

But I am facing the following error

Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag
/head; expected /meta.

Can any body help?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-when-using-URLDataSource-to-index-RSS-items-tp4140548.html
Sent from the Solr - User mailing list archive at Nabble.com.