I modified the index code to make it store the html after fetching. But when I use getContent( ) to get the html, it returns HTML entities like < instead of "<". Is there any purpose to store the html as html entities? Can I just store the html as regular html, not html entities? Thanks. -- View this message in context: http://www.nabble.com/How-come-getContent-returns-HTML-Entities--tp23008229p23008229.html Sent from the Nutch - User mailing list archive at Nabble.com.
