Hey Jamshaid, We cannot see any screenshot being attached. Could you upload it somewhere and share the url ?
On Thu, Jun 13, 2013 at 11:25 PM, Jamshaid Ashraf <jamshaid...@gmail.com>wrote: > Hi, > > Thanks for prompt reply! > > I have set debug point on following line in plugin code in eclipse but get > "source not found" screen when debugging plugin code in eclipse. Please see > attached screen shot. > > String content = new String(page.getContent().array()); > > What might cause this to happen and how can I fix it? > > Regards, > Jamshaid > > > On Thu, Jun 13, 2013 at 8:34 PM, feng lu <amuseme...@gmail.com> wrote: > >> Hi >> >> I checked the ParseFilter interface in Nutch 2.x like this. >> >> Parse filter(String url, WebPage page, Parse parse,HTMLMetaTags metaTags, >> DocumentFragment doc); >> >> you can through this method to get the raw content of html page. >> >> String content = new String(page.getContent().array()); >> >> and get the parsed text through parse.getText() method. >> >> >> >> >> >> On Thu, Jun 13, 2013 at 11:10 PM, Jamshaid Ashraf <jamshaid...@gmail.com >> >wrote: >> >> > Hi, >> > >> > Since I'm using nutch 2.2 ParseFilter plugin and I need to extract >> custom >> > information from parsed raw html (preferably using JSoup) ... but I >> still >> > could't find out how to get the raw html in @override filter () method >> . As >> > all the examples I have found are in Nutch 1.x api and doens't work with >> > new Nutch 2.x api. >> > >> > >> > Thanks in advance! >> > >> > Regards, >> > Jamshaid >> > >> >> >> >> -- >> Don't Grow Old, Grow Up... :-) >> > >