Hi @ all, I tried to get the outlinks during html parsing.
My first idea was: public Parse filter(Content content, Parse parse, HTMLMetaTags metaTags, DocumentFragment doc) { ParseData parseData = parse.getData(); Outlink[] outlinks = parseData.getOutlinks(); } This gives me all the time an outlinks array of size/length 0 :confused: After that I used the Outlink extractor and this works fine. public Parse filter(Content content, Parse parse, HTMLMetaTags metaTags, DocumentFragment doc) { OutlinkExtractor outlinkExtractor = new OutlinkExtractor(); Outlink[] outlinks = outlinkExtractor.getOutlinks(text, title, this.conf); } So why does my first attempt fails all the time and why is it not possible to get the outlinks over the ParseData object? Thanks for your help in advance. Regards, MyD -- View this message in context: http://www.nabble.com/Outlinks-during-parse-%28ParseData-getOutlinks-vs.-OutlinkExtractor-getOutlinks%29-tp22479136p22479136.html Sent from the Nutch - User mailing list archive at Nabble.com.