Hi @ all,

I tried to get the outlinks during html parsing. 

My first idea was:

public Parse filter(Content content, Parse parse, HTMLMetaTags metaTags,
                        DocumentFragment doc) {
ParseData parseData = parse.getData();
Outlink[] outlinks = parseData.getOutlinks();
}

This gives me all the time an outlinks array of size/length 0 :confused:

After that I used the Outlink extractor and this works fine.

public Parse filter(Content content, Parse parse, HTMLMetaTags metaTags,
                        DocumentFragment doc) {
OutlinkExtractor outlinkExtractor = new OutlinkExtractor();
Outlink[] outlinks = outlinkExtractor.getOutlinks(text, title, this.conf);
}

So why does my first attempt fails all the time and why is it not possible
to get the outlinks over the ParseData object? Thanks for your help in
advance.

Regards,
MyD

-- 
View this message in context: 
http://www.nabble.com/Outlinks-during-parse-%28ParseData-getOutlinks-vs.-OutlinkExtractor-getOutlinks%29-tp22479136p22479136.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to