Salut Yann, Not really answering your question but where did you get this config from? Some of its elements have been long deprecated (query-*, response-*, summary-*)
Julien On 15 June 2014 10:20, Yann Levreau <yann.levr...@gmail.com> wrote: > hi everyone ! > > I'm sorry to disturb you but i need some assistance for getting the > outlinks of http://elpais.com. > I use Nutch 2.2.1. > > The web page is well parsed, in debug I have all the outlinks in the Parse > object. > I use these basic plugins : > > > protocol-http|parse-html|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic) > > But outlinks are never injected in hbase (with http://elpais.com or > http://www.elpais.com). > If i try to parse www.nytimes.com, outlinks are normally injected and > added to the fetch list. > > Any idea ? > Thanks > Yann > > ==> I have the same issue with http://www.lemonde.fr > > > -- Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble