Salut Yann,

Not really answering your question but where did you get this config from?
Some of its elements have been long deprecated (query-*, response-*,
summary-*)

Julien


On 15 June 2014 10:20, Yann Levreau <yann.levr...@gmail.com> wrote:

> hi everyone !
>
> I'm sorry to disturb you but i need some assistance for getting the
> outlinks of http://elpais.com.
> I use Nutch 2.2.1.
>
> The web page is well parsed, in debug I have all the outlinks in the Parse
> object.
> I use these basic plugins :
>
>
> protocol-http|parse-html|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)
>
> But outlinks are never injected in hbase (with http://elpais.com or
> http://www.elpais.com).
> If i try to parse www.nytimes.com, outlinks are normally injected and
> added to the fetch list.
>
> Any idea ?
> Thanks
> Yann
>
> ==> I have the same issue with http://www.lemonde.fr
>
>
>


-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to