Hi Sebastian,

your suggestion of adding the plugin solved the problem. Thank you for your 
help.

Regards, Anton


________________________________
From: Sebastian Nagel <[email protected]>
Sent: Monday, November 11, 2019 3:08 PM
To: [email protected] <[email protected]>
Subject: Re: Metadata not indexed after migrating to Nutch 2.4

Hi Anton,

after a short look into MetadataIndexer:
- it does not request any fields from the webpage,
  see getFields() method
- this is a bug (but already was in 2.3.1)
- it could be worked around by activating another
  plugin which requests the METADATA field/column,
  eg. language-identifier/LanguageIndexingFilter

That's one possible explanation.

Please note that it is unlikely that there will be further
releases on the 2.x series of Nutch, see the release announcement
for more details.

Best,
Sebastian


On 11/11/19 12:44 PM, Anton Skarp wrote:
> Hi,
>
> After migrating from nutch 2.3.1 to 2.4 I have not been able to conf nutch to 
> index metadata to elasticsearch. Indexchecker gets the metadata correctly 
> though.
> I have tried both hbase version 0.9.8-hadoop2 and also with mongodb. Both 
> contained the wanted metadata.
>
> I have done some debugging and the problem seems to be that MetadataIndexer 
> filter methods parameter page does not even contain the metadata.
>
> There are no exceptions/errors outputted by nutch or elasticsearch.
>
> Any ideas on what is the problem and how I should approach fixing it.
>
>
> Regards. Anton
>

Reply via email to