[ https://issues.apache.org/jira/browse/NUTCH-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144357#comment-16144357 ]
Jorge Luis Betancourt Gonzalez commented on NUTCH-2414: ------------------------------------------------------- [~yossi] I think that [~markus.jel...@openindex.io] is suggesting implementing a generic {{IndexingFilter}} that supports JEXL expressions, this way we don't need to modify every possible {{IndexingFilter}}, this will be easier to maintain in the long run and provides a better separation. > Allow LanguageIndexingFilter to actually filter documents by language. > ---------------------------------------------------------------------- > > Key: NUTCH-2414 > URL: https://issues.apache.org/jira/browse/NUTCH-2414 > Project: Nutch > Issue Type: Improvement > Components: plugin > Affects Versions: 1.13 > Reporter: Yossi Tamari > Priority: Minor > > It is often useful to only index pages in select languages (e.g. only those > languages that we intend to search in). At first glance it seems that this is > done by LanguageIndexingFilter, but currently all the filter does is add the > language as a field to the index. > We can add a configuration property to LanguageIndexingFilter that will allow > it to only index languages specified in this property. -- This message was sent by Atlassian JIRA (v6.4.14#64029)