[ http://issues.apache.org/jira/browse/NUTCH-422?page=all ]
Alan Tanaman updated NUTCH-422:
-------------------------------
Attachment: index-extra-v1.0-bin-java1.5.zip
index-extra-v1.0-source.zip
> index-extra plugin creates additional fields in the index, based on
> configurable logic
> --------------------------------------------------------------------------------------
>
> Key: NUTCH-422
> URL: http://issues.apache.org/jira/browse/NUTCH-422
> Project: Nutch
> Issue Type: New Feature
> Components: indexer
> Affects Versions: 0.8.1
> Environment: All environments
> Reporter: Alan Tanaman
> Attachments: index-extra-v1.0-bin-java1.5.zip,
> index-extra-v1.0-source.zip
>
>
> Extract from the Readme file:
> A. Introduction
> The index-extra plugin allows you to configure additional fields that you
> wish to be added to the index, based on one of the following sources:
> - The parsed text
> - Meta data fields
> - Previously created document-to-be-indexed fields
> - Plain constant string
> - Java expression combining one or more of the above, and resolving to
> a string
> A regex can also be applied to any of the above, allowing fields to be
> created based on patterns extracted from the source.
> B. Installation
> 1) Binaries only: Copy the 'index-extra' folder within
> index-extra-v1.0-bin-java1.5.zip to NUTCHDIR/build
> Copy the 'index-extra-conf.xml' file to
> NUTCHDIR/conf, and configure
> Enable the plugin by updating the nutch-site.xml file
> 2) Source code: Always refer to the Nutch wiki for detailed
> instructions on building Nutch. In short:
> Copy the 'index-extra' folder within
> index-extra-v1.0-source.zip to NUTCHDIR/src/plugin
> Update the build.xml in NUTCHDIR/src/plugin to
> include plugin
> Update the NUTCHDIR/default.properties file to
> include plugin
> run ant to build
> Copy the 'index-extra-conf.xml' file to
> NUTCHDIR/conf, and configure
> Enable the plugin by updating the nutch-site.xml file
> C. Known Issues
> 1) For this plugin to work correctly on any document field, it is
> necessary to run the other index filters
> first, so that all basic document fields are generated first. To do
> this, configure the indexingfilter.order
> property. (Please see patch NUTCH-421 to enable indexingfilter.order
> property. If this patch is not applied,
> the plugin will still work, but will not be able to use document fields
> created by other index filter plugins.)
> 2) At this stage, field boost can not be used as Nutch scoring overrides
> the field boost with its own
> document-level boost calculation. This occurs at the end of
> org.apache.nutch.indexer.Indexer's reduce method.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers