Hi Guys
I wanna more fields in html header to be indexed or stored.
Take below as example, 'breadcrumb' should be stored without indexing
while 'keywords' should indexed.
.
..
And I read the index-more plugin, it seems I cannot not get
at all. Those info. is stored in "content".
Andrzej Bialecki (JIRA) wrote:
* improve status reporting throughout all plugins.
Please note, that this is an incompatble change between the
ProtocolStatus implemented patchset in NUTCH-54 and here, so if you
created some segments in between, you will need to refetch them.
I didn't feel t
[ http://issues.apache.org/jira/browse/NUTCH-61?page=all ]
Andrzej Bialecki updated NUTCH-61:
---
Attachment: 20050606.diff
The first round:
* change Page to use a 1-byte float, representing fetchInterval in seconds.
* implement a pluggable FetchSchedu
Adaptive re-fetch interval. Detecting umodified content
---
Key: NUTCH-61
URL: http://issues.apache.org/jira/browse/NUTCH-61
Project: Nutch
Type: New Feature
Components: fetcher
Reporter: Andrzej Bialecki