Has you reindexed your segments? It's important, because it makes nutch
recognize your common terms. I've tried it and the only thing I've noted was
the index size that is more big than the original (before use the common
terms).
On 9/25/06, carmmello [EMAIL PROTECTED] wrote:
I'm using Nutch
This issue happens even when I start a new crawl. So, I'm not reindexing
the segments. The indexing is done by nutch itself, using the intranet
method.
Do you mean that after this is done, do I have to reindex the segments, once
again? But, if so, why the english common terms are recognized
happens even when I start a new crawl. So, I'm not reindexing
the segments. The indexing is done by nutch itself, using the intranet
method.
Do you mean that after this is done, do I have to reindex the segments,
once
again? But, if so, why the english common terms are recognized first
time?
Tanks
I've added some code to query-basic to log the query after it
has run both addTerms and addPhrases. This helps me to better
understand what's going on. I've noticed that when my search contains
words like the or a, those don't appear in the actual query.
It looks to me like the
There is a list of stop words in NutchAnalysis class (
org.apache.nutch.analysis). I guess thats where the common terms are removed
during analysis.
--Rajesh Munavalli
Blog: http://mathsearch.blogspot.com
On 3/30/06, Vanderdray, Jacob [EMAIL PROTECTED] wrote:
I've added some code
There is a list of stop words in NutchAnalysis class
(org.apache.nutch.analysis). I guess thats where the common terms are
removed during analysis.
--Rajesh Munavalli
Blog: http://mathsearch.blogspot.com
Vanderdray, Jacob wrote:
I've added some code to query-basic to log the query
, March 30, 2006 5:24 PM
To: nutch-user@lucene.apache.org
Subject: Re: Common Terms
There is a list of stop words in NutchAnalysis class
(org.apache.nutch.analysis). I guess thats where the common terms are
removed during analysis.
--Rajesh Munavalli
Blog: http://mathsearch.blogspot.com
system as it's prone to abuse.
Stephen Ensor wrote:
Hi, I am using nutch to create a vertical search site and wish to create a
directory type menu for my front page with all the most common terms in my
index.
For example say my vertical search is pets and my index is full of pet sites
and pages
terms in my
index.
For example say my vertical search is pets and my index is full of pet sites
and pages, the common terms would be (cat, dog, fish, food, vet, etc*). Would
this be possible to generate using nutch and some plugin?
Any help is much appreciated, Thanks
Steve