the following lines:
Configuration c = NutchConfiguration.create();
/* Some code removed here */
c.set(analysis.common.terms.file, common-terms.utf8);
and including in the root of the nutch-1.0.jar the file common-terms.utf8
obtained from the $NUTCH_HOME/conf/ folder
Hope that helps (if you still
Any ideas !!
kazam wrote:
Hi there,
Nutch is giving an error to me saying that
org.apache.hadoop.conf.Configuration common-terms.utf8 not found
I have tried to specify paths in java using the configuration object.
ServletContext application = session.getServletContext
Hi there,
Nutch is giving an error to me saying that
org.apache.hadoop.conf.Configuration common-terms.utf8 not found
I have tried to specify paths in java using the configuration object.
ServletContext application = session.getServletContext();
Configuration nutchConf = NutchConfiguration.get
Hi there,
For some reason nutch can't seem to find my common-terms.utf8 file. I have
placed it under WEB-INF, WEB-INF/classes and even under WEB-INF/lib.
In my nutch-default.xml the path to the file is as follows
property
nameanalysis.common.terms.file/name
valuecommon-terms.utf8/value
Hi all,
I am trying to run Nutch 0.8 on a Linux server and am coming up with some
errors that did not appear when I ran Nutch on a Windows machine. It get this
error message: common-terms.utf8 not found which is throwing a
java.lang.NullPointerException. The line that giving the error
Hi all,
I am trying to run Nutch 0.8 on a Linux server and am coming up with some
errors that did not appear when I ran Nutch on a Windows machine. It get
this error message: common-terms.utf8 not found which is throwing a
java.lang.NullPointerException. The line that giving the error message
To: nutch-user@lucene.apache.org
Subject: Understanding common-terms.utf8
Oopps. Finally i did my homework, and found my way throught Mail Archives,
and the response to my FAQ questions..
http://www.mail-archive.com/nutch-user@lucene.apache.org/msg05635.html this
entire thread anwsers me
lucene uses stopwords..
So, How can i inject a stopword list in Nutch? How it's used
common-terms.utf8? If It isnt a stopword file what it is? How it's used
?
Ignacio J. Ortega
Dpto. soporte y desarrollo
http://www.derecho.com
http://www.elabogado.com
)
at
org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:83)
... 65 more
This is caused by the common-terms.utf8 file not being found in line
152 of org.apache.nutch.analysis.CommonGrams. However, this file is
located on the root level of the nutch.jar in the lib directory that
also contains
[EMAIL PROTECTED] wrote:
This is because Nutch turns those common terms into ngrams (not sure of what
size), and that increases the size of the index.
For example, if you have a phrase like:
vacation time
Normally, Nutch will index this phrase as 2 terms, a total of 12 characters
(probably
Hi,
Could anyone explain me what does exactly the common-terms.utf8 file? I
don't understand the real functionality of this file...
During indexing (and also during searching) the common terms are used to form
n-grams to make search faster for common words like articles for example
Hi Timo!
Thanks a lot! now I have a clearly knowledge about this file. This article
helps a lot too: http://searchenginewatch.com/showPage.html?page=2156061
Thanks again!
On 8/11/06, Timo Scheuer [EMAIL PROTECTED] wrote:
Hi,
Could anyone explain me what does exactly the common-terms.utf8
!
Thanks a lot! now I have a clearly knowledge about this file. This article
helps a lot too: http://searchenginewatch.com/showPage.html?page=2156061
Thanks again!
On 8/11/06, Timo Scheuer [EMAIL PROTECTED] wrote:
Hi,
Could anyone explain me what does exactly the common-terms.utf8 file?
I
]
To: nutch-user@lucene.apache.org
Sent: Friday, August 11, 2006 8:19:41 AM
Subject: Re: [Nutch-general] common-terms.utf8
Hi Timo!
I analyzed to index before and after using correctly the
common-terms.utf8file. Before adding the common terms in my language
my index had about 3mb.
After add the common
Hi,
Could anyone explain me what does exactly the common-terms.utf8 file? I
don't understand the real functionality of this file...
Regards,
--
Lourival Junior
Universidade Federal do Pará
Curso de Bacharelado em Sistemas de Informação
http://www.ufpa.br/cbsi
Msn: [EMAIL PROTECTED]
15 matches
Mail list logo