[EMAIL PROTECTED] wrote:
Hi Sami,
- Original Message [EMAIL PROTECTED] wrote:
I assume the idea is that the JVM knows about hadoop.log.dir system
property, and then log4j knows about it, too. However, it doesn't
_always_ work.
That is, when invoking various bin/nutch commands as de
Hi Sami,
- Original Message
[EMAIL PROTECTED] wrote:
>
> I assume the idea is that the JVM knows about hadoop.log.dir system
> property, and then log4j knows about it, too. However, it doesn't
> _always_ work.
>
> That is, when invoking various bin/nutch commands as described in
> http:
Hello,
Several people reported issues with slow fetcher in 0.8...
I run Nutch on a dual CPU (+HT) box, and have noticed that the fetch speed
didn't increase when I went from using 100 threads, to 200 threads. Has anyone
else observed the same?
I was using 2 map tasks (mapred.map.tasks propert
[EMAIL PROTECTED] wrote:
Hi,
In bin/nutch I saw this:
if [ "$NUTCH_LOGFILE" = "" ]; then NUTCH_LOGFILE='hadoop.log' fi
Wouldn't it make more sense to name the file nutch.log? Everything
there is Nutch-specific - Injector, Generator and I see some
mapred things. But as a Nutch user I'd e
[EMAIL PROTECTED] wrote:
I assume the idea is that the JVM knows about hadoop.log.dir system
property, and then log4j knows about it, too. However, it doesn't
_always_ work.
That is, when invoking various bin/nutch commands as described in
http://lucene.apache.org/nutch/tutorial8.html , this fa
Hi,
While building Nutch, I noticed several places where various Jars from plugins'
lib directories could not be found, for example:
$ ant package
...
deploy:
[copy] Warning: Could not find file
/home/otis/dev/repos/lucene/nutch/trunk/build/lib-log4j/lib-log4j.jar to copy.
init:
init-plugi
Non capisco... are you saying Google is offering you to search using the query
"Search" when you originally entered query "Explore"?
Otis
- Original Message
From: Florian Fricker <[EMAIL PROTECTED]>
To: nutch-user@lucene.apache.org
Sent: Friday, August 11, 2006 4:46:25 AM
Subject: [Nut
This is because Nutch turns those common terms into ngrams (not sure of what
size), and that increases the size of the index.
For example, if you have a phrase like:
vacation time
Normally, Nutch will index this phrase as 2 terms, a total of 12 characters
(probably less, if these words are st
Hi,
In bin/nutch I saw this:
if [ "$NUTCH_LOGFILE" = "" ]; then
NUTCH_LOGFILE='hadoop.log'
fi
Wouldn't it make more sense to name the file nutch.log? Everything there is
Nutch-specific - Injector, Generator and I see some mapred things. But as
a Nutch user I'd expect to see nutch.log,
Hello,
I noticed the following line in conf/log4j.properties:
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
I noticed that the ${hadoop.log.dir}/${hadoop.log.file} sometimes gets
interpreted as "/", indicating that the 2 hadoop properties there are undefined.
I also noticed t
- Original Message
From: Murat Ali Bayir <[EMAIL PROTECTED]>
To: nutch-user@lucene.apache.org
Sent: Friday, August 11, 2006 6:56:41 AM
Subject: [Nutch-general] log4j.properties
Hello everbody. here is the first few lines of my log4j.properties file
log4j.rootLogger=INFO,DRFA,stdout
#
Hi,
Change conf/log4j.properties to DEBUG level. File nutch-default.xml is not
related to logging.
Also, please post questions like this to nutch-user, please.
Otis
- Original Message
From: Feng Ji <[EMAIL PROTECTED]>
To: nutch-dev@lucene.apache.org
Sent: Friday, August 11, 2006 6:41
The thing I don't like commercial products like google mini or similar is
that, they charge you based on the number of documents allowable for
indexing. While in its core, the software probably is the same with just
some switches turned on and off.
I know that you can use httpclient and java to d
see /conf/log4j.propertiesl,
just set the debug level of nutch or hadoop to "DEBUG",
by default debugging output is being written to /log/hadoop.log.
Feng Ji wrote:
Hi there,
I found nutch-0.8. using apache's commons logging system
http://jakarta.apache.org/commons/logging/apidocs/index.html
Hi there,
I found nutch-0.8. using apache's commons logging system
http://jakarta.apache.org/commons/logging/apidocs/index.html
under the developing stage, I'd like to turn on debug mode
"if (log.isDebugEnabled()) {
...
"
I checked nutch-default.xml, but can't find a place to turn it on.
Doe
Matthew,
Looking over your recrawl script, it seems like you are merging *all*
segments together, including any old segments. It seems to me that
you could just be merging only the new segments together. Maybe you
could explain a little of the reasoning behind this.
Thanks,
Jacob Brunson
On 8/8/
Stevenson, Kerry wrote:
Hello all - I have been taking a look at Nutch for purposes of indexing
a large pile of internal LAN files at our company, and so far it looks
quite impressive. I believe it could substitute for the Google Mini
appliance. However, the bigger Google boxes add more features
Thanks, that did the trick.
-Original Message-
From: Raphael Hoffmann [mailto:[EMAIL PROTECTED]
Sent: Thursday, August 10, 2006 5:13 PM
To: nutch-user@lucene.apache.org
Subject: Re: More Fetcher NullPointerException
I had the same problem before. Just read
http://www.mail-archive.com/nu
Hi Timo!
I analyzed to index before and after using correctly the
common-terms.utf8file. Before adding the common terms in my language
my index had about 3mb.
After add the common terms it has now 5mb! Why it occurs?
Regards!
On 8/11/06, Lourival Júnior <[EMAIL PROTECTED]> wrote:
Hi Timo!
Th
Yes yes, I tested the index-more and query-more plugin. They allows to
search these fields easily. However if I could find a documentation about
they I would not spend time thinking in a solution.
Thanks a lot!
On 8/11/06, Lukas Vlcek <[EMAIL PROTECTED]> wrote:
Hi,
You need to look into sourc
Hi Timo!
Thanks a lot! now I have a clearly knowledge about this file. This article
helps a lot too: http://searchenginewatch.com/showPage.html?page=2156061
Thanks again!
On 8/11/06, Timo Scheuer < [EMAIL PROTECTED]> wrote:
Hi,
> Could anyone explain me what does exactly the common-terms.utf
Hello everbody. here is the first few lines of my log4j.properties file
log4j.rootLogger=INFO,DRFA,stdout
# Logging Threshold
log4j.threshhold=ALL
log4j.logger.org.apache.nutch=ALL
log4j.logger.org.apache.hadoop=ALL
I want to reduce size of log files and want to take only error logs,
which pa
Hi,
> Could anyone explain me what does exactly the common-terms.utf8 file? I
> don't understand the real functionality of this file...
During indexing (and also during searching) the common terms are used to form
n-grams to make search faster for common words like articles for example. It
is a
Hi,
You need to look into source to find out what exactly it does. As far
as I know it does not add any new filed into index (it should be done
via index-more plugin) but it allows you to query using type: date:
and site: I think.
Lukas
On 8/9/06, Lourival Júnior <[EMAIL PROTECTED]> wrote:
Wha
Hello
When i start a search query in Google e.x. “Explore” then Google tells
me the first few results. After this results Google displays more
results but with the search term, for example "Search". The meaning is
more or less the same, but with the query “Search” you will got more
results th
Benjamin Higgins wrote:
Further details:
If I run strace on the process, it looks like this, over and over and
over:
This doesn't really say anything, do a thread dump instead (Ctrl-Break
in Windows, kill -SIGQUIT in Unix).
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ _
26 matches
Mail list logo