Thank you for your confirmation Andrzej!
Yes, Next time I will report it to the dev list :-p
Regards
On 1/9/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
Nutch Newbie wrote:
> Hi:
>
> Could some please be kind enough to confirm if the 0.9-dev trunk is
> broken. I did
help.
On 1/8/07, Nutch Newbie <[EMAIL PROTECTED]> wrote:
Hi:
I am getting the following error after updating to revision 494024. My
Hadoop-site.xml (mapred.speculative) set to false .. I am not sure
what I am doing wrong.. everything worked before the update.. Any
help..
Regards
La
Hi:
I am getting the following error after updating to revision 494024. My
Hadoop-site.xml (mapred.speculative) set to false .. I am not sure
what I am doing wrong.. everything worked before the update.. Any
help..
Regards
Language identifier configuration [1-4/2048]
map 100% reduce 0%
Language
Hi:
I would like to take this opportunity to propose another idea. Nutch
should have a patch committing guidelines after all it effects how we
code and what way we should submit patch so it gets committed. This
makes easier and encouraging to when I know what exactly I need to do
to get my code i
s and not giving them
the chance to develop and contribute.
I completely understand your view and I am aware of Hadoop work in progress.
Regards,
On 11/14/06, Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
(Sorry for the long post, but I felt this issue needs to be made very
clear ...)
Nutch N
Here is some general comments:
The problem is in Hadoop i.e. map-reduce, i.e. processing. Hadoop-206
is not solved..Have a look.
http://www.mail-archive.com/hadoop-user%40lucene.apache.org/msg00521.html
Well, again its a wishful thinking to ask for many developers, patch
and bug reporting and b
Well, I would like to agree with Piotr here but current development i.e. 0.8
version and onwards single machine nutch install is not optimal there
are various
hadoop related issue example
http://issues.apache.org/jira/browse/HADOOP-206
are important for a single machine install. I don't think "o
Can you post your "xmlparser-conf.xml" from the nutch/conf dir ?
Also what kind of error message do you get when you index?
You can use Luke to see the index...
Regards,
On 11/4/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote:
Hello Everyone,
I am just installed nutch-0.8.1 on my dev machine
You need to enable "index-more" and "query-more" plugins to enable
type, date range etc based query..
plugin.includes
protocol-http|urlfilter-regex|parse-(text|html|js|mp3)|index-(basic|more)|query-(basic|more|site|url)|summary-basic|scoring-opic
On 10/28/06, [EMAIL PROTECTED] <[EMAIL PROTEC
interest.
On 8/15/06, thegallier <[EMAIL PROTECTED]> wrote:
I would love to try it.
Cheers
Nutch Newbie wrote:
>
> Hi:
>
> I am in the process of finishing up an Installer for Nutch 0.8 (one
> machine/local install), I am using opensource installer which complies
> with Ap
Good work!
On 7/17/06, Sudhi Seshachala <[EMAIL PROTECTED]> wrote:
In addition for crawling, I have customized the process of crawling.
Just curious what do you mean by customized process of crawling?
Best of luck with your site.
Hi
Have a look at the bin/nutch script about JAVA HEAP SIZE adjust it to
your settings..you should see something a line like
JAVA_HEAP_MAX=-Xmx1000m
in bin/nutch script
rgds
On 6/15/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote:
Hi,
I installed Tomcat using cPanel/WHM as root. It downlo
On 5/16/06, Alexander E Genaud <[EMAIL PROTECTED]> wrote:
Hello,
As far as I understand, /robots.txt designates which files may and may
not be indexed by the Nutch and other crawlers. However, is there a
method by which site may exclude only sections of a document?
The benefit is most evident i
AJ
Did you update the scrpit to reflect new changes in 0.8? no? I can
update it.. however I am getting a Class not found error when I try to
run nutch crawl or nutch inject?? yes I did pointed it to the current
class in 0.8??? any suggestions
Thanks
On 4/30/06, ArentJan Banck <[EMAIL PROTECTED]
Hi:
I am in the process of finishing up an Installer for Nutch 0.8 (one
machine/local install), I am using opensource installer which complies
with Apache 2.0 Lic. so my plan is to make it opensouce if there are
enough community interest. The installer is in Java and you can
integrate it with Ant.
Hi Philippe:
Any progress? Do you need any help?
On 3/6/06, Ivan Sekulovic <[EMAIL PROTECTED]> wrote:
> I think that licence is OK.
>
> Using that libray for plugin is realy simple. I've done some test some
> time ago.
>
> All you have to do is something like this (content is byte[])
>
> Metadata
Hmmm.. How about this... The photographer who take a photo has the
copyright over the photo not the owner of the picture motive, you, me
or any other photo object. So caching is nothing but taking a picture
using another sort of camera called robot :-) Nothing more really. If
a browser maker decide
Ravi:
Just wondering did you submit your modification in JIIRA? I can't
seems to find it.
Thanks
On 3/6/06, Ravi Chintakunta <[EMAIL PROTECTED]> wrote:
> Hi Frank,
>
> Have a look at this thread.
>
> http://www.mail-archive.com/nutch-user@lucene.apache.org/msg03014.html
>
> - Ravi
>
> On 3/6/06,
gt; based on the 0.7.1 base.
>
> The second error seems to indicate that you don't have a filter
> method in your indexer plugin. Check to make sure there isn't a typo in
> the name of the method.
>
> Good luck,
> Jake.
>
> -Original Message-
>
Hi Jacob:
I been trying to compile the recommended plugin example but having no
luck. I am hitting the following error? I did "ant tar" and i added
deploy and clean in the plugins/build.xml. But I am keep getting the
following error.. As I am just getting started any hint will be
greatly appreciat
Hi:
Google mini internals... check it out -
http://www.anandtech.com/IT/showdoc.aspx?i=2523&p=3
Pentium 3 and old dell memory?
Regards
ppreciate any response on this.
>
> Thanks In Advance
> Pushpesh
>
>
> On 12/28/05, Nutch Newbie <[EMAIL PROTECTED]> wrote:
> >
> > Have you tried the following:
> >
> > http://wiki.apache.org/nutch/HardwareRequirements
> >
> > and
&g
I had exactly similler problem with JDK 1.5. Also when I worked with
only one data node problem doesn't occur.
Thanks
On 12/28/05, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> Interesting!
> That is not a feature that is a bug, may you can open a minor bug
> report.
> Thanks.
> Stefan
> Am 28.12
Have you tried the following:
http://wiki.apache.org/nutch/HardwareRequirements
and
http://wiki.apache.org/nutch/
There are no quick answer if one is planning to crawl million
pages..Read..Try.. Read..
On 12/28/05, Pushpesh Kr. Rajwanshi <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I want to know if
Can you please try the following in your nutch-site.xml
I have added 5 after the local.You can also try 127.0.0.1
>
> fs.default.name
> local:5
> The name of the default file system. Either the
> literal string "local" or a host:port for NDFS.
>
Please make sure those ndfs and map
Stefan:
Your docs are good as it is. But only if you want to be the best then
you gotta :-).do
You could improve your tutorial by adding or modifying the following places.
1. You mention vaguely about having same user/pass for all the 3
machine. I think it would be good idea to put some
Yes its possible. I am guessing - if you have unpacked the tar file lets say
nutch-0.8.dev/
then go under the
src/ directory
find the directory
src/webapps
copy it so the directory
is under nutch-0.8.dev/webapps
When you start jobtracker it starts to look for that catalog.
Can you also try
Hi:
The command - bin/nutch admin is not supported on 0.8 version.
I don't recommend you run the crawl command as it is designed for "one run".
But I suggest you follow the tutorial below:
http://wiki.media-style.com/display/nutchDocu/setup+a+map+reduce+multi+box+system
It worked for me. Howeve
Hi:
I agree. It would be nice if one could do this. Some sort of mapping
based on pre-defined value. There is a plugin that might be of value.
You could start by looking at "Creative-Commons" plugin. Maybe one
could modify the file-protocol plugin to implement such option.
Just some thoughts.
Re
29 matches
Mail list logo