Re: Error after SVN update

2007-01-09 Thread Nutch Newbie
Thank you for your confirmation Andrzej! Yes, Next time I will report it to the dev list :-p Regards On 1/9/07, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: Nutch Newbie wrote: > Hi: > > Could some please be kind enough to confirm if the 0.9-dev trunk is > broken. I did

Re: Error after SVN update

2007-01-08 Thread Nutch Newbie
help. On 1/8/07, Nutch Newbie <[EMAIL PROTECTED]> wrote: Hi: I am getting the following error after updating to revision 494024. My Hadoop-site.xml (mapred.speculative) set to false .. I am not sure what I am doing wrong.. everything worked before the update.. Any help.. Regards La

Error after SVN update

2007-01-08 Thread Nutch Newbie
Hi: I am getting the following error after updating to revision 494024. My Hadoop-site.xml (mapred.speculative) set to false .. I am not sure what I am doing wrong.. everything worked before the update.. Any help.. Regards Language identifier configuration [1-4/2048] map 100% reduce 0% Language

Re: 0.7.3 version

2006-11-16 Thread Nutch Newbie
Hi: I would like to take this opportunity to propose another idea. Nutch should have a patch committing guidelines after all it effects how we code and what way we should submit patch so it gets committed. This makes easier and encouraging to when I know what exactly I need to do to get my code i

Re: Strategic Direction of Nutch

2006-11-13 Thread Nutch Newbie
s and not giving them the chance to develop and contribute. I completely understand your view and I am aware of Hadoop work in progress. Regards, On 11/14/06, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: (Sorry for the long post, but I felt this issue needs to be made very clear ...) Nutch N

Re: Strategic Direction of Nutch

2006-11-13 Thread Nutch Newbie
Here is some general comments: The problem is in Hadoop i.e. map-reduce, i.e. processing. Hadoop-206 is not solved..Have a look. http://www.mail-archive.com/hadoop-user%40lucene.apache.org/msg00521.html Well, again its a wishful thinking to ask for many developers, patch and bug reporting and b

Re: Strategic Direction of Nutch

2006-11-13 Thread Nutch Newbie
Well, I would like to agree with Piotr here but current development i.e. 0.8 version and onwards single machine nutch install is not optimal there are various hadoop related issue example http://issues.apache.org/jira/browse/HADOOP-206 are important for a single machine install. I don't think "o

Re: XMLParser for Nutch

2006-11-04 Thread Nutch Newbie
Can you post your "xmlparser-conf.xml" from the nutch/conf dir ? Also what kind of error message do you get when you index? You can use Luke to see the index... Regards, On 11/4/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote: Hello Everyone, I am just installed nutch-0.8.1 on my dev machine

Re: How can I setup an mp3 search engine?

2006-10-28 Thread Nutch Newbie
You need to enable "index-more" and "query-more" plugins to enable type, date range etc based query.. plugin.includes protocol-http|urlfilter-regex|parse-(text|html|js|mp3)|index-(basic|more)|query-(basic|more|site|url)|summary-basic|scoring-opic On 10/28/06, [EMAIL PROTECTED] <[EMAIL PROTEC

Re: nutch installer

2006-08-16 Thread Nutch Newbie
interest. On 8/15/06, thegallier <[EMAIL PROTECTED]> wrote: I would love to try it. Cheers Nutch Newbie wrote: > > Hi: > > I am in the process of finishing up an Installer for Nutch 0.8 (one > machine/local install), I am using opensource installer which complies > with Ap

Re: Vertical Search (Nutch) for Opensource Jobs- http://www.myopensourcejobs.com

2006-07-17 Thread Nutch Newbie
Good work! On 7/17/06, Sudhi Seshachala <[EMAIL PROTECTED]> wrote: In addition for crawling, I have customized the process of crawling. Just curious what do you mean by customized process of crawling? Best of luck with your site.

Re: Memory problem while running Nutch

2006-06-15 Thread Nutch Newbie
Hi Have a look at the bin/nutch script about JAVA HEAP SIZE adjust it to your settings..you should see something a line like JAVA_HEAP_MAX=-Xmx1000m in bin/nutch script rgds On 6/15/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote: Hi, I installed Tomcat using cPanel/WHM as root. It downlo

Re: robot exclusion portional of a document

2006-05-17 Thread Nutch Newbie
On 5/16/06, Alexander E Genaud <[EMAIL PROTECTED]> wrote: Hello, As far as I understand, /robots.txt designates which files may and may not be indexed by the Nutch and other crawlers. However, is there a method by which site may exclude only sections of a document? The benefit is most evident i

Re: Startscript in windows

2006-05-03 Thread Nutch Newbie
AJ Did you update the scrpit to reflect new changes in 0.8? no? I can update it.. however I am getting a Class not found error when I try to run nutch crawl or nutch inject?? yes I did pointed it to the current class in 0.8??? any suggestions Thanks On 4/30/06, ArentJan Banck <[EMAIL PROTECTED]

nutch installer

2006-04-20 Thread Nutch Newbie
Hi: I am in the process of finishing up an Installer for Nutch 0.8 (one machine/local install), I am using opensource installer which complies with Apache 2.0 Lic. so my plan is to make it opensouce if there are enough community interest. The installer is in Java and you can integrate it with Ant.

Re: Jpeg and Exif Plugin

2006-04-10 Thread Nutch Newbie
Hi Philippe: Any progress? Do you need any help? On 3/6/06, Ivan Sekulovic <[EMAIL PROTECTED]> wrote: > I think that licence is OK. > > Using that libray for plugin is realy simple. I've done some test some > time ago. > > All you have to do is something like this (content is byte[]) > > Metadata

Re: Legal issues

2006-03-30 Thread Nutch Newbie
Hmmm.. How about this... The photographer who take a photo has the copyright over the photo not the owner of the picture motive, you, me or any other photo object. So caching is nothing but taking a picture using another sort of camera called robot :-) Nothing more really. If a browser maker decide

Re: Multi-applications?

2006-03-06 Thread Nutch Newbie
Ravi: Just wondering did you submit your modification in JIIRA? I can't seems to find it. Thanks On 3/6/06, Ravi Chintakunta <[EMAIL PROTECTED]> wrote: > Hi Frank, > > Have a look at this thread. > > http://www.mail-archive.com/nutch-user@lucene.apache.org/msg03014.html > > - Ravi > > On 3/6/06,

Re: recommended plugin example

2006-02-24 Thread Nutch Newbie
gt; based on the 0.7.1 base. > > The second error seems to indicate that you don't have a filter > method in your indexer plugin. Check to make sure there isn't a typo in > the name of the method. > > Good luck, > Jake. > > -Original Message- >

recommended plugin example

2006-02-24 Thread Nutch Newbie
Hi Jacob: I been trying to compile the recommended plugin example but having no luck. I am hitting the following error? I did "ant tar" and i added deploy and clean in the plugins/build.xml. But I am keep getting the following error.. As I am just getting started any hint will be greatly appreciat

offtopic - disecting google mini

2006-02-14 Thread Nutch Newbie
Hi: Google mini internals... check it out - http://www.anandtech.com/IT/showdoc.aspx?i=2523&p=3 Pentium 3 and old dell memory? Regards

Re: Is any one able to successfully run Distributed Crawl?

2005-12-28 Thread Nutch Newbie
ppreciate any response on this. > > Thanks In Advance > Pushpesh > > > On 12/28/05, Nutch Newbie <[EMAIL PROTECTED]> wrote: > > > > Have you tried the following: > > > > http://wiki.apache.org/nutch/HardwareRequirements > > > > and &g

Re: Trouble setting NDFS on multiple machines

2005-12-28 Thread Nutch Newbie
I had exactly similler problem with JDK 1.5. Also when I worked with only one data node problem doesn't occur. Thanks On 12/28/05, Stefan Groschupf <[EMAIL PROTECTED]> wrote: > Interesting! > That is not a feature that is a bug, may you can open a minor bug > report. > Thanks. > Stefan > Am 28.12

Re: Is any one able to successfully run Distributed Crawl?

2005-12-27 Thread Nutch Newbie
Have you tried the following: http://wiki.apache.org/nutch/HardwareRequirements and http://wiki.apache.org/nutch/ There are no quick answer if one is planning to crawl million pages..Read..Try.. Read.. On 12/28/05, Pushpesh Kr. Rajwanshi <[EMAIL PROTECTED]> wrote: > Hi, > > I want to know if

Re: How to run Nutch?

2005-12-27 Thread Nutch Newbie
Can you please try the following in your nutch-site.xml I have added 5 after the local.You can also try 127.0.0.1 > > fs.default.name > local:5 > The name of the default file system. Either the > literal string "local" or a host:port for NDFS. > Please make sure those ndfs and map

Re: How to run Nutch?

2005-12-27 Thread Nutch Newbie
Stefan: Your docs are good as it is. But only if you want to be the best then you gotta :-).do You could improve your tutorial by adding or modifying the following places. 1. You mention vaguely about having same user/pass for all the 3 machine. I think it would be good idea to put some

Re: How to run Nutch?

2005-12-26 Thread Nutch Newbie
Yes its possible. I am guessing - if you have unpacked the tar file lets say nutch-0.8.dev/ then go under the src/ directory find the directory src/webapps copy it so the directory is under nutch-0.8.dev/webapps When you start jobtracker it starts to look for that catalog. Can you also try

Re: How to run Nutch?

2005-12-26 Thread Nutch Newbie
Hi: The command - bin/nutch admin is not supported on 0.8 version. I don't recommend you run the crawl command as it is designed for "one run". But I suggest you follow the tutorial below: http://wiki.media-style.com/display/nutchDocu/setup+a+map+reduce+multi+box+system It worked for me. Howeve

Re: file to http mapping

2005-12-26 Thread Nutch Newbie
Hi: I agree. It would be nice if one could do this. Some sort of mapping based on pre-defined value. There is a plugin that might be of value. You could start by looking at "Creative-Commons" plugin. Maybe one could modify the file-protocol plugin to implement such option. Just some thoughts. Re