Re: no nutch script file under bin directory
The nightly builds are all cataloged here: http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/ The current nightly build is #153 from July 18. For instance, you could do: wget http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/153/artifact/trunk/build/nutch-2007-07-18_04-01-20.tar.gz --Kai - Original Message From: Tsengtan A Shuy <[EMAIL PROTECTED]> To: nutch-dev@lucene.apache.org Sent: Wednesday, July 18, 2007 11:59:52 AM Subject: RE: no nutch script file under bin directory Where do you get the nightly build? I followed your referral web page and use " wget http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/lastStableBuild /artifact/trunk/build/nutch-2007-06-27_06-52-44.tar.gz" to get it. Then I got the "file not found" error message. Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 18, 2007 11:35 AM To: nutch-dev@lucene.apache.org Subject: Re: no nutch script file under bin directory I'm not actually sure ... I think I downloaded and unzipped a nightly build in my usr/local directory thus creating this directory: /usr/local/nutch-2007-06-27_06-52-44 then from within that directory I ran the svn command ... if I remember correctly. You can always try just making a 'nutch' directory or a 'nutch0.9' directory, running svn, and see if it creates another subdirectory under that, then moves things to where you want. - Original Message From: Tsengtan A Shuy <[EMAIL PROTECTED]> To: nutch-dev@lucene.apache.org Sent: Tuesday, July 17, 2007 5:30:18 PM Subject: RE: no nutch script file under bin directory This may seems like a silly question, but I need to know it anyway. When I check out the trunk, I shall put it to the nutch directory which should be the latest release directory e.g: nutch-0.9 release. Am I right? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:33 PM To: 'Tsengtan A Shuy'; nutch-dev@lucene.apache.org Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted "two discussions". Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: 'nutch-dev@lucene.apache.org' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: nutch-dev@lucene.apache.org Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html --Kai Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 Food fight? Enjoy some healthy debate in the Yahoo! Answers Food & Drink Q&A. http://answers.yahoo.com/dir/?link=list&sid=396545367
RE: no nutch script file under bin directory
Where do you get the nightly build? I followed your referral web page and use " wget http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/lastStableBuild /artifact/trunk/build/nutch-2007-06-27_06-52-44.tar.gz" to get it. Then I got the "file not found" error message. Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 18, 2007 11:35 AM To: nutch-dev@lucene.apache.org Subject: Re: no nutch script file under bin directory I'm not actually sure ... I think I downloaded and unzipped a nightly build in my usr/local directory thus creating this directory: /usr/local/nutch-2007-06-27_06-52-44 then from within that directory I ran the svn command ... if I remember correctly. You can always try just making a 'nutch' directory or a 'nutch0.9' directory, running svn, and see if it creates another subdirectory under that, then moves things to where you want. - Original Message From: Tsengtan A Shuy <[EMAIL PROTECTED]> To: nutch-dev@lucene.apache.org Sent: Tuesday, July 17, 2007 5:30:18 PM Subject: RE: no nutch script file under bin directory This may seems like a silly question, but I need to know it anyway. When I check out the trunk, I shall put it to the nutch directory which should be the latest release directory e.g: nutch-0.9 release. Am I right? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:33 PM To: 'Tsengtan A Shuy'; nutch-dev@lucene.apache.org Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted "two discussions". Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: 'nutch-dev@lucene.apache.org' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: nutch-dev@lucene.apache.org Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html --Kai Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7
Re: no nutch script file under bin directory
I'm not actually sure ... I think I downloaded and unzipped a nightly build in my usr/local directory thus creating this directory: /usr/local/nutch-2007-06-27_06-52-44 then from within that directory I ran the svn command ... if I remember correctly. You can always try just making a 'nutch' directory or a 'nutch0.9' directory, running svn, and see if it creates another subdirectory under that, then moves things to where you want. - Original Message From: Tsengtan A Shuy <[EMAIL PROTECTED]> To: nutch-dev@lucene.apache.org Sent: Tuesday, July 17, 2007 5:30:18 PM Subject: RE: no nutch script file under bin directory This may seems like a silly question, but I need to know it anyway. When I check out the trunk, I shall put it to the nutch directory which should be the latest release directory e.g: nutch-0.9 release. Am I right? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:33 PM To: 'Tsengtan A Shuy'; nutch-dev@lucene.apache.org Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted "two discussions". Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: 'nutch-dev@lucene.apache.org' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: nutch-dev@lucene.apache.org Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html --Kai Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7
RE: no nutch script file under bin directory
How do I apply nutch to a website without using Tomcat root directory or remote search engine like Mozdex.com? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 18, 2007 9:42 AM To: nutch-dev@lucene.apache.org Subject: Re: no nutch script file under bin directory Hi: sorry, here's the original discussion that led to the link I accidentally sent twice; I had meant to include it too. http://www.mail-archive.com/[EMAIL PROTECTED]/msg08621.html - Original Message From: Tsengtan A Shuy <[EMAIL PROTECTED]> To: Tsengtan A Shuy <[EMAIL PROTECTED]>; nutch-dev@lucene.apache.org Sent: Tuesday, July 17, 2007 12:32:49 PM Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted "two discussions". Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: 'nutch-dev@lucene.apache.org' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: nutch-dev@lucene.apache.org Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html --Kai The fish are biting. Get more visitors on your site using Yahoo! Search Marketing. http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php
Re: no nutch script file under bin directory
Hi: sorry, here's the original discussion that led to the link I accidentally sent twice; I had meant to include it too. http://www.mail-archive.com/[EMAIL PROTECTED]/msg08621.html - Original Message From: Tsengtan A Shuy <[EMAIL PROTECTED]> To: Tsengtan A Shuy <[EMAIL PROTECTED]>; nutch-dev@lucene.apache.org Sent: Tuesday, July 17, 2007 12:32:49 PM Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted "two discussions". Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: 'nutch-dev@lucene.apache.org' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: nutch-dev@lucene.apache.org Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html --Kai The fish are biting. Get more visitors on your site using Yahoo! Search Marketing. http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php
RE: no nutch script file under bin directory
This may seems like a silly question, but I need to know it anyway. When I check out the trunk, I shall put it to the nutch directory which should be the latest release directory e.g: nutch-0.9 release. Am I right? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:33 PM To: 'Tsengtan A Shuy'; nutch-dev@lucene.apache.org Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted "two discussions". Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: 'nutch-dev@lucene.apache.org' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: nutch-dev@lucene.apache.org Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html --Kai
RE: no nutch script file under bin directory
BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted "two discussions". Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: 'nutch-dev@lucene.apache.org' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design & Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: nutch-dev@lucene.apache.org Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html http://www.mail-archive.com/nutch-dev@lucene.apache.org/msg06571.html --Kai