I forgot about the parsechecker and indexchecker command line options.
When I run it parsechecker with the default nutch with the standard job
file it works.
14/08/13 11:35:28 INFO http.Http: http.proxy.host = null
14/08/13 11:35:28 INFO http.Http: http.proxy.port = 8080
14/08/13 11:35:28 INFO ht
great , pass all tests.
+1 for release.
On Wed, Aug 13, 2014 at 1:31 PM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:
> Hi user@ & dev@,This thread is a VOTE for releasing Apache Nutch 1.9.
> The release candidate comprises the following components.* A staging
> repository [0] conta
Hi,
Yes, that should be fine. The only thing I would do differently would be :
2. Change the list in regex-urlfilter.txt (add +^http://www.rlp.de/ i.e.
> for every url)
allow any URLs instead of specifying all the hostnames one by one but set
the following property to true in nutch-site.xml :
Hi,
+1 to release. Compilation and tests run fine. Signatures look good.
Thanks Lewis!
Julien
On 13 August 2014 06:32, Lewis John Mcgibbney
wrote:
> VOTE'ing will be open for 'at-least' 72 hours to allow people enough time
> to cast their VOTE's.
> Thanks
> Lewis
>
>
> On Tue, Aug 12, 2014 a
Hi Steve,
I tried with Nutch 1.9 RC1 and am not getting this exception.
=> ./nutch parsechecker -D http.agent.name=tralala
http://www.my-ebenefits.com/PenguinRandomHouse/
Probably something that we fixed since 1.5.1 which is rather outdated. Why
don't you give 1.9 a try instead?
Julien
On 12
5 matches
Mail list logo