I think that my nutch-site.xml setting will kill re-directs..
Just remembered this
<property>
<name>db.ignore.external.links</name>
<value>true</value>
<description>
Don't go to External Links, just stay in the domain
that I passed into you
</description>
</property>
I only want to scan within the domain I requested... Unless that url
instantly re-directs me to a different URL and then I want to only use
that one. Any thoughts..
Am I understanding this correctly?
Ray
-----Original Message-----
From: Lukas, Ray [mailto:[email protected]]
Sent: Monday, May 04, 2009 1:56 PM
To: [email protected]
Subject: Re-direct in Nutch does not seem to work
Re-direct in Nutch 1.0 does not seem to work..
If I point to a url that is "re-directed to" (the result of a
re-direction, everything works great, if I point to the page that is
re-directing me to the working one, I get a corrupted index.
Can nutch handle re-direction and if so what magic is required?
ray