Are you crawling jsp's?
Put this in your regex-normalize.xml 
<regex>
  <pattern>(.*)(;jsessionid=[a-zA-Z0-9]{32})(.*)</pattern>
  <substitution>$1$3</substitution>
</regex> 

***********
And change this setting in your nutch-default.xml

<property>
  <name>urlnormalizer.class</name>
  <value>org.apache.nutch.net.RegexUrlNormalizer</value>
  <description>Name of the class used to normalize URLs.</description>
</property>
-----Original Message-----
From: Deepa Devanathan [mailto:[EMAIL PROTECTED] 
Sent: Friday, July 21, 2006 9:21 AM
To: nutch-user@lucene.apache.org
Subject: Nutch with Domino web server

hi guys,

I tried crawling my site which works with a Domino web server talking to
a Tomcat - using the crawl command ( with all the config for urls,
file-types etc etc) - but the crawl log doesnt show any URLs being
fetched.

Is there something different I need to do to run a crawl for a site
running on Domino Web server ?

I had earlier run the crawl successfully with an Apache web server - but
am not able to do so with Domino..

any ideas/ suggestions ?
any help would be highly appreciated..

Thanks in advance,
Deepa

Reply via email to