Hi guys,
Just had a quick question - can Nutch 0.7.1 crawl text content in JSPs also
?
if so, is it built-in or is there a plugin available for this ?
if my site content is entirely in JSP's , does that mean it cant be crawl
and hence searched ??
pls lemme know ur thoughts on this .. based on w
hi guys,
I have a site i need to crawl but the very first page asks for a username,
password.
Is there a way I can supply these thru some config file in nutch so I can
crawl the underlying content ?
if anyone has ideas please let me know ..
Alex and Sudhi - thanks for your responses to my prev
hi guys,
Can Nutch parse thru Lotus notes databases - .nsf files yet ?
my site uses nsf's extensively and I need to crawl the content which
includes htmls, jsp,s pdfs etc..
Will the normal crawl work ? if anybody has any ideas, please let me know..
any help is greatly appriciated !
Thanks,
Dee
hi guys,
I tried crawling my site which works with a Domino web server talking to a
Tomcat - using the crawl command ( with all the config for urls, file-types
etc etc) - but the crawl log doesnt show any URLs being fetched.
Is there something different I need to do to run a crawl for a site run
Hi,
I have a setup where a non-Apache server is the one serving up content on a
port other than 80 along with
a Tomcat for jsp content. I have installed nutch and ran the crawl program.
The indexes are not getting created properly - I was unable to see the URLs
of the pages being index in the log
Hi,
I am a newbie to Nutch .. need some help with my search results ..
I have a common index for some english as well as french htmls .. I read
on the mail archives that
1. by activating the language identifier plugin in nutch-default.xml and
2. adding the advisory attribute lang:fr to the que