hi guys,
Can Nutch parse thru Lotus notes databases - .nsf files yet ?
my site uses nsf's extensively and I need to crawl the content which
includes htmls, jsp,s pdfs etc..
Will the normal crawl work ? if anybody has any ideas, please let me know..
any help is greatly appriciated !
Thanks,
I do not believe we have a nsf-plugin.
You will have to write a plugin to be able to handle nsf. There is a plugin
tutorial on nutch wiki. Please refer to it.
Thanks
Sudhi
Deepa Devanathan [EMAIL PROTECTED] wrote:
hi guys,
Can Nutch parse thru Lotus notes databases - .nsf files
Bonjour,On veut installer Nutch sur un réseau Intranet,on a trouvé un guide d'installation(on la envoyer avec l'email) mais on n'a pas réussi la première étape,en fait on n'a pas compris ceci:Intranet: Configuration To configure things for intranet crawling you must: Create a
What do you mean by the results page? The search.jsp?
Thanks
- Original Message -
From: Matt Timion [EMAIL PROTECTED]
To: nutch-user@lucene.apache.org
Sent: Monday, July 24, 2006 9:46 AM
Subject: Re: Search with sponsored ads?
Should be as easy as dropping the code in the results
Hello everybody,
I am a new nutch user and trying to install it on a Godaddy.com Linux server.
I works on my local machine, but unfortunately, I am getting the following
error message on Godaddy.com Linux server (test JSP and Servlets works
correctly) when I run nutch. Anyone can help me?
Hi
I am trying to run Nutch by following the instructions
given in the tutorial.
The environment is FEDORA 5, JDK 1.4.2 and Nutch 0.7.2
And of course Tomcat 5
I get the following errors:
[EMAIL PROTECTED] ~]# /root/nutch-0.7.2/bin/nutch crawl urls -dir crawl -depth
3 -topN50
run java in
Try to delete the directory crawl in /root/nutch-0.7.2/. So, run the command
again.
On 7/26/06, kawther khazri [EMAIL PROTECTED] wrote:
Hi
I am trying to run Nutch by following the instructions
given in the tutorial.
The environment is FEDORA 5, JDK 1.4.2 and Nutch 0.7.2
And of course Tomcat
redirecting to nutch-user...
What I currently have is that max. 2 matches are shown per website - but
that also from the summary-website only 2 matches are shown. Either I'd
need to be able to show only 2 matches per website but _all_ matches
from the summary-website (would be okay in this
Sami Siren wrote:
redirecting to nutch-user...
What I currently have is that max. 2 matches are shown per website - but
that also from the summary-website only 2 matches are shown. Either I'd
need to be able to show only 2 matches per website but _all_ matches
from the summary-website
Stefan Neufeind wrote:
Sami Siren wrote:
redirecting to nutch-user...
What I currently have is that max. 2 matches are shown per website - but
that also from the summary-website only 2 matches are shown. Either I'd
need to be able to show only 2 matches per website but _all_ matches
from
Sami Siren wrote:
Stefan Neufeind wrote:
Sami Siren wrote:
redirecting to nutch-user...
What I currently have is that max. 2 matches are shown per website -
but
that also from the summary-website only 2 matches are shown. Either I'd
need to be able to show only 2 matches per website but
I have already deleted the directory crawl. but, I have again an other error
message:
[EMAIL PROTECTED] ~]# rm -f -r crawl
[EMAIL PROTECTED] ~]# /root/nutch-0.7.2/bin/nutch crawl urls -dir crawl -depth
3 -topN50
run java in /usr/lib/jvm/jre
060726 152304 parsing
Stefan Neufeind wrote:
Sami Siren wrote:
Stefan Neufeind wrote:
Sami Siren wrote:
redirecting to nutch-user...
What I currently have is that max. 2 matches are shown per website -
but
that also from the summary-website only 2 matches are shown. Either I'd
need to
Andrzej Bialecki wrote:
Stefan Neufeind wrote:
Sami Siren wrote:
Stefan Neufeind wrote:
Sami Siren wrote:
redirecting to nutch-user...
What I currently have is that max. 2 matches are shown per website -
but
that also from the summary-website only 2 matches are
i don't know where to create the directory urls(for the Intranet
configuration),and how to edit the file urls/nutch
please i need your help,
thanks
-
Découvrez un nouveau moyen de poser toutes vos questions quelque soit le sujet
! Yahoo!
Hello,
I'm wondering if anyone can help. We injected 1000 seed URLs into Nutch
0.7.2 (basic configuration + 1000 URLs in regexp filter) and it
processed them in just few hours. We just switched to 0.8 with same
configuration, same URLs, but it seems everything slowed down
significantly.
A results page displays the search results. I believe nutch's results page
is search.jsp.
- Original Message -
From: wmelo_qualidade [EMAIL PROTECTED]
To: nutch-user@lucene.apache.org
Sent: Monday, July 24, 2006 9:56 AM
Subject: Re: Search with sponsored ads?
What do you mean by
Wow... that's a cool site Sudhi.
As for the php java bridge, My site's front end is a custom php script. It
just parses the XML from
http://localhost:8080/opensearch?query=urlencoded+search+string+here and
then outputs it according to my php functions.
It's secure (8080 is only available
Does anyone know how to get the keywords from the meta tags of a page.
I have been looking around but it wasn't immediately apparent how to do
this.
Dennis
Ok. I did this for Nutch 0.8 (had to edit the listed code some to make
up for changes from .7.2 to .8 - mostly having to do with the
Configuration type being needed).
It partially works.
If the page I'm trying to index contains the word interviews and I
type in the search engine interview,
It sounds like the query-stemmer is not being called.
The query string interviews needs to be processed
into interview. Are you sure that your nutch-default.xml
is including the query-stemmer correctly? Put print statements
in to see if it's getting there.
By the way, someone recently told me
21 matches
Mail list logo