Nutch with nsf files

2006-07-26 Thread Deepa Devanathan
hi guys, Can Nutch parse thru Lotus notes databases - .nsf files yet ? my site uses nsf's extensively and I need to crawl the content which includes htmls, jsp,s pdfs etc.. Will the normal crawl work ? if anybody has any ideas, please let me know.. any help is greatly appriciated ! Thanks,

Re: Nutch with nsf files

2006-07-26 Thread Sudhi Seshachala
I do not believe we have a nsf-plugin. You will have to write a plugin to be able to handle nsf. There is a plugin tutorial on nutch wiki. Please refer to it. Thanks Sudhi Deepa Devanathan [EMAIL PROTECTED] wrote: hi guys, Can Nutch parse thru Lotus notes databases - .nsf files

installation de Nutch

2006-07-26 Thread kawther khazri
Bonjour,On veut installer Nutch sur un réseau Intranet,on a trouvé un guide d'installation(on la envoyer avec l'email) mais on n'a pas réussi la première étape,en fait on n'a pas compris ceci:Intranet: Configuration To configure things for intranet crawling you must: Create a

Re: Search with sponsored ads?

2006-07-26 Thread wmelo_qualidade
What do you mean by the results page? The search.jsp? Thanks - Original Message - From: Matt Timion [EMAIL PROTECTED] To: nutch-user@lucene.apache.org Sent: Monday, July 24, 2006 9:46 AM Subject: Re: Search with sponsored ads? Should be as easy as dropping the code in the results

Nutch Problem on Godaddy.com server: Can't find bundle for base name org.nutch.jsp.search

2006-07-26 Thread WAJ
Hello everybody, I am a new nutch user and trying to install it on a Godaddy.com Linux server. I works on my local machine, but unfortunately, I am getting the following error message on Godaddy.com Linux server (test JSP and Servlets works correctly) when I run nutch. Anyone can help me?

installation de nutch

2006-07-26 Thread kawther khazri
Hi I am trying to run Nutch by following the instructions given in the tutorial. The environment is FEDORA 5, JDK 1.4.2 and Nutch 0.7.2 And of course Tomcat 5 I get the following errors: [EMAIL PROTECTED] ~]# /root/nutch-0.7.2/bin/nutch crawl urls -dir crawl -depth 3 -topN50 run java in

Re: installation de nutch

2006-07-26 Thread Lourival Júnior
Try to delete the directory crawl in /root/nutch-0.7.2/. So, run the command again. On 7/26/06, kawther khazri [EMAIL PROTECTED] wrote: Hi I am trying to run Nutch by following the instructions given in the tutorial. The environment is FEDORA 5, JDK 1.4.2 and Nutch 0.7.2 And of course Tomcat

[Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Sami Siren
redirecting to nutch-user... What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are shown. Either I'd need to be able to show only 2 matches per website but _all_ matches from the summary-website (would be okay in this

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Stefan Neufeind
Sami Siren wrote: redirecting to nutch-user... What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are shown. Either I'd need to be able to show only 2 matches per website but _all_ matches from the summary-website

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Sami Siren
Stefan Neufeind wrote: Sami Siren wrote: redirecting to nutch-user... What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are shown. Either I'd need to be able to show only 2 matches per website but _all_ matches from

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Stefan Neufeind
Sami Siren wrote: Stefan Neufeind wrote: Sami Siren wrote: redirecting to nutch-user... What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are shown. Either I'd need to be able to show only 2 matches per website but

installation de nutch

2006-07-26 Thread kawther khazri
I have already deleted the directory crawl. but, I have again an other error message: [EMAIL PROTECTED] ~]# rm -f -r crawl [EMAIL PROTECTED] ~]# /root/nutch-0.7.2/bin/nutch crawl urls -dir crawl -depth 3 -topN50 run java in /usr/lib/jvm/jre 060726 152304 parsing

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Andrzej Bialecki
Stefan Neufeind wrote: Sami Siren wrote: Stefan Neufeind wrote: Sami Siren wrote: redirecting to nutch-user... What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are shown. Either I'd need to

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Stefan Neufeind
Andrzej Bialecki wrote: Stefan Neufeind wrote: Sami Siren wrote: Stefan Neufeind wrote: Sami Siren wrote: redirecting to nutch-user... What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are

installation de nutch

2006-07-26 Thread kawther khazri
i don't know where to create the directory urls(for the Intranet configuration),and how to edit the file urls/nutch please i need your help, thanks - Découvrez un nouveau moyen de poser toutes vos questions quelque soit le sujet ! Yahoo!

0.8 much slower than 0.7

2006-07-26 Thread Vasja Ocvirk
Hello, I'm wondering if anyone can help. We injected 1000 seed URLs into Nutch 0.7.2 (basic configuration + 1000 URLs in regexp filter) and it processed them in just few hours. We just switched to 0.8 with same configuration, same URLs, but it seems everything slowed down significantly.

Re: Search with sponsored ads?

2006-07-26 Thread Matt Timion
A results page displays the search results. I believe nutch's results page is search.jsp. - Original Message - From: wmelo_qualidade [EMAIL PROTECTED] To: nutch-user@lucene.apache.org Sent: Monday, July 24, 2006 9:56 AM Subject: Re: Search with sponsored ads? What do you mean by

Re: any success with php-java-bridge and Nutch?

2006-07-26 Thread Matt Timion
Wow... that's a cool site Sudhi. As for the php java bridge, My site's front end is a custom php script. It just parses the XML from http://localhost:8080/opensearch?query=urlencoded+search+string+here and then outputs it according to my php functions. It's secure (8080 is only available

Getting Keywords from Metatags

2006-07-26 Thread Dennis Kubes
Does anyone know how to get the keywords from the meta tags of a page. I have been looking around but it wasn't immediately apparent how to do this. Dennis

Re: stemming

2006-07-26 Thread Matthew Holt
Ok. I did this for Nutch 0.8 (had to edit the listed code some to make up for changes from .7.2 to .8 - mostly having to do with the Configuration type being needed). It partially works. If the page I'm trying to index contains the word interviews and I type in the search engine interview,

Re: stemming

2006-07-26 Thread Howie Wang
It sounds like the query-stemmer is not being called. The query string interviews needs to be processed into interview. Are you sure that your nutch-default.xml is including the query-stemmer correctly? Put print statements in to see if it's getting there. By the way, someone recently told me