Re: stemming

2006-07-26 Thread Howie Wang
It sounds like the query-stemmer is not being called. The query string "interviews" needs to be processed into "interview". Are you sure that your nutch-default.xml is including the query-stemmer correctly? Put print statements in to see if it's getting there. By the way, someone recently told me

Re: stemming

2006-07-26 Thread Matthew Holt
Ok. I did this for Nutch 0.8 (had to edit the listed code some to make up for changes from .7.2 to .8 - mostly having to do with the Configuration type being needed). It partially works. If the page I'm trying to index contains the word "interviews" and I type in the search engine "interview"

Getting Keywords from Metatags

2006-07-26 Thread Dennis Kubes
Does anyone know how to get the keywords from the meta tags of a page. I have been looking around but it wasn't immediately apparent how to do this. Dennis

Re: any success with php-java-bridge and Nutch?

2006-07-26 Thread Matt Timion
Wow... that's a cool site Sudhi. As for the php java bridge, My site's front end is a custom php script. It just parses the XML from http://localhost:8080/opensearch?query=urlencoded+search+string+here and then outputs it according to my php functions. It's secure (8080 is only available to

Re: Search with sponsored ads?

2006-07-26 Thread Matt Timion
A results page displays the search results. I believe nutch's results page is search.jsp. - Original Message - From: "wmelo_qualidade" <[EMAIL PROTECTED]> To: Sent: Monday, July 24, 2006 9:56 AM Subject: Re: Search with sponsored ads? What do you mean by the "results page"? The "s

0.8 much slower than 0.7

2006-07-26 Thread Vasja Ocvirk
Hello, I'm wondering if anyone can help. We injected 1000 seed URLs into Nutch 0.7.2 (basic configuration + 1000 URLs in regexp filter) and it processed them in just few hours. We just switched to 0.8 with same configuration, same URLs, but it seems everything slowed down significantly. Crawl

installation de nutch

2006-07-26 Thread kawther khazri
i don't know where to create the directory urls(for the Intranet configuration),and how to edit the file urls/nutch please i need your help, thanks - Découvrez un nouveau moyen de poser toutes vos questions quelque soit le sujet ! Yahoo! Question

Howto deploy a ROOT.war (if needed)

2006-07-26 Thread NG-Marketing, M.Schneider
Hello List, after moving the ROOT.war file in the Tomcat5 webapps directory and restarting Tomcat, the ROOT.war will be extracted into the ROOT directory automatically. Then, after changing as example my search.jsp and restarting the service again, it works all fine with my new search.jsp.

Re: any success with php-java-bridge and Nutch?

2006-07-26 Thread Sudhi Seshachala
For http://www.myopensourcejobs.com, we are uisng similar to OpensourceXML That works like a champ. I am not sure, if PHP-Java bridge would have any difference in terrms of perf. Thanks Sudhi Stefan Neufeind <[EMAIL PROTECTED]> wrote: Chris Stephens wrote: > Has anyone had succes

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Stefan Neufeind
Andrzej Bialecki wrote: > Stefan Neufeind wrote: >> Sami Siren wrote: >> >>> Stefan Neufeind wrote: >>> Sami Siren wrote: > redirecting to nutch-user... > >> What I currently have is that max. 2 matches are shown per website - >> but >> that also fr

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Andrzej Bialecki
Stefan Neufeind wrote: Sami Siren wrote: Stefan Neufeind wrote: Sami Siren wrote: redirecting to nutch-user... What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are shown. Either I'd need to be

installation de nutch

2006-07-26 Thread kawther khazri
I have already deleted the directory crawl. but, I have again an other error message: [EMAIL PROTECTED] ~]# rm -f -r crawl [EMAIL PROTECTED] ~]# /root/nutch-0.7.2/bin/nutch crawl urls -dir crawl -depth 3 -topN50 run java in /usr/lib/jvm/jre 060726 152304 parsing file:/root/nutch-0.7.2/conf/nutch

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Stefan Neufeind
Sami Siren wrote: > Stefan Neufeind wrote: >> Sami Siren wrote: >> >>> redirecting to nutch-user... >>> >>> What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are shown. Either I'd need to be able to show on

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Sami Siren
Stefan Neufeind wrote: Sami Siren wrote: redirecting to nutch-user... What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are shown. Either I'd need to be able to show only 2 matches per website but _all_ matches from the

Re: [Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Stefan Neufeind
Sami Siren wrote: > redirecting to nutch-user... > >> What I currently have is that max. 2 matches are shown per website - but >> that also from the summary-website only 2 matches are shown. Either I'd >> need to be able to show only 2 matches per website but _all_ matches >> from the summary-webs

[Fwd: [Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section]]

2006-07-26 Thread Sami Siren
redirecting to nutch-user... What I currently have is that max. 2 matches are shown per website - but that also from the summary-website only 2 matches are shown. Either I'd need to be able to show only 2 matches per website but _all_ matches from the summary-website (would be okay in this case)

Re: installation de nutch

2006-07-26 Thread Lourival Júnior
Try to delete the directory crawl in /root/nutch-0.7.2/. So, run the command again. On 7/26/06, kawther khazri <[EMAIL PROTECTED]> wrote: Hi I am trying to run Nutch by following the instructions given in the tutorial. The environment is FEDORA 5, JDK 1.4.2 and Nutch 0.7.2 And of course Tomcat

installation de nutch

2006-07-26 Thread kawther khazri
Hi I am trying to run Nutch by following the instructions given in the tutorial. The environment is FEDORA 5, JDK 1.4.2 and Nutch 0.7.2 And of course Tomcat 5 I get the following errors: [EMAIL PROTECTED] ~]# /root/nutch-0.7.2/bin/nutch crawl urls -dir crawl -depth 3 -topN50 run java in /

Nutch Problem on Godaddy.com server: Can't find bundle for base name org.nutch.jsp.search

2006-07-26 Thread WAJ
Hello everybody, I am a new nutch user and trying to install it on a Godaddy.com Linux server. I works on my local machine, but unfortunately, I am getting the following error message on Godaddy.com Linux server (test JSP and Servlets works correctly) when I run nutch. Anyone can help me?

Re: Search with sponsored ads?

2006-07-26 Thread wmelo_qualidade
What do you mean by the "results page"? The "search.jsp"? Thanks - Original Message - From: "Matt Timion" <[EMAIL PROTECTED]> To: Sent: Monday, July 24, 2006 9:46 AM Subject: Re: Search with sponsored ads? Should be as easy as dropping the code in the results pages. Check out my

installation de Nutch

2006-07-26 Thread kawther khazri
Bonjour,On veut installer Nutch sur un réseau Intranet,on a trouvé un guide d'installation(on la envoyer avec l'email) mais on n'a pas réussi la première étape,en fait on n'a pas compris ceci:Intranet: Configuration To configure things for intranet crawling you must: Create a direct

Re: Nutch with nsf files

2006-07-26 Thread Sudhi Seshachala
I do not believe we have a nsf-plugin. You will have to write a plugin to be able to handle nsf. There is a plugin tutorial on nutch wiki. Please refer to it. Thanks Sudhi Deepa Devanathan <[EMAIL PROTECTED]> wrote: hi guys, Can Nutch parse thru Lotus notes databases - .nsf files

Nutch with nsf files

2006-07-26 Thread Deepa Devanathan
hi guys, Can Nutch parse thru Lotus notes databases - .nsf files yet ? my site uses nsf's extensively and I need to crawl the content which includes htmls, jsp,s pdfs etc.. Will the normal crawl work ? if anybody has any ideas, please let me know.. any help is greatly appriciated ! Thanks, Dee