It sounds like the query-stemmer is not being called.
The query string "interviews" needs to be processed
into "interview". Are you sure that your nutch-default.xml
is including the query-stemmer correctly? Put print statements
in to see if it's getting there.
By the way, someone recently told me
Ok. I did this for Nutch 0.8 (had to edit the listed code some to make
up for changes from .7.2 to .8 - mostly having to do with the
Configuration type being needed).
It partially works.
If the page I'm trying to index contains the word "interviews" and I
type in the search engine "interview"
Does anyone know how to get the keywords from the meta tags of a page.
I have been looking around but it wasn't immediately apparent how to do
this.
Dennis
Wow... that's a cool site Sudhi.
As for the php java bridge, My site's front end is a custom php script. It
just parses the XML from
http://localhost:8080/opensearch?query=urlencoded+search+string+here and
then outputs it according to my php functions.
It's secure (8080 is only available to
A results page displays the search results. I believe nutch's results page
is search.jsp.
- Original Message -
From: "wmelo_qualidade" <[EMAIL PROTECTED]>
To:
Sent: Monday, July 24, 2006 9:56 AM
Subject: Re: Search with sponsored ads?
What do you mean by the "results page"? The "s
Hello,
I'm wondering if anyone can help. We injected 1000 seed URLs into Nutch
0.7.2 (basic configuration + 1000 URLs in regexp filter) and it
processed them in just few hours. We just switched to 0.8 with same
configuration, same URLs, but it seems everything slowed down
significantly. Crawl
i don't know where to create the directory urls(for the Intranet
configuration),and how to edit the file urls/nutch
please i need your help,
thanks
-
Découvrez un nouveau moyen de poser toutes vos questions quelque soit le sujet
! Yahoo! Question
Hello List,
after moving the ROOT.war file in the Tomcat5 webapps directory and
restarting Tomcat, the ROOT.war will be extracted into the ROOT directory
automatically.
Then, after changing as example my search.jsp and restarting the service
again, it works all fine with my new search.jsp.
For http://www.myopensourcejobs.com, we are uisng similar to OpensourceXML
That works like a champ.
I am not sure, if PHP-Java bridge would have any difference in terrms of perf.
Thanks
Sudhi
Stefan Neufeind <[EMAIL PROTECTED]> wrote:
Chris Stephens wrote:
> Has anyone had succes
Andrzej Bialecki wrote:
> Stefan Neufeind wrote:
>> Sami Siren wrote:
>>
>>> Stefan Neufeind wrote:
>>>
Sami Siren wrote:
> redirecting to nutch-user...
>
>> What I currently have is that max. 2 matches are shown per website -
>> but
>> that also fr
Stefan Neufeind wrote:
Sami Siren wrote:
Stefan Neufeind wrote:
Sami Siren wrote:
redirecting to nutch-user...
What I currently have is that max. 2 matches are shown per website -
but
that also from the summary-website only 2 matches are shown. Either I'd
need to be
I have already deleted the directory crawl. but, I have again an other error
message:
[EMAIL PROTECTED] ~]# rm -f -r crawl
[EMAIL PROTECTED] ~]# /root/nutch-0.7.2/bin/nutch crawl urls -dir crawl -depth
3 -topN50
run java in /usr/lib/jvm/jre
060726 152304 parsing file:/root/nutch-0.7.2/conf/nutch
Sami Siren wrote:
> Stefan Neufeind wrote:
>> Sami Siren wrote:
>>
>>> redirecting to nutch-user...
>>>
>>>
What I currently have is that max. 2 matches are shown per website -
but
that also from the summary-website only 2 matches are shown. Either I'd
need to be able to show on
Stefan Neufeind wrote:
Sami Siren wrote:
redirecting to nutch-user...
What I currently have is that max. 2 matches are shown per website - but
that also from the summary-website only 2 matches are shown. Either I'd
need to be able to show only 2 matches per website but _all_ matches
from the
Sami Siren wrote:
> redirecting to nutch-user...
>
>> What I currently have is that max. 2 matches are shown per website - but
>> that also from the summary-website only 2 matches are shown. Either I'd
>> need to be able to show only 2 matches per website but _all_ matches
>> from the summary-webs
redirecting to nutch-user...
What I currently have is that max. 2 matches are shown per website - but
that also from the summary-website only 2 matches are shown. Either I'd
need to be able to show only 2 matches per website but _all_ matches
from the summary-website (would be okay in this case)
Try to delete the directory crawl in /root/nutch-0.7.2/. So, run the command
again.
On 7/26/06, kawther khazri <[EMAIL PROTECTED]> wrote:
Hi
I am trying to run Nutch by following the instructions
given in the tutorial.
The environment is FEDORA 5, JDK 1.4.2 and Nutch 0.7.2
And of course Tomcat
Hi
I am trying to run Nutch by following the instructions
given in the tutorial.
The environment is FEDORA 5, JDK 1.4.2 and Nutch 0.7.2
And of course Tomcat 5
I get the following errors:
[EMAIL PROTECTED] ~]# /root/nutch-0.7.2/bin/nutch crawl urls -dir crawl -depth
3 -topN50
run java in /
Hello everybody,
I am a new nutch user and trying to install it on a Godaddy.com Linux server.
I works on my local machine, but unfortunately, I am getting the following
error message on Godaddy.com Linux server (test JSP and Servlets works
correctly) when I run nutch. Anyone can help me?
What do you mean by the "results page"? The "search.jsp"?
Thanks
- Original Message -
From: "Matt Timion" <[EMAIL PROTECTED]>
To:
Sent: Monday, July 24, 2006 9:46 AM
Subject: Re: Search with sponsored ads?
Should be as easy as dropping the code in the results pages.
Check out my
Bonjour,On veut installer Nutch sur un réseau Intranet,on a trouvé un guide d'installation(on la envoyer avec l'email) mais on n'a pas réussi la première étape,en fait on n'a pas compris ceci:Intranet: Configuration To configure things for intranet crawling you must: Create a direct
I do not believe we have a nsf-plugin.
You will have to write a plugin to be able to handle nsf. There is a plugin
tutorial on nutch wiki. Please refer to it.
Thanks
Sudhi
Deepa Devanathan <[EMAIL PROTECTED]> wrote:
hi guys,
Can Nutch parse thru Lotus notes databases - .nsf files
hi guys,
Can Nutch parse thru Lotus notes databases - .nsf files yet ?
my site uses nsf's extensively and I need to crawl the content which
includes htmls, jsp,s pdfs etc..
Will the normal crawl work ? if anybody has any ideas, please let me know..
any help is greatly appriciated !
Thanks,
Dee
23 matches
Mail list logo