Re: how to crawl when Solr is search engine?

Manoharam Reddy Thu, 07 Jun 2007 00:43:14 -0700

Thanks for your quick response.

This brings me to another question. As far as I know Nutch can take
care of crawling as well as indexing. Then why go through the hassle
of crawling through Nutch and integrating it into Solr?


Another question I have, Solr provides the search results in XML
format, any ready made tools to convert them directly to web pages for
visitors to see?

On 6/7/07, Ian Holsman <[EMAIL PROTECTED]> wrote:

Hi Manoharam.

we use nutch to do the crawl, and have used sami's patch of nutch
(http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html
) to have it integrate with Solr. It works quite well for our needs.

If you are concerned with the speed, Solr also has a CSV upload
facility, which you might be able to use to upload the data into solr
that way, but we haven't found the HTTP Post speed to be an issue for us.

Regards
Ian

Manoharam Reddy wrote:
> I have just begun using Solr. I see that we have to insert documents
> by posting XMLs to solr/update
>
> I would like to know how Solr is used as a search engine in
> enterprises. How do you do the crawling of your intranet and passing
> the information as XML to solr/update. Isn't this going to be slow? To
> put all content in the index via a HTTP POST request requiring network
> sockets to be opened?
>
> Isn't there any direct way to to do the same thing without resorting
> to HTTP?
>

Re: how to crawl when Solr is search engine?

Reply via email to