Re: Arguments for Solr implementation at public web site

Jon Baer Fri, 13 Nov 2009 03:14:26 -0800

For this list I usually end up @ http://solr.markmail.org (which I believe also 
uses Lucene under the hood)

Google is such a black box ... 

Pros:
+ 1 Open Source (enough said :-)

There also seems to always be the notion that "crawling" leads itself to 
produce the best results but that is rarely the case.  And unless you are a 
"special" type of site Google will not overlay your results w/ some type of 
context in the search (ie news or sports, etc).  

What I think really needs to happen is Solr (and is a bit missing @ the moment) 
is there needs to be a common interface to "reindexing" another index (if that 
makes sense) ... something akin or like OpenSearch 
(http://www.opensearch.org/Community/OpenSearch_software)

For example what I would like to do is have my site, have my search index, and 
connect Google to indexing just to my search index (and not crawl the site) ... 
the only current option for something like that are sitemaps which I think Solr 
(templates) should have a contrib project for (but you would have to generate 
these offline for sure).

- Jon  

On Nov 13, 2009, at 6:00 AM, Lukáš Vlček wrote:

> Hi,
> 
> thanks for inputs so far... however, let's put it this way:
> 
> When you need to search for something Lucene or Solr related, which one do
> you use:
> - generic Google
> - go to a particular mail list web site and search from here (if there is
> any search form at all)
> - go to LucidImagination.com and use its search capability
> 
> Regards,
> Lukas
> 
> 
> On Fri, Nov 13, 2009 at 11:50 AM, Andrew Clegg <andrew.cl...@gmail.com>wrote:
> 
>> 
>> 
>> Lukáš Vlček wrote:
>>> 
>>> I am looking for good arguments to justify implementation a search for
>>> sites
>>> which are available on the public internet. There are many sites in
>>> "powered
>>> by Solr" section which are indexed by Google and other search engines but
>>> still they decided to invest resources into building and maintenance of
>>> their own search functionality and not to go with [user_query site:
>>> my_site.com] google search. Why?
>>> 
>> 
>> You're assuming that Solr is just used in these cases to index discrete web
>> pages which Google etc. would be able to access via following navigational
>> links.
>> 
>> I would imagine that in a lot of cases, Solr is used to index database
>> entities which are used to build [parts of] pages dynamically, and which
>> might be viewable in different forms in various different pages.
>> 
>> Plus, with stored fields, you have the option of actually driving a website
>> off Solr instead of directly off a database, which might make sense from a
>> speed perspective in some cases.
>> 
>> And further, going back to page-only indexing -- you have no guarantee when
>> Google will decide to recrawl your site, so there may be a delay before
>> changes show up in their index. With an in-house search engine you can
>> reindex as often as you like.
>> 
>> Andrew.
>> 
>> --
>> View this message in context:
>> http://old.nabble.com/Arguments-for-Solr-implementation-at-public-web-site-tp26333987p26334734.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
>>

Re: Arguments for Solr implementation at public web site

Reply via email to