Re: I was at a search vendor round table today...

2010-09-22 Thread Alexander Kanarsky
>  He said some other things about a huge petabyte hosted search collection 
> they have used by banks..

In context of your discussion this reference sounds really, really funny... :)

-Alexander

On Wed, Sep 22, 2010 at 1:17 PM, Grant Ingersoll  wrote:
>
> On Sep 22, 2010, at 2:04 PM, Smiley, David W. wrote:
>
>> (I don't twitter or blog so I thought I'd send this message here)
>>
>> Today at work (at MITRE outside DC) there was (is) a day of technical 
>> presentations about topics related to information dissemination and 
>> discovery (broad squishy words there, but mostly covered "search") at which 
>> I spoke about the value of faceting, and gave a quick Solr pitch.  There was 
>> an hour vendor panel in which a representative from Autonomy, Microsoft 
>> (i.e. FAST), Google, Vivisimo, and Endeca had the opportunity to espouse the 
>> virtues of their product, and fit in an occasional jab at their competitors 
>> next to them.  In the absence of a suitable representative for Solr (e.g. 
>> Lucid) I pointed out how open-source Solr has "democratized" (i.e. made 
>> free) search and faceting when it used to require paying lots of money.  And 
>> I asked them how their products have reacted to this new reality.  Autonomy 
>> acknowledged they used to make millions on simple engagements in the distant 
>> past but that isn't the case these days.  He said some other things about a 
>> huge petabyte hosted search collection they have used by banks... I forget 
>> what else he said.  I forgot what Google said.  Vivisimo quoted Steve 
>> Ballmer, saying "open source is as free as a free puppy" (not a bad point 
>> IMO).
>
> Too funny.  Hadn't heard that one before.  Presumably meaning you have to 
> care and feed it, despite the fact that you really do love it and it is cute 
> as hell?  The care and feeding is true of the commercial ones, too, 
> especially in terms of  for supporting features you never use, but love 
> (as in we love using this tool) is usually not a word I hear associated in 
> those respects too often, but of course that is likely self selecting.
>
>> Endeca claimed to be happy Solr exists because it raises the awareness of 
>> faceted search, but then claimed it would not scale and they should then 
>> upgrade to Endeca.  (!)  I found that claim ridiculous, of course.
>
> Having replaced all the above on a number of occasions w/ Solr at both a 
> significant cost savings on licensing, dev time, and hardware, I would agree 
> that claim is quite ridiculous.  Besides, in my experience, the scale claim 
> is silly.  Everyone (customers) says they need scale, but few of them really 
> know what scale is, so it is all relative.   For some, scale is 1M docs, for 
> others it's 1B+ docs;  for others it's 100K queries per day, for others it's 
> 100M per day.  (BTW, I've seen Lucene/Solr do both, just fine.  Not that it 
> is a free lunch, but neither are the other ones despite what they say.)
>
>>
>> Speaking of performance, on a large scale search project where we're using 
>> Solr in place of a MarkLogic prototype (because ML is so friggin expensive, 
>> for one reason), the search results were so fast (~150ms) vs. the ML's 
>> results of 2-3 seconds, that the UI engineers building the interface on top 
>> of the XML output thought Solr was broken because it was so fast.  The quote 
>> was "It's so fast, it's broken".    In other words, they were used to 2-3 
>> second response times and so if the results came back as fast as what Solr 
>> has been doing, then surely there's a bug.  There's no bug.  :)  Admittedly, 
>> I think it was a bit of an apples and oranges comparison but I love that 
>> quote nonetheless.
>
>
> I love it.  I have had the same experience where people think it's broken b/c 
> it's so fast.  Large vendor named above took 24 hours to index 4M records 
> (they weren't even doing anything fancy on the indexing side) and search was 
> slow too.  Solr took about 40 minutes to index all the content and search was 
> blazing.  Same content, faster indexing, better search results, a lot less 
> time.
>
> At any rate, enough of tooting our own horn.  Thanks for sharing!
>
> -Grant
>
>
> --
> Grant Ingersoll
> http://www.lucidimagination.com/
>
>


Re: I was at a search vendor round table today...

2010-09-22 Thread Walter Underwood
On Sep 22, 2010, at 11:04 AM, Smiley, David W. wrote:

> Speaking of performance, on a large scale search project where we're using 
> Solr in place of a MarkLogic prototype (because ML is so friggin expensive, 
> for one reason), the search results were so fast (~150ms) vs. the ML's 
> results of 2-3 seconds, that the UI engineers building the interface on top 
> of the XML output thought Solr was broken because it was so fast.  The quote 
> was "It's so fast, it's broken".In other words, they were used to 2-3 
> second response times and so if the results came back as fast as what Solr 
> has been doing, then surely there's a bug.  There's no bug.  :) Admittedly, I 
> think it was a bit of an apples and oranges comparison but I love that quote 
> nonetheless.

I implemented Solr at Netflix and now I work at MarkLogic, and I strongly agree 
that the comparison is apples and oranges. MarkLogic does run very fast on very 
large datasets, so maybe that prototype was built to show functionality instead 
of speed. Also, MarkLogic already has a lot of stuff that is still in the 
future for Solr, like true real-time search, updating fields, and geospatial 
search.

Next time, invite the MarkLogic people, too. :-)

wunder
--
Walter Underwood
Lead Engineer
MarkLogic



Re: I was at a search vendor round table today...

2010-09-22 Thread Grant Ingersoll

On Sep 22, 2010, at 2:04 PM, Smiley, David W. wrote:

> (I don't twitter or blog so I thought I'd send this message here)
> 
> Today at work (at MITRE outside DC) there was (is) a day of technical 
> presentations about topics related to information dissemination and discovery 
> (broad squishy words there, but mostly covered "search") at which I spoke 
> about the value of faceting, and gave a quick Solr pitch.  There was an hour 
> vendor panel in which a representative from Autonomy, Microsoft (i.e. FAST), 
> Google, Vivisimo, and Endeca had the opportunity to espouse the virtues of 
> their product, and fit in an occasional jab at their competitors next to 
> them.  In the absence of a suitable representative for Solr (e.g. Lucid) I 
> pointed out how open-source Solr has "democratized" (i.e. made free) search 
> and faceting when it used to require paying lots of money.  And I asked them 
> how their products have reacted to this new reality.  Autonomy acknowledged 
> they used to make millions on simple engagements in the distant past but that 
> isn't the case these days.  He said some other things about a huge petabyte 
> hosted search collection they have used by banks... I forget what else he 
> said.  I forgot what Google said.  Vivisimo quoted Steve Ballmer, saying 
> "open source is as free as a free puppy" (not a bad point IMO).  

Too funny.  Hadn't heard that one before.  Presumably meaning you have to care 
and feed it, despite the fact that you really do love it and it is cute as 
hell?  The care and feeding is true of the commercial ones, too, especially in 
terms of  for supporting features you never use, but love (as in we love 
using this tool) is usually not a word I hear associated in those respects too 
often, but of course that is likely self selecting.  

> Endeca claimed to be happy Solr exists because it raises the awareness of 
> faceted search, but then claimed it would not scale and they should then 
> upgrade to Endeca.  (!)  I found that claim ridiculous, of course.

Having replaced all the above on a number of occasions w/ Solr at both a 
significant cost savings on licensing, dev time, and hardware, I would agree 
that claim is quite ridiculous.  Besides, in my experience, the scale claim is 
silly.  Everyone (customers) says they need scale, but few of them really know 
what scale is, so it is all relative.   For some, scale is 1M docs, for others 
it's 1B+ docs;  for others it's 100K queries per day, for others it's 100M per 
day.  (BTW, I've seen Lucene/Solr do both, just fine.  Not that it is a free 
lunch, but neither are the other ones despite what they say.)

> 
> Speaking of performance, on a large scale search project where we're using 
> Solr in place of a MarkLogic prototype (because ML is so friggin expensive, 
> for one reason), the search results were so fast (~150ms) vs. the ML's 
> results of 2-3 seconds, that the UI engineers building the interface on top 
> of the XML output thought Solr was broken because it was so fast.  The quote 
> was "It's so fast, it's broken".In other words, they were used to 2-3 
> second response times and so if the results came back as fast as what Solr 
> has been doing, then surely there's a bug.  There's no bug.  :)  Admittedly, 
> I think it was a bit of an apples and oranges comparison but I love that 
> quote nonetheless.


I love it.  I have had the same experience where people think it's broken b/c 
it's so fast.  Large vendor named above took 24 hours to index 4M records (they 
weren't even doing anything fancy on the indexing side) and search was slow 
too.  Solr took about 40 minutes to index all the content and search was 
blazing.  Same content, faster indexing, better search results, a lot less 
time. 

At any rate, enough of tooting our own horn.  Thanks for sharing!

-Grant


--
Grant Ingersoll
http://www.lucidimagination.com/