Nagendra

You wrote,

Naveen:

*NRT with Apache Solr 3.3 and RankingAlgorithm does need a commit for a
document to become searchable*. Any document that you add through update
becomes  immediately searchable. So no need to commit from within your
update client code.  Since there is no commit, the cache does not have to be
cleared or the old searchers closed or  new searchers opened, and warmed
(error that you are facing).


Looking at the link which you mentioned is clearly what we wanted. But the
real thing is that you have "RA does need a commit for  a document to become
searchable" (please take a look at bold sentence) .

In future, for more loads, can it cater to Master Slave (Replication) and
etc to scale and perform better? If yes, we would like to go for NRT and
looking at the performance described in the article is acceptable. We were
expecting the same real time performance for a single user.

What about multiple users, should we wait for 1-2 secs before calling the
curl request to make SOLR perform better. Or internally it will handle with
multiple request (multithreaded and etc).

What would be doc size (10,000 docs) to allow JVM perform better? Have you
done any kind of benchmarking in terms of multi threaded and multi user for
NRT and also JVM tuning in terms of SOLR sever performance. Any kind of
performance analysis would help us to decide quickly to switch over to NRT.

Questions in terms for switching over to NRT,


1.Should we upgrade to SOLR 4.x ?

2. Any benchmarking (10,000 docs/secs).  The question here is more specific

the detail of individual doc (fields, number of fields, fields size,
parameters affecting performance with faceting or w/o faceting)

3. What about multiple users ?

A user in real time might be having an large doc size of .1 million. How to
break and analyze which one is better (though it is our task to do). But
still any kind of break up will help us. Imagine a user inbox.

4. JVM tuning and performance result based on Multithreaded environment.

5. Machine Details (RAM, CPU, and settings from SOLR perspective).

Hoping that you are getting my point. We want to benchmark the performance.
If you can involve me in your group, that would be great.

Thanks
Naveen



2011/8/15 Nagendra Nagarajayya <nnagaraja...@transaxtions.com>

> Bill:
>
> I did look at Marks performance tests. Looks very interesting.
>
> Here is the Apacle Solr 3.3 with RankingAlgorithm NRT performance:
> http://solr-ra.tgels.com/wiki/**en/Near_Real_Time_Search_ver_**3.x<http://solr-ra.tgels.com/wiki/en/Near_Real_Time_Search_ver_3.x>
>
>
> Regards
>
> - Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://rankingalgorithm.tgels.**org <http://rankingalgorithm.tgels.org>
>
>
>
> On 8/14/2011 7:47 PM, Bill Bell wrote:
>
>> I understand.
>>
>> Have you looked at Mark's patch? From his performance tests, it looks
>> pretty good.
>>
>> When would RA work better?
>>
>> Bill
>>
>>
>> On 8/14/11 8:40 PM, "Nagendra Nagarajayya"<nnagarajayya@**
>> transaxtions.com <nnagaraja...@transaxtions.com>>
>> wrote:
>>
>>  Bill:
>>>
>>> The technical details of the NRT implementation in Apache Solr with
>>> RankingAlgorithm (SOLR-RA) is available here:
>>>
>>> http://solr-ra.tgels.com/**papers/NRT_Solr_**RankingAlgorithm.pdf<http://solr-ra.tgels.com/papers/NRT_Solr_RankingAlgorithm.pdf>
>>>
>>> (Some changes for Solr 3.x, but for most it is as above)
>>>
>>> Regarding support for 4.0 trunk, should happen sometime soon.
>>>
>>> Regards
>>>
>>> - Nagendra Nagarajayya
>>> http://solr-ra.tgels.org
>>> http://rankingalgorithm.tgels.**org <http://rankingalgorithm.tgels.org>
>>>
>>>
>>>
>>>
>>>
>>> On 8/14/2011 7:11 PM, Bill Bell wrote:
>>>
>>>> OK,
>>>>
>>>> I'll ask the elephant in the roomÅ .
>>>>
>>>> What is the difference between the new UpdateHandler from Mark and the
>>>> SOLR-RA?
>>>>
>>>> The UpdateHandler works with 4.0 does SOLR-RA work with 4.0 trunk?
>>>>
>>>> Pros/Cons?
>>>>
>>>>
>>>> On 8/14/11 8:10 PM, "Nagendra
>>>> Nagarajayya"<nnagarajayya@**transaxtions.com<nnagaraja...@transaxtions.com>
>>>> >
>>>> wrote:
>>>>
>>>>  Naveen:
>>>>>
>>>>> NRT with Apache Solr 3.3 and RankingAlgorithm does need a commit for a
>>>>> document to become searchable. Any document that you add through update
>>>>> becomes  immediately searchable. So no need to commit from within your
>>>>> update client code.  Since there is no commit, the cache does not have
>>>>> to be cleared or the old searchers closed or  new searchers opened, and
>>>>> warmed (error that you are facing).
>>>>>
>>>>> Regards
>>>>>
>>>>> - Nagendra Nagarajayya
>>>>> http://solr-ra.tgels.org
>>>>> http://rankingalgorithm.tgels.**org<http://rankingalgorithm.tgels.org>
>>>>>
>>>>>
>>>>>
>>>>> On 8/14/2011 10:37 AM, Naveen Gupta wrote:
>>>>>
>>>>>> Hi Mark/Erick/Nagendra,
>>>>>>
>>>>>> I was not very confident about NRT at that point of time, when we
>>>>>> started
>>>>>> project almost 1 year ago, definitely i would try NRT and see the
>>>>>> performance.
>>>>>>
>>>>>> The current requirement was working fine till we were using
>>>>>> commitWithin 10
>>>>>> millisecs in the XMLDocument which we were posting to SOLR.
>>>>>>
>>>>>> But due to which, we were getting very poor performance (almost 3 mins
>>>>>> for
>>>>>> 15,000 docs) per user. There are many paraller user committing to our
>>>>>> SOLR.
>>>>>>
>>>>>> So we removed the commitWithin, and hence performance was much much
>>>>>> better.
>>>>>>
>>>>>> But then we are getting this maxWarmingSearcher Error, because we are
>>>>>> committing separately as a curl request after once entire doc is
>>>>>> submitted
>>>>>> for indexing.
>>>>>>
>>>>>> The question here is what is difference between commitWithin and
>>>>>> commit
>>>>>> (apart from the fact that commit takes memory and processes and
>>>>>> additional
>>>>>> hardware usage)
>>>>>>
>>>>>> Why we want it to be visible as soon as possible, since we are
>>>>>> applying
>>>>>> many
>>>>>> business rules on top of the results (older indexes as well as new
>>>>>> one)
>>>>>> and
>>>>>> apply different filters.
>>>>>>
>>>>>> upto 5 mins is fine for us. but more than that we need to think then
>>>>>> other
>>>>>> optimizations.
>>>>>>
>>>>>> We will definitely try NRT. But please tell me other options which we
>>>>>> can
>>>>>> apply in order to optimize.?
>>>>>>
>>>>>> Thanks
>>>>>> Naveen
>>>>>>
>>>>>>
>>>>>> On Sun, Aug 14, 2011 at 9:42 PM, Erick
>>>>>> Erickson<erickerickson@gmail.**com <erickerick...@gmail.com>>wrote:
>>>>>>
>>>>>>  Ah, thanks, Mark... I must have been looking at the wrong JIRAs.
>>>>>>>
>>>>>>> Erick
>>>>>>>
>>>>>>> On Sun, Aug 14, 2011 at 10:02 AM, Mark Miller<markrmil...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Aug 14, 2011, at 9:03 AM, Erick Erickson wrote:
>>>>>>>>
>>>>>>>>  You either have to go to near real time (NRT), which is under
>>>>>>>>> development, but not committed to trunk yet
>>>>>>>>>
>>>>>>>> NRT support is committed to trunk.
>>>>>>>>
>>>>>>>> - Mark Miller
>>>>>>>> lucidimagination.com
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>
>>>>
>>
>>
>>
>

Reply via email to