Thats an interesting scaling scheme you mention.
I have been trying to devise a good scheme for myself for our scale.

I will try to see how this works out for us.

> On Apr 2, 2019, at 9:15 PM, Walter Underwood <wun...@wunderwood.org> wrote:
> 
> Yeah, that would overload it. To get good indexing speed, I configure two 
> clients per CPU on the indexing machine. With one shard on a 16 processor 
> machine, that would be 32 threads. With four shards on four 16 processor 
> machines, 128 clients. Basically, one thread is waiting while the CPU 
> processes a batch and the other is sending the next batch.
> 
> That should get the cluster to about 80% CPU. If the cluster is handling 
> queries at the same time, I cut that way back, like one client thread for 
> every two CPUs.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Apr 2, 2019, at 8:13 PM, Aroop Ganguly <aroopgang...@icloud.com> wrote:
>> 
>> Mutliple threads to the same index ? And how many concurrent threads?
>> 
>> Our case is not merely multiple threads but actually large scale spark 
>> indexer jobs that index 1B records at a time with a concurrency of 400.
>> In this case multiple such jobs were indexing into the same index. 
>> 
>> 
>>> On Apr 2, 2019, at 7:25 AM, Walter Underwood <wun...@wunderwood.org> wrote:
>>> 
>>> We run multiple threads indexing to Solr all the time and have been doing 
>>> so for years.
>>> 
>>> How big are your documents and how big are your batches?
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
>>>> On Apr 1, 2019, at 10:51 PM, Aroop Ganguly <aroopgang...@icloud.com> wrote:
>>>> 
>>>> Turns out the cause was multiple indexing jobs indexing into the index 
>>>> simultaneously, which one can imagine can cause jvm loads on certain 
>>>> replicas for sure.
>>>> Once this was found and only one job ran at a time, things were back to 
>>>> normal.
>>>> 
>>>> Your comments seem right on no correlation to the stack trace! 
>>>> 
>>>>> On Apr 1, 2019, at 5:32 PM, Shawn Heisey <apa...@elyograg.org> wrote:
>>>>> 
>>>>> 4/1/2019 5:40 PM, Aroop Ganguly wrote:
>>>>>> Thanks Shawn, for the initial response.
>>>>>> Digging into a bit, I was wondering if we’d care to read the inner most 
>>>>>> stack.
>>>>>> From the inner most stack it seems to be telling us something about what 
>>>>>> trigger it ?
>>>>>> Ofcourse, the system could have been overloaded as well, but is the 
>>>>>> exception telling us something or its of no use to consider this stack
>>>>> 
>>>>> The stacktrace on OOME is rarely useful.  The memory allocation where the 
>>>>> error is thrown probably has absolutely no connection to the part of the 
>>>>> program where major amounts of memory are being used.  It could be ANY 
>>>>> memory allocation that actually causes the error.
>>>>> 
>>>>> Thanks,
>>>>> Shawn
>>>> 
>>> 
>> 
> 

Reply via email to