Re: Inserts from multiple threads

Amit Pandey Tue, 28 Nov 2017 12:04:17 -0800

Okay Thanks... What are the parameters to tune socket buffer ?

On Wed, Nov 29, 2017 at 1:06 AM, Charlie Black <[email protected]> wrote:


> Socket Buffers - good catch.
>
> On Tue, Nov 28, 2017 at 11:33 AM Udo Kohlmeyer <[email protected]>
> wrote:
>
>> Another thing to keep in mind.... putAll does have notion of
>> micro-batching... which means... in a partitioned region, the client will
>> try and use single-hop semantics and only send the entries relevant to a
>> server to that server...
>> But as everybody else has already stated... you'll have test what is your
>> "optimal" batch size AND ... maybe tune your buffers to match ....
>>
>> --Udo
>>
>> On Tue, Nov 28, 2017 at 11:21 AM, Charlie Black <[email protected]>
>> wrote:
>>
>>> Sure 50 to 1000 key/values in a putAll - just add metrics and see what
>>> works best for your environment.   The thing to think when trying to
>>> achieve best performance think about amortizing network overhead and
>>> parallelizing the storage request (putAll).
>>>
>>> I would like to point out more threads isn't necessarily better.   Geode
>>> does a great job on making sure its kind to the network and shuffling the
>>> data to right nodes.   So we have to think about is there enough
>>> cores/horsepower to perform the unit of work from the client to servers.
>>>
>>> Regards,
>>>
>>> Charlie
>>>
>>> On Tue, Nov 28, 2017 at 10:06 AM Amit Pandey <[email protected]>
>>> wrote:
>>>
>>>> Thanks guy. Much appreciated.
>>>>
>>>> Charlie do you mean batches of say 50-100 for putAlls ?
>>>>
>>>> Regards
>>>>
>>>> On Tue, Nov 28, 2017 at 11:15 PM, Charlie Black <[email protected]>
>>>> wrote:
>>>>
>>>>> Both are correct and incorrect at the same time - it depends on
>>>>> your application, domain model, workload and physical environment.   I
>>>>> would recommend adding some metrics and follow what Akihiro mentioned and
>>>>> use what works for your environment.
>>>>>
>>>>> As a side note: I would also recommend trying smaller batches in
>>>>> your testing.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Charlie
>>>>>
>>>>> On Tue, Nov 28, 2017 at 8:32 AM Amit Pandey <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hey Thanks for the answer. I guess I didn't explain it correctly. I
>>>>>> am not trying to do single puts from threads.
>>>>>>
>>>>>> So my situation is :-
>>>>>>
>>>>>> I can do 500 inserts from 10 threads via putAll
>>>>>>
>>>>>> or I can just collect them ( 5000) and do a putAll.
>>>>>>
>>>>>> Which one is the correct approach ?
>>>>>>
>>>>>> On Mon, Nov 27, 2017 at 8:07 AM, Akihiro Kitada <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello Amit,
>>>>>>>
>>>>>>> >Now my question is will it be faster to do it on the individual
>>>>>>> threads and just return that they have completed the task so that they 
>>>>>>> can
>>>>>>> be sent back to the caller or the way we do it now I,e collect all data 
>>>>>>> and
>>>>>>> insert is better ?
>>>>>>>
>>>>>>> It depends on the workload and cluster configuration (data size, num
>>>>>>> of data, num of threads, num of members, region type and so on) although
>>>>>>> putAll could be more efficient in terms of throughput per threads.
>>>>>>>
>>>>>>> I recommend you to try both ways based on the possible workload and
>>>>>>> configuration.
>>>>>>>
>>>>>>> Thanks, regards.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Akihiro Kitada  |  Staff Customer Engineer |  +81 80 3716 3736
>>>>>>> <+81%2080-3716-3736>
>>>>>>> Support.Pivotal.io <http://support.pivotal.io/>  |  Mon-Fri  9:00am
>>>>>>> to 5:30pm JST  |  1-877-477-2269 <(877)%20477-2269>
>>>>>>> [image: support] <https://support.pivotal.io/> [image: twitter]
>>>>>>> <https://twitter.com/pivotal> [image: linkedin]
>>>>>>> <https://www.linkedin.com/company/3048967> [image: facebook]
>>>>>>> <https://www.facebook.com/pivotalsoftware> [image: google plus]
>>>>>>> <https://plus.google.com/+Pivotal> [image: youtube]
>>>>>>> <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl>
>>>>>>>
>>>>>>>
>>>>>>> 2017-11-26 0:33 GMT+09:00 Amit Pandey <[email protected]>:
>>>>>>>
>>>>>>>> Hey Guys,
>>>>>>>>
>>>>>>>> I have a question. So I have a function which calls some threads to
>>>>>>>> get data to be inserted into a region. It collects all the data and 
>>>>>>>> then
>>>>>>>> puts them into a region with putAll.
>>>>>>>>
>>>>>>>> Now my question is will it be faster to do it on the individual
>>>>>>>> threads and just return that they have completed the task so that they 
>>>>>>>> can
>>>>>>>> be sent back to the caller or the way we do it now I,e collect all 
>>>>>>>> data and
>>>>>>>> insert is better ?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>> [email protected] | +1.858.480.9722 <(858)%20480-9722>
>>>>>
>>>>
>>>> --
>>> [email protected] | +1.858.480.9722 <+1%20858-480-9722>
>>>
>>
>>
>>
>> --
>> Kindest Regards
>> -----------------------------
>> *Udo Kohlmeyer* | *Pivotal*
>> [email protected]
>> <http://www.gopivotal.com/>
>> www.pivotal.io
>>
> --
> [email protected] | +1.858.480.9722
>

Re: Inserts from multiple threads

Reply via email to