Re: Sync copy of buckets across Replication | Geode

Udo Kohlmeyer Fri, 15 Jul 2016 08:28:07 -0700

Hi there Sushil,

The scope "distributed-no-ack" is only applicable for replicate regions.Partition regions will always have synchronous replication to theredundant copy.

Also, I would not have the high number of "total-num-buckets". 5003 is alittle. A "good" rule of thumb is to have about 8-12 buckets per host.This is by no means a definitive solution as data distribution overbuckets might be a little "lumpy" and would be solved beincreasing/decreasing the number of buckets. Maybe for the first roundstick with the default 113.

You would not really see a significant improvement in input rate byincreasing the total-num-buckets. The buckets would primarily influencethe distribution of data across the nodes. There is no directcorrelation of performance with the number of buckets. What I've seen isthat with too many buckets the query (OQL) performance might degrade.

Could you possibly also indicate on how many nodes you are intending touse in your test?

Maybe also sizes of the key/value?

I would also consider using PDX serialization over standard Javaserialization. I would configure the pdx serializer to use"read-serialized=true", which would mean that the servers do notdeserialize the data objects into POJOs. You can still access the databy invoking the Pdx API "pdxInstance.getField("{fieldName}") to get atthe properties.

Not only does keeping the data in serialized format reduce theserialization overhead and gc overhead but it can potentially reduce thememory required for each object. The OQL engine is also enabled to workwith PdxInstance objects.

Another tip would be to use the "putAll" method to insert data into thecluster.


I hope some of these pointers help.

--Udo



On 16/07/2016 12:18 AM, Chaudhary, Sushil (CONT) wrote:

Anthony,
Thanks for the reply.
I have region defined as either as PARTITION or REPLICATED. Can I usescope="distributed-no-ack”
With the region type = PARTITION_HEAP_LRU.
create region --name=regionB --type=PARTITION_HEAP_LRU--redundant-copies=1 --total-num-buckets=5003
My aim is to get the best performance for IMDB, I am fine to haveasync copy of buckets across node to get better performance. Also, wehave 500Millions key/values to put into IMDB. What do you think shouldbe best bucket size to get best performance. I tried increasing it tohigh number (prime number) but does not see improvement in gets/puts rate.
We are evaluating Geode, against other IMDB like hazel cast and Igniteand have go choose one to get the enterprise version. Bestperformance is key criteria. Please let me know.
*Sushil Chaudhary*
*Email*: [email protected]<mailto:[email protected]>
From: Anthony Baker <[email protected] <mailto:[email protected]>>
Reply-To: <[email protected]<mailto:[email protected]>>
Date: Thursday, July 14, 2016 at 12:57 PM
To: <[email protected]<mailto:[email protected]>>
Subject: Re: Sync copy of buckets across Replication | Geode
Here are some links to Pivotal-hosted documentation (note that this isan interim hosting solution until the docs are donated to the geodeproject):
http://geode.docs.pivotal.io/docs/developing/distributed_regions/choosing_level_of_dist.html
http://geode.docs.pivotal.io/docs/developing/partitioned_regions/how_partitioning_works.html

Anthony
On Jul 14, 2016, at 6:21 AM, Chaudhary, Sushil (CONT)<[email protected]<mailto:[email protected]>> wrote:
Anthony,
Thanks for the reply. Do you have any docs/link on the same?

*Sushil Chaudhary*
*Email*: [email protected]<mailto:[email protected]>
From: Anthony Baker <[email protected] <mailto:[email protected]>>
Reply-To: <[email protected]<mailto:[email protected]>>
Date: Thursday, July 14, 2016 at 12:32 AM
To: <[email protected]<mailto:[email protected]>>
Subject: Re: Sync copy of buckets across Replication | Geode
Geode does synchronous replication of updates by default. Changesare replicated prior to sending the client response.
------------------------------------------------------------------------
The information contained in this e-mail is confidential and/orproprietary to Capital One and/or its affiliates and may only be usedsolely in performance of work or services for Capital One. Theinformation transmitted herewith is intended only for use by theindividual or entity to which it is addressed. If the reader of thismessage is not the intended recipient, you are hereby notified thatany review, retransmission, dissemination, distribution, copying orother use of, or taking of any action in reliance upon thisinformation is strictly prohibited. If you have received thiscommunication in error, please contact the sender and delete thematerial from your computer.
------------------------------------------------------------------------
The information contained in this e-mail is confidential and/orproprietary to Capital One and/or its affiliates and may only be usedsolely in performance of work or services for Capital One. Theinformation transmitted herewith is intended only for use by theindividual or entity to which it is addressed. If the reader of thismessage is not the intended recipient, you are hereby notified thatany review, retransmission, dissemination, distribution, copying orother use of, or taking of any action in reliance upon thisinformation is strictly prohibited. If you have received thiscommunication in error, please contact the sender and delete thematerial from your computer.

Re: Sync copy of buckets across Replication | Geode

Reply via email to