Hi there Sushil,

The scope "distributed-no-ack" is only applicable for replicate regions. Partition regions will always have synchronous replication to the redundant copy.

Also, I would not have the high number of "total-num-buckets". 5003 is a little. A "good" rule of thumb is to have about 8-12 buckets per host. This is by no means a definitive solution as data distribution over buckets might be a little "lumpy" and would be solved be increasing/decreasing the number of buckets. Maybe for the first round stick with the default 113.

You would not really see a significant improvement in input rate by increasing the total-num-buckets. The buckets would primarily influence the distribution of data across the nodes. There is no direct correlation of performance with the number of buckets. What I've seen is that with too many buckets the query (OQL) performance might degrade.

Could you possibly also indicate on how many nodes you are intending to use in your test?
Maybe also sizes of the key/value?

I would also consider using PDX serialization over standard Java serialization. I would configure the pdx serializer to use "read-serialized=true", which would mean that the servers do not deserialize the data objects into POJOs. You can still access the data by invoking the Pdx API "pdxInstance.getField("{fieldName}") to get at the properties.

Not only does keeping the data in serialized format reduce the serialization overhead and gc overhead but it can potentially reduce the memory required for each object. The OQL engine is also enabled to work with PdxInstance objects.

Another tip would be to use the "putAll" method to insert data into the cluster.

I hope some of these pointers help.

--Udo



On 16/07/2016 12:18 AM, Chaudhary, Sushil (CONT) wrote:


Anthony,
Thanks for the reply.

I have region defined as either as PARTITION or REPLICATED. Can I use scope="distributed-no-ack”
With the region type = PARTITION_HEAP_LRU.

create region --name=regionB --type=PARTITION_HEAP_LRU --redundant-copies=1 --total-num-buckets=5003



My aim is to get the best performance for IMDB, I am fine to have async copy of buckets across node to get better performance. Also, we have 500Millions key/values to put into IMDB. What do you think should be best bucket size to get best performance. I tried increasing it to high number (prime number) but does not see improvement in gets/puts rate.

We are evaluating Geode, against other IMDB like hazel cast and Ignite and have go choose one to get the enterprise version. Best performance is key criteria. Please let me know.

*Sushil Chaudhary*
*Email*: [email protected] <mailto:[email protected]>

From: Anthony Baker <[email protected] <mailto:[email protected]>>
Reply-To: <[email protected] <mailto:[email protected]>>
Date: Thursday, July 14, 2016 at 12:57 PM
To: <[email protected] <mailto:[email protected]>>
Subject: Re: Sync copy of buckets across Replication | Geode


Here are some links to Pivotal-hosted documentation (note that this is an interim hosting solution until the docs are donated to the geode project):

http://geode.docs.pivotal.io/docs/developing/distributed_regions/choosing_level_of_dist.html
http://geode.docs.pivotal.io/docs/developing/partitioned_regions/how_partitioning_works.html

Anthony

On Jul 14, 2016, at 6:21 AM, Chaudhary, Sushil (CONT) <[email protected] <mailto:[email protected]>> wrote:

Anthony,
Thanks for the reply. Do you have any docs/link on the same?

*Sushil Chaudhary*
*Email*: [email protected] <mailto:[email protected]>

From: Anthony Baker <[email protected] <mailto:[email protected]>>
Reply-To: <[email protected] <mailto:[email protected]>>
Date: Thursday, July 14, 2016 at 12:32 AM
To: <[email protected] <mailto:[email protected]>>
Subject: Re: Sync copy of buckets across Replication | Geode

Geode does synchronous replication of updates by default. Changes are replicated prior to sending the client response.
------------------------------------------------------------------------
The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.


------------------------------------------------------------------------

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.


Reply via email to