Hi Trung, Here are some high-level observations and guidelines:
*Only use cache.xml for those constructs which you cannot create using gfsh*. We've added the majority of frequently-used options to gfsh but there are still some things that cannot be configured with gfsh. If you have some specific needs, please add a Jira ticket so that we can get an idea of where the gaps still are. In your particular example, if you were now to use gfsh to create a region referencing the cache.xml defined disk store, any new servers joining, or servers restarting, would complain about a missing disk store. This is because the CC is applied first (so the region you may have created with gfsh) and then the cache.xml is applied. If, for whatever reason, you must or prefer to use cache.xml, (for more than just those constructs that you cannot configure with gfsh), I would suggest disabling the cluster configuration feature so that you don't end up with unexpected conflicts. That still allows you to make adhoc changes with gfsh, however they will not be persisted. Regarding the startup sequence you described - it doesn't matter at which point you create your regions and disk stores. There is no difference whether you do it after 1 server has started or all of them. --Jens On Sun, Jul 29, 2018 at 8:53 PM trung kien <[email protected]> wrote: > Hi Akihiro, > > No I didn't specify anything for cluster configuration. > Just read again about cluster configuration, in my case the cluster > configuration should be enable by default. > According to this docs ( > http://geode.apache.org/docs/guide/16/configuring/cluster_config/gfsh_persist.html > ) > > My cache.xml suppose to have /transactions_overflow defined, so it > should be in the cluster configuration. > > From the docs, it recommend usings following methods: > * Start locators > * Start server 1 only > * Create disk-stores, regions (now cluster configuration have all > definitions) > * Start remaining servers > > Is that the recommend method everyone is using when deploying in > production? > > > On Sun, Jul 29, 2018 at 6:06 PM, Akihiro Kitada <[email protected]> > wrote: > >> Hello Kien, >> >> I may provide something ideas for question 2. >> >> >2/ When I try to restart a server, it failes with error missing region >> /transactions_overflow >> >Althouth it has been defined in cache.xml file >> >> Have you specify disk stores defined in cache.xml from region defined >> with gfsh via Cluster Configuration? >> >> When initializing cache servers, it first populate cache configuration >> from Cluster Configuration (which is defined with gfsh) and then populate >> from cache.xml. >> >> If you only specify your disk store /transactions_overflow in cache.xml >> only and your gfsh defined region specifies /transactions_overflow as >> its disk store, then it fails to populate the region. >> >> Thanks, regards. >> >> >> >> >> >> -- >> Akihiro Kitada | Staff Customer Engineer | +81 80 3716 3736 >> Support.Pivotal.io <https://pivotal.io/support> | Mon-Fri 9:00am to >> 5:30pm JST | 1-877-477-2269 >> [image: support] <https://support.pivotal.io/> [image: twitter] >> <https://twitter.com/pivotal> [image: linkedin] >> <https://www.linkedin.com/company/3048967> [image: facebook] >> <https://www.facebook.com/pivotalsoftware> [image: google plus] >> <https://plus.google.com/+Pivotal> [image: youtube] >> <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl> >> >> >> >> 2018年7月29日(日) 14:21 trung kien <[email protected]>: >> >>> Hi all, >>> >>> I am deploying geode on kubernestes with 2 locators and 5 servers >>> And observed very weird inconsistent while running the application: >>> >>> Locators are started by command: >>> gfsh start locator --name="${HOSTNAME}" --connect=false >>> --locators="${LOCATORS}" --port=10334 >>> Servers are started by command >>> gfsh start server --bind-address=$ip --name="${HOSTNAME}" >>> --cache-xml-file=/geode/config/cache.xml --groups memory --initial-heap=20g >>> --max-heap=20g --eviction-heap-percentage=60 --critical-heap-percentage=80 >>> --J=-XX:+UseConcMarkSweepGC --J=-XX:CMSInitiatingOccupancyFraction=60 >>> --locators="${LOCATORS}" >>> >>> Where the cache.xml looks like following: >>> >>> <?xml version="1.0" encoding="UTF-8"?> >>> <cache >>> xmlns="http://geode.apache.org/schema/cache" >>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >>> xsi:schemaLocation="http://geode.apache.org/schema/cache >>> http://geode.apache.org/schema/cache/cache-1.0.xsd" >>> version="1.0"> >>> <disk-store name="transactions_overflow"> >>> <disk-dirs> >>> <disk-dir>/data/transactions_overflow</disk-dir> >>> </disk-dirs> >>> </disk-store> >>> <region name="transactions"> >>> <region-attributes refid="PARTITION_OVERFLOW"> >>> <partition-attributes total-num-buckets="23" >>> redundant-copies="3"/> >>> </region-attributes> >>> <region-attributes disk-store-name="transactions_overflow" >>> ></region-attributes> >>> <region-attributes> >>> <eviction-attributes> >>> <lru-entry-count action="overflow-to-disk"/> >>> </eviction-attributes> >>> </region-attributes> >>> </region> >>> </cache> >>> >>> Here's how I start the cluster >>> 1/ start first locator and wait some time for it to fully startup >>> 2/ start second locators and wait some time for it to fully startup >>> 3/ start all 5 servers at the same time >>> >>> The cluster came up nicely, I can see 2 locators and 5 servers when >>> connecting to any locators via gfsh >>> However, when I start populating data to my region I got in-consistent >>> data returns by my client >>> I run following code: >>> while(true) { >>> long start = System.currentTimeMillis(); >>> ClientCache cache = new >>> ClientCacheFactory().set("cache-xml-file", "cache.xml").create(); >>> TransactionGeoDAO geoDAO = new TransactionGeoDAO(cache); >>> HashMap<String, PositionStore> records = >>> geoDAO.getTransactions(date); >>> LOGGER.info(String.format("Timetaken %s, Number of records >>> %s", System.currentTimeMillis() - start, records.size())); >>> cache.close(); >>> } >>> >>> The query in getTransactions is : select * from /transactions >>> where date=%s >>> And the result returns very consisetnt (even after i stop >>> publishing data) >>> 716676 INFO com.tata.mo.Reconcile - Timetaken 483, Number of >>> records 2446 >>> 717879 INFO com.tata.mo.Reconcile - Timetaken 1203, Number of >>> records 2593 >>> 718290 INFO com.tata.mo.Reconcile - Timetaken 411, Number of >>> records 2057 >>> 718810 INFO com.tata.mo.Reconcile - Timetaken 520, Number of >>> records 2593 >>> 719180 INFO com.tata.mo.Reconcile - Timetaken 370, Number of >>> records 2446 >>> 719834 INFO com.tata.mo.Reconcile - Timetaken 654, Number of >>> records 2057 >>> 720374 INFO com.tata.mo.Reconcile - Timetaken 540, Number of >>> records 2446 >>> 721579 INFO com.tata.mo.Reconcile - Timetaken 1205, Number of >>> records 2593 >>> 722255 INFO com.tata.mo.Reconcile - Timetaken 676, Number of >>> records 2057 >>> 722733 INFO com.tata.mo.Reconcile - Timetaken 478, Number of >>> records 2057 >>> >>> Here's my cache.xml for client >>> <!DOCTYPE client-cache PUBLIC >>> "-//GemStone Systems, Inc.//GemFire Declarative Caching 6.5//EN" >>> "http://www.gemstone.com/dtd/cache8_0.dtd"> >>> <client-cache> >>> <pool name="myPool"> >>> <locator host="locator1" port="10334"/> >>> <locator host="locator2" port="10334"/> >>> </pool> >>> <region name="transactions" refid="PROXY"/> >>> </client-cache> >>> >>> But without using --cache-xml-file=/geode/config/cache.xml option, If >>> region is created by gfsh when all servers came up the result will be >>> consistent >>> >>> Besides of above errors, I sometime got following erros: >>> 1/ NoAvailableServersException >>> All locators and servers are still running, "list members" still show >>> correct members >>> >>> ERROR StatusLogger Unrecognized conversion specifier [n] starting at >>> position 56 in conversion pattern. >>> Exception in thread "main" >>> org.apache.geode.cache.client.NoAvailableServersException >>> >>> 2/ When I try to restart a server, it failes with error missing region >>> /transactions_overflow >>> Althouth it has been defined in cache.xml file >>> >>> >>> Could anyone please help to check if my deployment method is in the >>> right way? >>> >>> -- >>> Thanks >>> Kien >>> >> > > > -- > Thanks > Kien >
