Hi all,

I am deploying geode on kubernestes with 2 locators and 5 servers
And observed very weird inconsistent while running the application:

Locators are started by command:
  gfsh start locator --name="${HOSTNAME}" --connect=false
--locators="${LOCATORS}" --port=10334
Servers are started by command
  gfsh start server --bind-address=$ip --name="${HOSTNAME}"
--cache-xml-file=/geode/config/cache.xml --groups memory --initial-heap=20g
--max-heap=20g --eviction-heap-percentage=60 --critical-heap-percentage=80
--J=-XX:+UseConcMarkSweepGC --J=-XX:CMSInitiatingOccupancyFraction=60
--locators="${LOCATORS}"

Where the cache.xml looks like following:

    <?xml version="1.0" encoding="UTF-8"?>
    <cache
        xmlns="http://geode.apache.org/schema/cache";
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
        xsi:schemaLocation="http://geode.apache.org/schema/cache
http://geode.apache.org/schema/cache/cache-1.0.xsd";
        version="1.0">
      <disk-store name="transactions_overflow">
         <disk-dirs>
                <disk-dir>/data/transactions_overflow</disk-dir>
         </disk-dirs>
      </disk-store>
      <region name="transactions">
        <region-attributes refid="PARTITION_OVERFLOW">
          <partition-attributes total-num-buckets="23"
redundant-copies="3"/>
        </region-attributes>
        <region-attributes disk-store-name="transactions_overflow"
></region-attributes>
        <region-attributes>
          <eviction-attributes>
            <lru-entry-count action="overflow-to-disk"/>
          </eviction-attributes>
        </region-attributes>
      </region>
    </cache>

Here's how I start the cluster
1/ start first locator and wait some time for it to fully startup
2/ start second locators and wait some time for it to fully startup
3/ start all 5 servers at the same time

The cluster came up nicely, I can see 2 locators and 5 servers when
connecting to any locators via gfsh
However, when I start populating data to my region I got in-consistent data
returns by my client
I run following code:
      while(true) {
            long start = System.currentTimeMillis();
            ClientCache cache = new
ClientCacheFactory().set("cache-xml-file", "cache.xml").create();
            TransactionGeoDAO geoDAO = new TransactionGeoDAO(cache);
            HashMap<String, PositionStore> records =
geoDAO.getTransactions(date);
            LOGGER.info(String.format("Timetaken %s, Number of records %s",
System.currentTimeMillis() - start, records.size()));
            cache.close();
        }

      The query in getTransactions is : select * from /transactions where
date=%s
      And the result returns very consisetnt (even after i stop publishing
data)
      716676 INFO  com.tata.mo.Reconcile - Timetaken 483, Number of records
2446
      717879 INFO  com.tata.mo.Reconcile - Timetaken 1203, Number of
records 2593
      718290 INFO  com.tata.mo.Reconcile - Timetaken 411, Number of records
2057
      718810 INFO  com.tata.mo.Reconcile - Timetaken 520, Number of records
2593
      719180 INFO  com.tata.mo.Reconcile - Timetaken 370, Number of records
2446
      719834 INFO  com.tata.mo.Reconcile - Timetaken 654, Number of records
2057
      720374 INFO  com.tata.mo.Reconcile - Timetaken 540, Number of records
2446
      721579 INFO  com.tata.mo.Reconcile - Timetaken 1205, Number of
records 2593
      722255 INFO  com.tata.mo.Reconcile - Timetaken 676, Number of records
2057
      722733 INFO  com.tata.mo.Reconcile - Timetaken 478, Number of records
2057

    Here's my cache.xml for client
    <!DOCTYPE client-cache PUBLIC
        "-//GemStone Systems, Inc.//GemFire Declarative Caching 6.5//EN"
        "http://www.gemstone.com/dtd/cache8_0.dtd";>
    <client-cache>
        <pool name="myPool">
            <locator host="locator1" port="10334"/>
            <locator host="locator2" port="10334"/>
        </pool>
        <region name="transactions" refid="PROXY"/>
    </client-cache>

But without using  --cache-xml-file=/geode/config/cache.xml option, If
region is created by gfsh when all servers came up the result will be
consistent

Besides of above errors, I sometime got following erros:
1/ NoAvailableServersException
  All locators and servers are still running, "list members" still show
correct members

  ERROR StatusLogger Unrecognized conversion specifier [n] starting at
position 56 in conversion pattern.
  Exception in thread "main"
org.apache.geode.cache.client.NoAvailableServersException

2/ When I try to restart a server, it failes with error missing region
/transactions_overflow
Althouth it has been defined in cache.xml file


Could anyone please help to check if my deployment method is in the right
way?

-- 
Thanks
Kien

Reply via email to