Ankit, Here is how you can query your JSON object.
String queryStr = "SELECT d.col1 FROM /JsonRegion v, v.data d where d.col1.k11 = 'aaa'"; As replied earlier; the data is stored as PdxInstance type in the cache. In the PdxInstance, the data is stored as top level or nested collection of objects/values based on input JSON object structure. The query engine queries on the PdxInstance type and returns the value. To see, how the PdxInstance data looks like in the cache, you can print the returned value from querying the region values: E.g.: String queryStr = "SELECT v FROM /JsonRegion v"; SelectResults results = (SelectResults) QueryService().newQuery(queryStr).execute(); Object[] value = results.asList().toArray(); System.out.println("#### Projected value: " + value[0]); You can find sample queries on different type of objects (collections, etc) at: https://geode.apache.org/docs/guide/18/getting_started/querying_quick_reference.html Also in order to determine where the time is getting spent, can you separate out object creation through JSONFormatter from put operation. E.g.: PdxInstance pdxInstance = JSONFormatter.fromJSON(jsonDoc_2); // Time taken to format: region.put("1", pdxInstance); // Time taken to add to cache: And measure the time separately. It will help to see if the time is spent in getting the PdxInstance or in doing puts. Also, can you measure the time in avg. E.g. Say time measured for puts from 1000 to 2000 and avg time for those puts. -Anil. On 11/23/20, 11:27 AM, "ankit Soni" <ankit.soni.ge...@gmail.com> wrote: Hello geode-dev, I am *evaluating usage of Geode (1.12) with storing JSON documents and querying the same*. I am able to store the json records successfully in geode but seeking guidance on how to query them. More details on code and sample json is, *Sample client-code* import org.apache.geode.cache.client.ClientCache; import org.apache.geode.cache.client.ClientCacheFactory; import org.apache.geode.cache.client.ClientRegionShortcut; import org.apache.geode.pdx.JSONFormatter; import org.apache.geode.pdx.PdxInstance; public class MyTest { *//NOTE: Below is truncated json, single json document can max contain an array of col1...col30 (30 diff attributes) within data. * public final static String jsonDoc_2 = "{" + "\"data\":[{" + "\"col1\": {" + "\"k11\": \"aaa\"," + "\"k12\":true," + "\"k13\": 1111," + "\"k14\": \"2020-12-31:00:00:00\"" + "}," + "\"col2\":[{" + "\"k21\": \"222222\"," + "\"k22\": true" + "}]" + "}]" + "}"; * //NOTE: Col1....col30 are mix of JSONObject ({}) and JSONArray ([]) as shown above in jsonDoc_2;* public static void main(String[] args){ //create client-cache ClientCache cache = new ClientCacheFactory().addPoolLocator(LOCATOR_HOST, PORT).create(); Region<String, PdxInstance> region = cache.<String, PdxInstance>createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY) .create(REGION_NAME); //store json document region.put("key", JSONFormatter.fromJSON(jsonDoc_2)); //How to query json document like, // 1. select col2.k21, col1, col20 from /REGION_NAME where data.col2.k21 = '222222' OR data.col2.k21 = '333333' // 2. select col2.k21, col1.k11, col1 from /REGION_NAME where data.col1.k11 in ('aaa', 'xxx', 'yyy') } } *Server: Region-creation* gfsh> create region --name=REGION_NAME --type=PARTITION --redundant-copies=1 --total-num-buckets=61 *Setup: Distributed cluster of 3 nodes * *My Observations/Problems* - Put operation takes excessive time: region.put("key", JSONFormatter.fromJSON(jsonDoc_2)); - Fetching a single record from () a file and Storing in geode approx. takes . 3 secs Is there any suggestions/configuration related to JSONFormatter API or other to optimize this...? *Looking forward to guidance on querying this JOSN for above sample queries.* *Thanks* *Ankit.*