Sure Mike will get back on this once I will have some updates around it. Not sure If i had mentioned this earlier but not all getAll requests are taking 10-30 seconds. If I go and hit for say 10 requests from a client there will be a one or two requests which would take a lot of time otherwise for rest it's <1 second.
With best regards, Ashish On Wed, Oct 31, 2018, 12:03 AM Michael Stolz <[email protected]> wrote: > Thanks for that additional detail. > The workload you are describing should be taking more like 30 milliseconds > rather than 30 seconds. There is something VERY wrong somewhere in this > environment. > One area I would look at in this case is maybe the disk you are logging to > is having i/o errors and retries? > I've seen that cause multiple orders of magnitude delays. > > -- > Mike Stolz > Principal Engineer, GemFire Product Lead > Mobile: +1-631-835-4771 > > > > On Tue, Oct 30, 2018 at 1:16 PM aashish choudhary < > [email protected]> wrote: > >> Hi Udo, >> >> To answer your questions. >> >> - How many keys are part of the getAll() request. Keys are very small >> in size for some requests it was 10 for some it was 30 or something. >> getAll >> time was above 10 seconds most of the time for above cases and 30 seconds >> was max. >> - How large are your value objects? It's not very large. All it has >> is collections of some object. Object has just two attributes. >> - Is this request made from a client? Yes from a client. >> - Is there any memory pressure on the client? Are there any gc's >> taking place at that time? Not that we have observed so far as per our >> investigation. We have gc logs enabled for client application and we have >> not seen any 'stop the world' scenario. >> - Have you configured a CacheLoader? No >> - What about eviction and eviction thresholds. No. >> >> Do we know like how much time a client application can take because of >> this bug if this at all related to our case. >> >> We will be enabling stats for our client application to see if something >> comes up there. >> >> I agree single-hop enabled or not 30 seconds is a lot for any system to >> give back response. Also as Charlie and Mike suggested we will do a >> profiling and network monitoring for client application. >> >> For network issue a normal ping for Gemfire servers from the machine >> where client apps are running is very fast. If that was the question. >> >> >> With best regards, >> Ashish >> >> On Sat, Oct 27, 2018, 12:29 AM Udo Kohlmeyer <[email protected]> wrote: >> >>> Hi there Aashish, >>> >>> Could you possibly provide a little more information, as to the getAll() >>> operation. >>> >>> Just for interest sake: >>> >>> - How many keys are part of the getAll() request >>> - How large are your value objects? >>> - Is this request made from a client? >>> - Is there any memory pressure on the client? Are there any gc's >>> taking place at that time? >>> - Have you configured a CacheLoader? >>> - What about eviction and eviction thresholds. >>> >>> The reason I'm asking is that when you initiate a getAll() the server(s) >>> might be able to respond in a reasonable time frame, BUT, if there >>> keys/values are really large, you might be seeing that the servers spend >>> some time serializing the data. When that data is returned to the client, >>> the client needs to provision all that space for the deserialized data. >>> Given that Geode operations are not yet streaming enabled, means that you >>> hit "all-or-nothing" semantics. Which means, to retrieve the result from >>> the getAll, requires all the data to have been delivered and deserialized >>> before it is made available as a response object. >>> >>> Even with non-single hop enabled, I do not see any real reason as to the >>> 30s response times. So there must be other factors here... >>> >>> Maybe to help us, try and help you, some more information about >>> operation, key/values sizes, amount of keys, gc state, logs could be more >>> helpful. >>> >>> >>> --Udo >>> >>> On 10/25/18 11:36, aashish choudhary wrote: >>> >>> Hi, >>> >>> We have an issue wherein our getAll operations are taking too long to >>> respond (in some cases as high as 30 seconds) which is not acceptable from >>> IMDG like Geode. Also this seems to be an existing issue as per the link >>> given below and being fixed in 1.8 version. >>> https://issues.apache.org/jira/browse/GEODE-5649 >>> We are on version 1.2 and wanted to confirm if this issue exits in that >>> version too. If yes is there any workaround for that?. Not sure if 1.8 is >>> available for download. >>> >>> >>> With best regards, >>> Ashish >>> >>> >>>
