You can try increasing the subscription queue...
Following are some of the steps to manage subscription queue:
http://gemfire.docs.pivotal.io/geode/developing/events/limit_server_subscription_queue_size.html

-Anil.


On Wed, Sep 27, 2017 at 2:58 PM, Mangesh Deshmukh <[email protected]>
wrote:

> Hi,
>
>
>
> FYI: I have filed a JIRA ticket on this but I thought may be someone might
> be aware of solution or workaround for this problem. So, I am posting it
> here as well.
>
>
>
> In one of the project we are using Geode. Here is a summary of how we use
> it.
>
> - Geode servers (Release 1.1.1) have multiple regions.
>
> - Clients subscribe to the data from these regions.
>
> - Clients subscribe interest in all the entries, therefore they get
> updates about all the entries from creation to modification to deletion.
>
> - One of the regions usually has 5-10 million entries with a TTL of 24
> hours. Most entries are added in an hour's span one after other. So, when
> TTL kicks in, they are often destroyed in an hour.
>
>
>
> Problem:
>
> Every now and then we observe following message:
>
>                 Client queue for _gfe_non_durable_client_with_
> id_x.x.x.x(14229:loner):42754:e4266fc4_2_queue client is full.
>
> This seems to happen when the TTL kicks in. Entries start getting evicted
> (deleted), the updates now must be sent to clients. We see that the updates
> do happen for a while but suddenly the updates stop and the queue size
> starts growing. This is becoming a major issue for smooth functioning of
> our production setup. Any help will be much appreciated.
>
>
>
> I did some ground work by downloading and looking at the code. I see
> reference to 2 issues #37581, #51400. But I am unable to view actual JIRA
> tickets (needs login credentials) Hopefully, it helps someone looking at
> the issue.
>
> Here is the pertinent code:
>
>
>
>    @Override
>
>     @edu.umd.cs.findbugs.annotations.SuppressWarnings("TLW_TWO_LOCK_WAIT")
>
>     void checkQueueSizeConstraint() throws InterruptedException {
>
>       if (this.haContainer instanceof HAContainerMap && isPrimary()) { //
> Fix for bug 39413
>
>         if (Thread.interrupted())
>
>           throw new InterruptedException();
>
>         synchronized (this.putGuard) {
>
>           if (putPermits <= 0) {
>
>             synchronized (this.permitMon) {
>
>               if (reconcilePutPermits() <= 0) {
>
>                 if (region.getSystem().getConfig(
> ).getRemoveUnresponsiveClient()) {
>
>                   isClientSlowReciever = true;
>
>                 } else {
>
>                   try {
>
>                     long logFrequency = CacheClientNotifier.DEFAULT_
> LOG_FREQUENCY;
>
>                     CacheClientNotifier ccn = CacheClientNotifier.
> getInstance();
>
>                     if (ccn != null) { // check needed for junit tests
>
>                       logFrequency = ccn.getLogFrequency();
>
>                     }
>
>                     if ((this.maxQueueSizeHitCount % logFrequency) == 0) {
>
>                       logger.warn(LocalizedMessage.create(
>
>                           LocalizedStrings.HARegionQueue_CLIENT_QUEUE_
> FOR_0_IS_FULL,
>
>                           new Object[] {region.getName()}));
>
>                       this.maxQueueSizeHitCount = 0;
>
>                     }
>
>                     ++this.maxQueueSizeHitCount;
>
>                     this.region.checkReadiness(); // fix for bug 37581
>
>                     // TODO: wait called while holding two locks
>
>                     this.permitMon.wait(CacheClientNotifier.
> eventEnqueueWaitTime);
>
>                     this.region.checkReadiness(); // fix for bug 37581
>
>                     // Fix for #51400. Allow the queue to grow beyond its
>
>                     // capacity/maxQueueSize, if it is taking a long time
> to
>
>                     // drain the queue, either due to a slower client or
> the
>
>                     // deadlock scenario mentioned in the ticket.
>
>                     reconcilePutPermits();
>
>                     if ((this.maxQueueSizeHitCount % logFrequency) == 1) {
>
>                       logger.info(LocalizedMessage
>
>                           .create(LocalizedStrings.
> HARegionQueue_RESUMING_WITH_PROCESSING_PUTS));
>
>                     }
>
>                   } catch (InterruptedException ex) {
>
>                     // TODO: The line below is meaningless. Comment it out
> later
>
>                     this.permitMon.notifyAll();
>
>                     throw ex;
>
>                   }
>
>                 }
>
>               }
>
>             } // synchronized (this.permitMon)
>
>           } // if (putPermits <= 0)
>
>           --putPermits;
>
>         } // synchronized (this.putGuard)
>
>       }
>
>     }
>
>
>
>
>
> Thanks
>
> Mangesh
>
>
>

Reply via email to