Thanks Udara and Isuru for your replies. I like the approach of doing retry with incremental delay
Hi Imesh, Please go ahead and implement this fix and also take care of the mqtt-client upgrade. Thanks, -Jeffrey From: Imesh Gunaratne <im...@apache.org<mailto:im...@apache.org>> Reply-To: "dev@stratos.apache.org<mailto:dev@stratos.apache.org>" <dev@stratos.apache.org<mailto:dev@stratos.apache.org>> Date: Friday, November 14, 2014 11:18 AM To: dev <dev@stratos.apache.org<mailto:dev@stratos.apache.org>> Subject: Re: Un-subscribe cartridge takes 60+ sec Hi Jeffrey, Thanks for bringing this up. I was thinking about this issue sometime back and thought that may be we could start the retry interval with a small value and then increase it step by step depending on the number of continuous failures. I noticed this approach in Gmail. I can do a quick fix for this if you don't mind. Of course, we could upgrade the mqtt-client version. Thanks for pointing this out! I will do this. Thanks On Fri, Nov 14, 2014 at 6:19 AM, Jeffrey Nguyen (jeffrngu) <jeffr...@cisco.com<mailto:jeffr...@cisco.com>> wrote: Hi, While testing Stratos 4.1 M3, I noticed about half the times, unsubscribing a cartridge takes 60+ seconds to execute. During that time, the caller is blocked. I poke around Stratos code and found TopicPublisher class that is responsible for the 60 second retry delay. The code sleeps for 60 seconds whenever we encounter an exception while posting the given message to its corresponding topic. Looks like this code has been around since at least release 4.0.0. It seems to me 60 sec delay is way to long. If you have a lot of subscriptions, the delay can be multiple factor of 60 seconds. Also, if you do this via Stratos' Rest API, the request might get timed out before the response comes back? I reduced this delay from 60 seconds to one second and tested and noticed that worked just fine. I'm planning to push this change upstream unless I get any objection on that. As for why the exception while attempting to update the topic, I did some googling and found [1]. It seems this is an issue with mqtt-client. Somehow we're getting a stale connection to MB when we update the topic. We're currently using version 0.4.0 of mqtt-client. From [2], it looks like the latest version is 1.0.0. Maybe it's time to upgrade? Other question: when I make a Rest call to get list of subscriptions and it returns nothing, is it safe to assume all spawn instances have been killed? Regards, -Jeffrey [1] https://github.com/openhab/openhab/issues/980 [2] http://git.eclipse.org/c/paho/org.eclipse.paho.mqtt.java.git/ -- Imesh Gunaratne Technical Lead, WSO2 Committer & PMC Member, Apache Stratos