Re: [asterisk-dev] invasive fixes to chan_iax2 in 1.4
Hello, I now compared SIP against IAX performance on Russel's branch. I flooded the server with registrations (900) with 1 ms in between. During this process I try to launch a call, with SIP or IAX based on the flooding I am doing. Asterisk answers the call and plays a sound file. With SIP things are great. The call is connected really fast. With IAX it is a whole other story. It takes a hell lof of time to get connected. The IAX protocol is just not responsive The number of channels sip show channels or iax show channels is similar. Just to make sure both flood tools do approximately the same. Now, I guess on IAX it is related how many threads I configure inside iax.conf For now my settings are: iaxthreadcount = 100 iaxmaxthreadcount = 150 I think the highes value possible is 512 which I tried but it did not really improve the situation. The setting of rtcachefriends in the config has no impact my my test results. Can anyone confirm those results? Compared to SIP it is at the moment too easy to DOS IAX because of the poor performance. No fast internet connection needed. Just a few UDP packages and IAX is down. I do not say that this is different on some older iax implementations. So I think the latest developments are some big improvements and should be merged quite soon, depending on more test results. Best regards, Loic Didelot. On Fri, 2007-08-17 at 16:20 -0400, Russell Bryant wrote: Greetings, I have a branch where I have fixed some bugs in chan_iax2. The code is intended to go in both 1.4 and trunk. The changes are rather invasive, so I wanted to describe the changes and justify them before merging them in. svn diff http://svn.digium.com/svn/asterisk/branches/1.4/channels/chan_iax2.c http://svn.digium.com/svn/asterisk/team/russell/iax_refcount/channels/chan_iax2.c This branch is targeted at fixing a set of crashes. They were brought to my immediate attention by being given access to a system that could be easily crashed by issuing a reload while running a registration load test. The problems turned out to be related to the handling of iax2_peer objects. However, the handling of iax2_user objects suffers from the same problems so the same changes have been made to them, as well. There are various situations where peers could get destroyed while other threads still hold references to them. To fix this problem, I made it so the objects are reference counted. To do this, I had 3 choices. 1) Don't use any object model. Just write it in manually. This felt like the worst choice. It's best to use a common implementation when possible. 2) Use astobj.h. This hasn't made it very far throughout the code base. The API is a set of macros which are cumbersome to use, and especially to debug, in my opinion. However, it's certainly better than option #1, as it is a common implementation already in use in some places. However, it is likely that this is going to be deprecated in favor of astobj2. 3) Use astobj2. Luigi Rizzo and one of his students, Marta Carbone, have written a new reference counted object model and have had it in a branch for a long time now. After reviewing astobj versus astobj2, I perferred astobj2, and decided to go with that. It doesn't use macros, and is still more efficient. Another reason that I chose to go with astobj2 was that I *know* this is not the end of problems of this type. In fact, there is a nice set of crashes on the bug tracker which the fix will be converting the ast_channel struct over to a reference counted object model, which is going to be a big job. Since we are likely going to have to do more conversions of object handling to this method, I figured it was worth bringing in astobj2 to 1.4 so that we can use it as needed, and not do all of this work using a model that we know will be deprecated in the near future. If you have any comments or objections, I would like to hear them. Also, feel free to ask for any clarification or more detailed explanations of the things I have mentioned here. If there are no objections, I will be merging these changes into 1.4 next week. -- Loic DIDELOT (CTO) voipGATE S.A. Tel: +352 20 200 223 Fax: +352 20 200 923 E-mail: [EMAIL PROTECTED] Web: http://www.voipgate.com ___ --Bandwidth and Colocation Provided by http://www.api-digital.com-- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev
Re: [asterisk-dev] IAX2 changes - thank you!
Yep, that is the tool I am using and for SIP I have a similar tool and will release the sources in 2 weeks, needs some more code polish. I will post the backtraces on Monday, in the meanwhile the person who likes access to my test box should jabber me on [EMAIL PROTECTED] Best regards, Loic Didelot. On Fri, 2007-08-10 at 15:47 +0300, Zoa wrote: I sent that test tool to Digium about 2 years ago already :p http://www.astertest.com/downloads/iax_mass_authorization.tgz Zoa Russell Bryant wrote: Loic Didelot wrote: If anyone is interested in core dumps just tell me. I can also give root access to a test box with with the flood tool etc... This might speed up things as the test environment is ready. Posting backtraces of these crashes to the bug tracker would be great. Getting access to this test tool to recreate the problem is even better. Now, I know that those tests are not really an everyday situation. This is why I will try to put it in production on Monday and provide further test results. This kind of testing is very valuable. You shouldn't be able to make it crash. It's a lot easier to fix the problem on a system where the crash can be recreated using a test tool as opposed to a production system where those 800 registrations are customers. :) ___ --Bandwidth and Colocation Provided by http://www.api-digital.com-- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev -- Loic DIDELOT (CTO) voipGATE S.A. Tel: +352 20 200 223 Fax: +352 20 200 923 E-mail: [EMAIL PROTECTED] Web: http://www.voipgate.com ___ --Bandwidth and Colocation Provided by http://www.api-digital.com-- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev
Re: [asterisk-dev] Re: Realtime and call states in SIP
Exactly. Writing config files on the fly and reloading at each change would be overkill. The realtime engine solves that problem. The realtime caching features are there to keep some load from the DB. At least that is how I used it till now and I guess many others. Best regards, Loic Didelot. On Wed, 2007-05-30 at 09:10 -0400, Sergey Okhapkin wrote: To me (and I believe to many others), realtime is a way to provide a realtime provisioning of new clients, nothing else. On Wednesday 30 May 2007 08:01, Kevin P. Fleming wrote: I just don't understand why Realtime was supposed to 'save memory' at all; ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev -- Loic DIDELOT (CTO) voipGATE S.A. Tel: +352 20 200 223 Fax: +352 20 200 923 E-mail: [EMAIL PROTECTED] Web: http://www.voipgate.com ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev
Re: [asterisk-dev] IAX2 still broken
Hi guys, I have been in contact with Digium last week and provided access to our servers etc The latest commits to 1.4 really remove the deadlock. Now there is another bug. After some IAX registrations, asterisk will crash. It does not matter if you use odbc or mysql. The solution is to set rtcachefriends to no in iax.conf. Doing this my box runs totaly stable. Most of our traffic is IAX. Ok, I admin that disabling the local cache brings up some other problems which are far more complicated to explain and which do not result in a nice core dump. This tool is really good to test IAX. Most of the problems we noticed are linked to registrations. http://www.astertest.com/downloads/iax_mass_authorization.tgz Best regards, Loic Didelot. On Thu, 2006-11-02 at 21:07 +0500, Anton wrote: Thanks to Tim Panton, the following were found in 1.4svn IAX, and submitted to bugtracker http://bugs.digium.com/view.php?id=8273 --FROM TIM- IAX2 calls go silent in 1.4 It looks like a meta (trunk) frame, but it doesn't have Meta Command or Cmd data set to valid values. The length is plausible for a Meta frame carrying a single (G.711) packet. Details in the bugtracker. On 2 November 2006 14:10, Tim Panton wrote: On 2 Nov 2006, at 04:31, Anton wrote: Again, the OLD issue - after a while - IAX becomes 1way-or-no-audio operation. Any suggestion or anyone wants to take a look? I'd like to work with you on this, have you ever managed to capture a packet trace with ethereal whilst it is silent ? Are there voice mini-frames flowing? iax2 debug doesn't cut it as it ignores mini-frames and I suspect it is miniframes that are the problem. Tim Panton www.mexuar.com ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev -- Loic DIDELOT (CTO) voipGATE S.A. Tel: +352 20 200 223 Fax: +352 20 200 923 E-mail: [EMAIL PROTECTED] Web: http://www.voipgate.com ___ --Bandwidth and Colocation provided by Easynews.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev
Re: [asterisk-dev] Proposal to seperate qualify keep alive
Hi, I actually like the idea of separating keep-alives from the qualify. Defining the frequency of the packages is very important to adapt the asterisk behaviour according to the customers one has. This would solve many of our problems. Are there more people who need this? Is there a way to get this developed and include it in asterisk 1.4? voipGATE would be interested in cosponsoring this feature but only if it will be included in the 1.4 stable release and if available for IAX as well. Best regards, Loic Didelot. On Mon, 2006-06-26 at 13:06 -0700, John Todd wrote: At 8:32 PM +0200 6/26/06, Johansson Olle E wrote: 26 jun 2006 kl. 19.23 skrev John Lange: In the current implementation, qualify sends out a SIP request at the specified interval and if it doesn't receive a reply within that same interval asterisk flags the peer as unreachable. This also acts as a sort of keep-alive for devices behind NAT when combined with the nat=yes parameter. The regular flow of SIP packets keeps the NAT connective alive for the device behind the firewall. The problem is, these are two very different concepts and at times it would be nice if we could separate the two. Specifically; we have some clients with devices behind nat and satellite. Their nat and satellite requires a more-or-less constant flow of packets to keep the connection alive. However due to the quirky nature of satellite combined with long round-trip times the qualify option needs to be set high (5000ms) or Asterisk won't send calls to the client. In fact we would like to set qualify=no because often the client appears to be very lagged when the satellite perceives the connection to be idle (apparently it queues packets until it has a bunch and sends them in groups) but if you initiate a call the lag drops immediately to an acceptable level (800ms). But if we set qualify=no then the firewall closes the connection and they can't receive any calls. So, the question is; is it reasonable to undertake the implementation of a keep alive for sip clients? Any thoughts on how this should be done? SIP NOTIFY or would something else make more sense? I don't see a reason for changing method. We should propably find a way to override and be able to dial out regardless of the monitoring status. That seems like a simple fix. /O I would actually agree that the two functions should be separated. I find myself often in the same position, where the use of qualify= is used as a NAT mapping tool only, and I don't particularly care about the actual milliseconds of response time to the request. I also think we would be well-served to make these timers a bit more flexible, since right now everyone is in the same bucket as far as timing goes for how frequently OPTIONS requests are sent. I'd like to be more aggressive for foolish people who have poorly-configured firewalls that close NAT UDP sessions after 30 (or fewer) seconds, and currently the only way to do this is to change the code to send ALL of my OPTIONS requests much more frequently, which eventually leads to a huge amount of nonsense noise on my network to solve for a few poorly behaved clients. SER sends bogus packets fairly frequently as part of it's NAT module, and this seems to work well. The current method in Asterisk has a few downsides: 1) OPTIONS packets are larger than just simple UDP keepalives (but not by much) 2) OPTIONS requests require stateful storage of status, so if I have 6000 SIP peers each using qualify=, then Asterisk needs to store a fairly large amount of memory aside to track each one of those transmitted OPTIONS statements, and if at any time there are 10% of those peers which are slow to respond (say, two cycles) then I have a huge backlog of stateful requests in queue. If a UDP packet that did not require return receipt was sent just for NAT keepalives, this would be much lighter weight, and we could move the heavier OPTIONS request interval to a larger time value. 3) The current OPTIONS request is bursty, and all of the OPTIONS are sent in 60 second intervals using the same interval timer. This is really ugly, with big spikes of data every 60 seconds. This should be probably distributed so that each entry has it's own timer. I propose a different way to do this, with an example out of sip.conf listed below. I know that this will require the creation of memory space for each of these timers (and a whole slew of timer-related issues internally to Asterisk) but it does seem like it would be more flexible to do it this way and may reduce the amount of processing for the OPTIONS requests if just lightweight UDP can be sent for NAT translations. With this method, I could possibly crank up the OPTIONS qualifiers to something like 5 minutes, but leave the NAT translation keepalives down at 20 seconds and hopefully see less load on my Asterisk
Re: [Asterisk-Dev] res_config_mysql.c connection problems = bug
Hello, I have for every registration about 3-5 selects and the same thing for every call. But I only have one update per registration so every 5 minutes I have been able to do 5000 update in 3 seconds on my mysql server. Best regards, Loic Didelot. On Wed, 2005-11-30 at 01:37 +1300, Matt Riddell wrote: Loic DIDELOT wrote: ps: i am working on a patch for res_config_mysql.c to send SELECT queries and UPDATE queries to 2 different servers (great for master-slave database setups). i got it working, except when asterisk has connection problems. just give me a sign if anyone is interested in reviewing my lousy c code. Very very interested in this. I have been pulling my hair out (and I don't have much left) trying to figure out how to do it without two way replication, so select and update splitting would be great. What do you think the relationship would be in terms of quantity? Are there a lot more selects than updates? ___ --Bandwidth and Colocation provided by Easynews.com -- Asterisk-Dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev
Re: [Asterisk-Dev] res_config_mysql.c connection problems = bug
Hello, not really. As this is the first thing I submited ever for a project I really have no clue how I should submit... But I tried this: http://bugs.digium.com/view.php?id=5881 Loic On Wed, 2005-11-30 at 16:03 +1300, Matt Riddell wrote: Loic DIDELOT wrote: Hello, I have for every registration about 3-5 selects and the same thing for every call. But I only have one update per registration so every 5 minutes I have been able to do 5000 update in 3 seconds on my mysql server. Cool, have you got a bug tracker id for the patches? ___ --Bandwidth and Colocation provided by Easynews.com -- Asterisk-Dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev