Hi all,

Please forgive the slightly long post, but if you have anything to contribute on this topic, please consider giving it a read as I could really use your input. :-)

As I'm sure many others of you running proxy-based service delivery platforms of some description also, I am faced with the problem of trying to account for calls with missing BYEs in a realistic way. There is no shortage of mailing list posts over the years on this topic. Inevitably, in a platform with sufficient call volume, with some NAT'd and/or endpoint diversity and other technical causes, there will be some calls that are never officially terminated from the point of view of a proxy.

The ability of the 'dialog' module to spoof bidirectional BYEs on timeout[1] goes a long way toward addressing this problem theoretically. However, there are practical obstacles to relying on it solely as a solution, mainly because there is not an acceptable timeout value to use as a trade-off. If the timeout period is set to a very low value, users will obviously complain, and in any case, depending on the destination, the worst-case scenario for maximum call billing may still be far too high. If the timeout period is set high--perhaps something like 5-8 hours--then all calls that fail to end in the normal way will be billed some excessively large amount that certainly will not sit well with users either.

If either the core delivery element of the platform or the user agent is tightly controlled by the operator of the proxy from an administrative point of view, it is indeed probably possible to rely on RTP timeouts or SIP Session Timers (SSTs) on one of the endpoints.

That doesn't create a satisfying resolution for those of us dealing with indeterminate call completion scenarios with a great deal of user and vendor diversity, though. For instance, I route to about 15 ITSPs and carriers; I think maybe one of them does 15-minute SSTs, and the rest are certainly not going to turn them on just for me, even if their SBCs/switches/things have the capability. The user endpoints are mostly Asterisk and do RTP timeout, of course, and in most cases I do get the resulting BYE. However, this discussion is about the minute but nontrivial percentage of cases in which I do not get the BYE, whether because of NAT statekeeping problems or network reachability or whatever underlying causes--in truth, I cannot accurately characterise these.

So, it seems to me that from a theoretical point of view, there are basically two directions someone in this position can go from here:

1) Inline B2BUA in the signaling path of all calls;

1a) Make it do SSTs; or
1b) Make it relay media, too, and hang up the call (bidirectional BYE) on RTP receive timeout;

2) Couple the proxy to an RTP relay and provide some mechanism by which the proxy can be made aware, in an asynchronous fashion, that an RTP timeout was detected by the relay.

It seems to me from a brief and informal survey of prior mailing list literature that #1 is the usually recommended option here.

If #1 is pursued, what is the best tool to use in the Kamailio/SIP-Router-oriented ecosystem? My default instinct would say SEMS; I really like SEMS, and use it a lot for various related chores.

The problem is that the pre-built modules and examples for SEMS mostly center on application-level functionality, while low-level documentation of its powerful C++ API is a bit impoverished, so this would take a lot of work.

Needless to say, I am interested in the option that requires the least work but still solves the problem in an elegant way from a technical and--dare I say--aesthetic perspective.

For instance, it seems clear from looking at the SEMS-1.1.1 sources that SSTs are supported in principle in core/plug-in/session_timer. But unless I am missing something, I cannot find anywhere in the sources or examples where it is actually used.

So, I suppose one option is to figure out how to make this stuff work in SEMS, and make it work. But for some reason who is not attune to the universe of its C++ API, it is a rather formidable chore. I think the same would hold true of making it observe bidirectional RTP timeout.

Turning attention to option #2, I have looked at rtpproxy (my preferred default), iptrtpproxy, and mediaproxy modules but have not found any evidence that the control protocols Kamailio/SR uses to engage them support any notion of backward asynchronous feedback in case of RTP timeout.

It would be really nice if one of these stream control protocols was augmented to kick back a packet to Kamailio that can be caught in a special event_route, like event_route[nathelper:rtp-stream-timeout], but that is clearly not the case today.

To be honest, I would not use MediaProxy even if it had this feature, because, well, let's be bluntly honest and acknowledge what the more politically aware presumably already conjecture: in light of AG Projects' zealous OpenSIPS partnership, it's difficult to muster confidence in future compatibility of MediaProxy with Kamailio. The module is there, it works, and I'm sure its maintainers are dedicated to doing whatever it takes to reverse engineer and keep it working, lift patches from OpenSIPS as necessary, etc., but who wants to be on the wrong side of the project ecosystem fence? Not I.

That leaves iptrtpproxy, whose 'switchboard' concept I do not fully comprehend due to lack of experience with it, but which holds a potentially viable, if slightly kludgy/Rube Goldbergian answer. Of the three RTP proxies, it is the only one that provides a ready means of exporting a list of media streams it is currently tracking, together with statistics on how many packets have been received, etc. It is not inconceivable to cook up an external process that will frequently check this 'switchboard', as it were, and incite Kamailio/SR to do dlg_bye() via MI if it appears that the media stream has disappeared from either side; the dialog module helpfully exports the MI command dlg_end_dlg.

Still, this does not seem nearly as parsimonious and reliable a solution as simply building some kind of RTP stream leg timeout notification into the control socket. After all, the control socket is open persistently, right, not on-demand? The various RTP proxies all seem to have some kind of dead peer detection internally in order to have some means of gracefully expiring resources allocated to media streams that have gone away, so it would just be a matter of passing a control frame up the socket to Kamailio/SR and wiring that to a custom event_route or a more static callback in the code.

By the way, I should mention that I am aware of and historically very sympathetic to the perspective that this kind of call control is alien to the nature of a proxy, and an appropriate job for UAs and not proxies at all. However, we all have to make pragmatic concessions to the realities of real-world operation, which I assume is the motivation for dialog timeouts, dlg_bye(), and other perversions from the point of view of a purist. :-)

I welcome your thoughts and suggestions about the easiest and most technically meritorious approach.

Thanks,

-- Alex

[1] Enabled via $dlg_ctx(timeout_bye) = 1

--
Alex Balashov - Principal
Evariste Systems LLC
1170 Peachtree Street
12th Floor, Suite 1200
Atlanta, GA 30309
Tel: +1-678-954-0670
Fax: +1-404-961-1892
Web: http://www.evaristesys.com/

_______________________________________________
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
sr-users@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users

Reply via email to