Hi Taisto, Your problem is not timer related or how serial forking is done in opensips (I will comment on these in a later reply).
Right now, the quick answer to fix your problem: failure route must be re-armed after each branch -> this is why your failure route does not catches the end of the second branch. Adding a t_on_failure("1") before t_relay() in failure route will fix your problem. Regards, Bogdan Taisto Qvist wrote: > Hi Bogdan, > > I've now been trying with some tests, and I cant really get it to work, > since the transactionlayer on the server transaction returns a 408 > back to the UAC before serial forking has ended. > This seems a little bit related to what I once commented on a long time > ago regarding handling of timer C and the fact that the timer c seems to > be quite "tied" to Timer B > > When the fr_timer pops, (causing the CANCEL to be sent so that we can move > on to the next serial-fork-target), the tm-layer seems store this > timer-pop > as a 408 response > > 20:41:44 osips[4686]: DBG:tm:utimer_routine: timer > routine:4,tl=0xb5b6770c next=(nil), timeout=649300000 > 20:41:55 osips[4686]: DBG:tm:timer_routine: timer > routine:1,tl=0xb5b67728 next=(nil), timeout=660 > 20:41:55 osips[4686]: DBG:tm:final_response_handler: stop retr. and > send CANCEL (0xb5b675c0) > 20:41:55 osips[4686]: DBG:tm:t_should_relay_response: T_code=180, > new_code=408 > 20:41:55 osips[4686]: DBG:tm:t_pick_branch: picked branch 0, code 408 > (prio=800) > > As the capture and log I've attached indicates, I am not able to > perform a three > step serial fork. I have three Uas:es registered with 1.0, 0.9, and > 0.8 in q-values. > > First timer pop causes a CANCEL, and a new INVITE towards UAS with > q=0.9, but when > it pops the second time, TM still cancels the second target, but > instead of continuing > with the third, it sends a 408 towards the UAC. > > It might be something with my script-handling in the failure_route, so > here it is: > > failure_route[1] > { > if ( t_was_cancelled() ) > { > log(2, "transaction was cancelled by UAC\n"); > } > xlog("(lab1) - In FailureRoute: branches=$(branch(uri)[*])\n"); > if ( isflagset(1) ) > { > log(2,"(lab1) - 3++ Received, attempting serial fork!\n"); > next_branches(); > switch ( $retcode ) > { > case 1: > log(2,"(lab1) - More branches left, rollOver timer > set."); > $avp(s:timerC) = 12; > setflag(1); # Do I need this? Should I use > branchflags instead? > break; > case 2: > log(2,"(lab1) - Last branch, timerC set to 60 sec"); > $avp(s:timerC) = 60; > break; > > default: > log(2,"(lab1) - No more serial fork targets."); > exit; > } > if ( !t_relay() ) > { > log(2,"(lab1) - Error during relay for serial fork!\n"); > } > } > else > { > log(2,"(lab1) - 3++ result. Serialforking not available.\n"); > } > > } > > When I say that it seems related to another issue I commented on a > long time > ago, I am referring to the general handling of Timer C, which doesn't > seem to > be a separate timer, but is reusing the timerB. > > When the timer pops after the normal 180 seconds, the TM layer will > *instantly* > generate a 408 response on the server txn, while at the same time > generating > the CANCEL attempting to terminate the client txn. > To me, this is wrong, but maybe I am suppose to handle this in the > failure_route? > > What I would expect is that the CANCEL will cause a 487 response from > the UAS, > and this will be the final response sent to the UAC. > Also by behaving this way, we may cause a protocol violation even > though the risk > is small. > > Once timer C pops we send the CANCEL hoping that it will cause a 487. > BUT, it is > quite possible that before the cancel is received by the UAS, it sends > a 200 to > the INVITE! Even IF the CANCEL receives a 2xx response, we may still > get a 2xx > response to the INVITE. > But with the current behavior of opensips, this would cause opensips > to proxy > TWO final responses on the server txn, once being the initial 408 sent > by the > txn on timer C timeout, and then the real/actual 2xx sent by the uas. > > I've also seen a similar problem with 6xx responses received on a > branch during > forking. > Opensips forwards the 6xx *before* the remaining client txns has > completed, and > there is no guarantee that these client txns will all terminate with > 487 even > if opensips tries to CANCEL all of them asap. > They may still return 2xx to the invite, which would cause a > forwarding of both > a 6xx and a 2xx on the server txn. This scenario is even mentioned in > rfc3261. > > So all these three problems have in common that the server txn seems > to be > terminating a bit early, before the client side has fully completed, > but as > I said, it might at least partially be something I should handle in my > failure_routes...? > > Thanks for all your help. > Regards > Taisto Qvist > > > Bogdan-Andrei Iancu skrev 2010-10-06 17:04: >> Hi Taisto, >> >> could you test the rev 7248 on trunk for solution 2) ? if ok, I will >> backport to 1.6 >> >> Regards, >> Bogdan -- Bogdan-Andrei Iancu OpenSIPS Bootcamp 15 - 19 November 2010, Edison, New Jersey, USA www.voice-system.ro _______________________________________________ Users mailing list Users@lists.opensips.org http://lists.opensips.org/cgi-bin/mailman/listinfo/users