Hi Ben,

First, if you use TH, makes no sense to do Record-Routing - there are 2 SIP concepts that overlaps. You either act as an end-point (by doing TH), either as a proxy (doing RR).

If doing TH, makes no sense to use validate + fix as these functions check and repair the routing information in the request (like Route and Contact headers). if you do TH, this routing info is actually hidden and added by OpenSIPS, so there is nothing to fix and repair.

Nevertheless, this should not crash or corrupt OpenSIPS. HAve you managed to get a corefile ?

Also if you suspect memory corruption, you can compile-in the memory debugger - see http://www.opensips.org/Documentation/TroubleShooting-OutOfMem .

Regards,

Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com

On 26.07.2016 23:20, Newlin, Ben wrote:

I have had 3 OpenSIPS server crashes in the last week. All were due to segmentation faults. I was not able to capture core dumps; I am configuring that now to catch the next crash.

My logs leading up to the crash are full of errors from fix_route_dialog() complaining about invalid URIs for sequential requests:

Jul 26 19:34:02 [220] ERROR:dialog:fix_route_dialog: Failed to parse SIP uri

Jul 26 19:34:02 [220] ERROR:core:parse_uri: bad uri, state 0 parsed: <ip:1> (4) / <ip:10.18.8.18:5060;ftag=gK0448f137;lr;r2=on>> (44)

Jul 26 19:11:19 [218] ERROR:dialog:fix_route_dialog: Failed to parse SIP uri

Jul 26 19:11:19 [218] ERROR:core:parse_uri: bad uri, state 0 parsed: <b0i2> (4) / <b0i2yjor;transport=udp<sip:10.18.8.17:5060;ftag=7207ce89;lr;r2=on> (65)

Jul 26 17:43:19 [220] ERROR:dialog:fix_route_dialog: Failed to parse SIP uri

Jul 26 17:43:19 [220] ERROR:core:parse_uri: bad uri, state 0 parsed: <ervi> (4) / <ervice_id6fdbc70f-2438-4726-807c-0d081df4d87> (44)

Many times the “URI” displayed in the error message is actually internal OpenSIPS variables, as in the last error above. When they are from the SIP message, I have verified that the messages themselves are correctly formatted. This leads me to believe there is memory corruption occurring.

This all started when I updated my load-balancer servers to use Record-Routing, specifically the “double_rr” mechanism for when multiple interfaces exist. The Record-Routing is occurring on different servers which have not crashed. Only the servers receiving the Record-Routed messages are experiencing the errors.

Here is a piece of the code processing sequential requests. I am using the topology_hiding() functionality of the Dialog module. Are validate_dialog() and fix_route_dialog() still valid in a topology_hiding scenario?

if (t_check_trans())

setflag(SEQ_REQUEST);

  if (has_totag())

  {

loose_route();

    if (match_dialog())

    {

      if (!validate_dialog())

fix_route_dialog();

      if (is_method("BYE"))

setflag(ACC_FLAG);

setflag(SEQ_REQUEST);

    }

    else if (!isflagset(SEQ_REQUEST))

    {

      if (!is_method("ACK")) {

route(rlog, LV_ERROR, "check_sequential", "Sequential request not matched");

  route(reply_error, "481", "Call Does Not Exist");

      }

return(EXIT);

    }

  }

I will attempt to get core dumps of future crashes.

Thanks,

Ben Newlin



_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users

_______________________________________________
Users mailing list
Users@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/users

Reply via email to