Hi Daniel, 

On my initial test in prod, I can see already discrepancies:

```
root@kamailio2:~# kamctl rpc dlg.stats_active
{
  "jsonrpc":  "2.0",
  "result": {
    "starting": 4,
    "connecting": 55,
    "answering":  2,
    "ongoing":  253,
    "all":  314
  },
  "id": 23360
}
root@kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
    "dialog:active_dialogs = 18446744073709551604",
    "dialog:early_dialogs = 15",
    "dialog:expired_dialogs = 0",
    "dialog:failed_dialogs = 120",
    "dialog:processed_dialogs = 340",
root@kamailio2:~#
```

Now we can go past 9223372036854776000 ??? I don't understand that.

That was right after restarting (green active dialogs, yellow early dialogs):

![image](https://user-images.githubusercontent.com/16212586/42706316-2926ed4e-868b-11e8-8f62-08bddc849d58.png)


Here is multiple runs of the stats.get_statistics:

```
root@kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
    "dialog:active_dialogs = 18446744073709551614",
    "dialog:early_dialogs = 8",
    "dialog:expired_dialogs = 0",
    "dialog:failed_dialogs = 315",
    "dialog:processed_dialogs = 863",
root@kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
    "dialog:active_dialogs = 18446744073709551614",
    "dialog:early_dialogs = 8",
    "dialog:expired_dialogs = 0",
    "dialog:failed_dialogs = 315",
    "dialog:processed_dialogs = 864",
root@kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
    "dialog:active_dialogs = 18446744073709551614",
    "dialog:early_dialogs = 8",
    "dialog:expired_dialogs = 0",
    "dialog:failed_dialogs = 315",
    "dialog:processed_dialogs = 864",
root@kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
    "dialog:active_dialogs = 18446744073709551613",
    "dialog:early_dialogs = 10",
    "dialog:expired_dialogs = 0",
    "dialog:failed_dialogs = 320",
    "dialog:processed_dialogs = 866",
root@kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
    "dialog:active_dialogs = 0",
    "dialog:early_dialogs = 8",
    "dialog:expired_dialogs = 0",
    "dialog:failed_dialogs = 322",
    "dialog:processed_dialogs = 870",
root@kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
    "dialog:active_dialogs = 1",
    "dialog:early_dialogs = 9",
    "dialog:expired_dialogs = 0",
    "dialog:failed_dialogs = 323",
    "dialog:processed_dialogs = 873",
root@kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
    "dialog:active_dialogs = 18446744073709551615",
    "dialog:early_dialogs = 9",
    "dialog:expired_dialogs = 0",
    "dialog:failed_dialogs = 324",
    "dialog:processed_dialogs = 875",
root@kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
    "dialog:active_dialogs = 18446744073709551615",
    "dialog:early_dialogs = 10",
    "dialog:expired_dialogs = 0",
    "dialog:failed_dialogs = 325",
    "dialog:processed_dialogs = 876",
root@kamailio2:~#
```

I tried to have a script gathering metrics:

```
root@kamailio2:~# cat dialogs.sh
#!/bin/bash

while true; do
    DIALOGS=`kamctl rpc dlg.list | grep h_entry | sort  | wc -l`
    METRICS=`kamctl rpc stats.get_statistics all | grep -e "dialog:active" -e 
"dialog:early" | tr -d " " | tr -d "\n"`
    NEWMETRICS=`kamctl rpc dlg.stats_active | jq -c .result`
    echo "`date +'%Y-%m-%d %H:%M:%S.%N'` - dlg.list:$DIALOGS - 
stats.get_statistics:$METRICS - dlg.stats_active:$NEWMETRICS"
    sleep 1
done
root@kamailio2:~#
```

But, after a little kamailio "starts slowing down" until it dies (had to 
completely reboot the server).

```
root@kamailio2:~# ./dialogs.sh
2018-07-13 10:57:14.204158716 - dlg.list:419 - 
stats.get_statistics:"dialog:active_dialogs=13","dialog:early_dialogs=12", - 
dlg.stats_active:{"starting":5,"connecting":62,"answering":7,"ongoing":288,"all":362}
2018-07-13 10:57:15.437165579 - dlg.list:422 - 
stats.get_statistics:"dialog:active_dialogs=15","dialog:early_dialogs=11", - 
dlg.stats_active:{"starting":4,"connecting":59,"answering":10,"ongoing":289,"all":362}
2018-07-13 10:57:16.678432892 - dlg.list:423 - 
stats.get_statistics:"dialog:active_dialogs=14","dialog:early_dialogs=9", - 
dlg.stats_active:{"starting":2,"connecting":58,"answering":9,"ongoing":289,"all":358}
2018-07-13 10:57:17.915458776 - dlg.list:422 - 
stats.get_statistics:"dialog:active_dialogs=13","dialog:early_dialogs=9", - 
dlg.stats_active:{"starting":4,"connecting":60,"answering":7,"ongoing":289,"all":360}
2018-07-13 10:57:19.149467641 - dlg.list:420 - 
stats.get_statistics:"dialog:active_dialogs=13","dialog:early_dialogs=8", - 
dlg.stats_active:{"starting":3,"connecting":60,"answering":5,"ongoing":290,"all":358}
2018-07-13 10:57:20.386457513 - dlg.list:415 - 
stats.get_statistics:"dialog:active_dialogs=14","dialog:early_dialogs=7", - 
dlg.stats_active:{"starting":4,"connecting":59,"answering":5,"ongoing":290,"all":358}
2018-07-13 10:57:21.627241266 - dlg.list:412 - 
stats.get_statistics:"dialog:active_dialogs=15","dialog:early_dialogs=7", - 
dlg.stats_active:{"starting":3,"connecting":60,"answering":6,"ongoing":292,"all":361}
2018-07-13 10:57:22.867957903 - dlg.list:415 - 
stats.get_statistics:"dialog:active_dialogs=14","dialog:early_dialogs=9", - 
dlg.stats_active:{"starting":2,"connecting":64,"answering":5,"ongoing":291,"all":362}
2018-07-13 10:57:24.099604068 - dlg.list:418 - 
stats.get_statistics:"dialog:active_dialogs=13","dialog:early_dialogs=10", - 
dlg.stats_active:{"starting":2,"connecting":65,"answering":3,"ongoing":294,"all":364}
2018-07-13 10:57:33.049706118 - dlg.list:411 - 
stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - 
dlg.stats_active:{"starting":4,"connecting":62,"answering":5,"ongoing":287,"all":358}
2018-07-13 10:57:34.290398926 - dlg.list:418 - 
stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - 
dlg.stats_active:{"starting":9,"connecting":62,"answering":5,"ongoing":287,"all":363}
2018-07-13 10:57:35.531092078 - dlg.list:419 - 
stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - 
dlg.stats_active:{"starting":9,"connecting":62,"answering":5,"ongoing":287,"all":363}
2018-07-13 10:57:36.762143019 - dlg.list:419 - 
stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - 
dlg.stats_active:{"starting":9,"connecting":62,"answering":5,"ongoing":287,"all":363}
2018-07-13 10:57:38.006519396 - dlg.list:420 - 
stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - 
dlg.stats_active:{"starting":10,"connecting":62,"answering":5,"ongoing":287,"all":364}
2018-07-13 10:57:39.254513350 - dlg.list:420 - 
stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - 
dlg.stats_active:{"starting":10,"connecting":62,"answering":5,"ongoing":287,"all":364}
2018-07-13 10:57:55.779414400 - dlg.list:431 - 
stats.get_statistics:"dialog:active_dialogs=2","dialog:early_dialogs=6", - 
dlg.stats_active:{"starting":8,"connecting":72,"answering":10,"ongoing":272,"all":362}
^C
root@kamailio2:~#
```

As you can see it starts taking >10s to reply some times, which I believe means 
it is dying, leaving a core and restarting:

```
root@kamailio2:/var/tmp# ls -lrt
total 47781984
-rw------- 1 root kamailio 8607031296 Jul 13 10:59 
core.kamailio.24376.13cn4.1531504766
-rw------- 1 root kamailio 2128039936 Jul 13 11:01 
core.kamailio.28422.13cn4.1531504860
-rw------- 1 root kamailio 2081087488 Jul 13 11:01 
core.kamailio.28420.13cn4.1531504860
-rw------- 1 root kamailio 2136645632 Jul 13 11:01 
core.kamailio.28421.13cn4.1531504860
-rw------- 1 root kamailio 2174545920 Jul 13 11:01 
core.kamailio.28423.13cn4.1531504860
-rw------- 1 root kamailio 5265956864 Jul 13 11:07 
core.kamailio.1335.13cn4.1531505030
-rw------- 1 root kamailio 5309550592 Jul 13 11:07 
core.kamailio.1339.13cn4.1531505030
-rw------- 1 root kamailio 5294923776 Jul 13 11:07 
core.kamailio.1336.13cn4.1531505030
-rw------- 1 root kamailio 5329113088 Jul 13 11:07 
core.kamailio.1334.13cn4.1531505030
-rw------- 1 root kamailio 5338894336 Jul 13 11:07 
core.kamailio.1338.13cn4.1531505030
-rw------- 1 root kamailio 5262925824 Jul 13 11:07 
core.kamailio.1337.13cn4.1531505030
root@kamailio2:/var/tmp#
```

I'm grabbing the `bt full` and `info locals` from all and I'll add them here 
shortly. I will have to rollback until you have time to check all this.



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/1591#issuecomment-404911844
_______________________________________________
Kamailio (SER) - Development Mailing List
sr-dev@lists.kamailio.org
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev

Reply via email to