I also hit the package oom issue on a 1.3 server. Just to confirm that this problem really exists. I can't confirm after how much number of calls ...
Regards, Ovidiu Sas On Mon, Oct 6, 2008 at 11:08 AM, Daniel-Constantin Mierla <[EMAIL PROTECTED]> wrote: > Hello, > > On 10/06/08 13:22, mayamatakeshi wrote: >> [...] >> >> >> >> Hello, >> we have openser 1.3.3 running in production >> (current rev.: >> 4943). >> For 3 times in 50 days we had to restart openser to >> correct pkg memory problem. >> >> openser 1.3.3 was released 3 weeks ago, so I guess you were >> running previous version before, but it happened again >> since >> you upgraded to 1.3.3, right? >> >> >> After some time logging messages like this: >> /openser.log:Aug 19 10:39:18 ipx022 >> /usr/local/sbin/openser[16991]: >> ERROR:core:new_credentials: no pkg memory left, >> openser will eventually run out of pkg memory and >> refuse >> all subsequent requests. >> >> We are trying to recreate this in our lab so that >> we can >> follow memory troubleshooting instructions at >> >> http://kamailio.net/dokuwiki/doku.php/troubleshooting:memory, >> but so far we were unable to do it even when generating >> millions of calls and registration transactions (we are >> using SIPp to generate normal call flows and even >> abnormal >> call flows detected when reading openser.log, like >> 'invalid cseq for aor', malformed SIP messages etc). >> >> We can spot memory leaks even the "out of memory" >> message is >> not printed. Just archive the logs (the most important >> is the >> shut down time) and made them available for download so >> they >> can be investigated. >> >> There could be two reasons: >> - there is memory leak but happens in some cases that you >> don't reproduce in lab, but they are in the production >> environment >> - you get memory fragmentation >> >> Let's see first the debug messages... >> >> >> Hello, >> here are the link for openser.log and cfg files: >> http://www.yousendit.com/download/bVlEV0o4R3NoeWJIRGc9PQ >> >> After compilation with debug flags for memory manager, I left >> openser running in production for 24 hours. Then, I moved all >> traffic to another host and waited for more than 30 minutes >> before >> stopping openser. >> In the openser.cfg, I set debug=2. If you need, I can run >> it again >> with a higher value (but I hope it doesn't have to be too high, >> due to overhead concerns). >> >> Sorry, I forgot to tell one thing: the last revision that >> showed this problem was 4809, so we reverted back to that >> revision before performing the above. >> >> to understand that you couldn't reproduce with latest svn version? >> So you had to get a previous version? >> >> >> Hi, >> no, the reason for reversion is that the latest version running in >> production will not show the problem because we adopted preventive >> reset to minimize impact to customer calls. So I don't know yet if it >> shows this problem or not. >> So I collected the logs using a revision that I was sure could >> recreate the problem. > OK, I understand now. I was looking at the logs and there seems to be a > leak with db operations - something does not free a db result. I will go > over the modules that you are using and try to spot any issue -- i will > check the change log to see if something happened in the last time > regarding such issue.. >> >> But here's some developments on my investigation: >> Up to now, I was trying to recreate the problem using VirtualMachines >> running the same OS (Fedora 5) as in production. It never happened >> there, even after 30 million of calls. >> But we eventually were able to test openser 1.3 using a production >> machine with the same spec as the ones showing the problem and we were >> able to generate pkg memory problem using a simple outgoing SIPp >> scenario. The problem always happens after we reach around 28.000 >> calls and we confirmed the amount of calls needed to cause the problem >> grows linearly with the amount of pkg memory (after increase of pkg >> memory pool by 4, problem started to happen only after around 128.000 >> calls). >> However, we also tried the same tests with kamailio 1.4 (rev. 5017) on >> that machine and we could not recreate the problem after 1.5 million >> calls, so we are thinking in just upgrade to 1.4 after other scenarios >> show everything else is working. > OK, 1.4 is recommended, it has lot of new features and many fixes. >> >> But I don't know why the problem cannot be recreated using the VMs: >> the only significant difference is that the productions machines have >> 4 NICs that are bound in 2 pairs (1 for private ip and another for >> public ip) while the VMs have just one NIC. > I see no relation with the NICs. >> >> I hope upgrading to 1.4 will solve everything, however, since nobody >> is complaining about having openser stopping after 28.000 calls, I >> still believe we have some problem in the openser.cfg itself. I'll >> check it after we put kamailio 1.4 in production. > OK, I will dig in further, I might be a bit slow, however, these days. > > Cheers, > Daniel > > -- > Daniel-Constantin Mierla > http://www.asipto.com > > > _______________________________________________ > Users mailing list > [email protected] > http://lists.kamailio.org/cgi-bin/mailman/listinfo/users > _______________________________________________ Users mailing list [email protected] http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
