Re: XULRunner 10.0.6.esr cause CPU spike to 100%+ and network completely hangs while the UI still responsive.
The root cause of this cpu spinning issue should due to Socket transport service thread lack of an error handle mechanism when system error happened in nsSocketTransportService::DoPollIteration() - PR_Poll() - Poll/Select, I have opened a new bug in Mozilla bugzilla, can anyone help to take a look at it? https://bugzilla.mozilla.org/show_bug.cgi?id=1119160 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: XULRunner 10.0.6.esr cause CPU spike to 100%+ and network completely hangs while the UI still responsive.
On Wednesday, November 19, 2014 10:29:35 PM UTC+8, Josh Matthews wrote: On 2014-11-19 3:38 AM, xingzhou...@gmail.com wrote: === The problem we have: === Our product leverage XULRunner 10.0.6.esr Version as embedded browser core, currently, it seems that we are encountering the problem highlighted in bug Bug 710176 - Socket transport service thread pegs the CPU spinning to send data on a SSL socket that is blocking waiting for certificate validation to finish within our product. This is causing our users on Mac to have a poor user experience since when the CPU Spikes to 100%+, the CPU keeps spinning while Embedded Browser couldn't open any web pages any more(no content will rendering anymore), seems like the network completely hangs while the UI still responsive. We have got the Sample file of our product while the CPU spikes, and in the process sample file, we found a lot of XULRunner threads(such as nsSocketTransportService, nsSSLThread, nsSSLIOLayerPoll...) are working.. We have also using XCode's Instruments tool to look at which thread is spinning the CPU, and the results as showed below, XULRunner's nsSSLThread::requestPoll() - nsSocketTransportService::Poll() takes much of the CPU time.. the symptom is that if XULRunner enter a potential loop between nsSSLIOLayerPoll() and nsSSLThread::requestPoll() which spinning on the CPU and starving everything else, this might leads to a complete hang of any networking, but the UI is responsive, which means that when the CPU spikes, you can still open another browser tab, but you can not loading any pages from then unless you restart the client.. === The help that we need: === We have done a great deal of research on this problem, but have not found a solidly reproducible case which also seems to be the case with Bug 71076. We are leveraging 10.0.6esr version of XULrunner and determined that the problem was addressed in version 12. However in this version a lot of changes to the network service related code were made, it is not easy for us to backport the Bug710176 patch to XULRunner10.0.6.esr since we are not the XULRunner experts. Specifically we are looking for the following help: (1)Identification of the specific changes made to address bug 71076, plus guidance on how to backport that to 10.06esr. (2)Is there a test case specifically identified which we could use to confirm the backport was successful. (3)Any suggestions on further regression testing that should be done to ensure there are no unexpected side effects applying a patch to the earlier build. Great thanks for your kindly help in advance. Maybe you can try the versions which landed on mozilla-beta (which was version 11 at the time); they might be closer to the ESR code you're using: https://bugzilla.mozilla.org/show_bug.cgi?id=710176#c73 Hi Josh, Thanks for your reply.. I noticed that Mozilla Version 11 had already remove nsSSLThread(https://bugzilla.mozilla.org/show_bug.cgi?id=674147), and the 710176's given patch is based on new infrastructure without nsSSLThread.. However XULRunner 10.0.6.esr also have the CPU spike issue, so I am wondering is there any fix for XULRunner 10.0.6.esr which is based on the old nsSSLThread implementation. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
XULRunner 10.0.6.esr cause CPU spike to 100%+ and network completely hangs while the UI still responsive.
=== The problem we have: === Our product leverage XULRunner 10.0.6.esr Version as embedded browser core, currently, it seems that we are encountering the problem highlighted in bug Bug 710176 - Socket transport service thread pegs the CPU spinning to send data on a SSL socket that is blocking waiting for certificate validation to finish within our product. This is causing our users on Mac to have a poor user experience since when the CPU Spikes to 100%+, the CPU keeps spinning while Embedded Browser couldn't open any web pages any more(no content will rendering anymore), seems like the network completely hangs while the UI still responsive. We have got the Sample file of our product while the CPU spikes, and in the process sample file, we found a lot of XULRunner threads(such as nsSocketTransportService, nsSSLThread, nsSSLIOLayerPoll...) are working.. We have also using XCode's Instruments tool to look at which thread is spinning the CPU, and the results as showed below, XULRunner's nsSSLThread::requestPoll() - nsSocketTransportService::Poll() takes much of the CPU time.. the symptom is that if XULRunner enter a potential loop between nsSSLIOLayerPoll() and nsSSLThread::requestPoll() which spinning on the CPU and starving everything else, this might leads to a complete hang of any networking, but the UI is responsive, which means that when the CPU spikes, you can still open another browser tab, but you can not loading any pages from then unless you restart the client.. === The help that we need: === We have done a great deal of research on this problem, but have not found a solidly reproducible case which also seems to be the case with Bug 71076. We are leveraging 10.0.6esr version of XULrunner and determined that the problem was addressed in version 12. However in this version a lot of changes to the network service related code were made, it is not easy for us to backport the Bug710176 patch to XULRunner10.0.6.esr since we are not the XULRunner experts. Specifically we are looking for the following help: (1)Identification of the specific changes made to address bug 71076, plus guidance on how to backport that to 10.06esr. (2)Is there a test case specifically identified which we could use to confirm the backport was successful. (3)Any suggestions on further regression testing that should be done to ensure there are no unexpected side effects applying a patch to the earlier build. Great thanks for your kindly help in advance. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: XULRunner 10.0.6.esr cause CPU spike to 100%+ and network completely hangs while the UI still responsive.
On 2014-11-19 3:38 AM, xingzhou...@gmail.com wrote: === The problem we have: === Our product leverage XULRunner 10.0.6.esr Version as embedded browser core, currently, it seems that we are encountering the problem highlighted in bug Bug 710176 - Socket transport service thread pegs the CPU spinning to send data on a SSL socket that is blocking waiting for certificate validation to finish within our product. This is causing our users on Mac to have a poor user experience since when the CPU Spikes to 100%+, the CPU keeps spinning while Embedded Browser couldn't open any web pages any more(no content will rendering anymore), seems like the network completely hangs while the UI still responsive. We have got the Sample file of our product while the CPU spikes, and in the process sample file, we found a lot of XULRunner threads(such as nsSocketTransportService, nsSSLThread, nsSSLIOLayerPoll...) are working.. We have also using XCode's Instruments tool to look at which thread is spinning the CPU, and the results as showed below, XULRunner's nsSSLThread::requestPoll() - nsSocketTransportService::Poll() takes much of the CPU time.. the symptom is that if XULRunner enter a potential loop between nsSSLIOLayerPoll() and nsSSLThread::requestPoll() which spinning on the CPU and starving everything else, this might leads to a complete hang of any networking, but the UI is responsive, which means that when the CPU spikes, you can still open another browser tab, but you can not loading any pages from then unless you restart the client.. === The help that we need: === We have done a great deal of research on this problem, but have not found a solidly reproducible case which also seems to be the case with Bug 71076. We are leveraging 10.0.6esr version of XULrunner and determined that the problem was addressed in version 12. However in this version a lot of changes to the network service related code were made, it is not easy for us to backport the Bug710176 patch to XULRunner10.0.6.esr since we are not the XULRunner experts. Specifically we are looking for the following help: (1)Identification of the specific changes made to address bug 71076, plus guidance on how to backport that to 10.06esr. (2)Is there a test case specifically identified which we could use to confirm the backport was successful. (3)Any suggestions on further regression testing that should be done to ensure there are no unexpected side effects applying a patch to the earlier build. Great thanks for your kindly help in advance. Maybe you can try the versions which landed on mozilla-beta (which was version 11 at the time); they might be closer to the ESR code you're using: https://bugzilla.mozilla.org/show_bug.cgi?id=710176#c73 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform