I have no idea where I should report this for proper attention. Please suggest a better Newsgroup if there is one. I am posting this to mozilla.dev.apps.thunderbird mozilla.dev.platform mozilla.tools
During the test of C-C TB using its test suite |make mozmill| I often encounter random errors. The symptom varies, but one of them is strange timeout error(s). (Tryserver runs also encounter some of them. But not as badly as mine, I think.) To run the tests, I create a virtual X desktop using Xephyr program and runs the test inside this virtual desktop so that I can work on other things on the real X desktop without the fear of disrupting running TB during |make mozmill| tests. E.g. The code snippet to set up virtual desktop. # Use a separate window as Xserver screen. Xephyr -ac -br -noreset -screen 1280x768 :1 & DISPLAY=localhost:1.0 sleep 2 # oclock & xfwm4 & TB runs in the window specified by $DIPSPLAY under xfwm4 window manager. This setup works just fine. I am explaining the use of Xephyr because it may be relevant to the following discussion. (I wonder what Tryserver does for displaying the application screen.) About the timeout and how I discovered a strange cure: Just by a sheer coincidence, I have found out that some timeout(s) disappear by the following operation while the test goes on. [The explanation as follows is hard to believe, and so I intend to capture the screen vide clip in a day or two if I can do so successfully.] Symptom: There are many random errors that can crop up without any discenable reason. We see the connection to thunderbird from the test harness dropped out due to timeout or some popup does not appear in time, etc. Now, I noticed that when such timeout occurs when I manually select the particular test using SOLO_TEST feature of |make mozmill| test, the screen seems gets stuck without any action for a long time and then timeout kicks in and the test proceeds with another one or quits if it is at the end of the series of the test. I can tell "screen gets stuck" because until such stuck screen status is shown, TB shows its menus opened, selected, closed, or some other actions taking place in its 3-pane window, etc. Suddenly, there is no visible action at all. So it is relatively easy to see that TB seems to get stuck. Strange Cure: The other day, I was staring at the screen during |make mozmill| when such a stuck situation occurred. I noticed cursor (mouse cursor, that is) is left stationery and for no reason at all, simply MOVE the mouse WITHOUT pushing any buttons or clicking. Lo and behold, suddenly the stuck test started to proceed again [I could see activities in TB window] and after a while TB got stuck with a pull down menu showing and nothing happening. [It is as if the selection of the menu which mozmill test program instructed TB to perform was not done at all.] I got curious, and cautiously moved the mouse somewhat WITHOUT touching the buttons at all. Again, to my UTTER SURPRISE, the test AWOKE from the hung, so to speak, and proceeded with the rest of the operation and finished and there were no timeout error which was reported earlier in |make mozmill| test(!!!). Similar tests were re-tried and I conclude that 95 % of the times the strange errors disappear with the mouse movement (without any button operation.). Some random errors seem to stick often no matter what. This makes me think that there is a subtle feature/interaction/bug in the test harness code and event handling. I said that I do not push buttons or clicking because I did not want to disrupt TB by sending button down, button up event, etc. But one test requires caution: when I tested this "Strange Cure" by moving mouse to avoid stuck test causing random timeouts, one test in composition/test-attachment-reminder.js seems to "GRAB" the mouse and tries to pin it down on an "Alert" button it shows in the upper right corner. If this happens, my moving mouse actually moves the whole TB window around, too. I had to slowly to push and release the mouse button to clear this PINNING down. This may or may not result in the test failure. Sometimes test fails and sometimes test succeeds. Anyway, if readers doubt this "Strange Cure" works, [I would doubt it works if I just hear this story until I see it.] here are a few tests that experienced random "TEST-UNEXPECTED-ERROR" during entire |make mozmill|, but when I tried the manual invocation of individual file using SOLO_TEST and when I used the "Strange Cure" to release TB from "stuck state" on the screen, succeeded Those who use linux PC for development might want to try the cure themselves. Note: that stuck state does not always happen. This is why they are termed "random" failures. So manually invoking test may or may not show the stuck symptom. The "Strange Cure" is useful very often (not 100% times) when the test somehow got stuck. TEST-START | /REF-COMM-CENTRAL/comm-central/mail/test/mozmill/composition/test-signature-updating.js | testHTMLComposeWindowSwitchSignaturesWithSuppressedSeparator The above once caused timeout but with manual invocation and "Strange Cure", it worked just fine. The next one, too. This one often reports errors under my 64-bit test PC, and I did notknow what was wrong. But manually running this test and used "Strange Cure", it ran successfully (!) a few times in straight! TEST-UNEXPECTED-FAIL | /REF-COMM-CENTRAL/comm-central/mail/test/mozmill/content-tabs/test-install-xpi.js | test-install-xpi.js::test_install_xpi_offer This is the test with which I discovered the "Strange Cure" for the first time. Timeout occurs during |make mozmill| but moving mouse clears the issue! TEST-UNEXPECTED-FAIL | /REF-COMM-CENTRAL/comm-central/mail/test/mozmill/folder-tree-modes/test-mode-switching.js | test-mode-switching.js::test_toggling_modes Strange, isn't it? TIA CI _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform