I have no idea where I should report this for proper attention.
Please suggest a better Newsgroup if there is one.
I am posting this to
mozilla.dev.apps.thunderbird
mozilla.dev.platform
mozilla.tools

During the test of C-C TB using its test suite |make mozmill|
I often encounter random errors. The symptom varies, but
one of them is strange timeout error(s).
(Tryserver runs also encounter some of them. But not as badly as mine, I think.)

To run the tests,
I create a virtual X desktop using Xephyr program and
runs the test inside this virtual desktop so that
I can work on other things on the real X desktop
without the fear of disrupting running TB during |make mozmill| tests.

E.g. The code snippet to set up virtual desktop.
# Use a separate window as Xserver screen.
Xephyr -ac -br -noreset -screen 1280x768 :1 &
DISPLAY=localhost:1.0
sleep 2
# oclock &
xfwm4 &

TB runs in the window specified by $DIPSPLAY under xfwm4 window manager.
This setup works just fine.
I am explaining the use of Xephyr because it
may be relevant to the following discussion.
(I wonder what Tryserver does for displaying the application screen.)

About the timeout and how I discovered a strange cure:

Just by a sheer coincidence, I have found out that some timeout(s) disappear by
the following operation while the test goes on.

[The explanation as follows is hard to believe, and so I intend to capture
the screen vide clip in a day or two if I can do so successfully.]

Symptom:

There are many random errors that can crop up without any discenable reason.
We see the connection to thunderbird from the test harness dropped out due
to timeout or
some popup does not appear in time, etc.

Now, I noticed that when such timeout occurs when I manually select the
particular test using SOLO_TEST feature of |make mozmill| test,
the screen seems gets stuck without any action for a long time and then
timeout kicks in
and the test proceeds with another one or quits if it is at the end of the
series of the test.

I can tell "screen gets stuck" because until such stuck screen status is shown,
TB shows its menus opened, selected, closed, or some other actions taking
place in its 3-pane window, etc. Suddenly, there is no visible action at
all. So it is relatively easy to see that TB seems to get stuck.

Strange Cure:

The other day, I was staring at the screen during |make mozmill| when such
a stuck situation occurred.
I noticed cursor (mouse cursor, that is) is left stationery and
for no reason at all, simply MOVE the mouse WITHOUT pushing any buttons or
clicking.

Lo and behold, suddenly the stuck test started to proceed again [I could see
activities in TB
window] and after a while TB got stuck with a pull down menu showing and
nothing happening.
[It is as if the selection of the menu which mozmill test program instructed
TB to perform was not done at all.]
I got curious, and cautiously moved the mouse somewhat WITHOUT touching the
buttons at all.
Again, to my UTTER SURPRISE, the test AWOKE from the hung, so to speak, and
proceeded with the rest of the operation and finished and there were no
timeout error which was reported earlier in |make mozmill| test(!!!).

Similar tests were re-tried and I conclude that 95 % of the times the strange
errors disappear with the mouse movement (without any button operation.).
Some random errors seem to stick often no matter what.

This makes me think that there is a subtle feature/interaction/bug
in the test harness code and event handling.

I said that I do not push buttons or clicking because I did not want to
disrupt TB by sending button down, button up event, etc.

But one test requires caution: when I tested this "Strange Cure" by moving
mouse
to avoid stuck test causing random timeouts,
one test in composition/test-attachment-reminder.js seems to "GRAB" the mouse
and tries to pin it down on an "Alert" button it shows in the upper right
corner.
If this happens, my moving mouse actually moves the whole TB window around, too.
I had to slowly to push and release the mouse button to clear this PINNING down.
This may or may not result in the test failure. Sometimes test fails and
sometimes test succeeds.

Anyway, if readers doubt this "Strange Cure" works,
[I would doubt it works  if I just hear this story until I see it.]
here are a few tests that experienced random "TEST-UNEXPECTED-ERROR" during
entire |make mozmill|,
but when I tried the manual invocation of individual file using SOLO_TEST
and when I used
the "Strange Cure" to release TB from "stuck state" on the screen, succeeded

Those who use linux PC for development might want to try the cure themselves.

Note: that stuck state does not always happen. This is why they are termed
"random" failures.
So manually invoking test may or may not show the stuck symptom.
The "Strange Cure" is useful very often (not 100% times) when the test
somehow got stuck.

TEST-START |
/REF-COMM-CENTRAL/comm-central/mail/test/mozmill/composition/test-signature-updating.js
| testHTMLComposeWindowSwitchSignaturesWithSuppressedSeparator

The above once caused timeout but with manual invocation and "Strange Cure",
it worked just fine.

The next one, too. This one often reports errors under my 64-bit test PC,
and I did notknow what was wrong.
But manually running this test and used "Strange Cure", it ran successfully
(!) a few times in straight!
TEST-UNEXPECTED-FAIL |
/REF-COMM-CENTRAL/comm-central/mail/test/mozmill/content-tabs/test-install-xpi.js
| test-install-xpi.js::test_install_xpi_offer

This is the test with which I discovered the "Strange Cure" for the first time.
Timeout occurs during |make mozmill| but moving mouse clears the issue!
TEST-UNEXPECTED-FAIL |
/REF-COMM-CENTRAL/comm-central/mail/test/mozmill/folder-tree-modes/test-mode-switching.js
| test-mode-switching.js::test_toggling_modes

Strange, isn't it?

TIA

CI


_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to