On (2014年04月08日 15:20), Gabriele Svelto wrote: > On 07/04/2014 23:13, Dave Hylands wrote: >> Personally, I think that the more ways we can test for threading issues the >> better. >> It seems to me that we should do some amount of testing on single core and >> multi-core. >> >> Then I suppose the question becomes how many cores? 2? 4? 8? >> >> Maybe we can cycle through some different number of cores so that we get >> coverage without duplicating everything? > > One configuration that is particularly good at catching threading errors > (especially narrow races) is constraining the software to run on two > hardware threads on the same SMT-enabled core. This effectively forces > the threads to share the L1 D$ which in turn can reveal some otherwise > very-hard-to-find data synchronization issues. > > I don't know if we have that level of control on our testing hardware > but if we do then that's a scenario we might want to include. > > Gabriele
I run thunderbird under valgrind from time to time. Valgrind slows down the CPU execution by a very large factor and it seems to open many windows for thread races. (Sometimes a very short window is prolonged enough so that events caused by, say, I/O can fall inside this prolonged usually short window.) During valgrind execution,I have seen errors that were not reported anywhere, and many have happened only once :-( If VM (such as VirtualBox, VMplayer or something) can artificially change the execution time of CPU or even different cores slightly (maybe 1/2, 1/3, 1/4) I am sure many thread-race issues will be caught. I agree that this is a brute-force approach, but please recall that the first space shuttle launch needed to be aborted due to software glitch. It was a timing issue and according to the analysis of the time, it could happen once in 72 (or was it 74) cases. Even NASA with a large pocket of money and its subcontractor could not catch it before launch. I am afraid that the situation has not changed much (unless we use a computer language well suited to avoid these thread-race issues.) We need all the help to track down visible and dormant thread-races. If artificial CPU execution tweaking (by changing the # of cores or even more advanced tweaking methods if available) can help, it is worth a try. Maybe not always if such a work cost extra money, but a prolonged (say a week) testing from time to time (each quarter or half a year, or maybe just prior to testing of beta of major release?). TIA _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform