I was not particularly happy with the speed of our new wxwidgets device driver as of the release of 5.11.1 on the Linux platform because it was often significantly (factor of two to a factor of 20) slower than any of our other interactive devices, and sometimes (especially during tests) it would slow down by two orders of magnitude!
So here some interesting measurements I have made of the speed of wxwidgets as of master^ = 65e7b3c Fix bug with plotting in wxPLplotDemo (e.g., the commit that is working for Pedro and me) which show a substantial improvement over 5.11.1 thanks to Phil's work throughout this release cycle on speed issues for wxwidgets. Here are the two critical timing lines that before Phil's recent "/dev/random ==> /dev/urandom" fix were often separated by a long pause of 5 to 15 seconds due to the blocking nature of /dev/random on Linux. 15:48:32: Debug: nanosecs since epoch = 2142186322453622: SetupMemoryMap(): enter 15:48:32: Debug: nanosecs since epoch = 2142186323455041: SetupMemoryMap(): mapName start That "pause" is now reduced to 10^6 nanosec ~ 1 ms, an improvement of 4 orders of magnitude! To collect more time results I ran a bash "for" loop that compared real times for all the examples from 0 to 30 (excluding 08, 17, 20, and 25 because 17 and 20 are interactive in nature and I want to discuss 08 and 25 separately below). For each loop iteration I displayed time results for our 3 highest quality (in terms of the antialiased look of graphics and text, processing of unicode, etc.) Linux interactive devices; qtwidget, xcairo, and wxwidgets. To reduce the bash time result output from the normal 3 lines to just one for each device, I changed the TIMEFORMAT environment variable that controls the format of the time command (see the bash man page) from the default export TIMEFORMAT=$'\nreal\t%3lR\nuser\t%3lU\nsys\t%3lS' which gives the "real", "user", and "sys" 3 lines of output you normally see from the "time" command to export TIMEFORMAT=$'real\t%3R' which expresses the real result on one line in pure seconds with 3 characters of precision past the decimal point, and drops the user and system results. So here are those real time comparison results (N.B. in groups of three where the first is from -dev qtwidget, the second from xcairo, and the 3rd from wxwidgets) generated by the following bash for loop command: software@raven> for N in $(seq --format='%02.0f' 0 30 |grep -vE '08|17|20|25'); do echo $N; (time examples/c/x${N}c -dev qtwidget -np >&/dev/null); (time examples/c/x${N}c -dev xcairo -np >&/dev/null); (time examples/c/x${N}c -dev wxwidgets -np >&/dev/null); sleep 5; done 00 real 0.184 * real 0.292 real 0.344 01 real 0.207 * real 0.376 real 0.307 02 real 0.279 * real 0.386 real 0.288 03 real 0.191 * real 0.385 real 0.275 04 real 0.293 real 0.400 real 0.279 * 05 real 0.188 * real 0.395 real 0.275 06 real 0.598 real 0.458 real 0.393 * 07 real 1.331 real 0.846 real 0.619 * 09 real 0.570 real 0.450 real 0.370 * 10 real 0.178 * real 0.351 real 0.296 11 real 0.882 real 0.750 real 0.407 * 12 real 0.504 real 0.498 * real 0.807 13 real 0.184 * real 0.334 real 0.301 14 real 0.813 real 0.712 real 0.669 * 15 real 0.310 * real 0.417 real 0.376 16 real 0.578 real 0.998 real 0.402 * 18 real 1.281 real 0.813 * real 1.161 19 real 1.015 real 0.844 real 0.668 * 21 real 0.806 * real 0.887 real 1.124 22 real 0.923 real 0.824 * real 0.925 23 real 2.860 real 1.463 * real 1.685 24 real 0.630 real 0.533 * real 0.845 26 real 0.628 real 0.524 * real 0.854 27 real 1.396 real 1.077 * real 1.232 28 real 0.834 real 0.567 * real 0.883 29 real 0.781 real 0.643 real 0.466 * 30 real 0.238 * real 0.365 real 0.288 Obviously, a caveat for these results is they are going to be distorted in the -dev wxwidgets case because they only count the real time spent by that device and completely ignore the real time spent by wxPLViewer. However, a countervailing argument if you have multiple CPU's (like my case where I have two of them) is wxPLViewer can be run on a separate CPU because it is a separate application so in a sense its real time does not count on multiple CPU machines). Nevertheless, in a few cases wxPLViewer took so much longer to finish then -dev wxwidgets that it was distorting subsequent wxPLViewer instances which were automatically created smaller to reduce (I presume) how much my screen got filled up by wxPLViewer GUI's. To counteract that crowding effect I implemented a 5 second wait (not counted in the above real time outputs) at the end of each loop iteration above, and the result was most wxPLViewer GUI's finished during the loop and therefore the next launch of the wxPLViewer GUI on the next loop ended up being full sized. Despite the above caveat these time results are still quite interesting. I have indicated with a trailing asterisk in the above table which of the three times was the best, and if you count those results above you obtain the following summary: qtwidget 10 xcairo 08 wxwidgets 09 So it is clear that wxwidgets is now pretty much holding its own (ignoring the caveat) with the others for our standard set of examples, and since my previous evaluation done near the time when 5.11.1 was released was subject to the same caveat it is clear there has been quite a substantial improvement, and we can generally (with a few exceptions to be discussed) be proud of the efficiency of this device now on Linux. There are two known exceptions to the above remarks. Whereas in all the other examples above wxwidgets is at most two times slower than the fastest result from either qt or xcairo, we have the following results for examples 08 and 25 generated by a similar loop to the above (starting with for N in 08 25; .... ). 08 real 0.860 * real 1.971 real 13.520 25 real 1.102 real 0.655 * real 5.311 For these two cases, wxwidgets is slower (subject to the same caveat) than the best of the other two devices by respective factors of 16 and 8! After the release we should look at examples 08 and 25 to see what the issue is, but both these examples have large numbers of graphical elements (i.e., example 08 uses a large number of small triangles to represent those varying 3D surfaces in a smooth way and example 25 uses a large number of thin rectangles to represent the number of gradients in this plot in a smooth way). So my working hypothesis to explain these large slowdowns compared to the rest of the examples is there is some bottleneck transmitting positional data for graphical elements between -dev wxwidgets and associated wxPLViewer application via shared memory. If this hypothesis is correct, the cure for this issue would likely be to move to the efficient "unnamed semaphores" algorithm which I demonstrated (using the proof-of-concept project at cmake/test_linux_ipc) could move 25MB of data from one executable to another in 0.3 seconds! (I assume example 33 will show the same slowdown issue as 25 because there are lots of plgradient invocations in example 33, but I did not investigate 33 because it takes so long even when it is efficient.) Also note whatever the cure is for this slowdown issue with examples that have large numbers of graphical elements, it might be a noticable benefit to some of the rest of the examples with moderate numbers of graphical elements. Thus, with any luck at all when this particular inefficiency is solved we might see a big increase in the number of examples where wxwidgets is faster than both the qtwidget and xcairo results. So from the above numbers there is still one obvious Linux efficiency issue left for -dev wxwidgets that we need to work on after the release, and we can look forward to a potential big payoff from that work not only for examples 08 and 25, but possibly several other examples as well. However, for this release we can still be proud of all the Linux wxwidgets efficiency progress made to this point. So, good work, Phil, on getting us to this stage! Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ ------------------------------------------------------------------------------ Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/intel _______________________________________________ Plplot-devel mailing list Plplot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/plplot-devel