I was not particularly happy with the speed of our new wxwidgets
device driver as of the release of 5.11.1 on the Linux platform
because it was often significantly (factor of two to a factor of 20)
slower than any of our other interactive devices, and sometimes
(especially during tests) it would slow down by two orders of
magnitude!

So here some interesting measurements I have made of the speed of
wxwidgets as of master^ =

65e7b3c Fix bug with plotting in wxPLplotDemo

(e.g., the commit that is working for Pedro and me) which show a
substantial improvement over 5.11.1 thanks to Phil's work throughout
this release cycle on speed issues for wxwidgets.

Here are the two critical timing lines that before Phil's recent
"/dev/random ==> /dev/urandom" fix were often separated by a long
pause of 5 to 15 seconds due to the blocking nature of /dev/random on
Linux.

15:48:32: Debug: nanosecs since epoch = 2142186322453622: SetupMemoryMap(): 
enter
15:48:32: Debug: nanosecs since epoch = 2142186323455041: SetupMemoryMap(): 
mapName start

That "pause" is now reduced to 10^6 nanosec ~ 1 ms, an improvement of 4 orders 
of
magnitude!

To collect more time results I ran a bash "for" loop that compared
real times for all the examples from 0 to 30 (excluding 08, 17, 20,
and 25 because 17 and 20 are interactive in nature and I want to
discuss 08 and 25 separately below).  For each loop iteration I
displayed time results for our 3 highest quality (in terms of the
antialiased look of graphics and text, processing of unicode, etc.)
Linux interactive devices; qtwidget, xcairo, and wxwidgets.

To reduce the bash time result output from the normal 3 lines to just
one for each device, I changed the TIMEFORMAT environment variable that 
controls the
format of the time command (see the bash man page) from the default

export TIMEFORMAT=$'\nreal\t%3lR\nuser\t%3lU\nsys\t%3lS'

which gives the "real", "user", and "sys" 3 lines of output you
normally see from the "time" command to

export TIMEFORMAT=$'real\t%3R'

which expresses the real result on one line in pure seconds with 3
characters of precision past the decimal point, and drops the user and
system results.

So here are those real time comparison results (N.B. in groups of three
where the first is from -dev qtwidget, the second from xcairo, and the
3rd from wxwidgets) generated by the following bash for loop command:

software@raven> for N in $(seq --format='%02.0f' 0 30 |grep -vE '08|17|20|25'); 
do echo $N; (time examples/c/x${N}c -dev qtwidget -np >&/dev/null); (time 
examples/c/x${N}c -dev xcairo -np >&/dev/null); (time examples/c/x${N}c -dev 
wxwidgets -np >&/dev/null); sleep 5; done
00
real    0.184 *
real    0.292
real    0.344
01
real    0.207 *
real    0.376
real    0.307
02
real    0.279 *
real    0.386
real    0.288
03
real    0.191 *
real    0.385
real    0.275
04
real    0.293
real    0.400
real    0.279 *
05
real    0.188 *
real    0.395
real    0.275
06
real    0.598
real    0.458
real    0.393 *
07
real    1.331
real    0.846
real    0.619 *
09
real    0.570
real    0.450
real    0.370 *
10
real    0.178 *
real    0.351
real    0.296
11
real    0.882
real    0.750
real    0.407 *
12
real    0.504
real    0.498 *
real    0.807
13
real    0.184 *
real    0.334
real    0.301
14
real    0.813
real    0.712
real    0.669 *
15
real    0.310 *
real    0.417
real    0.376
16
real    0.578
real    0.998
real    0.402 *
18
real    1.281
real    0.813 *
real    1.161 
19
real    1.015
real    0.844
real    0.668 *
21
real    0.806 *
real    0.887
real    1.124
22
real    0.923 
real    0.824 *
real    0.925
23
real    2.860
real    1.463 *
real    1.685
24
real    0.630
real    0.533 *
real    0.845
26
real    0.628
real    0.524 *
real    0.854
27
real    1.396
real    1.077 *
real    1.232
28
real    0.834
real    0.567 *
real    0.883
29
real    0.781
real    0.643
real    0.466 *
30
real    0.238 *
real    0.365
real    0.288

Obviously, a caveat for these results is they are going to be
distorted in the -dev wxwidgets case because they only count the real
time spent by that device and completely ignore the real time spent by
wxPLViewer. However, a countervailing argument if you have multiple
CPU's (like my case where I have two of them) is wxPLViewer can be run
on a separate CPU because it is a separate application so in a sense
its real time does not count on multiple CPU machines).  Nevertheless,
in a few cases wxPLViewer took so much longer to finish then -dev
wxwidgets that it was distorting subsequent wxPLViewer instances which
were automatically created smaller to reduce (I presume) how much my
screen got filled up by wxPLViewer GUI's.  To counteract that crowding
effect I implemented a 5 second wait (not counted in the above real
time outputs) at the end of each loop iteration above, and the result
was most wxPLViewer GUI's finished during the loop and therefore the
next launch of the wxPLViewer GUI on the next loop ended up being full
sized.

Despite the above caveat these time results are still quite
interesting.  I have indicated with a trailing asterisk in the above
table which of the three times was the best, and if you count those
results above you obtain the following summary:

qtwidget  10
xcairo    08
wxwidgets 09

So it is clear that wxwidgets is now pretty much holding its own
(ignoring the caveat) with the others for our standard set of
examples, and since my previous evaluation done near the time when
5.11.1 was released was subject to the same caveat it is clear there
has been quite a substantial improvement, and we can generally (with a
few exceptions to be discussed) be proud of the efficiency of this
device now on Linux.

There are two known exceptions to the above remarks.  Whereas in all
the other examples above wxwidgets is at most two times slower than
the fastest result from either qt or xcairo, we have the following results
for examples 08 and 25 generated by a similar loop to the above
(starting with

for N in 08 25; ....

).

08
real    0.860 *
real    1.971
real    13.520
25
real    1.102
real    0.655 *
real    5.311

For these two cases, wxwidgets is slower (subject to the same caveat) than the 
best of the other
two devices by respective factors of 16 and 8!

After the release we should look at examples 08 and 25 to see what the
issue is, but both these examples have large numbers of graphical
elements (i.e., example 08 uses a large number of small triangles
to represent those varying 3D surfaces in a smooth way and 
example 25 uses a large number of thin rectangles to represent the number
of gradients in this plot in a smooth way). So my working hypothesis
to explain these large slowdowns compared to the rest of the examples
is there is some bottleneck transmitting positional data for graphical
elements between -dev wxwidgets and associated wxPLViewer application
via shared memory.  If this hypothesis is correct, the cure for this
issue would likely be to move to the efficient "unnamed semaphores"
algorithm which I demonstrated (using the proof-of-concept project at
cmake/test_linux_ipc) could move 25MB of data from one executable to
another in 0.3 seconds!

(I assume example 33 will show the same slowdown issue as 25 because
there are lots of plgradient invocations in example 33, but I did not
investigate 33 because it takes so long even when it is efficient.)

Also note whatever the cure is for this slowdown issue with examples
that have large numbers of graphical elements, it might be a noticable
benefit to some of the rest of the examples with moderate numbers of
graphical elements.  Thus, with any luck at all when this particular
inefficiency is solved we might see a big increase in the number of
examples where wxwidgets is faster than both the qtwidget and xcairo
results.

So from the above numbers there is still one obvious Linux efficiency
issue left for -dev wxwidgets that we need to work on after the
release, and we can look forward to a potential big payoff from that
work not only for examples 08 and 25, but possibly several other
examples as well.  However, for this release we can still be proud of
all the Linux wxwidgets efficiency progress made to this point.

So, good work, Phil, on getting us to this stage!

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
and the Linux Brochure Project (lbproject.sf.net).
__________________________

Linux-powered Science
__________________________

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel
_______________________________________________
Plplot-devel mailing list
Plplot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/plplot-devel

Reply via email to