Re: [Plplot-devel] Recent progress with wxwidgets IPC

2017-02-07 Thread p.d.rosenberg
Hi Alan
It'll definitely be interesting to see the differences. I wasn't trying to put 
you off, just hoping you can get a boost from the things that i learnt as I 
went along. That was the first time I'd tried ipc locally, although I did some 
one way network stuff years ago. It did turn out to be more difficult than I 
expected, especially once I realised there needed to be some 2 way comms.

The only thing about sending amount, yes you are correct about the amountToCopy 
variable, but that is for exactly one transfer. However, i don’t think there is 
a reliable end of page command so you can't store up commands and send them all 
in one go when the plot is complete. Also the viewer never knows if the page is 
complete so must always keep rechecking for new data.

Phil

Sent from my Windows 10 phone

From: Alan W. Irwin
Sent: 07 February 2017 20:51
To: Phil Rosenberg
Cc: PLplot development list
Subject: Re: [Plplot-devel] Recent progress with wxwidgets IPC

Hi Phil:

You make a lot of points about some uncertainties in what I propose to
do.  And I do agree there are such uncertainties.  So this is
definitely a "show them the code" moment.  At worst, I will strip it
all out again because it will turn out to be complex and slow.  But it
could be significantly less complex (no polling!) and just as
efficient or better.  So we will see.

Further discussion below about that "no polling" point.

On 2017-02-07 13:51- Phil Rosenberg wrote:

[...]
> In a simple
> restart-from-the-beginning buffer like you seem to be proposing the
> sender must wait until the reader has read all the data from the
> buffer before it can send more data.

True, but neverthless the mini-project demonstrated this was an
efficient method of moving many MBytes at one go since effectively the
only costs (assuming overheads are small) are a memcpy of those bytes
on the sending end and a memcpy of those bytes on the receiving end,
and (this is the important point) without any polling at all!  Note,
in principle for GigaHertz machines a 1 GByte memcpy should only
require 1 second or so.  So we are discussing really small
inefficiencies here so long as the size of the buffer is large enough
to make the overheads of the method (the overhead of setting up the
transfer of control to the memcpy routine for each chunk, and the
overhead of checking semaphores as each chunk is passed).  So I would
expect the method would be inefficient for really small buffer sizes
(such as 100 bytes or so).  But, for example, I was already getting
quite efficient results with a buffer size of just 1K (!) so I don't
think the overheads of this two-semaphore method are that big a deal.

Of course, if you add polling to the mix, then that would introduce a
lot of efficiency concern. But there is no polling needed or used in
this mini-project so that is why it is efficient, and I believe I can
avoid polling as well for the wxwidgets IPC method, see below.

[...]
> Don't forget also that while waiting for new data
> the semaphore cannot block indefinitely. To do so would hang
> wxPLViewer or the sender software. I think also there is no way to
> tell if a page is finished or whether there is more data to come.
> Therefore you must use non- blocking semaphores and poll them on
> regular intervals.

My assumption is the sending side knows exactly how many bytes it
needs to send.  And my preliminary analysis is that is exactly what
the present code calculates with the amountToCopy variable
in wxPLDevice::TransmitBuffer.  So with that assumption (and as
demonstrated by the mini-project) there is no need for non-blocking
semaphores or polling if you use the two-semaphore method.

Of course, this line of reasoning completely falls apart if
amountToCopy does not do what its name implies so please let
me know if there is some case where that calculation is unreliable.

My efficiency test results for the case where the -np option is not
used show the inefficiences of the present wxwidgets IPC are
negligible compared to wxPLViewer taking a long time (30 seconds for
example 8) to render the plot while other interactive devices take the
order of a second for this same task.  Most of this large time
interval occurs after -dev wxwidgets is completely finished so IPC
ineffiency cannot be the source of this wxPLViewer inefficiency for
cases like example 8 where large numbers of graphical elements are
being plotted.  Therefore, from this evidence you do have the polling
interval used in the present one-semaphore method tuned up fairly well
(at least for typical PC hardware). So my fundamental goal here is
making our wxwidgets IPC a lot simpler for POSIX systems by
eliminating the polling and the rest of the circular buffer logic for
that case.  I am hoping for some noticeable improvement in efficiency
due to this, but I am not counting on anything showing at all in that
regard until at least the non-IPC inefficiency of wxPLViewer is
addressed.

In sum, it is "show the code" time.  

Re: [Plplot-devel] Recent progress with wxwidgets IPC

2017-02-07 Thread Alan W. Irwin
Hi Phil:

You make a lot of points about some uncertainties in what I propose to
do.  And I do agree there are such uncertainties.  So this is
definitely a "show them the code" moment.  At worst, I will strip it
all out again because it will turn out to be complex and slow.  But it
could be significantly less complex (no polling!) and just as
efficient or better.  So we will see.

Further discussion below about that "no polling" point.

On 2017-02-07 13:51- Phil Rosenberg wrote:

[...]
> In a simple
> restart-from-the-beginning buffer like you seem to be proposing the
> sender must wait until the reader has read all the data from the
> buffer before it can send more data.

True, but neverthless the mini-project demonstrated this was an
efficient method of moving many MBytes at one go since effectively the
only costs (assuming overheads are small) are a memcpy of those bytes
on the sending end and a memcpy of those bytes on the receiving end,
and (this is the important point) without any polling at all!  Note,
in principle for GigaHertz machines a 1 GByte memcpy should only
require 1 second or so.  So we are discussing really small
inefficiencies here so long as the size of the buffer is large enough
to make the overheads of the method (the overhead of setting up the
transfer of control to the memcpy routine for each chunk, and the
overhead of checking semaphores as each chunk is passed).  So I would
expect the method would be inefficient for really small buffer sizes
(such as 100 bytes or so).  But, for example, I was already getting
quite efficient results with a buffer size of just 1K (!) so I don't
think the overheads of this two-semaphore method are that big a deal.

Of course, if you add polling to the mix, then that would introduce a
lot of efficiency concern. But there is no polling needed or used in
this mini-project so that is why it is efficient, and I believe I can
avoid polling as well for the wxwidgets IPC method, see below.

[...]
> Don't forget also that while waiting for new data
> the semaphore cannot block indefinitely. To do so would hang
> wxPLViewer or the sender software. I think also there is no way to
> tell if a page is finished or whether there is more data to come.
> Therefore you must use non- blocking semaphores and poll them on
> regular intervals.

My assumption is the sending side knows exactly how many bytes it
needs to send.  And my preliminary analysis is that is exactly what
the present code calculates with the amountToCopy variable
in wxPLDevice::TransmitBuffer.  So with that assumption (and as
demonstrated by the mini-project) there is no need for non-blocking
semaphores or polling if you use the two-semaphore method.

Of course, this line of reasoning completely falls apart if
amountToCopy does not do what its name implies so please let
me know if there is some case where that calculation is unreliable.

My efficiency test results for the case where the -np option is not
used show the inefficiences of the present wxwidgets IPC are
negligible compared to wxPLViewer taking a long time (30 seconds for
example 8) to render the plot while other interactive devices take the
order of a second for this same task.  Most of this large time
interval occurs after -dev wxwidgets is completely finished so IPC
ineffiency cannot be the source of this wxPLViewer inefficiency for
cases like example 8 where large numbers of graphical elements are
being plotted.  Therefore, from this evidence you do have the polling
interval used in the present one-semaphore method tuned up fairly well
(at least for typical PC hardware). So my fundamental goal here is
making our wxwidgets IPC a lot simpler for POSIX systems by
eliminating the polling and the rest of the circular buffer logic for
that case.  I am hoping for some noticeable improvement in efficiency
due to this, but I am not counting on anything showing at all in that
regard until at least the non-IPC inefficiency of wxPLViewer is
addressed.

In sum, it is "show the code" time.  That is, it is pretty clear what
I have said above has speculative elements and similarly for any
further replies you make (unless you know of some cases where
amountToCopy is definitely unreliable). So my focus from now on will
be to continue my project of implementing the two-semaphore method for
wxwidgets IPC. Once I have completed that implementation, we should
evaluate that code and obviously if it is simpler and there is at
least no drop in efficiency we should adopt it but otherwise not.

Alan
__
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project 

Re: [Plplot-devel] Recent progress with wxwidgets IPC

2017-02-07 Thread Phil Rosenberg
Hi Alan
That seems reasonable. I'm not sure what the benefits are though.
Should the new way be quicker?

In terms of the overall complexity here are some of the things I found
while setting up the current system that each made the system more
complex than I initially envisaged. I imagine you will come across
many of these too.

The reason that I went for a circular buffer is that data can
continuously be being sent even while the reader is reading. That is
until the writer catches up with the reader. In a simple
restart-from-the-beginning buffer like you seem to be proposing the
sender must wait until the reader has read all the data from the
buffer before it can send more data.

In fact I have a feeling that the system as I have it now makes little
or no use of the actual semaphores and just uses flags in specified
locations in the shared memory. I would have to look through the code
properly to check. Don't forget also that while waiting for new data
the semaphore cannot block indefinitely. To do so would hang
wxPLViewer or the sender software. I think also there is no way to
tell if a page is finished or whether there is more data to come.
Therefore you must use non- blocking semaphores and poll them on
regular intervals.


If you want to do two way communications you will need two areas of
shared memory, one for each direction, otherwise you will likely have
a race condition for which process will write next. As it happens,
like you said the transfer is almost all one way, with just a small
amount being returned. So I just allocated a small portion of the
shared memory to represent a struct to hold some specific returned
information. I'm doing this from memory, but I think it would work
like this for the case of getting text size.

Sender zeros the flag which indicates a text size is available.
Sender sends a message to the Viewer saying it wants a text size and
with the string it wants sizing
Sender starts checking the flag to say that the text size is available.
Viewer receives the message requesting text size
Viewer determines the text size and writes it to the location in
shared memory reserved for text size
Viewer sets the flag which indicates text size is available to 1
Sender sees the text size flag it one
Sender reads the text size from the location in shared memory that is
reserved for text size.

The same is basically true for getting position information for
pllocate calls. But there is a pause on the Viewer side while it waits
for user input.

Of course the alternate is to set up a totally generic symmetric
system, I'm not sure if one is easier or faster.

I guess all these things contribute to the complexity of the code that
is there, but I'm not sure that it is more complex than necessary.
I'll be interested to see how your setup differs :-)

Phil



On 7 February 2017 at 02:01, Alan W. Irwin  wrote:
> On 2017-02-06 23:52- p.d.rosenb...@gmail.com wrote:
>
>> Hi Alan
>
>
>> Not exactly sure what you mean by complex? It is not always possible
>
> to send all data, as the shared memory is finite size and therefore
> the data to be sent may be bigger than the shared memory.
>
> Hi Phil:
>
> To get a preview of what I mean by implementing an approach that is
> simpler than the current one, I suggest you take a look at the code in
> cmake/test_linux_ipc.  There, the shared memory buffer size is
> relatively small, that buffer is _not_ a circular buffer, and
> typically the amount of data to be transmitted >> shared memory buffer
> size.  The data are split up into chunks that fit into the buffer on
> the sending side and those chunks are assembled on the receiving side
> under the control of two semaphores.  The result is an efficient
> transfer of the entirety of what can be an extremely large amount of
> data (25MB in one test) between sender and receiver with relatively
> simple code and relatively small shared memory buffer.
>
> My current plan is to have a generic function "send" for sending a
> generic array of bytes and a generic function "receive" for receiving
> that array where those functions contain all the details of the
> two-semaphore method for transmitting and receiving a generic data
> array.  Then higher level code would create an array to be sent or
> received and then they would use this send/receive API to transfer
> those arrays.  Of course, the usual case is that -dev wxwidgets
> normally would call the "send" API and wxPlViewer normally would
> normally call the "receive" API, but when those roles are reversed
> (e.g., when transmitting back the physical size of displayed text
> strings), then wxPlViewer will be calling the "send" API and -dev
> wxwidgets will be using the "receive" API.
>
>> I presume it's this named semaphore and/or memory flags that you intend to
>> remove?
>
>
> I am definitely going to keep named semaphores (see step 3 in my
> original plan where a small change to the two-semaphores approach
> should change that