Re: [Plplot-devel] Recent progress with wxwidgets IPC
Hi Alan It'll definitely be interesting to see the differences. I wasn't trying to put you off, just hoping you can get a boost from the things that i learnt as I went along. That was the first time I'd tried ipc locally, although I did some one way network stuff years ago. It did turn out to be more difficult than I expected, especially once I realised there needed to be some 2 way comms. The only thing about sending amount, yes you are correct about the amountToCopy variable, but that is for exactly one transfer. However, i don’t think there is a reliable end of page command so you can't store up commands and send them all in one go when the plot is complete. Also the viewer never knows if the page is complete so must always keep rechecking for new data. Phil Sent from my Windows 10 phone From: Alan W. Irwin Sent: 07 February 2017 20:51 To: Phil Rosenberg Cc: PLplot development list Subject: Re: [Plplot-devel] Recent progress with wxwidgets IPC Hi Phil: You make a lot of points about some uncertainties in what I propose to do. And I do agree there are such uncertainties. So this is definitely a "show them the code" moment. At worst, I will strip it all out again because it will turn out to be complex and slow. But it could be significantly less complex (no polling!) and just as efficient or better. So we will see. Further discussion below about that "no polling" point. On 2017-02-07 13:51- Phil Rosenberg wrote: [...] > In a simple > restart-from-the-beginning buffer like you seem to be proposing the > sender must wait until the reader has read all the data from the > buffer before it can send more data. True, but neverthless the mini-project demonstrated this was an efficient method of moving many MBytes at one go since effectively the only costs (assuming overheads are small) are a memcpy of those bytes on the sending end and a memcpy of those bytes on the receiving end, and (this is the important point) without any polling at all! Note, in principle for GigaHertz machines a 1 GByte memcpy should only require 1 second or so. So we are discussing really small inefficiencies here so long as the size of the buffer is large enough to make the overheads of the method (the overhead of setting up the transfer of control to the memcpy routine for each chunk, and the overhead of checking semaphores as each chunk is passed). So I would expect the method would be inefficient for really small buffer sizes (such as 100 bytes or so). But, for example, I was already getting quite efficient results with a buffer size of just 1K (!) so I don't think the overheads of this two-semaphore method are that big a deal. Of course, if you add polling to the mix, then that would introduce a lot of efficiency concern. But there is no polling needed or used in this mini-project so that is why it is efficient, and I believe I can avoid polling as well for the wxwidgets IPC method, see below. [...] > Don't forget also that while waiting for new data > the semaphore cannot block indefinitely. To do so would hang > wxPLViewer or the sender software. I think also there is no way to > tell if a page is finished or whether there is more data to come. > Therefore you must use non- blocking semaphores and poll them on > regular intervals. My assumption is the sending side knows exactly how many bytes it needs to send. And my preliminary analysis is that is exactly what the present code calculates with the amountToCopy variable in wxPLDevice::TransmitBuffer. So with that assumption (and as demonstrated by the mini-project) there is no need for non-blocking semaphores or polling if you use the two-semaphore method. Of course, this line of reasoning completely falls apart if amountToCopy does not do what its name implies so please let me know if there is some case where that calculation is unreliable. My efficiency test results for the case where the -np option is not used show the inefficiences of the present wxwidgets IPC are negligible compared to wxPLViewer taking a long time (30 seconds for example 8) to render the plot while other interactive devices take the order of a second for this same task. Most of this large time interval occurs after -dev wxwidgets is completely finished so IPC ineffiency cannot be the source of this wxPLViewer inefficiency for cases like example 8 where large numbers of graphical elements are being plotted. Therefore, from this evidence you do have the polling interval used in the present one-semaphore method tuned up fairly well (at least for typical PC hardware). So my fundamental goal here is making our wxwidgets IPC a lot simpler for POSIX systems by eliminating the polling and the rest of the circular buffer logic for that case. I am hoping for some noticeable improvement in efficiency due to this, but I am not counting on anything showing at all in that regard until at least the non-IPC inefficiency of wxPLViewer is addressed. In sum, it is "show the code" time.
Re: [Plplot-devel] Recent progress with wxwidgets IPC
Hi Phil: You make a lot of points about some uncertainties in what I propose to do. And I do agree there are such uncertainties. So this is definitely a "show them the code" moment. At worst, I will strip it all out again because it will turn out to be complex and slow. But it could be significantly less complex (no polling!) and just as efficient or better. So we will see. Further discussion below about that "no polling" point. On 2017-02-07 13:51- Phil Rosenberg wrote: [...] > In a simple > restart-from-the-beginning buffer like you seem to be proposing the > sender must wait until the reader has read all the data from the > buffer before it can send more data. True, but neverthless the mini-project demonstrated this was an efficient method of moving many MBytes at one go since effectively the only costs (assuming overheads are small) are a memcpy of those bytes on the sending end and a memcpy of those bytes on the receiving end, and (this is the important point) without any polling at all! Note, in principle for GigaHertz machines a 1 GByte memcpy should only require 1 second or so. So we are discussing really small inefficiencies here so long as the size of the buffer is large enough to make the overheads of the method (the overhead of setting up the transfer of control to the memcpy routine for each chunk, and the overhead of checking semaphores as each chunk is passed). So I would expect the method would be inefficient for really small buffer sizes (such as 100 bytes or so). But, for example, I was already getting quite efficient results with a buffer size of just 1K (!) so I don't think the overheads of this two-semaphore method are that big a deal. Of course, if you add polling to the mix, then that would introduce a lot of efficiency concern. But there is no polling needed or used in this mini-project so that is why it is efficient, and I believe I can avoid polling as well for the wxwidgets IPC method, see below. [...] > Don't forget also that while waiting for new data > the semaphore cannot block indefinitely. To do so would hang > wxPLViewer or the sender software. I think also there is no way to > tell if a page is finished or whether there is more data to come. > Therefore you must use non- blocking semaphores and poll them on > regular intervals. My assumption is the sending side knows exactly how many bytes it needs to send. And my preliminary analysis is that is exactly what the present code calculates with the amountToCopy variable in wxPLDevice::TransmitBuffer. So with that assumption (and as demonstrated by the mini-project) there is no need for non-blocking semaphores or polling if you use the two-semaphore method. Of course, this line of reasoning completely falls apart if amountToCopy does not do what its name implies so please let me know if there is some case where that calculation is unreliable. My efficiency test results for the case where the -np option is not used show the inefficiences of the present wxwidgets IPC are negligible compared to wxPLViewer taking a long time (30 seconds for example 8) to render the plot while other interactive devices take the order of a second for this same task. Most of this large time interval occurs after -dev wxwidgets is completely finished so IPC ineffiency cannot be the source of this wxPLViewer inefficiency for cases like example 8 where large numbers of graphical elements are being plotted. Therefore, from this evidence you do have the polling interval used in the present one-semaphore method tuned up fairly well (at least for typical PC hardware). So my fundamental goal here is making our wxwidgets IPC a lot simpler for POSIX systems by eliminating the polling and the rest of the circular buffer logic for that case. I am hoping for some noticeable improvement in efficiency due to this, but I am not counting on anything showing at all in that regard until at least the non-IPC inefficiency of wxPLViewer is addressed. In sum, it is "show the code" time. That is, it is pretty clear what I have said above has speculative elements and similarly for any further replies you make (unless you know of some cases where amountToCopy is definitely unreliable). So my focus from now on will be to continue my project of implementing the two-semaphore method for wxwidgets IPC. Once I have completed that implementation, we should evaluate that code and obviously if it is simpler and there is at least no drop in efficiency we should adopt it but otherwise not. Alan __ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project
Re: [Plplot-devel] Recent progress with wxwidgets IPC
Hi Alan That seems reasonable. I'm not sure what the benefits are though. Should the new way be quicker? In terms of the overall complexity here are some of the things I found while setting up the current system that each made the system more complex than I initially envisaged. I imagine you will come across many of these too. The reason that I went for a circular buffer is that data can continuously be being sent even while the reader is reading. That is until the writer catches up with the reader. In a simple restart-from-the-beginning buffer like you seem to be proposing the sender must wait until the reader has read all the data from the buffer before it can send more data. In fact I have a feeling that the system as I have it now makes little or no use of the actual semaphores and just uses flags in specified locations in the shared memory. I would have to look through the code properly to check. Don't forget also that while waiting for new data the semaphore cannot block indefinitely. To do so would hang wxPLViewer or the sender software. I think also there is no way to tell if a page is finished or whether there is more data to come. Therefore you must use non- blocking semaphores and poll them on regular intervals. If you want to do two way communications you will need two areas of shared memory, one for each direction, otherwise you will likely have a race condition for which process will write next. As it happens, like you said the transfer is almost all one way, with just a small amount being returned. So I just allocated a small portion of the shared memory to represent a struct to hold some specific returned information. I'm doing this from memory, but I think it would work like this for the case of getting text size. Sender zeros the flag which indicates a text size is available. Sender sends a message to the Viewer saying it wants a text size and with the string it wants sizing Sender starts checking the flag to say that the text size is available. Viewer receives the message requesting text size Viewer determines the text size and writes it to the location in shared memory reserved for text size Viewer sets the flag which indicates text size is available to 1 Sender sees the text size flag it one Sender reads the text size from the location in shared memory that is reserved for text size. The same is basically true for getting position information for pllocate calls. But there is a pause on the Viewer side while it waits for user input. Of course the alternate is to set up a totally generic symmetric system, I'm not sure if one is easier or faster. I guess all these things contribute to the complexity of the code that is there, but I'm not sure that it is more complex than necessary. I'll be interested to see how your setup differs :-) Phil On 7 February 2017 at 02:01, Alan W. Irwinwrote: > On 2017-02-06 23:52- p.d.rosenb...@gmail.com wrote: > >> Hi Alan > > >> Not exactly sure what you mean by complex? It is not always possible > > to send all data, as the shared memory is finite size and therefore > the data to be sent may be bigger than the shared memory. > > Hi Phil: > > To get a preview of what I mean by implementing an approach that is > simpler than the current one, I suggest you take a look at the code in > cmake/test_linux_ipc. There, the shared memory buffer size is > relatively small, that buffer is _not_ a circular buffer, and > typically the amount of data to be transmitted >> shared memory buffer > size. The data are split up into chunks that fit into the buffer on > the sending side and those chunks are assembled on the receiving side > under the control of two semaphores. The result is an efficient > transfer of the entirety of what can be an extremely large amount of > data (25MB in one test) between sender and receiver with relatively > simple code and relatively small shared memory buffer. > > My current plan is to have a generic function "send" for sending a > generic array of bytes and a generic function "receive" for receiving > that array where those functions contain all the details of the > two-semaphore method for transmitting and receiving a generic data > array. Then higher level code would create an array to be sent or > received and then they would use this send/receive API to transfer > those arrays. Of course, the usual case is that -dev wxwidgets > normally would call the "send" API and wxPlViewer normally would > normally call the "receive" API, but when those roles are reversed > (e.g., when transmitting back the physical size of displayed text > strings), then wxPlViewer will be calling the "send" API and -dev > wxwidgets will be using the "receive" API. > >> I presume it's this named semaphore and/or memory flags that you intend to >> remove? > > > I am definitely going to keep named semaphores (see step 3 in my > original plan where a small change to the two-semaphores approach > should change that