On Sun, 23 Aug 2015 14:51:04 -0700, you wrote:

>OK so with all that in mind, I'm back to square 1. These processes can not
>share the same memory space. libmongoose seems to love stomping all over
>the stack, and I'm fairly sure it is not thread safe. Which is why I'm
>using two separate executables.

Linux (someone please correct me if I am wrong) *has* to be thread
safe.  

The problem I see is that you are not using the parts of the OS that
are designed to keep you from messing things up.


>
>*OR* maybe I could go crazy and malloc() everything ? heheh no way ;)

malloc, or perhaps a version that the OS uses that turns OFF
interrupts and deals with the memory manager in the chips, should only
allocate memory within the program's space, or once that memory is
allocated, automatically assign it to that program.

An operating system is about managing resources, giving them to a
program, and making the whole thing graceful.

Harvey

>
>On Sun, Aug 23, 2015 at 2:41 PM, Harvey White <ma...@dragonworks.info>
>wrote:
>
>> On Sun, 23 Aug 2015 14:05:12 -0700, you wrote:
>>
>> >Walter,
>> >
>> >Thank you for your reply.
>> >
>> >I've examined pretty much all of SYSV and POSIX IPC mechanisms. I'm no
>> >expert here, as this is really my first go with anything IPC, and pretty
>> >much my first "major" application running on Linux.
>>
>> Which means, perhaps, the first application where the OS is a real
>> factor.
>> >
>> >Pipes may not be fast enough for what I'm trying to accomplish. To keep an
>> >explanation short. I'm only tracking one PGN. A PGN for fastpackets is a
>> >set of data items in this case. For this one PGN I'm dealing with 3 items
>> >in data ( voltage, current and frequency ), but program wide I have to
>> keep
>> >track of much more. This PGN is also only one of of roughly 20. WIth most
>> >PGNs issuing data sets of varying length 2 times a second . . .
>>
>> The problem may be more of "how much data and how long to process it"
>> rather than the frequency of the data itself.
>>
>> You are correct to consider context switching time.
>>
>> >
>> >
>> >It may be I'll have to somehow rate limit the data I'll be dealing with. I
>> >did consider POSIX Message queues, but according to what I've read. POSIX
>> >shared memory is the fastest of all IPC mechanisms, and while I do agree
>> >that it is not very easy. Personally, I think shared memory is easy now
>> >that I understand a lot of it. At minimum, it's not very hard to under the
>> >idea, and implement it in code. Semaphores, mutexes, and threads however I
>> >do find a bit intimidating. At minimum, I personally think they're overly
>> >complex.
>>
>> Hmmm, perhaps not quite that intimidating.
>>
>> A thread is a path of execution.  A single program consisting of a
>> loop and a single interrupt has two threads.
>>
>> Threads share common resources, data, address space.  It's up to you
>> to make them well behaved about what changes what and why.... That's
>> why microprocessors save the registers on the stack for an interrupt.
>>
>> Processes are threads with isolated resources.  Each process ideally
>> thinks that it is the only thing running in a processor, and data just
>> "magically" appears.  The OS's job is to keep the processes separate.
>>
>> Mutexes and semaphores are similar, and are synchronization mechanisms
>> between either threads or processes.  Please look up the definition
>> and explanation of "critical section" in programming.
>>
>> The idea is to have a flag that can be changed without interference
>> from another process, or for that matter, can be read without
>> interfering with another process.  This could be a complete message.
>>
>> The mutexes and semaphores serve to synchronize two processes which,
>> by the very nature of an operating system, *cannot* be guaranteed to
>> by synchronous.
>>
>>
>> >
>> >I have though about a lot of different approaches, and I'm not saying my
>> >approach won't change. This is just where I am right now. Stumbling about
>> >learning the various Linux API's / libraries. Using, and understanding
>> >fork() is on my TODO list, I just have not made it there yet. These two
>> >processes are actually two separate executables. I am a bit worried about
>> >process context switching though. I mean I'm sure I am inuring some
>> penalty
>> >right now running two separate executables, but I'm not sure it would be
>> >the same using threads.
>>
>> It actually would be the same with thread vs. processes.  The only
>> real difference is that the threads share the same address space as
>> the each other, so they have access to variables without a special
>> mechanism (which would take time).
>>
>> Processes, as I mentioned, run in their own worlds, with the operating
>> system controlling what they see (resources, shared memory, etc). That
>> mechanism has overhead.
>>
>> So yes, threads are faster than processes, but more dangerous.
>>
>> Harvey
>>
>> >
>> >On Sun, Aug 23, 2015 at 1:42 PM, William Hermans <yyrk...@gmail.com>
>> wrote:
>> >
>> >> *1) what stops process A from writing to the shared buffer if process B*
>> >>> * is reading it?*
>> >>
>> >>
>> >> Nothing. I assume that writes are slower, or at most as fast as reads.
>> >> Both reads, and writes are done using a mmap'd pointer.
>> >>
>> >> *2) what keeps B from getting an incomplete or inaccurate value from*
>> >>> * process A for the byte position?  is it a byte variable or is it an*
>> >>> * integer?  Does the processor write this as an integer in one*
>> >>> * uninterruptible process?*
>> >>>
>> >>
>> >> Aside from the fact that the byte position I'm testing here is a source
>> >> ID, of two different devices. Nothing. They do come in - in order one
>> after
>> >> the other however. This is not permanent however. When I start tracking
>> >> more data, for one set of data this will still work. But not for other
>> sets
>> >> of data. Write / read type is  char. No way really to get this wrong as
>> >> with gcc -Wall, gcc will warn. I have no errors or warning when
>> compiling.
>> >>
>> >> 3) if both A and B access Internet devices (over the same interface
>> >> I'd guess), what stops the data collision between process A and
>> >> process B?  What protects that Internet resource?  What is the result
>> >> if both A and B read a status register at the same time (in the
>> >> hardware)?
>> >>
>> >> No. I guess more correctly they are socket devices. Both using Linux
>> >> network sockets. socketcan for CANBus, and standard Linux sockets for
>> >> ethernet. The web libraries I did not write. It's libmongoose.
>> >>
>> >> On Sun, Aug 23, 2015 at 1:06 PM, Harvey White <ma...@dragonworks.info>
>> >> wrote:
>> >>
>> >>> On Sun, 23 Aug 2015 11:44:13 -0700, you wrote:
>> >>>
>> >>> >Ok. In my case however -
>> >>> >
>> >>> >Process A writes to shared memory only.
>> >>> >Process B Reads from shared memory only.
>> >>>
>> >>> Ok, so that eliminates one form of data corruption.
>> >>> >
>> >>> >As it stands Process B starts off with a variable set to 0x00. then
>> >>> >compares this to a byte position in the file. When Process B first
>> >>> starts,
>> >>> >this comparison will always fail. Process B then copies the contents
>> of
>> >>> the
>> >>> >file, sets the variable to this value to the value at the byte
>> position.
>> >>> >Then sends the data out over a websocket.
>> >>>
>> >>> Ok:
>> >>> 1) what stops process A from writing to the shared buffer if process B
>> >>> is reading it?
>> >>>
>> >>> 2) what keeps B from getting an incomplete or inaccurate value from
>> >>> process A for the byte position?  is it a byte variable or is it an
>> >>> integer?  Does the processor write this as an integer in one
>> >>> uninterruptible process?
>> >>>
>> >>> 3) if both A and B access Internet devices (over the same interface
>> >>> I'd guess), what stops the data collision between process A and
>> >>> process B?  What protects that Internet resource?  What is the result
>> >>> if both A and B read a status register at the same time (in the
>> >>> hardware)?
>> >>>
>> >>> Harvey
>> >>>
>> >>>
>> >>>
>> >>> >
>> >>> >On the next iteration of the loop cycle. Process B then reads this
>> value
>> >>> >again, makes the comparison - which will likely succeed. The loop
>> cycle
>> >>> >then continues until this comparison fails again. Where the logic
>> process
>> >>> >repeats. It's pretty simple - Or so I thought.
>> >>> >
>> >>> >The reasoning for this development model is simple. Code segregation.
>> >>> Code
>> >>> >in process B does not play well with the code in process A. They're
>> both
>> >>> >accessing network devices, and when it happen simultaneously - Data
>> gets
>> >>> >lost. Which happens more often than not.
>> >>> >
>> >>> >On Sun, Aug 23, 2015 at 9:39 AM, Harvey White <ma...@dragonworks.info
>> >
>> >>> >wrote:
>> >>> >
>> >>> >> On Sun, 23 Aug 2015 08:52:53 -0700, you wrote:
>> >>> >>
>> >>> >> >Hi Harvey,
>> >>> >> >
>> >>> >> >Thanks for the response. I think the biggest question in my mind
>> is -
>> >>> Ok,
>> >>> >> >so perhaps I have a synchronization problem that rears it's head
>> once
>> >>> in a
>> >>> >> >while. But is this really that much of a problem which may cause
>> both
>> >>> >> >processes to stop ?
>> >>> >> >
>> >>> >> >A sample here and there once in a while that does not display,
>> >>> because it
>> >>> >> >is malformed does not bother me. The processes stopping - does. I
>> can
>> >>> not
>> >>> >> >see how this could be causing the processes to stop. However . . .
>> I
>> >>> >> >honestly do not know one way or the other.
>> >>> >>
>> >>> >> Process A: while process B is busy, wait, then read from process B
>> >>> >>
>> >>> >> Process B: while process A is busy, wait, then read from process A
>> >>> >>
>> >>> >> Classic deadlock.
>> >>> >>
>> >>> >> Process A: wait for permission to read special area, read, then wait
>> >>> >> outside that permission area.  No restrictions on process B except
>> >>> >> when accessing special area (which happens infrequently) .
>> >>> >>
>> >>> >> Process B: wait for permission to read special area, read, then wait
>> >>> >> outside that permission area.  No restrictions on process A except
>> >>> >> when accessing special area (which happens infrequently) .
>> >>> >>
>> >>> >> Since the waiting is outside that special area, and the processes
>> are
>> >>> >> not allowed to hog the special area (and block the other process),
>> >>> >> then neither process can block the other except for a very brief
>> time.
>> >>> >>
>> >>> >> The implication is that the process check and access special area
>> >>> >> takes a very small time, and the wait/do something else part takes a
>> >>> >> longer time.
>> >>> >>
>> >>> >> Harvey
>> >>> >>
>> >>> >> >On Sun, Aug 23, 2015 at 8:43 AM, Harvey White <
>> ma...@dragonworks.info
>> >>> >
>> >>> >> >wrote:
>> >>> >> >
>> >>> >> >> On Sun, 23 Aug 2015 08:25:02 -0700, you wrote:
>> >>> >> >>
>> >>> >> >> >HI Przemek,
>> >>> >> >> >
>> >>> >> >> >*Since this involves two processes that as you say stop
>> >>> >> simultaneously,*
>> >>> >> >> >> * I'd suspect a latent synchronization bug. You don't say how
>> >>> you*
>> >>> >> >> >> * interlock your shared memory,  but one possibility is that
>> your
>> >>> >> >> reader*
>> >>> >> >> >> * code gets stuck because you overwrite the data while it's
>> >>> reading
>> >>> >> it.*
>> >>> >> >> >> * Debugging this type of thing is tricky, but maybe write a
>> >>> state*
>> >>> >> >> >> * machine that lights some LEDs that show the phases of your*
>> >>> >> >> >> * synchronization process, and wait to see where it's stuck.*
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> >Currently, I have no synchronization. At one point I was using a
>> >>> byte
>> >>> >> in
>> >>> >> >> >shared memory as a binary stopgap, but after a while it was not
>> >>> working
>> >>> >> >> >predictably. Now, I'm re-reading documentation on POSIX
>> >>> semaphores, and
>> >>> >> >> >creating a semaphore in shared memory, instead of using a system
>> >>> wide
>> >>> >> >> >resource.
>> >>> >> >>
>> >>> >> >> Then you have two things that happen with no predictable time
>> >>> >> >> relationship to each other at all.
>> >>> >> >>
>> >>> >> >> You could be writing part of a multibyte message when trying to
>> read
>> >>> >> >> that message with another process.
>> >>> >> >>
>> >>> >> >> A binary semaphore controls access to the shared (message)
>> resource.
>> >>> >> >> Checking the binary semaphore generally involves turning off
>> >>> >> >> interrupts so that the other process can't grab control during
>> the
>> >>> >> >> check code.  If you have two separate processors, you still need
>> to
>> >>> >> >> deal with the same thing, not so much interrupts, but permission
>> to
>> >>> >> >> access.  The semaphore read/write must be atomic, and the access
>> >>> must
>> >>> >> >> be negotiated between the two processors (generally happens in
>> >>> >> >> hardware for two processors, happens in software for two
>> processes
>> >>> >> >> running on the same processor).
>> >>> >> >> >
>> >>> >> >> >*I'd definitely look at this malformation---it could be the
>> smoke
>> >>> from*
>> >>> >> >> >> * the real fire. Or not. In any case, this one should be
>> easier
>> >>> to*
>> >>> >> >> >> * find---just wait for the message, inspect the data in
>> firebug,
>> >>> and*
>> >>> >> >> >> * write a checker routine, inspecting your outgoing data, that
>> >>> >> watches*
>> >>> >> >> >> * for this type of distortion. *
>> >>> >> >> >
>> >>> >> >> >
>> >>> >> >> >The first thing that comes to mind here, which I forgot to add
>> to
>> >>> my
>> >>> >> post
>> >>> >> >> >last night is that I am not zeroing out the shared memory file
>> >>> before
>> >>> >> >> >usage. I know this is bad . . .but am not convinced this is what
>> >>> the
>> >>> >> >> >problem is. However since it is / can be a one line of code
>> fix. I
>> >>> >> will do
>> >>> >> >> >so. The odd thing here is that I get maybe 1-2 notifications an
>> >>> hour -
>> >>> >> If
>> >>> >> >> >that. Then it is inside the actual json object ( string pointer
>> -
>> >>> e.g.
>> >>> >> >> char
>> >>> >> >> >*buffer ) - not outside.
>> >>> >> >> >
>> >>> >> >> >What does all this mean to me. The first impression that I get
>> out
>> >>> of
>> >>> >> this
>> >>> >> >> >is that it is a synchronization issue. I'm still not convinced
>> >>> though
>> >>> >> . .
>> >>> >> >> .
>> >>> >> >> >
>> >>> >> >>
>> >>> >> >> analyze the code to see what happens if one process is writing
>> while
>> >>> >> >> the other is reading.
>> >>> >> >>
>> >>> >> >> The error rate may be just a measure of how frequently this
>> happens.
>> >>> >> >>
>> >>> >> >> Harvey
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> >Also, for what it's worth. I'm using mmap() and not file open(),
>> >>> >> read(),
>> >>> >> >> >write(). So the code is very fast.
>> >>> >> >> >
>> >>> >> >> >On Sun, Aug 23, 2015 at 6:40 AM, Przemek Klosowski <
>> >>> >> >> >przemek.klosow...@gmail.com> wrote:
>> >>> >> >> >
>> >>> >> >> >> On Sun, Aug 23, 2015 at 1:31 AM, William Hermans <
>> >>> yyrk...@gmail.com>
>> >>> >> >> >> wrote:
>> >>> >> >> >> > So I have a problem with some code I've been working on for
>> the
>> >>> >> last
>> >>> >> >> few
>> >>> >> >> >> > months. The code, which is compiled into two separate
>> processes
>> >>> >> >> suddenly
>> >>> >> >> >> > stops working. No error, nothing in dmesg, nothing in any
>> file
>> >>> in
>> >>> >> >> >> /var/log
>> >>> >> >> >> > period. It did however occur to me that since rsyslog is
>> >>> likely or
>> >>> >> >> >> possible
>> >>> >> >> >> > disabled.
>> >>> >> >> >> >
>> >>> >> >> >> > What my code does is read from the CAN peripheral. Form
>> >>> extended
>> >>> >> >> packets
>> >>> >> >> >> out
>> >>> >> >> >> > of the CAN frames( NMEA 2000 fastpackets ), and then writes
>> the
>> >>> >> data
>> >>> >> >> >> into a
>> >>> >> >> >> > POSIX shared memory file ( /dev/shm/file ).
>> >>> >> >> >>
>> >>> >> >> >> Since this involves two processes that as you say stop
>> >>> >> simultaneously,
>> >>> >> >> >> I'd suspect a latent synchronization bug. You don't say how
>> you
>> >>> >> >> >> interlock your shared memory,  but one possibility is that
>> your
>> >>> >> reader
>> >>> >> >> >> code gets stuck because you overwrite the data while it's
>> >>> reading it.
>> >>> >> >> >> Debugging this type of thing is tricky, but maybe write a
>> state
>> >>> >> >> >> machine that lights some LEDs that show the phases of your
>> >>> >> >> >> synchronization process, and wait to see where it's stuck.
>> >>> >> >> >>
>> >>> >> >> >> > The second process simply reads
>> >>> >> >> >> > from the file, and shuffles the data out over a websocket in
>> >>> json /
>> >>> >> >> human
>> >>> >> >> >> > readable form. The data on the webside of things is tested
>> >>> >> accurate,
>> >>> >> >> >> > although I do occasionally get a malformed json object
>> warning
>> >>> from
>> >>> >> >> >> firefox
>> >>> >> >> >> > firebug.
>> >>> >> >> >>
>> >>> >> >> >> I'd definitely look at this malformation---it could be the
>> smoke
>> >>> from
>> >>> >> >> >> the real fire. Or not. In any case, this one should be easier
>> to
>> >>> >> >> >> find---just wait for the message, inspect the data in firebug,
>> >>> and
>> >>> >> >> >> write a checker routine, inspecting your outgoing data, that
>> >>> watches
>> >>> >> >> >> for this type of distortion.
>> >>> >> >> >>
>> >>> >> >> >> --
>> >>> >> >> >> For more options, visit http://beagleboard.org/discuss
>> >>> >> >> >> ---
>> >>> >> >> >> You received this message because you are subscribed to the
>> >>> Google
>> >>> >> >> Groups
>> >>> >> >> >> "BeagleBoard" group.
>> >>> >> >> >> To unsubscribe from this group and stop receiving emails from
>> it,
>> >>> >> send
>> >>> >> >> an
>> >>> >> >> >> email to beagleboard+unsubscr...@googlegroups.com.
>> >>> >> >> >> For more options, visit https://groups.google.com/d/optout.
>> >>> >> >> >>
>> >>> >> >>
>> >>> >> >> --
>> >>> >> >> For more options, visit http://beagleboard.org/discuss
>> >>> >> >> ---
>> >>> >> >> You received this message because you are subscribed to the
>> Google
>> >>> >> Groups
>> >>> >> >> "BeagleBoard" group.
>> >>> >> >> To unsubscribe from this group and stop receiving emails from it,
>> >>> send
>> >>> >> an
>> >>> >> >> email to beagleboard+unsubscr...@googlegroups.com.
>> >>> >> >> For more options, visit https://groups.google.com/d/optout.
>> >>> >> >>
>> >>> >>
>> >>> >> --
>> >>> >> For more options, visit http://beagleboard.org/discuss
>> >>> >> ---
>> >>> >> You received this message because you are subscribed to the Google
>> >>> Groups
>> >>> >> "BeagleBoard" group.
>> >>> >> To unsubscribe from this group and stop receiving emails from it,
>> send
>> >>> an
>> >>> >> email to beagleboard+unsubscr...@googlegroups.com.
>> >>> >> For more options, visit https://groups.google.com/d/optout.
>> >>> >>
>> >>>
>> >>> --
>> >>> For more options, visit http://beagleboard.org/discuss
>> >>> ---
>> >>> You received this message because you are subscribed to the Google
>> Groups
>> >>> "BeagleBoard" group.
>> >>> To unsubscribe from this group and stop receiving emails from it, send
>> an
>> >>> email to beagleboard+unsubscr...@googlegroups.com.
>> >>> For more options, visit https://groups.google.com/d/optout.
>> >>>
>> >>
>> >>
>>
>> --
>> For more options, visit http://beagleboard.org/discuss
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "BeagleBoard" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to beagleboard+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to