Re: [PATCH 2/4] score: SMP initialization and shutdown changes

Chris Johns Thu, 27 Feb 2014 00:17:58 -0800

On 26/02/2014 7:43 pm, Sebastian Huber wrote:

First some words to this patch set.  It is a gradual improvement of the
existing SMP low-level initialization and shutdown procedure.


The existing solution was able to start an SMP system.  Only atomic
reads/writes were used.  It is impossible to implement a controlled
shutdown only with atomic reads/writes.

I cannot comment as I have no considered it in detail. It is a bigstatement to make. Maybe this is a suitable GSoC project.

See also "The Art of
Multiprocessor Programming", chapter 5, "The Relative Power of Primitive
Synchronization Operations".


I will take a look but it might be a while.

This patch introduces a statically initialized global SMP lock to manage
the per-CPU state changes (_Per_CPU_State_lock).  It changes also the
states and documents the state machine.

It is now also possible to use fatal error handlers to control the
shutdown on the lowest-level.

Without this patch on NGMP the situation is like this:

***  SMP03 TEST ***
   CPU 3 running task Init
   CPU 2 running task TA1
   CPU 1 running task TA2
   CPU 0 running task TA3
   CPU 0 running task TA4
*** END OF TEST SMP03 ***
CPU 0:  Unknown watchpoint hit
         0x0000b1c4: 30800000  ba,a  0x0000B1C4
<rtems_smp_process_interrupt+172>
CPU 1:  Unknown watchpoint hit
         0x0000b1c4: 30800000  ba,a  0x0000B1C4
<rtems_smp_process_interrupt+172>
CPU 2:  Unknown watchpoint hit
         0x0000b1c4: 30800000  ba,a  0x0000B1C4
<rtems_smp_process_interrupt+172>
CPU 3:  Program exited normally.

Now I have this:

***  SMP03 TEST ***
   CPU 3 running task Init
   CPU 2 running task TA1
   CPU 1 running task TA2
   CPU 0 running task TA3
   CPU 0 running task TA4
*** END OF TEST SMP03 ***

   CPU 0:  Program exited normally.
   CPU 1:  Power down mode
   CPU 2:  Power down mode
   CPU 3:  Power down mode

I think we should try to discuss problems with a specific patch in one
thread and general problems in a separate thread in the future.

Great. Please start the discussion. I am not doing the work or creatingthe patches and you are so it is up to you and in your interests to getthis under way and discussed as soon as possible. Starting a discussionand general design and review process leaves this patch and any othersin a difficult position to approve and it also effects your plannedwork. I am aware of this and sensitive to it however I do not agree withsome parts of the presented design and I am concerned about the futureparts that are known to be wrong and not covered in your current scopeof work. We need an open discussion where the top level requirementsneed to be defined and the design path settled and without it I amlimited to commenting in patches about general design issues.


On 2014-02-25 00:02, Chris Johns wrote:

On 21/02/2014 6:48 pm, Sebastian Huber wrote:

On 2014-02-21 01:31, Chris Johns wrote:

On 21/02/2014 3:31 am, Sebastian Huber wrote:

Hello Chris,

On 2014-02-20 03:50, Chris Johns wrote:

On 20/02/2014 12:42 am, Sebastian Huber wrote:

+/**
+ * @brief State of a processor.
+ *
+ * @dot
+ * digraph states {
+ *   bi [label="PER_CPU_STATE_BEFORE_INITIALIZATION"];
+ *   rsm [label="PER_CPU_STATE_READY_TO_START_MULTITASKING"];
+ *   sm [label="PER_CPU_STATE_START_MULTITASKING"];
+ *   ds [label="PER_CPU_STATE_DO_SHUTDOWN"];
+ *   u [label="PER_CPU_STATE_UP"];
+ *   s [label="PER_CPU_STATE_SHUTDOWN"];
+ *   bi -> rsm [label="secondary processor\ncompleted
initialization"];
+ *   bi -> u [label="main processor\nstarts multitasking"];
+ *   rsm -> sm [label="main processor\ncompleted initialization"];
+ *   rsm -> ds [label="a fatal error occurred"];
+ *   ds -> s [label="do shutdown\nstate observed"];
+ *   sm -> u [label="secondary processor\nstarts multitasking"];
+ *   u -> s [label="shutdown initiated"];
+ * }
+ * @enddot


I do not see sm to ds if the main core goes to s after going from bi
to u ?

I also see u to s and not to ds; why as the state is called "do
shutdown" ?
Should this state be PER_CPU_STATE_SHUTTING_DOWN ?


I had to change the procedure considerably since I noticed unsolvable
problems with the previous approach based only on atomic read/write
operations. Attached is the new state diagram.


I do not follow this picture. I am assume there is one state machine
per CPU
and so I do not see where the state "starting multitasking" is. I see
a state
called "request start multitasking" then "up" and I am wondering what
the state
is after the request has happened and before up.


The application defines a maximum count of processors (NP).  The
configuration will then define the _Per_CPU_Information[NP] table.  Each
table entry has one per-CPU state entry.


Does the BSP define the max number of possible cores the hardware can
support ?


Yes, via the return value of _CPU_SMP_Initialize().


This is what should be used to define the Per CPU area.

This patch changes nothing in this area.


I have made my comments on this above.

I see later in this email you talk about AP being available processors
and
defined by the BSP. I think I am getting confused with the term
"maximum count
of processors". Does it mean the max count the application may require
or the
max count supported by the hardware ?


It is a mix of both.  The application configures a desired maximum count
and the BSP may reduce it to match the actual hardware.

This seems confused and wrong and why I feel a general lack of top levelrequirements is appearing in the design. We do not configure any otherpart of the configuration this way. We do not let applications set thememory size to 1T byte of RAM and the BSP lowers it to the actual amountnor do we have the application set the CPU clock to 1PetaHz and the BSPlower it to the hardware defined maximum rate. It seemscounter-intuitive and goes against how everything else is configured.


There is another issue here I am struggling with and while
specifically not on
topic for this patch it relates to the configuration of processor
count so
please indulge me. The application having to define the number of
cores means
there is no default position and that means we have this ...

http://git.rtems.org/rtems/tree/testsuites/smptests/smp01/system.h#n28

which is based on this ...

http://git.rtems.org/rtems/tree/cpukit/sapi/include/confdefs.h#n205

A user builds RTEMS with --enable-smp and then has another gate in
CONFIGURE_SMP_APPLICATION. This does not make sense. Why build for SMP
and then
never use it yet incur lots of overhead ? If I build RTEMS with SMP
enabled for
a BSP that supports SMP I would expect it to work with SMP and not
default to
some half way point. For example if you build the Zync A9 qemu BSP
with SMP
enabled and build all the tests then run them on a patched qemu with SMP
support you happily see about 10 maybe 15 failures for over 480 tests
however
these result are meaningless because CONFIGURE_SMP_APPLICATION is only
defined
for a few SMP specific tests and incorrectly for the Zync because it
only has 2
cores and not 4. I feel we should not go to a release with the tests
in this
state even with clear documentation. If the tests fail with SMP we
need to have
the test results show this and document what the failures are so users
are left
with a clear picture of the state of SMP support. For example with the
Zynq A9
qemu example none of the block (libbd) tests should pass as libbd uses
disable
pre-emption and this should generate an error if used.

The CONFIGURE_SMP_APPLICATION needs to go however there is no default
core
count defined by the BSP that can be used to allow this. IMO if the
application
does not define the core count it needs, ie less than the max
supported by the
hardware, the max possible should be used.


I didn't invent this CONFIGURE_SMP_APPLICATION.

I know. All I am trying to do is bring out into the open all the issueswe have with SMP in RTEMS so the community knows the state and so we canlook at the list of things that needs doing.

At the moment it is
useful since we lack important features like thread deletion.


If the code is broken and tests fails we should have them fail.

With the
introduction of clustered/partitioned scheduling the configuration will
completely change. So I would like to postpone this discussion a couple
of weeks.


I cannot comment as I have no idea what this means. I did not even know

"clustered/partitioned scheduling" was a requirement. We still cannotrun a single real application under SMP (see the thread deletion commentabove).

 Since this table is in the BSS
section, all states start with PER_CPU_STATE_INITIAL.


Who is clearing the BSS and how is this managed ? There seems to be some
implicit requirements here. The requirement being "RTEMS requires all
processors to run in an SMP environment must be started externally to
RTEMS".
Is this requirement what we really want ?


This is not the case.  The low-level start-up is highly BSP specific.
The leon3 BSP for example starts secondary processors on its own.  The
QorIQ relies on U-Boot.  Our Altera Cyclone-V (ARM Cortex-A9 MPCore
based) will also start secondary processors on its own.

Ok so we have a mix. I did not know this and when I raised this lastyear with you I was told it needs to be performed by the boot monitor.As a result I went down this path on the Zynq.


On the ARM (well the zynq I have used) starting processors is easy and
the code
needed is tiny and could be made available in the BSP shared area,
while I
understand on the PPC it is complex. I suppose the question is adding
this
support to RTEMS what we want to have and worth it verses adding this
code
gives us complete control over initialisation and makes the boot
monitor simpler.

Given I currently only use the ARM which is easy and I will not ship
uboot due
to licensing reason I favour adding support to RTEMS. :)


The boot processor calls boot_card() and all other processors do what
they want, but they must wait until the boot processor gives the go
(explained later).

The boot processor calls eventually:

   /**
    * @brief Performs CPU specific SMP initialization in the context of
the boot
    * processor.
    *
    * This function is invoked on the boot processor by RTEMS during
    * initialization.  All interrupt stacks are allocated at this point
in case
    * the CPU port allocates the interrupt stacks.
    *
    * The CPU port should start secondary processors now.


What does this mean ? I am not sure if you mean the boot monitor does
this or
this is handled in RTEMS in the BSP or related CPU code.

    *
    * @param[in] configured_cpu_count The count of processors requested
by the
    * application configuration.
    *
    * @return The count of processors available for the application in
the system.
    * This value is less than or equal to the configured count of
processors.
    */
   uint32_t _CPU_SMP_Initialize( uint32_t configured_cpu_count );

So here has the CPU port (or BSP in most cases) the chance to reduce the
configured count of processors to the actually available (AP).  We have
AP <= NP.  In case this function returns "you have three processors",
then these three processors MUST start properly or terminate the
system.  If some processors are not always available or otherwise
unreliable this must be dealt with in _CPU_SMP_Initialize().


This is confusing me. The application controls the max number of cores
(NP) a
system can have and the BSP (or CPU port which knows the max that can
exist)
can configure that down to the actual number. Does this mean I could
define 32
cores at the application and Zynq BSP would say, sorry only 2 here so
this is
what you get ?


Yes.  Its done in some tests, e.g.

http://git.rtems.org/rtems/tree/testsuites/smptests/smplock01/init.c#n28

Does this waste RAM ?

Yes, but its not much, e.g. the _Per_CPU_Information table is a bit
bigger than necessary.


Lets agree to just stick to yes.


What about an application defining 2 on a 4 core system ?


It gets two.

Ok.


System termination can happen at anytime on any processor.  So you can
change from every state into PER_CPU_STATE_SHUTDOWN.  This ability is
the core of this change set.  In the previous implementation this was
not guaranteed.


Did the state diagram show this ? I am sorry if I missed this.


Yes, you have arrows from all other states to PER_CPU_STATE_SHUTDOWN.

Now lets look at the normal start-up (no shutdown).  The next state
after PER_CPU_STATE_INITIAL is
PER_CPU_STATE_READY_TO_START_MULTITASKING.

   /**
    * @brief Processor is ready to start multitasking.
    *
    * The secondary processor performed its basic initialization and is
ready to
    * receive inter-processor interrupts.  Interrupt delivery must be
disabled
    * in this state, but requested inter-processor interrupts must be
recorded
    * and must be delivered once the secondary processor enables
interrupts for
    * the first time.  The boot processor will wait for all secondary
processors
    * to change into this state.  In case a secondary processor does not
reach
    * this state the system will not start.  The secondary processors
wait now
    * for a change into the PER_CPU_STATE_REQUEST_START_MULTITASKING
state set
    * by the boot processor once all secondary processors reached the
    * PER_CPU_STATE_READY_TO_START_MULTITASKING state.
    */
   PER_CPU_STATE_READY_TO_START_MULTITASKING,

The key point for a per-CPU state and not a global state is that every
processor must perform some initialization steps to set up the "I can
receive inter-processor interrupts (IPI)" state.  Before IPIs are
possible the only why are spin variables (e.g. this per-CPU state
variables).

This PER_CPU_STATE_READY_TO_START_MULTITASKING is a synchronization
barrier.

The boot processor will then set the state to
PER_CPU_STATE_REQUEST_START_MULTITASKING on all processors configured
and available (AP).  Once the secondary processor observes this state
change (it spins on its state variable), it will go into the
PER_CPU_STATE_UP state and perform a context switch to the first thread.


If this is about getting to a suitable IPI state on each core then why
not have
this in the states and functions rather then the global state type
names of
"start multitasking" etc.


I don't think we should be too specific about the IPI here.  The basic
statement is that a processor is ready to start multitasking.


I see the issue of needing a clear path from no IPI to having IPI
however I am
still wondering why the need to have the synch point and all cores to
be there
before moving on. A new core in the system should be able to see the
system
state and IPIs are active and then enable its support then add itself
to the
system.


Then you have to check each time which method you need to send a message
to another processor.  I don't see a benefit here.


It is just a check when booting and not used once using IPI.


If there is a global state variable with suitable spinning locks and
atomics
that direct the per cpu paths taken I fail to see why we need this sync
barrier. The first core to a given global state makes the transition
and the
other cores need to wait or move to a different state. This assumes
the BSS
init issue is resolved.


I fail to see why this barrier is a problem.

Because as I stated before this is a system design constraint and notsomething RTEMS should enforce. It could be optional but not mandatoryand being optional then using it comes with some risks.

The processors in an SMP
system MUST work reliable if now or later it doesn't matter.

When I used the existing system and the boot monitor did not start aprocessor the system locked up. There was no recovery with the currentdesign and I consider this a flaw. A watchdog just looped.

Do you
really want to work an a system that works under this condition:
"Sometimes scheduling actions take place, but you cannot assume that a
thread will execute as planned.".

Again a system issue. It is up to system designers on a specific projectto determine the fail states like this and not operating system kerneldesigners. Allowing a degraded state provides opportunity for recover orfault reporting. It is not for everyone or every system.


I do not see why we have main and secondary processors ?


I changed "main processor" into "boot processor" to highlight that
this
is only the case during system boot.


Is there any code checking the processor number ? I see it in the

This is symmetric
multiprocessing which means each core is the same therefore
capable of
completing any required task. I understand there are paths which
need to
complete so if we have states for these phases as gates then any
processor that
arrives should be able to enter the gate (spinning lock for those
that
need to
wait) and complete the work. This means a degraded state can exist
and
things
at least start. The application would need to detect and manage the
degraded
state and so RTEMS should not be concerned with this condition other
than doing
its best to run where possible.


In case the boot procedure of the system is unstable then it makes no
sense to run an application.


This depends on the application and its system requirements and how the
hardware is constructed. A boot monitor can see cores are not
present and
decide not to start RTEMS or it can decide to start a single core.
This relates
to the system's requirement and can vary and has little to do with
RTEMS. If a
boot monitor passes control to RTEMS then it should run and only fail
if it's
integrity it not correct. RTEMS does should not move into the area of
system
requirements or constraints. Adding constraints like this in the
operating
system does not seem a good idea to me.


I see no problem here.  See _CPU_SMP_Initialize() above.


I do not understand. Are you saying the design will run with a core
count that
does not match NP or AP ?


It is min { NP, AP }.

IMO an application core count higher than the available cores is aconfiguration error.


I assume if we have n cores where n is 1..cpus available we enter the
static
constructors once and 'main' [1] once and this independent of the
number of
defined and/or operating cores. If an application's static
constructor
starts
further threads it needs to manage the concurrency issues and main is
only
entered once.


This change is about the low-level boot procedure.  High level
concepts
like threads are not an issue here.


They are if you remove the all cores needing to sync plus using cpu
numbers to
control which cores go through to the static constructors and main and
which
cores go direct to the score thread code.


If you remove this synchronization barrier via
PER_CPU_STATE_READY_TO_START_MULTITASKING, then you mandate that there
is boot loader that does these steps.  With the current set-up this boot
loader is optional.


Or the boot loading is not involved.

It is this sort of code that concerns
me ...

http://git.rtems.org/rtems/tree/c/src/lib/libbsp/arm/shared/include/arm-a9mpcore-start.h#n88


Yes, this is a dirty hack.  It works at the moment due to our immature
SMP state, but this must definitely change.



http://git.rtems.org/rtems/tree/c/src/lib/libbsp/sparc/shared/start/start.S#n226


Yes, exactly the same issue.  This processor index 0 == boot processor
is a bad assumption.


I feel the single major issue here is a lack of defined requirement
for SMP.
The RTEMS Project and its community need to define the top level
requirements
and along with that tests that check those requirements. For me reviewing
patches it is not clear what is a hack or temporary and what is long
term and
here for good. Questioning each patch or change to find this out is time
consuming and not very productive.

[...]

Yes, we have the wiki page for this:

http://www.rtems.org/wiki/index.php?title=SMP


Yes and the section is empty.

Chris
_______________________________________________
rtems-devel mailing list
[email protected]
http://www.rtems.org/mailman/listinfo/rtems-devel

Re: [PATCH 2/4] score: SMP initialization and shutdown changes

Reply via email to