Re: performance tests in xenomai

Philippe Gerum Wed, 28 Jul 2004 17:38:02 +0200

On Wed, 2004-07-28 at 16:13, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > 
> > So, basically, in your views, any attempt to virtualize the IRQ handling
> > is doomed for real-time? How does RTLinux, RTAI over RTHAL and a bunch
> > of other Windows-based stuff work, then? ;o)
> > 
> 
> For obvious reasons not for the non-real-time part, but the real-time 
> part should have the best, i.e. fastest, irq management which is 
> available on a specific platform.
>


As Gilles has just proposed, we could for instance make the
stalling/unstalling stuff adaptive, so it relies on hw masking directly
instead of virtualization when dealing with the highest priority domain
-- a first quick and dirty check did not come out as a success (no
performance improvement), but maybe some more effort is needed to see a
real difference.

I'd also like to check if the I-cache saving strategy with inline
stall/unstall code is actually better than the indirect hooks for
manipulating the domain stall bit.

As you see, there is still things to experiment, and no one is
pretending that the situation must be frozen. Things must be done in a
somewhat orderly manner, that's all. 

> > The fact is that Stodolsky's proposal can be used the same way for
> > different purposes, that's all. If this optimizes the average case
> > without wrecking the worst case one, but additionally allows to defer
> > the interrupts for whatever purpose, that's fine. Indeed.
> > 
> 
> Agree - if your conditions are fulfilled in practice ;)
> 
> > Adeos is virtualizing the IRQ flow too, what's new with the Adeos model
> > is to use this feature to prioritize the incoming events among any
> > number of domains according to a pipeline abstraction, and not just to a
> > single most prioritary domain. So the additional cost compared to the
> > old-fashioned way is basically defined by the cost of transitioning
> > between multiple domains.
> > 
> 
> I'm aware that the irq virtualisation for high-priority (real-time) 
> domains comes from the basic adeos concept. But even if it is nice in 
> the model, we should also consider its effects in reality.
> 

The first effect is that you can work on ia64, x86, ppc, arm w/ and w/o
MMU using the very same model with a minimal amount of arch-dep stuff.
Another effect is that you can process IRQs and other system events the
same way, which brings a higher level of virtualization, for e.g.
intercepting syscalls and traps in a prioritized manner. Not to speak of
the ability to write down a documentation (which already exists) and
describe behaviours/features of a stable interface that remains the same
over all archs, and as much as possible, over time.

> > 
> >>Ok, this approach is as deterministic as the classic cli/sti: The worst 
> >>case scenario is now critical section length + deferred irq call(s???). 
> >>But this variant is also in no way MORE deterministic than the classic one.
> >>
> > 
> > 
> > Who--said--that??? The purpose of Adeos has never been, is not and will
> > never be to pretend working faster than the hardware does! :o))
> > 
> 
> Is this a point where the adeos concept has a higher priority for you 
> than the performance? Please don't take it as a criticism, I'm only 
> trying to understand the motivation and goal.
> 

Don't worry, my skin is not that thin. The only priority I have wrt to
anything I write is to make it work first, and to make it fast next. So
please consider the facts on a longer period of time.

> > Come on... what's important is that it brings a common low level
> > architecture for supporting event prioritization, that improves
> > portability, provides a uniform interface and _behaviour_ among
> > different archs, and provides performances that are comparable to the
> > ones of the old-fashioned stuff, where you are basically immediately fed
> > by the interrupt vector.
> > 
> > Everything has a cost: if it's acceptable performance-wise like you seem
> > to find it out by yourself with your test on a P1, then you will likely
> > accept this cost to get back a much larger benefit.
> > Keep in mind that you could not have Marc's stuff work on Xenomai
> > without Adeos; the 50us more you pay now should be reduced to something
> > around 20us compared to LXRT by a careful investigation and proper
> > optimization; but even if you would have to live with 20us more
> > _bounded_ latency, I don't think this would prevent you from having a
> > properly working application, unless your constraints are so tight that
> > this figure would not fit. But in the latter case, x86, and its
> > terminally ill architecture wrt to very high determinism, is definitely
> > not the arch you would have chosen in the first place, I guess. 
> > 
> 
> I can, indeed, live with 20 us more latency (but not with up to 100 us 
> as we measured on some other box). The point are also not that 
> additional 10 or maybe 20 us which are probably caused by the irq 
> virtualisation for the real-time core. It's more that I'm afraid, if we 
> are to slackly in this regard, we may have problems reducing the total 
> jitter to a more acceptable level (when thinking about replacing RTAI 
> with xenomai+skin in the future).
> 

Xenomai is part of RTAI since the merger of both project last year. As
far as the "replacement" is concerned, the only one who can decide of it
is Paolo in any case, so you should not be afraid of anything regarding
this unless Paolo decides to promote over the night a total wreckage as
the core RTAI technology. Looking how Adeos was introduced back in
24.1.11, and kept until 3.1 for x86, I think that the message is already
quite clear. Looking how fusion already behaves, it looks like we have
already a bit more consistency than a wreckage.

Concerning the potential ability for the fusion core to usefully replace
the current one when the bare latency figures are comparable (we are
talking about 50us, not multi ms jitters here), I have no doubt about
it. But you will need to wait for the second stage of the fusion
development to end in order to prove me either right or wrong on that.

> > 
> >>I just did a quick experiment on vesuvio with Marc's skin on a PI 
> >>266-MMX (text-only, no disk access, only ping -f and some user mode 
> >>load): replacing rtai_local_irq_save/rtai_local_irq_restore with 
> >>rtai_hw_lock/rtai_hw_unlock improved the situation a bit. The maximum 
> >>jitter decreased from about 85 to 75 us. So, it seems that this 
> >>mechanism has an effect, but it is also not the dominating one. As you 
> >>said, the cache locality of code and data is likely a bit worse with 
> >>xenomai+skin compared to lxrt. I hope this is not too much due to the 
> >>layer concept.
> >>
> > 
> > 
> > A layered approach is not bad because of the layering per se, but
> > because the abstraction levels are not properly defined. So the real
> > question is: does Xenomai has those right? I intuitively think so, but
> > the only thing that can solve the matter here is bringing facts, and I
> > intend to do so when I can pour more time into fusion, i.e. when I won't
> > be able to help more on vesuvio.
> > 
> 
> I know your time is limited and I appreciate your work very much. It's 
> just that I wanted to hear: Yes we can reduce these numbers to a 
> well-known level - also on low-end machines. I need a perspective so 
> that I can sleep easily :)

> Jan
> 
> 
> _______________________________________________
> Rtai-dev mailing list
> [EMAIL PROTECTED]
> https://mail.gna.org/listinfo/rtai-dev
-- 

Philippe.

Re: performance tests in xenomai

Reply via email to