Re: [Xenomai-core] [PATCH-STACK] Missing bits for -rc1
On 03/07/07, Jan Kiszka [EMAIL PROTECTED] wrote: Modified version just went online, see http://www.rts.uni-hannover.de/rtaddon/patches/xenomai/fix-intr-locking-part-ii-v3.patch Should be ok now, I think. Please, add your Copyright notice to intr.c (I think, Philippe has no objections here). Jan -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PATCH-STACK] Missing bits for -rc1
On 02/07/07, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: On 01/07/07, Jan Kiszka [EMAIL PROTECTED] wrote: Hi all, I've just uploaded my rebased patch stack, including specifically the last bits that should go in before -rc1, a few comments: (1) in xnintr_irq_handler() + xnlock_get(xnirqs[irq].lock); + +#ifdef CONFIG_SMP + /* In SMP case, we have to reload the cookie under the per-IRQ lock + to avoid racing with xnintr_detach. */ + intr = rthal_irq_cookie(rthal_domain, irq); + if (unlikely(!intr)) { + s = 0; + goto unlock_and_exit; + } +#else + intr = cookie; +#endif I think, it would be better to check 'intr' for NULL at this point so to cover 'cookie == NULL' (who knows.. and it's better not to rely on the lower layer here), I guess. (2) use cases of xnintr_sync_stat_references() in xnintr_irq_detach() + + xnlock_get(xnirqs[irq].lock); + + err = xnarch_release_irq(intr-irq); + xnintr_sync_stat_references(intr); + xnlock_put(xnirqs[irq].lock); Why do you call xnintr_sync_stat_references() inside the locked section here? Does xnintr_sync_stat_references() really need to be protected by 'intr-lock' (it's been already detached) ? Jan -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RFC][PATCH] shirq locking rework
Hello Jan, I appologize for the huge reply latency. Yeah, that might explain while already trying to parse it manually failed: What is xnintr_sync_stat_references? :) yeah.. it was supposed to be xnintr_sync_stat_refs() 'prev = xnstat_get_current()' reference is also tracked as reference accounting becomes a part of the xnstat interface (not sure we do need it though). Mind to elaborate on _why_ you think we need this, specifically if it adds new atomic counters? Forget about it, it was a wrong approach. We do reschedule in xnintr_*_handler() and if 'prev-refs' is non-zero and a newly scheduled thread calls xnstat_runtime_synch() (well, how it could be in theory with this interfcae) before deleting the first thread.. oops. so this 'referencing' scheme is bad anyway. Note, that if the real re-schedule took place in xnpod_schedule() , we actually don't need to _restore_ 'prev' when we get control back.. it must be already restored by xnpod_schedule() when the preempted thread ('prev' is normally a thread in which context an interrupt occurs) gets CPU back. if I'm not missing something. hum? ... if (--sched-inesting == 0 xnsched_resched_p()) xnpod_schedule(); (*) 'sched-current_account' should be already == 'prev' in case xnpod_schedule() took place xnltt_log_event(xeno_ev_iexit, irq); xnstat_runtime_switch(sched, prev); ... The simpler scheme with xnstat_ accounting would be if we account only time spent in intr-isr() to corresponding intr-stat[cpu].account... This way, all accesses to the later one would be inside xnlock_{get,put}(xnirqs[irq].lock) sections [*]. It's preciceness (although, it's arguable to some extent) vs. simplicity (e.g. no need for any xnintr_sync_stat_references()). I would still prefer this approach :-) Otherwise, so far I don't see any much nicer solution that the one illustrated by your first patch. Uhh, be careful, I burned my fingers with similar things recently as well. You have to make sure that all types are resolvable for _all_ includers of that header. Otherwise, I'm fine with cleanups like this. But I think there was once a reason for #define. yeah.. now I recall it as well :-) Thanks, Jan -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RFC][PATCH] shirq locking rework
On 22/06/07, Jan Kiszka [EMAIL PROTECTED] wrote: [ ... ] Only compile-tested under various .configs. Any comment welcome. @@ -76,7 +102,7 @@ static inline void xnintr_stat_counter_d static void xnintr_irq_handler(unsigned irq, void *cookie) { xnsched_t *sched = xnpod_current_sched(); - xnintr_t *intr = (xnintr_t *)cookie; + xnintr_t *intr; xnstat_runtime_t *prev; xnticks_t start; int s; @@ -86,6 +112,16 @@ static void xnintr_irq_handler(unsigned xnltt_log_event(xeno_ev_ienter, irq); ++sched-inesting; + + xnlock_get(xnirqs[irq].lock); + +#ifdef CONFIG_SMP + /* In SMP case, we have to reload the cookie under the per-IRQ lock + to avoid racing with xnintr_detach. */ + intr = rthal_irq_cookie(rthal_domain, irq); +#else + intr = cookie; +#endif s = intr-isr(intr); I guess, 'intr' can be NULL here. Could you please send me attached (non-inlined) a combo patch on top of the trunk version (as I see this one seems to be on top of your previous one)? I'll try to come up with some solution during this weekend. -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RFC][PATCH] shirq locking rework
On 20/06/07, Jan Kiszka [EMAIL PROTECTED] wrote: [ ... ] xnintr_attach/detach()).. your approach does increase a worst case length of the lock(intrlock) -- unlock(intrlock) section... but that's unlikely to be critical. I think, the patch I sent before addresses a reported earlier case with xnintr_edge_shirq_handler().. but your approach does make code shorter (and likely simpler), right? I don't see any problems it would possibly cause (maybe I'm missing smth yet :) I must confess I didn't get your idea immediately. Later on (cough, after hacking my own patch, cough) I had a closer look, and it first appeared fairly nice, despite the additional ifs. But then I realised that end == old_end may produce false positives in case we have several times the same (and only the same) IRQ in a row. So, I'm afraid we have to live with only one candidate. :- It's not clear, can you elaborate your (non-working) scenario in more details? :-) Note: I sent the second patch with the following correction : ... } else if (code == XN_ISR_NONE end == NULL) { end = intr; +old_end = NULL; } ... I don't see why this idea can't work (even if I made yet another error). Under some circumstances the following code will be triggered to end a loop even when 'end' is still valid if (end old_end == end) intr = NULL; _but_ it'd be exactly a case when intr == end and hence, the loop would be finished anyway by the while (intr intr != end) condition.. So sometimes it acts as yet _another_ check to exit the loop, IOW while (intr intr != end) may be converted to just while (intr). Jan -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RFC][PATCH] shirq locking rework
On 21/06/07, Jan Kiszka [EMAIL PROTECTED] wrote: [ ... ] Unfortunately, that whole thing make me meditate about the whole issue again. And now I wonder why we make such a fuss about locking for shared IRQs (where it should be correct with any of the new patches) while we do not care about the non-shared, standard scenario. surprise-surprise.. sure, one way or another it must be fixed. heh.. too many bugs (and I don't even want to say who's responsible) :-/ Sigh, the never ending IRQ story... should be reviewed. Jan -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RFC][PATCH] shirq locking rework
On 21/06/07, Jan Kiszka [EMAIL PROTECTED] wrote: [ .. ] surprise-surprise.. sure, one way or another it must be fixed. heh.. too many bugs (and I don't even want to say who's responsible) :-/ I wouldn't accept all the responsibility if I were you. I have no problems in this respect. I was just a bit sarcastic with the way to say it's my fault. It's a sign that the design might be too complex now frankly speaking, I don't think it's really complex :) Things get worse, at least with XENO_OPT_STATS: Due to the runtime statistics collection, we may end up with dangling references to our xnintr object even after xnintr_shirq_unlock(). We would actually need some kind of IRQ_INPROGRESS + synchronize_irq() at i-pipe level. But we have the problem, in contrast to the kernel, that we reschedule from inside the handler (more precisely: at nucleus level), thus synchronize_irq() would not just wait on some simple handler to exit... Yeah.. we had already conversations on this topic (I think with Philippe) and, if I recall right, that was one of the reasons. That's why synchronize_irq() is in the nucleus layer. Jan -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RFC][PATCH] shirq locking rework
Hello Jan, Well, I hate nested locks when it comes to real-time, but in this case I really found no simpler approach. There is the risk of deadlocks via IRQ:xnintr_shirq::lock - handler - nklock vs. Application:nklock - xnintr_attach/detach - xnintr_shirq::lock, it's also relevant for the current code - xnintr_attach/detach() must not be called while holding the 'nklock'. I think, your approach should work as well.. provided we have only a single reader vs. a single writter contention case, which seems to be the case here ('intrlock' is responsible for synchronization between xnintr_attach/detach()).. your approach does increase a worst case length of the lock(intrlock) -- unlock(intrlock) section... but that's unlikely to be critical. I think, the patch I sent before addresses a reported earlier case with xnintr_edge_shirq_handler().. but your approach does make code shorter (and likely simpler), right? I don't see any problems it would possibly cause (maybe I'm missing smth yet :) -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Xenomai support for ARM926EJ
On 20/03/07, Gilles Chanteperdrix [EMAIL PROTECTED] wrote: Muruganandam Ganapathy wrote: The board is based on the Fujitsu SOC which has the ARM926EJ processor core. This board has SPI, I2C and 10/100 ethernet interfaces and it can support 16/32MB SDRAM and 4/8MB flash memory. You still do not tell us the name of the board, but it is probably not supported. In addition, I would like to know the interrupt reponse mesaured with Xenomai in ARM9 processor based platforms, if any. The interrupt response expected is around 40 -50 microseconds in our case. The interrupt response I mean, it is the time between the generation of the interrupt and the actual ISR invocation. Whether use of Xenomai will enable us meet this timing reqirement. The latencies I have observed so far on ARM are usually larger than 150 microseconds, but these are userspace dispatch latencies. So, you could improve situation if you stay in interrupt handlers. Another way to improve the situation a bit more is to use ucLinux, if your platform is supported. Still, IMHO, 40-50 microseconds is too ambitious. Gilles, as I understood the question was about the interrupt latency, not the scheduling one. I guess, the vitually tagged cache is an additional component of high scheduling latencies on ARM. The interrupt latency should be ok though. -- Gilles Chanteperdrix. ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Doubts about the Xenomai list
Hi, I think moderators may comment on your question. I just take an opportunity to make a minor comment on the papers' content (a part of it). I read briefly only the part of your thesis regarding the Xenomai and RTAI parts (page 4). Frankly speaking (and based on my humble opinion/knowledge of Xenomai), the information is badly presented, both historically- and technically-wise, at the very least. Starting from the very first sentence: Xenomai: Forked from RTAI Have you ever (if not publicly then privately) asked somebody from the list to read/review your papers? I guess, no. But that would definitely have been a good idea. -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [patch] memory barriers in intr.c :: xnintr_lock/unlock()
I see two paths now: A) If we go for atomic_inc/dec with such locking service, xnarch_memory_barrier__after_atomic_inc friends will be needed anyway and could be already introduced now. Yes, this would be better. Any thoughts? I have already sent once to you the message, but I do it now once more. Just to give some more stuff to think about (although, nobody seems to be inetersted in memory-barriers :) and maybe, if I'm wrong, someone will point it out. ** I just noticed that probably the code is still, at least very theoretically, not perfectly safe. let's consider the following scenario: op1; lock(); op2; unlock(); op3; from the Documentation/memory-barriers.txt follows that the only guarantee here is that b = c is executed inside the lock-unlock section (of course, that's what locks are up too). But the funny thing is that non of the ops are ordered in respect to each other! iow, e.g. the following sequences are possible : lock(); op1; op2; op3; lock(); or lock(); op3; op2; op1; lock(); and moreover, pure lock/unlock (without irq disable/enable) doesn't even imply a compiler barrier for UP. [ read starting from the line 1150 in the above mentioned doc ] And now apply all the said above to xnintr_detach() or even linux's synchronize_irq(). IOW, spin_unlock() doesn't guarantee we have a mb between element deletion and checking the active flag :) Ok, maybe it's just in theory. e.g. lock and unlock for x86 seem to imply a full memory barrier (and, probably, all the rest systems does the same). Just look at definitions of mb() and spinlocks() for x86. asm(lock; some write ops) does the trick in both cases. Jan -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PATCH 0/3] Reworked nucleus statistics
On 18/10/06, Jan Kiszka [EMAIL PROTECTED] wrote: (1) time spent in ISR; (2) some prologue for the very first shared handler and epilogue - for the last one.(2) is the rare case that multiple IRQs happen at the same time on the same line. As I said, the case when only one occurs and the others areneedlessly tested is more often. Indeed, it looks like prologue and epilogue are always accounted to the ISR that reported XN_ISR_HANDLES. And if it's only one, then its position in the chain doesn't matter. The simple model accounts only (1) to ISR time statistics so it's always easy to say what's this number means. It just describes the load caused by a corresponding ISR handler.[sorry, Dmitry, for this replay ;)] [off-topic] I somehow read it as sorry for this _reply_ and thought you had expressed below all you think about my complains and myself... but, ok. you are more polit-correct :))) I designed this instrumentation with a fairly old question of mine inthe head: What does raising this IRQ, say, at 1 KHz costs the system? And this question includes not just the ISR itself, but also thedisturbance due to potential rescheduling. Ok. Shared Interrupt are indeed an exception, and if we were only discussingnon-shared instrumentation here, I guess it wouldn't be that tricky to find a common model. If you look at that part, I'm basically shiftingsome instrumentation point after the rescheduling, that's it. yes, per-IRQ accounting (as opposed to per-ISR) wouldn't have a background for such objections. Well, if we conclude that the enhanced scheme provides better informativeness and everyone benefiits from it, then let it be so. I don't have any more objections [chores : uufff, haleluia!] Jan-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PATCH 0/3] Reworked nucleus statistics
...Dmitry, are there other issues I'm missing, that you think would bebetter solved using a simpler accounting model? So according to you, the more complex model gives users a better understanding of behavior of their drivers/applications? ... 0 0 0 201357 0 2.8 IRQ1: handler1 0 0 0 58110 2.4 IRQ1: handler2 what is % in this case? it accounts : (1) time spent in ISR; (2) some prologue for the very first shared handler and epilogue - for the last one. So in fact, part (1) can be the same for both handlers (say, just the same code), N can be == M (irq frequency) but % numbers are different. Is it fair? The simple model accounts only (1) to ISR time statistics so it's always easy to say what's this number means. It just describes the load caused by a corresponding ISR handler. Ok, if somebody wants to tell me that shared interrupts is an exceptional and rare case, then I got it. But anyway, as long as we want to provide per-ISR (and not per-IRQ) statistics, it adds some unclearness (unfairness) on how to consider reported values. I also expected that those prologue and epilogue are likely to cause much lighter disturbance being added to a preempted thread (because such threads normally should have higher % load wrt ISR anyway). Regarding accounting only XN_ISR_HANDLED cases. At least, I suppose, ISR[n+1] doesn't get accounted the interval of time spent by ISR[n] to report XN_ISR_NONE? Which brings us to the next point that I didn't get it from the 3-d patch after looking at it brielfy (well, I didn't apply it yet). And in general, it's getting a bit more difficult to see interrupt handling details behind statistic-handling ones, esp. in the case of shared interrupts. Of course, that's really a minor issue as long as users may get better statistic information :)ok, that was kind of a summary, I hope I didn't forget anything else as we already had quite a long discussion with Jan. all in all (yep, I like repeating myself :) IMHO the simple model, at least, clearly answeres a question what's behind the numbers while the complex one makes the answer more complex trying to be fair in one place and at the same time adding unfairness in another place. --Philippe.-- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [Xenomai-help] Bad EIP kernel-Oops
On 15/09/06, Philippe Gerum [EMAIL PROTECTED] wrote: On Fri, 2006-09-15 at 00:12 +0200, Dmitry Adamushko wrote:[...] And it's not a good idea to get ipipe_catch_event() buzy spinning as (as I understand) ipipe_dispatch_event() may take, in general, an unbounded amount of time.Actually, this is not an issue, since there is no requirement forbounded execution time in ipipe_catch_event(). This call is part of theinitialization chores, it is not meant as being deterministic. I meant it's not ok to keep a CPU spinning all this time. But your approach with schedule_timeout() is really better. And probably, we don't need to try being more general that it's really required as we need to get synched only with ipipe_catch_event(..., NULL) from the Linux domain. --Philippe.-- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [Xenomai-help] Bad EIP kernel-Oops
On 13/09/06, Jan Kiszka [EMAIL PROTECTED] wrote: It's the indirect call to the event handler. 8b3: 8b 55 e4mov0xffe4(%ebp),%edx 8b6: 50push %eax- 8b7: ff 94 93 80 22 00 00call *0x2280(%ebx,%edx,4) 8be: 83 c4 0cadd$0xc,%esp 8c1: 85 c0 test %eax,%eaxIn my case the kernel tries to access the address 0xd09bc5e5 which seemslike it used to be a valid one. I suppose, you have = 256 Mb on this machine? If so, 0xd09bc5e5 belongs to vmalloc()'ed area (in the past). So that was likely some module (e.g. nucleus). So this looks like we really need some mechanism to make sure all CPUsuse the updated pointers after unhooking some event handler and before proceeding with further cleanups. It's more complicated, I'm afraid. We need to be sure the event handler function itself will not disappear in the mean time. Hence, module unloading should be delayed, iow something alike synchronize_rcu() that blocks a cleanup caller (which calls unregister_domain()) untill all the readers are done with their activities. Maybe it's wrong. Some more code reading would be required. Jan-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [Xenomai-help] Bad EIP kernel-Oops
The proper fix is to synchronize ipipe_catch_event(..., NULL) with theevent dispatcher, so that any caller could legitimately assume that no subsequent call to the former handler will happen on any CPU after thisservice has returned. Since ipipe_unregister_domain() may alreadylegitimately assume that all event handlers have been cleared usingipipe_catch_event() by the caller before proceeding, the issue would be solved. That's I understand. I just said a cleanup caller which amongst other things does ipipe_unregister_domain. Although, yep. The latter one is not necessarily involved. The difficult part is to refrain from masking the hw interrupts whenrunning the event handlers for the sake of keeping short latencies (nothing would prevent those handlers to re-enable interrupts anyway).IOW, using a big stick like the critical inter-CPU lock is not thepreferred option. And it's not a good idea to get ipipe_catch_event() buzy spinning as (as I understand) ipipe_dispatch_event() may take, in general, an unbounded amount of time. Ok, some more weird thoughts on top of my mind. I hope the idea is clear, notwithstanding my description that can be not that clear but hey... it's a late hour :) ipipe_dispatch_event() { ... ipipe_lock_cpu(flags); start_domain = this_domain = ipipe_percpu_domain[cpuid]; list_for_each_safe(pos,npos,__ipipe_pipeline) { next_domain = list_entry(pos,struct ipipe_domain,p_link); + event_handler = next_domain-evhand[event]; if (next_domain-evhand[event] != NULL) { ipipe_percpu_domain[cpuid] = next_domain; + atomic_inc(somewhere_stored-counter); ipipe_unlock_cpu(flags); - propagate = !next_domain-evhand[event](event,start_domain,data); + propagate = !event_handler(...); ipipe_lock_cpu(flags); + if (atomic_dec(somewhere_stored-counter) == 0) + send_virtual_irq(virt_irq, EVENT_TYPE, arg); // do it per interested domain ... } then ipipe_catch_event(..., NULL); should do something along the following lines : ipipe_catch_event() { ... lock() ; // not sure, it's even necessary set ipd-evhand[event] to NULL; unlock(); // gets blocked ipipe_get_synched(EVENT_TYPE, arg); ... } ipipe_gets_synched() - lock - if somewhere_stored-counter != 0 - adds a caller to some wait queue (impl. depends on domain) - unlock - gets blocked. virtual_irq_handler() - lock - wakeup_all_blocked for a given EVENT_TYPE and domain - unlock -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [Xenomai-help] Bad EIP kernel-Oops
On 12/09/06, Philippe Gerum [EMAIL PROTECTED] wrote: On Tue, 2006-09-12 at 15:24 +0200, Nils Kemper wrote: Hi, I want to use Xenomai, but I get (sometimes, but everytime the same) kernel-Oops just by running xeno-test: [..] Xenomai: stopping native API services. I-pipe: Domain Xenomai unregistered. Xenomai: hal/x86 stopped. Xenomai: real-time nucleus unloaded.Does the issue still pop up if you set the Xenomai nucleus as static(i.e. not as a module) in the kernel configuration? Just a weird presupposition. In __ipipe_dispatch_event() ipipe_lock_cpu(flags); start_domain = this_domain = ipipe_percpu_domain[cpuid]; list_for_each_safe(pos,npos,__ipipe_pipeline) { next_domain = list_entry(pos,struct ipipe_domain,p_link); //... if (next_domain-evhand[event] != NULL) { ipipe_percpu_domain[cpuid] = next_domain; ipipe_unlock_cpu(flags); (1) propagate = !next_domain-evhand[event](event,start_domain,data); Does anything prevent another thread from preempting the current one at (1) and making next_domain invalid? then : if next_domain == rthal_domain (aka Xenomai) - e.g. someone unloaded all the modules. then if it's static : rthal_domain is still kind of a valid object - it's at least in a valid memory region + evhand points to a valid function. It's even possible to jump to the next element if the rthal_domain::fields were not cleared... non-static : the module image was unloaded, next_domain doesn't point to anything reasonable. Jan or Nils, what instructions does objdump -d kernel/ipipe/core.o show for a given offset in the __ipipe_dispatch_event(). 0xcd in case of Nils. [c013f158] __ipipe_dispatch_event+0xcd/0xeb ? -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: Move rtdm_irq_enable close to rtdm_irq_request
Sharing itself is no problem: if your request an IRQ with XN_ISR_SHARED set but the nucleus is not able to actually assign it to more than onedriver, the second request will simply fail. I see no need to deny thefirst request or even break the driver build. Problematic is only the handling of edge-triggered shared IRQs with the level-triggered handler (can cause lost IRQs). Probably a runtime checkis best as we cannot control the configuration of, e.g., external RTDMdrivers. What about the attached patch? It's ok with me. I just thought maybe it's better to break a build and alert a user earlier (although, it's kind of inderect alert indeed) when the host environment (Xeno) doesn't support a requested mode (e.g. SHARED_EDGE). If such a driver (that requires EDGE_SHARED) is a part of the mainline, then we may use Kconfig features either (1) to make it selectable only when XENO_OPT_SHIRQ_EDGE is on or (2) select XENO_OPT_SHIRQ_EDGE automatically when the driver has been choosen. Jan-- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: Move rtdm_irq_enable close to rtdm_irq_request
On 06/09/06, Jan Kiszka [EMAIL PROTECTED] wrote: If such a driver (that requires EDGE_SHARED) is a part of the mainline, then we may use Kconfig features either (1) to make it selectable only when XENO_OPT_SHIRQ_EDGE is on or (2) select XENO_OPT_SHIRQ_EDGE automatically when the driver has been choosen.Let's do both, the runtime check + some Kconfig magic for mainline drivers.For the latter we should reorganise the config options slightly.XENO_OPT_SHIRQ_* may better depend on a new switch XENO_OPT_SHIRQ. Thus, when the user enables IRQ sharing and some in-tree driver requiresedge-triggering support, XENO_OPT_SHIRQ_EDGE will be selected by thedriver's Kconfig: select XENO_OPT_SHIRQ_EDGE if XENO_OPT_SHIRQ. With the current layout it would look like this: select XENO_OPT_SHIRQ_EDGE ifXENO_OPT_SHIRQ_LEVEL. XENO_OPT_SHIRQ_LEVEL doesn't need to be on in order to use XENO_OPT_SHIRQ_EDGE. Or you probably mean the following behavior : (1) some driver with XN_ISR_EDGE is selected : if XENO_OPT_IRQ select XENO_OPT_SHIRQ_EDGE else run-time check in xnintr_attach() will report -EBUSY in case the line is already busy What I meant is (2) some driver with XN_ISR_EDGE is selected : o select XENO_OPT_SHIRQ_EDGE So that shared irq support (only for edge-triggered interrupts) gets unconditionally enabled. in case of (1), XENO_OPT_SHIRQ can't be enabled on its own, i.e. without any of XENO_OPT_SHIRQ_*. Well, your proposal is probably better. With XN_ISR_EDGE a driver only declares that it's ready to work in shared mode but it doesn't mean it can't work in non-shared one. If a user has a separate line for it, then the shared-IRQ infrastracture adds just non-used overhead. Yep, you are right (heh... you are asking who had doubts? :) Jan-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: Move rtdm_irq_enable close to rtdm_irq_request
On 06/09/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: -menu Shared interrupts +menuconfig XENO_OPT_SHIRQ + bool Shared interrupts config XENO_OPT_SHIRQ_LEVEL bool Level-triggered interrupts - default n + depends on XENO_OPT_SHIRQ + default y help - + Enables support for shared level-triggered interrupts, so that multiple real-time interrupt handlers are allowed to control dedicated hardware devices which are configured to share @@ -369,7 +371,8 @@ config XENO_OPT_SHIRQ_LEVEL config XENO_OPT_SHIRQ_EDGE bool Edge-triggered interrupts - default n + depends on XENO_OPT_SHIRQ + default y help So a user may end up with XENO_OPT_SHIRQ being enabled while both LEVEL and EDGE are disabled? Maybe it's worth to make LEVEL y by default as it's likely to be a required option?Do you see the default y above, no? :) Arghhh, again... nop, I bet it was not there before! How did you manage to hack my gmail account? :) I thought about making only XENO_OPT_SHIRQ_LEVEL default y, but at least for poor x86 users on legacy hardware (ISA) sharing takes at least asoften place with edge-triggered sources. I thought it's level-triggered indeed. At least, judging by the fact that linux provides a generic support only for level-triggered case. e.g. cross-domain IRQ sharing (with you approach) would require only LEVEL option (I actually wanted to port/rework, taking into account the improvements we have discussed recently, your patch over some recent e.g. e1000 driver + Xeno so to have an up-to-date example illustrating the approach). Jan-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [patch, RFC] detect unhandled interrupts
(cache line == 128 bits) let's say cacheline[4] int a = 1; // e.g. a == 0xabcd0004 this part of memory is currently not in the cache. So : 1) [0xabcd, 0xabcd0010] == 128 bits is loaded from memory into cacheline.2) then 1 is loaded into cacheline[1] 3) [ write-through ] --- sync with memory or [ write-back ] --- delay synching ? It seems to be correct indeed. IOW, any write op. involves cache line fetching (from L2 cache or main memory) if it's not in the L1. -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [patch, RFC] detect unhandled interrupts
On 31/08/06, Jan Kiszka [EMAIL PROTECTED] wrote: Unless I'm currently doing something completely wrong, it looks like itdoesn't work as it should. :(I attached a xeno_16550A to the Ethernet NIC's IRQ line and opened theserial port - lock-up. (1) Ok, and now I enabled shared IRQ support:kernel/xenomai/nucleus/intr.c: In function 'xnintr_irq_handler': kernel/xenomai/nucleus/intr.c:399: error: 'xnintr_t' has no member named'unhandled'kernel/xenomai/nucleus/intr.c:404: error: 'xnintr_t' has no member named'unhandled' (2) Looks like our patch review failed... :-/ I failed and got my face in the soup (as it's said in my country). But as I can see Philippe's corrections fix only compilation issues, i.e. (2). (1) still remains unclear. Can't see any explanation so far, if only xeno_16550A sees some of interrupts as its own and returns HANDLED :) Jan-- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [patch, RFC] detect unhandled interrupts
I had it also in mind and grepped use cases of unlikely in kernel/ directory. There are a number of unlikely (a op. b) but none of unlikely(a) op. unlikely (b). Out of curiosity, one may disassemble code for both cases. My feeling though, unlikely(a b) is at least not worse (cpu and compiler-wise) but don't want to speculate as I'm quite uneducated bozo here :)Since likely/unlikely are hints given to the compiler for optimizingbranches, you might want to give it all the information you have at hand immediately, to augment your chances to have it do the right thing [justin case the optimizer has no more short-term memory than a red fish...] (1) if (unlikely(a) unlikely(b)) (2) if (unlikely(a b)) (1) results in 1 more je instruction on the path of the CPU to a likely branch than in case of (2). And as someone more educated in this field has just told me (hopefully I got it correct), any conditional jump leads to the pipeline flushing (maybe recent CPUs can avoid it indeed in case a condition is false). So the code that contains less conditional unconditional jumps is more pipeline-friendly. And a compiler is not (always) smart (should it be though?) enough to make the following transformation : unlikely(a) unlikely(b) = unlikely(a + b) -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [patch, RFC] detect unhandled interrupts
On 30/08/06, Dmitry Adamushko [EMAIL PROTECTED] wrote: (cache line == 128 bits) let's say cacheline[4] int a = 1; // e.g. a == 0xabcd0004 this part of memory is currently not in the cache. So : 1) [0xabcd, 0xabcd0010] == 128 bits is loaded from memory into cacheline. Nop, it looks wrong. Ignore this part :) 2) then 1 is loaded into cacheline[1] 3) [ write-through ] --- sync with memory or [ write-back ] --- delay synching ? Jan -- Best regards,Dmitry Adamushko -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [patch, RFC] detect unhandled interrupts
Hello, Jan has rised this question initially and I was struggling last week to get his request eventually done :) The main idea is to prevent system lockups when the cross domain IRQ sharing isn't properly used (there were a number of reports recently). So here is an initial patch as well as some related critics (yep, I critisize my own patch). I tend to think now : 1) unhandled is not necessary indeed (do we need chasing spurious interrupts the way Linux does it? i.e. it disables the line only after a number of consequently unhandled interrupts SOME_LIMIT); 2) XN_ISR_NONE -- should _not_ imply that the line gets re-enabled. XN_ISR_HANDLED -- implies it, that's ok. But XN_ISR_NONE (all ISRs in case of a shared RT line) should really mean something strange happens and it can be : o either a spurious interrupt; o something is sitting on the same line in the linux domain. ISR should have returned XN_ISR_PROPAGATE in this case. That said, if we are not interested in chasing spurious interrupts the linux way, then this patch could be highly simplified to just adding the following check in all nucleus ISR handlers. + if (unlikely(s == XN_ISR_NONE)) { + xnlogerr(xnintr_check_status: %d of unhandled consequent interrupts. + Disabling the IRQ line #%d\n, + XNINTR_MAX_UNHANDLED, irq); + s |= XN_ISR_NOENABLE; + } I tend to think this would be nicer? -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [patch, RFC] detect unhandled interrupts
-ENOATTACHMENT (also a common issue...) now fixed. Jan-- Best regards,Dmitry Adamushko diff -upr xenomai-SVN/include/nucleus/intr.h xenomai/include/nucleus/intr.h --- xenomai-SVN/include/nucleus/intr.h 2006-07-20 11:09:01.0 +0200 +++ xenomai/include/nucleus/intr.h 2006-08-29 21:20:19.0 +0200 @@ -43,6 +43,8 @@ typedef struct xnintr { #if defined(CONFIG_XENO_OPT_SHIRQ_LEVEL) || defined(CONFIG_XENO_OPT_SHIRQ_EDGE) struct xnintr *next; /* ! Next object in the IRQ-sharing chain. */ +#else +unsigned unhandled; /* ! Number of consequent unhandled interrupts */ #endif /* CONFIG_XENO_OPT_SHIRQ_LEVEL || CONFIG_XENO_OPT_SHIRQ_EDGE */ xnisr_t isr; /* ! Interrupt service routine. */ diff -upr xenomai-SVN/ksrc/nucleus/intr.c xenomai/ksrc/nucleus/intr.c --- xenomai-SVN/ksrc/nucleus/intr.c 2006-07-20 12:35:40.0 +0200 +++ xenomai/ksrc/nucleus/intr.c 2006-08-29 21:52:49.0 +0200 @@ -159,6 +159,8 @@ int xnintr_init(xnintr_t *intr, intr-flags = flags; #if defined(CONFIG_XENO_OPT_SHIRQ_LEVEL) || defined(CONFIG_XENO_OPT_SHIRQ_EDGE) intr-next = NULL; +#else + intr-unhandled = 0; #endif /* CONFIG_XENO_OPT_SHIRQ_LEVEL || CONFIG_XENO_OPT_SHIRQ_EDGE */ return 0; @@ -374,6 +376,7 @@ void xnintr_clock_handler(void) xnintr_irq_handler(nkclock.irq, nkclock); } +#define XNINTR_MAX_UNHANDLED 1000 /* * Low-level interrupt handler dispatching the ISRs -- Called with * interrupts off. @@ -393,6 +396,14 @@ static void xnintr_irq_handler(unsigned s = intr-isr(intr); ++intr-hits; + if (unlikely(s == XN_ISR_NONE ++intr-unhandled == XNINTR_MAX_UNHANDLED)) { + xnlogerr(xnintr_check_status: %d of unhandled consequent interrupts. + Disabling the IRQ line #%d\n, + XNINTR_MAX_UNHANDLED, irq); + s |= XN_ISR_NOENABLE; + } else + intr-unhandled = 0; + if (s XN_ISR_PROPAGATE) xnarch_chain_irq(irq); else if (!(s XN_ISR_NOENABLE)) @@ -422,6 +433,7 @@ static void xnintr_irq_handler(unsigned typedef struct xnintr_shirq { xnintr_t *handlers; + int unhandled; #ifdef CONFIG_SMP atomic_counter_t active; #endif/* CONFIG_SMP */ @@ -482,12 +494,21 @@ static void xnintr_shirq_handler(unsigne intr = shirq-handlers; while (intr) { - s |= intr-isr(intr) XN_ISR_BITMASK; + s |= intr-isr(intr); ++intr-hits; intr = intr-next; } + xnintr_shirq_unlock(shirq); + if (unlikely(s == XN_ISR_NONE ++shirq-unhandled == XNINTR_MAX_UNHANDLED)) { + xnlogerr(xnintr_irq_handler: %d of unhandled consequent interrupts. + Disabling the IRQ line #%d\n, + XNINTR_MAX_UNHANDLED, irq); + s |= XN_ISR_NOENABLE; + } else + shirq-unhandled = 0; + if (s XN_ISR_PROPAGATE) xnarch_chain_irq(irq); else if (!(s XN_ISR_NOENABLE)) @@ -527,16 +548,15 @@ static void xnintr_edge_shirq_handler(un intr = shirq-handlers; while (intr != end) { - int ret, code, bits; + int ret, code; ret = intr-isr(intr); code = ret ~XN_ISR_BITMASK; - bits = ret XN_ISR_BITMASK; + s |= ret; if (code == XN_ISR_HANDLED) { ++intr-hits; end = NULL; - s |= bits; } else if (code == XN_ISR_NONE end == NULL) end = intr; @@ -554,6 +574,14 @@ static void xnintr_edge_shirq_handler(un (xnintr_edge_shirq_handler() : failed to get the IRQ%d line free.\n, irq); + if (unlikely(s == XN_ISR_NONE ++shirq-unhandled == XNINTR_MAX_UNHANDLED)) { + xnlogerr(xnintr_irq_handler: %d of unhandled consequent interrupts. + Disabling the IRQ line #%d\n, + XNINTR_MAX_UNHANDLED, irq); + s |= XN_ISR_NOENABLE; + } else + shirq-unhandled = 0; + if (s XN_ISR_PROPAGATE) xnarch_chain_irq(irq); else if (!(s XN_ISR_NOENABLE)) @@ -613,6 +641,7 @@ static int xnintr_shirq_attach(xnintr_t handler = xnintr_edge_shirq_handler; #endif /* CONFIG_XENO_OPT_SHIRQ_EDGE */ } + shirq-unhandled = 0; err = xnarch_hook_irq(intr-irq, handler, intr-iack, intr); if (err) ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [patch, minor] irq proc output in intr.c
On 22/08/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote:. diff -urp xenomai-SVN/include/nucleus/intr.h xenomai-a/include/nucleus/intr.h --- xenomai-SVN/include/nucleus/intr.h2006-07-20 11:09:01.0 +0200 +++ xenomai-a/include/nucleus/intr.h2006-08-22 09:32: 24.0 +0200 @@ -71,7 +71,9 @@ int xnintr_mount(void);void xnintr_clock_handler(void); +#if defined(CONFIG_PROC_FS) defined(__KERNEL__)int xnintr_irq_proc(unsigned int irq, char *str); +#endif /* CONFIG_PROC_FS __KERNEL__ *//* Public interface. */ diff -urp xenomai-SVN/ksrc/nucleus/intr.c xenomai-a/ksrc/nucleus/intr.c --- xenomai-SVN/ksrc/nucleus/intr.c 2006-07-20 12:35: 40.0 +0200 +++ xenomai-a/ksrc/nucleus/intr.c 2006-08-22 09:34:28.0 +0200 @@ -691,6 +691,7 @@ int xnintr_mount(void) return 0;} +#if defined(CONFIG_PROC_FS) defined(__KERNEL__) int xnintr_irq_proc(unsigned int irq, char *str){ xnintr_shirq_t *shirq; @@ -727,6 +728,7 @@ int xnintr_irq_proc(unsigned int irq, ch return p - str;} +#endif /* CONFIG_PROC_FS __KERNEL__ */#else /* !CONFIG_XENO_OPT_SHIRQ_LEVEL !CONFIG_XENO_OPT_SHIRQ_EDGE */ @@ -735,10 +737,31 @@ int xnintr_mount(void) return 0;} +#if defined(CONFIG_PROC_FS) defined(__KERNEL__)int xnintr_irq_proc(unsigned int irq, char *str){ - return 0; + xnintr_t *intr; + char *p = str;+ spl_t s; My bad. I run diff with an older version, the correct one was in another directory. + + if (rthal_virtual_irq_p(irq)) { + p += sprintf(p, [virtual]); + return p - str; + } else if (irq == XNARCH_TIMER_IRQ) { + p += sprintf(p, %s, nkclock.name); + return p - str; + } + + xnlock_get_irqsave(nklock, s);What's the idea of this lock? I'm asking as xnintr_attach/detach should not run under nklock, right? So, how can it protect us then? They can as rthal_critical_enter/exit() is no longer used in rthal_irq_request(). Now I actually recall why I have not used a cookie field to store a corresponding xnintr_shirq_t object. It's not safe wrt detach ops. and there is no way to implement correct synchronization involving only the nucleus layer. Ok, it's possible to use something similar to xnintr_shirq_lonk/unlock() but that would require the per-irq array also for non-irq-shared case and that's something I wanted to avoid in the first instance. Hmm.. but maybe it's actually not that bad... So the problem is that the xnint_t object we have got as a cookie can be deleted at any time while we are using it. intr = rthal_irq_cookie(rthal_domain, irq); if (intr *(intr-name)) p += sprintf(p, %s, intr-name); Hm.. if I am not wrong, the same problem is true for xnintr_irq_handler(). Have to think a bit more on this... Jan-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [patch, minor] xnpipe_recv and xntimer_get_timeout_periodic()
[ resent. subject was broken ] Hello, [ pipe.c.patch ] update the timeout variable so that the remaining timeout is used in case of consequent xnsynch_sleep_on() calls. As I understand that may indeed happen in case when another thread steals data while this one waits to be scheduled in. [ timer.c.patch] xnticks_t is unsigned while (as I understand) xntlholder_date(timer-plink) - nkpod-jiffies can be negative. In this case, some positive big number is returned and any code that relies on xnthread_timeout() or xntimer_get_timeout() and runs over periodic mode won't work properly (e.g. xnsynch_sleep_on() and xnpipe_recv() now). Not sure 1 should be returned in this case (so far I just did it the same way as xntimer_get_timeout_aperiodic()), I guess 0 would be better in both cases. At least in theory, 1 may cause an (even) infinite loop in xnpipe_recv() as I don't like a check for timeout 1 to be placed there. It's something that should be decided at the timer layer - I mean, whether it's too late to sleep or not. hope, I haven't overlooked anything this time :o) -- Best regards,Dmitry Adamushko --- pipe-SVN.c 2006-08-20 15:04:25.0 +0200 +++ pipe.c 2006-08-20 21:29:55.0 +0200 @@ -450,6 +450,9 @@ ssize_t xnpipe_recv(int minor, struct xn ret = -EIDRM; goto unlock_and_exit; } + + /* remaining timeout */ + timeout = xnthread_timeout(thread); } *pmh = link2mh(holder); --- timer-SVN.c 2006-08-20 15:50:18.0 +0200 +++ timer.c 2006-08-20 15:54:30.0 +0200 @@ -313,7 +313,12 @@ static xnticks_t xntimer_get_date_period static xnticks_t xntimer_get_timeout_periodic(xntimer_t *timer) { - return xntlholder_date(timer-plink) - nkpod-jiffies; + xnticks_t now = nkpod-jiffies; + + if (xntlholder_date(timer-plink) now) + return 1; /* Will elapse shortly. */ + + return xntlholder_date(timer-plink) - now; } static xnticks_t xntimer_get_raw_expiry_periodic(xntimer_t *timer) ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] -EINTR using rt_pipe_read with TM_INFINITE
Hello, take a look at the 4-th parameter of rt_pipe_create() : @param poolsize Specifies the size of a dedicated buffer pool for the * pipe. Passing 0 means that all message allocations for this pipe are * performed on the system heap. The system heap is also used for other allocations and not only to serve a given heap. It's default size is 128 Kb (configurable through the config though). Passing non-zero parameter causes a private heap of the given size to be created for this heap. Note, it's a size in bytes, not a flag (in your example you use 1 for the second heap). This value is rounded up to a page size. rt_pipe_create() { ... if (poolsize 2 * PAGE_SIZE) poolsize = 2 * PAGE_SIZE; poolsize = xnheap_rounded_size(poolsize, PAGE_SIZE); -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [Xenomai-help] -110 error on rt_task_send... bug?
The patch works here too. Thanks for looking into this so quickly. I'llswitch back to 2.1 until this is resolved. It's also broken in 2.1. There are 2 problems there not releted to recently introduced ownership-stealing part. That said, you may safely (well, you remember this WITHOUT ANY WARRANTY... but indeed :) apply this patch so far if you really need synch. message passing mechanism. -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [Xenomai-help] -110 error on rt_task_send... bug?
So could anyone test this patch and let me know if it works? Note : I haven't compiled it even as I don't have a proper environment. But the changes are pretty simple so it should be ok. There was actually yet another problem mmm... who cares to delete a sender from the msendq? Now should be ok or maybe I'm completely wrong and see the things which do not exist at all :) p.s. I cc'ed xenomai-core as it may involve further discussions and it's the right place indeed. -- Best regards,Dmitry Adamushko diff -urp xenomai-a/include/nucleus/synch.h xenomai-b/include/nucleus/synch.h --- xenomai-a/include/nucleus/synch.h 2006-07-20 11:09:01.0 +0200 +++ xenomai-b/include/nucleus/synch.h 2006-08-09 12:53:37.044217000 +0200 @@ -28,6 +28,7 @@ #define XNSYNCH_NOPIP 0x0 #define XNSYNCH_PIP 0x2 #define XNSYNCH_DREORD 0x4 +#define XNSYNCH_FOWNER 0x20 /* Fixed owner */ #if defined(__KERNEL__) || defined(__XENO_SIM__) diff -urp xenomai-a/ksrc/nucleus/synch.c xenomai-b/ksrc/nucleus/synch.c --- xenomai-a/ksrc/nucleus/synch.c 2006-06-15 14:15:25.0 +0200 +++ xenomai-b/ksrc/nucleus/synch.c 2006-08-09 13:28:55.199081000 +0200 @@ -37,6 +37,14 @@ #include nucleus/module.h #include nucleus/ltt.h +/* temporarily here */ +static inline void xnsynch_set_owner_internal(xnsynch_t *synch, xnthread_t *thread) +{ + if (!testbits(synch-status, XNSYNCH_FOWNER)) + synch-owner = thread; +} + + /*! * \fn void xnsynch_init(xnsynch_t *synch, xnflags_t flags); * \brief Initialize a synchronization object. @@ -181,7 +189,7 @@ void xnsynch_sleep_on(xnsynch_t *synch, if (testbits(synch-status, XNSYNCH_PENDING)) { /* Ownership is still pending, steal the resource. */ - synch-owner = thread; + xnsynch_set_owner_internal(synch, thread); __clrbits(thread-status, XNRMID | XNTIMEO | XNBREAK); goto grab_ownership; @@ -209,7 +217,7 @@ void xnsynch_sleep_on(xnsynch_t *synch, xnpod_suspend_thread(thread, XNPEND, timeout, synch); - if (unlikely(synch-owner != thread)) { + if (!testbits(synch-status, XNSYNCH_OWNER) unlikely(synch-owner != thread)) { /* Somebody stole us the ownership while we were ready to run, waiting for the CPU: we need to wait again for the resource. */ @@ -362,12 +370,12 @@ xnthread_t *xnsynch_wakeup_one_sleeper(x if (holder) { thread = link2thread(holder, plink); thread-wchan = NULL; - synch-owner = thread; + xnsynch_set_owner_internal(synch, thread); __setbits(synch-status, XNSYNCH_PENDING); xnltt_log_event(xeno_ev_wakeup1, thread-name, synch); xnpod_resume_thread(thread, XNPEND); } else { - synch-owner = NULL; + xnsynch_set_owner_internal(synch, thread); __clrbits(synch-status, XNSYNCH_PENDING); } @@ -435,7 +443,7 @@ xnpholder_t *xnsynch_wakeup_this_sleeper nholder = poppq(synch-pendq, holder); thread = link2thread(holder, plink); thread-wchan = NULL; - synch-owner = thread; + xnsynch_set_owner_internal(synch, thread); __setbits(synch-status, XNSYNCH_PENDING); xnltt_log_event(xeno_ev_wakeupx, thread-name, synch); xnpod_resume_thread(thread, XNPEND); @@ -523,7 +531,7 @@ int xnsynch_flush(xnsynch_t *synch, xnfl status = XNSYNCH_RESCHED; } - synch-owner = NULL; + xnsynch_set_owner_internal(synch, NULL); __clrbits(synch-status, XNSYNCH_PENDING); xnlock_put_irqrestore(nklock, s); diff -urp xenomai-a/ksrc/skins/native/task.c xenomai-b/ksrc/skins/native/task.c --- xenomai-a/ksrc/skins/native/task.c 2006-07-30 10:50:49.0 +0200 +++ xenomai-b/ksrc/skins/native/task.c 2006-08-09 13:32:21.91777 +0200 @@ -262,7 +262,7 @@ int rt_task_create(RT_TASK *task, #ifdef CONFIG_XENO_OPT_NATIVE_MPS xnsynch_init(task-mrecv, XNSYNCH_FIFO); - xnsynch_init(task-msendq, XNSYNCH_PRIO | XNSYNCH_PIP); + xnsynch_init(task-msendq, XNSYNCH_PRIO | XNSYNCH_PIP | XNSYNCH_FOWNER); xnsynch_set_owner(task-msendq, task-thread_base); task-flowgen = 0; #endif /* CONFIG_XENO_OPT_NATIVE_MPS */ @@ -2057,10 +2057,7 @@ int rt_task_reply(int flowid, RT_TASK_MC identifier from other senders wrt to a given receiver. */ if (receiver-wait_args.mps.mcb_s.flowid == flowid) { - /* Note that the following will cause the receiver to be - unblocked without transferring the ownership of the - msendq object, since we want the sender to keep it. */ - xnpod_resume_thread(receiver-thread_base, XNPEND); + xnsynch_wakeup_this_sleeper(sender-msendq, holder); break; } } ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [Xenomai-help] -110 error on rt_task_send... bug?
So could anyone test this patch and let me know if it works? Note : I haven't compiled it even as I don't have a proper environment. But the changes are pretty simple so it should be ok.I just did run the test program supplied by Vincent after applying Your patch(against 2.2.0)It did work ! That's good. Thanks for your assistance. So synch. message passing interface (IOW, rt_task_send/receive/reply) is mgmglmgmm... slightly broken at the moment for all versions. The ones who need it may use this patch so far. I'll think a bit more on it in the mean time but I tend to think that Fixed Ownership (expressed by XNSYNCH_FOWNER) should really be a property of the synch. object indeed (as in the current patch). Only one small typo was in the patch, it is fixed in the attached patch version. Ok, now I need to run diff -u old-patch new-patch to find out this typo :) -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] xeno task debigging
Hi, do you mean that your rt tasks work fine until you press Ctrl-Alt-FN to switch virtual terminals? well, it's pretty enough information at least to point out the code that caused a problem in question :) it seems to be sysexit in entry.S, namely the following block /* if something modifies registers it must also disable sysexit */ EMULATE_ROOT_IRET(sysenter_exit) movl EIP(%esp), %edx movl OLDESP(%esp), %ecx xorl %ebp,%ebp sti - sysexit Looking at dumped register values : edx == e410 that stands for some code inside vsyscall page - [fffe000, ]. I suppose the next instruction after sysenter. ecx - a pointer to user-space stack of the task that issued a syscall through vsyscall page. it's 0xb7d1223c and at least we may say it belongs to user-space (but can point to something invalid). this is exactly what sysexit expects to find in edx and ecx, namely EIP and ESP. Then let's look at the code: Code: 44 24 18 e8 81 ca 03 00 fb 8b 4d 08 66 f7 c1 ff fe 0f 85 4e 01 00 00 e8 ed ea 00 00 8b 44 24 18 8b 54 24 28 8b 4c 24 34 31 ed fb 0f 35 50 fc 06 1e 50 55 57 56 52 51 53 ba 7b 00 00 00 8e da 8e 0f 35 seems to be the current instruction and this is an opcode for sysexit on e.g. AMD aa64 (google helps :) The following also confirms that: EIP is at sysenter_exit+0xf/0x11 EIP points to last 2 bytes of sysenter_exit and that's sysexit. I tried to find what exactly sysexit does and when/why it may fault but failed so far. e.g. if ecx is wrong (points to non-existent vma). It's also interesting that 2 tasks get GPF, both returning from some syscall. Maybe Philippe who (I suppose) introduced sysenter/exit support for Adeos/ipipe can shed some light. -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PATCH 6/6] Introduce IRQ latency benchmark
On 28/06/06, Gilles Chanteperdrix [EMAIL PROTECTED] wrote: Jan Kiszka wrote: Ok, then we also need a fix for the latency test (this is where I grabbed that pattern from). Is there a high risk that this locks up? I wonder why I never observed or heard of problems with latency. The latency test used to deadlock, that is why the summary printed on signal is printed on stderr. Unfortunately, there seem to be one display left on stdout, so it should still deadlock. The reason why we never see the deadlock is (perhaps) that the continuous intermediate results are printed on the context of the display thread, whereas the termination signals are preferably delivered by the kernel to the main thread, that only calls pause. Which makes the use of stderr in signals handlers pointless. It's very likely so. The main thread would use instead something like : ... while (!sig_term_received) pause(); do_cleanup_chores(); return 0; cleanup_upon_sig() should only set the sig_term_received flag up. Then all other threads must block signal delivering with sigprocmask() so that the main thread is the only one which accepts signals. btw, according to POSIX 1003.1-2003, the write() call is amongst safe ones. http://www.die.net/doc/linux/man/man2/signal.2.html So write(1, ); heh... not that nice? :) -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PATCH 6/6] Introduce IRQ latency benchmark
Then all other threads must block signal delivering with sigprocmask() so that the main thread is the only one which accepts signals. Is that required, i.e. does pause() only wake up if the signal handler executed in the main thread's context? Then cyclictest contains a bug as well... If strictly speaking, then yes, it's required. But I expect your fix to be working perfectly, just because the current linux implementation always favors the main thread when it comes to choosing a thread to handle the signal. look at signal.c :: __group_complete_signal() routine where the actual magic happens. from the comments /* * Now find a thread we can wake up to take the signal off the queue. * * If the main thread wants the signal, it gets first crack. * Probably the least surprising to the average bear. */ ... As I understand, the only case when another thread may handle the signal indeed is when the main thread has already another pending signal. But even then nothing bad happens as pause() (in the main thread) will be interrupted by this pending signal. Just finished will be set up by some other thread but it doesn't really matter. -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RFC][patch] per-process data.
On 08/05/06, Gilles Chanteperdrix [EMAIL PROTECTED] wrote: Jan Kiszka wrote: Likely I did not yet get the full picture: What prevents using another adeos per-task key for this? We would need a per-task key for every skin that needs a per-process data, but more importantly, we would need to track the clone syscalls (that would be another adeos event) in order to propagate the per-task data to each thread of the process. Or maybe ptds are inherited upon clone ? All of them are cleared in kernel/fork.c :: copy_process(). At least, the one pointed by nkgkptd (keeps shadow's xnthread_t *) must be cleared. Another logic could be applied for the rest, I guess. p.s. I haven't read you patch yet so all said above is out of current context :) -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [REQUEST] eliminate the rthal_critical_enter/exit() from rthal_irq_request()
Hello, this fix prevents possible deadlocks in rt_intr_delete() vs. ISR handlers that I have mentioned earlier. -- Best regards, Dmitry Adamushko native-intr.c-delete-rework.patch Description: Binary data posix-intr.c-delete-rework.patch Description: Binary data Changelog-delete-rework.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [REQUEST] eliminate the rthal_critical_enter/exit() from rthal_irq_request()
Another approach would be to introduce a service like xnintr_synchronize() and enfource the upper interfaces (e.g. skins) to make use of it in their _delete() methods. That would be useful too for solving the concurrent ISR while deleting issue, but would not enforce single deletion in the SMP case, I guess. I actually want to introduce this call anyway. The nucleus then provides single xnintr_disabe (nosync) interface + xnintr_synchronize() and a skin is free to introduce both sync and nosync versions on its own. Otherwise, there will be the same problem as with xnintr_detach() - i.e. xnintr_disable (sync) can't be called from a locked section as it's actually done in rt_intr_delete() (of course, if we do really need the atomicy in the later one). -- Philippe. -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [REQUEST] eliminate the rthal_critical_enter/exit() from rthal_irq_request()
Hi, I haven't had access to my laptop during this week so the patches follow only now. Yet another issue remains that may lead to a deadlock under some circumstances: - rt_intr_delete() calls xnintr_detach() while holding the nklock. - xnintr_detach() (with CONFIG_SMP) spins in xnintr_shirq_spin() when a corresponding ISR is currently running. - now this ISR calls any function that uses nklock (everything make use of it) ... Bum! I first thought about moving xnintr_detach() out of the locked section in rt_intr_delete() but it somewhat breaks interface consistency --- e.g. xeno_mark_delete() is called before the object is completely destroyed. Another approach would be to introduce a service like xnintr_synchronize() and enfource the upper interfaces (e.g. skins) to make use of it in their _delete() methods. The problem here is that the xnintr_shirq - interface is not complete and safe without xnintr_shirq_spin() on detaching step and it becomes somewhat blured with the enforcement like on SMP systems one should always call xnintr_synchronize() right after calling xnintr_detach(). So I'm still thinking how to fix it better... -- Best regards, Dmitry Adamushko hal.c-irqrequest-noglock.patch Description: Binary data core.c-virtualizeirq.patch Description: Binary data intr.c-noglock.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [REQUEST] eliminate the rthal_critical_enter/exit() from rthal_irq_request()
Hi, the following question/suggestion : it could be good to eliminate the use of rthal_critical_enter/exit() from rthal_irq_request() if it's not strictly necessary. the proposal : int rthal_irq_request (unsigned irq, rthal_irq_handler_t handler, rthal_irq_ackfn_t ackfn, void *cookie) { -unsigned long flags; int err = 0; if (handler == NULL || irq = IPIPE_NR_IRQS) return -EINVAL; -flags = rthal_critical_enter(NULL); - -if (rthal_irq_handler(rthal_domain, irq) != NULL) - { - err = -EBUSY; - goto unlock_and_exit; - } - err = rthal_virtualize_irq(rthal_domain, irq, handler, cookie, ackfn, - IPIPE_DYNAMIC_MASK); + IPIPE_DYNAMIC_MASK|IPIPE_EXCLUSIVE_MASK); - unlock_and_exit: - -rthal_critical_exit(flags); - return err; } IPIPE_EXCLUSIVE_MASK causes a -EBUZY error to be returned by ipipe_virtualize_irq() when handler != NULL and the current ipd-irqs[irq].handler != NULL. (IPIPE_EXCLUSIVE_MASK is actually not used at the moment anywere, though ipipe_catch_event() is mentioned in comments). Another variant : ipipe_virtualize_irq() should always return -EBUZY when handler != NULL and the current ipd-irqs[irq].handler != NULL, not taking into account the IPIPE_EXCLUSIVE_FLAG. should work if : o all the ipipe_domain structs are zeroed upon initialization (ok, in case of static or global); o ipipe_virtualize_irq(..., handler=NULL,...) is always called between possible consecutive ipipe_virtualize_irq(..., handler!=NULL, ...) calls. But, yep, this way we enforce a policy for ipipe_virtualize_irq() so that the use of IPIPE_EXCLUSIVE_FLAG is likely better. esp. for the nucleus where every rthal_irq_request() has a conforming rthal_irq_release() call. Why do I want to eliminate it? o any function that make use of critical_enter/exit() must not be called when a lock (e.g. nklock) is held. ok, xnintr_attach() was always the case and it's used properly e.g. in native::rt_intr_create(). but xnintr_detach() is now the case too (heh.. I have overlooked it in the first instance) because xnintr_shirq_detach() is synchronized wrt xnintr_shirq_attach() using critical_enter/exit(). This is only because xnintr_shirq_attach() make use of rthal_irq_request() --- rthal_critical_enter/exit(), hence the nklock can't be used in xnintr_shirq_*. Yep, another approach is to enforce the policy that both xnintr_attach() and xnintr_detach() must never be called when the nklock is held (say, rt_intr_delete() should be rewritten)... but I guess, the better solution is to eliminate the critical_enter/exit() from rthal_irq_request(). o there is no need to use cirtical_enter/exit() in xnintr_shirq_* anymore, the nklock can be used instead. this would solve one synch. problem in xnintr_* code, though there is yet another and more complex one I'm banging my head on now :o) -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [draft PATCH] nested enable/disable irq calls
Hi, yep, this is yet another draft patch which aims at supporting the nested irq disable/enable calls for the primary domain. o no changes on the adeos-ipipe layer, hence it's much cleaner and smaller that the one I have posted last time; o eliminates the need for XN_ISR_NOENABLE flag; o but is based on the presupposition (otherwise it's wrong) that for all acrhitectures that support Xenomai the following is true : pic_handler-ack : * mask * eoi pic_handler-end : * unmask Philippe told me some time ago that this is a _must_ now for any arch to be compatible with adeos-ipipe. If so, with some minor cleanups (XN_ISR_NOENABLE should be removed all over the map, docs fixes, ...) and testing the patch may hopefully find its way into the codebase. Any feedback? TIA, -- Best regards, Dmitry Adamushko depth.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [PATCH] shirq correction
Hello, I overlooked that the logic of decrementing the interrupt nesting count has been changed recently and both xnintr_shirq_handler() and xnintr_edge_shirq_handler() behave wrong in this respect. The attached patch fixes this misbehavior. -- Best regards,Dmitry Adamushko intr.c.patch Description: Binary data ChangeLog Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [RFC, Experimental Patch] nested irq disable calls
BEFORE static void openpic_end_irq(unsigned int irq_nr) { if (!(irq_desc[irq_nr].status (IRQ_DISABLED|IRQ_INPROGRESS)) irq_desc[irq_nr].action) openpic_enable_irq(irq_nr); } AFTER static void openpic_end_irq(unsigned int irq_nr) { if (!ipipe_root_domain_p() !test_bit(IPIPE_DISABLED_FLAG,ipipe_current_domain-irqs[irq_nr].control)) return; - !test_bit(IPIPE_DISABLED_FLAG,ipipe_current_domain-irqs[irq_nr].control)) + test_bit(IPIPE_DISABLED_FLAG,ipipe_current_domain-irqs[irq_nr].control)) ? Yep. Additionally, there is another issue we discussed once with Anders, which is related to not sending EOI twice after the shared IRQ already ended by a RT domain has been fully propagated down the pipeline to Linux; some kind of test_and_clear_temporary_disable flag, would do, I guess. The other way would be to test_and_set some ended flag for the outstanding IRQ when the -end() routine is entered, clearing this flag before pipelining the IRQ in __ipipe_walk_pipeline(). Actually, I'm now starting to wonder why we would want to permanently disable an IRQ line from a RT domain, which is known to be used by Linux. Is this what IPIPE_DISABLED_FLAG is expected to be used for, or is it only there to handle the transient disabled state discussed above? Why permanently? I would see the following scenario - an ISR wants to _temporary_ defer an IRQ line enabling until some later stage (e.g. rt_task which is a bottom half). This is the only reason why xnarch_end_irq() or some later step in it (in this case -end() ) must be aware of IPIPE_DISABLED_FLAG. Why the currently used approach is not that good for it (NOENABLE) ? 1) it actually defers (for some PICs) not only enabling but sending an EOI too; As a consequence : 2) rthal_end_irq() (on PPC and not just xnintr_enable() or rthal_enable_irq()) must be called in bottom half to re-endable the IRQ line; 3) does not co-exist well with the shared interrupts support (I don't mean sharing between RT and not-RT doamins here). Although it's not a common case if a few ISRs on the same shared line want to defer enabling, esp. for real-time domain; 4) it's a bit and if we would like to use only scalar values one day then something like HANDLED, HANDLED_NOENABLE would be needed; The support for nested irq enable/disable calls would resolve all the restrictions above but the question is whether we really need to resolve them. In the same vein, I'd like to know you vision of the nested irq enable/disable calls support. Any use cases? if (!ipipe_root_domain_p() || !(irq_desc[irq_nr].status (IRQ_DISABLED|IRQ_INPROGRESS)) irq_desc[irq_nr].action) openpic_enable_irq(irq_nr); } There is another way for most archs, which is to add such code to the -end() routine override in ipipe-root.c; this would be simpler and safer than fixing such routine for each and every kind of interrupt controller. x86 directly pokes into the PIC code and does not overrides IRQ control routines, though. I didn't know about them as I mostly looked at the x86 implementation. This gives as control over per-domain irq locking/unlocking (ipipe_irq_lock/unlock()), __ipipe_std_irq_dtype[irq].end() is always called unconditionally (as a result, .enable() for some PICs). That said, the IRQ line is actually on, the interrupts are just not handled but accumulated in the pipe-log. Actually, why is ipipe_irq_unlock(irq) necessary in __ipipe_override_irq_end()? ipipe_irq_lock() is not called in __ipipe_ack_irq(). Is it locked somewhere else? At least, I haven't found explicit ipipe_irq_lock() or __ipipe_lock_irq() calls anywhere else. [skip-skip-skip] From another point of view, this new feature seems not to be too intrusive and not something really affecting the fast path so it could be used by default, I guess. Or maybe we don't need it at all? The disable nesting count at Xenomai level is needed to let ISRs act independently from each others wrt interrupt masking - or at the very least, to let them think they do. This is strictly a Xenomai issue, not an Adeos one. If we keep it at this level, then using the xnintr struct to store the nesting counter becomes an option, I guess. Let's start from defining possible use cases with nested irq enable/disable calls then. Maybe we just don't need them at all (at least the same way Linux deals with them)? -- Philippe. -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [RFC, Experimental Patch] nested irq disable calls
her Adeos domains, such as Linux. This is the regular * way to share interrupts between the nucleus and the host system. * - * - XN_ISR_NOENABLE causes the nucleus to ask the real-time control - * layer _not_ to re-enable the IRQ line (read the following section). - * xnarch_end_irq() must be called to re-enable the IRQ line later. - * * The nucleus re-enables the IRQ line by default. Over some real-time * control layers which mask and acknowledge IRQs, this operation is * necessary to revalidate the interrupt channel so that more interrupts * can be notified. + * The ISR may issue xnintr_disable_nosync() call in order to defer + * the IRQ line enabling until later. * * A count of interrupt receipts is tracked into the interrupt * descriptor, and reset to zero each time the interrupt object is @@ -345,12 +341,22 @@ int xnintr_enable (xnintr_t *intr) * Rescheduling: never. */ -int xnintr_disable (xnintr_t *intr) +int xnintr_disable_nosync (xnintr_t *intr) { return xnarch_disable_irq(intr-irq); } + +int xnintr_disable (xnintr_t *intr) + +{ + int ret = xnintr_disable_nosync(intr); + xnintr_synchronize(intr); + + return ret; +} + /*! * \fn xnarch_cpumask_t xnintr_affinity (xnintr_t *intr, xnarch_cpumask_t cpumask) * \brief Set interrupt's processor affinity. @@ -405,9 +411,9 @@ static void xnintr_irq_handler (unsigned s = intr-isr(intr); ++intr-hits; - if (s XN_ISR_PROPAGATE) + if (s == XN_ISR_PROPAGATE) xnarch_chain_irq(irq); - else if (!(s XN_ISR_NOENABLE)) + else xnarch_end_irq(irq); if (--sched-inesting == 0 xnsched_resched_p()) @@ -487,7 +493,9 @@ static void xnintr_shirq_handler (unsign while (intr) { - s |= intr-isr(intr) XN_ISR_BITMASK; + int temp = intr-isr(intr); + if (temp s) + s = temp; ++intr-hits; intr = intr-next; } @@ -495,9 +503,9 @@ static void xnintr_shirq_handler (unsign --sched-inesting; - if (s XN_ISR_PROPAGATE) + if (s == XN_ISR_PROPAGATE) xnarch_chain_irq(irq); - else if (!(s XN_ISR_NOENABLE)) + else xnarch_end_irq(irq); if (sched-inesting == 0 xnsched_resched_p()) @@ -535,19 +543,19 @@ static void xnintr_edge_shirq_handler (u while (intr != end) { - int ret = intr-isr(intr), - code = ret ~XN_ISR_BITMASK, - bits = ret XN_ISR_BITMASK; + int ret = intr-isr(intr); - if (code == XN_ISR_HANDLED) + if (ret == XN_ISR_HANDLED) { ++intr-hits; end = NULL; - s |= bits; } - else if (code == XN_ISR_NONE end == NULL) + else if (ret == XN_ISR_NONE end == NULL) end = intr; + if (ret s) + s = ret; + if (counter++ MAX_EDGEIRQ_COUNTER) break; @@ -562,9 +570,9 @@ static void xnintr_edge_shirq_handler (u if (counter MAX_EDGEIRQ_COUNTER) xnlogerr(xnintr_edge_shirq_handler() : failed to get the IRQ%d line free.\n, irq); - if (s XN_ISR_PROPAGATE) + if (s == XN_ISR_PROPAGATE) xnarch_chain_irq(irq); - else if (!(s XN_ISR_NOENABLE)) + else xnarch_end_irq(irq); if (sched-inesting == 0 xnsched_resched_p()) @@ -645,7 +653,7 @@ unlock_and_exit: return err; } -int xnintr_shirq_detach (xnintr_t *intr) +static int xnintr_shirq_detach (xnintr_t *intr) { xnintr_shirq_t *shirq = xnshirqs[intr-irq]; xnintr_t *e, **p = shirq-handlers; @@ -695,6 +703,11 @@ int xnintr_shirq_detach (xnintr_t *intr) return err; } +static void xnintr_synchronize (xnintr_t *intr) +{ + xnintr_shirq_spin(xnshirqs[intr-irq]); +} + int xnintr_mount(void) { int i; --- -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] 16550 compile err: ?RTDM_IRQ_ENABLE? undeclared (first use in this function)
On 28/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Jim Cromie wrote:LDdrivers/xenomai/16550A/built-in.oCC [M]drivers/xenomai/16550A/16550A.o /mnt/dilbert/jimc/dilbert/lxbuild/linux-2.6.15.1-ipipe-121/drivers/xenomai/16550A/16550A.c: In function ?rt_16550_interrupt?: /mnt/dilbert/jimc/dilbert/lxbuild/linux-2.6.15.1-ipipe-121/drivers/xenomai/16550A/16550A.c:269: error: ?RTDM_IRQ_ENABLE? undeclared (first use in this function) /mnt/dilbert/jimc/dilbert/lxbuild/linux- 2.6.15.1-ipipe-121/drivers/xenomai/16550A/16550A.c:269: error: (Each undeclared identifier is reported only once /mnt/dilbert/jimc/dilbert/lxbuild/linux-2.6.15.1-ipipe-121/drivers/xenomai/16550A/16550A.c:269: error: for each function it appears in.) make[4]: *** [drivers/xenomai/16550A/16550A.o] Error 1 make[3]: *** [drivers/xenomai/16550A] Error 2 make[2]: *** [drivers/xenomai] Error 2 make[1]: *** [drivers] Error 2 make: *** [_all] Error 2 I de-configured 16550, it built fine, so I suspect some recent change missed this item. that said, I havent tried _NOENABLE, since im guessing blind. Yeah, I'm on it. Here is half of the patch I'm currently preparing:--- ../ksrc/drivers/16550A/16550A.c (Revision 624)+++ ../ksrc/drivers/16550A/16550A.c (Arbeitskopie)@@ -238,7 +238,7 @@ int rbytes = 0; int events = 0; int modem;-int ret = RTDM_IRQ_PROPAGATE;+int ret = RTDM_IRQ_NONE; ctx = rtdm_irq_get_arg(irq_context, struct rt_16550_context);@@ -266,7 +266,7 @@ events |= RTSER_EVENT_MODEMLO; }-ret = RTDM_IRQ_ENABLE | RTDM_IRQ_HANDLED; +ret = RTDM_IRQ_HANDLED; } if (ctx-in_nwait 0) { My fault, sorry. I should have fixed that but overlooked again. Jan-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [Xenomai-help] rt_task_wait_period() and overruns
On 01/03/06, Philippe Gerum [EMAIL PROTECTED] wrote: The other option I've described fordealing with overruns in rt_task_wait_period would be as follows: - save the count of overruns- clear the count of overruns/* i.e. purge */- return both the saved count and -ETIMEDOUT to the user.This way, rt_task_wait_period would return only once with an error status, telling the user about the exact count of pending overruns in the same time. I definitely agree with you here. IMHO, there is no point in calling rt_task_wait_period() a few times in a row just to clean up the poverrun counter (if there were a few overruns) when the whole may be reported at once. This former way just gives unnecessary overhead. Actually, there is a kind of application that must not rely on the poverrun counter, the klatency/latency utility and alike. They are run normally (at least at the very first time) in the untrusted environment where SMI or something similar - that may prevent a CPU from handling normal interrupts for quite a long time - make occur happily. As the poeverrun counting is dependent on the timer interrupt, it becomes irrelevant. Something like overruns = (real_time_of_wakeup - desired_time_of_wakeup) / period (*) should be rather used there (of course, the timing source must not be interrupt-dependent). Other applications may likely rely on the poverrun counter (returned by rt_task_wait_period(overrun)) as they normally run in the (more or less) studied and tweaked environment. Ok, some paranoid programmers would think differently even then but there is always the (*) way :o) -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [Xenomai-help] rt_task_wait_period() and overruns
On 01/03/06, Philippe Gerum [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: On 01/03/06, *Philippe Gerum* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: The other option I've described for dealing with overruns in rt_task_wait_period would be as follows: - save the count of overruns - clear the count of overruns/* i.e. purge */ - return both the saved count and -ETIMEDOUT to the user. This way, rt_task_wait_period would return only once with an error status, telling the user about the exact count of pending overruns in the same time. I definitely agree with you here. IMHO, there is no point in calling rt_task_wait_period() a few times in a row just to clean up the poverrun counter (if there were a few overruns) when the whole may be reported at once. This former way just gives unnecessary overhead.My concern is that some recovery procedure might require to get the exact number of pending overruns to operate properly in order to catch up with the missingexpiries, and there is no way to get this information out of the current API (!).Even calling rt_task_wait_period in loop and testing for -ETIMEDOUT is unusable, since well, we would obviously get blocked when the overrun count drops to zero,which is not what we want in order to be able to run the recovery procedure asap. All in all, I would vote for changing the current rt_task_wait_period() interface. Actually, there is a kind of application that must not rely on the poverrun counter, the klatency/latency utility and alike. They are run normally (at least at the very first time) in the untrusted environment where SMI or something similar - that may prevent a CPU from handling normal interrupts for quite a long time - make occur happily. As the poeverrun counting is dependent on the timer interrupt, it becomes irrelevant. Something like overruns = (real_time_of_wakeup - desired_time_of_wakeup) / period (*) should be rather used there (of course, the timing source must not be interrupt-dependent).Ah! you know what, I'm pretty sure that one of your very first public posts on theRTAI/fusion mailing list at that time, was exactely about this issue :o) Good memory indeed; so it's too earlier for you to get retired :o) --Philippe.-- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [PATCH] Shared interrupts (yet another movie :)
On 28/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: ... Ok, are there any objections as to the current patch? If no, please apply.Oops, I found a minor problem:make[2]: Entering directory `/usr/src/xenomai/build/doc/doxygen' doxygen/usr/src/xenomai/ksrc/skins/native/intr.c:523: Warning: no matching filemember found forint rt_intr_create(RT_INTR *intr, unsigned irq, int mode)Possible candidates:int rt_intr_create(RT_INTR *intr, const char *name, unsigned irq, int mode) ... int rt_intr_bind(rt_intr_placeholder *intr, unsigned irq, RTIME timeout)Possible candidates: int rt_intr_bind(RT_INTR *intr, const char *name, RTIME timeout)int rt_intr_bind(RT_INTR *intr, const char *name, RTIME timeout)Seems the doc is not yet up-to-date. Thanks. I have overlooked some parts. This patch fixes it up (should be applied after the main patch). Jan -- Best regards,Dmitry Adamushko shirq-doc-update.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [PATCH] Shared interrupts (yet another movie :)
Hi there, I have explicitly cc'ed Gilles as this patch affects the posix skin. In the light of the recent discussions, the AUTOENA flag has been converted to NOAUTOENA and the IRQ line is re-enabled on return from xnintr_irq_handler() and shirq brothers by default. Also XN_ISR_CHAINED - XN_ISR_PROPAGATE. I'm still not sutisfied with results, namely - return values of ISR. But, well, this is a quite separate question to the shirq support so the later one should not remain in pending status only because of that. I still would like to see something along scalar values : NONE, HANDLED, PROPAGATE and xnintr_disable() being called in ISR to defer IRQ line enabling (not .ending - PROPAGATE does it). (*) Currently, there is a XN_ISR_NOENABLE bit which asks the real-time layer to defer the IRQ line, Warning!, .ending (and not just enabling) until later. In common case, xnarch_end_irq() must be called by the rt_task that stands for a bottom half (and not just xnintr_enable() - this may not work on ppc). This adds a bit of confusion and we will avoid it with (*) scheme. So this is a subject to change in the future. As I pointed out in another message, the implementation for PPC is not yet clear at the moment. That's it... Ok, are there any objections as to the current patch? If no, please apply. CHANGELOG.patch is here https://mail.gna.org/public/xenomai-core/2006-02/msg00154.html -- Best regards,Dmitry Adamushko shirq-combo.patch Description: Binary data shirq-KConfig.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [arm/hal.c] possible problem (and a PIC-related observation in vanilla Linux)
Hello everybody, 1) arm/hal.c : rthal_irq_enable/disable/end() the only arch that calls Linux'es enable_irq() and disable_irq() directly. The later ones use spinlock with the interrupts off only for the Linux domain, so that under some circumstances - Linux domain gets preempted being in enable/disable/setup/free_irq (THIS_IRQ) by the rt-ISR (of THIS_IRQ) - the deadlock may happen. rthal_irq_end() - enable_irq() - spin_lock_irqsave(lock,flags) - deadlock (*) wait.. does that only may happen for inter-domain shared interrupts (THIS_IRQ must be used in both domains)? No indeed :) # cat /proc/interrupts may lead to the same deadlock! this routine is called for i = 0..NR_IRQS and used in fs/proc.c (arch dependent part, e.g. i386/kernel/irq.c) int show_interrupts(struct seq_file *p, void *v) { int i = *(loff_t *) v, j; struct irqaction * action; unsigned long flags; if (i == 0) { seq_printf(p, ); for_each_cpu(j) seq_printf(p, CPU%d ,j); seq_putc(p, '\n'); } if (i NR_IRQS) { spin_lock_irqsave(irq_desc[i].lock, flags); --- LOCK [Oops!!!] We are preempted here by the rt-ISR that's registered for irq only in the primary domain action = ""> if (!action) --- i.e. there is no ISR in the linux domain, thus - nothing to be printed out goto skip; ... skip: spin_unlock_irqrestore(irq_desc[i].lock, flags); (*) takes place and BUM! :o) So, I guess, it's better to implement rthal_irq_enable/disable/end() the same way as other archs do. 2) This one concern Linux and not Xeno. [so one may skip-skip-skip it :] we know that any ISR is always called with a given local IRQ line disabled. e.g. __do_IRQ() { .ack - disables the IRQ line // all the ISRs for this IRQ line are called .end - enables it. } If an ISR wants to defer IRQ line enabling until later (e.g. bottom half), it may call disable_irq_nosync() or even disable_irq() (with care as specified in comments). void disable_irq_nosync(unsigned int irq) { irq_desc_t *desc = irq_desc + irq; unsigned long flags; spin_lock_irqsave(desc-lock, flags); if (!desc-depth++) { --- inc (depth) desc-status |= IRQ_DISABLED; --- DISABLED is set! desc-handler-disable(irq); } spin_unlock_irqrestore(desc-lock, flags); } In this case, .end must _not_ re-enable the IRQ line! In order to do so, it checks for IRQ_DISABLED flag. e.g. here is a correct implementation: i386/kernel/i8259.c static void end_8259A_irq (unsigned int irq) { if (!(irq_desc[irq].status (IRQ_DISABLED|IRQ_INPROGRESS)) --- i.e. was disabled in ISR irq_desc[irq].action) enable_8259A_irq(irq); } end_8259A_irq() is almost the same as enable_8259A_irq() with the only difference that the later one always re-enables the IRQ line unconditionaly. That's why any implementation of .end() (which re-enables the IRQ line) must always check for IRQ_DISABLED and never be the same routine as .enable! Otherwise, disable_irq_nosync() being called in ISR works not as expected (IRQ line enabling is not defered). e.g. arch/ppc/syslib/gt64260_pic.c struct hw_interrupt_type gt64260_pic = { .typename = gt64260_pic , .enable = gt64260_unmask_irq, .disable = gt64260_mask_irq, .ack = gt64260_mask_irq, .end = gt64260_unmask_irq, --- the same as .enable (i.e. doesn't check for IRQ_DISABLED) }; There are a few other sub-archs that do the same. Error? yet another example : static struct hw_interrupt_type xilinx_intc = { .typename = Xilinx Interrupt Controller, .enable = xilinx_intc_enable, .disable = xilinx_intc_disable, .ack = xilinx_intc_disable_and_ack, .end = xilinx_intc_end, --- OK! This is correct. }; but let's look at the implementation : static void xilinx_intc_end(unsigned int irq) { unsigned long mask = (0x0001 (irq 31)); pr_debug(end: %d\n, irq); if (!(irq_desc[irq].status (IRQ_DISABLED | IRQ_INPROGRESS))) { intc_out_be32(intc + SIE, mask); /* ack level sensitive intr */ --- never gets executed if ISR has called disable_irq()! if (irq_desc[irq].status IRQ_LEVEL) intc_out_be32(intc + IAR, mask); } } so ISR have not called disable_irq() : enable + ack. -//- did call disable_irq() : only enable (e.g. enable_irq() - handler-enable() in bottom half) Could it be such an arch feature? It looks to me that it's logically wrong anyway. Heh... if one is still reading these lines, thanks for your time indeed! :o) -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
For RTDM I'm now almost determined to rework the API in way that only HANDLED/UNHANDLED (or what ever their names will be) get exported, any additional guru features will remain excluded as long as we have no clean usage policy for them. Good. Then let's go for HANDLED, UNHANDLED - we may consider them even as 2 scalar values + NOENABLE, CHAINED - additional bits. They are not encouraged to be used with shared interrupts (explained in docs + debug messages when XENO_OPT_DEBUG is on). Any ISR on the shared irq line should understand that it's just one among the equals. That said, it should not do anything that may affect other ISRs and not require any special treatment (like CHAINED or NOENABLE). If it wants it indeed, then don't declare itself as SHARED. We may come back to the topic about possible return values of ISRs a bit later maybe having got more feedback (hm.. hopefully) on shared irq support. But the later one is not only about enabling the line, but on some archs - about .end-ing it too (sending EOI). And to support HANDLED_NOENABLE properly, those 2 have to be decoupled, i.e. EOI should always be sent from xnintr_shirq_handler(). But the one returning HANDLED_NOENABLE is likely to leave the interrupt asserted, hence we can't EOI at this point (unless NO_ENABLE means DISABLE). I guess this is what Dmitry meant: explicitly call disable() if one or more ISRs returned NOENABLE - at least on archs where end != enable. Will this work? We could then keep on using the existing IRQ-enable API from bottom-half IRQ tasks. Almost. Let's consider the following only as anorther way of doing some things; I don't propose to implement it, it's just to illustrate my thoughts. So one may simply ski-skip-skip it :o) Let's keep in mind that what is behind .end-ing the IRQ line depends on archs. Sometimes end == enable (EOI was sent on .ack step), while in other cases end == send_EOI [+ re-enabling]. But anyway, all ISRs are running with a given IRQ line disabled. Supported values : HANDLED, UNHANDLED, PROPAGATE. nucleus :: xnintr_irq_handler() { ret = 0; ... for each isr in isr_list[ IRQ ] { temp = isr-handler(); if (temp ret) ret = temp; } if (ret == PROPAGATE) { // quite dengerous with shared interrupts, be sure you understand // what you are doing! xnarch_chain_irq(irq); // will be .end-ed in Linux domain } else { // HANDLED or UNHANDLED xnarch_end_irq(); } ... } ENABLE or NOENABLE is missing? Nop. let's say we have 2 rt-ISRs : isr1 : HANDLED isr2 : HANDLED + WISH WISH == I want the IRQ line remain disabled until later (e.g. bottom half in rt_task will enable it) How may isr2 express this WISH? Simply, xnarch_irq_disable/enable() should support an internal counter that allows them to be called in a nested way. So e.g. if there are 2 consecutive calls to disable_irq(), then 2 calls to enable_irq() are needed to really enable the IRQ line. This way, the WISH is only about directly calling xnarch_irq_disable() in isr2 and there is no need in ENABLE or NOENABLE flags. This way, PROPAGATE really means NOEND - the IRQ will be .end-ed in Linux domain; while WISH==NOENABLE - has nothing to do with sending EOI, but only with enabling/disabling the given IRQ line. Yes, if one isr (or a few) defers the IRQ line enabling until later, it will affect all others ISR = all interrupts are temporary not accepted on this line. This scenario is possible under Linux, but should be used with even more care in real-time system. At least, this way HANDLED_NOENABLE case works and doesn't lead to lost interrupts on some archs. Moreover, it avoids the need for ENABLE flag even in non-shared interrupt case. -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
For RTDM I'm now almost determined to rework the API in way that only HANDLED/UNHANDLED (or what ever their names will be) get exported, any additional guru features will remain excluded as long as we have no clean usage policy for them. Good. Then let's go for HANDLED, UNHANDLED - we may consider them even as 2 scalar values + NOENABLE, CHAINED - additional bits. They are not encouraged to be used with shared interrupts (explained in docs + debug messages when XENO_OPT_DEBUG is on). Any ISR on the shared irq line should understand that it's just one among the equals. That said, it should not do anything that may affect other ISRs and not require any special treatment (like CHAINED or NOENABLE). If it wants it indeed, then don't declare itself as SHARED. We may come back to the topic about possible return values of ISRs a bit later maybe having got more feedback (hm.. hopefully) on shared irq support. But the later one is not only about enabling the line, but on some archs - about .end-ing it too (sending EOI). And to support HANDLED_NOENABLE properly, those 2 have to be decoupled, i.e. EOI should always be sent from xnintr_shirq_handler(). But the one returning HANDLED_NOENABLE is likely to leave the interrupt asserted, hence we can't EOI at this point (unless NO_ENABLE means DISABLE). I guess this is what Dmitry meant: explicitly call disable() if one or more ISRs returned NOENABLE - at least on archs where end != enable. Will this work? We could then keep on using the existing IRQ-enable API from bottom-half IRQ tasks. Almost. Let's consider the following only as anorther way of doing some things; I don't propose to implement it, it's just to illustrate my thoughts. So one may simply ski-skip-skip it :o) Let's keep in mind that what is behind .end-ing the IRQ line depends on archs. Sometimes end == enable (EOI was sent on .ack step), while in other cases end == send_EOI [+ re-enabling]. But anyway, all ISRs are running with a given IRQ line disabled. Supported values : HANDLED, UNHANDLED, PROPAGATE. nucleus :: xnintr_irq_handler() { ret = 0; ... for each isr in isr_list[ IRQ ] { temp = isr-handler(); if (temp ret) ret = temp; } if (ret == PROPAGATE) { // quite dengerous with shared interrupts, be sure you understand // what you are doing! xnarch_chain_irq(irq); // will be .end-ed in Linux domain } else { // HANDLED or UNHANDLED xnarch_end_irq(); } ... } ENABLE or NOENABLE is missing? Nop. let's say we have 2 rt-ISRs : isr1 : HANDLED isr2 : HANDLED + WISH WISH == I want the IRQ line remain disabled until later (e.g. bottom half in rt_task will enable it) How may isr2 express this WISH? Simply, xnarch_irq_disable/enable() should support an internal counter that allows them to be called in a nested way. So e.g. if there are 2 consecutive calls to disable_irq(), then 2 calls to enable_irq() are needed to really enable the IRQ line. This way, the WISH is only about directly calling xnarch_irq_disable() in isr2 and there is no need in ENABLE or NOENABLE flags. This way, PROPAGATE really means NOEND - the IRQ will be .end-ed in Linux domain; while WISH==NOENABLE - has nothing to do with sending EOI, but only with enabling/disabling the given IRQ line. Yes, if one isr (or a few) defers the IRQ line enabling until later, it will affect all others ISR = all interrupts are temporary not accepted on this line. This scenario is possible under Linux, but should be used with even more care in real-time system. At least, this way HANDLED_NOENABLE case works and doesn't lead to lost interrupts on some archs. Moreover, it avoids the need for ENABLE flag even in non-shared interrupt case. -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
On 21/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: N.B. Amongst other things, some thoughts about CHAINED with shared interrupts. On 20/02/06, Anders Blomdell [EMAIL PROTECTED] wrote: A number of questions arise: 1. What happens if one of the shared handlers leaves the interrupt asserted, returns NOENABLE|HANDLED and another return only HANDLED? 2. What happens if one returns PROPAGATE and another returns HANDLED? Yep, each ISR may return a different value and all of them are accumulated in the s variable ( s |= intr-isr(intr); ). So the loop may end up with s which contains all of the possible bits: (e.g. isr1 - HANDLED | ENABLE isr2 - HANDLED (don't want the irq to be enabled) isr3 - CHAINED ) s = HANDLED | ENABLE | CHAINED; Then CHAINED will be ignored because of the following code : +if (s XN_ISR_ENABLE) + xnarch_end_irq(irq); +else if (s XN_ISR_CHAINED)(*) + xnarch_chain_irq(irq);Which may actually be fatal: consider one ISR intentionally not handling an IRQ but propagating it while another ISR thinks it has to re-enableit - bang! Anders' as well as my original suggestion were to ignore theENABLE in this case. the current code in the CVS doen not contain else in (*), so that ENABLE | CHAINED is possible, though it's a wrong combination. This said, we suppose that one knows what he is doing. In the case of a single ISR per line, it's not that difficult to achieve. But if there are a few ISRs, then one should analize and take into account all possible return values of all the ISRs, as each of them may affect others (e.g. if one returns CHAINED when another - HANDLED | ENABLE). So my feeling is that CHAINED should not be used by drivers which registered their ISRs as SHARED.That's true for standard drivers, but we should not prevent specialcorner-case designs which can be tuned to make even shared RT-IRQs + propagation possible. Sharing does not necessarily mean that differentRT drivers are involved... Yep. But in this case the designer of the system should _care_ about all corners of the system, i.e. to make sure that all ISRs return values which are not contradictory to each other. e.g. 1) isr1 : HANDLED (keep disabled) isr2 : CHAINED or 2) isr1 : HANDLED | NOENABLE (will be enabled by rt_task1) isr2 : HANDLED | NOENABLE (-//- by rt_task2) 1) and 2) are wrong! Both lead to the IRQ line being .end-ed (xnarch_end_irq()) twice! And on some archs, as we have seen before, that may lead to lost interrupts. Of course, when designing the real-time system, one should take charge of all corners of the system, so that analyzing the possible return values of all other ISRs that may share the same IRQ line is not anything extravagant, but a requirement. This said, I see no reason then why we should prevent users from some cases (like CHAINED | ENABLE) and let him do the ones like 2). 2) is much more common scenario for irq sharing and looks sane from the point of view of a separate ISR, when the global picture (and nucleus innards) is not taken into account. p.s. my 2) is equal to 2 ISRs returning HANDLED_NOENABLE in terms of Anders. This case is one of the most common with shared interrupts which I would imagine. And it nevertheless leads to problems when the whole picture is not taken into account by the designer of a given ISR. Moreover, I actually see the only scenario of CHAINED (I provided it before) : all ISRs in the primary domain have reported UNHANDLED = nucleus propagates the interrupt down the pipeline with xnacrh_chain_irq(). This call actually returns 1 upon successful propagation (some domain down the pipeline was interested in this irq) and 0 otherwise.But this only means that some handler is registered down there, it cannot predict if that handler will actually care about the event. Linux (not only vanilla, but ipipe-aware too) always .end-s (calls desc-handler-end(irq)) on return from __do_IRQ(), no matter this interrupt was handled by some ISR or not. It also re-enables the IRQ line if it was not disabled by some ISR explicitly. We don't care about anything else. Jan -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
On 21/02/06, Anders Blomdell [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: N.B. Amongst other things, some thoughts about CHAINED with shared interrupts. On 20/02/06, *Anders Blomdell* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: A number of questions arise: 1. What happens if one of the shared handlers leaves the interrupt asserted, returns NOENABLE|HANDLED and another return only HANDLED? 2. What happens if one returns PROPAGATE and another returns HANDLED? Yep, each ISR may return a different value and all of them are accumulated in the s variable ( s |= intr-isr(intr); ). So the loop may end up with s which contains all of the possible bits: (e.g. isr1 - HANDLED | ENABLE isr2 - HANDLED (don't want the irq to be enabled) isr3 - CHAINED ) s = HANDLED | ENABLE | CHAINED; Then CHAINED will be ignored because of the following code : +if (s XN_ISR_ENABLE) + xnarch_end_irq(irq); +else if (s XN_ISR_CHAINED)(*) + xnarch_chain_irq(irq); Which is the worst way possible of prioritizing them, if a Linux interrupt isactive when we get there with ENABLE|CHAINED, interrupts will be enabled withthe Linux interrupt still asserted - the IRQ-handlers will be called once more, probably returning ENABLE|CHAINED again - infinite loop... the current code in the CVS doen not contain else in (*), so that ENABLE | CHAINED is possible, though it's a wrong combination. This said, we suppose that one knows what he is doing. In the case of a single ISR per line, it's not that difficult to achieve. But if there are a few ISRs, then one should analize and take into account all possible return values of all the ISRs, as each of them may affect others (e.g. if one returns CHAINED when another - HANDLED | ENABLE).Which is somewhat contrary to the concept of shared interrupts, if we have to take care of the global picture, why make them shared in the first place?(I like the concept of shared interrupts, but it is important that the frameworkgives a separation of concerns) Unfortunately, it looks to me that the current picture (even with your scalar values) requires from the user who develops a given IRQ to take into account the possible return values of other ISRs. As I pointed out, the situation when 2 ISRs return HANDLED_NOENABLE may lead to problems on some archs. --- ... I'll take a look at the rest of the message a bit later. -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
Good point, leaves us with 2 possible return values for shared handlers: HANDLED NOT_HANDLED i.e. shared handlers should never defer the end'ing of the interrupt (which makes sense, since this would affect the other [shared] handlers). HANDLED_NOEBNABLE could be supported too. There would be no need in reenventing a wheel, just do it the way Linux does it. But it's about some additional re-designing of the current codebase (e.g. nested calling for irq_enable/disable()) I'm not sure we do need it for something else rather than irq sharing code but it affects the rest of the code. And we have a kind of wrong concept : XN_ISR_ENABLE (or NOENABLE) corresponds to xnarch_end_irq(). But the later one is not only about enabling the line, but on some archs - about .end-ing it too (sending EOI). And to support HANDLED_NOENABLE properly, those 2 have to be decoupled, i.e. EOI should always be sent from xnintr_shirq_handler(). Yes, should. And this should is best be handled by a) Documenting the potential conflict in the same place when describing the return values b) Placing some debug warning in the nucleus' IRQ trampoline function to bail out (once per line) when running into such situation But I'm against any further runtime restrictions, especially as most drivers will never return anything else than NOT_HANDLED or HANDLED. Actually, this was the reason why I tried to separate the NO_ENABLE and PROPAGATE features as *additional* bits from HANDLED and NOT_HANDLED/UNHANDLED/NOINT. But I acknowledge that having all valid bit combination present as constants can be more helpful for the user. We just have to draw some line between the standard values and the additional gurus return codes (documentation: don't use NO_ENABLE or PROPAGATE unless you understand their side-effects and pitfalls precisely). I agree with you on PROPAGATE case, but NO_ENABLE that, as pointed out above, should (IMHO and at least, in theory) only mean keep the IRQ line disabled (and have nothing to do with .end-ing the IRQ line) would be better supported. But this is, again as was pointed out above, about some redesigning of the current code = some overhead that likely affects non-shared aware code too. So on one hand, I'm ready to re-work code with : HANDLED and UNHANDLED (or NOINT) + 2 additional bits : NOENABLE and PROPAGATE. and document it like you suggested don't use NO_ENABLE or PROPAGATE with shared interrupts unless you understand their side-effects and pitfalls precisely; on the other hand, I'd say that I'm almost ready to vote against merging the irq sharing code at all as it looks to be a rather partial solution. -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
On 21/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: N.B. Amongst other things, some thoughts about CHAINED with shared interrupts. On 20/02/06, Anders Blomdell [EMAIL PROTECTED] wrote: A number of questions arise: 1. What happens if one of the shared handlers leaves the interrupt asserted, returns NOENABLE|HANDLED and another return only HANDLED? 2. What happens if one returns PROPAGATE and another returns HANDLED? Yep, each ISR may return a different value and all of them are accumulated in the s variable ( s |= intr-isr(intr); ). So the loop may end up with s which contains all of the possible bits: (e.g. isr1 - HANDLED | ENABLE isr2 - HANDLED (don't want the irq to be enabled) isr3 - CHAINED ) s = HANDLED | ENABLE | CHAINED; Then CHAINED will be ignored because of the following code : +if (s XN_ISR_ENABLE) + xnarch_end_irq(irq); +else if (s XN_ISR_CHAINED)(*) + xnarch_chain_irq(irq);Which may actually be fatal: consider one ISR intentionally not handling an IRQ but propagating it while another ISR thinks it has to re-enableit - bang! Anders' as well as my original suggestion were to ignore theENABLE in this case. the current code in the CVS doen not contain else in (*), so that ENABLE | CHAINED is possible, though it's a wrong combination. This said, we suppose that one knows what he is doing. In the case of a single ISR per line, it's not that difficult to achieve. But if there are a few ISRs, then one should analize and take into account all possible return values of all the ISRs, as each of them may affect others (e.g. if one returns CHAINED when another - HANDLED | ENABLE). So my feeling is that CHAINED should not be used by drivers which registered their ISRs as SHARED.That's true for standard drivers, but we should not prevent specialcorner-case designs which can be tuned to make even shared RT-IRQs + propagation possible. Sharing does not necessarily mean that differentRT drivers are involved... Yep. But in this case the designer of the system should _care_ about all corners of the system, i.e. to make sure that all ISRs return values which are not contradictory to each other. e.g. 1) isr1 : HANDLED (keep disabled) isr2 : CHAINED or 2) isr1 : HANDLED | NOENABLE (will be enabled by rt_task1) isr2 : HANDLED | NOENABLE (-//- by rt_task2) 1) and 2) are wrong! Both lead to the IRQ line being .end-ed (xnarch_end_irq()) twice! And on some archs, as we have seen before, that may lead to lost interrupts. Of course, when designing the real-time system, one should take charge of all corners of the system, so that analyzing the possible return values of all other ISRs that may share the same IRQ line is not anything extravagant, but a requirement. This said, I see no reason then why we should prevent users from some cases (like CHAINED | ENABLE) and let him do the ones like 2). 2) is much more common scenario for irq sharing and looks sane from the point of view of a separate ISR, when the global picture (and nucleus innards) is not taken into account. p.s. my 2) is equal to 2 ISRs returning HANDLED_NOENABLE in terms of Anders. This case is one of the most common with shared interrupts which I would imagine. And it nevertheless leads to problems when the whole picture is not taken into account by the designer of a given ISR. Moreover, I actually see the only scenario of CHAINED (I provided it before) : all ISRs in the primary domain have reported UNHANDLED = nucleus propagates the interrupt down the pipeline with xnacrh_chain_irq(). This call actually returns 1 upon successful propagation (some domain down the pipeline was interested in this irq) and 0 otherwise.But this only means that some handler is registered down there, it cannot predict if that handler will actually care about the event. Linux (not only vanilla, but ipipe-aware too) always .end-s (calls desc-handler-end(irq)) on return from __do_IRQ(), no matter this interrupt was handled by some ISR or not. It also re-enables the IRQ line if it was not disabled by some ISR explicitly. We don't care about anything else. Jan -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
On 21/02/06, Anders Blomdell [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: N.B. Amongst other things, some thoughts about CHAINED with shared interrupts. On 20/02/06, *Anders Blomdell* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: A number of questions arise: 1. What happens if one of the shared handlers leaves the interrupt asserted, returns NOENABLE|HANDLED and another return only HANDLED? 2. What happens if one returns PROPAGATE and another returns HANDLED? Yep, each ISR may return a different value and all of them are accumulated in the s variable ( s |= intr-isr(intr); ). So the loop may end up with s which contains all of the possible bits: (e.g. isr1 - HANDLED | ENABLE isr2 - HANDLED (don't want the irq to be enabled) isr3 - CHAINED ) s = HANDLED | ENABLE | CHAINED; Then CHAINED will be ignored because of the following code : +if (s XN_ISR_ENABLE) + xnarch_end_irq(irq); +else if (s XN_ISR_CHAINED)(*) + xnarch_chain_irq(irq); Which is the worst way possible of prioritizing them, if a Linux interrupt isactive when we get there with ENABLE|CHAINED, interrupts will be enabled withthe Linux interrupt still asserted - the IRQ-handlers will be called once more, probably returning ENABLE|CHAINED again - infinite loop... the current code in the CVS doen not contain else in (*), so that ENABLE | CHAINED is possible, though it's a wrong combination. This said, we suppose that one knows what he is doing. In the case of a single ISR per line, it's not that difficult to achieve. But if there are a few ISRs, then one should analize and take into account all possible return values of all the ISRs, as each of them may affect others (e.g. if one returns CHAINED when another - HANDLED | ENABLE).Which is somewhat contrary to the concept of shared interrupts, if we have to take care of the global picture, why make them shared in the first place?(I like the concept of shared interrupts, but it is important that the frameworkgives a separation of concerns) Unfortunately, it looks to me that the current picture (even with your scalar values) requires from the user who develops a given IRQ to take into account the possible return values of other ISRs. As I pointed out, the situation when 2 ISRs return HANDLED_NOENABLE may lead to problems on some archs. --- ... I'll take a look at the rest of the message a bit later. -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
Good point, leaves us with 2 possible return values for shared handlers: HANDLED NOT_HANDLED i.e. shared handlers should never defer the end'ing of the interrupt (which makes sense, since this would affect the other [shared] handlers). HANDLED_NOEBNABLE could be supported too. There would be no need in reenventing a wheel, just do it the way Linux does it. But it's about some additional re-designing of the current codebase (e.g. nested calling for irq_enable/disable()) I'm not sure we do need it for something else rather than irq sharing code but it affects the rest of the code. And we have a kind of wrong concept : XN_ISR_ENABLE (or NOENABLE) corresponds to xnarch_end_irq(). But the later one is not only about enabling the line, but on some archs - about .end-ing it too (sending EOI). And to support HANDLED_NOENABLE properly, those 2 have to be decoupled, i.e. EOI should always be sent from xnintr_shirq_handler(). Yes, should. And this should is best be handled by a) Documenting the potential conflict in the same place when describing the return values b) Placing some debug warning in the nucleus' IRQ trampoline function to bail out (once per line) when running into such situation But I'm against any further runtime restrictions, especially as most drivers will never return anything else than NOT_HANDLED or HANDLED. Actually, this was the reason why I tried to separate the NO_ENABLE and PROPAGATE features as *additional* bits from HANDLED and NOT_HANDLED/UNHANDLED/NOINT. But I acknowledge that having all valid bit combination present as constants can be more helpful for the user. We just have to draw some line between the standard values and the additional gurus return codes (documentation: don't use NO_ENABLE or PROPAGATE unless you understand their side-effects and pitfalls precisely). I agree with you on PROPAGATE case, but NO_ENABLE that, as pointed out above, should (IMHO and at least, in theory) only mean keep the IRQ line disabled (and have nothing to do with .end-ing the IRQ line) would be better supported. But this is, again as was pointed out above, about some redesigning of the current code = some overhead that likely affects non-shared aware code too. So on one hand, I'm ready to re-work code with : HANDLED and UNHANDLED (or NOINT) + 2 additional bits : NOENABLE and PROPAGATE. and document it like you suggested don't use NO_ENABLE or PROPAGATE with shared interrupts unless you understand their side-effects and pitfalls precisely; on the other hand, I'd say that I'm almost ready to vote against merging the irq sharing code at all as it looks to be a rather partial solution. -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
N.B. Amongst other things, some thoughts about CHAINED with shared interrupts. On 20/02/06, Anders Blomdell [EMAIL PROTECTED] wrote: A number of questions arise:1. What happens if one of the shared handlers leaves the interrupt asserted, returns NOENABLE|HANDLED and another return only HANDLED?2. What happens if one returns PROPAGATE and another returns HANDLED? Yep, each ISR may return a different value and all of them are accumulated in the s variable ( s |= intr-isr(intr); ). So the loop may end up with s which contains all of the possible bits: (e.g. isr1 - HANDLED | ENABLE isr2 - HANDLED (don't want the irq to be enabled) isr3 - CHAINED ) s = HANDLED | ENABLE | CHAINED; Then CHAINED will be ignored because of the following code : + if (s XN_ISR_ENABLE) + xnarch_end_irq(irq); + else if (s XN_ISR_CHAINED) (*) + xnarch_chain_irq(irq); the current code in the CVS doen not contain else in (*), so that ENABLE | CHAINED is possible, though it's a wrong combination. This said, we suppose that one knows what he is doing. In the case of a single ISR per line, it's not that difficult to achieve. But if there are a few ISRs, then one should analize and take into account all possible return values of all the ISRs, as each of them may affect others (e.g. if one returns CHAINED when another - HANDLED | ENABLE). So my feeling is that CHAINED should not be used by drivers which registered their ISRs as SHARED. Moreover, I actually see the only scenario of CHAINED (I provided it before) : all ISRs in the primary domain have reported UNHANDLED = nucleus propagates the interrupt down the pipeline with xnacrh_chain_irq(). This call actually returns 1 upon successful propagation (some domain down the pipeline was interested in this irq) and 0 otherwise. Upon 0, this is a spurious irq (none of domains was interested in its handling). ok, let's suppose now : we have 2 ISRs on the same shared line : isr1 : HANDLED (will be enabled by rt task. Note, rt task must call xnarch_end_irq() and not just xnarch_enable_irq()! ) isr2 : CHAINED So HANDLED | CHAINED is ok for the single ISR on the line, but it may lead to HANDLED | CHAINED | ENABLE in a case of the shared line. rt task that works jointly with isr1 just calls xnarch_end_irq() at some moment of time and some ISR in the linux domain does the same later = the line is .end-ed 2 times. ISR should never return CHAINED as to indicate _only_ that it is not interested in this irq, but ~HANDLED or NOINT (if we'll support it) instead. If the ISR nevertheless wants to propagate the IRQ to the Linux domain _explicitly_, it _must not_ register itself as SHARED, i.e. it _must_ be the only ISR on this line, otherwise that may lead to the IRQ line being .end-ed twice (lost interrupts in some cases). #define UNHANDLED 0#define HANDLED_ENABLE 1#define HANDLED_NOENABLE 2#define PROPAGATE 3 Yep, I'd agree with you. Moreover, PROPAGATE should not be used for shared interrupts. -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
Hi Jan, let's make yet another revision of the bits : new XN_ISR_HANDLED == old XN_ISR_HANDLED + old XN_ISR_NO_ENABLE ok. new XN_ISR_NOENABLE == ~ old XN_ISR_ENABLE ok. new XN_ISR_PROPAGATE == XN_ISR_CHAINED ok. new XN_ISR_NOINT == ? does it suppose the interrupt line to be .end-ed (enabled) and irq not to be propagated? Should be so, I guess, if it's different from 5). Then nucleus ignores implicit IRQ enable for 5) as well as for 3). Do we really need that NOINT then, as it seems to be the same as ~HANDLED? or NOINT == 0 and then it's a scalar value, not a bit. So one may consider HANDLED == 1 and NOINT == 0 as really scalar values and NOENABLE and PROPAGATE as additional bits (used only if needed). -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
On 16/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: On 16/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: On 16/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Hmm, I still find XN_ISR_NOINT in the patch. Shouldn't we solve this before merging (i.e. make XN_ISR_HANDLED non-zero)? Ok, XN_ISR_NOINT is removed in favor of XN_ISR_HANDLED which is now non-zero. Actually, at first I wanted to make it just the other way about. Hmm, thinking about this again, I would like to revert my suggestion in favour of a third approach (trying to keep you busy ;) ). Ok, you are wellcome :) I didn't get it, at least while looking briefly. To make it a bit easier, at least for me, let's go another way. At the left is the list of allowed values as Philippe pointed out. At the right is another list which corresponds to the left one but when NOINT is used instead of HANDLED. Moreover, I have added another case - pure NOINT. The ISR says it's not mine and, well, it doesn't know whether it should be propagated or no (ok, so far CHAINED standed for NOINT). 1.) HANDLED -0 2.) HANDLED | ENABLE -ENABLE 3.) HANDLED | CHAINED-CHAINED 4.) CHAINED-CHAINED | NOINT 5.)- NOINT Could you provide the 3-d corresponding table using your new flags?Still not 3D, but hopefully clarifying: :) 1.) XN_ISR_HANDLED |XN_ISR_NO_ENABLE 2.) XN_ISR_HANDLED3.) XN_ISR_HANDLED | XN_ISR_PROPAGATE(nucleus ignores implicit IRQ enable)4.) XN_ISR_NOINT | XN_ISR_PROPAGATE5.) XN_ISR_NOINT Then why do we need XN_ISR_NOINT? is it not just the same as ~HANDLED? Moreover, I tend to think now that PROPAGATE (CHAINED) should not be normally used by ISRs at all. The only exception would be irq sharing between RT and non-RT domains, but that's of no concern for most part of rt drivers and, as last events revealed, it's very partially supported at the moment and possible implementations are quite arguable. So HANDLED means the irq line is .end-ed by xnintr_irq_handler() (xnarch_end_irq(irq) is called). If one wants to do it on his own later, NOENABLE can be returned additionally. But then, one must use xnintr_end_irq() and not just xnintr_irq_enable(). If xnintr_end_irq() would be always the same as xnintr_irq_enable() or we would decouple ending and enabling operations on the line (don't have sources at hand); we could go Linux way and make use of a nested counter in xnintr_t, so that xnintr_irq_disable() could be called in a nested way. This said, calling xnintr_irq_disable() in ISR leads to the interrupt line being still disabled upon return from xnintr_irq_handler() and re-enabled again by the user later. ~HANDLED means the ISR was not concerned about this interrupt. i.e. : - there is ISR in another domain (down the pipeline) which is interested in this irq ; well, the designer of the system should be aware of such cases and probably avoid that on hw layer or, at least, to know what she is doing. - this is a spurious irq. So upon getting ~HANDLED, xnintr_irq_handler() does : success = xnintr_irq_chained(irq); if (success) { /* i.e. another domain is really interested and got this irq as pending*/ return without .end-ing } else { /* this is a spurious irq */ make use of some accounting of spurious interrupts (if we need it)), i.e. keep the irq line disabled if the % of unhandled requests some limit. otherwise xnintr_end_irq() and return } too long :) ok, it's quite similar indeed to what you have suggested, with the exception that - no need for NOINT (if it's really == ~HANDLED); - nucleus provides some additional logic while handling ~HANDLED case. - user don't need to return ENABLE explicitly - in most cases, that's what he actually needs but sometimes just forgets (I recall a few questions on the mailing list about only one interrupt. Jan-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
On 16/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: On 16/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Hmm, I still find XN_ISR_NOINT in the patch. Shouldn't we solve this before merging ( i.e. make XN_ISR_HANDLED non-zero)? Ok, XN_ISR_NOINT is removed in favor of XN_ISR_HANDLED which is now non-zero. Actually, at first I wanted to make it just the other way about. Hmm, thinking about this again, I would like to revert my suggestion infavour of a third approach (trying to keep you busy ;) ). Ok, you are wellcome :) I didn't get it, at least while looking briefly. To make it a bit easier, at least for me, let's go another way. At the left is the list of allowed values as Philippe pointed out. At the right is another list which corresponds to the left one but when NOINT is used instead of HANDLED. Moreover, I have added another case - pure NOINT. The ISR says it's not mine and, well, it doesn't know whether it should be propagated or no (ok, so far CHAINED standed for NOINT). HANDLED - 0 HANDLED | ENABLE - ENABLE HANDLED | CHAINED - CHAINED CHAINED - CHAINED | NOINT - NOINT Could you provide the 3-d corresponding table using your new flags? xnintr_t contains just the link char *name as you suggested but RT_INTR contains the name[XNOBJECT_NAME_LEN] member (as it was before actually). If it's ok,then I'll send um... yet another final patch that addresses both issues :)I'm fine with this as well if you prefer it, no problem. Yep, let's go this way. Jan -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] Re: [PATCH] Shared interrupts (ready to merge)
On 16/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: On 16/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: On 16/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Hmm, I still find XN_ISR_NOINT in the patch. Shouldn't we solve this before merging (i.e. make XN_ISR_HANDLED non-zero)? Ok, XN_ISR_NOINT is removed in favor of XN_ISR_HANDLED which is now non-zero. Actually, at first I wanted to make it just the other way about. Hmm, thinking about this again, I would like to revert my suggestion in favour of a third approach (trying to keep you busy ;) ). Ok, you are wellcome :) I didn't get it, at least while looking briefly. To make it a bit easier, at least for me, let's go another way. At the left is the list of allowed values as Philippe pointed out. At the right is another list which corresponds to the left one but when NOINT is used instead of HANDLED. Moreover, I have added another case - pure NOINT. The ISR says it's not mine and, well, it doesn't know whether it should be propagated or no (ok, so far CHAINED standed for NOINT). 1.) HANDLED -0 2.) HANDLED | ENABLE -ENABLE 3.) HANDLED | CHAINED-CHAINED 4.) CHAINED-CHAINED | NOINT 5.)- NOINT Could you provide the 3-d corresponding table using your new flags?Still not 3D, but hopefully clarifying: :) 1.) XN_ISR_HANDLED |XN_ISR_NO_ENABLE 2.) XN_ISR_HANDLED3.) XN_ISR_HANDLED | XN_ISR_PROPAGATE(nucleus ignores implicit IRQ enable)4.) XN_ISR_NOINT | XN_ISR_PROPAGATE5.) XN_ISR_NOINT Then why do we need XN_ISR_NOINT? is it not just the same as ~HANDLED? Moreover, I tend to think now that PROPAGATE (CHAINED) should not be normally used by ISRs at all. The only exception would be irq sharing between RT and non-RT domains, but that's of no concern for most part of rt drivers and, as last events revealed, it's very partially supported at the moment and possible implementations are quite arguable. So HANDLED means the irq line is .end-ed by xnintr_irq_handler() (xnarch_end_irq(irq) is called). If one wants to do it on his own later, NOENABLE can be returned additionally. But then, one must use xnintr_end_irq() and not just xnintr_irq_enable(). If xnintr_end_irq() would be always the same as xnintr_irq_enable() or we would decouple ending and enabling operations on the line (don't have sources at hand); we could go Linux way and make use of a nested counter in xnintr_t, so that xnintr_irq_disable() could be called in a nested way. This said, calling xnintr_irq_disable() in ISR leads to the interrupt line being still disabled upon return from xnintr_irq_handler() and re-enabled again by the user later. ~HANDLED means the ISR was not concerned about this interrupt. i.e. : - there is ISR in another domain (down the pipeline) which is interested in this irq ; well, the designer of the system should be aware of such cases and probably avoid that on hw layer or, at least, to know what she is doing. - this is a spurious irq. So upon getting ~HANDLED, xnintr_irq_handler() does : success = xnintr_irq_chained(irq); if (success) { /* i.e. another domain is really interested and got this irq as pending*/ return without .end-ing } else { /* this is a spurious irq */ make use of some accounting of spurious interrupts (if we need it)), i.e. keep the irq line disabled if the % of unhandled requests some limit. otherwise xnintr_end_irq() and return } too long :) ok, it's quite similar indeed to what you have suggested, with the exception that - no need for NOINT (if it's really == ~HANDLED); - nucleus provides some additional logic while handling ~HANDLED case. - user don't need to return ENABLE explicitly - in most cases, that's what he actually needs but sometimes just forgets (I recall a few questions on the mailing list about only one interrupt. Jan-- Best regards,Dmitry Adamushko
[Xenomai-core] [PATCH] Shared interrupts (ready to merge)
Hello everybody, being inspired by successful results of tests conducted recently by Jan team, I'm presenting the final (yep, yet another final :) combo-patch. The shirq support now is optional, so that CONFIG_XENO_OPT_SHIRQ_LEVEL - enables shirq for level-triggered interrupts; CONFIG_XENO_OPT_SHIRQ_EDGE - -//- for edge-triggered ones. I addressed all the remarks and now, IMHO, it's (hopefully) ready for merge. -- Best regards,Dmitry Adamushko shirq-combo.patch Description: Binary data shirq-KConfig.patch Description: Binary data ChangeLog.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [PATCH] Shared interrupts (ready to merge)
Hello everybody, being inspired by successful results of tests conducted recently by Jan team, I'm presenting the final (yep, yet another final :) combo-patch. The shirq support now is optional, so that CONFIG_XENO_OPT_SHIRQ_LEVEL - enables shirq for level-triggered interrupts; CONFIG_XENO_OPT_SHIRQ_EDGE - -//- for edge-triggered ones. I addressed all the remarks and now, IMHO, it's (hopefully) ready for merge. -- Best regards,Dmitry Adamushko shirq-combo.patch Description: Binary data shirq-KConfig.patch Description: Binary data ChangeLog.patch Description: Binary data
Re: [Xenomai-core] Re: [shirq] first test results
On 13/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: On 11/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: And as an additional option, it could be interesting to print out to the log if not all counter values then min,max,average (the same like for the latency :) per second or per 1000 interrupts; so to see whether tweaking the MAX_EDGEIRQ_COUNTER may make the things better. Yes, maybe it's too small. But this also depends on the absolute time required for so many loops, something we should see in the traces then. I'm afraid that we will finally have to move the UART's read-from-fifo to task context to reduce the time spent in IRQ context. Good. Keep me informed as before.We got it working. :) Great! Anyway, this works now. But we also found a bug, as usual, a clean-upbug: You released the IRQ line based on the wrong test, see attached patch. Ah... it worked for my test cases since it depended on the order of xnintr_detach() calls. Thanks. With this fix applied and the IRQ flags as well as the/proc/xenomai/irq output cleaned up, I would say you great work is ready for merge! So I'll create the final patch and submit it to Philippe for merging. One more thing is making all this stuff optional; XENO_OPT_SHIRQ and XENO_OPT_SHIRQ_EDGE or something like this. Thanks,Jan-- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: More on Shared interrupts
irq K | --- | ---o | // Linux only ... irq L | ---o | | // RT-only ... irq M | ---o--- | ---o | // Shared between domains ... irq N | ---o---o--- | | // Shared inside single domain ... irq O | ---o---o--- | ---o | // Shared between and inside single domain Xenomai currently handles the K L cases, Dmitrys patch addresses the N case, with edge triggered interrupts the M (and O after Dmitry's patch) case(s) might be handled by returning RT_INTR_CHAINED | RT_INTR_ENABLE As you pointed out recently, using this combo for M (and thus O) might also be unsafe, e.g. causing some implementation to send eoi twice or more (and the second time while hw IRQs are off and the second IRQ is still pending) if more than a single domain ends the current interrupt. This said, I've never tried that actually, but this does seem a bit optimistic to always expect a proper behaviour in this case (basically, it all depends on what ending the interrupt means hw-wise). Just to be sure I've got the things more or less clear. I feel there are some holes in my understanding of the irq handling stuff, mainly hw-wise. x86 + i8259A. In this case, .end handler does nothing more than enabling the IRQ line. The eoi is sent in .ack right after disabling the IRQ line. Let's suppose we have the M case (O makes no difference here). When the ISR handler in Linux domain gets control : - the IRQ line is ON (it was already .end-ed in the primary domain in case of XN_ISR_ENABLE | XN_ISR_CHAINED ). === Actually, it's possible to set IPIPE_PASS flag for the primary_domain.irqs[THIS_IRQ].control when THIS_IRQ is shared across domains. This way, there is no need to call propagate_irq() in xnintr_irq_handler(). Ok, it's not that important so far. === This is not how it happens when the IRQ is only handled in the Linux domain. But the ISR is still serialized (need not to be reentrant) because do_IRQ() takes care of it (by means of RUNNING and PENDING flags). So, I suppose, it's possible to use XN_ISR_ENABLE | XN_ISR_CHAINED in that case? In case of a pure Linux : SMP : I/O APIC + local APICs. As I know, enabling/disabling the IRQ line occurs (normally?) on per-local APIC basis. This said, when the ISR is running on CPU_0 with the IRQ line disabled, nothing prevents (probably, if the IRQ also was .ack-ed by that time?) the same IRQ from being signaled on another CPU (even on a few). In that case, it gets marked as PENDING and do_IRQ() on CPU_0 will run handler_IRQ_events() - ISR handlers - once more. Since PENDING is only a flag (not a counter), the fact that the IRQ was signaled is not lost but Linux can't know how much time it was signaled. I guess, it's not true for devices that e.g. do not rise a new interrupt until the ISR handler does some special acknowledgement device-wise. am I right? -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [shirq] first test results
On 11/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: And as an additional option, it could be interesting to print out to the log if not all counter values then min,max,average (the same like for the latency :) per second or per 1000 interrupts; so to see whether tweaking the MAX_EDGEIRQ_COUNTER may make the things better.Yes, maybe it's too small. But this also depends on the absolute timerequired for so many loops, something we should see in the traces then. I'm afraid that we will finally have to move the UART's read-from-fifoto task context to reduce the time spent in IRQ context. Good. Keep me informed as before. Maybe we'd better go for 1, just in case somebody would like to know the actual number of virq occuried. Although, I can imagine that only for debugging reasons and of no avail for the normal users, so maybe 2 :)I have no final opinion yet. Having some names also behind the virtualIRQs would be really nice for debugging, but it would also break another bulk of API calls. Just marking those IRQs virtual would render therelated output of /proc/xenomai/irq not very useful in my eye - as longas you do not know what's behind it. Just dropping this information is then likely better. Yep. Let's just avoid printing it. Then I stumbled over the xnintr structure. Why do you keep a copy of the device name? A const char * should be enough, we just have to demand that it will remain valid as long as the xnintr structure itself (i.e. during the IRQ being attached). Saves a few bytes. :) Could be if all the users of it are initially kernel-mode entities. But e.g. there is a version of rt_intr_create() that may be called by user-space apps and I extended it to support the name parameter too : int rt_intr_create(RT_INTR *intr, const char *name, unsigned irq, int mode); In such a case, the kernel-side rt_intr_create() is called with the name allocated on the stack in skins/native/syscall.c : __rt_intr_create()) so the initial copy of the name can not remain valid.But you could create buffer space for a copy of the name in the sameblock already allocated for RT_INTR. Should be straightforward. RT_INTR does not have the name field any more. I have removed it after adding the same field to the xnintr_t as it became redundant. To access the name, any code in native skin does intr-intr_base.name (intr_base is of xnintr_t type). Jan-- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Re: [shirq] first test results
On 11/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: And as an additional option, it could be interesting to print out to the log if not all counter values then min,max,average (the same like for the latency :) per second or per 1000 interrupts; so to see whether tweaking the MAX_EDGEIRQ_COUNTER may make the things better.Yes, maybe it's too small. But this also depends on the absolute timerequired for so many loops, something we should see in the traces then. I'm afraid that we will finally have to move the UART's read-from-fifoto task context to reduce the time spent in IRQ context. Good. Keep me informed as before. Maybe we'd better go for 1, just in case somebody would like to know the actual number of virq occuried. Although, I can imagine that only for debugging reasons and of no avail for the normal users, so maybe 2 :)I have no final opinion yet. Having some names also behind the virtualIRQs would be really nice for debugging, but it would also break another bulk of API calls. Just marking those IRQs virtual would render therelated output of /proc/xenomai/irq not very useful in my eye - as longas you do not know what's behind it. Just dropping this information is then likely better. Yep. Let's just avoid printing it. Then I stumbled over the xnintr structure. Why do you keep a copy of the device name? A const char * should be enough, we just have to demand that it will remain valid as long as the xnintr structure itself (i.e. during the IRQ being attached). Saves a few bytes. :) Could be if all the users of it are initially kernel-mode entities. But e.g. there is a version of rt_intr_create() that may be called by user-space apps and I extended it to support the name parameter too : int rt_intr_create(RT_INTR *intr, const char *name, unsigned irq, int mode); In such a case, the kernel-side rt_intr_create() is called with the name allocated on the stack in skins/native/syscall.c : __rt_intr_create()) so the initial copy of the name can not remain valid.But you could create buffer space for a copy of the name in the sameblock already allocated for RT_INTR. Should be straightforward. RT_INTR does not have the name field any more. I have removed it after adding the same field to the xnintr_t as it became redundant. To access the name, any code in native skin does intr-intr_base.name (intr_base is of xnintr_t type). Jan-- Best regards, Dmitry Adamushko
[Xenomai-core] Benchmarks
Hi there, after a preliminary discussion with Philipe and, well, a few days later than I expected, I'm starting a new effort of writting some simple (i.e. not too complex :) yet, hopefully, useful benchmarking utilites. The idea of each utility is to emulate a certain use case but at the level which is significant enough to prove that the system is (or is not) working properly latency-wise. This, hopefully, will help to determine some bottlenecks and the parts of code that need to be reworked/tweaked. Then we may use such tests on release-by-release basis as indicators of either progress or regress we are making with a certain release. As an example, the first utility would implement the following use case : (based on the latency program) - a given number of periodic threads are running; - configurable periods (so that e.g. a few threads can become active at the same moment of time). Actually, that's what we would need. - timer : periodic or apperiodic; ... the utility will not likely produce any screen-output during its work, but rather the comprehensive statistic in a handy form after finishing. --- other utils could make use of some scenarious where synch. primitives/ rt_queue's/pipes could be highly used etc. I guess, Xenomai already provides a valid amount of functionality and it's quite stable for the time being. So it's time to work on optimizing it! Everyone is wellcome to come up with any scenarios on which such utilities could be based. Any comments on the one with a given number of threads are wellcome too. -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] Benchmarks
Hi there, after a preliminary discussion with Philipe and, well, a few days later than I expected, I'm starting a new effort of writting some simple (i.e. not too complex :) yet, hopefully, useful benchmarking utilites. The idea of each utility is to emulate a certain use case but at the level which is significant enough to prove that the system is (or is not) working properly latency-wise. This, hopefully, will help to determine some bottlenecks and the parts of code that need to be reworked/tweaked. Then we may use such tests on release-by-release basis as indicators of either progress or regress we are making with a certain release. As an example, the first utility would implement the following use case : (based on the latency program) - a given number of periodic threads are running; - configurable periods (so that e.g. a few threads can become active at the same moment of time). Actually, that's what we would need. - timer : periodic or apperiodic; ... the utility will not likely produce any screen-output during its work, but rather the comprehensive statistic in a handy form after finishing. --- other utils could make use of some scenarious where synch. primitives/ rt_queue's/pipes could be highly used etc. I guess, Xenomai already provides a valid amount of functionality and it's quite stable for the time being. So it's time to work on optimizing it! Everyone is wellcome to come up with any scenarios on which such utilities could be based. Any comments on the one with a given number of threads are wellcome too. -- Best regards,Dmitry Adamushko
Re: [Xenomai-core] [Combo-PATCH] Shared interrupts (final)
On 08/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: I still prefer configuration options as they also allow to reduce the overall code size (less cache refills and TLB misses). And shared interrupts are for x86 only (approximately), I think. Unfortunately, IOk, that's a good argument. Then make the whole IRQ-sharing stuffcompile-time configurable and see how much we can save. Anyway, I agree that the code which is supposed to be used by only a fraction of users (Jan is only interested so far? and esp. in that brain-damaged edge-triggered stuff) and which is a bit too heavy to guarantee a close-to-zero-overhead should be made optional. Ok, let's go for it upon getting the test results. Enclosed a small optimization to reduce the code in (optional in future) ISR. Jan -- Best regards,Dmitry Adamushko shirq-v9.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Combo-PATCH] Shared interrupts (final)
On 08/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: I still prefer configuration options as they also allow to reduce the overall code size (less cache refills and TLB misses). And shared interrupts are for x86 only (approximately), I think. Unfortunately, IOk, that's a good argument. Then make the whole IRQ-sharing stuffcompile-time configurable and see how much we can save. Anyway, I agree that the code which is supposed to be used by only a fraction of users (Jan is only interested so far? and esp. in that brain-damaged edge-triggered stuff) and which is a bit too heavy to guarantee a close-to-zero-overhead should be made optional. Ok, let's go for it upon getting the test results. Enclosed a small optimization to reduce the code in (optional in future) ISR. Jan -- Best regards,Dmitry Adamushko shirq-v9.patch Description: Binary data
[Xenomai-core] [Combo-PATCH] Shared interrupts (final)
Hi, this is the final set of patches against the SVN trunk of 2006-02-03. It addresses mostly remarks concerning naming (XN_ISR_ISA - XN_ISR_EDGE), a few cleanups and updated comments. Functionally, the support for shared interrupts (a few flags) to the rtdm (Jan's patch) and native skin. In the later case, rt_intr_create() now contains the 6-th argument, namely int mode. Now I'm waiting for the test results from Jan (the previous patch-set remains to be suitable for testing too in case you are using it already). Upon success, the new code is ready for merging. the patches have to be applied as follows : - shirq-base - shirq-v8 - shirq-proc - shirq-edge - shirq-ext Happy testing ! :) -- Best regards,Dmitry Adamushko shirq-base.patch Description: Binary data shirq-v8.patch Description: Binary data shirq-proc.patch Description: Binary data shirq-edge.patch Description: Binary data shirq-ext.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Combo-PATCH] Shared interrupts (final)
On 07/02/06, Wolfgang Grandegger [EMAIL PROTECTED] wrote: Hello, [skip-skip-skip] Happy testing ! :)My concern is code size. I see that the patches add substantial amountof code to the ISR. What about make this feature configurable? Yes, I though about making the new stuff optionally configurable. But another option would be to leave the xnintr_irq_handler() as it was before and to introduce a separate handler for the shared interrupts. edge-triggeres interrupts already have the distinct handler (xnintr_edgeirq_handler()). I guess the additional code in xnintr_attach/detach() and a few (~1 for UP and ~2 for SMP) KBs of data are not that critical taking into account where and how they are used, both latency and cache-wise. Wolfgang. -- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [Combo-PATCH] Shared interrupts (base, /proc support, edge-triggered stuff)
On 02/02/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: Hi there, as I promised, here go the following patches (ordered as they have to be applied one by one) : 1) shirq-base.patch Adds the name field to the interrupt object of the nucleus layer. Reworks the related bits of the native skin (+ a few minor changes for posix and rtdm) to support the named interrupt objects. 2) shirq-v7.patch Generic support for shared interrupts. 3) shirq-proc.patch /proc support. Now /proc/xenomai/irq shows the names of handlers. The related code have been removed from the hal layer to nucleus. - 4) shirq-isa.patch Trying to handle the edge-triggered stuff. This is a very preliminary version so the only thing I promise so far is that it compiles successfully. Why not name it XN_ISR_EDGE or so (also the related functions etc.)?Although it mostly targets ISA hardware, it's about the edge-triggeredcharacteristic of that devices after all. Ack. I forgot to apply your earlier fixes so it will be done too. The functionality added by first 3 patches seem to be working. -- Best regards, Dmitry AdamushkoI'm now trying to organise a full test scenario with shared UARTISA-IRQs at our institute. May take till beginning of next week, some regressions in our own software need to be fixed first. That's good. Thanks. Keep me informed on progress. Jan-- Best regards,Dmitry Adamushko
[Xenomai-core] [Combo-PATCH] Shared interrupts (base, /proc support, edge-triggered stuff)
Hi there, as I promised, here go the following patches (ordered as they have to be applied one by one) : 1) shirq-base.patch Adds the name field to the interrupt object of the nucleus layer. Reworks the related bits of the native skin (+ a few minor changes for posix and rtdm) to support the named interrupt objects. 2) shirq-v7.patch Generic support for shared interrupts. 3) shirq-proc.patch /proc support. Now /proc/xenomai/irq shows the names of handlers. The related code have been removed from the hal layer to nucleus. - 4) shirq-isa.patch Trying to handle the edge-triggered stuff. This is a very preliminary version so the only thing I promise so far is that it compiles successfully. The functionality added by first 3 patches seem to be working.-- Best regards,Dmitry Adamushko shirq-base.patch Description: Binary data shirq-v7.patch Description: Binary data shirq-proc.patch Description: Binary data shirq-isa.patch Description: Binary data
[Xenomai-core] [PATCH] Shared irqs v.6
Hi, the previous version contained an ugly bug, namely the misuse of the test_and_set/clear_bit interface that led to a broken system from time to time. -- Best regards,Dmitry Adamushko shirq-v6.patch Description: Binary data
Re: [Xenomai-core] [PATCH] Shared irqs v.6
On 31/01/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: Hi, the previous version contained an ugly bug, namely the misuse of the test_and_set/clear_bit interface that led to a broken system from time to time. What about the ISA/edge-triggered stuff? :DMy problem is that we cannot test here on real hardware because all ourshared IRQs reside on PC104 cards. Could you provide me with some real piece of such code as an example? Anyway, design-wise, it's about introducing a separate xnintr_handler() for such a special case to avoid getting a mixture with the rest of sane code. I mean that the support of shared interrupts for ISA boards (edge-triggered stuff) is a kind of emulation to overcome the shortcommings of the initial design on the hardware level. The hardware was just not supposed to support shared interrupt channels. So, let's keep it a bit aside from another code :o) xnintr_attach() will check for a special flag like XN_ISR_ISASHARED and attach e.g. xnintr_isairq_handler() instead of xnintr_irq_handler(). I'll post an extension to the rt_intr_create(..., const char *name, ...) tomorrow - made but not tested yet - as well as the /proc support. Jan-- Best regards,Dmitry Adamushko
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
... I have not checked it yet but my presupposition that something as easy as : preempt_disable() wake_up_interruptible_sync(); schedule(); preempt_enable();It's a no-go: scheduling while atomic. One of my first attempts tosolve it. My fault. I meant the way preempt_schedule() and preempt_irq_schedule() call schedule() while being non-preemptible. To this end, ACTIVE_PREEMPT is set up. The use of preempt_enable/disable() here is wrong. The only way to enter schedule() without being preemptible is viaACTIVE_PREEMPT. But the effect of that flag should be well-known now. Kind of Gordian knot. :( Maybe I have missed something so just for my curiosity : what does prevent the use of PREEMPT_ACTIVE here? We don't have a preempted while atomic message here as it seems to be a legal way to call schedule() with that flag being set up. could work... err.. and don't blame me if no, it's some one else who has written that nonsense :o) -- Best regards, Dmitry AdamushkoJan-- Best regards,Dmitry Adamushko
Re: [Xenomai-core] [BUG] racy xnshadow_harden under CONFIG_PREEMPT
On 30/01/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: ... I have not checked it yet but my presupposition that something as easy as : preempt_disable() wake_up_interruptible_sync(); schedule(); preempt_enable(); It's a no-go: scheduling while atomic. One of my first attempts to solve it. My fault. I meant the way preempt_schedule() and preempt_irq_schedule() call schedule() while being non-preemptible. To this end, ACTIVE_PREEMPT is set up. The use of preempt_enable/disable() here is wrong. The only way to enter schedule() without being preemptible is via ACTIVE_PREEMPT. But the effect of that flag should be well-known now. Kind of Gordian knot. :( Maybe I have missed something so just for my curiosity : what does prevent the use of PREEMPT_ACTIVE here? We don't have a preempted while atomic message here as it seems to be a legal way to call schedule() with that flag being set up.When PREEMPT_ACTIVE is set, task gets /preempted/ but not removed from the run queue - independent of its current status. Err... that's exactly the reason I have explained in my first mail for this thread :) Blah.. I wish I was smoking something special before so I would point that as the reason of my forgetfulness. Actually, we could use PREEMPT_ACTIVE indeed + something else (probably another flag) to distinguish between a case when PREEMPT_ACTIVE is set by Linux and another case when it's set by xnshadow_harden(). xnshadow_harden() { struct task_struct *this_task = current; ... xnthread_t *thread = xnshadow_thread(this_task); if (!thread) return; ... gk-thread = thread; + add_preempt_count(PREEMPT_ACTIVE); // should be checked in schedule() + xnthread_set_flags(thread, XNATOMIC_TRANSIT); set_current_state(TASK_INTERRUPTIBLE); wake_up_interruptible_sync(gk-waitq); + schedule(); + sub_preempt_count(PREEMPT_ACTIVE); ... } Then, something like the following code should be called from schedule() : void ipipe_transit_cleanup(struct task_struct *task, runqueue_t *rq) { xnthread_t *thread = xnshadow_thread(task); if (!thread) return; if (xnthread_test_flags(thread, XNATOMIC_TRANSIT)) { xnthread_clear_flags(thread, XNATOMIC_TRANSIT); deactivate_task(task, rq); } } - schedule.c : ... switch_count = prev-nivcsw; if (prev-state !(preempt_count() PREEMPT_ACTIVE)) switch_count = prev-nvcsw; if (unlikely((prev-state TASK_INTERRUPTIBLE) unlikely(signal_pending(prev)) )) prev-state = TASK_RUNNING; else { if (prev-state == TASK_UNINTERRUPTIBLE) rq-nr_uninterruptible++; deactivate_task(prev, rq); } } // removes a task from the active queue if PREEMPT_ACTIVE + // XNATOMIC_TRANSIT + #ifdef CONFIG_IPIPE + ipipe_transit_cleanup(prev, rq); + #endif /* CONFIG_IPIPE */ ... Not very gracefully maybe, but could work or am I missing something important? -- Best regards,Dmitry Adamushko
[Xenomai-core] Re: [PATCH] Shared irqs v.5
On 23/01/06, Jan Kiszka [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: Hello Jan, as I promised earlier today, here is the patch.I finally had a look at your patch (not yet a try), and it looks reallynice and light-weight. I have another version here at hand. The only difference is that xnintr_irq_handler() handles all interrupts and destinguished the timer interrupt via irq == XNARCH_TIMER_IRQ to handle it appropriately. This way, the i-cache is, hopefully, used a bit more effectively. But it doesn't make a big difference in other parts of code so you may start testing with the one I posted earlier. Now I only have two topics on my wish list: o Handling of edge-triggered IRQs (ISA-cards...). As I tried to explain,in this case we have to run the IRQ handler list as long as the full list completed without any handler reporting XN_ISR_ENABLE back. Thenand only then we are safe to not stall the IRQ line. See e.g.serial8250_interrupt() in linux/drivers/serial/8250.c for a per-driver solution and [1] for some discussion on sharing IRQs (be warned, it'sfrom the evil side ;) ). Ok. e.g. we may introduce another flag to handle such a special case. Something along XN_ISR_EDGETIRQ and maybe a separate xnintr_etshirq_handler() (xnintr_attach() will set it up properly) so to avoid interfering with another code. No big overhead I guess. serial8250_interrupt() defines a maximum number of iterations so we should do the same (?) to avoid brain-damaged cases. On our systems we already have two of those use-cases: the xeno_16550Ahandling up to 4 devices on the same IRQ on an ISA card (I don't want to know what worst-case latency can be caused here...) and ourSJA-1000 CAN driver for PC/104 ISA card with 2 controllers on the sameinterrupt line. So a common solution would reduce the code size and potential bug sources.o Store and display (/proc) the driver name(s) registered on an IRQ linesomewhere (ipipe?). This is just a nice-to-have. I introduced the RTDMAPI with the required argument in place, would be great if we can use this some day. Yes, the proper /proc extension should be avalable. Actually, the native skin can't be extended to support the shared interrupts only by adding a new flag. The problem is the way the /proc/xenomai/registry/interrupts is implemented there (and I assume any other skin follows the same way). The rt_registry object is created per each RT_INTR structure and, hence, per each xnintr_t. I'd see the following scheme : either /proc/xenomai/interrupts lists all interrupts objects registered on the nucleus layer (xnintr_t should have a name field). IRQN drivers 3 driver1 ... 5 driver2, driver3 and the skin presents per-object information as ll /proc/xenomai/registry/interrupts driver1 driver2 driver3 each of those files contains the same information as now. To achieve this, 1) xnintr_t should be extended with a name field; 2) rt_intr_create() should contain a name argument and not use auto-generation (as irqN) any more. or ll /proc/xenomai/registry/interrupts 3 5 Those are directories and e.g. ll /proc/xenomai/registry/interrupts/5 driver2 driver3 Those are files and contain the same information as now. This is harder to implement since the registry interface should be extended (for each skin). ... JanPS: Still at home? Yes. This week I'm going to Belgium to attend a few meeting with some customers of my potential employer. So my next step for the nearest future will be finally determined there :) How many degrees Centigrade? I guess our current -9°Chere in Hannover must appear ridiculous, almost subtropical warm to you. ;) Hey, I'm not from Syberia :o) This is a kind of common delusion I guess as the whole former USSR is assotiated with cold winters, bears, eak.. KGB etc. :o) from wikipedia.com (about Belarus) : The climate ranges from harsh winters (average January temperatures are in the range −8°C to −2°C) to cool and moist summers (average temperature 15°C to 20°C). Actually, last days it was very cold - even about -30C. This happens from time to time but very rare (once in a few years or so) and it's not considered as something normal here. e.g. schools were closed a few last days when the temperature was below -25. Actually, the weather is getting crazy last years and not only here :) [1] http://www.microsoft.com/whdc/system/sysperf/apic.mspx -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PATCH] shared irqs v.3
Hi Jan, As lighter may mean that reducing the structure size also reduces the number of used cache lines, it might be a good idea. The additional complexity for entry removal is negligible. My current working version is already lighter when it comes to the size of additional data structures. It's implemented via the one-way linked list instead of xnqueue_t. This way, it's 3 times lighter for UP and 2 times for SMP systems. I'll try to post it today later. The only problem remaining is the compilation issues so I should fix it before posting, namely: it looks like some code in kscr/nucleus (e.g. intr.c) is used for compiling both kernel-mode code (of cousre) and user-mode (maybe for UVM, though I haven't looked at it thoroughly yet). The link to ksrc/nucleus is created in the src/ directory. Both the IPIPE_NR_IRQS macro and rthal_critical_enter/exit() calls are undefined when intr.c is compiled for the user-mode side. That's why it so far contains those __IPIPE_NR_IRQS and external int rthal_critical_enter/exit() definitions. I hope that also answers your another question later on this mail. Beleive it or not, I have considered different ways to guarantee that a passed cookie param is valid (xnintr_detach() has not deleted it) and remains to be so while the xnintr_irq_handler() is running. And there are some obstacles there... I'll post them later if someone is interested since I'm short of time now :) ... I'm interested... Ok. So I will have at least a reader :) Actually, I still hope to find out some solution so to make use of the recently extended ipipe interface as it was supposed to be used (then there is no need for any per-irq xnshirqs array in intr.c). Otherwise, I have to admit that my recent work with that ipipe extension (I can ay it since I made it) is of no big avail. Maybe we together will find out a solution. That code is compiled for the user-mode code also and the originals are not available. So consider it a temp solution for test purposes, I guess it's easily fixable. test/shirq.c - is a test module. SHIRQ_VECTOR must be the one used by Linux, e.g. I have used 12 that's used by the trackball device. I haven't tried your code yet, but in the preparation of a real scenario I stumbled over a problem in my serial driver regarding IRQ sharing: In case you want to use xeno_16550A for ISA devices with shared IRQs, an iteration over the list of registered handlers would be required /until/ no device reported that it handled something. This is required so that the IRQ line gets released for a while and system obtains a chance to detect a new /edge/ triggered IRQ - ISA oddity. That's the way most serial drivers work, but they do it internally. So the question arose for me if this edge-specific handling shouldn't be moved to the nucleus as well (so that I don't have to fix my 16550A ;)). Brrr... frankly speaking, I haven't got it clearly so don't want to make pure speculations. Probably I have to take a look at the xeno_16550A driver keeping in mind your words. Another optimisation idea, which I once also realised in my own shared IRQ wrapper, is to use specialised trampolines at the nucleus level, i.e. to not apply the full sharing logic with its locking and list iterations for non-shared IRQs. What do you think? Worth it? Might be when the ISA/edge handling adds further otherwise unneeded overhead. Yep, maybe. But let's take something working first.. Jan -- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [PATCH] shared irqs v.3
Hi Jan, As lighter may mean that reducing the structure size also reduces the number of used cache lines, it might be a good idea. The additional complexity for entry removal is negligible. My current working version is already lighter when it comes to the size of additional data structures. It's implemented via the one-way linked list instead of xnqueue_t. This way, it's 3 times lighter for UP and 2 times for SMP systems. I'll try to post it today later. The only problem remaining is the compilation issues so I should fix it before posting, namely: it looks like some code in kscr/nucleus (e.g. intr.c) is used for compiling both kernel-mode code (of cousre) and user-mode (maybe for UVM, though I haven't looked at it thoroughly yet). The link to ksrc/nucleus is created in the src/ directory. Both the IPIPE_NR_IRQS macro and rthal_critical_enter/exit() calls are undefined when intr.c is compiled for the user-mode side. That's why it so far contains those __IPIPE_NR_IRQS and external int rthal_critical_enter/exit() definitions. I hope that also answers your another question later on this mail. Beleive it or not, I have considered different ways to guarantee that a passed cookie param is valid (xnintr_detach() has not deleted it) and remains to be so while the xnintr_irq_handler() is running. And there are some obstacles there... I'll post them later if someone is interested since I'm short of time now :) ... I'm interested... Ok. So I will have at least a reader :) Actually, I still hope to find out some solution so to make use of the recently extended ipipe interface as it was supposed to be used (then there is no need for any per-irq xnshirqs array in intr.c). Otherwise, I have to admit that my recent work with that ipipe extension (I can ay it since I made it) is of no big avail. Maybe we together will find out a solution. That code is compiled for the user-mode code also and the originals are not available. So consider it a temp solution for test purposes, I guess it's easily fixable. test/shirq.c - is a test module. SHIRQ_VECTOR must be the one used by Linux, e.g. I have used 12 that's used by the trackball device. I haven't tried your code yet, but in the preparation of a real scenario I stumbled over a problem in my serial driver regarding IRQ sharing: In case you want to use xeno_16550A for ISA devices with shared IRQs, an iteration over the list of registered handlers would be required /until/ no device reported that it handled something. This is required so that the IRQ line gets released for a while and system obtains a chance to detect a new /edge/ triggered IRQ - ISA oddity. That's the way most serial drivers work, but they do it internally. So the question arose for me if this edge-specific handling shouldn't be moved to the nucleus as well (so that I don't have to fix my 16550A ;)). Brrr... frankly speaking, I haven't got it clearly so don't want to make pure speculations. Probably I have to take a look at the xeno_16550A driver keeping in mind your words. Another optimisation idea, which I once also realised in my own shared IRQ wrapper, is to use specialised trampolines at the nucleus level, i.e. to not apply the full sharing logic with its locking and list iterations for non-shared IRQs. What do you think? Worth it? Might be when the ISA/edge handling adds further otherwise unneeded overhead. Yep, maybe. But let's take something working first.. Jan -- Best regards,Dmitry Adamushko
[Xenomai-core] [PATCH] Shared irqs v.5
Hello Jan, as I promised earlier today, here is the patch. hehe.. more comments later together with other explanations I promised. I have to go now since I have to make a trip and my bus is leaving in 45 minutes :) -- Best regards,Dmitry Adamushko shirq-v4.patch Description: Binary data
[Xenomai-core] Re: nucleus:shared irq, draft v.2
On 09/01/06, Gilles Chanteperdrix [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: Hi everybody, I'm presenting it one last time as a draft, so it's a right time to give all the remarks concerning design/implementation issues for all interested parties. Any feedback is wellcome. Maybe I missed some recent changes in Xenomai HAL or Adeos, in which case, please forget my remark, but calling xnarch_hook_irq, i.e. rthal_irq_request when nklock is locked, irq off, may lead to deadlock on multi-processor machines. The situation is (or used to be) as follows: CPU1 holds the nklock calls rthal_irq_request, which calls ipipe_critical_enter, which: triggers the critical IPI spins, waiting for other CPUs to enter __ipipe_do_critical_sync CPU2 spins on nklock, irq off never handles the critical IPI, because irqs are off Nop, your remark is actually right to the place. I had some doubts regarding the use of nklock here but, flankly, I didn't even consider that possible deadlock. Thanks for a hint. I only skimmed over the discussion about ipipe_virtualize_irq_from. I do not know if it finally flushes all pending interrupts an all CPUs when called with a NULL handler. But if it does, the call to xnintr_shirq_spin seems redundant in xnintr_detach. Well, the problem is that the shirq-handlers list, i.e. the shirq object itself may be in use at the moment when xnintr_detach() is called. That said, the shirq object as well as all the elements of shirq-handlers (those that have been removed by xnintr_detach() from the list) must remain valid as long as there is a isr handler for the given irq running (i.e. xnintr_irq_handler() ). To sum it up: xnintr_shirq_spin() is for: o shirq must be deleted only when the xnintr_irq_handler() for the given irq has completed; o even if there is no need to delete the shirq object (there are other intr objects in the list) the xnintr_detach() must wait until the handler gets completed, thas keeping intr-link valid. Ok, maybe my explanation is a bit confusing but the direct analogy is the way used in Linux, namely the synchroize_irq() call (take a look at how it's used in free_irq()). This is a kind of RCU. The element is removed from the list but it's completely freed only when all the read clients are completed (in our case, xnintr_irq_handler() and handle_IRQ_events() in Linux). -- Gilles Chanteperdrix. -- Best regards, Dmitry Adamushko
[Xenomai-core] Re: nucleus:shared irq, draft v.2
); Looking forward to see this in Xenomai - at letting some real tests run with it. :) So am I. Heh.. :) Jan -- Best regards, Dmitry Adamushko
[Xenomai-core] Re: [PATCH, RFC] nucleus:shared irq and possible ipipe problem
On 05/01/06, Philippe Gerum [EMAIL PROTECTED] wrote: Dmitry Adamushko wrote: err... I have to orginize some issues next few days so I'll be out of my notebook. Then I'll finilize the ipipe-irq-related patch so to be ready for the .01 version. Meanwhile, I have finished merging the shared IRQ infrastructure intothe Adeos codebase for all supported archs. The Xenomai trunk/ has beenupgraded accordingly too. Thanks and sorry for aditionally taking your time on some work I shoud have completed on my own. Actually this infrastructure is by no means a _must_ for introducing the shared irq feature on the nucleus layer. I doubt that we may gain any noticeable benefits, latency-wise, having got the one-layer-less irq processing chain. Just hope there is no regression. So, my hope is rather that the new interface has become a bit more clear and usefull, e.g. there is no need to implement any other per-irq tables to introduce the cookie concept no matter where it has to be used (even directly at the adeos layer). x86-wise, please make sure to rebase yourdevelopment on 1.1-02 . Sure. --Philippe.-- Best regards, Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [rt shared irqs] changes on the adeos-ipipe layer v.2
er.. I was confused by the fact that sum of all apcs in /proc/xenomai/apc != virq from /proc/xenomai/irq, but ipipe_printk_virq is not listed since it's registered without rthal::apc interface. Ok, should take a look at it closer, i.e. maybe at least rthal_linux_irq can be reworked now.APCs are multiplexed on a single virq. Yep, I know. Don't know why but I thought that printk_flush is multiplexed on the same virq, OTOH it's an APC too. Well, the easiest argument against that could have been that the APC interface is implemented in the hal but printk_virq is registered in the ipipe, i.e. when there is no APC interface yet, mea culpa. --Philippe.-- Best regards,Dmitry Adamushko ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
Re: [Xenomai-core] [rt shared irqs] changes on the adeos-ipipe layer v.2
er.. I was confused by the fact that sum of all apcs in /proc/xenomai/apc != virq from /proc/xenomai/irq, but ipipe_printk_virq is not listed since it's registered without rthal::apc interface. Ok, should take a look at it closer, i.e. maybe at least rthal_linux_irq can be reworked now.APCs are multiplexed on a single virq. Yep, I know. Don't know why but I thought that printk_flush is multiplexed on the same virq, OTOH it's an APC too. Well, the easiest argument against that could have been that the APC interface is implemented in the hal but printk_virq is registered in the ipipe, i.e. when there is no APC interface yet, mea culpa. --Philippe.-- Best regards,Dmitry Adamushko
[Xenomai-core] [PATCH, RFC] nucleus:shared irq and possible ipipe problem
Hi everybody, I have got a few synch-related problems while adding the code for supporting the rt shared irqs to the nucleus layer. But at first let's take a look at some adeos code that, well, possibly has the same problem. let's say the primary domain is interested in irq = TEST_IRQ. CPU 0: - TEST_IRQ occurs. - ipipe_handle_irq() The local interrupts are off on entry. testbit(IPIPE_HANDLE_FLAG, ipd-irqs[irq].control) shows whether a given domain is interested in handling the irq. Then cpu-local data is mainly used, e.g. cpudata-irq_hits[irq]++ and proper changes of pending_lo/hi ... CPU 1: - ... - rthal_irq_release() - ipipe_virtualize_irq(TEST_IRQ, ... handler = NULL, ...) - Here, __ipipe_pipe_lock + interrupts off. o ipipe_virtualize_irq() drops the IPIPE_HANDLE_FLAG and sets up ipd-irqs[irq].handler to NULL. First observations, at the same time the TEST_IRQ still can be marked as pending (i.e. some_cpudata-irq_pending_lo/hi and irq_hits)! CPU 0: - actually, ipipe_handle_irq() now (if not yet done before) may be calling __ipipe_set_irq_bit(ipd,cpuid,irq) but noone is interested in this irq == TEST_IRQ already. But no matter, the fact is that cpudata-irq_* of a given cpu denotes a pending irq, let's go further. - later on, ipipe_sync_stage() is called for a given domain and cpu. It handles all irqs which are marked for the domain and cpu. So it's based only on the check of cpudata-irq_(pending_(hi,lo), hits) fields. Let's recall that TEST_IRQ is marked here as well... In some way (ipipe_call_root_*irq_handler() or directly ipd-irqs[irq].handler()) the isr handler is called and boom! it's NULL. Have I missed something that prevents this? - In a sense, the synch. problem I have mentioned at the beginning of this mail reminds this scenario. The draft patch is enclosed just to elustrate an idea. There are 2 problems: 1) we probably don't want to hold any lock while processing the handlers list (xnintr_intr_handler(): for all shirq-handlers). Here we may use the approach used by linux in manage.c::free_irq() vs. handle.c::__do_IRQ() that calls handle_IRQ_event() without the desc-lock being locked. The magic is in free_irq() : it removes the action item from the list but then falls into busy waiting until the IRQ_INPROGRESS flag is removed. And only then it deletes the action item. At that time, the action item is already not on the list but still points to the valid tail of the list so the iteration may be proceeded even if the current item == deleted one. Blah, just I guess the presupposition here is that the operation of deletion (free_irq():: *pp = action-next) is atomic. 2) xnintr_irq_handler() gets a cookie which is == xnshared_irq_t* (see xnintr_attach) that may already be invalid at that time or, and that's a problem, become invalid during the execution of xnintr_irq_handler(). To prevent that, we could add a flag like IRQ_INPROGRESS but either we have to set/remove it on the adeos layer before the control is passed to the xnintr_irq_handler() (to be sure that cookie is not xnfree()'d. xnintr_detach() will do busy waiting) or we may set/remove the flag in the xnintr_irq_handler() but have to ignore the passed cookie and get it as cookie = rthal_irq_cookie(ipd,irq). Mmm, not very gracefully I'd say. Ok, it's enough for the New Year's Eve. Happy New Year to everybody! I wish you all the best for the New Year :o) -- Best regards,Dmitry Adamushko nucleus_intr.c.patch Description: Binary data nucleus_intr.h.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [PATCH, RFC] nucleus:shared irq and possible ipipe problem
Hi everybody, I have got a few synch-related problems while adding the code for supporting the rt shared irqs to the nucleus layer. But at first let's take a look at some adeos code that, well, possibly has the same problem. let's say the primary domain is interested in irq = TEST_IRQ. CPU 0: - TEST_IRQ occurs. - ipipe_handle_irq() The local interrupts are off on entry. testbit(IPIPE_HANDLE_FLAG, ipd-irqs[irq].control) shows whether a given domain is interested in handling the irq. Then cpu-local data is mainly used, e.g. cpudata-irq_hits[irq]++ and proper changes of pending_lo/hi ... CPU 1: - ... - rthal_irq_release() - ipipe_virtualize_irq(TEST_IRQ, ... handler = NULL, ...) - Here, __ipipe_pipe_lock + interrupts off. o ipipe_virtualize_irq() drops the IPIPE_HANDLE_FLAG and sets up ipd-irqs[irq].handler to NULL. First observations, at the same time the TEST_IRQ still can be marked as pending (i.e. some_cpudata-irq_pending_lo/hi and irq_hits)! CPU 0: - actually, ipipe_handle_irq() now (if not yet done before) may be calling __ipipe_set_irq_bit(ipd,cpuid,irq) but noone is interested in this irq == TEST_IRQ already. But no matter, the fact is that cpudata-irq_* of a given cpu denotes a pending irq, let's go further. - later on, ipipe_sync_stage() is called for a given domain and cpu. It handles all irqs which are marked for the domain and cpu. So it's based only on the check of cpudata-irq_(pending_(hi,lo), hits) fields. Let's recall that TEST_IRQ is marked here as well... In some way (ipipe_call_root_*irq_handler() or directly ipd-irqs[irq].handler()) the isr handler is called and boom! it's NULL. Have I missed something that prevents this? - In a sense, the synch. problem I have mentioned at the beginning of this mail reminds this scenario. The draft patch is enclosed just to elustrate an idea. There are 2 problems: 1) we probably don't want to hold any lock while processing the handlers list (xnintr_intr_handler(): for all shirq-handlers). Here we may use the approach used by linux in manage.c::free_irq() vs. handle.c::__do_IRQ() that calls handle_IRQ_event() without the desc-lock being locked. The magic is in free_irq() : it removes the action item from the list but then falls into busy waiting until the IRQ_INPROGRESS flag is removed. And only then it deletes the action item. At that time, the action item is already not on the list but still points to the valid tail of the list so the iteration may be proceeded even if the current item == deleted one. Blah, just I guess the presupposition here is that the operation of deletion (free_irq():: *pp = action-next) is atomic. 2) xnintr_irq_handler() gets a cookie which is == xnshared_irq_t* (see xnintr_attach) that may already be invalid at that time or, and that's a problem, become invalid during the execution of xnintr_irq_handler(). To prevent that, we could add a flag like IRQ_INPROGRESS but either we have to set/remove it on the adeos layer before the control is passed to the xnintr_irq_handler() (to be sure that cookie is not xnfree()'d. xnintr_detach() will do busy waiting) or we may set/remove the flag in the xnintr_irq_handler() but have to ignore the passed cookie and get it as cookie = rthal_irq_cookie(ipd,irq). Mmm, not very gracefully I'd say. Ok, it's enough for the New Year's Eve. Happy New Year to everybody! I wish you all the best for the New Year :o) -- Best regards,Dmitry Adamushko nucleus_intr.c.patch Description: Binary data nucleus_intr.h.patch Description: Binary data
[Xenomai-core] [rt shared irqs] changes on the adeos-ipipe layer v.2
Hi everybody, the recent patches are enclosed here. Mmm I haven't ported it to the recent adeos-ipipe xeno though but I guess it's not a big obstacle for possible design-related comments etc. just to note some main issues... o the ipipe_virtualize_irq() interface now includes an additional param, namely - void *cookie. o ipipe_domain::cpudata::irq_hits has been renamed to irq_pending_hits while irq_hits now stands for the former rthal_realtime_irq::irq_hits. It's incremeneted in ipipe_irq_handle() and ipipe_schedule_irq(), IOW at the places where irqs (normal or virtual) become visible on the adeos radar first time. Another approach would be to increment it in ipipe_sync_stage() right before calling actual isr handlers. This way, accounting (well, it depends what are we going to count though) would be wrong for virq and ipipe_tick_irq handler. er.. I was confused by the fact that sum of all apcs in /proc/xenomai/apc != virq from /proc/xenomai/irq, but ipipe_printk_virq is not listed since it's registered without rthal::apc interface. Ok, should take a look at it closer, i.e. maybe at least rthal_linux_irq can be reworked now. -- Best regards,Dmitry Adamushko ipipe-i386-1.0-10-ext-2.patch Description: Binary data xenomai-2.1-i386-ext-2.patch Description: Binary data ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [rt shared irqs] changes on the adeos-ipipe layer v.2
Hi everybody, the recent patches are enclosed here. Mmm I haven't ported it to the recent adeos-ipipe xeno though but I guess it's not a big obstacle for possible design-related comments etc. just to note some main issues... o the ipipe_virtualize_irq() interface now includes an additional param, namely - void *cookie. o ipipe_domain::cpudata::irq_hits has been renamed to irq_pending_hits while irq_hits now stands for the former rthal_realtime_irq::irq_hits. It's incremeneted in ipipe_irq_handle() and ipipe_schedule_irq(), IOW at the places where irqs (normal or virtual) become visible on the adeos radar first time. Another approach would be to increment it in ipipe_sync_stage() right before calling actual isr handlers. This way, accounting (well, it depends what are we going to count though) would be wrong for virq and ipipe_tick_irq handler. er.. I was confused by the fact that sum of all apcs in /proc/xenomai/apc != virq from /proc/xenomai/irq, but ipipe_printk_virq is not listed since it's registered without rthal::apc interface. Ok, should take a look at it closer, i.e. maybe at least rthal_linux_irq can be reworked now. -- Best regards,Dmitry Adamushko ipipe-i386-1.0-10-ext-2.patch Description: Binary data xenomai-2.1-i386-ext-2.patch Description: Binary data
[Xenomai-core] [rt shared irqs] ipipe-related changes (draft)
Hi everybody, the enclosed patches (the first version, hence it's still raw) are supposed to build the basis needed on the ipipe layer for adding support of real-time shared interrupts. As a side effect, the irq_trampoline() layer has been eliminated and the irq processing chain has become shorter. Some points are more likely to be changed in the next iteration (e.g. __ipipe_irq_cookie() vs. changing the ipipe_virtualize_irq() interface). The struct rthal_realtime_irq::hits[per-IRQ] is missed so far. Anyway, comments are very wellcome. -- Best regards, Dmitry Adamushko diff -urp linux-2.6.14.2-ipipe-1.0-10/arch/i386/kernel/ipipe-core.c linux-2.6.14.2-ipipe-1.0-10-ext/arch/i386/kernel/ipipe-core.c --- linux-2.6.14.2-ipipe-1.0-10/arch/i386/kernel/ipipe-core.c 2005-12-22 14:15:17.0 +0100 +++ linux-2.6.14.2-ipipe-1.0-10-ext/arch/i386/kernel/ipipe-core.c 2005-12-23 16:37:32.0 +0100 @@ -98,13 +98,14 @@ static void (*__ipipe_cpu_sync) (void); pushl %%edx\n\t \ pushl %%ecx\n\t \ pushl %%ebx\n\t \ +pushl %2\n\t \ pushl %%eax\n\t \ call *%1\n\t \ -addl $4,%%esp\n\t \ +addl $8,%%esp\n\t \ jmp ret_from_intr\n\t \ 1:\n \ : /* no output */ \ -: a (irq), m ((ipd)-irqs[irq].handler)) +: a (irq), m ((ipd)-irqs[irq].handler), r ((ipd)-irqs[irq].cookie)) static __inline__ unsigned long flnz(unsigned long word) { @@ -125,7 +126,7 @@ int __ipipe_ack_system_irq(unsigned irq) /* Always called with hw interrupts off. */ -void __ipipe_do_critical_sync(unsigned irq) +void __ipipe_do_critical_sync(unsigned irq, void *cookie) { ipipe_declare_cpuid; @@ -301,7 +302,7 @@ void fastcall __ipipe_sync_stage(unsigne local_irq_disable_hw(); } else { __clear_bit(IPIPE_SYNC_FLAG, cpudata-status); - ipd-irqs[irq].handler(irq); + ipd-irqs[irq].handler(irq, ipd-irqs[irq].cookie); __set_bit(IPIPE_SYNC_FLAG, cpudata-status); } @@ -415,7 +416,7 @@ int fastcall __ipipe_send_ipi (unsigned int ipipe_virtualize_irq(struct ipipe_domain *ipd, unsigned irq, -void (*handler) (unsigned irq), +ipipe_irq_handler_t handler, int (*acknowledge) (unsigned irq), unsigned modemask) { diff -urp linux-2.6.14.2-ipipe-1.0-10/arch/i386/kernel/ipipe-root.c linux-2.6.14.2-ipipe-1.0-10-ext/arch/i386/kernel/ipipe-root.c --- linux-2.6.14.2-ipipe-1.0-10/arch/i386/kernel/ipipe-root.c 2005-12-22 14:15:17.0 +0100 +++ linux-2.6.14.2-ipipe-1.0-10-ext/arch/i386/kernel/ipipe-root.c 2005-12-23 16:57:25.0 +0100 @@ -78,7 +78,7 @@ static int __ipipe_ack_common_irq(unsign #ifdef CONFIG_X86_LOCAL_APIC -static void __ipipe_null_handler(unsigned irq) +static void __ipipe_null_handler(unsigned irq, void *cookie) { /* Nop. */ } @@ -124,19 +124,19 @@ void __init __ipipe_enable_pipeline(void ipipe_virtualize_irq(ipipe_root_domain, LOCAL_TIMER_VECTOR - FIRST_EXTERNAL_VECTOR, -(void (*)(unsigned))smp_apic_timer_interrupt, +(ipipe_irq_handler_t)smp_apic_timer_interrupt, __ipipe_ack_system_irq, IPIPE_STDROOT_MASK); ipipe_virtualize_irq(ipipe_root_domain, SPURIOUS_APIC_VECTOR - FIRST_EXTERNAL_VECTOR, -(void (*)(unsigned))smp_spurious_interrupt, +(ipipe_irq_handler_t)smp_spurious_interrupt, __ipipe_noack_irq, IPIPE_STDROOT_MASK); ipipe_virtualize_irq(ipipe_root_domain, ERROR_APIC_VECTOR - FIRST_EXTERNAL_VECTOR, -(void (*)(unsigned))smp_error_interrupt, +(ipipe_irq_handler_t)smp_error_interrupt, __ipipe_ack_system_irq, IPIPE_STDROOT_MASK); @@ -167,7 +167,7 @@ void __init __ipipe_enable_pipeline(void #ifdef CONFIG_X86_MCE_P4THERMAL ipipe_virtualize_irq(ipipe_root_domain, THERMAL_APIC_VECTOR - FIRST_EXTERNAL_VECTOR, -(void (*)(unsigned))smp_thermal_interrupt, +(ipipe_irq_handler_t)smp_thermal_interrupt, __ipipe_ack_system_irq, IPIPE_STDROOT_MASK); #endif
[Xenomai-core] [rt shared irqs] ipipe-related changes (draft)
Hi everybody, the enclosed patches (the first version, hence it's still raw) are supposed to build the basis needed on the ipipe layer for adding support of real-time shared interrupts. As a side effect, the irq_trampoline() layer has been eliminated and the irq processing chain has become shorter. Some points are more likely to be changed in the next iteration (e.g. __ipipe_irq_cookie() vs. changing the ipipe_virtualize_irq() interface). The struct rthal_realtime_irq::hits[per-IRQ] is missed so far. Anyway, comments are very wellcome. -- Best regards, Dmitry Adamushko diff -urp linux-2.6.14.2-ipipe-1.0-10/arch/i386/kernel/ipipe-core.c linux-2.6.14.2-ipipe-1.0-10-ext/arch/i386/kernel/ipipe-core.c --- linux-2.6.14.2-ipipe-1.0-10/arch/i386/kernel/ipipe-core.c 2005-12-22 14:15:17.0 +0100 +++ linux-2.6.14.2-ipipe-1.0-10-ext/arch/i386/kernel/ipipe-core.c 2005-12-23 16:37:32.0 +0100 @@ -98,13 +98,14 @@ static void (*__ipipe_cpu_sync) (void); pushl %%edx\n\t \ pushl %%ecx\n\t \ pushl %%ebx\n\t \ +pushl %2\n\t \ pushl %%eax\n\t \ call *%1\n\t \ -addl $4,%%esp\n\t \ +addl $8,%%esp\n\t \ jmp ret_from_intr\n\t \ 1:\n \ : /* no output */ \ -: a (irq), m ((ipd)-irqs[irq].handler)) +: a (irq), m ((ipd)-irqs[irq].handler), r ((ipd)-irqs[irq].cookie)) static __inline__ unsigned long flnz(unsigned long word) { @@ -125,7 +126,7 @@ int __ipipe_ack_system_irq(unsigned irq) /* Always called with hw interrupts off. */ -void __ipipe_do_critical_sync(unsigned irq) +void __ipipe_do_critical_sync(unsigned irq, void *cookie) { ipipe_declare_cpuid; @@ -301,7 +302,7 @@ void fastcall __ipipe_sync_stage(unsigne local_irq_disable_hw(); } else { __clear_bit(IPIPE_SYNC_FLAG, cpudata-status); - ipd-irqs[irq].handler(irq); + ipd-irqs[irq].handler(irq, ipd-irqs[irq].cookie); __set_bit(IPIPE_SYNC_FLAG, cpudata-status); } @@ -415,7 +416,7 @@ int fastcall __ipipe_send_ipi (unsigned int ipipe_virtualize_irq(struct ipipe_domain *ipd, unsigned irq, -void (*handler) (unsigned irq), +ipipe_irq_handler_t handler, int (*acknowledge) (unsigned irq), unsigned modemask) { diff -urp linux-2.6.14.2-ipipe-1.0-10/arch/i386/kernel/ipipe-root.c linux-2.6.14.2-ipipe-1.0-10-ext/arch/i386/kernel/ipipe-root.c --- linux-2.6.14.2-ipipe-1.0-10/arch/i386/kernel/ipipe-root.c 2005-12-22 14:15:17.0 +0100 +++ linux-2.6.14.2-ipipe-1.0-10-ext/arch/i386/kernel/ipipe-root.c 2005-12-23 16:57:25.0 +0100 @@ -78,7 +78,7 @@ static int __ipipe_ack_common_irq(unsign #ifdef CONFIG_X86_LOCAL_APIC -static void __ipipe_null_handler(unsigned irq) +static void __ipipe_null_handler(unsigned irq, void *cookie) { /* Nop. */ } @@ -124,19 +124,19 @@ void __init __ipipe_enable_pipeline(void ipipe_virtualize_irq(ipipe_root_domain, LOCAL_TIMER_VECTOR - FIRST_EXTERNAL_VECTOR, -(void (*)(unsigned))smp_apic_timer_interrupt, +(ipipe_irq_handler_t)smp_apic_timer_interrupt, __ipipe_ack_system_irq, IPIPE_STDROOT_MASK); ipipe_virtualize_irq(ipipe_root_domain, SPURIOUS_APIC_VECTOR - FIRST_EXTERNAL_VECTOR, -(void (*)(unsigned))smp_spurious_interrupt, +(ipipe_irq_handler_t)smp_spurious_interrupt, __ipipe_noack_irq, IPIPE_STDROOT_MASK); ipipe_virtualize_irq(ipipe_root_domain, ERROR_APIC_VECTOR - FIRST_EXTERNAL_VECTOR, -(void (*)(unsigned))smp_error_interrupt, +(ipipe_irq_handler_t)smp_error_interrupt, __ipipe_ack_system_irq, IPIPE_STDROOT_MASK); @@ -167,7 +167,7 @@ void __init __ipipe_enable_pipeline(void #ifdef CONFIG_X86_MCE_P4THERMAL ipipe_virtualize_irq(ipipe_root_domain, THERMAL_APIC_VECTOR - FIRST_EXTERNAL_VECTOR, -(void (*)(unsigned))smp_thermal_interrupt, +(ipipe_irq_handler_t)smp_thermal_interrupt, __ipipe_ack_system_irq, IPIPE_STDROOT_MASK); #endif
Re: [Xenomai-core] [RFC] define your own pipe heap
[EMAIL PROTECTED] wrote on 22.11.2005 11:21:09: Jan Kiszka wrote: ... A patch says more than thousand words. ;) As a first approach, I picked the second variant and implemented a new function called rt_pipe_setpool. I also had to extend rt_pipe_alloc and rt_pipe_free so that the right pool is used by them. I thought about this variant again, and it seems to me rather unsafe in case some buffer allocation takes place between rt_pipe_create and rt_pipe_setpool. So, here is a patch which extends rt_pipe_create with a new argument poolsize instead. I haven't read the patch thoroughly yet so just a few common remarks. IMHO, the interface would be much clear in case a rt_pipe_create() is extended since: - we avoid a misuse when a rt_pipe_alloc() is called with an old pool but rt_pipe_free() (explicitly or implicitly internally) is called with a new one; - rt_pipe_setpool() can be successfuly called only once and a user must care that: o either this is next call after a pipe has been created; o or all messages allocated from the old pool have been freed by that moment (otherwise we need to extend a pipe::message interface so that every message knows the exact pool it has been allocated from and, moreover, implement the reference counting so that a pool object exists as long as there is at least a reference on it - and we don't need that complications). So, yep, adding an additional parameter to the rt_pipe_create() would be a better solution. My humble 2 cents :o) Once I thought about an interface that allows to attach/detach an existing RT_HEAP object to any pipe/queue/maybe_smth_else but here we would need to extend the underlying mechanisms as I mentioned above. And likely, such an interface is not something of big avail for strict real-time environments. Keep it simple instaed :o) --- Best regards, Dmitry
Re: [Xenomai-core] Re: [Xenomai-help] Blocking reads from pipes
[EMAIL PROTECTED] wrote on 18.11.2005 09:57:15: Exactly, I have just found out that and posted actually a long mail just before getting this mail from you :o) Yep, and before getting blocked, read() increments the counter as well, that's why we don't have a xnpipe_realease() called as a result of close(). So everything is correct. Fine. Though then my problem is not related to xenomai, any suggestions on how to force read() to return? (without writing anything to the other end of the pipe) ummm... send it a certain signal with pthread_kill() (and setting up something like thread::flag == END_OF_WORK_BABY at the same time but only in case when there can be other cases of getting a signal by this thread)? In such a case, read() should return -EINTR. Note, a signal handler must be installed without a SA_RESTART flag so that the read() will not be restarted after signal processing. --- Dmitry ___ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core
[Xenomai-core] [pipe.c] hairy synchronization - flush the output queue upon closure
Hi, I failed to find the original message from Sebastian that led to the following change: - 2005-11-09 Philippe Gerum [EMAIL PROTECTED] * nucleus/pipe.c (xnpipe_disconnect): Flush the output queue upon closure. Issue spotted by Sebastian Smolorz. (xnpipe_release): Flush the input queue upon closure. - There is one more issue as follows. The reason is that xnpipe_open/release() are not atomic. Briefly, a newly connected standard linux thread may get a message that have been posted by the real-time side long time before a connection has been opened by that thread. That message has been placed to the queue at the time when there was another linux thread connected to the pipe. err... I'm not sure now that we should fight against that :confused a bit: --- details --- THREAD #1 on CPU #1 (normal linux thread): xnpipe_release() { // ok, we are locked here and noone may access the inq queue, let's free all the messages if any ... if (state-output_handler != NULL) { while ((holder = getq(state-outq)) != NULL) state-output_handler(minor,link2mh(holder),-EPIPE,state-cookie); } while ((holder = getq(state-inq)) != NULL) { if (state-input_handler != NULL) state-input_handler(minor,link2mh(holder),-EPIPE,state-cookie); else if (state-alloc_handler == NULL) xnfree(link2mh(holder)); } ... if (waitqueue_active(state-readq)) wake_up_interruptible_all(state-readq); if (state-asyncq) /* Clear the async queue */ { xnlock_get_irqsave(nklock,s); removeq(xnpipe_asyncq,state-alink); clrbits(state-status,XNPIPE_USER_SIGIO); xnlock_put_irqrestore(nklock,s); fasync_helper(-1,file,0,state-asyncq); } // here the lock is not held. (*** EIP ***) --- THREAD #1 is about to drop the XNPIPE_USER_CONN flag. /* Free the state object. Since that time it can be open by someone else */ clrbits(state-status,XNPIPE_USER_CONN); } THREAD #2 on CPU #2 (real-time thread) xnpipe_send() { ... // locked section xnlock_get_irqsave(nklock,s); // actually, the following 2 checks should be re-ordered :) (*** EIP ***) - THREAD #1 doesn't dropped the XNPIPE_USER_CONN flag yet but the pipe is almost non-valid! if (!testbits(state-status,XNPIPE_USER_CONN)) { xnlock_put_irqrestore(nklock,s); return -EPIPE; } if (!testbits(state-status,XNPIPE_KERN_CONN)) { xnlock_put_irqrestore(nklock,s); return -EBADF; } inith(xnpipe_m_link(mh)); xnpipe_m_size(mh) = size - sizeof(*mh); state-ionrd += xnpipe_m_size(mh); // here the message is successfully added to the outq if (flags XNPIPE_URGENT) prependq(state-outq,xnpipe_m_link(mh)); else appendq(state-outq,xnpipe_m_link(mh)); } ... In the mean time, THREAD #1 drops the XNPIPE_USER_FLAG so another standard linux thread may open a pipe. When that happens, that thread will find a message that have been posted when the old connection existed. err... so is it a problem regarding desired behaviour of pipes? --- Dmitry
Re: [Xenomai-core] [pipe.c] hairy synchronization - flush the output queue upon closure
Philippe Gerum [EMAIL PROTECTED] wrote on 18.11.2005 11:14:26: ... it looks like we can't make the whole xnpipe_release() atomic because of PREEMPT_RT + wake_up_interruptible_all() things, right? Or no. You must _never_ _ever_ reschedule with the nucleus lock held; this is a major cause of jittery I recently stumbled upon that was induced by xnpipe_read_wait() at that time. So indeed, xnpipe_release() cannot be made atomic this way under a fully preemptible kernel. Yep. Now keeping in mind the observation I have made yesterday, it looks like, in fact, there is no need in wake_up_*(readers) call in file_operations::release(). There is nobody to be woken up at the time when release() is called: 1) The reference counter of file object is 0, i.e. there are no readers since read() increases a counter before getting blocked. 2) noone else can use anew that file object since close() does the following: filp = files-fd[fd]; if (!filp) goto out_unlock; files-fd[fd] = NULL; --- it's invalid from now on so it's not possible that some new readers may occur when a counter == 0. Hm.. but we still have fasync_helper(-1,file,0,state-asyncq); which is about sending a signal and that's perfectly valid (a file::counter is not involved here). And that call may lead to re-scheduling (linux re-scheduling of course) so we can't put it in a blocked section. So the best way I see is to have something like(): xnpipe_drop_user_conn() { xnlock_get_irqsave(nklock,s); while ((holder = getq(state-outq)) != NULL) state-output_handler(minor,link2mh(holder),-EPIPE,state-cookie); } while ((holder = getq(state-inq)) != NULL) { if (state-input_handler != NULL) state-input_handler(minor,link2mh(holder),-EPIPE,state-cookie); else if (state-alloc_handler == NULL) xnfree(link2mh(holder)); } __clrbits(state-status,XNPIPE_USER_CONN); xnlock_put_irqrestore(nklock,s); } and call it everywhere instead of clrbits(state-status,XNPIPE_USER_CONN); This way we may be sure there are no pending messages left. -- Philippe. --- Dmitry