On 10/10/2012 03:16 PM, Gilles Chanteperdrix wrote: > On 10/10/2012 03:09 PM, Philippe Gerum wrote: >> On 10/10/2012 03:01 PM, Gilles Chanteperdrix wrote: >>> On 10/10/2012 02:57 PM, Philippe Gerum wrote: >>>> On 10/10/2012 02:55 PM, Gilles Chanteperdrix wrote: >>>>> On 10/10/2012 02:41 PM, Philippe Gerum wrote: >>>>>> On 10/10/2012 02:33 PM, Gilles Chanteperdrix wrote: >>>>>>> On 10/10/2012 02:09 PM, Philippe Gerum wrote: >>>>>>>> On 10/10/2012 01:30 PM, Gilles Chanteperdrix wrote: >>>>>>>>> On 10/10/2012 12:54 PM, Philippe Gerum wrote: >>>>>>>>> >>>>>>>>>> On 10/10/2012 12:25 PM, Jan Kiszka wrote: >>>>>>>>>>> On 2012-10-10 12:07, Gilles Chanteperdrix wrote: >>>>>>>>>>>> On 10/10/2012 12:04 PM, Jan Kiszka wrote: >>>>>>>>>>>>> On 2012-10-10 11:23, Gilles Chanteperdrix wrote: >>>>>>>>>>>>>> On 10/10/2012 11:01 AM, Jan Kiszka wrote: >>>>>>>>>>>>>>> On 2012-10-10 10:58, Gilles Chanteperdrix wrote: >>>>>>>>>>>>>>>> On 10/10/2012 10:10 AM, Jan Kiszka wrote: >>>>>>>>>>>>>>>>> On 2012-10-10 10:04, Gilles Chanteperdrix wrote: >>>>>>>>>>>>>>>>>> On 10/10/2012 09:56 AM, Jan Kiszka wrote: >>>>>>>>>>>>>>>>>>> On 2012-10-10 09:51, Gilles Chanteperdrix wrote: >>>>>>>>>>>>>>>>>>>> On 10/10/2012 09:38 AM, Thierry Bultel wrote: >>>>>>>>>>>>>>>>>>>>> Hi Gilles, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Many thanks, >>>>>>>>>>>>>>>>>>>>> The first patch does not work, the second does. >>>>>>>>>>>>>>>>>>>>> I think the reason for 1st patch why is that in >>>>>>>>>>>>>>>>>>>>> rtcan_virt, we have >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> rtdm_lock_get_irqsave(&rtcan_recv_list_lock, lock_ctx); >>>>>>>>>>>>>>>>>>>>> rtdm_lock_get(&rtcan_socket_lock); >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>>>>>> ---> rtcan_rcv(rx_dev, &skb); >>>>>>>>>>>>>>>>>>>>> .... >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> rtdm_lock_put(&rtcan_socket_lock); >>>>>>>>>>>>>>>>>>>>> rtdm_lock_put_irqrestore(&rtcan_recv_list_lock, lock_ctx); >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>> rtcan_rcv->rtcan_rcv_deliver->rtdm_sem_up(&sock->recv_sem); >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> thus the same re-scheduling stuff with interrupts locked. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Are you not not afraid of side effects with the second >>>>>>>>>>>>>>>>>>>>> patch, >>>>>>>>>>>>>>>>>>>>> since you change the overall behaviour ? >>>>>>>>>>>>>>>>>>>>> Won't you prefer a only locally modified rtcan_virt ? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> We should ask Jan's opinion. In any case, if we adopt the >>>>>>>>>>>>>>>>>>>> second patch, >>>>>>>>>>>>>>>>>>>> we might want to try and reduce the overhead of >>>>>>>>>>>>>>>>>>>> xnpod_unlock_sched. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> We were signaling the semaphore while holding a spin lock? >>>>>>>>>>>>>>>>>>> That's a >>>>>>>>>>>>>>>>>>> clear bug. Your patch is aligning rtcan to the pattern we >>>>>>>>>>>>>>>>>>> are also using >>>>>>>>>>>>>>>>>>> in RTnet. We just need to make sure (haven't looked at the >>>>>>>>>>>>>>>>>>> full context >>>>>>>>>>>>>>>>>>> yet) that sock remains valid even after dropping the >>>>>>>>>>>>>>>>>>> lock(s). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The second patch idea was to lock the scheduler while >>>>>>>>>>>>>>>>>> spinlocks are >>>>>>>>>>>>>>>>>> held, so that posting a semaphore while holding a spin lock >>>>>>>>>>>>>>>>>> is no longer >>>>>>>>>>>>>>>>>> a bug. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Sounds a bit hacky, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Well, that is what the linux kernel does. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> but I think we have this pattern >>>>>>>>>>>>>>>>> (RTDM_EXECUTE_ATOMICALLY) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> RTDM_EXECUTE_ATOMICALLY is a bit of a misnomer, if you do: >>>>>>>>>>>>>>>> RTDM_EXECUTE_ATOMICALLY(foo(); rtdm_sem_up(); bar()); >>>>>>>>>>>>>>>> foo() and bar() are not executed atomically if sem_up wakes up >>>>>>>>>>>>>>>> another >>>>>>>>>>>>>>>> thread. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So, I do not see how RTDM_EXECUTE_ATOMICALLY solves the issue >>>>>>>>>>>>>>>> we are >>>>>>>>>>>>>>>> talking about. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> RTDM_EXECUTE_ATOMICALLY holds the nucleus lock across the >>>>>>>>>>>>>>> encapsulated >>>>>>>>>>>>>>> code, executing it atomically as rescheduling is postponed >>>>>>>>>>>>>>> until the end >>>>>>>>>>>>>>> of the block. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Err... no. Absolutely not. >>>>>>>>>>>>> >>>>>>>>>>>>> Err... absolutely right. >>>>>>>>>>>>> >>>>>>>>>>>>> The good news is: we don't need to worry about such kind of >>>>>>>>>>>>> locking. In >>>>>>>>>>>>> rtcan_raw_recvmsg, the socket is locked via the RTDM context as >>>>>>>>>>>>> we are >>>>>>>>>>>>> in a handler. So it won't disappear when we drop the lock, and >>>>>>>>>>>>> your >>>>>>>>>>>>> first patch is fine. >>>>>>>>>>>> >>>>>>>>>>>> Which one? The first one does not seem to work because the rtdm >>>>>>>>>>>> locks >>>>>>>>>>>> seem to be nested. The second one would probably need to find a >>>>>>>>>>>> way to >>>>>>>>>>>> reduce the overhead of xnpod_unlock_sched(). What can be done, >>>>>>>>>>>> however, >>>>>>>>>>>> is adding a call to xnpod_lock_sched()/xnpod_unlock_sched() in >>>>>>>>>>>> RTDM_EXECUTE_ATOMICALLY. >>>>>>>>>>> >>>>>>>>>>> Oh, I'm seeing the locking forest in rtcan now. I suppose rtcan >>>>>>>>>>> wasn't >>>>>>>>>>> used much on SMP so far. That looks indeed unresolvable without a >>>>>>>>>>> semantical change to rtdm_lock/unlock. >>>>>>>>>>> >>>>>>>>>>> But then we really need something as light-weight as >>>>>>>>>>> preempt_enable/disable. >>>>>>>>>>> >>>>>>>>>> This is not as lightweight as it might be given that we pair a flag >>>>>>>>>> and >>>>>>>>>> a counter to achieve this (which saves one data reference in >>>>>>>>>> xnpod_schedule() though), but this is a start: >>>>>>>>>> >>>>>>>>>> http://git.xenomai.org/?p=xenomai-2.6.git;a=commit;h=aed4dfce9967e45ef7e8a8da4b6c90267ea81497 >>>>>>>>>> >>>>>>>>>> So, I'm setting __xnpod_lock_sched() and __xnpod_unlock_sched() in >>>>>>>>>> stone >>>>>>>>>> in the nucleus API to manipulate the sched locking counter from a >>>>>>>>>> context where the nucleus lock is already held, so that RTDM can >>>>>>>>>> rely on >>>>>>>>>> this for RTDM_EXECUTE_ATOMICALLY(). >>>>>>>>> >>>>>>>>> >>>>>>>>> The problem of xnpod_unlock_sched from my point of view is this >>>>>>>>> section >>>>>>>>> of code: >>>>>>>>> >>>>>>>>> xnsched_set_self_resched(curr->sched); >>>>>>>>> xnpod_schedule(); >>>>>>>>> >>>>>>>>> It means that we will go to the full blown __xnpod_schedule when >>>>>>>>> unlocking the top-most spinlock. >>>>>>>>> >>>>>>>>> I guess what I meant is that we should have a scheduler bit that is >>>>>>>>> simply tested in xnpod_schedule, but then we loose the ability for the >>>>>>>>> threads with the scheduler locked to suspend themselves. So, maybe we >>>>>>>>> should define another xnpod_lock/unlock pair, maybe something like >>>>>>>>> xnpod_preempt_disablle()/xnpod_preempt_enabled(). >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> xnpod_lock_sched() really means xnpod_preempt_disable() in RTOS >>>>>>>> parlance. >>>>>>> >>>>>>> Yes, but the reason why we have to go to __xnpod_schedule in >>>>>>> xnpod_unlock_sched() is to allow thread to be suspended with the >>>>>>> scheduler locked (you added that at some point for the vxworks emulator, >>>>>>> if I remember correctly). But without that need, we can put the XNLOCK >>>>>>> bit in the sched->status structure, and simply test that bit with all >>>>>>> the others in xnpod_schedule(), and forget about the need to call >>>>>>> xnsched_set_self_resched() in xnpod_unlock_sched(). That is all I meant. >>>>>>> >>>>>> >>>>>> You may have to reschedule after an IRQ has interrupted the lock owner >>>>>> inside the locked section, in which case you have to make sure to >>>>>> reschedule when dropping the sched lock. >>>>> >>>>> Well, in that case the resched bit will have been set already by the irq >>>>> handler calling xnpod_resume_thread, or other service, you do not have >>>>> to force it. >>>>> >>>> >>>> Yes, and this is why you have to call __xnpod_schedule() regardless. >>>> >>> no, xnpod_schedule is enough, it will only call __xnpod_schedule() if >>> the resched bit is set. >>> >> >> We are not discussing about the same issue, I'm afraid. We can't >> optimize the schedlock path via a scheduler flag since we have to care >> about lock nesting. Since the only sane option there is to hold such >> counter in the current thread context, the optimization is 100% in the >> way we access this information. >> > > Again, if we do not have to allow threads to suspend while holding the > scheduler lock, the preempt_count also can be in the sched structure. >
Sure, but this point is moot. We just _can't_ do that. -- Philippe. _______________________________________________ Xenomai mailing list [email protected] http://www.xenomai.org/mailman/listinfo/xenomai
