On 12.05.2010 13:32, Klaus Hitschler wrote:
> On 11.05.2010 21:42:31, Wolfgang Grandegger wrote:

>> For the time being, I tend to fix just the problem for the system where
>> is shows up. It would be nice if we could reproduce it somehow. Klaus,
>> on what hardware did you realize that problem?
> 
> On multiple x86 multi-core machines. The scenario was always the same. The 
> users got lots of receive data and did send data with a low frequency. In 
> this 
> case sometimes the initial write (triggered in the direct control path of a 
> ioctl) got lost and following writes become stalled since no transmit-ready 
> interrupt was raised. I managed to reproduce the fault multiple times.
> 
> The write stall did not happen on single core machines.
> 
> It is not enough to add a delay since you cannot determine when the register 
> is accessed by another core in follow of a interrupt. Even Softirq code can 
> be 
> interrupted by Hardirqs. 
> 
> OK. Most PCI systems cannot run the desgined 33 nsec BUS cycles. But the 
> situation is not less serious with e.g. 66 nsec, 99 nsec ...

Yes. So far it looks like that only SMP systems are affected. As obviously the
command register is the only register that is unfortunately used in the hot
path at receiving AND sending CAN frames i tend to adopt the strict locking
suggested by Klaus.

What about an updated patch with a dedicated function for writing the command
register, which executes a different locking depending on CONFIG_SMP.

Would this be ok for you?

Regards,
Oliver
_______________________________________________
Socketcan-core mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/socketcan-core

Reply via email to