Dear Christian, On Tuesday 30 of April 2024 08:40:43 Christian MAUDERER wrote: > > For others, code under review hosted in CTU university GitLab > > server > > https://gitlab.fel.cvut.cz/otrees/rtems/rtems-canfd > > Documentation > > > > https://otrees.pages.fel.cvut.cz/rtems/rtems-canfd/doc/can/can-html/can.html > > > > https://otrees.pages.fel.cvut.cz/rtems/rtems-canfd/doc/doxygen/html/index.html > > > > Main developer behind extension to CAN FD and switch to RTEMS > > is Michal Lenc. > > > > The intention is to (hopefully) reach state when it meets criteria > > to mainlining int RTEMS CPU kit under > > > > cpukit/dev/can ... > > I agree, that it is compromise. But adding yet another file descriptor > > like multiplexor for queues to each file descriptor seems to me as > > too much complexity. But it can be added. even later as IOCTL to remove > > individual queues based on CAN ID matches or queues IDs if create > > is modified to return internal queue IDs... > > I somehow missed that you can open the device multiple times and get > independent queues. With that, it's completely OK and should be flexible > enough for most applications. > > It's great that you already have put some thought into how it could be > extended later if some application needs more flexibility. ... > >> Did you check with > >> some other hardware controller, whether the whole structures / defines > >> / flags close to the hardware do work well for other controllers too? > > > > The code/concept is based on my previous LinCAN and OrtCAN work > > > > https://ortcan.sourceforge.net/lincan/ ... > I didn't want to doubt your competence. Like I said it's some trap that > I have fallen into often enough myself (like when guiding Prashanths > GSoC project). But it's clear that you have put a lot of thought into > that. So I would expect that there shouldn't be much trouble with most > controllers. Maybe except for the ones where a semiconductor vendor > thought it would be a good idea to create a completely different > concept. But these are always difficult.
I agree with discussion and searching for hard arguments. The solution is compromise and in general CAN bus concept is optimized for direct replacement of wires in car going between distinc units and its use as general communication solution has some difficulties and requires some compromises. For small devices with predefined purpose and Autosar, it is ideal to allocate for each CAN ID (wire signal) to be sent one communication object on the controller. Same for each received signal value or their set in the single frame. The most controllers are equipped by filters and mechanism to do so including selection of the Tx message object for physical bus-link arbitration according to the priority. Then sending side updates signal value in corresponding Tx object and receiving side sees most actual one usually on the best effort basis, older unread frames are overwritten by updated value. But even in simple ECU, there are obstacles to use this principle in all kind of the communication. CAN bus is used for firmware updates and general configuration. In this case, the reliable delivery of all messages with given CAN ID is required because whole sequence has to be received and processed and the state evolution is associated to the sequence. If a single message is lost, then all data are unusable. Because sequence requires exact ordering it is typical that only single Tx object is used. On Rx side there can be problem to capture all frames without overwrite by single Rx object so some controllers ad FIFO which can be attached to each object or some mechanism how to allocate more Rx objects and pass them to the user in FIFO order. That works for small ECUs with single purpose firmware. But on general purpose operating system which should allow even complete monitoring of the CAN bus, allows dynamically started applications and even whole virtual CAN/CANopen nodes, allocation the controller Tx/Rx message objects for each specific purpose is impossible. That is why all generic CAN subsystems which I know (CAN4Linux, LinCAN, SockteCAN, NuttX char device CAN, windows Peak's drivers etc.) define API based on opening driver and presenting received messages in FIFO order to application (with options for software filtering but usually not propagated to controller, HW - LinCAN has some option to union user FIFOs to mask and ID propagated to HW, but you usually end with fully end with need to receve all anyway and it has not been used at the end). The Tx FIFO order is required for messages with same ID or even sometimes between same stream of mesages even wit altering ID for correct realization of some higher level protocols. The result is that even on hardware equipped with multiple Tx objects but without special Tx FIFO order preserving cyclic queue only single Tx object is used to realize transmission of all messages, for example SocketCAN on XCAN controller. So only part of the CAN bus media badwidth can be utilized by single node. May it be, it is sometimes a luck, because CAN IDs are not correctly allocated according to priority even on cars critical subsystems. On the Rx side original buffers approach is hard to use in order preserving FIFO concept, but the most of today controllers add some option to keep order and leave processing and distribution on software side. See evolution from CCAN to DCAN to overcome that problem. We have even made LinCAN for CCAN many many years ago which somehow kept required properties but it was headache. So back to generic OS can interfaces, all I know are FIFO(s) based. Most of them keep strict FIFO order on Tx side which results in HoL (head-of-the-line) blocking and priority inversion on bus loaded by middle priority from other node. That is why SocketCAN adds alloc_candev_mqs (multiple-queues) alternative for drivers https://elixir.bootlin.com/linux/latest/source/drivers/net/can/dev/dev.c#L249 but as I know, no mainline kernel driver is using that. We have done some work to research and even a little extend Linux networking QoS subsystem to solve buffer bloat by old messages for traffic requiring best effort (most up to date data for control) for given IDs and to limit badwidth of others or virtual guests connected through QEMU to physical bus etc. may years ago at time when multi-queue has not been available on Linux side. I have long time plan to extend CTU CAN FD mainline Linux driver for this support and probably to be the first example how to overcome HoL/priority inversion in Linux CAN subsystem. It has been planned in original LinCAN before SoketCAN and it is now implemented in proposed RTEMS CAN/FD framework where application can setup multiple queues even for single open instance with different Tx priority class and when used and mapped correctly to CAN IDs, it can prevent priority inversion. It is not generic, because it is quite expensive for deeper FIFOs and even mutual order of Tx messages has to be preserved for many protocols as discussed earlier. CTU CAN FD IP core interface to software has been architected by me to allow maximal utilization of the Tx buffers and their reallocation when needed for higher priority message. Wait for DTP processing and publication of our international CAN Conference 2024 article or come and meet next week in Baden-Baden https://www.can-cia.org/icc/ There are two branches of the thought from this point 1) how it maps to other controllers For these equipped by single Tx object only (i.e. SJA1000), it maps well because attempt to repeat Tx and arbitration can be disabled when higher priority queue becomes ready and our CAN infrastructure allows to push back lower priority message and schedule higher one to be sent. For more complex one, if they do not allow to control Tx objects order then only single Tx object can be used. Bad, link underutilization, but it is what is standard in SocketCAN and other CAN solutions for general purpose operating systems today. All controllers which I know allows to stop Tx attempt repeat and I hope to seen at all option t check if the latest attempt has been successful or not. So newt RTEMS CAN can use them same as on SJA1000. On Rx side, most have FIFO preserving option to use multiple buffers. Sometimes partially broken, burdened by erratas etc. (like iMX RT where we overcome these problems in NuttX drivers). When number of Tx priority classes is limited (for proposed system by default 3 but compile time configurable) then we can allocate one Tx buffer for each class, easy and preserves HoL priority inversion even on simple controllers. If there is option to order Tx according to the buffer index in the controller, then there is option for a little more performant solution when multiple Tx buffers are allocated for each class and they are sequentially filled till highest allocated buffer index is filled. Then there is some gap till all these buffers in given priority are sent because cyclic filling of the minimal index would result in reordering with possible break of some protocol requirements. Some controllers allows to attach DMA realized FIFOs to more Tx objects, in such case it would map to proposed design well too. Some newer controllers adds local priority bits above CAN ID ones (i.e. new NXP FlexCAN). This could allow cyclic use of some Tx objects/buffers similar to CTU CAN FD. There will be problems because multiple Tx buffers priorities are not reachable by single atomic operation like in CTU CAN FD case. But I have some idea how to implement sequential updates to ensure order in the class. There would be problem, that most controllers do not allow to update this information on the objects participating actively in arbitration. So it would lead to much more acrobation between eggs and some gap time, where none message is offered in the link arbitration even that there are pending user requests will be inevitable in some scenarios after some number of messages sent. That cannot be on the bus side worse that considering fixed order according to index. May be, it can be found that overhead does not worth that. But we preserve API in variants in all cases... 2) use of the CAN bus in applications requiring maximal bus transparency with minimal latency and SW load. This is totally opposite of the general CAN bus subsystem for general purpose RTOS. The API in this case should allocated Tx and Rx controller objects for the individual purposes/CAN IDs. Rx side SW processing can be considered as alternative and proposed framework allows to setup queues, but it has overhead and under extreme load it can lost some messages if HW is not performant enough. On Tx side it is even more problematic. But if this type of use of RTEMS for example for Autosar or Simulink generated code is considered then it is possible to extend actual proposed API by IOCTLs which allows to reserve some controller objects for specific purposes and allows to access them directly for minimal overhead and use under direct application control or attach separated controller side "canque_ends_dev_t" to such objects and propagate them to some clients to standard CAN read and write API. So I think that the proposed framework provides what is expected bu most of general purpose CAN/CAN FD framework users, tries to perpare a little even for come of CAN XL, solves problems which may be practically unsolved by all other generic approaches still. And we have some clue how to extend support for most/all other controllers and even some open doors to offer even ECU style API for applications which benefit from direct controller buffers use/allocation which is possible on controllers with abundant number of buffers (not case of SJA1000 and very limited on CTU CAN FD - max 8 can be configured to silicon under actual registers map). I understand that the text is long but you have asked for it in the fact and I provide complete thought dump to analyes it. I would be happy if you and or others find time to look into actual code implementation to identify what could be issue for mainlining as soon as possible because after May 24 changes do not propagate into Michal Lenc's thesis text which can be alternative and more in depth documentation and analysis than what fits into official RTEMS one. The full document has already 47 pages and 34 of the actual text without content and appendices. Document includes benchmarks under RTEMS load by HTTP traffic, priority inversion prevention confirmation by measurements with performance data etc. It will be published on CTU in May or June https://dspace.cvut.cz/ and links will be added to https://canbus.pages.fel.cvut.cz/ same as for much shorter iCC article and presentation. Best wishes, Pavel -- Pavel Pisa phone: +420 603531357 e-mail: p...@cmp.felk.cvut.cz Department of Control Engineering FEE CVUT Karlovo namesti 13, 121 35, Prague 2 university: http://control.fel.cvut.cz/ personal: http://cmp.felk.cvut.cz/~pisa company: https://pikron.com/ PiKRON s.r.o. Kankovskeho 1235, 182 00 Praha 8, Czech Republic projects: https://www.openhub.net/accounts/ppisa social: https://social.kernel.org/ppisa CAN related:http://canbus.pages.fel.cvut.cz/ RISC-V education: https://comparch.edu.cvut.cz/ Open Technologies Research Education and Exchange Services https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home _______________________________________________ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel