just noting where we are on various ARM builds - ARM really because it is the only non-x86 platform I'm looking into and so I have nothing contribute otherwise
getting a new board working with LinuxCNC is has several aspects: 1. getting a realtime kernel going for the board 2. getting LinuxCNC to compile and run 3. getting drivers to work 4. make it work as fast as you can 5. determine if the result is actually usable Lets look at these in turn. 1) is a bit of a hit-and-miss game. historically ARM support in Linux has suffered from the enormous range of offerings available, and only recently with the Linaro effort the ARM ecosystem is trying to get its act together in terms of build support, and integration with Linux mainline. As for RT, options really are only Xenomai and RT_PREEMPT at this point. Generally one cannot hope for any stock kernel of this genre, it means building one. The main difference is that Xenomai ports are pegged to very few kernel versions as starting point - realistically 2, maybe three for the adventurous; whereas RT_PREEMPT has been available for many major kernel versions so far and will likely be the first RT option to be in Linux mainline; Xenomai might follow later when their 'Xenomai 3' strategy pans out. Still the key flow remains 'find a working kernel for that board; find a matching RT patch for that kernel version; progress or abandon otherwise'. Xenomai can be built off 2.6.38.8 and 3.2.21 kernel versions based on a patch, and some of that patch is hardware dependent, in particular high resolution timer support. Without that, one need to look no further because if the timer support is low resolution, any latency measurements - leave alone usable results with Linuxcnc and a fast base thread - are useless. The porting and adaptation process is well documented and not that huge in lines of code, but it requires intimate knowledge of the hardware. This means reading processor/SoC datasheets and very low level work. There is a 3.6-based Xenomai patch but I dont think it has seen much exposure yet. Luckily I found a quite usable patch for the Raspberry board and Xenomai works well on it, giving on the order of 40uS latency, and that is in my repo; currently I have no binary packages available but it looks doable. I have not found a RT_PREEMPT patch which matches a usable Raspberry kernel version close enough to give it a try. Usually kernel minor revisions mostly vary with respect to driver support, so there is a chance one can fast forward over some minor kernel version and still have something usable. The other board I have and I'm dabbling with, the BeagleBone, has several options - there's a RT_PREEMPT patched kernel source readily available for 2.6.8 (building right now) and there are several reports and patches for a 3.2.21 based Xenomai kernel which I'll try next. I have several starting points and 'just' need to determine which one works well. 'Just' should be read as 'a kernel build for such platforms should be started before you go to sleep, and check in the morning'. 2) means build support - that's package availability and configure support. Configure support can be fixed, but massive special-purpose package builds are out of scope for me, so I try to pick a base which has a decent package stream; sometimes there are several options (sometimes there are too many). Rasperry: has a very decent ubuntu-like package stream, so most of the moving parts are in place. Adapting configure involved all ARM dependencies so it was initially more but will be much less for other boards. While all of LinuxCNC builds, I have yet to see an Axis screen, but this is very likely a local setup problem. HAL/RTAPI/Gladevcp run fine. Beaglebone: this comes with the Angstrom distribution installed, and that is useless for LinuxCNC purposes - too many packages missing. I switched to an Ubuntu precise based setup and that behaves pretty much like the x86 environment. There were minimal configure changes after the Raspberry initial round of changes. No suprises and Axis actually runs (see 5) below). Building master: the current rtos-integration-preview1 code is based on 2.5. I have test-merged into master with minimal touchups. Due to the use of boost::python and its memory requirements during compilation swap space is needed during a master build; a USB flash stick is fine (both of my boards sport 256M memory) 3) Drivers we are out in no-parport, no PCI land. Candidate Boards usally have GPIO pins, some of which can be overlaid with other functions like I2C or SPI. The is usually driver support through sysfs to configure and wiggle pins, and drive i2c and SPI peripherals, but is not a high speed option as it requires system calls and that is generally not a good idea in real time code. I have made a minimal attempt with the Raspberry hal_gpio module; minimal insofar as it works but isnt flexible in terms of configuration and not optimized for speed. The good news is: once you have one, you have them all (more or less); all ARM I/O is memory mapped and very similar from platform to platform. So getting from A to B is quite straighforward; slightly different setup, different macros for memory location, but that is about it. hal_gpio is really just a starting point, but once you get some pin to wiggle with a simple C test program, you're almost there. 4) Making it fast that's really two questions and they are not the same - is the latency ok, and does the overall system have enough umph to run the whole of LinuxCNC. My answer right now would be 'yes and no' to that. That however doesnt make it a moot effort for me - my primary reason is _not_ to find a $50 PC replacement but to arrive at a realtime outboard solution where the rest of LinuxCNC runs on some other non-RT platform. Maybe an iPad, or an Android tablet, who knows. Latency with X86/RTAI is still best; Xenomai second, RT_PREEMPT third. Is it good enough? Well, for servo it is, but who would run servo-only on such a low-end board which is likely hooked up to steppers to start with? This isnt making any impression with the RepRap crowd either which still thinks Arduino. So that turns attention to base thread performance. Can you move steppers with Pi/Beagleboard/Xenomai/hal_gpio? Yes you can, but not very fast. Similar to RTAI/x86, just slower. So the standard solution to that would be 'glue on some extra hardware for higher speeds' - an FPGA-based board for example, and then we are leaving low-cost, single-board land, at least for now. That however is not the end of the story since we are not the first to have discovered the issue. There is more than one option to deal with that issue, and some of them promise to yield significantly better results we currently have than with a LinuxCNC-optimized, hand-massaged RTAI-driven junkyard PC with a parport. Sergey already has shown how to use a low-overhead hardware feature to improve stepping performance with miniemc2. I find the features of the TI AM335x Sitara Omap processors used in the Beaglebone board particularly promising. Let me give a minimal rundown what is special here: - besides the main ARM cpu, there are two processors called 'Programmable Realtime Units' on-chip. - these run at 200Mhz, are 32bit integer CPU's and can drive the relevant peripherals like GPIO, at speeds exceeding 50Mhz while them main CPU is doing nothing (this isnt marketing baloney, I have scoped it myself) - programming these PRU's is done in assembly (not C, limited fun, but fairly straighforward) - integrating these PRU's can be done with reasonably low effort, see hal/components/hal_pru.c here: http://git.mah.priv.at/gitweb/emc2-dev.git/blob/refs/heads/arm335x-hal-pru-module:/src/hal/components/hal_pru.c - these CPU's are programmed in assembler, and that assembler is provided as open source by TI - the whole interaction is through shared memory, including stepping, halt, run, inspect registers etc, which makes debugging straighforward (determined low-level hackers can stop and inspect the PRU's by fiddling bits in /dev/mem) - there is a time stamp counter running at 200Mhz I find it entirely feasible to recode the RT thread functions of stepgen, encoder, freqgen etc for these PRU's and it is not rocket science. Ok, reading manuals and trying this and that, but definitely for mortals. What I _think_could be the result of such an effort is a LinuxCNC HAL/RTAP/component system which runs a base-thread lookalike in maybe 5-10uS cycles, and quite deterministic at that. Is it portable? no. Fast? yes. Cheap? yes. That is about as much as one can expect for a $85 board. And we will _never_ have a solution with all three boxes ticked. Not going to happen. For anybody who is researching that option I would suggest to study Bas' work which is the most advanced use of the PRU scheme I could find: https://github.com/modmaker/BeBoPr . Since Bas started early on that platform, some of the PRU handling is much easier nowadays since support code from TI has become available. On the cultural/community side it seems TI 'gets it' - they have made great strides to appeal to, and actively support the open source and hacker communities, and committed manpower and marketing money to the cause. Thee are other options in the pipe - processors with a bit of FPGA inside or glued on, a Beaglebone FPGA 'cape' plugin is in the works, so we're going to see more here. As for the Raspberry: that has very limited potential - the platform is about half as fast as the Beaglebone, has fairly minimal GPIO, and it is wed to the Broadcom chipset, which I would rate as a company which still doesnt get it. A mean voice said about the chipset used on the Pi 'half of it isnt documented, and the other half doesnt work'. Thats not entirely true but there is something to that comment. 5) Is the result usable? If you're pegging hopes to connect screen and keyboard to any of the current boards, be able to fire up Axis and determine it is as fast as the current PC/RTAI option: you are likely in for a bad surprise. Both Sergey and I found that the current code maxes out the main CPU to the extent that maybe 10, 20% CPU are left and that isnt good news. Note we both havent seriosly looked into profiling and removing any glaring bottelnecks; this is still to be verified. The hardware offerings will improve, but not within the lifetime of the boards I reviewed. If one follows the idea of spinning out RTAP+HAL+drivers onto such a board, I think the prospect is excellent and potentially the same or better as the current best-of-breed soft stepping solution (NB I am explicitly am exluding Mesanet/Pico type solutions here). That is 'potential only' - we still have such inflexible interfacing in LinuxCNC that such a setup isnt possible with today's code structure. - Michael ------------------------------------------------------------------------------ Keep yourself connected to Go Parallel: BUILD Helping you discover the best ways to construct your parallel projects. http://goparallel.sourceforge.net _______________________________________________ Emc-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/emc-developers
