An update: It turns out that the default linux client compiled for ARM does work with a few modifications <https://gist.github.com/waTeim/0dd3ce0e0d3a3cd1c734>. It is indeed slow to JIT-compile, (on the order of seconds), but is fast after that. See randomMatrix.txt in the gist, for exact measures, but for example rand(100,100) takes 1 second the first time and then 0.0005 seconds subsequently -- that's quite a speedup. Likewise, svd of that 100x100 random matrix is 3 seconds and then 0.05 seconds afterwards.
Also, of course, I tried using node-julia and that was difficult, but I managed to get that working too. Firstly, I had to add -march=armv7-a to the compile step to allow use of C++ std::thread to be successful; apparently this is a known <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65255> bug of openembedded <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65255>, and not an issue with Julia. However, I also had to add -lgfortran and -lopenblas to the link step to satisfy the loader which is reminiscent of the error I had <https://groups.google.com/forum/#!searchin/julia-users/could$20not$20load$20library$20%22libopenblas/julia-users/A8HwlmldVTM/vuU9eNH6ZpAJ> when I first tried to get thigs working on linux in 2014. This is not the same thing but feels possibly related. I also had to create a symbolic link for gfortran as only the specialized version (libgfortran.so.3) existed; ln -s libgfortran.so.3 libgfortran.so; that should probably be fixed. All but 3 of the node-julia regressions worked once I increased the default timeout time. You can see the relative speed differences between an OS/X labtop and the drone in the gist as well. Some of the timings are comparable, some are not. I believe the large difference in time is due to not only the processor being slower, but the (flash) filesystem as well. Exec-ing processes on this drone takes a very large amount of time (relative to normal laptops/desktops). The memory is limited too (512 MB), so you really have to be careful about resources.