On Monday, November 23, 2015 at 5:08:16 PM UTC, Yichao Yu wrote: > > On Mon, Nov 23, 2015 at 11:43 AM, Ian Watson <ianiw...@gmail.com > <javascript:>> wrote: > > As far as I know, the underlying O/S and software on both machines is > the > > same, Red Hat Enterprise Linux Server release 6.6 (Santiago), but I > cannot > > be 100% certain about that. I am compiling with gcc-5.2.0 > > > > I tried the suggestion in DISTRIBUTING.md, but setting MARCH to either > core2 > > or x86-64 failed on the E5-2680 machine - openblas usually fails > > Yes, you should build with an architecture that is compatible with all > the ones you want to run on. >
I do not see a reason that the architecture "should" be the same (except as he says "but that can wait for another day", of course start with the low hanging fruit). It's not clear to me from the link, if the way to install on heterogeneous is not just supported (because it is not worth it much?), any more, but easily could (seems it could, as it was supported, or was it broken and that is why not longer supported?). Is it just not worth the effort? Not a large payoff? No need to read further for my speculations.. I would think there are a lot of such clusters (maybe a large fraction or majority?). If some parts of clusters are older/slower is that a big drawback (except for the hassle to get different ISAs to work)? [You might need some load balancing? Wouldn't code at least work, with possibly slower nodes dragging down performance, at least never giving wrong results/more race conditions or something?] I expect the bitness needs to be the same.. (not really much of a problem), and the endianness (or not..?), in theory x86 and ARM etc. could be combined (not a good idea..?), and even to OSes different (maybe again not a good idea and no good reason..(?)). I just ask out of curiosity, it seems Julia would be ideal to do stuff like this, and say for x86 and ARM, not really possible (or at least a hassle) for the "competition" (e.g. C++). An exception, might be Java/JVM and CLR etc. but at least JVM (both?) do not have multi-dimensional arrays (I'm not sure if JVMs are much of a competition, but IBM did some heroic optimizations to get multidimensional arrays to work fast/er with [their (only I think)] JVM and compiler - I'm not sure if this is used much..). The alternative for established languages is differently compiled executable on each node - might not be out of the question in a source code environment (already done that way?) - and/or executable/libraries dynamically choose different machine code as in e.g.: "openblas defaults to detecting the runtime system and picking the best code to run for that computer." [See above, why did openblas then fail? I guess because of "binutils too old", and shouldn't have otherwise] I wander how commonly code is compiled that way, just for x86-variants (it has a runtime cost.. and I assume not done as fat binaries for x86/ARM and wouldn't scale to more..). -- Palli.