On 12 Jan 2016, at 10:49, Reynold Xin 
<r...@databricks.com<mailto:r...@databricks.com>> wrote:

How big of a deal this use case is in a heterogeneous endianness environment? 
If we do want to fix it, we should do it when right before Spark shuffles data 
to minimize performance penalty, i.e. turn big-endian encoded data into 
little-indian encoded data before it goes on the wire. This is a pretty 
involved change and given other things that might break across heterogeneous 
endianness environments, I am not sure if it is high priority enough to even 
warrant review bandwidth right now.



This is a classic problem in distributed computing, which has two common 
strategies


the SunOS RPC strategy: fixed order. For Sun, hence NFS, the order was that of 
the Motorola 68K, so cost-free on Sun workstations. As SPARC used the same byte 
ordering; again, free. For x86 parts wanting to play, inefficient at both 
sending and receiving. Protobuf has a fixed order, but here little-endian 
https://developers.google.com/protocol-buffers/docs/encoding.

Apollo RPC DCE strategy: packets declare byte order, recipient gets to deal 
with it. This is efficient in a homogenous cluster of either endianness, as 
x86-x86 would be zero-byteswapping. The Apollo design ended up in DCE, which is 
what Windows Distributed COM uses.  ( 
http://pubs.opengroup.org/onlinepubs/9629399/chap14.htm ). If you look at that 
spec, you can see its floating point marshalling that's most trouble.

recipient-makes-good is ideal for clusters where the systems all share the same 
endianness: the amount of marshalling is guaranteed to be zero if all CPU parts 
are the same. That's clearly the defacto strategy in Spark. On contrast, the 
one-network-fomat is guaranteed to have 0 byteswaps on CPUs whose endian 
matches the wire format, guaranteed to be two for the other part (one at each 
end). For mixed-endian RPC there'll be one bswap, so the cost is the same as 
for the apollo DCE.

Bits of hadoop core do byteswap stuff; for performance this is in native code; 
code which has to use assembly and builtin functions for max efficiency.

It's a big patch —one that's designed for effective big-endian support, 
*ignoring heterogenous clusters*

https://issues.apache.org/jira/secure/attachment/12776247/HADOOP-11505.007.patch

All that stuff cropped up during Alan Burlinson sitting down to get Hadoop 
working properly on Sparc —that's a big enough project on its own that worrying 
about heterogenous systems isn't on his roadmap —and nobody else appears to 
care.

I'd suggest the same to IBM: focus effort & testing on Power + AIX rather than 
worrying about heterogenous systems.

-Steve

Reply via email to