On 8/6/2011 5:38 PM, jdrewsen wrote:
AFAIK David Nadlinger is handling serialization in his GSOC Thrift
project that he is working on currently.

/Jonas

Good to know, but what flavor? As I see it there is a three-way tradeoff in serialization. In order of importance for distributed parallelism, the qualities are:

1. Efficiency. How much does it cost to serialize/unserialize something and how much space overhead is there?

2. Flexibility w.r.t. types: How many types can be serialized? How faithfully are they reproduced on the other end w.r.t. things like pointer/reference/slice aliasing?

3. Standardization: How universally understood is the format? Can it be used to send data across different CPU architectures? Across languages? Is it human readable? Is it based on some meta-format like XML?

For enterprisey use cases, I think this ordering would probably be completely reversed. For example, in a typical MPI cluster all nodes are of the same architecture, so it's usually perfectly reasonable to send arrays of primitives as just raw bits. I imagine this is a terrible idea in other contexts that I know less about.

Reply via email to