subject:"std.concurrency wrapper over MPI\?"

Re: std.concurrency wrapper over MPI?

2011-08-11 Thread Jacob Carlborg


On 2011-08-11 13:07, dsimcha wrote:

On 8/11/2011 4:14 AM, Jacob Carlborg wrote:

On 2011-08-07 21:28, dsimcha wrote:

In addition to the bug reports I filed, why is it necessary to write any
serialization code to serialize through the base class? What's wrong
with just doing something like:

class Base {}
class Derived : Base {}

void main() {
auto serializer = new Serializer(new XMLArchive!());

// Introspect Derived and figure out all the details automatically.
serializer.register!(Derived);
}



I've been thinking about this and currently I don't see how this would
be possible. When serializing through a base class reference the static
type would be of the base class. But what I need is the static type of
the subclass, to be able to loop through the tuple returned by tupleof.
The only information I can get about the subclass is basically the fully
qualified name.

What I would need is some kind of associative array that maps strings to
types, but as far as I know that's not possible, specially since the
strings would be runtime values.



You have classinfo as a key, as you point out. You also already have a
template that's capable of serializing a class given that its static
type is exactly its dynamic type.

I was thinking something like:

class Serializer {
string delegate(Object)[TypeInfo_Class] registered;

void register(T)() {
registered[T.classinfo] = &downcastSerialize!(T);
}

void serialize(T)(T value) if(is(T : Object)) {
if(value.classinfo is T.classinfo) {
// Then the static type is exactly the runtime type.
// Serialize it the same way you do now.
} else {
enforce(value.classinfo in registered,
"Cannot serialize a " ~ value.classinfo.name ~
" because it has not been registered.");

return registered[value.classinfo](value);
}
}

string downcastSerialize(T)(Object value) if(is(T : Object)) {
auto casted = cast(T) value;
assert(casted);
assert(value.classinfo is T.classinfo);

return serialize(casted);
}
}



Cool, very clever. I didn't think of delegates and specially not 
creating a delegate out of a template method. Thanks.


--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-11 Thread dsimcha


On 8/11/2011 7:07 AM, dsimcha wrote:

On 8/11/2011 4:14 AM, Jacob Carlborg wrote:

On 2011-08-07 21:28, dsimcha wrote:

In addition to the bug reports I filed, why is it necessary to write any
serialization code to serialize through the base class? What's wrong
with just doing something like:

class Base {}
class Derived : Base {}

void main() {
auto serializer = new Serializer(new XMLArchive!());

// Introspect Derived and figure out all the details automatically.
serializer.register!(Derived);
}



I've been thinking about this and currently I don't see how this would
be possible. When serializing through a base class reference the static
type would be of the base class. But what I need is the static type of
the subclass, to be able to loop through the tuple returned by tupleof.
The only information I can get about the subclass is basically the fully
qualified name.

What I would need is some kind of associative array that maps strings to
types, but as far as I know that's not possible, specially since the
strings would be runtime values.



You have classinfo as a key, as you point out. You also already have a
template that's capable of serializing a class given that its static
type is exactly its dynamic type.

I was thinking something like:

class Serializer {
string delegate(Object)[TypeInfo_Class] registered;

void register(T)() {
registered[T.classinfo] = &downcastSerialize!(T);
}

void serialize(T)(T value) if(is(T : Object)) {
if(value.classinfo is T.classinfo) {
// Then the static type is exactly the runtime type.
// Serialize it the same way you do now.
} else {
enforce(value.classinfo in registered,
"Cannot serialize a " ~ value.classinfo.name ~
" because it has not been registered.");

return registered[value.classinfo](value);
}
}

string downcastSerialize(T)(Object value) if(is(T : Object)) {
auto casted = cast(T) value;
assert(casted);
assert(value.classinfo is T.classinfo);

return serialize(casted);
}
}



One small correction:


string downcastSerialize(T)(Object value) if(is(T : Object)) {
auto casted = cast(T) value;
assert(casted);
assert(casted.classinfo is T.classinfo);

return serialize(casted);
}

Re: std.concurrency wrapper over MPI?

2011-08-11 Thread dsimcha


On 8/11/2011 4:14 AM, Jacob Carlborg wrote:

On 2011-08-07 21:28, dsimcha wrote:

In addition to the bug reports I filed, why is it necessary to write any
serialization code to serialize through the base class? What's wrong
with just doing something like:

class Base {}
class Derived : Base {}

void main() {
auto serializer = new Serializer(new XMLArchive!());

// Introspect Derived and figure out all the details automatically.
serializer.register!(Derived);
}



I've been thinking about this and currently I don't see how this would
be possible. When serializing through a base class reference the static
type would be of the base class. But what I need is the static type of
the subclass, to be able to loop through the tuple returned by tupleof.
The only information I can get about the subclass is basically the fully
qualified name.

What I would need is some kind of associative array that maps strings to
types, but as far as I know that's not possible, specially since the
strings would be runtime values.



You have classinfo as a key, as you point out.  You also already have a 
template that's capable of serializing a class given that its static 
type is exactly its dynamic type.


I was thinking something like:

class Serializer {
string delegate(Object)[TypeInfo_Class] registered;

void register(T)() {
registered[T.classinfo] = &downcastSerialize!(T);
}

void serialize(T)(T value) if(is(T : Object)) {
if(value.classinfo is T.classinfo) {
// Then the static type is exactly the runtime type.
// Serialize it the same way you do now.
} else {
 enforce(value.classinfo in registered,
 "Cannot serialize a " ~ value.classinfo.name  ~
 " because it has not been registered.");

 return registered[value.classinfo](value);
}
}

string downcastSerialize(T)(Object value) if(is(T : Object)) {
auto casted = cast(T) value;
assert(casted);
assert(value.classinfo is T.classinfo);

return serialize(casted);
}
}

Re: std.concurrency wrapper over MPI?

2011-08-11 Thread Jacob Carlborg


On 2011-08-07 21:28, dsimcha wrote:

In addition to the bug reports I filed, why is it necessary to write any
serialization code to serialize through the base class? What's wrong
with just doing something like:

class Base {}
class Derived : Base {}

void main() {
auto serializer = new Serializer(new XMLArchive!());

// Introspect Derived and figure out all the details automatically.
serializer.register!(Derived);
}



I've been thinking about this and currently I don't see how this would 
be possible. When serializing through a base class reference the static 
type would be of the base class. But what I need is the static type of 
the subclass, to be able to loop through the tuple returned by tupleof. 
The only information I can get about the subclass is basically the fully 
qualified name.


What I would need is some kind of associative array that maps strings to 
types, but as far as I know that's not possible, specially since the 
strings would be runtime values.


--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-09 Thread David Nadlinger


On 8/7/11 12:09 AM, dsimcha wrote:

On 8/6/2011 5:38 PM, jdrewsen wrote:

AFAIK David Nadlinger is handling serialization in his GSOC Thrift
project that he is working on currently.

Good to know, but what flavor?


The most important thing to note, and the reason it could not be 
appropriate for what you want to do, is that the intended main use case 
for Thrift is to define an interface that can easily be used from 
several programming languages, feeling »native« for each of them 
(similar to what protobuf does). As a consequence, Thrift by design only 
supports value types, so it is not possible to e.g. serialize a tree or 
a DAG without »flattening« it first.


Another important feature of Thrift is protocol versioning – you can 
have required and optional fields, and the order of struct fields on the 
wire is not defined. While the schemata themselves are never serialized, 
the serialized data includes type tags and field ids for this purpose.


For the actual serialization format, there are several choices 
available, currently implemented for D are the most popular ones: 
Binary, which basically just dumps the raw bytes to the stream (all 
numbers are written in network byte order, though), Compact, which is a 
space-optimized binary protocol (zigzag varints, merging of some bytes 
where you know you don't need all bits, …), and a »rich« JSON format.


These features obviously come at a (manageable) performance cost, but 
except for that, the code is quite heavily optimized for reading/writing 
performance. For example, while the protocols and transports (serialized 
data sources/sinks) are pluggable at runtime, it is possible to 
specialize all the serialization/RPC code for the actual implementations 
used, thus eliminating all virtual calls and allowing e.g. the 
serialization code for a struct to be inlined into a single function 
without any control flow resp. the reading code into a single switch 
statement (for the field ids) inside a loop.


But as said above, the second item from your list, flexibility with 
regard to the types serialized, is a non-goal for Thrift, so it probably 
isn't the best fit for your application.


David

Re: std.concurrency wrapper over MPI?

2011-08-08 Thread Masahiro Nakagawa


On Mon, 08 Aug 2011 01:08:39 +0900, dsimcha  wrote:


On 8/7/2011 12:01 PM, Lutger Blijdestijn wrote:

dsimcha wrote:


On 8/7/2011 11:36 AM, Jacob Carlborg wrote:

Currently, the only available format is XML.


Ok, I'll look into writing a binary archiver that assumes that the CPU
architecture on the deserializing end is the same as that on the
serializing end.  If it works, maybe Orange is a good choice.


Just in case you missed it, the messagepack protocol has a D  
implementation

which seems to be what you're looking for: http://msgpack.org/ The last
commit on bitbucket reveals it should be compatible with 2.054. Perhaps  
it

can be adapted as an archiver for Orange.


Ok, this sounds great.  Again, though, it would be great to get  
serialization into Phobos.  (I don't know whether messagepack is  
suitable in its current form, because I haven't looked in detail.)  I  
was vaguely aware of a messagepack implementation for D, but I didn't  
realize it was still maintained and didn't know where it was hosted.


I maintain MessagePack for D and use this library as internal tool of my  
job.


I will move from bitbucket to github.
D programmer mainly uses git and github is more useful than bitbucket.


Masahiro

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Jacob Carlborg


On 2011-08-07 21:28, dsimcha wrote:

Yeah, I was trying to wrap my head around the whole "key" concept. I
wasn't very successful. I also tried out Orange and filed a few bug
reports. It may be that Orange isn't the right tool for the job for MPI,
though modulo some bug fixing and polishing it could be extremely useful
in different cases with different sets of tradeoffs.


Every serialized value has a associated key. The key should be unique in 
its context but doesn't have to be unique in the whole document. A key 
can be explicitly chosen, in that case that key will be used, or the 
serialize can create a key (just a number that is incremented). Example:


class Foo
{
int bar;
}

auto foo = new Foo;

When serializing "foo", it will get the key "0", chosen by the 
serializer. When "bar" is serialized it will use the explicit key "bar". 
This way the serialization process won't depend on the order of instance 
variables or struct members.


In addition to keys, all values have a associated id which is unique 
across the whole document. This is used for pointers and similar which 
reference other variables.



In addition to the bug reports I filed, why is it necessary to write any
serialization code to serialize through the base class? What's wrong
with just doing something like:

class Base {}
class Derived : Base {}

void main() {
auto serializer = new Serializer(new XMLArchive!());

// Introspect Derived and figure out all the details automatically.
serializer.register!(Derived);
}



I haven't thought about that, seems it would work. That will shorten the 
code a lot. This is a part that has not gone through the rewrite.


Note that all documentation on the wiki pages are outdated, they only 
refer to the first version, 0.0.1. The unit tests can be used as 
documentation to see how to use the new version and how it behaves.


--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Sean Kelly

This would probably work with the protobuf format. 

Sent from my iPhone

On Aug 7, 2011, at 12:28 PM, dsimcha  wrote:

> On 8/7/2011 2:27 PM, Jacob Carlborg wrote:
>> On 2011-08-07 17:45, dsimcha wrote:
>>> On 8/7/2011 11:36 AM, Jacob Carlborg wrote:
 Currently, the only available format is XML.
>>> 
>>> Ok, I'll look into writing a binary archiver that assumes that the CPU
>>> architecture on the deserializing end is the same as that on the
>>> serializing end. If it works, maybe Orange is a good choice.
>> 
>> Sounds good. I just hope that the current design allows for a binary
>> archive. Currently the serializer in Orange assumes that an archive can
>> deserialize a value based on a key which could be basically anywhere in
>> the serialized data. This allows at least to implement archives which
>> store the serialized data in a structured format, e.g. XML, JSON, YAML.
>> I don't know if that's possible with a binary format, I'm not familiar
>> with how to implement a binary format.
>> 
> 
> Yeah, I was trying to wrap my head around the whole "key" concept.  I wasn't 
> very successful.  I also tried out Orange and filed a few bug reports.  It 
> may be that Orange isn't the right tool for the job for MPI, though modulo 
> some bug fixing and polishing it could be extremely useful in different cases 
> with different sets of tradeoffs.
> 
> In addition to the bug reports I filed, why is it necessary to write any 
> serialization code to serialize through the base class?  What's wrong with 
> just doing something like:
> 
> class Base {}
> class Derived : Base {}
> 
> void main() {
>auto serializer = new Serializer(new XMLArchive!());
> 
>// Introspect Derived and figure out all the details automatically.
>serializer.register!(Derived);
> }
>

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread dsimcha


On 8/7/2011 2:27 PM, Jacob Carlborg wrote:

On 2011-08-07 17:45, dsimcha wrote:

On 8/7/2011 11:36 AM, Jacob Carlborg wrote:

Currently, the only available format is XML.


Ok, I'll look into writing a binary archiver that assumes that the CPU
architecture on the deserializing end is the same as that on the
serializing end. If it works, maybe Orange is a good choice.


Sounds good. I just hope that the current design allows for a binary
archive. Currently the serializer in Orange assumes that an archive can
deserialize a value based on a key which could be basically anywhere in
the serialized data. This allows at least to implement archives which
store the serialized data in a structured format, e.g. XML, JSON, YAML.
I don't know if that's possible with a binary format, I'm not familiar
with how to implement a binary format.



Yeah, I was trying to wrap my head around the whole "key" concept.  I 
wasn't very successful.  I also tried out Orange and filed a few bug 
reports.  It may be that Orange isn't the right tool for the job for 
MPI, though modulo some bug fixing and polishing it could be extremely 
useful in different cases with different sets of tradeoffs.


In addition to the bug reports I filed, why is it necessary to write any 
serialization code to serialize through the base class?  What's wrong 
with just doing something like:


class Base {}
class Derived : Base {}

void main() {
auto serializer = new Serializer(new XMLArchive!());

// Introspect Derived and figure out all the details automatically.
serializer.register!(Derived);
}

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Jacob Carlborg


On 2011-08-07 18:01, Lutger Blijdestijn wrote:

dsimcha wrote:


On 8/7/2011 11:36 AM, Jacob Carlborg wrote:

Currently, the only available format is XML.


Ok, I'll look into writing a binary archiver that assumes that the CPU
architecture on the deserializing end is the same as that on the
serializing end.  If it works, maybe Orange is a good choice.


Just in case you missed it, the messagepack protocol has a D implementation
which seems to be what you're looking for: http://msgpack.org/ The last
commit on bitbucket reveals it should be compatible with 2.054. Perhaps it
can be adapted as an archiver for Orange.


I think it should be possible.

--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Jacob Carlborg


On 2011-08-07 17:58, dsimcha wrote:

On 8/7/2011 11:36 AM, Jacob Carlborg wrote:

Good to know, but what flavor? As I see it there is a three-way tradeoff
in serialization. In order of importance for distributed parallelism,
the qualities are:


I can answer these tradeoff for the Orange serialization library,
http://dsource.org/projects/orange/.



BTW, I know this has been discussed in the past, but I'll bring it up
again. Since serialization is pretty fundamental to a lot of things and
I want to avoid dependency hell, what are the prospects for getting
Orange into Phobos?


To get Orange into Phobos, at least this most be done:

* Actually finishing the rewrite (I'm almost done, the basic stuff works)
* Add more unit tests
* Add documentation
* Rip out all D1 and Tango related code
* Some minor changes to follow the Phobos style guide, I have not 
followed the 80-120 column limit

* The XML module in Phobos needs some minor updates
* I've used my own kind of mini unit test framework, don't know if 
people like that, should be easy to remove


I think that's all.

--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Jacob Carlborg


On 2011-08-07 17:45, dsimcha wrote:

On 8/7/2011 11:36 AM, Jacob Carlborg wrote:

Currently, the only available format is XML.


Ok, I'll look into writing a binary archiver that assumes that the CPU
architecture on the deserializing end is the same as that on the
serializing end. If it works, maybe Orange is a good choice.


Sounds good. I just hope that the current design allows for a binary 
archive. Currently the serializer in Orange assumes that an archive can 
deserialize a value based on a key which could be basically anywhere in 
the serialized data. This allows at least to implement archives which 
store the serialized data in a structured format, e.g. XML, JSON, YAML. 
I don't know if that's possible with a binary format, I'm not familiar 
with how to implement a binary format.


--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Jacob Carlborg


On 2011-08-07 18:15, Sean Kelly wrote:

I was mostly wondering if the serialized was all template code or if the 
archived portion used some form of polymorphism. Sounds like its the latter.


The serializer uses template methods, the archive uses interfaces and 
virtual methods.



Sent from my iPhone

On Aug 7, 2011, at 8:19 AM, Jacob Carlborg  wrote:


On 2011-08-07 02:24, Sean Kelly wrote:

Is the archive formatter dynamically pluggable?


I'm not exactly sure what you mean but you can create new archive types and use 
them with the existing serializer. When creating a new serializer it takes an 
archive (as an interface) as a parameter.


Sent from my iPhone

On Aug 6, 2011, at 11:51 AM, Jacob Carlborg   wrote:


On 2011-08-06 18:32, Sean Kelly wrote:

I'd love to be able to send classes between processes, but first we need a good 
serialization/deserialization mechanism.


Have a look at Orange, I don't know if it's considered good but it works for 
almost all types available in D, the only available archive is currently XML. 
http://dsource.org/projects/orange/

--
/Jacob Carlborg



--
/Jacob Carlborg



--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread dsimcha


On 8/6/2011 12:32 PM, Sean Kelly wrote:

I'd love to be able to send classes between processes, but first we need a good 
serialization/deserialization mechanism.


The more I think about it, the more I think that std.concurrency isn't 
quite the right interface for cluster parallelism.  I'm thinking instead 
of doing something loosely based on, but not a translation of, 
boost::mpi.  The following differences between std.concurrency and what 
makes sense for MPI bother me:


1.  shared/immutable isn't needed when you're copying the data anyhow.

2.  spawn() is taken care of by the MPI runtime.

3.  std.concurrency doesn't support broadcasting.

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Sean Kelly

I was mostly wondering if the serialized was all template code or if the 
archived portion used some form of polymorphism. Sounds like its the latter. 

Sent from my iPhone

On Aug 7, 2011, at 8:19 AM, Jacob Carlborg  wrote:

> On 2011-08-07 02:24, Sean Kelly wrote:
>> Is the archive formatter dynamically pluggable?
> 
> I'm not exactly sure what you mean but you can create new archive types and 
> use them with the existing serializer. When creating a new serializer it 
> takes an archive (as an interface) as a parameter.
> 
>> Sent from my iPhone
>> 
>> On Aug 6, 2011, at 11:51 AM, Jacob Carlborg  wrote:
>> 
>>> On 2011-08-06 18:32, Sean Kelly wrote:
 I'd love to be able to send classes between processes, but first we need a 
 good serialization/deserialization mechanism.
>>> 
>>> Have a look at Orange, I don't know if it's considered good but it works 
>>> for almost all types available in D, the only available archive is 
>>> currently XML. http://dsource.org/projects/orange/
>>> 
>>> --
>>> /Jacob Carlborg
> 
> 
> -- 
> /Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Sean Kelly

Nope. It would represent an external destination and defines the protocol. 

Sent from my iPhone

On Aug 6, 2011, at 6:57 PM, dsimcha  wrote:

> On 8/6/2011 8:26 PM, Sean Kelly wrote:
>> I'm hoping to simply extend the existing API. The crucial portion will be 
>> the addition of a Node (base) type.
> 
> So Node would be the equivalent of Tid in the current API?

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread dsimcha


On 8/7/2011 12:01 PM, Lutger Blijdestijn wrote:

dsimcha wrote:


On 8/7/2011 11:36 AM, Jacob Carlborg wrote:

Currently, the only available format is XML.


Ok, I'll look into writing a binary archiver that assumes that the CPU
architecture on the deserializing end is the same as that on the
serializing end.  If it works, maybe Orange is a good choice.


Just in case you missed it, the messagepack protocol has a D implementation
which seems to be what you're looking for: http://msgpack.org/ The last
commit on bitbucket reveals it should be compatible with 2.054. Perhaps it
can be adapted as an archiver for Orange.


Ok, this sounds great.  Again, though, it would be great to get 
serialization into Phobos.  (I don't know whether messagepack is 
suitable in its current form, because I haven't looked in detail.)  I 
was vaguely aware of a messagepack implementation for D, but I didn't 
realize it was still maintained and didn't know where it was hosted.

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Lutger Blijdestijn

dsimcha wrote:

> On 8/7/2011 11:36 AM, Jacob Carlborg wrote:
>> Currently, the only available format is XML.
> 
> Ok, I'll look into writing a binary archiver that assumes that the CPU
> architecture on the deserializing end is the same as that on the
> serializing end.  If it works, maybe Orange is a good choice.

Just in case you missed it, the messagepack protocol has a D implementation  
which seems to be what you're looking for: http://msgpack.org/ The last 
commit on bitbucket reveals it should be compatible with 2.054. Perhaps it 
can be adapted as an archiver for Orange.

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Lutger Blijdestijn

link for the D implementation: https://bitbucket.org/repeatedly/msgpack4d/

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread dsimcha


On 8/7/2011 11:36 AM, Jacob Carlborg wrote:

Good to know, but what flavor? As I see it there is a three-way tradeoff
in serialization. In order of importance for distributed parallelism,
the qualities are:


I can answer these tradeoff for the Orange serialization library,
http://dsource.org/projects/orange/.



BTW, I know this has been discussed in the past, but I'll bring it up 
again.  Since serialization is pretty fundamental to a lot of things and 
I want to avoid dependency hell, what are the prospects for getting 
Orange into Phobos?

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread dsimcha


On 8/7/2011 11:36 AM, Jacob Carlborg wrote:

Currently, the only available format is XML.


Ok, I'll look into writing a binary archiver that assumes that the CPU 
architecture on the deserializing end is the same as that on the 
serializing end.  If it works, maybe Orange is a good choice.

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Jacob Carlborg


On 2011-08-07 00:09, dsimcha wrote:

On 8/6/2011 5:38 PM, jdrewsen wrote:

AFAIK David Nadlinger is handling serialization in his GSOC Thrift
project that he is working on currently.

/Jonas


Good to know, but what flavor? As I see it there is a three-way tradeoff
in serialization. In order of importance for distributed parallelism,
the qualities are:


I can answer these tradeoff for the Orange serialization library, 
http://dsource.org/projects/orange/.



1. Efficiency. How much does it cost to serialize/unserialize something
and how much space overhead is there?


I haven't done any measurements but I would guess it depends on which 
archive type is used. The actual serializer tries to do quite a lot, 
where possible, at compile time. But it also stores a reference for 
every serialized value, in the case a pointer points to the value.



2. Flexibility w.r.t. types: How many types can be serialized? How
faithfully are they reproduced on the other end w.r.t. things like
pointer/reference/slice aliasing?


If I haven't missed something Orange can serialize almost all types, 
except unions, function pointers, void pointers and delegates.



3. Standardization: How universally understood is the format? Can it be
used to send data across different CPU architectures? Across languages?
Is it human readable? Is it based on some meta-format like XML?


Currently, the only available format is XML.

--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-07 Thread Jacob Carlborg


On 2011-08-07 02:24, Sean Kelly wrote:

Is the archive formatter dynamically pluggable?


I'm not exactly sure what you mean but you can create new archive types 
and use them with the existing serializer. When creating a new 
serializer it takes an archive (as an interface) as a parameter.



Sent from my iPhone

On Aug 6, 2011, at 11:51 AM, Jacob Carlborg  wrote:


On 2011-08-06 18:32, Sean Kelly wrote:

I'd love to be able to send classes between processes, but first we need a good 
serialization/deserialization mechanism.


Have a look at Orange, I don't know if it's considered good but it works for 
almost all types available in D, the only available archive is currently XML. 
http://dsource.org/projects/orange/

--
/Jacob Carlborg



--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread dsimcha


On 8/6/2011 8:26 PM, Sean Kelly wrote:

I'm hoping to simply extend the existing API. The crucial portion will be the 
addition of a Node (base) type.


So Node would be the equivalent of Tid in the current API?

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread Sean Kelly

I'm hoping to simply extend the existing API. The crucial portion will be the 
addition of a Node (base) type. 

Sent from my iPhone

On Aug 6, 2011, at 2:38 PM, jdrewsen  wrote:

> Den 06-08-2011 05:51, dsimcha skrev:
>> I've finally bitten the bullet and learned MPI
>> (http://en.wikipedia.org/wiki/Message_passing_interface) for an ultra
>> computationally intensive research project I've been working on lately.
>> I wrote all the MPI-calling code in D against the C API, using a very
>> quick-and-dirty (i.e. not releasable) translation of the parts of the
>> header I needed.
>> 
>> I'm halfway-thinking of writing a std.concurrency-like interface on top
>> of MPI in D. A few questions:
>> 
>> 1. Is anyone besides me interested in this?
>> 
>> 2. Is anyone already working on something similar.
>> 
>> 3. Would this be Phobos material even though it would depend on MPI, or
>> would it better be kept as a 3rd party library?
> 
> I think std.concurrency needs to define a new interface for passing messages 
> out-of-process ie. other process or host. The implementation itself should 
> probably be 3rd party since there are many serialized representations and 
> protocols out there to pick from.
> 
>> 4. std.concurrency in its current incarnation doesn't allow objects with
>> mutable indirection to be passed as messages. This makes sense when
>> passing messages between threads in the same address space. However, for
>> passing between MPI processes, the object is going to be copied anyhow.
>> Should the restriction be kept (for consistency) or removed (because it
>> doesn't serve much of a purpose in the MPI context)?
> >
>> 5. For passing complex object graphs, serialization would obviously be
>> necessary. What's the current state of the art in serialization in D? I
>> want something that's efficient and general first and foremost. I really
>> don't care about human readability or standards compliance (in other
>> words, no XML or JSON or anything like that).
> 
> AFAIK David Nadlinger is handling serialization in his GSOC Thrift project 
> that he is working on currently.
> 
> /Jonas

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread Sean Kelly

Is the archive formatter dynamically pluggable?

Sent from my iPhone

On Aug 6, 2011, at 11:51 AM, Jacob Carlborg  wrote:

> On 2011-08-06 18:32, Sean Kelly wrote:
>> I'd love to be able to send classes between processes, but first we need a 
>> good serialization/deserialization mechanism.
> 
> Have a look at Orange, I don't know if it's considered good but it works for 
> almost all types available in D, the only available archive is currently XML. 
> http://dsource.org/projects/orange/
> 
> -- 
> /Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread dsimcha


On 8/6/2011 5:38 PM, jdrewsen wrote:

AFAIK David Nadlinger is handling serialization in his GSOC Thrift
project that he is working on currently.

/Jonas


Good to know, but what flavor?  As I see it there is a three-way 
tradeoff in serialization.  In order of importance for distributed 
parallelism, the qualities are:


1.  Efficiency.  How much does it cost to serialize/unserialize 
something and how much space overhead is there?


2.  Flexibility w.r.t. types:  How many types can be serialized?  How 
faithfully are they reproduced on the other end w.r.t. things like 
pointer/reference/slice aliasing?


3.  Standardization:  How universally understood is the format?  Can it 
be used to send data across different CPU architectures?  Across 
languages?  Is it human readable?  Is it based on some meta-format like XML?


For enterprisey use cases, I think this ordering would probably be 
completely reversed.  For example, in a typical MPI cluster all nodes 
are of the same architecture, so it's usually perfectly reasonable to 
send arrays of primitives as just raw bits.  I imagine this is a 
terrible idea in other contexts that I know less about.

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread jdrewsen


Den 06-08-2011 05:51, dsimcha skrev:

I've finally bitten the bullet and learned MPI
(http://en.wikipedia.org/wiki/Message_passing_interface) for an ultra
computationally intensive research project I've been working on lately.
I wrote all the MPI-calling code in D against the C API, using a very
quick-and-dirty (i.e. not releasable) translation of the parts of the
header I needed.

I'm halfway-thinking of writing a std.concurrency-like interface on top
of MPI in D. A few questions:

1. Is anyone besides me interested in this?

2. Is anyone already working on something similar.

3. Would this be Phobos material even though it would depend on MPI, or
would it better be kept as a 3rd party library?


I think std.concurrency needs to define a new interface for passing 
messages out-of-process ie. other process or host. The implementation 
itself should probably be 3rd party since there are many serialized 
representations and protocols out there to pick from.



4. std.concurrency in its current incarnation doesn't allow objects with
mutable indirection to be passed as messages. This makes sense when
passing messages between threads in the same address space. However, for
passing between MPI processes, the object is going to be copied anyhow.
Should the restriction be kept (for consistency) or removed (because it
doesn't serve much of a purpose in the MPI context)?

>

5. For passing complex object graphs, serialization would obviously be
necessary. What's the current state of the art in serialization in D? I
want something that's efficient and general first and foremost. I really
don't care about human readability or standards compliance (in other
words, no XML or JSON or anything like that).


AFAIK David Nadlinger is handling serialization in his GSOC Thrift 
project that he is working on currently.


/Jonas

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread Jonathan M Davis

On Friday 05 August 2011 23:51:24 dsimcha wrote:
> I've finally bitten the bullet and learned MPI
> (http://en.wikipedia.org/wiki/Message_passing_interface) for an ultra
> computationally intensive research project I've been working on lately.
>   I wrote all the MPI-calling code in D against the C API, using a very
> quick-and-dirty (i.e. not releasable) translation of the parts of the
> header I needed.
> 
> I'm halfway-thinking of writing a std.concurrency-like interface on top
> of MPI in D.  A few questions:
> 
> 1.  Is anyone besides me interested in this?

Personally, I've never heard of MPI and have no interest in it whatsoever, but 
I don't do much with concurrent programming. Others will probably be far more 
interested though.

> 3.  Would this be Phobos material even though it would depend on MPI, or
> would it better be kept as a 3rd party library?

If MPI is something which can be found by default on your typical OS install, 
then it may be okay to have it in Phobos. But in general, I would think that 
if you need to install 3rd party libraries to use it, it should probably be a 
3rd party library itself. It may be a bit of a grey area though. If we have 
many libraries such as that, then we may want to create an official (or at 
least 
pseudo-official) project which contains the major ones to make them easy to 
find. 
Regardless, we _don't_ want Phobos to require extra dependencies which aren't 
normally found on your typical OS install such that you have to install them 
even if you don't use the functionality that they're needed for.

- Jonathan M Davis

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread Jacob Carlborg


On 2011-08-06 18:32, Sean Kelly wrote:

I'd love to be able to send classes between processes, but first we need a good 
serialization/deserialization mechanism.


Have a look at Orange, I don't know if it's considered good but it works 
for almost all types available in D, the only available archive is 
currently XML. http://dsource.org/projects/orange/


--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread Sean Kelly

I'd love to be able to send classes between processes, but first we need a good 
serialization/deserialization mechanism. 

Sent from my iPhone

On Aug 5, 2011, at 8:51 PM, dsimcha  wrote:

> I've finally bitten the bullet and learned MPI 
> (http://en.wikipedia.org/wiki/Message_passing_interface) for an ultra 
> computationally intensive research project I've been working on lately.  I 
> wrote all the MPI-calling code in D against the C API, using a very 
> quick-and-dirty (i.e. not releasable) translation of the parts of the header 
> I needed.
> 
> I'm halfway-thinking of writing a std.concurrency-like interface on top of 
> MPI in D.  A few questions:
> 
> 1.  Is anyone besides me interested in this?
> 
> 2.  Is anyone already working on something similar.
> 
> 3.  Would this be Phobos material even though it would depend on MPI, or 
> would it better be kept as a 3rd party library?
> 
> 4.  std.concurrency in its current incarnation doesn't allow objects with 
> mutable indirection to be passed as messages.   This makes sense when passing 
> messages between threads in the same address space. However, for passing 
> between MPI processes, the object is going to be copied anyhow.  Should the 
> restriction be kept (for consistency) or removed (because it doesn't serve 
> much of a purpose in the MPI context)?
> 
> 5.  For passing complex object graphs, serialization would obviously be 
> necessary.  What's the current state of the art in serialization in D? I want 
> something that's efficient and general first and foremost.  I really don't 
> care about human readability or standards compliance (in other words, no XML 
> or JSON or anything like that).

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread dsimcha


On 8/6/2011 2:57 AM, Russel Winder wrote:

The main problem here is going to be that when anything gets released
performance will be the only yardstick by which things are measured.
Simplicity of code, ease of evolution of code, all the things
professional developers value, will go out of the window.  It's HPC
after all :-)


Now that I think of it, there's also the option of porting boost::mpi to 
D and then possibly writing a std.concurrency-like wrapper on top of 
that (or not).

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread Russel Winder

On Sat, 2011-08-06 at 10:09 -0400, dsimcha wrote:
[ . . . ]
> Anyhow, D has one key advantage that makes it more tolerant of 
> communication overhead than most languages:  std.parallelism.  At least 
> the way things are set up on the cluster here at Johns Hopkins, each 
> node has 8 cores.  The "traditional" MPI way of doing things is 
> apparently to allocate 8 MPI processes per node in this case, one per 
> core.  Instead, I'm allocating one process per node, using MPI only for 
> very coarse grained parallelism and using std.parallelism for more 
> fine-grained parallelism to keep all 8 cores occupied with one MPI process.

I think increasingly the idiom in the Fortran/C/C++ HPC community is to
use MPI on a per address space basis, rather than a per ALU basis, and
to use OpenMP to handle the thread control in a given address space
handling the multicores.  (OpenMP being something totally different to
OpenMPI.)

In the C++ arena though there is Threading Building Blocks (TBB) which
has element of arcane-ness but is a whole lot better than OpenMP.

As you point out there are much better, generally higher-level,
abstractions that would make HPC code faster as well as much, much
easier to maintain.  However even with Intel's high budget marketing of
some of the alternatives, the HPC community seem steadfast in their
support of MPI and OpenMP.  Of course they also have codes from the
1970s and 1980s they are in continued use because no-one is prepared to
rewrite them.

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

signature.asc
Description: This is a digitally signed message part

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread dsimcha


On 8/6/2011 2:57 AM, Russel Winder wrote:


The main problem here is going to be that when anything gets released
performance will be the only yardstick by which things are measured.
Simplicity of code, ease of evolution of code, all the things
professional developers value, will go out of the window.  It's HPC
after all :-)


This is why, even though I do stuff that's arguably HPC, I can't stand 
the HPC community.  Of course performance is important, but nothing 
should be so sacred as to be completely immune to tradeoffs.  The thing 
that drew me to D is that you can get pretty good performance out of it 
without sacrificing that much ease of use compared to dynamic languages. 
 Besides, you can always provide a high-level but not-that-efficient 
API for most cases and a lower-level API for when more control is needed.


Anyhow, D has one key advantage that makes it more tolerant of 
communication overhead than most languages:  std.parallelism.  At least 
the way things are set up on the cluster here at Johns Hopkins, each 
node has 8 cores.  The "traditional" MPI way of doing things is 
apparently to allocate 8 MPI processes per node in this case, one per 
core.  Instead, I'm allocating one process per node, using MPI only for 
very coarse grained parallelism and using std.parallelism for more 
fine-grained parallelism to keep all 8 cores occupied with one MPI process.

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread bearophile

dsimcha:

> 1.  Is anyone besides me interested in this?

Other people are interested.


> 3.  Would this be Phobos material even though it would depend on MPI, or 
> would it better be kept as a 3rd party library?

I'd like one or more Phobos modules built on top of the basic MPI, so I think 
it's better to have MPI too in Phobos.


> Should the restriction be kept (for consistency) or 
> removed (because it doesn't serve much of a purpose in the MPI context)?

This not easy to say now, for me.

Bye,
bearophile

Re: std.concurrency wrapper over MPI?

2011-08-06 Thread Jacob Carlborg


On 2011-08-06 05:51, dsimcha wrote:

5. For passing complex object graphs, serialization would obviously be
necessary. What's the current state of the art in serialization in D? I
want something that's efficient and general first and foremost. I really
don't care about human readability or standards compliance (in other
words, no XML or JSON or anything like that).


My rewrite of Orange is almost finished. It can currently only serialize 
to XML, but it's possible to create new archive types for other formats. 
I have no idea about the performance, I'm mostly focusing on be able to 
serialize as many types as possible.


http://dsource.org/projects/orange/

--
/Jacob Carlborg

Re: std.concurrency wrapper over MPI?

2011-08-05 Thread Russel Winder

On Fri, 2011-08-05 at 23:51 -0400, dsimcha wrote:
[ . . . ]
> 1.  Is anyone besides me interested in this?

MPI may be ancient, it may be a bit daft in terms of its treatment of
marshalling, unmarshalling and serializing, it may be only a Fortran and
C thing bolted into C++ (quite well) but it is the de facto standard for
HPC.  OK so HPC is about 10% of world-wide computing, probably less than
that of spend despite the enormous per installation price, but it is
about 90% of political marketing.  Any short term parallelism strategy
must include MPI -- and work with OpenMPI and MPICH2.

So I don't think it is a matter of just interest for D, I would say that
if D is to stand with C++, C and Fortran then there has to be an MPI
API.  Even though MPI should be banned going forward.

> 2.  Is anyone already working on something similar.
> 
> 3.  Would this be Phobos material even though it would depend on MPI, or 
> would it better be kept as a 3rd party library?

Given that it requires a transitive dependency then either Phobos goes
forward with optional dependencies or the MPI API is a separate thing.
Given my personal opinion that actor model, dataflow model, agents, etc.
should be the application level concurrency and parallelism model, I
would be quite happy with an MPI API not being in Phobos.  Keep Phobos
for that which every D installation will need.  MPI is a niche market in
that respect.

Optional dependencies sort of work but are sort of a real pain in the
Java/Maven milieu.

> 4.  std.concurrency in its current incarnation doesn't allow objects 
> with mutable indirection to be passed as messages.   This makes sense 
> when passing messages between threads in the same address space. 
> However, for passing between MPI processes, the object is going to be 
> copied anyhow.  Should the restriction be kept (for consistency) or 
> removed (because it doesn't serve much of a purpose in the MPI context)?

At the root of this issue is local thread-based parallelism in a shared
memory context, vs cluster parallelism.  MPI is a cluster solution --
even though it can be used in multicore shared memory situation.  The
point about enforced copying vs. potential sharing is core to this
obviously.  This has to be handled with absolute top notch performance
in mind.  It is arguably a situation where programming language
semantics and purity have to be sacrificed at the altar of performance.
There are already far too many MPI applications that are written with
far too much comms code in the application simply to ensure performance
-- because the MPI infrastructure cannot be trusted to do things fast
enough if you use anything other than the bottom most layer.

> 5.  For passing complex object graphs, serialization would obviously be 
> necessary.  What's the current state of the art in serialization in D? 
> I want something that's efficient and general first and foremost.  I 
> really don't care about human readability or standards compliance (in 
> other words, no XML or JSON or anything like that).

Again performance is everything, so nothing must get in the way of
having something that cannot be made faster.

The main problem here is going to be that when anything gets released
performance will be the only yardstick by which things are measured.
Simplicity of code, ease of evolution of code, all the things
professional developers value, will go out of the window.  It's HPC
after all :-)

I still think D needs a dataflow, CSP and data parallelism strategy, cf.
Go, GPars, Akka, even Haskell.  Having actors is good, but having only
actors is not good, cf. Scala and Akka.

-- 
Russel.
=
Dr Russel Winder  t: +44 20 7585 2200   voip: sip:russel.win...@ekiga.net
41 Buckmaster Roadm: +44 7770 465 077   xmpp: rus...@russel.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

signature.asc
Description: This is a digitally signed message part

std.concurrency wrapper over MPI?

2011-08-05 Thread dsimcha

I've finally bitten the bullet and learned MPI 
(http://en.wikipedia.org/wiki/Message_passing_interface) for an ultra 
computationally intensive research project I've been working on lately. 
 I wrote all the MPI-calling code in D against the C API, using a very 
quick-and-dirty (i.e. not releasable) translation of the parts of the 
header I needed.


I'm halfway-thinking of writing a std.concurrency-like interface on top 
of MPI in D.  A few questions:


1.  Is anyone besides me interested in this?

2.  Is anyone already working on something similar.

3.  Would this be Phobos material even though it would depend on MPI, or 
would it better be kept as a 3rd party library?


4.  std.concurrency in its current incarnation doesn't allow objects 
with mutable indirection to be passed as messages.   This makes sense 
when passing messages between threads in the same address space. 
However, for passing between MPI processes, the object is going to be 
copied anyhow.  Should the restriction be kept (for consistency) or 
removed (because it doesn't serve much of a purpose in the MPI context)?


5.  For passing complex object graphs, serialization would obviously be 
necessary.  What's the current state of the art in serialization in D? 
I want something that's efficient and general first and foremost.  I 
really don't care about human readability or standards compliance (in 
other words, no XML or JSON or anything like that).

38 matches

Mail list logo