Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-04-17 Thread Thierry Goubier
2018-04-17 14:28 GMT+02:00 Clément Bera :
>
>
> On Tue, Apr 17, 2018 at 2:08 PM, Thierry Goubier 
> wrote:
>>
>> 2018-04-17 14:03 GMT+02:00 Clément Bera :
>> > Hi Pavel,
>> >
>> > I'm looking at PosixSharedMemory again since I have to write a Master
>> > student proposal and I think this could be a good topic. I'm not really
>> > expert on SharedMemory, so I'm going to share what I have in mind, we
>> > can
>> > discuss about it, you could also co-supervise the student indirectly
>> > (though
>> > officially the supervisor has to be VUB staff).
>> >
>> > Implementation-wise, I was thinking what could be interesting are:
>> > - Making PosixSharedMemory compatible with TaskIt so from you image you
>> > can
>> > create a SharedMemory buffer, spawn another image+VM and attach it to
>> > the
>> > shared memory to have multiple threads working on the shared section.
>> > - Add the #at:if:put: primitive, which write into the shared memory
>> > using
>> > compare and swap instruction for efficient thread-safe access.
>> > - Add on SharedMemory all the primitives to read/write native types to
>> > the
>> > buffer (int64, double, etc.) with CAS and non CAS instructions.
>> > - Maybe add APIs to read/write objects through Fuel to pass them by
>> > copy,
>> > though this looks difficult in some cases.
>> > - Implement a lock system or a semaphore system on top of the CAS
>> > - implement lock-free and lock-full algorithm using CAS and non CAS
>> > instructions (I think a first try would be parallelSort on a 1Gb buffer
>> > of
>> > int32 with 4 native threads)
>> >
>> > What do you think ? Do you have ideas ? Are you interested ? Do you
>> > thing
>> > having a student on this would be nice ?
>> >
>> > The master thesis proposal has to include a research question. I am not
>> > sure
>> > what other languages do regarding shared memory.It's not clear so far
>> > what
>> > the research question is.
>>
>> I can forward that to a researcher here working on distributed shared
>> memory.
>>
>> Research question can be:
>> - heterogeneity (x86 + pi at the same time)
>> - load balancing between competing images on heterogeneous hardware
>> ...
>> - Object migration (pointer forwarding). Probably not, state of the
>> art is very advanced on that which means a costly implementation to
>> reach parity.
>>
>
> SharedMemory on heterogenous hardware ? Do you mean you need to physically
> plug the memory into a Raspberry Pie and an x86 computer ? Or you mean
> exporting the RAM of one hardware as NFS to the others ? I was just thinking
> sharing the memory between multiple pairs of image+VM on the same machine to
> be able to run some multi-threaded algorithm on the shared buffer. I know
> it's not much but we need to start somewhere.

No, it's having a memory abstraction (memory chunks) handled by a
server on a host (x86 or ARM), sending them over MPI to clients (x86
or ARM), each client accessing it through the OS shared memory,
releasing it when done so that other tasks can work on it (distributed
pipeline).

> The student would have 6 months, including 1 month to write the thesis so it
> cannot be too heavy. Something that we can re-use with a minor research
> contribution would be nice.

Given the state of the art in the field (20 years + of shared memory +
distributed shared memory + distributed object store and migration
already done), the only easy one that I know of is heterogeneity.

And even there, doing something worthy of a paper is hard.

Thierry

>> Thierry
>>
>> >
>> > Best,
>> >
>> > On Mon, Feb 12, 2018 at 10:33 AM, Pavel Krivanek
>> > 
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> among other less interesting things, I spent some time on existing
>> >> PosixSharedMemory project. It is a UFFI binding for the LibC methods
>> >> that provide support for the memory allocation between several
>> >> separate processes. I significantly improved the performance by
>> >> implementing the block access. Writing of 10MB byte array takes about
>> >> 1 millisecond, reading of it from other image took me about 4
>> >> milliseconds. While serialization with Fuel is very fast, it opens
>> >> interesting possibilities.
>> >> To have a shared memory without synchronization tools is not very
>> >> useful so I wrote a basic UFFI interface for the POSIX named
>> >> semaphores. They are quite easy to use and work nicely with Pharo. The
>> >> VM can all wait on the semaphore or it can check the status of it
>> >> periodically in an image thread. It has two small disadvantages. It
>> >> requires to dynamically link the next library (pthread) and they must
>> >> be cleaned manually. I plan to look at System V alternative in future.
>> >> Now we should write a nice framework for inter-image communication on
>> >> top of it or/and adopt Seamless for it ;-)
>> >>
>> >> Cheers,
>> >> -- Pavel
>> >>
>> >
>> >
>> >
>> > --
>> > Clément Béra
>> > 

Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-04-17 Thread Clément Bera
On Tue, Apr 17, 2018 at 2:08 PM, Thierry Goubier 
wrote:

> 2018-04-17 14:03 GMT+02:00 Clément Bera :
> > Hi Pavel,
> >
> > I'm looking at PosixSharedMemory again since I have to write a Master
> > student proposal and I think this could be a good topic. I'm not really
> > expert on SharedMemory, so I'm going to share what I have in mind, we can
> > discuss about it, you could also co-supervise the student indirectly
> (though
> > officially the supervisor has to be VUB staff).
> >
> > Implementation-wise, I was thinking what could be interesting are:
> > - Making PosixSharedMemory compatible with TaskIt so from you image you
> can
> > create a SharedMemory buffer, spawn another image+VM and attach it to the
> > shared memory to have multiple threads working on the shared section.
> > - Add the #at:if:put: primitive, which write into the shared memory using
> > compare and swap instruction for efficient thread-safe access.
> > - Add on SharedMemory all the primitives to read/write native types to
> the
> > buffer (int64, double, etc.) with CAS and non CAS instructions.
> > - Maybe add APIs to read/write objects through Fuel to pass them by copy,
> > though this looks difficult in some cases.
> > - Implement a lock system or a semaphore system on top of the CAS
> > - implement lock-free and lock-full algorithm using CAS and non CAS
> > instructions (I think a first try would be parallelSort on a 1Gb buffer
> of
> > int32 with 4 native threads)
> >
> > What do you think ? Do you have ideas ? Are you interested ? Do you thing
> > having a student on this would be nice ?
> >
> > The master thesis proposal has to include a research question. I am not
> sure
> > what other languages do regarding shared memory.It's not clear so far
> what
> > the research question is.
>
> I can forward that to a researcher here working on distributed shared
> memory.
>
> Research question can be:
> - heterogeneity (x86 + pi at the same time)
> - load balancing between competing images on heterogeneous hardware
> ...
> - Object migration (pointer forwarding). Probably not, state of the
> art is very advanced on that which means a costly implementation to
> reach parity.
>
>
SharedMemory on heterogenous hardware ? Do you mean you need to physically
plug the memory into a Raspberry Pie and an x86 computer ? Or you mean
exporting the RAM of one hardware as NFS to the others ? I was just
thinking sharing the memory between multiple pairs of image+VM on the same
machine to be able to run some multi-threaded algorithm on the shared
buffer. I know it's not much but we need to start somewhere.

The student would have 6 months, including 1 month to write the thesis so
it cannot be too heavy. Something that we can re-use with a minor research
contribution would be nice.

Thierry
>
> >
> > Best,
> >
> > On Mon, Feb 12, 2018 at 10:33 AM, Pavel Krivanek <
> pavel.kriva...@gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> among other less interesting things, I spent some time on existing
> >> PosixSharedMemory project. It is a UFFI binding for the LibC methods
> >> that provide support for the memory allocation between several
> >> separate processes. I significantly improved the performance by
> >> implementing the block access. Writing of 10MB byte array takes about
> >> 1 millisecond, reading of it from other image took me about 4
> >> milliseconds. While serialization with Fuel is very fast, it opens
> >> interesting possibilities.
> >> To have a shared memory without synchronization tools is not very
> >> useful so I wrote a basic UFFI interface for the POSIX named
> >> semaphores. They are quite easy to use and work nicely with Pharo. The
> >> VM can all wait on the semaphore or it can check the status of it
> >> periodically in an image thread. It has two small disadvantages. It
> >> requires to dynamically link the next library (pthread) and they must
> >> be cleaned manually. I plan to look at System V alternative in future.
> >> Now we should write a nice framework for inter-image communication on
> >> top of it or/and adopt Seamless for it ;-)
> >>
> >> Cheers,
> >> -- Pavel
> >>
> >
> >
> >
> > --
> > Clément Béra
> > https://clementbera.github.io/
> > https://clementbera.wordpress.com/
>
>


-- 
Clément Béra
https://clementbera.github.io/
https://clementbera.wordpress.com/


Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-04-17 Thread Thierry Goubier
2018-04-17 14:03 GMT+02:00 Clément Bera :
> Hi Pavel,
>
> I'm looking at PosixSharedMemory again since I have to write a Master
> student proposal and I think this could be a good topic. I'm not really
> expert on SharedMemory, so I'm going to share what I have in mind, we can
> discuss about it, you could also co-supervise the student indirectly (though
> officially the supervisor has to be VUB staff).
>
> Implementation-wise, I was thinking what could be interesting are:
> - Making PosixSharedMemory compatible with TaskIt so from you image you can
> create a SharedMemory buffer, spawn another image+VM and attach it to the
> shared memory to have multiple threads working on the shared section.
> - Add the #at:if:put: primitive, which write into the shared memory using
> compare and swap instruction for efficient thread-safe access.
> - Add on SharedMemory all the primitives to read/write native types to the
> buffer (int64, double, etc.) with CAS and non CAS instructions.
> - Maybe add APIs to read/write objects through Fuel to pass them by copy,
> though this looks difficult in some cases.
> - Implement a lock system or a semaphore system on top of the CAS
> - implement lock-free and lock-full algorithm using CAS and non CAS
> instructions (I think a first try would be parallelSort on a 1Gb buffer of
> int32 with 4 native threads)
>
> What do you think ? Do you have ideas ? Are you interested ? Do you thing
> having a student on this would be nice ?
>
> The master thesis proposal has to include a research question. I am not sure
> what other languages do regarding shared memory.It's not clear so far what
> the research question is.

I can forward that to a researcher here working on distributed shared memory.

Research question can be:
- heterogeneity (x86 + pi at the same time)
- load balancing between competing images on heterogeneous hardware
...
- Object migration (pointer forwarding). Probably not, state of the
art is very advanced on that which means a costly implementation to
reach parity.

Thierry

>
> Best,
>
> On Mon, Feb 12, 2018 at 10:33 AM, Pavel Krivanek 
> wrote:
>>
>> Hi,
>>
>> among other less interesting things, I spent some time on existing
>> PosixSharedMemory project. It is a UFFI binding for the LibC methods
>> that provide support for the memory allocation between several
>> separate processes. I significantly improved the performance by
>> implementing the block access. Writing of 10MB byte array takes about
>> 1 millisecond, reading of it from other image took me about 4
>> milliseconds. While serialization with Fuel is very fast, it opens
>> interesting possibilities.
>> To have a shared memory without synchronization tools is not very
>> useful so I wrote a basic UFFI interface for the POSIX named
>> semaphores. They are quite easy to use and work nicely with Pharo. The
>> VM can all wait on the semaphore or it can check the status of it
>> periodically in an image thread. It has two small disadvantages. It
>> requires to dynamically link the next library (pthread) and they must
>> be cleaned manually. I plan to look at System V alternative in future.
>> Now we should write a nice framework for inter-image communication on
>> top of it or/and adopt Seamless for it ;-)
>>
>> Cheers,
>> -- Pavel
>>
>
>
>
> --
> Clément Béra
> https://clementbera.github.io/
> https://clementbera.wordpress.com/



Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-04-17 Thread Clément Bera
Hi Pavel,

I'm looking at PosixSharedMemory again since I have to write a Master
student proposal and I think this could be a good topic. I'm not really
expert on SharedMemory, so I'm going to share what I have in mind, we can
discuss about it, you could also co-supervise the student indirectly
(though officially the supervisor has to be VUB staff).

Implementation-wise, I was thinking what could be interesting are:
- Making PosixSharedMemory compatible with TaskIt so from you image you can
create a SharedMemory buffer, spawn another image+VM and attach it to the
shared memory to have multiple threads working on the shared section.
- Add the #at:if:put: primitive, which write into the shared memory using
compare and swap instruction for efficient thread-safe access.
- Add on SharedMemory all the primitives to read/write native types to the
buffer (int64, double, etc.) with CAS and non CAS instructions.
- Maybe add APIs to read/write objects through Fuel to pass them by copy,
though this looks difficult in some cases.
- Implement a lock system or a semaphore system on top of the CAS
- implement lock-free and lock-full algorithm using CAS and non CAS
instructions (I think a first try would be parallelSort on a 1Gb buffer of
int32 with 4 native threads)

What do you think ? Do you have ideas ? Are you interested ? Do you thing
having a student on this would be nice ?

The master thesis proposal has to include a research question. I am not
sure what other languages do regarding shared memory.It's not clear so far
what the research question is.

Best,

On Mon, Feb 12, 2018 at 10:33 AM, Pavel Krivanek 
wrote:

> Hi,
>
> among other less interesting things, I spent some time on existing
> PosixSharedMemory project. It is a UFFI binding for the LibC methods
> that provide support for the memory allocation between several
> separate processes. I significantly improved the performance by
> implementing the block access. Writing of 10MB byte array takes about
> 1 millisecond, reading of it from other image took me about 4
> milliseconds. While serialization with Fuel is very fast, it opens
> interesting possibilities.
> To have a shared memory without synchronization tools is not very
> useful so I wrote a basic UFFI interface for the POSIX named
> semaphores. They are quite easy to use and work nicely with Pharo. The
> VM can all wait on the semaphore or it can check the status of it
> periodically in an image thread. It has two small disadvantages. It
> requires to dynamically link the next library (pthread) and they must
> be cleaned manually. I plan to look at System V alternative in future.
> Now we should write a nice framework for inter-image communication on
> top of it or/and adopt Seamless for it ;-)
>
> Cheers,
> -- Pavel
>
>


-- 
Clément Béra
https://clementbera.github.io/
https://clementbera.wordpress.com/


Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-02-15 Thread Marcus Denker


> On 15 Feb 2018, at 12:56, Thierry Goubier  wrote:
> 
> 2018-02-15 11:31 GMT+01:00 Marcus Denker :
>> 
>> 
>>> On 14 Feb 2018, at 21:34, Thierry Goubier  wrote:
>>> 
>>> Le 14/02/2018 à 20:19, Stephane Ducasse a écrit :
 Thanks Pavel this looks quite fast :).
 Do you have a scenario in mind that could take advantage of this?
>>> 
>>> I'd be very interested to see that used with image segments for objects 
>>> migration between images running in different processes (or coupled with a 
>>> distributed shared memory implementation like [1]).
>>> 
>>> If latency is low enough, I think multi-window applications could be 
>>> developped as multi-process single window images (and we could get 
>>> scalability to thousands of cores, if the application design is right. Even 
>>> code synchronisation between images would be easy to do).
>>> 
>>> 
>> Yes, that is the direction to explore :-)
> 
> When I worked on that sort of systems, one of the questions I had was:
> what about the GC ? But yes, in today's context, focusing on process
> based concurrency can be interesting.
> 
>> This started with me and Pavel discussion about the general idea that it 
>> would be interesting if Objects would actually be “really recursive” (and 
>> the whole system would be such an Object, too).
> 
> I'm not sure I get the concept. Would you be ready to explain ?

I have started to write something about it (I need to make it clearer for 
myself, too).
When it is in some presentable shape (currently just notes) I will forward it.

Marcus


Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-02-15 Thread Thierry Goubier
2018-02-15 11:31 GMT+01:00 Marcus Denker :
>
>
>> On 14 Feb 2018, at 21:34, Thierry Goubier  wrote:
>>
>> Le 14/02/2018 à 20:19, Stephane Ducasse a écrit :
>>> Thanks Pavel this looks quite fast :).
>>> Do you have a scenario in mind that could take advantage of this?
>>
>> I'd be very interested to see that used with image segments for objects 
>> migration between images running in different processes (or coupled with a 
>> distributed shared memory implementation like [1]).
>>
>> If latency is low enough, I think multi-window applications could be 
>> developped as multi-process single window images (and we could get 
>> scalability to thousands of cores, if the application design is right. Even 
>> code synchronisation between images would be easy to do).
>>
>>
> Yes, that is the direction to explore :-)

When I worked on that sort of systems, one of the questions I had was:
what about the GC ? But yes, in today's context, focusing on process
based concurrency can be interesting.

> This started with me and Pavel discussion about the general idea that it 
> would be interesting if Objects would actually be “really recursive” (and the 
> whole system would be such an Object, too).

I'm not sure I get the concept. Would you be ready to explain ?

> And then asking the question: what is the minimal thing we need to explore 
> that direction?

On a meta-level, I can understand that.

Thierry

>>
>> [1] https://hal.archives-ouvertes.fr/hal-01679052
>>
>
> I will read that.

I talked to Loic about that sort of ideas (his S-DSM could do even
more interesting things) but also of the fact we don't have the
resources and time to explore it :(

Thierry

>
> Marcus
>
>



Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-02-15 Thread Marcus Denker


> On 14 Feb 2018, at 21:34, Thierry Goubier  wrote:
> 
> Le 14/02/2018 à 20:19, Stephane Ducasse a écrit :
>> Thanks Pavel this looks quite fast :).
>> Do you have a scenario in mind that could take advantage of this?
> 
> I'd be very interested to see that used with image segments for objects 
> migration between images running in different processes (or coupled with a 
> distributed shared memory implementation like [1]).
> 
> If latency is low enough, I think multi-window applications could be 
> developped as multi-process single window images (and we could get 
> scalability to thousands of cores, if the application design is right. Even 
> code synchronisation between images would be easy to do).
> 
> 
Yes, that is the direction to explore :-)

This started with me and Pavel discussion about the general idea that it would 
be interesting if Objects would actually be “really recursive” (and the whole 
system would be such an Object, too).
And then asking the question: what is the minimal thing we need to explore that 
direction?

> 
> [1] https://hal.archives-ouvertes.fr/hal-01679052
> 

I will read that.

Marcus




Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-02-14 Thread Tudor Girba
Hi Pavel,

This is very cool. Is the code available?

Cheers,
Doru


> On Feb 12, 2018, at 10:33 AM, Pavel Krivanek  wrote:
> 
> Hi,
> 
> among other less interesting things, I spent some time on existing
> PosixSharedMemory project. It is a UFFI binding for the LibC methods
> that provide support for the memory allocation between several
> separate processes. I significantly improved the performance by
> implementing the block access. Writing of 10MB byte array takes about
> 1 millisecond, reading of it from other image took me about 4
> milliseconds. While serialization with Fuel is very fast, it opens
> interesting possibilities.
> To have a shared memory without synchronization tools is not very
> useful so I wrote a basic UFFI interface for the POSIX named
> semaphores. They are quite easy to use and work nicely with Pharo. The
> VM can all wait on the semaphore or it can check the status of it
> periodically in an image thread. It has two small disadvantages. It
> requires to dynamically link the next library (pthread) and they must
> be cleaned manually. I plan to look at System V alternative in future.
> Now we should write a nice framework for inter-image communication on
> top of it or/and adopt Seamless for it ;-)
> 
> Cheers,
> -- Pavel
> 

--
www.tudorgirba.com
www.feenk.com

"From an abstract enough point of view, any two things are similar."







Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-02-14 Thread Thierry Goubier

Le 14/02/2018 à 20:19, Stephane Ducasse a écrit :

Thanks Pavel this looks quite fast :).
Do you have a scenario in mind that could take advantage of this?


I'd be very interested to see that used with image segments for objects 
migration between images running in different processes (or coupled with 
a distributed shared memory implementation like [1]).


If latency is low enough, I think multi-window applications could be 
developped as multi-process single window images (and we could get 
scalability to thousands of cores, if the application design is right. 
Even code synchronisation between images would be easy to do).


Thierry

[1] https://hal.archives-ouvertes.fr/hal-01679052


Stef

On Mon, Feb 12, 2018 at 10:33 AM, Pavel Krivanek
 wrote:

Hi,

among other less interesting things, I spent some time on existing
PosixSharedMemory project. It is a UFFI binding for the LibC methods
that provide support for the memory allocation between several
separate processes. I significantly improved the performance by
implementing the block access. Writing of 10MB byte array takes about
1 millisecond, reading of it from other image took me about 4
milliseconds. While serialization with Fuel is very fast, it opens
interesting possibilities.
To have a shared memory without synchronization tools is not very
useful so I wrote a basic UFFI interface for the POSIX named
semaphores. They are quite easy to use and work nicely with Pharo. The
VM can all wait on the semaphore or it can check the status of it
periodically in an image thread. It has two small disadvantages. It
requires to dynamically link the next library (pthread) and they must
be cleaned manually. I plan to look at System V alternative in future.
Now we should write a nice framework for inter-image communication on
top of it or/and adopt Seamless for it ;-)

Cheers,
-- Pavel









Re: [Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-02-14 Thread Stephane Ducasse
Thanks Pavel this looks quite fast :).
Do you have a scenario in mind that could take advantage of this?

Stef

On Mon, Feb 12, 2018 at 10:33 AM, Pavel Krivanek
 wrote:
> Hi,
>
> among other less interesting things, I spent some time on existing
> PosixSharedMemory project. It is a UFFI binding for the LibC methods
> that provide support for the memory allocation between several
> separate processes. I significantly improved the performance by
> implementing the block access. Writing of 10MB byte array takes about
> 1 millisecond, reading of it from other image took me about 4
> milliseconds. While serialization with Fuel is very fast, it opens
> interesting possibilities.
> To have a shared memory without synchronization tools is not very
> useful so I wrote a basic UFFI interface for the POSIX named
> semaphores. They are quite easy to use and work nicely with Pharo. The
> VM can all wait on the semaphore or it can check the status of it
> periodically in an image thread. It has two small disadvantages. It
> requires to dynamically link the next library (pthread) and they must
> be cleaned manually. I plan to look at System V alternative in future.
> Now we should write a nice framework for inter-image communication on
> top of it or/and adopt Seamless for it ;-)
>
> Cheers,
> -- Pavel
>



[Pharo-dev] Pavel's ChangeLog week of 2018-02-05

2018-02-12 Thread Pavel Krivanek
Hi,

among other less interesting things, I spent some time on existing
PosixSharedMemory project. It is a UFFI binding for the LibC methods
that provide support for the memory allocation between several
separate processes. I significantly improved the performance by
implementing the block access. Writing of 10MB byte array takes about
1 millisecond, reading of it from other image took me about 4
milliseconds. While serialization with Fuel is very fast, it opens
interesting possibilities.
To have a shared memory without synchronization tools is not very
useful so I wrote a basic UFFI interface for the POSIX named
semaphores. They are quite easy to use and work nicely with Pharo. The
VM can all wait on the semaphore or it can check the status of it
periodically in an image thread. It has two small disadvantages. It
requires to dynamically link the next library (pthread) and they must
be cleaned manually. I plan to look at System V alternative in future.
Now we should write a nice framework for inter-image communication on
top of it or/and adopt Seamless for it ;-)

Cheers,
-- Pavel