Re: [hpx-users] Recommended approach for hetrogenous computing

2019-06-06 Thread Michael Levine

Hi, just following up - can anyone please help me with this?

Thanks!

On May 29, 2019 9:44:18 a.m. Michael Levine  wrote:

Hi all,
I was wondering whether anyone could provide some direction as to the 
recommended approach to incorporating GPU-based execution within an HPX 
application, either in general or, preferably if possible, for how I (think 
I) want to use it-


Given a number of host systems (CPU), each possibly containing one or more 
GPGPUs.  For the sake of simplicity, right now all of my GPUs are NVidia.


The specific application I'm working on is a distributed optimizer 
implementing a global search algorithm (right now, I have implemented 
Differential Evolution).  In its distributed form, it treats each node as a 
separate island with occasional migration.


When run on separate SLURM / HPX (CPU-based) nodes, each node has a set of 
N trial parameter vectors.  Each node iterates independently.  When certain 
conditions are met, a node can initiate a migration of data from a 
logically-adjacent node.


Given the feature sets of the latest versions of CUDA, I believe that it 
should be possible for me to treat each GPGPU as a node as well.  I realize 
it won't be a native node via SLURM (as an Intel Phi node might be), but 
rather initialized in some other way after the application is initialized 
across the cluster).


I've seen the HPXCL project repo, but it hasn't had any updates in more 
than 4 months.  I've also seen at one time, though I cannot find right now, 
scattered bits of code around the internet with a couple different 
approaches.  Finally, I am under the impression that work is being done on 
executors to provide this or related functionality.


Given, I think, that this should be a reasonably common use-case, it might 
be helpful if some official tutorials are generated at some point to assist 
users with this.  In the meanwhile, if anyone (and everyone) can provide 
some general direction or guidance on this, I would be greatly 
appreciative.  I would, also, be very willing to try and provide some 
additional tutorials and user documentation for this application.


Thank you and I would greatly appreciate any help that you can provide.

Regards,
Shmuel Levine



___
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users


Re: [hpx-users] Recommended approach for hetrogenous computing

2019-06-06 Thread Hartmut Kaiser
Resending my response...

Regards Hartmut
---
http://stellar.cct.lsu.edu
https://github.com/STEllAR-GROUP/hpx


> -Original Message-
> From: Hartmut Kaiser 
> Sent: Thursday, May 30, 2019 7:03 AM
> To: 'hpx-users@stellar.cct.lsu.edu' 
> Subject: RE: [hpx-users] Recommended approach for hetrogenous computing
>
> Michael,
>
> > I was wondering whether anyone could provide some direction as to the
> > recommended approach to incorporating GPU-based execution within an
> > HPX application, either in general or, preferably if possible, for how
> > I (think I) want to use it-
> >
> > Given a number of host systems (CPU), each possibly containing one or
> > more GPGPUs.  For the sake of simplicity, right now all of my GPUs are
> NVidia.
> >
> > The specific application I'm working on is a distributed optimizer
> > implementing a global search algorithm (right now, I have implemented
> > Differential Evolution).  In its distributed form, it treats each node
> > as a separate island with occasional migration.
> >
> > When run on separate SLURM / HPX (CPU-based) nodes, each node has a
> > set of N trial parameter vectors.  Each node iterates independently.
> > When certain conditions are met, a node can initiate a migration of
> > data from a logically-adjacent node.
> >
> > Given the feature sets of the latest versions of CUDA, I believe that
> > it should be possible for me to treat each GPGPU as a node as well.  I
> > realize it won't be a native node via SLURM (as an Intel Phi node
> > might be), but rather initialized in some other way after the
> > application is initialized across the cluster).
> >
> > I've seen the HPXCL project repo, but it hasn't had any updates in
> > more than 4 months.  I've also seen at one time, though I cannot find
> > right now, scattered bits of code around the internet with a couple
> > different approaches.  Finally, I am under the impression that work is
> > being done on executors to provide this or related functionality.
> >
> > Given, I think, that this should be a reasonably common use-case, it
> > might be helpful if some official tutorials are generated at some
> > point to assist users with this.  In the meanwhile, if anyone (and
> > everyone) can provide some general direction or guidance on this, I
> > would be greatly appreciative.  I would, also, be very willing to try
> > and provide some additional tutorials and user documentation for this
> application.
> >
> > Thank you and I would greatly appreciate any help that you can provide.
>
> We've had several attempts to providing higher level support to integrate
> GPGPUs with HPX. HPXCL is one of those; HPX.Compute is another attempt.
> Neither has resulted in something we're satisfied with. However, we have
> created a couple of lower level facilities that I believe are useful.
>
> The main idea is (as always) to expose operations the CPU schedules on the
> device (data transfer, run a kernel, etc.) through API function that
> return an hpx::future. That allows for nicely hiding latencies and
> communication overheads. This also allows integrating the GPU work into
> the overall asynchronous execution flow on the CPU.
>
> HPX.Compute has also introduced the concept of 'targets' (i.e. places in
> the system) that can be used to create a) allocators, and b) executors.
> This is important to be able to control data and execution placement from
> user land while still using system facilities like parallel algorithms,
> etc.
>
> HPXCL has solved the problem of remote GPU access by encapsulating a
> device in a HPX component, that allows to submitting data and kernels for
> execution to a remote device.
>
> There is essentially no documentation of any of the above. As I said, we
> were not satisfied with the overall design or implementation, so we
> decided not to spend time on describing things.
>
> As always, any help you might be willing to give would be highly
> appreciated.
>
> Thanks!
> Regards Hartmut
> ---
> http://stellar.cct.lsu.edu
> https://github.com/STEllAR-GROUP/hpx
>



___
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users


Re: [hpx-users] C++20 Modules and HPX library modularization

2019-06-06 Thread Michael Levine
Hi Hartmut,
Thanks for the quick reply - unfortunately it was deemed spam and since my 
spam folder does not sync on my phone, I didn't see it until I forced an 
update.

I'd certainly be interested and willing to contribute to this effort as 
well as to looking into a possible solution for GPGPUs, although I am going 
to need some guidance and help.

I will respond to each topic in its appropriate thread for continuity.

Given that the modularization of HPX seems to be the key to the successful 
integration of C++ modules, presumably, this should be fairly 
straightforward - the library module boundaries should probably correspond 
to the C++ module boundaries.

Having said that, I will need someone to help me / review with me how to 
map the hpx modules into Modules TD (e.g. "super" - modules which just 
import other modules, which modules might be best broken into submodules, etc.)

Is there anyone in the community who might be interested and able to 
provide some guidance?

Thanks,
Michael

--
On May 30, 2019 7:50:57 a.m. "Hartmut Kaiser"  wrote:

> Hey Michael,
>
>> I am wondering whether there are any plans in the short or mid-term to
>> provide support in HPX for the new Modules TS which is expected to be part
>> of C++20?
>>
>>
>> While compile times are not really the main benefit of modules, I
>> understand that we can expect to see some variable degree of improvement
>> when using modules as compared to the existing approach.
>>
>>
>> At a very early stage of the Modules TS, I took a quick look through the
>> HPX codebase to see whether I thought it might be possible at all to try
>> and add some conditional code to allow for possible modules TS support and
>> was quickly discouraged.
>>
>>
>> However, it also seems to me that the current modularization initiative
>> might make it possible to export HPX as modules.
>>
>>
>> Has anyone given this any consideration?
>> Is this an ultimate goal in anyone's vision?
>> Would an attempt to incorporate the features of the modules TS be welcomed
>> by the community? At least, perhaps, as a proof-of-concept?
>>
>>
>> I'd appreciate your thoughts and feedback
>
> This topic has not been discussed so far and I personally have not looked much
> into C++20 modules. However, I believe that the current modularization effort
> in HPX is most likely to help with introducing C++20 modules. Any help you
> might want to give would be highly appreciated.
>
> Thanks!
> Regards Hartmut
> ---
> http://stellar.cct.lsu.edu
> https://github.com/STEllAR-GROUP/hpx
>
>
>
>
> ___
> hpx-users mailing list
> hpx-users@stellar.cct.lsu.edu
> https://mail.cct.lsu.edu/mailman/listinfo/hpx-users




___
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users


Re: [hpx-users] C++20 Modules and HPX library modularization

2019-06-06 Thread Hartmut Kaiser
Michael,

> Thanks for the quick reply - unfortunately it was deemed spam and since my
> spam folder does not sync on my phone, I didn't see it until I forced an
> update.
>
> I'd certainly be interested and willing to contribute to this effort as
> well as to looking into a possible solution for GPGPUs, although I am
> going to need some guidance and help.
>
> I will respond to each topic in its appropriate thread for continuity.
>
> Given that the modularization of HPX seems to be the key to the successful
> integration of C++ modules, presumably, this should be fairly
> straightforward - the library module boundaries should probably correspond
> to the C++ module boundaries.
>
> Having said that, I will need someone to help me / review with me how to
> map the hpx modules into Modules TD (e.g. "super" - modules which just
> import other modules, which modules might be best broken into submodules,
> etc.)
>
> Is there anyone in the community who might be interested and able to
> provide some guidance?

The modularization effort is spearheaded by the guys at CSCS (Mikhal and 
Auriane, both should be subscribed here). They might be the best to contact in 
order to coordinate things. Alternatively, they normally can be reached 
through our IRC channel.

Thanks!
Regards Hartmut
---
http://stellar.cct.lsu.edu
https://github.com/STEllAR-GROUP/hpx


>
> Thanks,
> Michael
>
> --
> On May 30, 2019 7:50:57 a.m. "Hartmut Kaiser" 
> wrote:
>
> > Hey Michael,
> >
> >> I am wondering whether there are any plans in the short or mid-term
> >> to provide support in HPX for the new Modules TS which is expected to
> >> be part of C++20?
> >>
> >>
> >> While compile times are not really the main benefit of modules, I
> >> understand that we can expect to see some variable degree of
> >> improvement when using modules as compared to the existing approach.
> >>
> >>
> >> At a very early stage of the Modules TS, I took a quick look through
> >> the HPX codebase to see whether I thought it might be possible at all
> >> to try and add some conditional code to allow for possible modules TS
> >> support and was quickly discouraged.
> >>
> >>
> >> However, it also seems to me that the current modularization
> >> initiative might make it possible to export HPX as modules.
> >>
> >>
> >> Has anyone given this any consideration?
> >> Is this an ultimate goal in anyone's vision?
> >> Would an attempt to incorporate the features of the modules TS be
> >> welcomed by the community? At least, perhaps, as a proof-of-concept?
> >>
> >>
> >> I'd appreciate your thoughts and feedback
> >
> > This topic has not been discussed so far and I personally have not
> > looked much into C++20 modules. However, I believe that the current
> > modularization effort in HPX is most likely to help with introducing
> > C++20 modules. Any help you might want to give would be highly
> appreciated.
> >
> > Thanks!
> > Regards Hartmut
> > ---
> > http://stellar.cct.lsu.edu
> > https://github.com/STEllAR-GROUP/hpx
> >
> >
> >
> >
> > ___
> > hpx-users mailing list
> > hpx-users@stellar.cct.lsu.edu
> > https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
>
>



___
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users


Re: [hpx-users] Recommended approach for hetrogenous computing

2019-06-06 Thread Michael Levine
Thanks Hartmut for getting back to me on this. I'm sorry to have missed 
this originally /caught in my spam filter)

--
On May 30, 2019 8:03:00 a.m. "Hartmut Kaiser"  wrote:

> ...
>>
>
>
> We've had several attempts to providing higher level support to integrate
> GPGPUs with HPX. HPXCL is one of those; HPX.Compute is another attempt.
> Neither has resulted in something we're satisfied with. However, we have
> created a couple of lower level facilities that I believe are useful.
>
> The main idea is (as always) to expose operations the CPU schedules on the
> device (data transfer, run a kernel, etc.) through API function that return an
> hpx::future. That allows for nicely hiding latencies and communication
> overheads. This also allows integrating the GPU work into the overall
> asynchronous execution flow on the CPU.
>
> HPX.Compute has also introduced the concept of 'targets' (i.e. places in the
> system) that can be used to create a) allocators, and b) executors. This is
> important to be able to control data and execution placement from user land
> while still using system facilities like parallel algorithms, etc.
>
> HPXCL has solved the problem of remote GPU access by encapsulating a device in
> a HPX component, that allows to submitting data and kernels for execution to a
> remote device.
>
> There is essentially no documentation of any of the above. As I said, we were
> not satisfied with the overall design or implementation, so we decided not to
> spend time on describing things.
>
> As always, any help you might be willing to give would be highly appreciated.
>
> Thanks!
> Regards Hartmut
> ---
> http://stellar.cct.lsu.edu
> https://github.com/STEllAR-GROUP/hpx

I am certainly interested and excited to contribute in any way I can. 
However, I need to add the caveat that I will certainly need some guidance 
and direction from those with considerable more experience than myself.

For example, to understand the previous attempts and what was deemed 
lacking or problematic with the designs, what would be the broader design 
goals for this module, etc.

If someone is interested and available to assist and provide guidance, 
advice, etc. - basically to act as a mentor for this project - I would be 
very happy for the help and would be willing to contribute to this project 
to the best of my abilities.

Regards,
Michael


R


___
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users