Hi Kor,

regarding the SLURM error there is a workaround that I was taught in this 
issue: https://github.com/STEllAR-GROUP/hpx/issues/4297

  ./application --hpx:ignore-batch-env --hpx:localities=[N]

Where [N] is the number of nodes you use (works only for one locality per 
node). This should get rid of the "every locality thinks its rank 0" 
problem. Please let me know if it works and the two issues were indeed 
related.

Kind regards,

Kilian


On Fri, 3 Sep 2021 15:12:58 +0200
  Kor de Jong <k.dejo...@uu.nl> wrote:
> Hi HPX experts,
> 
> Once in a while I try to generate OTF2 traces for 
>distributed HPX runs. 
> This has never worked for me. A year ago, I asked this 
>mailing list for 
> help. Even using the helpful replies I could not get it 
>to work, and I 
> moved on.
> 
> But being able to generate these traces would actually 
>be very useful 
> for me, so I tried it yet another time, with the current 
>HPX 1.7.1.
> 
> I wonder whether anybody is actually doing this as well, 
>and succeeding. 
> If so, can you please explain how?
> 
> Generating a trace for a single process works fine. 
>Generating a trace 
> for multiple processes (8) on the same node(!) fails:
> 
> OTF2 Error: INVALID, Unknown error code 
> 
> [OTF2] src/otf2_archive_int.c:1108: error: Unknown error 
>code: Couldn't 
> create directories on root.
> [OTF2] src/otf2_archive_int.c:1108: error: Unknown error 
>code: Couldn't 
> create directories on root.
> OTF2 Error: INVALID, Unknown error code 
> 
> OTF2 Error: INVALID, Unknown error code 
> 
> [OTF2] src/otf2_archive_int.c:1108: error: Unknown error 
>code: Couldn't 
> create directories on root.
> OTF2 Error: INVALID, Unknown error code 
> 
> [OTF2] src/otf2_archive_int.c:1108: error: Unknown error 
>code: Couldn't 
> create directories on root.
> OTF2 Error: INVALID, Unknown error code 
> 
> [OTF2] src/otf2_archive_int.c:1108: error: Unknown error 
>code: Couldn't 
> create directories on root.
> OTF2 Error: INVALID, Unknown error code 
> 
> [OTF2] src/otf2_archive_int.c:1108: error: Unknown error 
>code: Couldn't 
> create directories on root.
> [OTF2] src/otf2_archive_int.c:1108: error: Unknown error 
>code: Couldn't 
> create directories on root.
> OTF2 Error: INVALID, Unknown error code
> 
> BTW, I always get the messages below as well. I am not 
>sure whether this 
> is relevant. As per reply by Harmut (April 19, this 
>list), I am ignoring 
> these messages.
> 
> hpx::init: command line warning: --hpx:localities used 
>when running with 
> SLURM, requesting a different number of localities (8) 
>than have been 
> assigned by SLURM (1), the application might not run 
>properly. 
> 
> 
> ...
> repeated 8 time
> ...
> 
> Hartmut writes:
> 
> "We have never been able to properly figure that one 
>out. In my 
> experience, you can ignore the warning (as long as the 
>binding 
> information looks correct)."
> 
> In his reply to my question last year John B writes (Sep 
>10 2020):
> 
> "This reminds me of the error that is produced when all 
>ranks think 
> they're rank 0 - so apex is not getting the correct 
>initialization info. 
> (Ranks 1-N-1 try to create the otf files and clobber 
>each other)"
> 
> I have no idea, but maybe these things are related?
> 
> Thanks for any info!
> 
> Kor
> _______________________________________________
> hpx-users mailing list
> hpx-users@stellar-group.org
> https://mail.cct.lsu.edu/mailman/listinfo/hpx-users


_______________________________________________
hpx-users mailing list
hpx-users@stellar-group.org
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to