https://github.com/jtb20 created 
https://github.com/llvm/llvm-project/pull/200408

The OpenMP 6.0 'saved' firstprivate modifier (see [7.2] together
with [14.3]) requires each list item to be snapshotted once at
recording time and observable on every replay of the recorded
task.  libomp reuses the same task descriptor across every replay
of a taskgraph-owned task, so the '.kmp_privates.t' tail struct
that holds the firstprivate values is also the natural home for
the saved data environment.  Getting that right needs two
changes, which this patch lands together: the destructor of each
list item must fire exactly once at end-of-taskgraph (not after
every replay), and non-trivially-copyable list items must be
re-constructed per replay so that copy constructors and inner
self-references are respected.

On the runtime side, move the per-task destructor-thunk
invocation from __kmp_task_finish (which previously fired it at
the end of every replay, leaving the saved snapshot in a
destructed state for the next replay) to __kmp_taskgraph_free,
so it fires exactly once per task at end-of-taskgraph.  Skip
taskwait nodes (record_map entries with task == nullptr) in that
loop while we are there, to avoid a latent nullptr dereference
that the existing tests do not exercise.

On the compiler side, the runtime previously cloned each replay's
task descriptor with a bitwise memcpy in __kmp_taskgraph_clone_task,
and a FIXME noted that this silently corrupts firstprivate list items
whose type is not trivially copyable (self-referential structs, types
with user-defined copy constructors / destructors, types holding inner
pointers into themselves).  Emit a per-task clone helper

  void __omp_task_clone.NN(kmp_task_t *dst, kmp_task_t *src,
                           int lastpriv);

modelled on emitTaskDupFunction and reusing emitPrivatesInit (now
extended with a tri-state PrivatesInitMode of Normal / ForDup /
ForClone), which re-runs the copy constructor of each
firstprivate list item into the freshly allocated descriptor's
'.kmp_privates.t'.  Tasks whose firstprivates are all trivially
copyable still rely on the runtime's memcpy fast-path and emit no
clone helper.  emitTaskCall passes the helper to the runtime as
the new trailing argument of __kmpc_taskgraph_task (null when no
helper is needed).

Two previously-XFAIL'd taskgraph runtime tests
(taskgraph_replayable_saved_stack_depth.cpp and
taskgraph_shared_stack_depth.cpp) now pass and are un-XFAIL'd, and other
tests have been added to cover new functionality.

Assisted-By: Claude Opus 4.7



_______________________________________________
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to