** Description changed:
## FFe ##
### General
As part of our efforts to have rocm stack v7.1.1 at least in resolute. I
created a merge proposal linked to this bug. Please take a look at the
merge proposal for details, creating this bug to increase
traceability/visibility of the request.
### Patch changes
+
+ changes:
+ - rocm-hipamd: https://github.com/ROCm/clr/compare/rocm-7.1.0...rocm-7.1.1
+ - hip module: https://github.com/ROCm/hip/compare/rocm-7.1.0...rocm-7.1.1
+ changes view-as-a-patch:
+ - rocm-hipamd:
https://github.com/ROCm/clr/compare/rocm-7.1.0...rocm-7.1.1.patch
+ - hip module:
https://github.com/ROCm/hip/compare/rocm-7.1.0...rocm-7.1.1.patch
+
+ What Changed & Justifications
+ This update is a combination of critical bug fixes and feature improvements
rolling up into the ROCM-HIPAMD and HIP module/ROCm 7.1.1 release.
+
+ - Improvement (Feature Addition): Implements hipHostRegisterIoMemory for
+ hipHostRegister.
+
+ - Justification: Allows I/O memory (like memory belonging to a
+ third-party PCIe device) to be registered with the HIP runtime so it can
+ be directly accessed by the GPU, which is crucial for RDMA and Virtual
+ Memory Management (VMM) operations.
+
+ - Improvement (API Correction): Updates the API signatures for
+ hipLibraryLoadData and hipLibraryLoadFromFile.
+
+ - Justification: Fixes an API inconsistency by changing the
+ jitOptions and libraryOptions parameters from double pointers (**) to
+ standard single pointers (*), maintaining ABI backward compatibility for
+ the rocprofiler-sdk.
+
+ - Bug Fix (Stability): Fixes a use-after-free race condition in
+ asynchronous event handlers.
+
+ - Justification: Prevents memory corruption in HIP graph-related
+ applications where pending signal handlers were accessing device memory
+ that had already been released.
+
+ - Bug Fix (Memory Management): Fixes memory release handling for
+ hostcalls during a device reset.
+
+ - Justification: Prevents teardown errors. Previously,
+ hipDeviceReset would attempt to access stale memory objects. The tracker
+ now securely tracks allocations by their virtual address instead of
+ direct object pointers.
+
+ - Bug Fix (Crash Prevention): Adds queue validation to the submitMarker
+ synchronization path.
+
+ - Justification: Prevents segmentation faults when the dynamic queue
+ management mechanism is enabled by ensuring the GPU queue is not NULL
+ before flushing.
+
+ - Bug Fix (Logging): Corrects Compute Unit (CU) mask printing.
+
+ - Justification: Ensures accurate debugging information by setting
+ the correct output field width for the CU mask logs.
** Changed in: rocm-hipamd (Ubuntu)
Assignee: (unassigned) => Igor Luppi (igorluppi)
** Description changed:
## FFe ##
### General
As part of our efforts to have rocm stack v7.1.1 at least in resolute. I
created a merge proposal linked to this bug. Please take a look at the
merge proposal for details, creating this bug to increase
traceability/visibility of the request.
### Patch changes
- changes:
+ changes:
- rocm-hipamd: https://github.com/ROCm/clr/compare/rocm-7.1.0...rocm-7.1.1
- hip module: https://github.com/ROCm/hip/compare/rocm-7.1.0...rocm-7.1.1
- changes view-as-a-patch:
+ changes view-as-a-patch:
- rocm-hipamd:
https://github.com/ROCm/clr/compare/rocm-7.1.0...rocm-7.1.1.patch
- hip module:
https://github.com/ROCm/hip/compare/rocm-7.1.0...rocm-7.1.1.patch
What Changed & Justifications
This update is a combination of critical bug fixes and feature improvements
rolling up into the ROCM-HIPAMD and HIP module/ROCm 7.1.1 release.
- Improvement (Feature Addition): Implements hipHostRegisterIoMemory for
hipHostRegister.
- - Justification: Allows I/O memory (like memory belonging to a
+ - Justification: Allows I/O memory (like memory belonging to a
third-party PCIe device) to be registered with the HIP runtime so it can
be directly accessed by the GPU, which is crucial for RDMA and Virtual
Memory Management (VMM) operations.
- Improvement (API Correction): Updates the API signatures for
hipLibraryLoadData and hipLibraryLoadFromFile.
- - Justification: Fixes an API inconsistency by changing the
+ - Justification: Fixes an API inconsistency by changing the
jitOptions and libraryOptions parameters from double pointers (**) to
standard single pointers (*), maintaining ABI backward compatibility for
the rocprofiler-sdk.
- Bug Fix (Stability): Fixes a use-after-free race condition in
asynchronous event handlers.
- - Justification: Prevents memory corruption in HIP graph-related
+ - Justification: Prevents memory corruption in HIP graph-related
applications where pending signal handlers were accessing device memory
that had already been released.
- Bug Fix (Memory Management): Fixes memory release handling for
hostcalls during a device reset.
- - Justification: Prevents teardown errors. Previously,
+ - Justification: Prevents teardown errors. Previously,
hipDeviceReset would attempt to access stale memory objects. The tracker
now securely tracks allocations by their virtual address instead of
direct object pointers.
- Bug Fix (Crash Prevention): Adds queue validation to the submitMarker
synchronization path.
- - Justification: Prevents segmentation faults when the dynamic queue
+ - Justification: Prevents segmentation faults when the dynamic queue
management mechanism is enabled by ensuring the GPU queue is not NULL
before flushing.
- Bug Fix (Logging): Corrects Compute Unit (CU) mask printing.
- - Justification: Ensures accurate debugging information by setting
+ - Justification: Ensures accurate debugging information by setting
the correct output field width for the CU mask logs.
+
+ ## TL;DR
+ This patch updates HIP to version 7.1.1, introducing crucial support for
mapping third-party PCIe I/O memory to the GPU (hipHostRegisterIoMemory).
Alongside this feature, it delivers vital stability fixes that prevent
segmentation faults in queue management, resolves use-after-free memory
corruption in HIP graphs, safely handles device teardowns, and corrects several
API signatures and logging inaccuracies.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2148498
Title:
FFe: New upstream version 7.1.1 (Bug fix)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rocm-hipamd/+bug/2148498/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs