Hi John, > On Mar 9, 2026, at 8:06 PM, John Hubbard <[email protected]> wrote: > > On 3/9/26 4:41 PM, Joel Fernandes wrote: >>>> On Mar 9, 2026, at 5:22 PM, Joel Fernandes <[email protected]> wrote: >>> On Fri, Feb 27, 2026 at 09:32:08PM +0900, Eliot Courtney wrote: >>>> Expose the `hInternalClient` and `hInternalSubdevice` handles. These are >>>> needed for RM control calls. >>>> >>>> Signed-off-by: Eliot Courtney <[email protected]> >>>> --- >>>> drivers/gpu/nova-core/gsp/commands.rs | 16 ++++++++++++++++ >>>> drivers/gpu/nova-core/gsp/fw/commands.rs | 10 ++++++++++ >>>> 2 files changed, 26 insertions(+) >>>> >>>> diff --git a/drivers/gpu/nova-core/gsp/commands.rs >>>> b/drivers/gpu/nova-core/gsp/commands.rs >>>> index 4740cda0b51c..2cadfcaf9a8a 100644 >>>> --- a/drivers/gpu/nova-core/gsp/commands.rs >>>> +++ b/drivers/gpu/nova-core/gsp/commands.rs >>>> @@ -197,6 +197,8 @@ fn init(&self) -> impl Init<Self::Command, >>>> Self::InitError> { >>>> /// The reply from the GSP to the [`GetGspInfo`] command. >>>> pub(crate) struct GetGspStaticInfoReply { >>>> gpu_name: [u8; 64], >>>> + h_client: u32, >>>> + h_subdevice: u32, >>> >>> I would rather have more descriptive names please. 'client_handle', > > Maybe it's better to mirror the Open RM names, which are ancient and > well known in those circles. Changing them at this point is probably > going to result in a slightly worse situation, because there are > probably millions of lines of code out there that use the existing > nomenclature.
I have to disagree a bit here. Saying h_ in code is a bit meaningless: there is no mention of the word "handle" anywhere near these fields. h_ could mean "higher", "hardware", or any number of things. The only reason I know it means "handle" is because of expertise with Nvidia drivers. The `_handle` suffix is self-documenting; `h_` is not. > > However... > >>> 'subdevice_handle'. Also some explanation of what a client and a sub-device >>> mean somewhere in the comments or documentation would be nice. > > Yes, although I expect you can simply refer to some well known pre- > existing documentation from NVIDAI for that! I apologize but I am a bit concerned with this approach because it feels we are drifting into black box dev without encouraging more code comments, documentation and cleaner code. We need to make the driver as readable and well documented as possible, we do not want another Nouveau with magic numbers and magic variable names. Very least I would expect at least one or two lines of code comments of what is a handle, what is a client, what is an internal client handle versus not. I guess I do not understand what is the hesitation? Sure external documentation is good but to clarify, I am referring to a few code comments. That's not much to ask right? Elaborate documentation files in kernel can be optional but there is probably no harm in citing external references from in-kernel docs too. But again I was more concerned about code comments and variable names. > >> >> Also just checking if we can have repr wrappers around the u32 for clients / >> handles. These concepts are quite common in Nvidia drivers so we should >> probably create new types for them. >> >> And if we can please document the terminology, device, subset, clients >> handles >> etc. and add new Documentation/ entries even. >> >> Thoughts? >> > > This has already been done countless times by countless people I > think, and so we don't need to do it again. Just refer to existing > docs. Not sure if you are referring to nova-core repr or docs, but the repr technique to wrap raw integers is pretty common in the firmware layer of nova-core. > > btw, as an aside: > > I'm checking with our GSP firmware team to be sure, but my > understanding is that much of this is actually very temporary. Because > the GSP team does not want to continue on with this model in which > GSP has to maintain that kind of state: an internal hierarchy of > objects. Instead, they are hoping to move to an API in which nova > would directly refer to each object/item in GSP. And subdevice, in Even if this is "temporary" we can't just say "Oh, I'll just add these legacy things and badly write them, because they're going anyway at some unpredictable point in the future". After all, this is the Linux kernel we are talking about :-). The bar is quite high. > particular, is an old SLI term that no one wants to keep around > either. It was an ugly hack in Open RM that took more than a decade > to recover from, by moving the SLI concept out to user space. > > So even though we should document what we're doing now, I would like > to also note that we suspect a certain amount of this will > disappear, to be replaced with a somewhat simpler API, in the > somewhat near future. Sure, but client handles are a broader GPU driver concept even if this particular one is GSP-internal. We are certainly going to need a rust type to represent a client right? Other GPU drivers also have concept of clients. The point is not that `hInternalClient` represents a GPU user today, it may well be temporary as you note, but that using `#[repr(transparent)]` new types for raw u32 handles costs nothing and makes the code better and more readable. This pattern is already well-established in nova-core itself: see `PackedRegistryEntry` for example being a repr type. IMHO, there should be little reason that we need the struct to have magic u32 numbers in Rust code for concepts like "handles". All I am saying is let us think this through before just doing the shortcut of using u32 for client handles, etc. Rust gives us rich types, lets use them. thanks, -- Joel Fernandes
