Re: [Nouveau] [PATCH] drm/nouveau: Add fine-grain temperature reporting
On Thu, 10 Sep 2020 at 00:07, Jeremy Cline wrote: > > On Wed, Sep 09, 2020 at 10:22:14AM +0200, Karol Herbst wrote: > > On Wed, Sep 9, 2020 at 6:06 AM Ben Skeggs wrote: > > > > > > On Thu, 13 Aug 2020 at 06:50, Jeremy Cline wrote: > > > > > > > > Commit d32656373857 ("drm/nouveau/therm/gp100: initial implementation of > > > > new gp1xx temperature sensor") added support for reading finer-grain > > > > temperatures, but continued to report temperatures in 1 degree Celsius > > > > increments via nvkm_therm_temp_get(). > > > > > > > > Rather than altering nvkm_therm_temp_get() to report finer-grain > > > > temperatures, which would be inconvenient for other users of the > > > > function, a second interface has been added to line up with hwmon's > > > > native unit of temperature. > > > Hey Jeremy, > > > > > > Sorry this slipped past me until now. I'm OK with adding support for > > > millidegree temperature reporting, but don't think we need to keep > > > both interfaces around and would rather see the existing code > > > converted to return millidegrees (even on GPUs that don't support it) > > > instead of degrees. > > > > > > Thanks! > > > Ben. > > > > > > > just a note as I was trying something like that in the past: we have a > > lot of code which fetches the temperature and there are a lot of > > places where we would then do a divide by 1000, so having a wrapper > > function at least makes sense. If we want to keep the func pointers? > > well.. I guess the increased CPU overhead wouldn't be as bad if all > > sub classes do this static * 1000 as well either. I just think we > > should have both interfaces in subdev/therm.h so we can just keep most > > of the current code as is. > > > > Indeed. My initial plan was to change the meaning of temp_get() and > adjust all the users, but there's around a dozen of them and based on my > understanding of the users none of them cared much about such accuracy. > > However, I do find having one way to do things appealing, so if there's > a strong preference for it, I'm happy to produce a somewhat more > invasive patch where all users expect millidegrees. Yeah, I do have a strong preference for not having to keep multiple interfaces around unnecessarily. It'd be somewhat different if we had external interactions, but this is all stuff within the module and we should be able to make these kinds of changes without too much pain. Ben. > > - Jeremy >
Re: [Nouveau] [PATCH] drm/nouveau: Add fine-grain temperature reporting
On Wed, Sep 09, 2020 at 10:22:14AM +0200, Karol Herbst wrote: > On Wed, Sep 9, 2020 at 6:06 AM Ben Skeggs wrote: > > > > On Thu, 13 Aug 2020 at 06:50, Jeremy Cline wrote: > > > > > > Commit d32656373857 ("drm/nouveau/therm/gp100: initial implementation of > > > new gp1xx temperature sensor") added support for reading finer-grain > > > temperatures, but continued to report temperatures in 1 degree Celsius > > > increments via nvkm_therm_temp_get(). > > > > > > Rather than altering nvkm_therm_temp_get() to report finer-grain > > > temperatures, which would be inconvenient for other users of the > > > function, a second interface has been added to line up with hwmon's > > > native unit of temperature. > > Hey Jeremy, > > > > Sorry this slipped past me until now. I'm OK with adding support for > > millidegree temperature reporting, but don't think we need to keep > > both interfaces around and would rather see the existing code > > converted to return millidegrees (even on GPUs that don't support it) > > instead of degrees. > > > > Thanks! > > Ben. > > > > just a note as I was trying something like that in the past: we have a > lot of code which fetches the temperature and there are a lot of > places where we would then do a divide by 1000, so having a wrapper > function at least makes sense. If we want to keep the func pointers? > well.. I guess the increased CPU overhead wouldn't be as bad if all > sub classes do this static * 1000 as well either. I just think we > should have both interfaces in subdev/therm.h so we can just keep most > of the current code as is. > Indeed. My initial plan was to change the meaning of temp_get() and adjust all the users, but there's around a dozen of them and based on my understanding of the users none of them cared much about such accuracy. However, I do find having one way to do things appealing, so if there's a strong preference for it, I'm happy to produce a somewhat more invasive patch where all users expect millidegrees. - Jeremy
Re: [Nouveau] [PATCH] drm/nouveau: Add fine-grain temperature reporting
On Wed, Sep 9, 2020 at 6:06 AM Ben Skeggs wrote: > > On Thu, 13 Aug 2020 at 06:50, Jeremy Cline wrote: > > > > Commit d32656373857 ("drm/nouveau/therm/gp100: initial implementation of > > new gp1xx temperature sensor") added support for reading finer-grain > > temperatures, but continued to report temperatures in 1 degree Celsius > > increments via nvkm_therm_temp_get(). > > > > Rather than altering nvkm_therm_temp_get() to report finer-grain > > temperatures, which would be inconvenient for other users of the > > function, a second interface has been added to line up with hwmon's > > native unit of temperature. > Hey Jeremy, > > Sorry this slipped past me until now. I'm OK with adding support for > millidegree temperature reporting, but don't think we need to keep > both interfaces around and would rather see the existing code > converted to return millidegrees (even on GPUs that don't support it) > instead of degrees. > > Thanks! > Ben. > just a note as I was trying something like that in the past: we have a lot of code which fetches the temperature and there are a lot of places where we would then do a divide by 1000, so having a wrapper function at least makes sense. If we want to keep the func pointers? well.. I guess the increased CPU overhead wouldn't be as bad if all sub classes do this static * 1000 as well either. I just think we should have both interfaces in subdev/therm.h so we can just keep most of the current code as is. > > > > Signed-off-by: Jeremy Cline > > --- > > .../drm/nouveau/include/nvkm/subdev/therm.h | 18 + > > drivers/gpu/drm/nouveau/nouveau_hwmon.c | 4 +-- > > .../gpu/drm/nouveau/nvkm/subdev/therm/base.c | 16 > > .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 25 +-- > > .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 1 + > > 5 files changed, 60 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > index 62c34f98c930..7b9928dd001c 100644 > > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > > @@ -100,6 +100,24 @@ struct nvkm_therm { > > }; > > > > int nvkm_therm_temp_get(struct nvkm_therm *); > > + > > +/** > > + * nvkm_therm_temp_millidegree_get() - get the temperature in millidegrees > > + * @therm: The thermal device to read from. > > + * > > + * This interface reports temperatures in units of millidegree Celsius to > > + * align with the hwmon API. Some cards may only be capable of reporting in > > + * units of Celsius, and those that report finer grain temperatures may > > not be > > + * capable of millidegree Celsius accuracy, > > + * > > + * For cases where millidegree temperature is too fine-grain, the > > + * nvkm_therm_temp_get() interface reports temperatures in one degree > > Celsius > > + * increments. > > + * > > + * Return: The temperature in millidegrees Celsius, or -ENODEV if > > temperature > > + * reporting is not supported. > > + */ > > +int nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm); > > int nvkm_therm_fan_sense(struct nvkm_therm *); > > int nvkm_therm_cstate(struct nvkm_therm *, int, int); > > void nvkm_therm_clkgate_init(struct nvkm_therm *, > > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > index 1c3104d20571..e96355f93ce5 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int > > channel, long *val) > > case hwmon_temp_input: > > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON) > > return -EINVAL; > > - ret = nvkm_therm_temp_get(therm); > > - *val = ret < 0 ? ret : (ret * 1000); > > + ret = nvkm_therm_temp_millidegree_get(therm); > > + *val = ret; > > break; > > case hwmon_temp_max: > > *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > > index 4a4d1e224126..e655b32c78b8 100644 > > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > > @@ -34,6 +34,22 @@ nvkm_therm_temp_get(struct nvkm_therm *therm) > > return -ENODEV; > > } > > > > +int > > +nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm) > > +{ > > + int ret = -ENODEV; > > + > > + if (therm->func->temp_millidegree_get) > > + return therm->func->temp_millidegree_get(therm); > > + > > + if (therm->func->temp_get) { > > + ret = therm->func->temp_get(therm); > > + if (ret > 0) > > + ret *= 1000; > > + }
Re: [Nouveau] [PATCH] drm/nouveau: Add fine-grain temperature reporting
On Thu, 13 Aug 2020 at 06:50, Jeremy Cline wrote: > > Commit d32656373857 ("drm/nouveau/therm/gp100: initial implementation of > new gp1xx temperature sensor") added support for reading finer-grain > temperatures, but continued to report temperatures in 1 degree Celsius > increments via nvkm_therm_temp_get(). > > Rather than altering nvkm_therm_temp_get() to report finer-grain > temperatures, which would be inconvenient for other users of the > function, a second interface has been added to line up with hwmon's > native unit of temperature. Hey Jeremy, Sorry this slipped past me until now. I'm OK with adding support for millidegree temperature reporting, but don't think we need to keep both interfaces around and would rather see the existing code converted to return millidegrees (even on GPUs that don't support it) instead of degrees. Thanks! Ben. > > Signed-off-by: Jeremy Cline > --- > .../drm/nouveau/include/nvkm/subdev/therm.h | 18 + > drivers/gpu/drm/nouveau/nouveau_hwmon.c | 4 +-- > .../gpu/drm/nouveau/nvkm/subdev/therm/base.c | 16 > .../gpu/drm/nouveau/nvkm/subdev/therm/gp100.c | 25 +-- > .../gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 1 + > 5 files changed, 60 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > index 62c34f98c930..7b9928dd001c 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h > @@ -100,6 +100,24 @@ struct nvkm_therm { > }; > > int nvkm_therm_temp_get(struct nvkm_therm *); > + > +/** > + * nvkm_therm_temp_millidegree_get() - get the temperature in millidegrees > + * @therm: The thermal device to read from. > + * > + * This interface reports temperatures in units of millidegree Celsius to > + * align with the hwmon API. Some cards may only be capable of reporting in > + * units of Celsius, and those that report finer grain temperatures may not > be > + * capable of millidegree Celsius accuracy, > + * > + * For cases where millidegree temperature is too fine-grain, the > + * nvkm_therm_temp_get() interface reports temperatures in one degree Celsius > + * increments. > + * > + * Return: The temperature in millidegrees Celsius, or -ENODEV if temperature > + * reporting is not supported. > + */ > +int nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm); > int nvkm_therm_fan_sense(struct nvkm_therm *); > int nvkm_therm_cstate(struct nvkm_therm *, int, int); > void nvkm_therm_clkgate_init(struct nvkm_therm *, > diff --git a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > index 1c3104d20571..e96355f93ce5 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_hwmon.c > +++ b/drivers/gpu/drm/nouveau/nouveau_hwmon.c > @@ -428,8 +428,8 @@ nouveau_temp_read(struct device *dev, u32 attr, int > channel, long *val) > case hwmon_temp_input: > if (drm_dev->switch_power_state != DRM_SWITCH_POWER_ON) > return -EINVAL; > - ret = nvkm_therm_temp_get(therm); > - *val = ret < 0 ? ret : (ret * 1000); > + ret = nvkm_therm_temp_millidegree_get(therm); > + *val = ret; > break; > case hwmon_temp_max: > *val = therm->attr_get(therm, NVKM_THERM_ATTR_THRS_DOWN_CLK) > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > index 4a4d1e224126..e655b32c78b8 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c > @@ -34,6 +34,22 @@ nvkm_therm_temp_get(struct nvkm_therm *therm) > return -ENODEV; > } > > +int > +nvkm_therm_temp_millidegree_get(struct nvkm_therm *therm) > +{ > + int ret = -ENODEV; > + > + if (therm->func->temp_millidegree_get) > + return therm->func->temp_millidegree_get(therm); > + > + if (therm->func->temp_get) { > + ret = therm->func->temp_get(therm); > + if (ret > 0) > + ret *= 1000; > + } > + return ret; > +} > + > static int > nvkm_therm_update_trip(struct nvkm_therm *therm) > { > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c > index 9f0dea3f61dc..4c3c2895a3cb 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gp100.c > @@ -24,7 +24,7 @@ > #include "priv.h" > > static int > -gp100_temp_get(struct nvkm_therm *therm) > +gp100_temp_get_raw(struct nvkm_therm *therm) > { > struct nvkm_device *device = therm->subdev.device; > struct nvkm_subdev *subdev = &therm->subdev; > @@ -37,14 +37,35 @@ gp100_temp_get(struct nvkm_therm *therm) > > /* device valid */ > if (tsensor & 0x