Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-30 Thread Danilo Krummrich
On Fri, Sep 27, 2024 at 12:27:24PM -0300, Jason Gunthorpe wrote: > On Fri, Sep 27, 2024 at 04:22:32PM +0200, Danilo Krummrich wrote: > > > When you say things like this it comes across as though you are > > > implying there are two tiers to the community. Ie those that set the > > > strategy and th

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-29 Thread Zhi Wang
On Sun, 22 Sep 2024 05:49:22 -0700 Zhi Wang wrote: +Ben. Forget to add you. My bad. > 1. Background > = > > NVIDIA vGPU[1] software enables powerful GPU performance for workloads > ranging from graphics-rich virtual workstations to data science and > AI, enabling IT to leverage

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-29 Thread Zhi Wang
On 23/09/2024 11.38, Danilo Krummrich wrote: > External email: Use caution opening links or attachments > > > On Sun, Sep 22, 2024 at 04:11:21PM +0300, Zhi Wang wrote: >> On Sun, 22 Sep 2024 05:49:22 -0700 >> Zhi Wang wrote: >> >> +Ben. >> >> Forget to add you. My bad. > > Please also add the d

[RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-29 Thread Zhi Wang
1. Background = NVIDIA vGPU[1] software enables powerful GPU performance for workloads ranging from graphics-rich virtual workstations to data science and AI, enabling IT to leverage the management and security benefits of virtualization as well as the performance of NVIDIA GPUs requir

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-27 Thread Jason Gunthorpe
On Fri, Sep 27, 2024 at 04:22:32PM +0200, Danilo Krummrich wrote: > > When you say things like this it comes across as though you are > > implying there are two tiers to the community. Ie those that set the > > strategy and those that don't. > > This isn't true, I just ask you to consider the goal

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-27 Thread Danilo Krummrich
On Fri, Sep 27, 2024 at 09:51:15AM -0300, Jason Gunthorpe wrote: > On Fri, Sep 27, 2024 at 12:42:56AM +0200, Danilo Krummrich wrote: > > On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote: > > > On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > > > > > > > > No, I do object t

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-27 Thread Jason Gunthorpe
On Fri, Sep 27, 2024 at 12:42:56AM +0200, Danilo Krummrich wrote: > On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote: > > On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > > > > > > No, I do object to "we are ignoring the driver being proposed by the > > > developers involv

RE: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Tian, Kevin
> From: Jason Gunthorpe > Sent: Friday, September 27, 2024 6:57 AM > > On Thu, Sep 26, 2024 at 09:55:28AM -0300, Jason Gunthorpe wrote: > > > I'm not entirely sure yet what this whole 'mgr' component is actually > > doing though. > > Looking more closely I think some of it is certainly appropri

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Jason Gunthorpe
On Thu, Sep 26, 2024 at 09:55:28AM -0300, Jason Gunthorpe wrote: > I'm not entirely sure yet what this whole 'mgr' component is actually > doing though. Looking more closely I think some of it is certainly appropriate to be in vfio. Like when something opens the VFIO device it should allocate the

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Danilo Krummrich
On Thu, Sep 26, 2024 at 11:40:57AM -0300, Jason Gunthorpe wrote: > On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > > > > No, I do object to "we are ignoring the driver being proposed by the > > developers involved for this hardware by adding to the old one instead" > > which it seems li

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Danilo Krummrich
On Thu, Sep 26, 2024 at 11:07:56AM -0700, Andy Ritger wrote: > > I hope and expect the nova and vgpu_mgr efforts to ultimately converge. > > First, for the fw ABI debacle: yes, it is unfortunate that we still don't > have a stable ABI from GSP. We /are/ working on it, though there isn't > anythi

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Andy Ritger
I hope and expect the nova and vgpu_mgr efforts to ultimately converge. First, for the fw ABI debacle: yes, it is unfortunate that we still don't have a stable ABI from GSP. We /are/ working on it, though there isn't anything to show, yet. FWIW, I expect the end result will be a much simpler i

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Jason Gunthorpe
On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > That's fine, but again, do NOT make design decisions based on what you > can, and can not, feel you can slide by one of these companies to get it > into their old kernels. That's what I take objection to here. It is not slide by. It is a

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Danilo Krummrich
On Thu, Sep 26, 2024 at 02:54:38PM +0200, Greg KH wrote: > On Thu, Sep 26, 2024 at 09:42:39AM -0300, Jason Gunthorpe wrote: > > On Thu, Sep 26, 2024 at 11:14:27AM +0200, Greg KH wrote: > > > On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > > > > On Mon, Sep 23, 2024 at 10:49:07AM

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Jason Gunthorpe
On Thu, Sep 26, 2024 at 06:43:44AM +, Tian, Kevin wrote: > Then there comes an open whether VFIO is a right place to host such > vendor specific provisioning interface. The existing mdev type based > provisioning mechanism was considered a bad fit already. > IIRC the previous discussion came

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Greg KH
On Thu, Sep 26, 2024 at 09:42:39AM -0300, Jason Gunthorpe wrote: > On Thu, Sep 26, 2024 at 11:14:27AM +0200, Greg KH wrote: > > On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > > > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > > > 2. Proposal for upstream

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Jason Gunthorpe
On Thu, Sep 26, 2024 at 11:14:27AM +0200, Greg KH wrote: > On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > > 2. Proposal for upstream > > > > > > > > > > What is the strategy in th

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-26 Thread Greg KH
On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > 2. Proposal for upstream > > > > > > > What is the strategy in the mid / long term with this? > > > > As you know, we're trying to mo

RE: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-25 Thread Tian, Kevin
> From: Jason Gunthorpe > Sent: Monday, September 23, 2024 11:02 PM > > On Mon, Sep 23, 2024 at 06:22:33AM +, Tian, Kevin wrote: > > > From: Zhi Wang > > > Sent: Sunday, September 22, 2024 8:49 PM > > > > > [...] > > > > > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provide

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-25 Thread Jason Gunthorpe
On Wed, Sep 25, 2024 at 11:08:40AM +1000, Dave Airlie wrote: > On Wed, 25 Sept 2024 at 10:53, Jason Gunthorpe wrote: > > > > On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > > > > > Currently - and please correct me if I'm wrong - you make it sound to me > > > as if > > > you'r

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-25 Thread Danilo Krummrich
On Tue, Sep 24, 2024 at 09:53:19PM -0300, Jason Gunthorpe wrote: > On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > > > Currently - and please correct me if I'm wrong - you make it sound to me as > > if > > you're not willing to respect the decisions that have been taken by Nou

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-24 Thread Jason Gunthorpe
On Wed, Sep 25, 2024 at 10:18:44AM +1000, Dave Airlie wrote: > > ? nova core, meaning nova rust, meaning vfio depends on rust, doesn't > > seem acceptable ? We need to keep rust isolated to DRM for the > > foreseeable future. Just need to find a separation that can do that. > > That isn't going t

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-24 Thread Dave Airlie
On Wed, 25 Sept 2024 at 10:53, Jason Gunthorpe wrote: > > On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > > > Currently - and please correct me if I'm wrong - you make it sound to me as > > if > > you're not willing to respect the decisions that have been taken by Nouveau > >

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-24 Thread Jason Gunthorpe
On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > Currently - and please correct me if I'm wrong - you make it sound to me as if > you're not willing to respect the decisions that have been taken by Nouveau > and > DRM maintainers. I've never said anything about your work, go d

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-24 Thread Dave Airlie
> > Well, no, I am calling a core driver to be the very minimal parts that > are actually shared between vfio and drm. It should definitely not > include key parts you want to work on in rust, like the command > marshaling. Unfortunately not, the fw ABI is the unsolved problem, rust is our best so

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-24 Thread Jason Gunthorpe
On Wed, Sep 25, 2024 at 08:52:32AM +1000, Dave Airlie wrote: > On Wed, 25 Sept 2024 at 05:57, Danilo Krummrich wrote: > > > > On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote: > > > On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote: > > > > > > > > From the VFIO side

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-24 Thread Dave Airlie
On Wed, 25 Sept 2024 at 05:57, Danilo Krummrich wrote: > > On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote: > > On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote: > > > > > > From the VFIO side I would like to see something like this merged in > > > > nearish future

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-24 Thread Danilo Krummrich
On Tue, Sep 24, 2024 at 01:41:51PM -0300, Jason Gunthorpe wrote: > On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote: > > > > From the VFIO side I would like to see something like this merged in > > > nearish future as it would bring a previously out of tree approach to > > > be ful

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-24 Thread Jason Gunthorpe
On Tue, Sep 24, 2024 at 12:50:55AM +0200, Danilo Krummrich wrote: > > From the VFIO side I would like to see something like this merged in > > nearish future as it would bring a previously out of tree approach to > > be fully intree using our modern infrastructure. This is a big win for > > the VF

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-23 Thread Danilo Krummrich
On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote: > On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > 2. Proposal for upstream > > > > > > > What is the strategy in the mid / long term with this? > > > > As you know, we're trying to mo

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-23 Thread Jason Gunthorpe
On Mon, Sep 23, 2024 at 06:22:33AM +, Tian, Kevin wrote: > > From: Zhi Wang > > Sent: Sunday, September 22, 2024 8:49 PM > > > [...] > > > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > > extended management and features, e.g. selecting the vGPU types, support > > l

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-23 Thread Jason Gunthorpe
On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > 2. Proposal for upstream > > > > What is the strategy in the mid / long term with this? > > As you know, we're trying to move to Nova and the blockers with the device / > driver infrastructure have been

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-23 Thread Danilo Krummrich
Hi Zhi, Thanks for the very detailed cover letter. On Sun, Sep 22, 2024 at 05:49:22AM -0700, Zhi Wang wrote: > 1. Background > = > > NVIDIA vGPU[1] software enables powerful GPU performance for workloads > ranging from graphics-rich virtual workstations to data science and AI, > enab

Re: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-23 Thread Danilo Krummrich
On Sun, Sep 22, 2024 at 04:11:21PM +0300, Zhi Wang wrote: > On Sun, 22 Sep 2024 05:49:22 -0700 > Zhi Wang wrote: > > +Ben. > > Forget to add you. My bad. Please also add the driver maintainers! I had to fetch the patchset from the KVM list, since they did not hit the nouveau list (I'm trying

RE: [RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support

2024-09-22 Thread Tian, Kevin
> From: Zhi Wang > Sent: Sunday, September 22, 2024 8:49 PM > [...] > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > extended management and features, e.g. selecting the vGPU types, support > live migration and driver warm update. > > Like other devices that VFIO suppo