On Tue, Jun 23, 2020 at 02:44:24PM -0400, Felix Kuehling wrote:
> Am 2020-06-23 um 3:39 a.m. schrieb Daniel Vetter:
> > On Fri, Jun 12, 2020 at 1:35 AM Felix Kuehling
> > wrote:
> >> Am 2020-06-11 um 10:15 a.m. schrieb Jason Gunthorpe:
> >>> On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel
Am 2020-06-23 um 3:39 a.m. schrieb Daniel Vetter:
> On Fri, Jun 12, 2020 at 1:35 AM Felix Kuehling wrote:
>> Am 2020-06-11 um 10:15 a.m. schrieb Jason Gunthorpe:
>>> On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote:
> I still have my doubts about allowing fence waiting from
On Fri, Jun 19, 2020 at 04:10:11PM -0400, Jerome Glisse wrote:
> Maybe we can audit how user ptr buffer are use today and see if
> we can define a usage pattern that would allow to cut corner in
> kernel. For instance we could use mmu notifier just to block CPU
> pte update while we do GUP and
On Mon, Jun 22, 2020 at 04:15:40PM -0400, Jerome Glisse wrote:
> On Mon, Jun 22, 2020 at 08:46:17AM -0300, Jason Gunthorpe wrote:
> > On Fri, Jun 19, 2020 at 04:31:47PM -0400, Jerome Glisse wrote:
> > > Not doable as page refcount can change for things unrelated to GUP, with
> > > John changes we
On Fri, Jun 12, 2020 at 1:35 AM Felix Kuehling wrote:
>
> Am 2020-06-11 um 10:15 a.m. schrieb Jason Gunthorpe:
> > On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote:
> >>> I still have my doubts about allowing fence waiting from within shrinkers.
> >>> IMO ideally they should use a
On Mon, Jun 22, 2020 at 08:46:17AM -0300, Jason Gunthorpe wrote:
> On Fri, Jun 19, 2020 at 04:31:47PM -0400, Jerome Glisse wrote:
> > Not doable as page refcount can change for things unrelated to GUP, with
> > John changes we can identify GUP and we could potentialy copy GUPed page
> > instead of
On Fri, Jun 19, 2020 at 04:31:47PM -0400, Jerome Glisse wrote:
> Not doable as page refcount can change for things unrelated to GUP, with
> John changes we can identify GUP and we could potentialy copy GUPed page
> instead of COW but this can potentialy slow down fork() and i am not sure
> how
On Fri, Jun 19, 2020 at 10:43:20PM +0200, Daniel Vetter wrote:
> On Fri, Jun 19, 2020 at 10:10 PM Jerome Glisse wrote:
> >
> > On Fri, Jun 19, 2020 at 03:18:49PM -0300, Jason Gunthorpe wrote:
> > > On Fri, Jun 19, 2020 at 02:09:35PM -0400, Jerome Glisse wrote:
> > > > On Fri, Jun 19, 2020 at
On Fri, Jun 19, 2020 at 10:10 PM Jerome Glisse wrote:
>
> On Fri, Jun 19, 2020 at 03:18:49PM -0300, Jason Gunthorpe wrote:
> > On Fri, Jun 19, 2020 at 02:09:35PM -0400, Jerome Glisse wrote:
> > > On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
> > > > On Fri, Jun 19, 2020 at
On Fri, Jun 19, 2020 at 04:55:38PM -0300, Jason Gunthorpe wrote:
> On Fri, Jun 19, 2020 at 03:48:49PM -0400, Felix Kuehling wrote:
> > Am 2020-06-19 um 2:18 p.m. schrieb Jason Gunthorpe:
> > > On Fri, Jun 19, 2020 at 02:09:35PM -0400, Jerome Glisse wrote:
> > >> On Fri, Jun 19, 2020 at 02:23:08PM
On Fri, Jun 19, 2020 at 03:18:49PM -0300, Jason Gunthorpe wrote:
> On Fri, Jun 19, 2020 at 02:09:35PM -0400, Jerome Glisse wrote:
> > On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
> > > On Fri, Jun 19, 2020 at 06:19:41PM +0200, Daniel Vetter wrote:
> > >
> > > > The madness is
Am 2020-06-19 um 3:55 p.m. schrieb Jason Gunthorpe:
> On Fri, Jun 19, 2020 at 03:48:49PM -0400, Felix Kuehling wrote:
>> Am 2020-06-19 um 2:18 p.m. schrieb Jason Gunthorpe:
>>> On Fri, Jun 19, 2020 at 02:09:35PM -0400, Jerome Glisse wrote:
On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason
On Fri, Jun 19, 2020 at 03:30:32PM -0400, Felix Kuehling wrote:
> We have a potential problem with CPU updating page tables while the GPU
> is retrying on page table entries because 64 bit CPU transactions don't
> arrive in device memory atomically.
Except for 32 bit platforms atomicity is
On Fri, Jun 19, 2020 at 03:48:49PM -0400, Felix Kuehling wrote:
> Am 2020-06-19 um 2:18 p.m. schrieb Jason Gunthorpe:
> > On Fri, Jun 19, 2020 at 02:09:35PM -0400, Jerome Glisse wrote:
> >> On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
> >>> On Fri, Jun 19, 2020 at 06:19:41PM
Am 2020-06-19 um 2:18 p.m. schrieb Jason Gunthorpe:
> On Fri, Jun 19, 2020 at 02:09:35PM -0400, Jerome Glisse wrote:
>> On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
>>> On Fri, Jun 19, 2020 at 06:19:41PM +0200, Daniel Vetter wrote:
>>>
The madness is only that device B's
On Fri, Jun 19, 2020 at 03:30:32PM -0400, Felix Kuehling wrote:
>
> Am 2020-06-19 um 3:11 p.m. schrieb Alex Deucher:
> > On Fri, Jun 19, 2020 at 2:09 PM Jerome Glisse wrote:
> >> On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
> >>> On Fri, Jun 19, 2020 at 06:19:41PM +0200,
Am 2020-06-19 um 3:11 p.m. schrieb Alex Deucher:
> On Fri, Jun 19, 2020 at 2:09 PM Jerome Glisse wrote:
>> On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
>>> On Fri, Jun 19, 2020 at 06:19:41PM +0200, Daniel Vetter wrote:
>>>
The madness is only that device B's mmu notifier
On Fri, Jun 19, 2020 at 2:09 PM Jerome Glisse wrote:
>
> On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
> > On Fri, Jun 19, 2020 at 06:19:41PM +0200, Daniel Vetter wrote:
> >
> > > The madness is only that device B's mmu notifier might need to wait
> > > for fence_B so that the
On Fri, Jun 19, 2020 at 02:09:35PM -0400, Jerome Glisse wrote:
> On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
> > On Fri, Jun 19, 2020 at 06:19:41PM +0200, Daniel Vetter wrote:
> >
> > > The madness is only that device B's mmu notifier might need to wait
> > > for fence_B so
On Thu, Jun 11, 2020 at 07:35:35PM -0400, Felix Kuehling wrote:
> Am 2020-06-11 um 10:15 a.m. schrieb Jason Gunthorpe:
> > On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote:
> >>> I still have my doubts about allowing fence waiting from within shrinkers.
> >>> IMO ideally they should
On Fri, Jun 19, 2020 at 02:23:08PM -0300, Jason Gunthorpe wrote:
> On Fri, Jun 19, 2020 at 06:19:41PM +0200, Daniel Vetter wrote:
>
> > The madness is only that device B's mmu notifier might need to wait
> > for fence_B so that the dma operation finishes. Which in turn has to
> > wait for device
On Fri, Jun 19, 2020 at 06:19:41PM +0200, Daniel Vetter wrote:
> The madness is only that device B's mmu notifier might need to wait
> for fence_B so that the dma operation finishes. Which in turn has to
> wait for device A to finish first.
So, it sound, fundamentally you've got this graph of
On Fri, Jun 19, 2020 at 5:15 PM Jason Gunthorpe wrote:
>
> On Fri, Jun 19, 2020 at 05:06:04PM +0200, Daniel Vetter wrote:
> > On Fri, Jun 19, 2020 at 1:39 PM Jason Gunthorpe wrote:
> > >
> > > On Fri, Jun 19, 2020 at 09:22:09AM +0200, Daniel Vetter wrote:
> > > > > As I've understood GPU that
On Fri, Jun 19, 2020 at 05:06:04PM +0200, Daniel Vetter wrote:
> On Fri, Jun 19, 2020 at 1:39 PM Jason Gunthorpe wrote:
> >
> > On Fri, Jun 19, 2020 at 09:22:09AM +0200, Daniel Vetter wrote:
> > > > As I've understood GPU that means you need to show that the commands
> > > > associated with the
On Fri, Jun 19, 2020 at 09:22:09AM +0200, Daniel Vetter wrote:
> > As I've understood GPU that means you need to show that the commands
> > associated with the buffer have completed. This is all local stuff
> > within the driver, right? Why use fence (other than it already exists)
>
> Because
On Fri, Jun 19, 2020 at 1:39 PM Jason Gunthorpe wrote:
>
> On Fri, Jun 19, 2020 at 09:22:09AM +0200, Daniel Vetter wrote:
> > > As I've understood GPU that means you need to show that the commands
> > > associated with the buffer have completed. This is all local stuff
> > > within the driver,
On Fri, Jun 19, 2020 at 8:58 AM Jason Gunthorpe wrote:
>
> On Thu, Jun 18, 2020 at 05:00:51PM +0200, Daniel Vetter wrote:
> > On Wed, Jun 17, 2020 at 12:28:35PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Jun 17, 2020 at 08:48:50AM +0200, Daniel Vetter wrote:
> > >
> > > > Now my understanding
On Thu, Jun 18, 2020 at 05:00:51PM +0200, Daniel Vetter wrote:
> On Wed, Jun 17, 2020 at 12:28:35PM -0300, Jason Gunthorpe wrote:
> > On Wed, Jun 17, 2020 at 08:48:50AM +0200, Daniel Vetter wrote:
> >
> > > Now my understanding for rdma is that if you don't have hw page fault
> > > support,
> >
On Wed, Jun 17, 2020 at 12:28:35PM -0300, Jason Gunthorpe wrote:
> On Wed, Jun 17, 2020 at 08:48:50AM +0200, Daniel Vetter wrote:
>
> > Now my understanding for rdma is that if you don't have hw page fault
> > support,
>
> The RDMA ODP feature is restartable HW page faulting just like nouveau
>
On Wed, Jun 17, 2020 at 12:29:40PM -0300, Jason Gunthorpe wrote:
> On Wed, Jun 17, 2020 at 09:57:54AM +0200, Daniel Vetter wrote:
>
> > > At the very least I think there should be some big warning that
> > > dma_fence in notifiers should be avoided.
> >
> > Yeah I'm working on documentation, and
On Wed, Jun 17, 2020 at 08:48:50AM +0200, Daniel Vetter wrote:
> Now my understanding for rdma is that if you don't have hw page fault
> support,
The RDMA ODP feature is restartable HW page faulting just like nouveau
has. The classical MR feature doesn't have this. Only mlx5 HW supports
ODP
On Wed, Jun 17, 2020 at 09:57:54AM +0200, Daniel Vetter wrote:
> > At the very least I think there should be some big warning that
> > dma_fence in notifiers should be avoided.
>
> Yeah I'm working on documentation, and also the notifiers here
> hopefully make it clear it's massive pain. I think
On Wed, Jun 17, 2020 at 9:27 AM Jason Gunthorpe wrote:
>
> On Tue, Jun 16, 2020 at 02:07:19PM +0200, Daniel Vetter wrote:
> > > > I've pinged a bunch of armsoc gpu driver people and ask them how much
> > > > this
> > > > hurts, so that we have a clear answer. On x86 I don't think we have much
>
On Tue, Jun 16, 2020 at 2:07 PM Daniel Vetter wrote:
>
> Hi Jason,
>
> Somehow this got stuck somewhere in the mail queues, only popped up just
> now ...
>
> On Thu, Jun 11, 2020 at 11:15:15AM -0300, Jason Gunthorpe wrote:
> > On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote:
> > > >
On Tue, Jun 16, 2020 at 02:07:19PM +0200, Daniel Vetter wrote:
> > > I've pinged a bunch of armsoc gpu driver people and ask them how much this
> > > hurts, so that we have a clear answer. On x86 I don't think we have much
> > > of a choice on this, with userptr in amd and i915 and hmm work in
Hi Jason,
Somehow this got stuck somewhere in the mail queues, only popped up just
now ...
On Thu, Jun 11, 2020 at 11:15:15AM -0300, Jason Gunthorpe wrote:
> On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote:
> > > I still have my doubts about allowing fence waiting from within
On Fri, Jun 12, 2020 at 1:35 AM Felix Kuehling wrote:
>
> Am 2020-06-11 um 10:15 a.m. schrieb Jason Gunthorpe:
> > On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote:
> >>> I still have my doubts about allowing fence waiting from within shrinkers.
> >>> IMO ideally they should use a
Am 2020-06-11 um 10:15 a.m. schrieb Jason Gunthorpe:
> On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote:
>>> I still have my doubts about allowing fence waiting from within shrinkers.
>>> IMO ideally they should use a trywait approach, in order to allow memory
>>> allocation during
On Thu, Jun 11, 2020 at 10:34:30AM +0200, Daniel Vetter wrote:
> > I still have my doubts about allowing fence waiting from within shrinkers.
> > IMO ideally they should use a trywait approach, in order to allow memory
> > allocation during command submission for drivers that
> > publish fences
On Thu, Jun 11, 2020 at 09:30:12AM +0200, Thomas Hellström (Intel) wrote:
>
> On 6/4/20 10:12 AM, Daniel Vetter wrote:
> > Two in one go:
> > - it is allowed to call dma_fence_wait() while holding a
> >dma_resv_lock(). This is fundamental to how eviction works with ttm,
> >so required.
>
On 6/4/20 10:12 AM, Daniel Vetter wrote:
Two in one go:
- it is allowed to call dma_fence_wait() while holding a
dma_resv_lock(). This is fundamental to how eviction works with ttm,
so required.
- it is allowed to call dma_fence_wait() from memory reclaim contexts,
specifically from
41 matches
Mail list logo