[Nouveau] [Bug 98138] New: Random Freeze - nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]

2016-10-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98138

Bug ID: 98138
   Summary: Random Freeze - nouveau :01:00.0: fifo:
SCHED_ERROR 0a [CTXSW_TIMEOUT]
   Product: xorg
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Driver/nouveau
  Assignee: nouveau@lists.freedesktop.org
  Reporter: stefan.kele...@gmx.de
QA Contact: xorg-t...@lists.x.org

Since im Using KDE Plasma, i have the Problem the system is Freezing randomly
over the when the Desktop is locked

kern.log when the problem begins

Oct  5 02:28:06 druuhl kernel: [73499.905933] nouveau :01:00.0: fifo:
SCHED_ERROR 0a [CTXSW_TIMEOUT]
Oct  5 02:28:06 druuhl kernel: [73499.905939] nouveau :01:00.0: fifo: sw
engine fault on channel 5, recovering...
Oct  5 02:28:08 druuhl kernel: [73501.905656] nouveau :01:00.0: fifo:
runlist 0 update timeout
Oct  5 02:28:10 druuhl kernel: [73504.200746] nouveau :01:00.0: fifo:
SCHED_ERROR 0a [CTXSW_TIMEOUT]
Oct  5 02:28:15 druuhl kernel: [73508.495393] nouveau :01:00.0: fifo:
SCHED_ERROR 0a [CTXSW_TIMEOUT]
Oct  5 02:28:19 druuhl kernel: [73512.790100] nouveau :01:00.0: fifo:
SCHED_ERROR 0a [CTXSW_TIMEOUT]
Oct  5 02:28:23 druuhl kernel: [73517.084810] nouveau :01:00.0: fifo:
SCHED_ERROR 0a [CTXSW_TIMEOUT]
Oct  5 02:28:27 druuhl kernel: [73521.379625] nouveau :01:00.0: fifo:
SCHED_ERROR 0a [CTXSW_TIMEOUT]

when the Desktop is unlocked over night there is no freeze. Some Discussion
about this describe problems with Fullscreen Application like Libreoffice, this
would match with my research.

Im Using Neptune Linux - Debian with KDE Plasma and i have the Problem with
Kernel 4.4.13 to 4.4.20.

Linux druuhl 4.4.20 #8 SMP PREEMPT Thu Sep 8 10:25:38 CEST 2016 x86_64
GNU/Linux
kdeplasma 5.6 to 5.8
xserver-xorg-video-nouveau 1.0.12
xorg 7.7

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 98129] New: X hung with nouveau 'INVALID_CMD' and 'INVALID_ADDRESS_ALIGNMENT' errors on GeForce 9600 GT [10de:0622]

2016-10-06 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98129

Bug ID: 98129
   Summary: X hung with nouveau 'INVALID_CMD' and
'INVALID_ADDRESS_ALIGNMENT' errors on GeForce 9600 GT
[10de:0622]
   Product: xorg
   Version: git
  Hardware: x86-64 (AMD64)
OS: All
Status: NEW
  Severity: critical
  Priority: medium
 Component: Driver/nouveau
  Assignee: nouveau@lists.freedesktop.org
  Reporter: ad...@happyassassin.net
QA Contact: xorg-t...@lists.x.org

Recently my desktop - running Fedora 25, kernel 4.8.0-0.rc8.git0.1.fc25.x86_64
, xorg-x11-drv-nouveau-1.0.13-1.fc25.x86_64 - hung in the middle of normal
operation. X was totally stuck, but I could get in and shut down via ssh. The
journal shows these nouveau errors at the time of the hang:

Oct 03 16:55:33 adam.happyassassin.net kernel: nouveau :01:00.0: fifo:
DMA_PUSHER - ch 6 [systemd-logind[1245]] get 00200218e0 put 0020024010 ib_get
032c ib_put 032d state 80004861 (err: INVALID_CMD) push 00504031
Oct 03 16:55:33 adam.happyassassin.net kernel: nouveau :01:00.0: fifo:
DMA_PUSHER - ch 6 [systemd-logind[1245]] get 0020024010 put 0020024010 ib_get
032c ib_put 0348 state 8000 (err: INVALID_CMD) push 00406040
Oct 03 16:55:33 adam.happyassassin.net kernel: nouveau :01:00.0: gr:
DATA_ERROR 000b [INVALID_ADDRESS_ALIGNMENT]
Oct 03 16:55:33 adam.happyassassin.net kernel: nouveau :01:00.0: gr:
0010 [] ch 6 [001fbe5000 systemd-logind[1245]] subc 0 class 5039 mthd 0328
data 

this is on a GeForce 9600 GT, [10de:0622] , with two displays connected via
DVI.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v5 2/3] drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit() hook

2016-10-06 Thread Ard Biesheuvel
The 100c10 scratch page is mapped using dma_map_page() before the TTM
layer has had a chance to set the DMA mask. This means we are still
running with the default of 32 when this code executes, and this causes
problems for platforms with no memory below 4 GB (such as AMD Seattle)

So move the dma_map_page() to the .oneinit hook, which executes after the
DMA mask has been set.

Signed-off-by: Ard Biesheuvel 
---
 drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c | 31 
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c
index 76433cc66fff..c1995c0024ef 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c
@@ -50,24 +50,39 @@ gf100_fb_intr(struct nvkm_fb *base)
 }
 
 int
-gf100_fb_oneinit(struct nvkm_fb *fb)
+gf100_fb_oneinit(struct nvkm_fb *base)
 {
-   struct nvkm_device *device = fb->subdev.device;
+   struct gf100_fb *fb = gf100_fb(base);
+   struct nvkm_device *device = fb->base.subdev.device;
int ret, size = 0x1000;
 
size = nvkm_longopt(device->cfgopt, "MmuDebugBufferSize", size);
size = min(size, 0x1000);
 
ret = nvkm_memory_new(device, NVKM_MEM_TARGET_INST, size, 0x1000,
- false, &fb->mmu_rd);
+ false, &base->mmu_rd);
if (ret)
return ret;
 
ret = nvkm_memory_new(device, NVKM_MEM_TARGET_INST, size, 0x1000,
- false, &fb->mmu_wr);
+ false, &base->mmu_wr);
if (ret)
return ret;
 
+   fb->r100c10_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+   if (!fb->r100c10_page) {
+   nvkm_error(&fb->base.subdev, "failed 100c10 page alloc\n");
+   return -ENOMEM;
+   }
+
+   fb->r100c10 = dma_map_page(device->dev, fb->r100c10_page, 0, PAGE_SIZE,
+  DMA_BIDIRECTIONAL);
+   if (dma_mapping_error(device->dev, fb->r100c10)) {
+   nvkm_error(&fb->base.subdev, "failed to map 100c10 page\n");
+   __free_page(fb->r100c10_page);
+   return -EFAULT;
+   }
+
return 0;
 }
 
@@ -123,14 +138,6 @@ gf100_fb_new_(const struct nvkm_fb_func *func, struct 
nvkm_device *device,
nvkm_fb_ctor(func, device, index, &fb->base);
*pfb = &fb->base;
 
-   fb->r100c10_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
-   if (fb->r100c10_page) {
-   fb->r100c10 = dma_map_page(device->dev, fb->r100c10_page, 0,
-  PAGE_SIZE, DMA_BIDIRECTIONAL);
-   if (dma_mapping_error(device->dev, fb->r100c10))
-   return -EFAULT;
-   }
-
return 0;
 }
 
-- 
2.7.4

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v5 3/3] drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit() hook

2016-10-06 Thread Ard Biesheuvel
The 100c08 scratch page is mapped using dma_map_page() before the TTM
layer has had a chance to set the DMA mask. This means we are still
running with the default of 32 when this code executes, and this causes
problems for platforms with no memory below 4 GB (such as AMD Seattle)

So move the dma_map_page() to the .oneinit hook, which executes after the
DMA mask has been set.

Signed-off-by: Ard Biesheuvel 
---
 drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c | 33 ++--
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c
index 1b5fb02eab2a..d9bc4d11f145 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c
@@ -210,6 +210,28 @@ nv50_fb_intr(struct nvkm_fb *base)
nvkm_fifo_chan_put(fifo, flags, &chan);
 }
 
+static int
+nv50_fb_oneinit(struct nvkm_fb *base)
+{
+   struct nv50_fb *fb = nv50_fb(base);
+
+   fb->r100c08_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
+   if (!fb->r100c08_page) {
+   nvkm_error(&fb->base.subdev, "failed 100c08 page alloc\n");
+   return -ENOMEM;
+   }
+
+   fb->r100c08 = dma_map_page(device->dev, fb->r100c08_page, 0, PAGE_SIZE,
+  DMA_BIDIRECTIONAL);
+   if (dma_mapping_error(device->dev, fb->r100c08)) {
+   nvkm_error(&fb->base.subdev, "failed to map 100c08 page\n");
+   __free_page(fb->r100c08_page);
+   return -EFAULT;
+   }
+
+   return 0;
+}
+
 static void
 nv50_fb_init(struct nvkm_fb *base)
 {
@@ -245,6 +267,7 @@ nv50_fb_dtor(struct nvkm_fb *base)
 static const struct nvkm_fb_func
 nv50_fb_ = {
.dtor = nv50_fb_dtor,
+   .oneinit = nv50_fb_oneinit,
.init = nv50_fb_init,
.intr = nv50_fb_intr,
.ram_new = nv50_fb_ram_new,
@@ -263,16 +286,6 @@ nv50_fb_new_(const struct nv50_fb_func *func, struct 
nvkm_device *device,
fb->func = func;
*pfb = &fb->base;
 
-   fb->r100c08_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
-   if (fb->r100c08_page) {
-   fb->r100c08 = dma_map_page(device->dev, fb->r100c08_page, 0,
-  PAGE_SIZE, DMA_BIDIRECTIONAL);
-   if (dma_mapping_error(device->dev, fb->r100c08))
-   return -EFAULT;
-   } else {
-   nvkm_warn(&fb->base.subdev, "failed 100c08 page alloc\n");
-   }
-
return 0;
 }
 
-- 
2.7.4

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v5 0/3] drm/nouveau: set DMA mask before mapping scratch page

2016-10-06 Thread Ard Biesheuvel
This v4 is now a 3 piece series (since v4), after Alexandre pointed out that
both GF 100 and NV50 are affected by the same issue, and that a related issue
has been solved already for Tegra in commit 9d0394c6bed5
("drm/nouveau/instmem/gk20a: set DMA mask early").

The issue that this series addresses is the fact that the Nouveau driver
invokes the DMA API before setting the DMA mask. In both cases addressed
here, these are simply static bidirectional mappings of scratch pages whose
purpose is not well understood, and in most cases, it does not matter that
these pages are always allocated below 4 GB even if the hardware can access
memory much higher up.

However, on platforms without any RAM below 4 GB, the preliminary DMA mask
of 32 is preventing the nouveau driver from loading on GF100 and NV50
hardware with an error like the following one:

   nouveau :02:00.0: enabling device ( -> 0003)
   nouveau :02:00.0: NVIDIA GT218 (0a8280b1)
   nouveau :02:00.0: bios: version 70.18.a6.00.00
   nouveau :02:00.0: fb ctor failed, -14
   nouveau: probe of :02:00.0 failed with error -14

So fix this by setting a preliminary DMA mask based on the MMU device 'dma_bits'
property (patch #1), and postpone mapping the scratch pages to the respective
FB .init() hooks. (#2 and #3)

v5: move setting of preliminary DMA mask to nvkm_device_pci_new() (#1)
move allocation and DMA mapping of scratch pages to .oneinit hooks (#2, #3)
v4: split and move dma_set_mask to probe hook (Alexander)
v3: rework code to get rid of DMA_ERROR_CODE references, which is not
defined on all architectures
v2: replace incorrect comparison of dma_addr_t type var against NULL

Ard Biesheuvel (3):
  drm/nouveau: set streaming DMA mask early
  drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit()
hook
  drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit()
hook

 drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c | 37 ++--
 drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c   | 31 +---
 drivers/gpu/drm/nouveau/nvkm/subdev/fb/nv50.c| 33 +++--
 3 files changed, 69 insertions(+), 32 deletions(-)

-- 
2.7.4

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v5 1/3] drm/nouveau: set streaming DMA mask early

2016-10-06 Thread Ard Biesheuvel
Some subdevices (i.e., fb/nv50.c and fb/gf100.c) map a scratch page using
dma_map_page() way before the TTM layer has had a chance to set the DMA
mask. This may prevent the driver from loading at all on platforms whose
system memory is not covered by the default DMA mask of 32-bit (i.e., when
all RAM is above 4 GB).

So set a preliminary DMA mask right after constructing the PCI device, and
base it on the .dma_bits member of the MMU subdevice, which is what the TTM
layer will base the DMA mask on as well.

Signed-off-by: Ard Biesheuvel 
---
 drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c | 37 ++--
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c 
b/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c
index 62ad0300cfa5..0030cd9543b2 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c
@@ -1665,14 +1665,31 @@ nvkm_device_pci_new(struct pci_dev *pci_dev, const char 
*cfg, const char *dbg,
*pdevice = &pdev->device;
pdev->pdev = pci_dev;
 
-   return nvkm_device_ctor(&nvkm_device_pci_func, quirk, &pci_dev->dev,
-   pci_is_pcie(pci_dev) ? NVKM_DEVICE_PCIE :
-   pci_find_capability(pci_dev, PCI_CAP_ID_AGP) ?
-   NVKM_DEVICE_AGP : NVKM_DEVICE_PCI,
-   (u64)pci_domain_nr(pci_dev->bus) << 32 |
-pci_dev->bus->number << 16 |
-PCI_SLOT(pci_dev->devfn) << 8 |
-PCI_FUNC(pci_dev->devfn), name,
-   cfg, dbg, detect, mmio, subdev_mask,
-   &pdev->device);
+   ret = nvkm_device_ctor(&nvkm_device_pci_func, quirk, &pci_dev->dev,
+  pci_is_pcie(pci_dev) ? NVKM_DEVICE_PCIE :
+  pci_find_capability(pci_dev, PCI_CAP_ID_AGP) ?
+  NVKM_DEVICE_AGP : NVKM_DEVICE_PCI,
+  (u64)pci_domain_nr(pci_dev->bus) << 32 |
+   pci_dev->bus->number << 16 |
+   PCI_SLOT(pci_dev->devfn) << 8 |
+   PCI_FUNC(pci_dev->devfn), name,
+  cfg, dbg, detect, mmio, subdev_mask,
+  &pdev->device);
+
+   if (ret)
+   return ret;
+
+   /*
+* Set a preliminary DMA mask based on the .dma_bits member of the
+* MMU subdevice. This allows other subdevices to create DMA mappings
+* in their init() or oneinit() methods, which may be called before the
+* TTM layer sets the DMA mask definitively.
+* This is necessary for platforms where the default DMA mask of 32
+* does not cover any system memory, i.e., when all RAM is > 4 GB.
+*/
+   if (subdev_mask & BIT(NVKM_SUBDEV_MMU))
+   dma_set_mask_and_coherent(&pci_dev->dev,
+   DMA_BIT_MASK(pdev->device.mmu->dma_bits));
+
+   return 0;
 }
-- 
2.7.4

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v4 1/3] drm/nouveau: set streaming DMA mask early

2016-10-06 Thread Ard Biesheuvel
On 3 October 2016 at 06:39, Alexandre Courbot  wrote:
> On Mon, Sep 26, 2016 at 9:32 PM, Ard Biesheuvel
>  wrote:
>> Some subdevices (i.e., fb/nv50.c and fb/gf100.c) map a scratch page using
>> dma_map_page() way before the TTM layer has had a chance to set the DMA
>> mask. This may prevent the driver from loading at all on platforms whose
>> system memory is not covered by the default DMA mask of 32-bit (i.e., when
>> all RAM is above 4 GB).
>>
>> So set a preliminary DMA mask right after constructing the PCI device, and
>> base it on the .dma_bits member of the MMU subdevice, which is what the TTM
>> layer will base the DMA mask on as well.
>>
>> Signed-off-by: Ard Biesheuvel 
>> ---
>>  drivers/gpu/drm/nouveau/nouveau_drm.c | 11 +++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
>> b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> index 652ab111dd74..e61e9a0adb51 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
>> @@ -361,6 +361,17 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
>>
>> pci_set_master(pdev);
>>
>> +   /*
>> +* Set a preliminary DMA mask based on the .dma_bits member of the
>> +* MMU subdevice. This allows other subdevices to create DMA mappings
>> +* in their init() function, which are called before the TTM layer 
>> sets
>> +* the DMA mask definitively.
>> +* This is necessary for platforms where the default DMA mask of 32
>> +* does not cover any system memory, i.e., when all RAM is > 4 GB.
>> +*/
>> +   dma_set_mask_and_coherent(device->dev,
>> + DMA_BIT_MASK(device->mmu->dma_bits));
>
> I would just move this to nvkm_device_pci_new() so that it perfectly
> mirrors the same call done in nvkm_device_tegra_new(), which was done
> for the same purpose. Otherwise, looks good to me.

OK, will do that.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau