Re: Bug report: HiBMC crash

2018-09-23 Thread John Garry

On 21/09/2018 15:28, Chris Wilson wrote:

Quoting John Garry (2018-09-21 09:11:19)

On 21/09/2018 06:49, Liuxinliang (Matthew Liu) wrote:

Hi John,
Thank you for reporting bug.
I am now using 4.18.7. I haven't found this issue yet.
I will try linux-next and figure out what's wrong with it.

Thanks,
Xinliang




As mentioned in internal mail, the issue may be that the surface
depth/bpp we were using the in the driver was previously invalid, but
code has since been added in v4.19 to reject this. Specifically it looks
like this patch:

commit 70109354fed232dfce8fb2c7cadf635acbe03e19
Author: Chris Wilson 
Date:   Wed Sep 5 16:31:16 2018 +0100

 drm: Reject unknown legacy bpp and depth for drm_mode_addfb ioctl



diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c 
b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
index b92595c477ef..f3e7f41e6781 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
@@ -71,7 +71,6 @@ static int hibmc_drm_fb_create(struct drm_fb_helper *helper,
DRM_DEBUG_DRIVER("surface width(%d), height(%d) and bpp(%d)\n",
 sizes->surface_width, sizes->surface_height,
 sizes->surface_bpp);
-   sizes->surface_depth = 32;

bytes_per_pixel = DIV_ROUND_UP(sizes->surface_bpp, 8);

@@ -192,7 +191,6 @@ int hibmc_fbdev_init(struct hibmc_drm_private *priv)
return -ENOMEM;
}

-   priv->fbdev = hifbdev;
drm_fb_helper_prepare(priv->dev, >helper,
  _fbdev_helper_funcs);

@@ -246,6 +244,7 @@ int hibmc_fbdev_init(struct hibmc_drm_private *priv)
 fix->ypanstep, fix->ywrapstep, fix->line_length,
 fix->accel, fix->capabilities);

+   priv->fbdev = hifbdev;
return 0;

 fini:

>
> Apply chunks 2&3 first to confirm they fix the GPF.
> -Chris

Hi Chris,

So relocating where priv->fbdev is set does fix the crash.

However then applying chunk #1 introduces another crash:

9.229007] pci 0007:90:00.0: can't derive routing for PCI INT A
[9.235082] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[9.240457] [TTM] Zone  kernel: Available graphics memory: 16297792 kiB
[9.247147] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[9.253744] [TTM] Initializing pool allocator
[9.258148] [TTM] Initializing DMA pool allocator
[9.262951] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[9.269636] [drm] No driver support for vblank timestamp query.
[9.280967] Unable to handle kernel9.229007] pci 0007:90:00.0: 
can't derive routing for PCI INT A

[9.235082] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[9.240457] [TTM] Zone  kernel: Available graphics memory: 16297792 kiB
[9.247147] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[9.253744] [TTM] Initializing pool allocator
[9.258148] [TTM] Initializing DMA pool allocator
[9.262951] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[9.269636] [drm] No driver support for vblank timestamp query.
[9.280967] Unable to handle kernel NULL pointer dereference at 
virtual address 0150

[9.289849] Mem abort info:
[9.292666]   ESR = 0x9644
[9.295747]   Exception class = DABT (current EL), IL = 32 bits
[9.301728]   SET = 0, FnV = 0
[9.304809]   EA = 0, S1PTW = 0
[9.307977] Data abort info:
[9.310882]   ISV = 0, ISS = 0x0044
[9.314754]   CM = 0, WnR = 1
[9.317744] [0150] user address but active_mm is swapper
[9.324166] Internal error: Oops: 9644 [#1] PREEMPT SMP
[9.329793] Modules linked in:
[9.332874] CPU: 16 PID: 293 Comm: kworker/16:1 Not tainted 
4.19.0-rc4-next-20180920-1-g9b0012c-dirty #345
[9.342983] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon 
D05 IT21 Nemo 2.0 RC0 04/18/2018

[9.352223] Workqueue: events work_for_cpu_fn
[9.356621] pstate: 8005 (Nzcv daif -PAN -UAO)
[9.361461] pc : hibmc_drm_fb_create+0x20c/0x3c0
[9.366122] lr : hibmc_drm_fb_create+0x1e4/0x3c0
[9.370781] sp : 0aeebb50
[9.374123] x29: 0aeebb50 x28: 
[9.379489] x27: 0aeebca0 x26: 8017b3830800
[9.384854] x25: 8017b3828018 x24: 8017b3850018
[9.390219] x23: 8017b3830670 x22: 8017b3830800
[9.395583] x21: 000eb000 x20: 8017b3830a70
[9.400948] x19: 091f9000 x18: 
[9.406313] x17:  x16: 8017d4168000
[9.411678] x15: 091f96c8 x14: 09049000
[9.417042] x13:  x12: 
[9.422407] x11: 8017daf39940 x10: 0040
[9.427772] x9 : 8017b53e02b0 x8 : 8017daf39918
[9.433136] x7 : 8017daf39a60 x6 : 8017b3840800
[9.438500] x5 :  x4 : 
[9.443865] x3 : 8017b53e0290 x2 : 

Re: Bug report: HiBMC crash

2018-09-23 Thread John Garry

On 21/09/2018 06:49, Liuxinliang (Matthew Liu) wrote:

Hi John,
Thank you for reporting bug.
I am now using 4.18.7. I haven't found this issue yet.
I will try linux-next and figure out what's wrong with it.

Thanks,
Xinliang




As mentioned in internal mail, the issue may be that the surface 
depth/bpp we were using the in the driver was previously invalid, but 
code has since been added in v4.19 to reject this. Specifically it looks 
like this patch:


commit 70109354fed232dfce8fb2c7cadf635acbe03e19
Author: Chris Wilson 
Date:   Wed Sep 5 16:31:16 2018 +0100

drm: Reject unknown legacy bpp and depth for drm_mode_addfb ioctl


Thanks,
John


On 2018/9/20 19:23, John Garry wrote:

On 20/09/2018 11:04, John Garry wrote:

Hi,

I am seeing this crash below on linux-next (20 Sept).

This is on an arm64 D05 board, which includes the HiBMC device. D06 was
also crashing for what looked like same reason. I am using standard
defconfig, except DRM and DRM_HISI_HIBMC are built-in.

Is this a known issue? I tested v4.19-rc3 and it had no such crash.

The origin seems to be here, where pointer info is not checked for NULL
for safety:
static int framebuffer_check(struct drm_device *dev,
 const struct drm_mode_fb_cmd2 *r)
{
...

/* now let the driver pick its own format info */
info = drm_get_format_info(dev, r);

...

for (i = 0; i < info->num_planes; i++) {
unsigned int width = fb_plane_width(r->width, info, i);
unsigned int height = fb_plane_height(r->height, info, i);
unsigned int cpp = info->cpp[i];




Upon closer inspection the crash is actually from hibmc probe error
handling path, specifically
hibmc_fbdev_destroy()->drm_framebuffer_put() is called with fb holding
the error value from hibmc_framebuffer_init(), as shown:

static int hibmc_drm_fb_create(struct drm_fb_helper *helper,
   struct drm_fb_helper_surface_size *sizes)
{

...

hi_fbdev->fb = hibmc_framebuffer_init(priv->dev, _cmd, gobj);
if (IS_ERR(hi_fbdev->fb)) {
ret = PTR_ERR(hi_fbdev->fb);

*** hi_fbdev->fb holds error code ***

DRM_ERROR("failed to initialize framebuffer: %d\n", ret);
goto out_release_fbi;
}


static void hibmc_fbdev_destroy(struct hibmc_fbdev *fbdev)
{
struct hibmc_framebuffer *gfb = fbdev->fb;
struct drm_fb_helper *fbh = >helper;

drm_fb_helper_unregister_fbi(fbh);

drm_fb_helper_fini(fbh);

**>fb holds error code, not pointer ***

if (gfb)
drm_framebuffer_put(>fb);
}

This change fixes the crash for me:

hi_fbdev->fb = hibmc_framebuffer_init(priv->dev, _cmd, gobj);
if (IS_ERR(hi_fbdev->fb)) {
ret = PTR_ERR(hi_fbdev->fb);
+hi_fbdev->fb = NULL;
DRM_ERROR("failed to initialize framebuffer: %d\n", ret);
goto out_release_fbi;
}

Why we're hitting the error path at all, I don't know.

And, having said all that, the code I pointed out in
framebuffer_check() still does not seem safe for same reason I
mentioned originally.

John


John

[9.220446] pci 0007:90:00.0: can't derive routing for PCI INT A
[9.226517] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[9.231847] [TTM] Zone  kernel: Available graphics memory:
16297696 kiB
[9.238536] [TTM] Zone   dma32: Available graphics memory: 2097152
kiB
[9.245133] [TTM] Initializing pool allocator
[9.249536] [TTM] Initializing DMA pool allocator
[9.254340] [drm] Supports vblank timestamp caching Rev 2
(21.10.2013).
[9.261026] [drm] No driver support for vblank timestamp query.
[9.272431] WARNING: CPU: 16 PID: 293 at
drivers/gpu/drm/drm_fourcc.c:221 drm_format_info.part.1+0x0/0x8
[9.282014] Modules linked in:
[9.285095] CPU: 16 PID: 293 Comm: kworker/16:1 Not tainted
4.19.0-rc4-next-20180920-1-g9b0012c #322
[9.294677] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon
D05 IT21 Nemo 2.0 RC0 04/18/2018
[9.303915] Workqueue: events work_for_cpu_fn
[9.308314] pstate: 6005 (nZCv daif -PAN -UAO)
[9.313150] pc : drm_format_info.part.1+0x0/0x8
[9.317724] lr : drm_get_format_info+0x90/0x98
[9.322208] sp : 0af1baf0
[9.325549] x29: 0af1baf0 x28: 
[9.330915] x27: 0af1bcb0 x26: 8017d3018800
[9.336279] x25: 8017d28a0018 x24: 8017d2f80018
[9.341644] x23: 8017d3018670 x22: 0af1bbf0
[9.347009] x21: 8017d3018a70 x20: 0af1bbf0
[9.352373] x19: 0af1bbf0 x18: 
[9.357737] x17:  x16: 
[9.363102] x15: 092296c8 x14: 09074000
[9.368466] x13:  x12: 
[9.373831] x11: 8017fbffe008 x10: 8017db9307e8
[9.379195] x9 :  x8 : 8017b517c800
[9.384560] x7 :  x6 : 003f
[9.389924] x5 : 0040 x4 : 
[9.395289] x3 : 08d04000 x2 : 56555941
[

Re: Bug report: HiBMC crash

2018-09-21 Thread Chris Wilson
Quoting John Garry (2018-09-21 09:11:19)
> On 21/09/2018 06:49, Liuxinliang (Matthew Liu) wrote:
> > Hi John,
> > Thank you for reporting bug.
> > I am now using 4.18.7. I haven't found this issue yet.
> > I will try linux-next and figure out what's wrong with it.
> >
> > Thanks,
> > Xinliang
> >
> >
> 
> As mentioned in internal mail, the issue may be that the surface 
> depth/bpp we were using the in the driver was previously invalid, but 
> code has since been added in v4.19 to reject this. Specifically it looks 
> like this patch:
> 
> commit 70109354fed232dfce8fb2c7cadf635acbe03e19
> Author: Chris Wilson 
> Date:   Wed Sep 5 16:31:16 2018 +0100
> 
>  drm: Reject unknown legacy bpp and depth for drm_mode_addfb ioctl


diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c 
b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
index b92595c477ef..f3e7f41e6781 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
@@ -71,7 +71,6 @@ static int hibmc_drm_fb_create(struct drm_fb_helper *helper,
DRM_DEBUG_DRIVER("surface width(%d), height(%d) and bpp(%d)\n",
 sizes->surface_width, sizes->surface_height,
 sizes->surface_bpp);
-   sizes->surface_depth = 32;

bytes_per_pixel = DIV_ROUND_UP(sizes->surface_bpp, 8);

@@ -192,7 +191,6 @@ int hibmc_fbdev_init(struct hibmc_drm_private *priv)
return -ENOMEM;
}

-   priv->fbdev = hifbdev;
drm_fb_helper_prepare(priv->dev, >helper,
  _fbdev_helper_funcs);

@@ -246,6 +244,7 @@ int hibmc_fbdev_init(struct hibmc_drm_private *priv)
 fix->ypanstep, fix->ywrapstep, fix->line_length,
 fix->accel, fix->capabilities);

+   priv->fbdev = hifbdev;
return 0;

 fini:

Apply chunks 2&3 first to confirm they fix the GPF.
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Bug report: HiBMC crash

2018-09-21 Thread John Garry

On 20/09/2018 11:04, John Garry wrote:

Hi,

I am seeing this crash below on linux-next (20 Sept).

This is on an arm64 D05 board, which includes the HiBMC device. D06 was
also crashing for what looked like same reason. I am using standard
defconfig, except DRM and DRM_HISI_HIBMC are built-in.

Is this a known issue? I tested v4.19-rc3 and it had no such crash.

The origin seems to be here, where pointer info is not checked for NULL
for safety:
static int framebuffer_check(struct drm_device *dev,
 const struct drm_mode_fb_cmd2 *r)
{
...

/* now let the driver pick its own format info */
info = drm_get_format_info(dev, r);

...

for (i = 0; i < info->num_planes; i++) {
unsigned int width = fb_plane_width(r->width, info, i);
unsigned int height = fb_plane_height(r->height, info, i);
unsigned int cpp = info->cpp[i];




Upon closer inspection the crash is actually from hibmc probe error 
handling path, specifically hibmc_fbdev_destroy()->drm_framebuffer_put() 
is called with fb holding the error value from hibmc_framebuffer_init(), 
as shown:


static int hibmc_drm_fb_create(struct drm_fb_helper *helper,
   struct drm_fb_helper_surface_size *sizes)
{

...

hi_fbdev->fb = hibmc_framebuffer_init(priv->dev, _cmd, gobj);
if (IS_ERR(hi_fbdev->fb)) {
ret = PTR_ERR(hi_fbdev->fb);

*** hi_fbdev->fb holds error code ***

DRM_ERROR("failed to initialize framebuffer: %d\n", ret);
goto out_release_fbi;
}


static void hibmc_fbdev_destroy(struct hibmc_fbdev *fbdev)
{
struct hibmc_framebuffer *gfb = fbdev->fb;
struct drm_fb_helper *fbh = >helper;

drm_fb_helper_unregister_fbi(fbh);

drm_fb_helper_fini(fbh);

**  >fb holds error code, not pointer ***

if (gfb)
drm_framebuffer_put(>fb);
}

This change fixes the crash for me:

hi_fbdev->fb = hibmc_framebuffer_init(priv->dev, _cmd, gobj);
if (IS_ERR(hi_fbdev->fb)) {
ret = PTR_ERR(hi_fbdev->fb);
+   hi_fbdev->fb = NULL;
DRM_ERROR("failed to initialize framebuffer: %d\n", ret);
goto out_release_fbi;
}

Why we're hitting the error path at all, I don't know.

And, having said all that, the code I pointed out in framebuffer_check() 
still does not seem safe for same reason I mentioned originally.


John


John

[9.220446] pci 0007:90:00.0: can't derive routing for PCI INT A
[9.226517] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[9.231847] [TTM] Zone  kernel: Available graphics memory: 16297696 kiB
[9.238536] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[9.245133] [TTM] Initializing pool allocator
[9.249536] [TTM] Initializing DMA pool allocator
[9.254340] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[9.261026] [drm] No driver support for vblank timestamp query.
[9.272431] WARNING: CPU: 16 PID: 293 at
drivers/gpu/drm/drm_fourcc.c:221 drm_format_info.part.1+0x0/0x8
[9.282014] Modules linked in:
[9.285095] CPU: 16 PID: 293 Comm: kworker/16:1 Not tainted
4.19.0-rc4-next-20180920-1-g9b0012c #322
[9.294677] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon
D05 IT21 Nemo 2.0 RC0 04/18/2018
[9.303915] Workqueue: events work_for_cpu_fn
[9.308314] pstate: 6005 (nZCv daif -PAN -UAO)
[9.313150] pc : drm_format_info.part.1+0x0/0x8
[9.317724] lr : drm_get_format_info+0x90/0x98
[9.322208] sp : 0af1baf0
[9.325549] x29: 0af1baf0 x28: 
[9.330915] x27: 0af1bcb0 x26: 8017d3018800
[9.336279] x25: 8017d28a0018 x24: 8017d2f80018
[9.341644] x23: 8017d3018670 x22: 0af1bbf0
[9.347009] x21: 8017d3018a70 x20: 0af1bbf0
[9.352373] x19: 0af1bbf0 x18: 
[9.357737] x17:  x16: 
[9.363102] x15: 092296c8 x14: 09074000
[9.368466] x13:  x12: 
[9.373831] x11: 8017fbffe008 x10: 8017db9307e8
[9.379195] x9 :  x8 : 8017b517c800
[9.384560] x7 :  x6 : 003f
[9.389924] x5 : 0040 x4 : 
[9.395289] x3 : 08d04000 x2 : 56555941
[9.400654] x1 : 08d04f70 x0 : 0044
[9.406019] Call trace:
[9.408483]  drm_format_info.part.1+0x0/0x8
[9.412705]  drm_helper_mode_fill_fb_struct+0x20/0x80
[9.417807]  hibmc_framebuffer_init+0x48/0xd0
[9.422204]  hibmc_drm_fb_create+0x1ec/0x3c8
[9.426513]  __drm_fb_helper_initial_config_and_unlock+0x1cc/0x418
[9.432756]  drm_fb_helper_initial_config+0x3c/0x48
[9.437681]  hibmc_fbdev_init+0xb4/0x198
[9.441638]  hibmc_pci_probe+0x2f4/0x3c8
[9.445598]  local_pci_probe+0x3c/0xb0
[9.449379] 

Re: Bug report: HiBMC crash

2018-09-21 Thread xinliang

Hi John,
Thank you for reporting bug.
I am now using 4.18.7. I haven't found this issue yet.
I will try linux-next and figure out what's wrong with it.

Thanks,
Xinliang


On 2018/9/20 19:23, John Garry wrote:

On 20/09/2018 11:04, John Garry wrote:

Hi,

I am seeing this crash below on linux-next (20 Sept).

This is on an arm64 D05 board, which includes the HiBMC device. D06 was
also crashing for what looked like same reason. I am using standard
defconfig, except DRM and DRM_HISI_HIBMC are built-in.

Is this a known issue? I tested v4.19-rc3 and it had no such crash.

The origin seems to be here, where pointer info is not checked for NULL
for safety:
static int framebuffer_check(struct drm_device *dev,
 const struct drm_mode_fb_cmd2 *r)
{
...

/* now let the driver pick its own format info */
info = drm_get_format_info(dev, r);

...

for (i = 0; i < info->num_planes; i++) {
unsigned int width = fb_plane_width(r->width, info, i);
unsigned int height = fb_plane_height(r->height, info, i);
unsigned int cpp = info->cpp[i];




Upon closer inspection the crash is actually from hibmc probe error 
handling path, specifically 
hibmc_fbdev_destroy()->drm_framebuffer_put() is called with fb holding 
the error value from hibmc_framebuffer_init(), as shown:


static int hibmc_drm_fb_create(struct drm_fb_helper *helper,
   struct drm_fb_helper_surface_size *sizes)
{

...

hi_fbdev->fb = hibmc_framebuffer_init(priv->dev, _cmd, gobj);
if (IS_ERR(hi_fbdev->fb)) {
ret = PTR_ERR(hi_fbdev->fb);

*** hi_fbdev->fb holds error code ***

DRM_ERROR("failed to initialize framebuffer: %d\n", ret);
goto out_release_fbi;
}


static void hibmc_fbdev_destroy(struct hibmc_fbdev *fbdev)
{
struct hibmc_framebuffer *gfb = fbdev->fb;
struct drm_fb_helper *fbh = >helper;

drm_fb_helper_unregister_fbi(fbh);

drm_fb_helper_fini(fbh);

**>fb holds error code, not pointer ***

if (gfb)
drm_framebuffer_put(>fb);
}

This change fixes the crash for me:

hi_fbdev->fb = hibmc_framebuffer_init(priv->dev, _cmd, gobj);
if (IS_ERR(hi_fbdev->fb)) {
ret = PTR_ERR(hi_fbdev->fb);
+hi_fbdev->fb = NULL;
DRM_ERROR("failed to initialize framebuffer: %d\n", ret);
goto out_release_fbi;
}

Why we're hitting the error path at all, I don't know.

And, having said all that, the code I pointed out in 
framebuffer_check() still does not seem safe for same reason I 
mentioned originally.


John


John

[9.220446] pci 0007:90:00.0: can't derive routing for PCI INT A
[9.226517] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[9.231847] [TTM] Zone  kernel: Available graphics memory: 
16297696 kiB
[9.238536] [TTM] Zone   dma32: Available graphics memory: 2097152 
kiB

[9.245133] [TTM] Initializing pool allocator
[9.249536] [TTM] Initializing DMA pool allocator
[9.254340] [drm] Supports vblank timestamp caching Rev 2 
(21.10.2013).

[9.261026] [drm] No driver support for vblank timestamp query.
[9.272431] WARNING: CPU: 16 PID: 293 at
drivers/gpu/drm/drm_fourcc.c:221 drm_format_info.part.1+0x0/0x8
[9.282014] Modules linked in:
[9.285095] CPU: 16 PID: 293 Comm: kworker/16:1 Not tainted
4.19.0-rc4-next-20180920-1-g9b0012c #322
[9.294677] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon
D05 IT21 Nemo 2.0 RC0 04/18/2018
[9.303915] Workqueue: events work_for_cpu_fn
[9.308314] pstate: 6005 (nZCv daif -PAN -UAO)
[9.313150] pc : drm_format_info.part.1+0x0/0x8
[9.317724] lr : drm_get_format_info+0x90/0x98
[9.322208] sp : 0af1baf0
[9.325549] x29: 0af1baf0 x28: 
[9.330915] x27: 0af1bcb0 x26: 8017d3018800
[9.336279] x25: 8017d28a0018 x24: 8017d2f80018
[9.341644] x23: 8017d3018670 x22: 0af1bbf0
[9.347009] x21: 8017d3018a70 x20: 0af1bbf0
[9.352373] x19: 0af1bbf0 x18: 
[9.357737] x17:  x16: 
[9.363102] x15: 092296c8 x14: 09074000
[9.368466] x13:  x12: 
[9.373831] x11: 8017fbffe008 x10: 8017db9307e8
[9.379195] x9 :  x8 : 8017b517c800
[9.384560] x7 :  x6 : 003f
[9.389924] x5 : 0040 x4 : 
[9.395289] x3 : 08d04000 x2 : 56555941
[9.400654] x1 : 08d04f70 x0 : 0044
[9.406019] Call trace:
[9.408483]  drm_format_info.part.1+0x0/0x8
[9.412705]  drm_helper_mode_fill_fb_struct+0x20/0x80
[9.417807]  hibmc_framebuffer_init+0x48/0xd0
[9.422204]  hibmc_drm_fb_create+0x1ec/0x3c8
[9.426513] __drm_fb_helper_initial_config_and_unlock+0x1cc/0x418
[9.432756]  drm_fb_helper_initial_config+0x3c/0x48
[9.437681]  hibmc_fbdev_init+0xb4/0x198
[9.441638]  

Bug report: HiBMC crash

2018-09-21 Thread John Garry

Hi,

I am seeing this crash below on linux-next (20 Sept).

This is on an arm64 D05 board, which includes the HiBMC device. D06 was 
also crashing for what looked like same reason. I am using standard 
defconfig, except DRM and DRM_HISI_HIBMC are built-in.


Is this a known issue? I tested v4.19-rc3 and it had no such crash.

The origin seems to be here, where pointer info is not checked for NULL 
for safety:

static int framebuffer_check(struct drm_device *dev,
 const struct drm_mode_fb_cmd2 *r)
{
...

/* now let the driver pick its own format info */
info = drm_get_format_info(dev, r);

...

for (i = 0; i < info->num_planes; i++) {
unsigned int width = fb_plane_width(r->width, info, i);
unsigned int height = fb_plane_height(r->height, info, i);
unsigned int cpp = info->cpp[i];


John

[9.220446] pci 0007:90:00.0: can't derive routing for PCI INT A
[9.226517] hibmc-drm 0007:91:00.0: PCI INT A: no GSI
[9.231847] [TTM] Zone  kernel: Available graphics memory: 16297696 kiB
[9.238536] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[9.245133] [TTM] Initializing pool allocator
[9.249536] [TTM] Initializing DMA pool allocator
[9.254340] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[9.261026] [drm] No driver support for vblank timestamp query.
[9.272431] WARNING: CPU: 16 PID: 293 at 
drivers/gpu/drm/drm_fourcc.c:221 drm_format_info.part.1+0x0/0x8

[9.282014] Modules linked in:
[9.285095] CPU: 16 PID: 293 Comm: kworker/16:1 Not tainted 
4.19.0-rc4-next-20180920-1-g9b0012c #322
[9.294677] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon 
D05 IT21 Nemo 2.0 RC0 04/18/2018

[9.303915] Workqueue: events work_for_cpu_fn
[9.308314] pstate: 6005 (nZCv daif -PAN -UAO)
[9.313150] pc : drm_format_info.part.1+0x0/0x8
[9.317724] lr : drm_get_format_info+0x90/0x98
[9.322208] sp : 0af1baf0
[9.325549] x29: 0af1baf0 x28: 
[9.330915] x27: 0af1bcb0 x26: 8017d3018800
[9.336279] x25: 8017d28a0018 x24: 8017d2f80018
[9.341644] x23: 8017d3018670 x22: 0af1bbf0
[9.347009] x21: 8017d3018a70 x20: 0af1bbf0
[9.352373] x19: 0af1bbf0 x18: 
[9.357737] x17:  x16: 
[9.363102] x15: 092296c8 x14: 09074000
[9.368466] x13:  x12: 
[9.373831] x11: 8017fbffe008 x10: 8017db9307e8
[9.379195] x9 :  x8 : 8017b517c800
[9.384560] x7 :  x6 : 003f
[9.389924] x5 : 0040 x4 : 
[9.395289] x3 : 08d04000 x2 : 56555941
[9.400654] x1 : 08d04f70 x0 : 0044
[9.406019] Call trace:
[9.408483]  drm_format_info.part.1+0x0/0x8
[9.412705]  drm_helper_mode_fill_fb_struct+0x20/0x80
[9.417807]  hibmc_framebuffer_init+0x48/0xd0
[9.422204]  hibmc_drm_fb_create+0x1ec/0x3c8
[9.426513]  __drm_fb_helper_initial_config_and_unlock+0x1cc/0x418
[9.432756]  drm_fb_helper_initial_config+0x3c/0x48
[9.437681]  hibmc_fbdev_init+0xb4/0x198
[9.441638]  hibmc_pci_probe+0x2f4/0x3c8
[9.445598]  local_pci_probe+0x3c/0xb0
[9.449379]  work_for_cpu_fn+0x18/0x28
[9.453161]  process_one_work+0x1e0/0x318
[9.457207]  worker_thread+0x228/0x450
[9.460988]  kthread+0x128/0x130
[9.464244]  ret_from_fork+0x10/0x18
[9.467850] ---[ end trace 2695ffa0af5be373 ]---
[9.472525] WARNING: CPU: 16 PID: 293 at 
drivers/gpu/drm/drm_framebuffer.c:730 drm_framebuffer_init+0x18/0x110

[9.482634] Modules linked in:
[9.485714] CPU: 16 PID: 293 Comm: kworker/16:1 Tainted: GW 
  4.19.0-rc4-next-20180920-1-g9b0012c #322
[9.496702] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon 
D05 IT21 Nemo 2.0 RC0 04/18/2018

[9.505936] Workqueue: events work_for_cpu_fn
[9.510333] pstate: 6005 (nZCv daif -PAN -UAO)
[9.515170] pc : drm_framebuffer_init+0x18/0x110
[9.519831] lr : hibmc_framebuffer_init+0x60/0xd0
[9.524578] sp : 0af1baf0
[9.527920] x29: 0af1baf0 x28: 
[9.533284] x27: 0af1bcb0 x26: 8017d3018800
[9.538649] x25: 8017d28a0018 x24: 8017d2f80018
[9.544014] x23: 8017d3018670 x22: 0af1bbf0
[9.549378] x21: 8017d3018a70 x20: 8017d242
[9.554743] x19: 8017b517c700 x18: 
[9.560108] x17:  x16: 
[9.565472] x15: 092296c8 x14: 09074000
[9.570837] x13:  x12: 
[9.576201] x11: 8017fbffe008 x10: 8017db9307e8
[9.581566] x9 :  x8 : 8017b517c800
[9.586930] x7 :  x6 : 003f
[9.592295] x5 : 0040 x4 : 
[