[Intel-gfx] [PATCH] iommu/intel: quirk to disable DMAR for QM57 igfx

2019-01-20 Thread Eric Wong
Joonas Lahtinen  wrote:
> Quoting Eric Wong (2019-01-04 03:06:26)
> > Yeah, so the Debian bpo 4.17(.17) kernel did not set
> > CONFIG_INTEL_IOMMU_DEFAULT_ON, so I didn't encounter problems.
> > My self-built kernels all set CONFIG_INTEL_IOMMU_DEFAULT_ON.
> 
> So it's the case that IOMMU never worked on your machine.
> 
> My recommendation would be to simply use intel_iommu=igfx_off if you
> need IOMMU.
> 
> Old hardware is known to have issues with IOMMU, and retroactively
> enabling IOMMU on those machines just brings them up :/

How about we use a quirk in case distros make IOMMU the default
one day?

8<
Subject: [PATCH] iommu/intel: quirk to disable DMAR for QM57 igfx

Like the GM45, it seems the integrated graphics on QM57 seems
broken and hanging graphics with "intel_iommu=on".  So allow
future users to unconditionally enable DMAR support and not have
to remember or know to specify "intel_iommu=igfx_off"

cf. https://lore.kernel.org/lkml/20181227114948.ev4b3jte3ubsc5us@dcvr/
cf. 
https://lore.kernel.org/lkml/154659116310.4596.13613897418163029...@jlahtine-desk.ger.corp.intel.com/

Signed-off-by: Eric Wong 
---
 drivers/iommu/intel-iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 048b5ab36a02..dc2507a01580 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5399,7 +5399,7 @@ const struct iommu_ops intel_iommu_ops = {
 
 static void quirk_iommu_g4x_gfx(struct pci_dev *dev)
 {
-   /* G4x/GM45 integrated gfx dmar support is totally busted. */
+   /* G4x/GM45/QM57 integrated gfx dmar support is totally busted. */
pr_info("Disabling IOMMU for graphics on this chipset\n");
dmar_map_gfx = 0;
 }
@@ -5411,6 +5411,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e20, 
quirk_iommu_g4x_gfx);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e30, quirk_iommu_g4x_gfx);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e40, quirk_iommu_g4x_gfx);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e90, quirk_iommu_g4x_gfx);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, quirk_iommu_g4x_gfx);
 
 static void quirk_iommu_rwbf(struct pci_dev *dev)
 {
@@ -5457,7 +5458,6 @@ static void quirk_calpella_no_shadow_gtt(struct pci_dev 
*dev)
}
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0040, 
quirk_calpella_no_shadow_gtt);
-DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, 
quirk_calpella_no_shadow_gtt);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0062, 
quirk_calpella_no_shadow_gtt);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x006a, 
quirk_calpella_no_shadow_gtt);
 
-- 
EW
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] iommu_intel or i915 regression in 4.18, 4.19.12 and drm-tip

2019-01-09 Thread Eric Wong
I just got a used Thinkpad X201 (Core i5 M 520, Intel QM57
chipset) and hit some kernel panics while trying to view
image/animation-intensive stuff in Firefox (X11) unless I use
"iommu_intel=igfx_off".

With Debian stable backport kernels, "linux-image-4.17.0-0.bpo.3-amd64"
(4.17.17-1~bpo9+1) has no problems.  But "linux-image-4.18.0-0.bpo.3-amd64"
(4.18.20-2~bpo9+1) gives a blank screen before I can login via agetty
and run startx.

Building 4.19.12 myself got me into X11 and able to start
Firefox to panic the kernel.  I also updated to the latest BIOS
(1.40), but it's an EOL laptop (but it's still the most powerful
laptop I use).  I intend to replace the BIOS with Coreboot soon...

Initially, I thought I was hitting another GPU hang from 4.18+:

https://bugs.freedesktop.org/show_bug.cgi?id=107945

But building drm-tip @ commit 28bb1fc015cedadf3b099b8bd0bb27609849f362
("drm-tip: 2018y-12m-25d-08h-12m-37s UTC integration manifest")
I was still able to reproduce the panic unless I use iommu_intel=igfx_off
"i915.reset=1" did not help matters, either.

Below is what I got from netconsole while on drm-tip:

Kernel panic - not syncing: DMAR hardware is malfunctioning
Shutting down cpus with NMI
Kernel Offset: disabled
---[ end Kernel panic - not syncing: DMAR hardware is malfunctioning ]---
[ cut here ]
sched: Unexpected reschedule of offline CPU#3!
WARNING: CPU: 0 PID: 105 at native_smp_send_reschedule+0x34/0x40
Modules linked in: netconsole ccm snd_hda_codec_hdmi snd_hda_codec_conexant 
snd_hda_codec_generic intel_powerclamp coretemp kvm_intel kvm irqbypass 
crc32_pclmul crc32c_intel ghash_clmulni_intel arc4 iwldvm aesni_intel 
aes_x86_64 crypto_simd cryptd mac80211 glue_helper intel_cstate iwlwifi 
intel_uncore i915 intel_gtt i2c_algo_bit iosf_mbi drm_kms_helper cfbfillrect 
syscopyarea intel_ips cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea 
thinkpad_acpi prime_numbers cfg80211 ledtrig_audio i2c_i801 sg snd_hda_intel 
led_class snd_hda_codec drm ac drm_panel_orientation_quirks snd_hwdep battery 
e1000e agpgart snd_hda_core snd_pcm snd_timer ptp snd soundcore pps_core 
ehci_pci ehci_hcd lpc_ich video mfd_core button acpi_cpufreq ecryptfs ip_tables 
x_tables ipv6 evdev thermal [last unloaded: netconsole]
CPU: 0 PID: 105 Comm: kworker/u8:3 Not tainted 4.20.0-rc7b1+ #1
Hardware name: LENOVO 3680FBU/3680FBU, BIOS 6QET70WW (1.40 ) 10/11/2012
Workqueue: i915 __i915_gem_free_work [i915]
RIP: 0010:native_smp_send_reschedule+0x34/0x40
Code: 05 69 c6 c9 00 73 15 48 8b 05 18 2d b3 00 be fd 00 00 00 48 8b 40 30 e9 
9a 58 7d 00 89 fe 48 c7 c7 78 73 af 81 e8 dc c2 01 00 <0f> 0b c3 66 0f 1f 84 00 
00 00 00 00 66 66 66 66 90 8b 05 0d 7d df
RSP: 0018:888075003d98 EFLAGS: 00010092
RAX: 002e RBX: 8880751a0740 RCX: 0006
RDX: 0007 RSI: 0082 RDI: 888075015440
RBP: 88806e823700 R08:  R09: 888072fc07c0
R10: 888075003d60 R11: fff5c002 R12: 8880751a0740
R13: 8880751a0740 R14:  R15: 0003
FS:  () GS:88807500() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fdb1f53f000 CR3: 01c0a004 CR4: 000206f0
Call Trace:
 
 ? check_preempt_curr+0x4e/0x90
 ? ttwu_do_wakeup.isra.19+0x14/0xf0
 ? try_to_wake_up+0x323/0x410
 ? autoremove_wake_function+0xe/0x30
 ? __wake_up_common+0x8d/0x140
 ? __wake_up_common_lock+0x6c/0x90
 ? irq_work_run_list+0x49/0x80
 ? tick_sched_handle.isra.6+0x50/0x50
 ? update_process_times+0x3b/0x50
 ? tick_sched_handle.isra.6+0x30/0x50
 ? tick_sched_timer+0x3b/0x80
 ? __hrtimer_run_queues+0xea/0x270
 ? hrtimer_interrupt+0x101/0x240
 ? smp_apic_timer_interrupt+0x6a/0x150
 ? apic_timer_interrupt+0xf/0x20
 
 ? panic+0x1ca/0x212
 ? panic+0x1c7/0x212
 ? __iommu_flush_iotlb+0x19e/0x1c0
 ? iommu_flush_iotlb_psi+0x96/0xf0
 ? intel_unmap+0xbf/0xf0
 ? i915_gem_object_put_pages_gtt+0x36/0x220 [i915]
 ? drm_ht_remove+0x20/0x20 [drm]
 ? drm_mm_remove_node+0x1ad/0x310 [drm]
 ? __pm_runtime_resume+0x54/0x70
 ? __i915_gem_object_unset_pages+0x129/0x170 [i915]
 ? __i915_gem_object_put_pages+0x70/0xa0 [i915]
 ? __i915_gem_free_objects+0x245/0x4e0 [i915]
 ? __switch_to_asm+0x24/0x60
 ? __i915_gem_free_work+0x65/0xa0 [i915]
 ? process_one_work+0x1fd/0x410
 ? worker_thread+0x49/0x3f0
 ? kthread+0xf8/0x130
 ? process_one_work+0x410/0x410
 ? kthread_park+0x90/0x90
 ? ret_from_fork+0x35/0x40
WARNING: CPU: 0 PID: 105 at native_smp_send_reschedule+0x34/0x40
---[ end trace 7dd2184d8c86cef5 ]---
[ cut here ]
sched: Unexpected reschedule of offline CPU#2!
WARNING: CPU: 0 PID: 105 at native_smp_send_reschedule+0x34/0x40
Modules linked in: netconsole ccm snd_hda_codec_hdmi snd_hda_codec_conexant 
snd_hda_codec_generic intel_powerclamp coretemp kvm_intel kvm irqbypass 
crc32_pclmul crc32c_intel ghash_clmulni_intel arc4 iwldvm aesni_intel 
aes_x86_64 crypto_simd cryptd mac80211 glue

Re: [Intel-gfx] iommu_intel or i915 regression in 4.18, 4.19.12 and drm-tip

2019-01-09 Thread Eric Wong
Joonas Lahtinen  wrote:
> Quoting Eric Wong (2018-12-27 13:49:48)
> > I just got a used Thinkpad X201 (Core i5 M 520, Intel QM57
> > chipset) and hit some kernel panics while trying to view
> > image/animation-intensive stuff in Firefox (X11) unless I use
> > "iommu_intel=igfx_off".
> > 
> > With Debian stable backport kernels, "linux-image-4.17.0-0.bpo.3-amd64"
> > (4.17.17-1~bpo9+1) has no problems.  But "linux-image-4.18.0-0.bpo.3-amd64"
> > (4.18.20-2~bpo9+1) gives a blank screen before I can login via agetty
> > and run startx.

> Most confusing about this is that 4.17 would have worked to begin with,
> without intel_iommu=igfx_off (unless it was the default for older
> kernel?)

Yeah, so the Debian bpo 4.17(.17) kernel did not set
CONFIG_INTEL_IOMMU_DEFAULT_ON, so I didn't encounter problems.
My self-built kernels all set CONFIG_INTEL_IOMMU_DEFAULT_ON.

Booting the Debian 4.17 kernel with "intel_iommu=on" gives the
same hanging problem I hit with self-built 4.19.{12,13} kernels.

I'm not sure how far back the problem goes (maybe forever),
since I only got this hardware.  Not sure what's the problem
with Debian 4.18, either; but (self-built) 4.19.13 is fine w/o
CONFIG_INTEL_IOMMU_DEFAULT_ON.

Debian backports doesn't have kernels for 4.19 or 4.20, yet.

> Did you maybe update other parts of the system while updating the
> kernel?

Definitely not; just the kernel + headers ("make bindeb-pkg)".

> If you could attach full boot dmesg from working and non-working kernel +
> have config file of both kernel's in Bugzilla. That'd be a good start!

Sorry, I get anxiety attacks when it comes to logins and forms.
Anyways, I managed to get the Debian kernel dmesg output uploaded
with and without iommu_intel=on:
https://bugs.freedesktop.org/attachment.cgi?bugid=109219
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx