Bug#884116: linux-image-4.9.0-4-amd64: screen atrifacts then crash

2018-01-10 Thread Sylvain
Hi,

I also get screen artifacts with the new -4 and -5 kernels:
- display going on/off/on regularly, usually on first key press after
switching between different screens in a multi-display setup
- current window briefly flickering, usually on first key press after
switching between different screens in a multi-display setup
- main display's left origin briefly shifted to the right causing a
"wrapping" effect

I didn't experience a system crash though.

Cheers!
Sylvain



Bug#884116: linux-image-4.9.0-4-amd64: screen atrifacts then crash

2018-01-07 Thread agafnd
Further adventures.

From https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=884116#45 :
> I am using Kde with a theme that relies heavily on transparency and I guess 
> this
> makes things worse
> [...] Enabling 
> a theme with less transparency effects makes the system work longer before 
> crashing.

It occurred to me after reading this that I, too, have been using transparency 
effects.
I have some semi-transparent MATE panels, and have the compositing window 
manager
option turned on. So I tried disabling these, and updating the kernel -- from
4.9.0-4-amd64 #1 SMP Debian 4.9.51-1 (2017-09-28) x86_64 to
4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) x86_64 (the current 
kernel).
Perhaps this is just apophenia, but I think I experienced fewer screen judders 
(for lack of
a better term). (Note that, unlike others, I've never experienced a crash from 
this bug.)

I decided to try the backports kernel (well, *a* backports kernel -- there seem 
to be multiple
ones). The specific kernel is 
4.14.0-0.bpo.2-amd64 #1 SMP Debian 4.14.7-1~bpo9+1 (2017-12-22) x86_64

So far (I still haven't re-enabled transparency or compositing. Frankly, I'm 
too nervous to --
I prefer a stable system to messing around and possibly getting an unstable one.

Still, it's a shame I can't use the current kernel -- judging from the date of 
the backports
kernel build, it seems unlikely it's been patched for Meltdown.

Regards, Agafnd


Bug#884116: linux-image-4.9.0-4-amd64: screen atrifacts then crash

2017-12-23 Thread Jerome

On 16/12/17 19:03, Jerome wrote:
    I have a similar issue, for me it occurred when upgrading from 
kernel package 4.9.0-3 (stable) to 4.9.0-4.

[...]
    I tried a newer kernel, 4.13.0-0.bpo.1 from backports: it helped a 
bit. Where 4.9.0-4 froze very quickly, sometimes right at the SDDM 
login other times after less than a minute, 4.13 could last several 
hours and some suspend/resume cycles. But it also had the same issue 
eventually.


    I enabled the IOMMU (intel_iommu=on), and it caught something. 
There seem to be an access error before the freeze: [...]


Actually kernel 4.13 from backports is fine on my machine. The issues I 
saw were related to enabling the IOMMU. Without enabling the IOMMU, 4.13 
is fine after one week of use, and doesn't show the problems of 4.9.0-4.


I had enabled the IOMMU in the hope of containing the issue with 4.9.0-4 
instead of having my laptop locked solid, and it somehow worked. But 
this had other effects: with the IOMMU enabled, the stable version 
4.9.0-3 started to have GPU hang issues due to IOMMU errors. So I 
disabled it and retried 4.13, and it was stable. That's a fishy 
situation, but at least the system is usable.


So people experiencing this bug problem may want to try the backports 
4.13 kernel.




Bug#884116: linux-image-4.9.0-4-amd64: screen atrifacts then crash

2017-12-16 Thread Jerome

    Hi,

    I have a similar issue, for me it occurred when upgrading from 
kernel package 4.9.0-3 (stable) to 4.9.0-4. With the later I sometimes 
got display content corruptions, always very quickly got X 
freeze/lock-up. Once or twice I could change to a console and observed 
the "*ERROR* CPU pipe A FIFO underrun" message in the kernel log. Trying 
to stop/start the X session always got to a complete lock-up.


    Some other bugs are similar: #859639, #884001. The kernel version 
where people experienced the issue can change though.


    I tried using the Intel driver instead of modesetting, as it helped 
for some people, but it didn't help in my case.


    I tried a newer kernel, 4.13.0-0.bpo.1 from backports: it helped a 
bit. Where 4.9.0-4 froze very quickly, sometimes right at the SDDM login 
other times after less than a minute, 4.13 could last several hours and 
some suspend/resume cycles. But it also had the same issue eventually.


    I enabled the IOMMU (intel_iommu=on), and it caught something. 
There seem to be an access error before the freeze:


[14312.568400] DMAR: DRHD: handling fault status reg 3
[14312.568406] DMAR: [DMA Write] Request device [00:02.0] fault addr 
1197000 [fault reason 23] Unknown
[14319.871599] [drm] GPU HANG: ecode 8:0:0x85db, in Xorg [1265], 
reason: Hang on rcs0, action: reset

[14319.871639] drm/i915: Resetting chip after gpu hang
[14327.894140] drm/i915: Resetting chip after gpu hang
[14337.878141] drm/i915: Resetting chip after gpu hang
[14349.878136] drm/i915: Resetting chip after gpu hang
[14448.886146] drm/i915: Resetting chip after gpu hang

    So where I observed the "CPU pipe A FIFO underrun", now I don't see 
it anymore but it's replaced by this IOMMU error (DMAR), followed by the 
GPU HANG. When this happens, the X display freezes but I can get back to 
a console reliably, even if it typically takes several seconds. If I try 
to restart the X session however, I quickly get back into the same 
problems. When I can get to a console I see many IOMMU exceptions:


[14457.416489] DMAR: DRHD: handling fault status reg 3
[14457.416496] DMAR: [DMA Write] Request device [00:02.0] fault addr 
1e2 [fault reason 23] Unknown

[14457.416584] DMAR: DRHD: handling fault status reg 2
[14457.416591] DMAR: [DMA Write] Request device [00:02.0] fault addr 
1e2 [fault reason 23] Unknown

[14461.966545] dmar_fault: 544 callbacks suppressed
[14461.966549] DMAR: DRHD: handling fault status reg 3
[14461.966562] DMAR: [DMA Write] Request device [00:02.0] fault addr 
1e25000 [fault reason 23] Unknown

[14461.966751] DMAR: DRHD: handling fault status reg 2

    Even with the IOMMU enabled the system ends up freezing solid if I 
persist, requiring a power cycle. The only reliable way to recover from 
such an error is a power cycle anyway.


    All this on an up to date Debian Stretch 9.3, on a Thinkpad X1 with 
5th gen CPU (i5-5200U). I use SDDM with KDE5/Plasma.


    For now I'm back on kernel 4.9.0-3, which is the last usable for 
me. I doesn't mean the underlying issue is not there (bug report #859639 
has the issue starting with 4.9.0-1), maybe some little changes makes 
the issue probability changes widely depending on system and configurations?
    I'm not competent to investigate this further on my own, but if 
anyone as suggestions on tests to make to investigate this issue, let me 
know.


Thanks



Bug#884116: linux-image-4.9.0-4-amd64: screen atrifacts then crash

2017-12-14 Thread Richard James Salts
Package: src:linux
Version: 4.9.65-3
Followup-For: Bug #884116



-- Package-specific info:
** Kernel log: boot messages should be attached

** Model information
sys_vendor: 
product_name: 
product_version: 
chassis_vendor: 
chassis_version: 
bios_vendor: Intel Corporation
bios_version: RYBDWi35.86A.0360.2016.1109.0946
board_vendor: Intel Corporation
board_name: NUC5i7RYB
board_version: H73774-101

** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation Broadwell-U Host Bridge -OPI 
[8086:1604] (rev 09)
Subsystem: Intel Corporation Broadwell-U Host Bridge -OPI [8086:2057]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: bdw_uncore

00:02.0 VGA compatible controller [0300]: Intel Corporation Iris Graphics 6100 
[8086:162b] (rev 09) (prog-if 00 [VGA controller])
Subsystem: Intel Corporation Iris Graphics 6100 [8086:2057]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: i915
Kernel modules: i915

00:03.0 Audio device [0403]: Intel Corporation Broadwell-U Audio Controller 
[8086:160c] (rev 09)
Subsystem: Intel Corporation Broadwell-U Audio Controller [8086:2057]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

00:14.0 USB controller [0c03]: Intel Corporation Wildcat Point-LP USB xHCI 
Controller [8086:9cb1] (rev 03) (prog-if 30 [XHCI])
Subsystem: Intel Corporation Wildcat Point-LP USB xHCI Controller 
[8086:2057]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- 
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci

00:16.0 Communication controller [0780]: Intel Corporation Wildcat Point-LP MEI 
Controller #1 [8086:9cba] (rev 03)
Subsystem: Intel Corporation Wildcat Point-LP MEI Controller [8086:2057]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: mei_me
Kernel modules: mei_me

00:19.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection (3) 
I218-V [8086:15a3] (rev 03)
Subsystem: Intel Corporation Ethernet Connection (3) I218-V [8086:2057]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: e1000e
Kernel modules: e1000e

00:1b.0 Audio device [0403]: Intel Corporation Wildcat Point-LP High Definition 
Audio Controller [8086:9ca0] (rev 03)
Subsystem: Intel Corporation Wildcat Point-LP High Definition Audio 
Controller [8086:2057]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel

00:1c.0 PCI bridge [0604]: Intel Corporation Wildcat Point-LP PCI Express Root 
Port #1 [8086:9c90] (rev e3) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 
Kernel driver in use: pcieport
Kernel modules: shpchp

00:1c.3 PCI bridge [0604]: Intel Corporation Wildcat Point-LP PCI Express Root 
Port #4 [8086:9c96] (rev e3) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 
Kernel driver in use: pcieport
Kernel modules: shpchp

00:1d.0 USB controller [0c03]: Intel Corporation Wildcat Point-LP USB EHCI 
Controller [8086:9ca6] (rev 03) (prog-if 20 [EHCI])
Subsystem: Intel Corporation Wildcat Point-LP USB EHCI Controller 
[8086:2057]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-

Bug#884116: linux-image-4.9.0-4-amd64: screen atrifacts then crash

2017-12-12 Thread Agafnd
Package: src:linux
Followup-For: Bug #884116

I also encountered this problem after the latest apt update && apt upgrade. The 
system didn't crash, though -- just the annoying right-to-left flicker 
occasionally.

After asking around in #debian on freenode, I installed (with dpkg -i)
http://snapshot.debian.org/archive/debian/20170929T215212Z/pool/main/l/linux/linux-image-4.9.0-4-amd64_4.9.51-1_amd64.deb
and rebooted. The system works fine now.

Here is some information that I think is relevant (I'd be happy to supply more 
if requested):

00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core 
Processor Family Integrated Graphics Controller [8086:0126] (rev 09) (prog-if 
00 [VGA controller])
Subsystem: Hewlett-Packard Company 2nd Generation Core Processor Family 
Integrated Graphics Controller [103c:161c]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: i915
Kernel modules: i915

-- System Information:
Debian Release: 9.3
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)



Bug#884116: linux-image-4.9.0-4-amd64: screen atrifacts then crash

2017-12-11 Thread Hannes Eberhardt
Source: linux
Version: 4.9.0-4
Followup-For: Bug #884116

I've got the same error after upgrading from 4.9.0-3 to the newer kernel.
I could provide information if required


-- System Information:
Debian Release: 9.3
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.9.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)



Bug#884116: linux-image-4.9.0-4-amd64: screen atrifacts then crash

2017-12-11 Thread root
Package: src:linux
Version: 4.9.65-3
Severity: critical
Justification: breaks the whole system

Dear Maintainer,

* What led up to the situation?
I did an upgrade today and linux image was upgraded from 4.9.51-1 to 4.9.65-3.

* What was the outcome of this action?
Various screen artifacts appear under Xorg, fortunately the console still works 
so I can write this report.
The most noticeable artifact is the whole screen image shifting to the right 
then back very fast.
Artifacts are more violent under compiz but they appear with xfwm4 too.
And then after a while the system crashes, no response to the keyboard/mouse.
Reverting to linux-image-4.9.0-3-amd64 makes the problem go away.


-- Package-specific info:
** Version:
Linux version 4.9.0-4-amd64 (debian-ker...@lists.debian.org) (gcc version 6.3.0 
20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.65-3 (2017-12-03)

** Command line:
BOOT_IMAGE=/vmlinuz-4.9.0-4-amd64 
root=UUID=e9287df1-d10d-4d6e-8910-7887cf5e5900 ro quiet

** Tainted: O (4096)
 * Out-of-tree module has been loaded.

** Kernel log:
Unable to read kernel log; any relevant messages should be attached

** Model information
sys_vendor: To be filled by O.E.M.
product_name: To be filled by O.E.M.
product_version: To be filled by O.E.M.
chassis_vendor: To Be Filled By O.E.M.
chassis_version: To Be Filled By O.E.M.
bios_vendor: American Megatrends Inc.
bios_version: 5.6.5
board_vendor: INTEL Corporation
board_name: CRESCENTBAY
board_version: To be filled by O.E.M.

** Loaded modules:
dm_mod
binfmt_misc
rfcomm
joydev
hid_generic
usbhid
pci_stub
vboxpci(O)
vboxnetadp(O)
vboxnetflt(O)
vboxdrv(O)
bnep
uas
btusb
btrtl
btbcm
btintel
usb_storage
bluetooth
crc16
intel_rapl
x86_pkg_temp_thermal
intel_powerclamp
coretemp
kvm_intel
kvm
irqbypass
crct10dif_pclmul
crc32_pclmul
snd_hda_codec_hdmi
ghash_clmulni_intel
intel_cstate
arc4
rt2800pci
rt2800mmio
rt2800lib
rt2x00pci
rt2x00mmio
rt2x00lib
iTCO_wdt
eeprom_93cx6
iTCO_vendor_support
evdev
mac80211
intel_uncore
cfg80211
crc_ccitt
intel_rapl_perf
pcspkr
rfkill
sg
snd_hda_codec_realtek
snd_hda_codec_generic
snd_hda_intel
battery
i915
snd_hda_codec
dw_dmac
snd_soc_ssm4567
snd_soc_rt5640
snd_soc_sst_acpi
dw_dmac_core
video
snd_soc_sst_match
elan_i2c
acpi_als
snd_soc_rl6231
snd_hda_core
snd_soc_core
snd_hwdep
snd_compress
snd_pcm
drm_kms_helper
snd_timer
snd
drm
kfifo_buf
soundcore
acpi_pad
button
industrialio
i2c_algo_bit
shpchp
lpc_ich
mfd_core
parport_pc
ppdev
lp
sunrpc
parport
ip_tables
x_tables
autofs4
xfs
raid10
raid456
async_raid6_recov
async_memcpy
async_pq
async_xor
async_tx
xor
raid6_pq
libcrc32c
crc32c_generic
raid1
multipath
linear
raid0
md_mod
sd_mod
crc32c_intel
aesni_intel
aes_x86_64
glue_helper
lrw
gf128mul
ablk_helper
cryptd
i2c_i801
i2c_smbus
ahci
libahci
libata
ehci_pci
ehci_hcd
xhci_pci
scsi_mod
xhci_hcd
r8169
mii
usbcore
usb_common
fan
thermal
sdhci_acpi
sdhci
mmc_core
i2c_hid
hid
i2c_designware_platform
i2c_designware_core

** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation Broadwell-U Host Bridge -OPI 
[8086:1604] (rev 09)
Subsystem: Intel Corporation Broadwell-U Host Bridge -OPI [8086:1604]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: bdw_uncore

00:02.0 VGA compatible controller [0300]: Intel Corporation Iris Graphics 6100 
[8086:162b] (rev 09) (prog-if 00 [VGA controller])
Subsystem: Intel Corporation Iris Graphics 6100 [8086:162b]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- 
SERR- TAbort- SERR- TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Express (v2) Root Port (Slot-), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ 
TransPend-
LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Exit 
Latency L0s <1us, L1 <4us
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ 
CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID , PMEStatus- PMEPending-
DevCap2: