> On Nov 8, 2019, at 9:06 AM, Thomas Zimmermann <tzimmerm...@suse.de> wrote:
> 
> Hi
> 
> Am 08.11.19 um 13:55 schrieb John Donnelly:
>> 
>> 
>>> On Nov 8, 2019, at 1:46 AM, Thomas Zimmermann <tzimmerm...@suse.de> wrote:
>>> 
>>> Hi John
>>> 
>>> Am 07.11.19 um 23:14 schrieb John Donnelly:
>>>> 
>>>> 
>>>>> On Nov 7, 2019, at 10:13 AM, John Donnelly <john.p.donne...@oracle.com> 
>>>>> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Nov 7, 2019, at 7:42 AM, Thomas Zimmermann <tzimmerm...@suse.de> 
>>>>>> wrote:
>>>>>> 
>>>>>> Hi John
>>>>>> 
>>>>>> Am 07.11.19 um 14:12 schrieb John Donnelly:
>>>>>>> Hi  Thomas ;  Thank you for reaching out. 
>>>>>>> 
>>>>>>> See inline: 
>>>>>>> 
>>>>>>>> On Nov 7, 2019, at 1:54 AM, Thomas Zimmermann <tzimmerm...@suse.de> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Hi John,
>>>>>>>> 
>>>>>>>> apparently the vgaarb was not the problem.
>>>>>>>> 
>>>>>>>> Am 07.11.19 um 03:29 schrieb John Donnelly:
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I am investigating an issue where we lose video activity when the 
>>>>>>>>> display is switched from from “text mode” to “graphic mode” 
>>>>>>>>> on a number of  servers using this driver.    Specifically  starting 
>>>>>>>>> the GNOME desktop. 
>>>>>>>> 
>>>>>>>> When you say "text mode", do you mean VGA text mode or the graphical
>>>>>>>> console that emulates text mode?
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> I call “text mode” the 24x80  ascii mode ;  - NOT GRAPHICS .       Ie : 
>>>>>>> run-level 3;  So I  guess your term for it is VGA. 
>>>>>> 
>>>>>> Yes.
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> When you enable graphics mode, does it set the correct resolution? A 
>>>>>>>> lot
>>>>>>>> of work went into memory management recently. I could imagine that the
>>>>>>>> driver sets the correct resolution, but then fails to display the
>>>>>>>> correct framebuffer.
>>>>>>> 
>>>>>>> There is no display at all ;  so there is no resolution  to mention.    
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> If possible, could you try to update to the latest drm-tip and attach
>>>>>>>> the output of
>>>>>>>> 
>>>>>>>> /sys/kernel/debug/dri/0/vram-mm
>>>>>>> 
>>>>>>> I don’t see that file ;   Is there something else I need to do ? 
>>>>>> 
>>>>>> That file is fairly new and maybe it's not in the mainline kernel yet.
>>>>>> See below for how to get it.
>>>>> 
>>>>> I  built your “tip” ;  Still no graphics displayed . 
>>>>> 
>>>>> 
>>>>> mount -t debugfs none /sys/kernel
>>>>> 
>>>>> cat /proc/cmdline 
>>>>> BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.4.0-rc6.drm.+ 
>>>>> root=/dev/mapper/ol_ca--dev55-root ro crashkernel=auto 
>>>>> resume=/dev/mapper/ol_ca--dev55-swap rd.lvm.lv=ol_ca-dev55/root 
>>>>> rd.lvm.lv=ol_ca-dev55/swap console=ttyS0,9600,8,n,1 drm.debug=0xff
>>>>> 
>>>>> 
>>>>> cat  /sys/kernel/dri/0/vram-mm 
>>>>> 
>>>>> In VGA mode :
>>>>> 
>>>>> 
>>>>> cat  /sys/kernel/dri/0/vram-mm 
>>>>> 0x0000000000000000-0x0000000000000300: 768: used
>>>>> 0x0000000000000300-0x0000000000000600: 768: used
>>>>> 0x0000000000000600-0x00000000000007ee: 494: free
>>>>> 0x00000000000007ee-0x00000000000007ef: 1: used
>>>>> 0x00000000000007ef-0x00000000000007f0: 1: used
>>>>> 
>>>>> 
>>>>> In GRAPHICS mode ( if it matters ) 
>>>>> 
>>>>> 
>>>>> cat  /sys/kernel/dri/0/vram-mm 
>>>>> 0x0000000000000000-0x0000000000000300: 768: used
>>>>> 0x0000000000000300-0x0000000000000600: 768: used
>>>>> 0x0000000000000600-0x00000000000007ee: 494: free
>>>>> 0x00000000000007ee-0x00000000000007ef: 1: used
>>>>> 0x00000000000007ef-0x00000000000007f0: 1: used
>>>>> total: 2032, used 1538 free 494
>>>>> 
>>> 
>>> This is interesting. In the graphics mode, you see two buffers of 768
>>> pages each. That's the main framebuffers as used by X (it's double
>>> buffered). Then there's a free area and finally two pages for cursor
>>> images (also double buffered). That looks as expected.
>>> 
>>> The thing is that in text mode, the areas are allocated. But the driver
>>> shouldn't be active, so the file shouldn't exist or only show a single
>>> free area.
>>> 
>> 
>>      If you want me to double check this I will .    I have GNOME installed 
>> , but the machine boots to runlevel  3, then I start the desktop using init 
>> 5  I am pretty sure I took that output when the machine was in graphic’s 
>> mode   at runlevel 5 . 
>> 
>> 
>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> I’ve attached : var/lib/gdm/.local/share/xorg/Xorg.0.log. ;   instead ; 
>>>>>> 
>>>>>> Good! Looking through that log file, the card is found at line 79 and
>>>>>> the generic X modesetting driver initializes below. That works as 
>>>>>> expected.
>>>>>> 
>>>>>> I notices that several operations are not permitted (lines 78 and 87). I
>>>>>> guess you're starting X from a regular user account? IIRC special
>>>>>> permission is required to acquire control of the display. What happens
>>>>>> if you start X as root user?
>>>>> 
>>>>> 
>>>>>  I am starting GNOME  as  root by doing  “init 5” from either the console 
>>>>>  session or from ssh .
>>>>> 
>>>>> The default runlevel is 3  on boot .
>>>>> 
>>>>> On failing session  running  your 5.4.0.rc6.
>>>>> 
>>>>> 78 [   237.712] xf86EnableIOPorts: failed to set IOPL for I/O (Operation 
>>>>> not permitted)
>>>>> 
>>>>> 87 [   237.712] (EE) open /dev/fb0: Permission denied
>>>>> 
>>>>> Booting 4.18 kernel yields the same error results in: 
>>>>> /var/lib/gdm/.local/share/xorg/Xorg.0.log
>>>>> 
>>>>> 78 [   101.334] xf86EnableIOPorts: failed to set IOPL for I/O (Operation 
>>>>> not permitted)
>>>>> 
>>>>> 87 [   101.334] (EE) open /dev/fb0: Permission denied
>>>>> 
>>>>> 
>>>>> What is strange the X logs  ( bad and Ok ) files essentially appear as if 
>>>>> GNOME started !
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> <Xorg.0.log.bad><Xorg.0.log.Ok>
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Here is my cmdline  -  I just tested 5.3.0 and it fails too  ( my last 
>>>>>>> test was 5.3.8 and it failed also ) . 
>>>>>>> 
>>>>>>> # cat /proc/cmdline 
>>>>>>> BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.3.0+ 
>>>>>>> root=/dev/mapper/ol_ca--dev55-root ro crashkernel=auto 
>>>>>>> resume=/dev/mapper/ol_ca--dev55-swap rd.lvm.lv=ol_ca-dev55/root 
>>>>>>> rd.lvm.lv=ol_ca-dev55/swap console=ttyS0,9600,8,n,1 drm.debug=0xff
>>>>>>> 
>>>>>>> When you say “tip”. - Are you referring to a specific kernel  ?  I can 
>>>>>>> build a  5.4.0.rc6  ;   The problem appears to have been introduced 
>>>>>>> around 5.3 time frame. 
>>>>>> 
>>>>>> The latest and greatest DRM code is in the drm-tip branch at
>>>>>> 
>>>>>> git://anongit.freedesktop.org/drm/drm-tip
>>>>>> 
>>>>>> If you build this version you should find
>>>>>> 
>>>>>> /sys/kernel/debug/dri/0/vram-mm
>>>>>> 
>>>>>> on the device. You have to build with debugfs enabled and
>>>>>> maybe have to mount debugfs at /sys/kernel/debug.
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> before and after switching to graphics mode. The file lists the
>>>>>>>> allocated regions of the VRAM.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This adapter is  Server Engines  Integrated Remote Video Acceleration 
>>>>>>>>> Subsystem (RVAS)  and is used as remote console in iLO/DRAC 
>>>>>>>>> environments.  
>>>>>>>>> 
>>>>>>>>> I don’t see any specific errors in the gdm logs or message file other 
>>>>>>>>> than this:
>>>>>>>> 
>>>>>>>> You can boot with drm.debug=0xff on the kernel command line to enable
>>>>>>>> more warnings.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Could you please attach the output of lspci -v for the VGA adapter?
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Here is the output from the current machine; The previous addresses 
>>>>>>> were from another model using the same SE device:
>>>>>>> 
>>>>>>> 
>>>>>>> Nov  7 04:42:50 ca-dev55 kernel: mgag200 0000:3d:00.0: 
>>>>>>> remove_conflicting_pci_framebuffers: bar 0: 0xc5000000 -> 0xc5ffffff
>>>>>>> Nov  7 04:42:50 ca-dev55 kernel: mgag200 0000:3d:00.0: 
>>>>>>> remove_conflicting_pci_framebuffers: bar 1: 0xc6810000 -> 0xc6813fff
>>>>>>> Nov  7 04:42:50 ca-dev55 kernel: mgag200 0000:3d:00.0: 
>>>>>>> remove_conflicting_pci_framebuffers: bar 2: 0xc6000000 -> 0xc67fffff
>>>>>>> Nov  7 04:42:50 ca-dev55 kernel: mgag200 0000:3d:00.0: vgaarb: 
>>>>>>> deactivate vga console
>>>>>>> 
>>>>>>> 
>>>>>>> lspci -s 3d:00.0 -vvv -k 
>>>>>>> 3d:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA 
>>>>>>> G200e [Pilot] ServerEngines (SEP1) (rev 05) (prog-if 00 [VGA 
>>>>>>> controller])
>>>>>>>         Subsystem: Oracle/SUN Device 4852
>>>>>>>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
>>>>>>> ParErr+ Stepping- SERR+ FastB2B- DisINTx-
>>>>>>>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
>>>>>>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>>>>>         Latency: 0, Cache Line Size: 64 bytes
>>>>>>>         Interrupt: pin A routed to IRQ 16
>>>>>>>         NUMA node: 0
>>>>>>>         Region 0: Memory at c5000000 (32-bit, non-prefetchable) 
>>>>>>> [size=16M]
>>>>>>>         Region 1: Memory at c6810000 (32-bit, non-prefetchable) 
>>>>>>> [size=16K]
>>>>>>>         Region 2: Memory at c6000000 (32-bit, non-prefetchable) 
>>>>>>> [size=8M]
>>>>>>>         Expansion ROM at 000c0000 [disabled] [size=128K]
>>>>>>>         Capabilities: [dc] Power Management version 2
>>>>>>>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
>>>>>>> PME(D0-,D1-,D2-,D3hot-,D3cold-)
>>>>>>>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>>>>>>>         Capabilities: [e4] Express (v1) Legacy Endpoint, MSI 00
>>>>>>>                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 
>>>>>>> <64ns, L1 <1us
>>>>>>>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>>>>>>>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ 
>>>>>>> Unsupported-
>>>>>>>                         RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
>>>>>>>                         MaxPayload 128 bytes, MaxReadReq 128 bytes
>>>>>>>                 DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ 
>>>>>>> AuxPwr- TransPend-
>>>>>>>                 LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, 
>>>>>>> Exit Latency L0s <64ns
>>>>>>>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>>>>>>>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>>>>>>>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>>>>>>                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ 
>>>>>>> DLActive- BWMgmt- ABWMgmt-
>>>>>>>         Capabilities: [54] MSI: Enable- Count=1/1 Maskable- 64bit-
>>>>>>>                 Address: 00000000  Data: 0000
>>>>>>>         Kernel driver in use: mgag200
>>>>>>>         Kernel modules: mgag200
>>>>>> 
>>>>>> Looks all normal.
>>>>>> 
>>>>>> Best regards
>>>>>> Thomas
>>>>>> 
>>>> 
>>>> ==============  Snip  ===========
>>>> 
>>>> 
>>>> Hi Thomas 
>>>> ,
>>>> I hopefully narrowed down the breakage between these up-stream commits,  
>>>> which is v5.2 and 5.3.0-rc1:   
>>>> 
>>>> 
>>>> between :  0ecfebd2b524 2019-07-07 | Linux 5.2      to :   5f9e832c1370 
>>>> 2019-07-21 | Linus 5.3-rc1
>>>> 
>>>> 
>>>> I started to bisect this range on by date, by day ,  based on the changes 
>>>> done in :
>>>> 
>>>> drivers/gpu/drm/
>>>> 
>>>> fec88ab0af97 2019-07-14 | Merge tag 'for-linus-hmm' of 
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma  ;  works 
>>>> 
>>>> Hopefully something in drivers/gpu/drm/ between the date range of 
>>>> 2019-07-14 to 2019-07-21 will surface tomorrow.
>>> 
>>> Great, thanks for bisecting.
>>> 
>>> Could you attach your kernel config file? I'd like to compare with my
>>> config and try to reproduce the issue.
>>> 
>>> Best regards
>>> Thomas
>> 
>>  Hi.
>> 
>>  Here are config files generated after a “ make oldconfig “     that started 
>> with an original .config file from a master file  we use for 5.4.0.-rc4. :
>> 
>>     config.5.2.21 -  work with that flavor
>>    config.5.3.   fails with 5.3 and later. 
>> 
>>  Do you have access to mgag200 style adapter ?  
> 
> I do.
> 
> I think I've been able to reproduce the issue. Buffers seem to remain in
> video ram after they have been pinned there. I'll investigate next week.
> I hope your bisecting session can point to the cause.
> 
> Best regards
> Thomas

Hi Thomas,


 Wonderful!  

 I think I have narrowed down the merge to this build which is : 
vmlinuz-5.2.0-rc5+ :


be8454afc50f 2019-07-15 | Merge tag 'drm-next-2019-07-16' of 
git://anongit.freedesktop.org/drm/drm

  Specifically this merge included these two changes :

  94dc57b10399 2019-06-13 | drm/mgag200: Rewrite cursor handling
  f4ce5af71bc2 2019-06-13 | drm/mgag200: Pin framebuffer BO during dirty update


I  tried reverting them and the resultant driver  doesn’t build afterwards due 
to drm calls. 

If I build a kernel from : 

fec88ab0af97 2019-07-14 | Merge tag 'for-linus-hmm' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

That is posted  day prior to  be8454afc50f - the GNOME desktop works. 






_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to