Re: [gentoo-user] SOLVED Can't get the GUI to stay up for more than a minute or so before crashing

2024-06-26 Thread Michael
On Wednesday, 26 June 2024 01:28:47 BST Dale wrote:
> Michael wrote:

> > The above message indicates the same problem you had experienced before
> > you
> > reinstalled.  The monitor is not sending its EDID table, or the card can't
> > read it.
> > 
> > Your Xorg sets a default dummy resolution of 640 x 480, because it can't
> > find anything connected to the card.
> > 
> > Things I would try, until someone who can grok nvidia contributes better
> > ideas:
> > 
> > Eliminate the hardware being the cause of the problem, e.g.: try a
> > different cable, different monitor, then try the same card (with same
> > drivers and same kernel settings) on your other PC.  If this proves
> > there's nothing wrong with the cable, card, or kernel settings:
> > 
> > 1. Try different ports and restart display-manager each time.
> > 
> > 2. Add these two lines at the bottom of /usr/share/sddm/scripts/Xsetup:
> > 
> > xrandr --setprovideroutputsource modesetting NVIDIA-0
> > xrandr --auto
> > 
> > Again restart display-manager.
> > 
> > 3. Add a file /etc/X11/xorg.conf.d/20nvidia.conf
> > 
> > Section "Device"
> > 
> >Identifier  "nvidia"
> >Driver  "nvidia"
> >BusID   "PCI:9:0:0"
> >Option "UseEDID" "false" ## Try this too ##
> > 
> > EndSection
> > 
> > Again restart display-manager.
> > 
> > Every time you try a setting and it doesn't produce the goods, revert it
> > before you try the next thing.  Make notes and keep an eye on your logs in
> > case you spot a difference.
> > 
> > If none of these tweaks work, then you can try capturing the EDID table
> > and creating a file for the card to load.
[snip ...]

> I was
> even thinking of moving my main rig monitor to the new rig and see what
> it did.  I'd already tried a different card so didn't see any need in
> repeating that.  Then I had a thought.  Why is it saying port DP-3?  Why
> is it not port DP-0?

Your PC indicated DFP-3 was what it had booted at - from your Xorg.0.log:

[44.311] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:9:0:0
[44.311] (--) NVIDIA(0): DFP-0
[44.311] (--) NVIDIA(0): DFP-1
[44.311] (--) NVIDIA(0): DFP-2
[44.311] (--) NVIDIA(0): DFP-3 (boot)
[44.311] (--) NVIDIA(0): DFP-4
[44.311] (--) NVIDIA(0): DFP-5
[44.311] (--) NVIDIA(0): DFP-6
[44.311] (--) NVIDIA(0): DFP-7

which is the display device connector type nvidia identifies the monitor being 
connected to.  However, then it prints this discouraging message:

[44.312] (--) NVIDIA(GPU-0): 
[44.332] (--) NVIDIA(GPU-0): DFP-3: disconnected  <== This ===
[44.332] (--) NVIDIA(GPU-0): DFP-3: Internal TMDS
[44.332] (--) NVIDIA(GPU-0): DFP-3: 165.0 MHz maximum pixel clock
[44.332] (--) NVIDIA(GPU-0): 


> I thought the first port was the one on the
> bottom.  Turns out, the top port is the first one.  So, I moved the
> cable to the first port, DP-0.

I thought you had already tried this prior to reinstalling, when I had 
suggested to try different ports.


> I booted the rig up, started DM, got the
> login screen as usual and guess what was next, a complete desktop.  I
> changed it to not power off or switch to a screensaver so that it would
> stay on and I could keep a eye on it. I heated up supper, ate, typing
> this reply and it is still running, in 1080P no less. 

YES!  :-D


> Now tell me this, why would it not work on DP-3 or DP-2 when I tried
> those earlier on?  Does one always have to have a monitor connected to
> DP-0 first then others as monitors are added? 

It may have something to do with auto-detecting PNP display devices, like 
DisplayPort monitor devices.  There is a HPD pin (Hot Plug Detect) on the DP, 
which lets the card know if a monitor turns off.  This seems to cause the 
card's driver to detect the display as "disconnected", which then disables the 
port.

The question is why would the monitor turn off.  Well, it might be taking too 
long for the card to walk from DP-1 to DP-3, by which time the monitor has 
gone to sleep to save energy.  If the monitor is on DP-1, then it doesn't get 
a chance to do this.

Alternatively, the Quadro P1000 video card being a 'pro' graphics card may 
have been designed with the assumption a monitor (the primary monitor) is 
*always* connected on the first port, or else the PC is configured as a 
headless server - I don't know really.

I think if you capture and feed manually the EDID table to the card's driver, 
it may work differently - but again, I have no experience with Nvidia.  By 
accident or good fortune I've always had 'linux-friendly' AMD-Radeon cards on 
my PCs.

One thing I have noticed with my DisplayPort monitor, it needs to be powered 
on while the PC boots up/shuts down.  If the monitor is switched off it will 
not get detected after boot and also the shutdown process is cancelled.  :-/


> Now comes the next question.  To move just KDE stuff over, desktop
> settings and such.  ~/.local and .config.  Are those 

Re: [gentoo-user] SOLVED Can't get the GUI to stay up for more than a minute or so before crashing

2024-06-25 Thread Dale
Michael wrote:
> On Tuesday, 25 June 2024 19:54:33 BST Dale wrote:
>> Michael wrote:
>>> You need to have USE="elogind -systemd" in your make.conf, then add the
>>> elogind service to the *boot* runlevel as shown here:
>>>
>>> https://wiki.gentoo.org/wiki/Elogind
>> I read down through that.  I did find that acl had made it into the USE
>> flag line.  I removed it.
> You shouldn't have.
>
>
>> It's not on my main rig so no idea where that
>> came from.
> It is enabled by the profile defaults:
>
> ~ $ euse -I acl
> global use flags (searching: acl)
> 
> [+ CD   ] /var/db/repos/gentoo/profiles/use.desc:acl - Add support for Access 
> Control Lists
> [snip ...]
>
>
>>> Can you please save and attach as plain text files your:
>>>
>>> 1. dmesg
>>> 2. Xorg.0.log
>>> 3. ~/.local/share/sddm/xorg-session.log
>>> 4. /var/log/sddm.log
>>>
>>> after you end up in a black screen, in case they reveal something.
>> Should be attached.  I blanked the files and then rebooted and started
>> display-manager, (DM).  You should have only the most recent info.  I'm
>> also putting a chunk of messages below.  It might help.  It isn't much. 
>> Same as before it seems.  I still say this is something simple but hard
>> to find.  :/ 
>>
>> Dale
>>
>> :-)  :-) 
>>
>> Messages:
> [snip ...] 
>
>> Jun 25 13:31:18 Gentoo-1 kernel: nvidia-modeset: WARNING: GPU:0: Unable
>> to read EDID for display device DP-3
> The above message indicates the same problem you had experienced before you 
> reinstalled.  The monitor is not sending its EDID table, or the card can't 
> read it.
>
> Your Xorg sets a default dummy resolution of 640 x 480, because it can't find 
> anything connected to the card.
>
> Things I would try, until someone who can grok nvidia contributes better 
> ideas:
>
> Eliminate the hardware being the cause of the problem, e.g.: try a different 
> cable, different monitor, then try the same card (with same drivers and same 
> kernel settings) on your other PC.  If this proves there's nothing wrong with 
> the cable, card, or kernel settings:
>
> 1. Try different ports and restart display-manager each time.
>
> 2. Add these two lines at the bottom of /usr/share/sddm/scripts/Xsetup:
>
> xrandr --setprovideroutputsource modesetting NVIDIA-0
> xrandr --auto
>
> Again restart display-manager.
>
> 3. Add a file /etc/X11/xorg.conf.d/20nvidia.conf
>
> Section "Device"
>Identifier  "nvidia"
>Driver  "nvidia"
>BusID   "PCI:9:0:0"
>Option "UseEDID" "false" ## Try this too ##
> EndSection
>
> Again restart display-manager.
>
> Every time you try a setting and it doesn't produce the goods, revert it 
> before you try the next thing.  Make notes and keep an eye on your logs in 
> case you spot a difference.
>
> If none of these tweaks work, then you can try capturing the EDID table and 
> creating a file for the card to load.
>
> HTH.


We have some serious patience with this thing.  I think everyone else
evacuated. The reinstall wasn't likely to lead to a resolution but I did
get to fix the partition boo boo.  It's hot here.  I took a nap.  I did
walk to the mailbox and get my mail first.  My little 4 volt battery
came in for my spare electric fence charger, keeps the deer out. 
Anyway, when I woke up, I looked at the rig and was thinking.  I was
even thinking of moving my main rig monitor to the new rig and see what
it did.  I'd already tried a different card so didn't see any need in
repeating that.  Then I had a thought.  Why is it saying port DP-3?  Why
is it not port DP-0?  I thought the first port was the one on the
bottom.  Turns out, the top port is the first one.  So, I moved the
cable to the first port, DP-0.  I booted the rig up, started DM, got the
login screen as usual and guess what was next, a complete desktop.  I
changed it to not power off or switch to a screensaver so that it would
stay on and I could keep a eye on it. I heated up supper, ate, typing
this reply and it is still running, in 1080P no less. 

Now tell me this, why would it not work on DP-3 or DP-2 when I tried
those earlier on?  Does one always have to have a monitor connected to
DP-0 first then others as monitors are added? 

Now comes the next question.  To move just KDE stuff over, desktop
settings and such.  ~/.local and .config.  Are those the big ones? 
Also, I have a .kde4 directory, that's no longer used right?  I think it
died ages ago.  I forgot all about that thing.  I'll copy the other
stuff over at some point but just want to play with the big stuff at the
moment. 

In your list, #1 would have been the fix.  It also turns out, it was
me.  I plugged the cable in the wrong port.  No idea why everything else
worked fine tho.  All the boot media worked just fine.  This is a large
thread over something so simple.  ;-) 

Thanks so much for all the help.  The main rig is still sitting there at
1080P waiting on me.  Finally, after over $1,000 spent, days of
installing,