Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Am 07.09.23 um 18:33 schrieb suijingfeng: Hi, On 2023/9/7 17:08, Christian König wrote: I strongly suggest that you just completely drop this here Drop this is OK, no problem. Then I will go to develop something else. This version is not intended to merge originally, as it's a RFC. Also, the core mechanism already finished, it is the first patch in this series. Things left are just policy (how to specify one and parse the kernel CMD line) and nothing interesting left. It is actually to fulfill my promise at V3 which is to give some examples as usage cases. and go into the AST driver and try to fix it. Well, someone tell me that this is well defined behavior yesterday, which imply that it is not a bug. I'm not going to fix a non-bug. Sorry for that, I wasn't realizing what you are actually trying to do. But if thomas ask me to fix it, then I probably have to try to fix. But I suggest if things not broken, don't fix it. Otherwise this may incur more big trouble. For server's single display use case, it is good enough. Yeah, exactly that's the reason why you shouldn't mess with this. In theory you could try to re-program the necessary north bridge blocks to make integrated graphics work even if you installed a dedicated VGA adapter, but you will most likely be missing something. The only real fix is to tell the BIOS that you want to use the integrated VGA device even if a dedicated one is detected. If you want to learn more about the background AMD has a bunch of documentation around this on their website: https://www.amd.com/en/search/documentation/hub.html The most interesting document for you is probably the BIOS programming manual, but don't ask me what exactly the title of that one. @Alex do you remember what that was called? IIRC Intel had similar documentations public, but I don't know where to find those of hand. Regards, Christian. Thanks.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/7 17:08, Christian König wrote: I strongly suggest that you just completely drop this here Drop this is OK, no problem. Then I will go to develop something else. This version is not intended to merge originally, as it's a RFC. Also, the core mechanism already finished, it is the first patch in this series. Things left are just policy (how to specify one and parse the kernel CMD line) and nothing interesting left. It is actually to fulfill my promise at V3 which is to give some examples as usage cases. and go into the AST driver and try to fix it. Well, someone tell me that this is well defined behavior yesterday, which imply that it is not a bug. I'm not going to fix a non-bug. But if thomas ask me to fix it, then I probably have to try to fix. But I suggest if things not broken, don't fix it. Otherwise this may incur more big trouble. For server's single display use case, it is good enough. Thanks.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Am 07.09.23 um 17:26 schrieb suijingfeng: [SNIP] Then, I'll give you another example, see below for elaborate description. I have one AMD BC160 GPU, see[1] to get what it looks like. The GPU don't has a display connector interface exported. It actually can be seen as a render-only GPU or compute class GPU for bitcoin. But the firmware of it still acclaim this GPU as VGA compatible. When mount this GPU onto motherboard, the system always select this GPU as primary. But this GPU can't be able to connect with a monitor. Under such a situation, modprobe.blacklist=amdgpu don't works either, because vgaarb always select this GPU as primary, this is a device-level decision. It's not VGAARB which makes this selection, it's the BIOS. VGAARB just detects what the BIOS has decided. $ dmesg | grep vgaarb: [ 3.541405] pci :0c:00.0: vgaarb: BAR 0: [mem 0xa000-0xafff 64bit pref] contains firmware FB [0xa000-0xa02f] [ 3.901448] pci :05:00.0: vgaarb: setting as boot VGA device [ 3.905375] pci :05:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 3.905382] pci :0c:00.0: vgaarb: setting as boot VGA device (overriding previous) [ 3.909375] pci :0c:00.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none [ 3.913375] pci :0d:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 3.913377] vgaarb: loaded [ 13.513760] amdgpu :0c:00.0: vgaarb: deactivate vga console [ 19.020992] amdgpu :0c:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem I'm using ubuntu 22.04 system, with ast.modeset=10 passed on the cmd line, I still be able to enter the graphics system. And views this GPU as a render-only GPU. Probably continue to examine what's wrong, except this, drm/amdgpu report " *ERROR* IB test failed on sdma0 (-110)" to me. Does this count as problem? No, again that is perfectly expected behavior. Some BIOSes (or maybe most by modern standard) allows to override this, but if you later override this by the OS you run the hardware outside what's validated. When you put a VGA device into a board with an integrated VGA device the integrated one gets disabled. This is even part of some PCIe specification IIRC. So the problems you run into here are perfectly expected. Regards, Christian. Before I could find solution, I have keep this de-fact render only GPU mounted. Because I need recompile kennel module, install the kernel module and testing. All I need is a 2D video card to display something, ast drm is OK, despite simple. It suit the need for my daily usage with VIM, that's enough for me. Now, the real questions that I want ask is: 1) Does the fact that when the kernel driver module got blocked (by modprobe.blacklist=amdgpu), while the vgaarb still select it as primary which leave the X server crash there (because no kennel space driver loaded) count as a problem? 2) Does my approach that mounting another GPU as the primary display adapter, while its real purpose is to solving bugs and development for another GPU, count as a use case? $ cat demsg.txt | grep drm [ 10.099888] ACPI: bus type drm_connector registered [ 11.083920] etnaviv :0d:00.0: [drm] bind etnaviv-display, master name: :0d:00.0 [ 11.084106] [drm] Initialized etnaviv 1.3.0 20151214 for :0d:00.0 on minor 0 [ 13.301702] [drm] amdgpu kernel modesetting enabled. [ 13.359820] [drm] initializing kernel modesetting (NAVI12 0x1002:0x7360 0x1002:0x0A34 0xC7). [ 13.368246] [drm] register mmio base: 0xEB10 [ 13.372861] [drm] register mmio size: 524288 [ 13.380788] [drm] add ip block number 0 [ 13.385661] [drm] add ip block number 1 [ 13.390531] [drm] add ip block number 2 [ 13.395405] [drm] add ip block number 3 [ 13.399760] [drm] add ip block number 4 [ 13.404111] [drm] add ip block number 5 [ 13.408378] [drm] add ip block number 6 [ 13.413249] [drm] add ip block number 7 [ 13.433546] [drm] add ip block number 8 [ 13.433547] [drm] add ip block number 9 [ 13.497757] [drm] VCN decode is enabled in VM mode [ 13.502540] [drm] VCN encode is enabled in VM mode [ 13.508785] [drm] JPEG decode is enabled in VM mode [ 13.529596] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit [ 13.564762] [drm] Detected VRAM RAM=8176M, BAR=256M [ 13.569628] [drm] RAM width 2048bits HBM [ 13.574167] [drm] amdgpu: 8176M of VRAM memory ready [ 13.579125] [drm] amdgpu: 15998M of GTT memory ready. [ 13.584184] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 13.590505] [drm] PCIE GART of 512M enabled (table at 0x00800030). [ 13.598749] [drm] Found VCN firmware Version ENC: 1.16 DEC: 5 VEP: 0 Revision: 4 [ 13.671786] [drm] reserve 0xe0 from 0x81fd00 for PSP TMR [ 13.801235] [drm] Display Core v3.2.247 initialized on DCN 2.0 [ 13.807061] [drm] DP-HDM
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/7 20:43, Christian König wrote: Am 07.09.23 um 14:32 schrieb suijingfeng: Hi, On 2023/9/7 17:08, Christian König wrote: Well, I have over 25 years of experience with display hardware and what you describe here was never an issue. I want to give you an example to let you know more. I have a ASRock AD2550B-ITX board[1], When another discrete video card is mounted into it mini PCIe slot or PCI slot, The IGD cannot be the primary display adapter anymore. The display is totally black. I have try to draft a few trivial patch to help fix this[2]. And I want to use the IGD as primary, does this count as an issue? No, this is completely expected behavior and a limitation of the hardware design. As far as I know both AMD and Intel GPUs work the same here. Regards, Christian. [1] https://www.asrock.com/mb/Intel/AD2550-ITX/ [2] https://patchwork.freedesktop.org/series/123073/ Then, I'll give you another example, see below for elaborate description. I have one AMD BC160 GPU, see[1] to get what it looks like. The GPU don't has a display connector interface exported. It actually can be seen as a render-only GPU or compute class GPU for bitcoin. But the firmware of it still acclaim this GPU as VGA compatible. When mount this GPU onto motherboard, the system always select this GPU as primary. But this GPU can't be able to connect with a monitor. Under such a situation, modprobe.blacklist=amdgpu don't works either, because vgaarb always select this GPU as primary, this is a device-level decision. $ dmesg | grep vgaarb: [3.541405] pci :0c:00.0: vgaarb: BAR 0: [mem 0xa000-0xafff 64bit pref] contains firmware FB [0xa000-0xa02f] [3.901448] pci :05:00.0: vgaarb: setting as boot VGA device [3.905375] pci :05:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [3.905382] pci :0c:00.0: vgaarb: setting as boot VGA device (overriding previous) [3.909375] pci :0c:00.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none [3.913375] pci :0d:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [3.913377] vgaarb: loaded [ 13.513760] amdgpu :0c:00.0: vgaarb: deactivate vga console [ 19.020992] amdgpu :0c:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem I'm using ubuntu 22.04 system, with ast.modeset=10 passed on the cmd line, I still be able to enter the graphics system. And views this GPU as a render-only GPU. Probably continue to examine what's wrong, except this, drm/amdgpu report " *ERROR* IB test failed on sdma0 (-110)" to me. Does this count as problem? Before I could find solution, I have keep this de-fact render only GPU mounted. Because I need recompile kennel module, install the kernel module and testing. All I need is a 2D video card to display something, ast drm is OK, despite simple. It suit the need for my daily usage with VIM, that's enough for me. Now, the real questions that I want ask is: 1) Does the fact that when the kernel driver module got blocked (by modprobe.blacklist=amdgpu), while the vgaarb still select it as primary which leave the X server crash there (because no kennel space driver loaded) count as a problem? 2) Does my approach that mounting another GPU as the primary display adapter, while its real purpose is to solving bugs and development for another GPU, count as a use case? $ cat demsg.txt | grep drm [ 10.099888] ACPI: bus type drm_connector registered [ 11.083920] etnaviv :0d:00.0: [drm] bind etnaviv-display, master name: :0d:00.0 [ 11.084106] [drm] Initialized etnaviv 1.3.0 20151214 for :0d:00.0 on minor 0 [ 13.301702] [drm] amdgpu kernel modesetting enabled. [ 13.359820] [drm] initializing kernel modesetting (NAVI12 0x1002:0x7360 0x1002:0x0A34 0xC7). [ 13.368246] [drm] register mmio base: 0xEB10 [ 13.372861] [drm] register mmio size: 524288 [ 13.380788] [drm] add ip block number 0 [ 13.385661] [drm] add ip block number 1 [ 13.390531] [drm] add ip block number 2 [ 13.395405] [drm] add ip block number 3 [ 13.399760] [drm] add ip block number 4 [ 13.404111] [drm] add ip block number 5 [ 13.408378] [drm] add ip block number 6 [ 13.413249] [drm] add ip block number 7 [ 13.433546] [drm] add ip block number 8 [ 13.433547] [drm] add ip block number 9 [ 13.497757] [drm] VCN decode is enabled in VM mode [ 13.502540] [drm] VCN encode is enabled in VM mode [ 13.508785] [drm] JPEG decode is enabled in VM mode [ 13.529596] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit [ 13.564762] [drm] Detected VRAM RAM=8176M, BAR=256M [ 13.569628] [drm] RAM width 2048bits HBM [ 13.574167] [drm] amdgpu: 8176M of VRAM memory ready [ 13.579125] [drm] amdgpu: 15998M of GTT memory ready. [ 13.584184] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 13.590505] [drm] PCIE GART of 512M enab
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Am 07.09.23 um 14:32 schrieb suijingfeng: Hi, On 2023/9/7 17:08, Christian König wrote: Well, I have over 25 years of experience with display hardware and what you describe here was never an issue. I want to give you an example to let you know more. I have a ASRock AD2550B-ITX board[1], When another discrete video card is mounted into it mini PCIe slot or PCI slot, The IGD cannot be the primary display adapter anymore. The display is totally black. I have try to draft a few trivial patch to help fix this[2]. And I want to use the IGD as primary, does this count as an issue? No, this is completely expected behavior and a limitation of the hardware design. As far as I know both AMD and Intel GPUs work the same here. Regards, Christian. [1] https://www.asrock.com/mb/Intel/AD2550-ITX/ [2] https://patchwork.freedesktop.org/series/123073/
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/7 17:08, Christian König wrote: Well, I have over 25 years of experience with display hardware and what you describe here was never an issue. I want to give you an example to let you know more. I have a ASRock AD2550B-ITX board[1], When another discrete video card is mounted into it mini PCIe slot or PCI slot, The IGD cannot be the primary display adapter anymore. The display is totally black. I have try to draft a few trivial patch to help fix this[2]. And I want to use the IGD as primary, does this count as an issue? [1] https://www.asrock.com/mb/Intel/AD2550-ITX/ [2] https://patchwork.freedesktop.org/series/123073/
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
On Wed, 06 Sep 2023, suijingfeng wrote: > Another limitation of the 'nomodeset' parameter is that > it is only available on recent upstream kernel. Low version > downstream kernel don't has this parameter supported yet. > So this create inconstant developing experience. I believe that > there always some people need do back-port and upstream work > for various reasons. While that may be true, it's not an argument in favour of adding new module parameters or special values to existing module parameters. They would have to be backported just as well. BR, Jani. -- Jani Nikula, Intel Open Source Graphics Center
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Am 07.09.23 um 04:30 schrieb Sui Jingfeng: Hi, On 2023/9/6 17:40, Christian König wrote: Am 06.09.23 um 11:08 schrieb suijingfeng: Well, welcome to correct me if I'm wrong. You seem to have some very basic misunderstandings here. The term framebuffer describes some VRAM memory used for scanout. This framebuffer is exposed to userspace through some framebuffer driver, on UEFI platforms that is usually efifb but can be quite a bunch of different drivers. When the DRM drivers load they remove the previous drivers using drm_aperture_remove_conflicting_pci_framebuffers() (or similar function), but this does not mean that the framebuffer or scanout parameters are modified in any way. It just means that the framebuffer is just no longer exposed through this driver. Take over is the perfectly right description here because that's exactly what's happening. The framebuffer configuration including the VRAM memory as well as the parameters for scanout are exposed by the newly loaded DRM driver. In other words userspace can query through the DRM interfaces which monitors already driven by the hardware and so in your terminology figure out which is the primary one. I'm a little bit of not convinced about this idea, you might be correct. Well I can point you to the code if you don't believe me. But there cases where three are multiple monitors and each video card connect one. Yeah, but this is irrelevant. The key point is the configuration is taken over when the driver loads. So whatever is there before as setup (one monitor showing console, three monitors mirrored, whatever) should be there after loading the driver as well. This configuration is just immediately overwritten because nobody cares about it. It also quite common that no monitors is connected, let the machine boot first, then find a monitors to connect to a random display output. See which will display. I don't expect the primary shake with. The primary one have to be determined as early as possible, because of the VGA console and the framebuffer console may directly output the primary. Well that is simply not correct. There is not concept of "primary" display, it can just be that a monitor was brought up by the BIOS or bootloader and we take over this configuration. Get the DDC and/or HPD involved may necessary complicated the problem. There are ASpeed BMC who add a virtual connector in order to able display remotely. There are also have commands to force a connector to be connected status. It's just that as Thomas explained as well that this completely irrelevant to any modern desktop. Both X and Wayland both iterate the available devices and start rendering to them which one was used during boot doesn't really matter to them. You may be correct, but I'm still not sure. I probably need more times to investigate. Me and my colleagues are mainly using X server, the version varies from 1.20.4 and 1.21.1.4. Even this is true, the problems still exist for non-modern desktops. Well, I have over 25 years of experience with display hardware and what you describe here was never an issue. What you have is simply a broken display driver which for some reason can't handle your use case. I strongly suggest that you just completely drop this here and go into the AST driver and try to fix it. Regards, Christian. Apart from that ranting like this and trying to explain stuff to people who obviously have much better background in the topic is not going to help your patches getting upstream. Thanks for you tell me so much knowledge, I'm realized where are the problems now. I will try to resolve the concerns at the next version. Regards, Christian.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/6 17:40, Christian König wrote: Am 06.09.23 um 11:08 schrieb suijingfeng: Well, welcome to correct me if I'm wrong. You seem to have some very basic misunderstandings here. The term framebuffer describes some VRAM memory used for scanout. This framebuffer is exposed to userspace through some framebuffer driver, on UEFI platforms that is usually efifb but can be quite a bunch of different drivers. When the DRM drivers load they remove the previous drivers using drm_aperture_remove_conflicting_pci_framebuffers() (or similar function), but this does not mean that the framebuffer or scanout parameters are modified in any way. It just means that the framebuffer is just no longer exposed through this driver. Take over is the perfectly right description here because that's exactly what's happening. The framebuffer configuration including the VRAM memory as well as the parameters for scanout are exposed by the newly loaded DRM driver. In other words userspace can query through the DRM interfaces which monitors already driven by the hardware and so in your terminology figure out which is the primary one. I'm a little bit of not convinced about this idea, you might be correct. But there cases where three are multiple monitors and each video card connect one. It also quite common that no monitors is connected, let the machine boot first, then find a monitors to connect to a random display output. See which will display. I don't expect the primary shake with. The primary one have to be determined as early as possible, because of the VGA console and the framebuffer console may directly output the primary. Get the DDC and/or HPD involved may necessary complicated the problem. There are ASpeed BMC who add a virtual connector in order to able display remotely. There are also have commands to force a connector to be connected status. It's just that as Thomas explained as well that this completely irrelevant to any modern desktop. Both X and Wayland both iterate the available devices and start rendering to them which one was used during boot doesn't really matter to them. You may be correct, but I'm still not sure. I probably need more times to investigate. Me and my colleagues are mainly using X server, the version varies from 1.20.4 and 1.21.1.4. Even this is true, the problems still exist for non-modern desktops. Apart from that ranting like this and trying to explain stuff to people who obviously have much better background in the topic is not going to help your patches getting upstream. Thanks for you tell me so much knowledge, I'm realized where are the problems now. I will try to resolve the concerns at the next version. Regards, Christian.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi Am 06.09.23 um 11:48 schrieb suijingfeng: [...] There's 'nomodeset', which disables all native drivers. It's useful for debugging or as a quick-fix if the graphics driver breaks. If you want to disable a specific driver, please use one of the options for blacklisting. Yeah, the 'nomodeset' disables all native drivers, this is a good point of it, but this is also the weak point of it. Well, that's by design. Graphics is at the core of the user experience. We often cannot _not_ provide it. And if it's broken, there needs to be a reliable fallback. There needs to be at least enough graphics support to run a terminal and repair the system. And it also needs to be simple enough for the average user. Falling back to serial terminals if often not an option. At least here at SUSE, when users or customers report a broken graphics driver, we can tell them to start with 'nomodeset' and get at least the basic graphics. That's good enough for most productivity/office software. In the meantime, we investigate the problem. There were concerns about the need of nomodeset, but I think it has proven to be useful in practice. Sometimes, when you are developing a drm driver for a new device. You will see the pain. Its too often a programmer's modification make the entire Linux kernel hang there. The problematic drm driver kernel module already in the initrd. Then, the real need to disable the ill-functional drm driver kernel module only. While what you recommend to disable them all. There are subtle difference. I found that initcall_blacklist= works reliable for me. Another limitation of the 'nomodeset' parameter is that it is only available on recent upstream kernel. Low version downstream kernel don't has this parameter supported yet. So this create inconstant developing experience. I believe that there always some people need do back-port and upstream work for various reasons. Nomodeset used to be there, but in a different form. It forced VGA text mode IIRC. 'git grep' for vga_text_force() in an old kernel. We adopted the parameter for all of graphics, because it already did what we needed. Best regards Thomas While (kindly, no offensive) debating, since we have the modprobe.blacklist why we still need the 'nomodeset' parameter ? why not try modprobe.blacklist="amdgpu,radeon,i915,ast,nouveau,gma500_gfx, ..." :-/ But OK in overall, I will listen to your advice. Best regards Thomas [1] https://elixir.bootlin.com/linux/v6.5/source/include/drm/drm_module.h#L83 for the modeset parameter, authors of various device driver try to make the usage not conflict with others. I believe that this is good thing for Linux users. It is probably the responsibility of the drm core maintainers to force various drm drivers to reach a minimal consensus. Probably it pains to do so and doesn't pay off. But reach a minimal consensus do benefit to Linux users. You can use modprobe.blacklist or initcall_blacklist on the kernel command line. There are some cases where the modprobe.blacklist doesn't works, I have come cross several time during the past. Because the device selected by the VGAARB is device-level thing, it is not the driver's problem. Sometimes when VGAARB has a bug, it will select a wrong device as primary. And the X server will use this wrong device as primary and completely crash there, due to lack a driver. Take my old S3 Graphics as an example: $ lspci | grep VGA 00:06.1 VGA compatible controller: Loongson Technology LLC DC (Display Controller) (rev 01) 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM] 07:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01) 08:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01) Before apply this patch: [ 0.361748] pci :00:06.1: vgaarb: setting as boot VGA device [ 0.361753] pci :00:06.1: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none [ 0.361765] pci :03:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361773] pci :07:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361779] pci :08:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361781] vgaarb: loaded [ 0.367838] pci :00:06.1: Overriding boot device as 1002:6778 [ 0.367841] pci :00:06.1: Overriding boot device as 5333:9070 [ 0.367843] pci :00:06.1: Overriding boot device as 5333:9070 For known reason, one of my system select the S3 Graphics as primary GPU. But this S3 Graphics not even have a decent drm upstream driver yet. Under such a case, I begin to believe that only the device who has a driver deserve the primary. Under such a condition, I want to reboot and enter the graphic environment with other working video cards. Either platform integrated and discrete GPU. This don't means I should compromi
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Am 06.09.23 um 12:31 schrieb Sui Jingfeng: Hi, On 2023/9/6 14:45, Christian König wrote: Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings). So, this series is definitely not to interact with the firmware framebuffer (or more intelligent framebuffer drivers). It is for user space program, such as X server and Wayland compositor. Its for Linux user or drm drivers testers, which allow them to direct graphic display server using right hardware of interested as primary video card. Also, I believe that X server and Wayland compositor are the best test examples. If a specific DRM driver can't work with X server as a primary, then there probably have something wrong. But what's the use case for overriding this setting? On a specific machine with multiple GPUs mounted, only the primary graphics get POST-ed (initialized) by the firmware. Therefore, the DRM drivers for the rest video cards, have to choose to work without the prerequisite setups done by firmware, This is called as POST. Well, you don't seem to understand the background here. This is perfectly normal behavior. Secondary cards are posted after loading the appropriate DRM driver. At least for amdgpu this is done by calling the appropriate functions in the BIOS. Well, thanks for you tell me this. You know more than me and definitely have a better understanding. Are you telling me that the POST function for AMDGPU reside in the BIOS? The kernel call into the BIOS? Yes, exactly that. Does the BIOS here refer to the UEFI runtime or ATOM BIOS or something else? On dGPUs it's the VBIOS on a flashrom on the board, for iGPUs (APUs as AMD calls them) it's part of the system BIOS. UEFI is actually just a small subsystem in the system BIOS which replaced the old interface used between system BIOS, video BIOS and operating system. But the POST function for the drm ast, reside in the kernel space (in other word, in ast.ko). Is this statement correct? I don't know the ast driver well enough to answer that, but I assume they just read the BIOS and execute the appropriate functions. I means that for ASpeed BMC chip, if the firmware not POST the display controller. Then we have to POST it at the kernel space before doing various modeset option. We can only POST this chip by directly operate the various registers. Am I correct for the judgement about ast drm driver? Well POST just means Power On Self Test, but what you mean is initializing the hardware. Some drivers can of course initialize the hardware without the help of the BIOS, but I don't think AST can do that. As far as I know it's a relatively simple driver. BTW firmware is not the same as the BIOS (which runs the POST), firmware usually refers to something run on microcontrollers inside the ASIC while the (system or video) BIOS runs on the host CPU. Regards, Christian. Thanks for your reviews.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/6 14:45, Christian König wrote: Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings). So, this series is definitely not to interact with the firmware framebuffer (or more intelligent framebuffer drivers). It is for user space program, such as X server and Wayland compositor. Its for Linux user or drm drivers testers, which allow them to direct graphic display server using right hardware of interested as primary video card. Also, I believe that X server and Wayland compositor are the best test examples. If a specific DRM driver can't work with X server as a primary, then there probably have something wrong. But what's the use case for overriding this setting? On a specific machine with multiple GPUs mounted, only the primary graphics get POST-ed (initialized) by the firmware. Therefore, the DRM drivers for the rest video cards, have to choose to work without the prerequisite setups done by firmware, This is called as POST. Well, you don't seem to understand the background here. This is perfectly normal behavior. Secondary cards are posted after loading the appropriate DRM driver. At least for amdgpu this is done by calling the appropriate functions in the BIOS. Well, thanks for you tell me this. You know more than me and definitely have a better understanding. Are you telling me that the POST function for AMDGPU reside in the BIOS? The kernel call into the BIOS? Does the BIOS here refer to the UEFI runtime or ATOM BIOS or something else? But the POST function for the drm ast, reside in the kernel space (in other word, in ast.ko). Is this statement correct? I means that for ASpeed BMC chip, if the firmware not POST the display controller. Then we have to POST it at the kernel space before doing various modeset option. We can only POST this chip by directly operate the various registers. Am I correct for the judgement about ast drm driver? Thanks for your reviews.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/6 16:05, Thomas Zimmermann wrote: Hi Am 05.09.23 um 17:59 schrieb suijingfeng: [...] FYI: per-driver modeset parameters are deprecated and not to be used. Please don't promote them. Well, please wait, I want to explain. drm/nouveau already promote it a little bit. Despite no code of conduct or specification guiding how the modules parameters should be. Noticed that there already have a lot of DRM drivers support the modeset parameters, Please look at the history and discussion around this parameter. To my knowledge, 'modeset' got introduced when modesetting with still done in userspace. It was an easy way of disabling the kernel driver if the system's Xorg did no yet support kernel mode setting. Fast forward a few years and all Linux' use kernel modesetting, which make the modeset parameters obsolete. We discussed and decided to keep them in, because many articles and blog posts refer to them. We didn't want to invalidate them. BUT modeset is deprecated and not allowed in new code. If you look at existing modeset usage, you will eventually come across the comment at [1]. OK, no problem. I agree what you said. There's 'nomodeset', which disables all native drivers. It's useful for debugging or as a quick-fix if the graphics driver breaks. If you want to disable a specific driver, please use one of the options for blacklisting. Yeah, the 'nomodeset' disables all native drivers, this is a good point of it, but this is also the weak point of it. Sometimes, when you are developing a drm driver for a new device. You will see the pain. Its too often a programmer's modification make the entire Linux kernel hang there. The problematic drm driver kernel module already in the initrd. Then, the real need to disable the ill-functional drm driver kernel module only. While what you recommend to disable them all. There are subtle difference. Another limitation of the 'nomodeset' parameter is that it is only available on recent upstream kernel. Low version downstream kernel don't has this parameter supported yet. So this create inconstant developing experience. I believe that there always some people need do back-port and upstream work for various reasons. While (kindly, no offensive) debating, since we have the modprobe.blacklist why we still need the 'nomodeset' parameter ? why not try modprobe.blacklist="amdgpu,radeon,i915,ast,nouveau,gma500_gfx, ..." :-/ But OK in overall, I will listen to your advice. Best regards Thomas [1] https://elixir.bootlin.com/linux/v6.5/source/include/drm/drm_module.h#L83 for the modeset parameter, authors of various device driver try to make the usage not conflict with others. I believe that this is good thing for Linux users. It is probably the responsibility of the drm core maintainers to force various drm drivers to reach a minimal consensus. Probably it pains to do so and doesn't pay off. But reach a minimal consensus do benefit to Linux users. You can use modprobe.blacklist or initcall_blacklist on the kernel command line. There are some cases where the modprobe.blacklist doesn't works, I have come cross several time during the past. Because the device selected by the VGAARB is device-level thing, it is not the driver's problem. Sometimes when VGAARB has a bug, it will select a wrong device as primary. And the X server will use this wrong device as primary and completely crash there, due to lack a driver. Take my old S3 Graphics as an example: $ lspci | grep VGA 00:06.1 VGA compatible controller: Loongson Technology LLC DC (Display Controller) (rev 01) 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM] 07:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01) 08:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01) Before apply this patch: [ 0.361748] pci :00:06.1: vgaarb: setting as boot VGA device [ 0.361753] pci :00:06.1: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none [ 0.361765] pci :03:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361773] pci :07:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361779] pci :08:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361781] vgaarb: loaded [ 0.367838] pci :00:06.1: Overriding boot device as 1002:6778 [ 0.367841] pci :00:06.1: Overriding boot device as 5333:9070 [ 0.367843] pci :00:06.1: Overriding boot device as 5333:9070 For known reason, one of my system select the S3 Graphics as primary GPU. But this S3 Graphics not even have a decent drm upstream driver yet. Under such a case, I begin to believe that only the device who has a driver deserve the primary. Under such a condition, I want to reboot and enter the graphic environment with other working video cards. Either platform integrat
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Am 06.09.23 um 11:08 schrieb suijingfeng: Well, welcome to correct me if I'm wrong. You seem to have some very basic misunderstandings here. The term framebuffer describes some VRAM memory used for scanout. This framebuffer is exposed to userspace through some framebuffer driver, on UEFI platforms that is usually efifb but can be quite a bunch of different drivers. When the DRM drivers load they remove the previous drivers using drm_aperture_remove_conflicting_pci_framebuffers() (or similar function), but this does not mean that the framebuffer or scanout parameters are modified in any way. It just means that the framebuffer is just no longer exposed through this driver. Take over is the perfectly right description here because that's exactly what's happening. The framebuffer configuration including the VRAM memory as well as the parameters for scanout are exposed by the newly loaded DRM driver. In other words userspace can query through the DRM interfaces which monitors already driven by the hardware and so in your terminology figure out which is the primary one. It's just that as Thomas explained as well that this completely irrelevant to any modern desktop. Both X and Wayland both iterate the available devices and start rendering to them which one was used during boot doesn't really matter to them. Apart from that ranting like this and trying to explain stuff to people who obviously have much better background in the topic is not going to help your patches getting upstream. Regards, Christian.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/6 14:45, Christian König wrote: Am 05.09.23 um 15:30 schrieb suijingfeng: Hi, On 2023/9/5 18:45, Thomas Zimmermann wrote: Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned If anything, the primary graphics adapter is the one initialized by the firmware. I think our boot-up graphics also make this assumption implicitly. Yes, but by the time of DRM drivers get loaded successfully,the boot-up graphics already finished. This is an incorrect assumption. drm_aperture_remove_conflicting_pci_framebuffers() and co don't kill the framebuffer, Well, my original description to this technique point is that 1) "Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings)" 2) "By the time of DRM drivers get loaded successfully, the boot-up graphics already finished." The word "killed" here is rough and coarse description about how does the drm device driver take over the firmware framebuffer. Since there seems have something obscure our communication, lets make the things clear. See below for more elaborate description. they just remove the current framebuffer driver to avoid further updates. This statement doesn't sound right, for UEFI environment, a correct description is that they remove the platform device, not the framebuffer driver. For the machines with the UEFI firmware, framebuffer driver here definitely refer to the efifb. The efifb still reside in the system(linux kernel). Please see the aperture_detach_platform_device() function in video/aperture.c So what happens (at least for amdgpu) is that we take over the framebuffer, This statement here is also not an accurate description. Strictly speaking, drm/amdgpu takes over the device (the VRAM hardware), not the framebuffer. The word "take over" here is also dubious, because drm/amdgpu takes over nothing. From the perspective of device-driver model, the GPU hardware *belongs* to the amdgpu drivers. Why you need to take over a thing originally and belong to you? If you could build the drm/amdgpu into the kernel and make it get loaded before the efifb. Then, there no need to use the firmware framebuffer ( the talking is limited to the display boot graphics purpose here). On such a case, the so-called "take over" will not happen. The truth is that the efifb create a platform device, which *occupy* part of the VRAM hardware resource. Thus, the efifb and the drm/amdgpu form the conflict. There are conflict because they share the same hardware resource. It is the hardware resources(address ranges) used by two different driver are conflict. Not the efifb driver itself conflict with drm/amdgpu driver. Thus, drm_aperture_remove_conflicting_xx() function have to kill one of the device are conflicting. Not to kill the driver. Therefore, the correct word would be the "reclaim". drm/amdgpu *reclaim* the hardware resource (vram address range) originally belong to you. The modeset state (including the framebuffer content) still reside in the amdgpu device. You just get the dirty framebuffer image in the framebuffer object. But the framebuffer object already dirty since it in the UEFI firmware stage. In conclusion, *reclaim* is more accurate than the "take over". And as far as I'm understanding, the drm/amdgpu take over nothing, no gains. Well, welcome to correct me if I'm wrong.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi Am 05.09.23 um 17:59 schrieb suijingfeng: [...] FYI: per-driver modeset parameters are deprecated and not to be used. Please don't promote them. Well, please wait, I want to explain. drm/nouveau already promote it a little bit. Despite no code of conduct or specification guiding how the modules parameters should be. Noticed that there already have a lot of DRM drivers support the modeset parameters, Please look at the history and discussion around this parameter. To my knowledge, 'modeset' got introduced when modesetting with still done in userspace. It was an easy way of disabling the kernel driver if the system's Xorg did no yet support kernel mode setting. Fast forward a few years and all Linux' use kernel modesetting, which make the modeset parameters obsolete. We discussed and decided to keep them in, because many articles and blog posts refer to them. We didn't want to invalidate them. BUT modeset is deprecated and not allowed in new code. If you look at existing modeset usage, you will eventually come across the comment at [1]. There's 'nomodeset', which disables all native drivers. It's useful for debugging or as a quick-fix if the graphics driver breaks. If you want to disable a specific driver, please use one of the options for blacklisting. Best regards Thomas [1] https://elixir.bootlin.com/linux/v6.5/source/include/drm/drm_module.h#L83 for the modeset parameter, authors of various device driver try to make the usage not conflict with others. I believe that this is good thing for Linux users. It is probably the responsibility of the drm core maintainers to force various drm drivers to reach a minimal consensus. Probably it pains to do so and doesn't pay off. But reach a minimal consensus do benefit to Linux users. You can use modprobe.blacklist or initcall_blacklist on the kernel command line. There are some cases where the modprobe.blacklist doesn't works, I have come cross several time during the past. Because the device selected by the VGAARB is device-level thing, it is not the driver's problem. Sometimes when VGAARB has a bug, it will select a wrong device as primary. And the X server will use this wrong device as primary and completely crash there, due to lack a driver. Take my old S3 Graphics as an example: $ lspci | grep VGA 00:06.1 VGA compatible controller: Loongson Technology LLC DC (Display Controller) (rev 01) 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM] 07:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01) 08:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01) Before apply this patch: [ 0.361748] pci :00:06.1: vgaarb: setting as boot VGA device [ 0.361753] pci :00:06.1: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none [ 0.361765] pci :03:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361773] pci :07:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361779] pci :08:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361781] vgaarb: loaded [ 0.367838] pci :00:06.1: Overriding boot device as 1002:6778 [ 0.367841] pci :00:06.1: Overriding boot device as 5333:9070 [ 0.367843] pci :00:06.1: Overriding boot device as 5333:9070 For known reason, one of my system select the S3 Graphics as primary GPU. But this S3 Graphics not even have a decent drm upstream driver yet. Under such a case, I begin to believe that only the device who has a driver deserve the primary. Under such a condition, I want to reboot and enter the graphic environment with other working video cards. Either platform integrated and discrete GPU. This don't means I should compromise by un-mount the S3 graphics card from the motherboard, this also don't means that I should update my BIOS setting. As sometimes, the BIOS is more worse. With this series applied, all I need to do is to reboot the computer and pass a command line. By force override another video card (who has a decent driver support) as primary, I'm able to do the debugging under graphic environment. I would like to examine what's wrong with the vgaarb on a specific platform under X server graphic environment. Probably try compile a driver for this card and see it works, simply reboot without the need to change anything. It is so efficient. So this is probably the second usage of my patch. It hand the right of control back to the graphic developer. -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg) OpenPGP_signature Description: OpenPGP digital signature
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi Am 06.09.23 um 05:08 schrieb suijingfeng: Hi, On 2023/9/5 23:05, Thomas Zimmermann wrote: However, on modern Linux systems the primary display does not really exist. 'Primary' is the device that is available via VGA, VESA or EFI. I may miss the point, what do you means by choose the word "modern"? Are you trying to tell me that X server is too old and Wayland is the modern display server? It comes down to that. Xorg's device handling is out of date. Fixing it would require a redesign of the whole program. A 'modern' compositor delegates device handling to the kernel. All it does is to open the device files and use the provided functionality. I've briefly mentioned this in the other email. There's more to 'modern', such as 'uses Wayland for compositing', 'Mesa for direct rendering' or 'does atomic modesetting'. But that's all unrelated here. Our drivers don't use these interfaces, but the native registers. Yes and no? Yes for the machine with the UEFI firmware, but I not sure if this statement is true for the machine with the legacy firmware. What I mean is: the primary device is the one that owns the VGA/VESA/EFI I/O space. But DRM drivers don't program by VGA registers or VESA/EFI calls. They use the hardware's actual native registers in the each device's I/O space. So each device operates on it's own. They (usually) don't have to share/arbitrate access to the VGA registers. Hence the idea of a primary device does not make sense here. It's useful to pick an initial default, but further display setup should rather be left to userspace. As the display controller in the ASpeed BMC is VGA compatible. Therefore, in theory, it should works with the VGA console on the machine with another VGA compatible video card. So the ast_vga_set_decode() function provided in the 0007 patch probably useful on legacy firmware environment. To be honest, I have tested this on various machine with UEFI firmware. But I didn't realized that I should do the testing on legacy firmware environment before sending this patch. It seems that the testing effort needed are quite exhausting, since all my machines come with the UEFI firmware. So is it OK to leave the legacy part to someone else who interested in it? Probably Alex is more professional at legacy VGA routing stuff? Maybe you can describe the user's problem to us. TBH I still don't understand what you're trying to solve. If you what to set the console's initial output device, you can make a parameter in vgaarb. But I also don't really see a need for that either. Best regards Thomas :-) -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg) OpenPGP_signature Description: OpenPGP digital signature
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi Am 06.09.23 um 04:34 schrieb suijingfeng: On 2023/9/5 23:05, Thomas Zimmermann wrote: Hi Am 05.09.23 um 15:30 schrieb suijingfeng: Hi, On 2023/9/5 18:45, Thomas Zimmermann wrote: Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned If anything, the primary graphics adapter is the one initialized by the firmware. I think our boot-up graphics also make this assumption implicitly. Yes, but by the time of DRM drivers get loaded successfully,the boot-up graphics already finished. Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings). So, this series is definitely not to interact with the firmware framebuffer Yes and no. The helpers you mention will attempt to remove the firmware framebuffer on the given PCI device. If you have multiple PCI devices, the other devices would not be affected. Yes and no. For the yes part: drm_aperture_remove_conflicting_pci_framebuffers() only kill the conflict one. But for a specific machine with the modern UEFI firmware, there should be only one firmware framebuffer driver. That shoudd be the EFIFB(UEFI GOP). I do have multiple PCI devices, but I don't understand when and why a system will have more than one firmware framebuffer. Maybe somewhat unrelated to the actual discussion, but it's not as simple as you assume. Many non-X86 systems use DeviceTree. On Sparc IIRC, there's the case of having multiple firmware framebuffers listed in the DT. We create an device for each and attach a DRM firmware driver; ofdrm in this case. I haven't seen this in the wild, but non-Sparc systems could also behave like that. And in addition to that, ARM-based systems often uses UEFI boot stub code that provides a simple UEFI environment to the kernel. For graphics we've had cases where we received the same firmware framebuffer from the DT and from the UEFI boot stub. We have to detect and handle such duplication in the kernel. Best regards Thomas Even for the machines with the legacy BIOS, the fixed VGA aperture address range can only be owned by one firmware driver. It is just that we need to handle the routing, the ->set_decode() callback of vga_client_register() is used to do such work. Am I correct? -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg) OpenPGP_signature Description: OpenPGP digital signature
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi Am 06.09.23 um 04:14 schrieb suijingfeng: Hi, On 2023/9/5 23:05, Thomas Zimmermann wrote: However, on modern Linux systems the primary display does not really exist. No, it do exist. X server need to know which one is the primary GPU. The '*' character at the of (4@0:0:0) PCI device is the Primary. The '*' denote primary, see the log below. (II) xfree86: Adding drm device (/dev/dri/card2) (II) xfree86: Adding drm device (/dev/dri/card0) (II) Platform probe for /sys/devices/pci:00/:00:1c.5/:003:00.0/:04:00.0/drm/card0 (II) xfree86: Adding drm device (/dev/dri/card3) (II) Platform probe for /sys/devices/pci:00/:00:1c.6/:005:00.0/drm/card3 (--) PCI: (0@0:2:0) 8086:3e91:8086:3e91 rev 0, Mem @ 0xdb00/16216, 0xa000/536870912, I/O @ 0xf000/64, BIOS @ 0x/131072 (--) PCI: (1@0:0:0) 1002:6771:1043:8636 rev 0, Mem @ 0xc000/2688435456, 0xdf22/131072, I/O @ 0xe000/256, BIOS @ 0x/131072 (--) PCI:*(4@0:0:0) 1a03:2000:1a03:2000 rev 48, Mem @ 0xde00/166777216, 0xdf02/131072, I/O @ 0xc000/128, BIOS @ 0x/131072 (--) PCI: (5@0:0:0) 10de:1288:174b:b324 rev 161, Mem @ 0xdc00/116777216, 0xd000/134217728, 0xd800/33554432, I/O @ 0xb000/128, BIOS @@0x/524288 The modesetting driver of X server will create framebuffer on the primary video adapter. If a 2D video adapter (like the aspeed BMC) is not the primary, then it probably will not be used. The only chance to be able to display something is to functional as a output slave. But the output slave technology need the PRIME support for cross driver buffer sharing. So, there do have some difference between the primary and non-primary video adapters. Xorg is a pretty bad example, because X parses the PCI bus and then tries to match devices to /dev/dri/ files. That's also not fixable in Xorg's current code base. Please don't promote Xorg's design. It dates back to the time when Xorg did the modesetting by itself. Userspace should just open existing device files and start rendering. Maybe pick the previous settings and/or do some guess work about the arrangment of these devices. AFAIK that's what the modern compositors do. Best regards Thomas 'Primary' is the device that is available via VGA, VESA or EFI. Our drivers don't use these interfaces, but the native registers. As you said yourself, these firmware devices (VGA, VESA, EFI) are removed ASAP by the native drivers. -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg) OpenPGP_signature Description: OpenPGP digital signature
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Am 05.09.23 um 15:30 schrieb suijingfeng: Hi, On 2023/9/5 18:45, Thomas Zimmermann wrote: Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned If anything, the primary graphics adapter is the one initialized by the firmware. I think our boot-up graphics also make this assumption implicitly. Yes, but by the time of DRM drivers get loaded successfully,the boot-up graphics already finished. This is an incorrect assumption. drm_aperture_remove_conflicting_pci_framebuffers() and co don't kill the framebuffer, they just remove the current framebuffer driver to avoid further updates. So what happens (at least for amdgpu) is that we take over the framebuffer, including both mode and it's contents, and provide a new framebuffer interface until DRM masters like X or Wayland take over. Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings). So, this series is definitely not to interact with the firmware framebuffer (or more intelligent framebuffer drivers). It is for user space program, such as X server and Wayland compositor. Its for Linux user or drm drivers testers, which allow them to direct graphic display server using right hardware of interested as primary video card. Also, I believe that X server and Wayland compositor are the best test examples. If a specific DRM driver can't work with X server as a primary, then there probably have something wrong. But what's the use case for overriding this setting? On a specific machine with multiple GPUs mounted, only the primary graphics get POST-ed (initialized) by the firmware. Therefore, the DRM drivers for the rest video cards, have to choose to work without the prerequisite setups done by firmware, This is called as POST. Well, you don't seem to understand the background here. This is perfectly normal behavior. Secondary cards are posted after loading the appropriate DRM driver. At least for amdgpu this is done by calling the appropriate functions in the BIOS. One of the use cases of this series is to test if a specific DRM driver could works properly, even though there is no prerequisite works have been done by firmware at all. And it seems that the results is not satisfying in all cases. drm/ast is the first drm drivers which refused to work if not being POST-ed by the firmware. As far as I know this is expected as well. AST is a relatively simple driver and when it's not the primary one during boot the assumption is that it isn't used at all. Regards, Christian. Before apply this series, I was unable make drm/ast as the primary video card easily. On a multiple video card configuration, the monitor connected with the AST2400 not light up. While confusing, a naive programmer may suspect the PRIME is not working. After applied this series and passing ast.modeset=10 on the kernel cmd line, I found that the monitor connected with my ast2400 video card still black, It doesn't display and doesn't show image to me. While in the process of study drm/ast, I know that drm/ast driver has the POST code shipped. See the ast_post_gpu() function, then, I was wondering why this function doesn't works. After a short-time (hasty) debugging, I found that the the ast_post_gpu() function didn't get run. Because it have something to do with the ast->config_mode. Without thinking too much, I hardcoded the ast->config_mode as ast_use_p2a to force the ast_post_gpu() function get run. ``` --- a/drivers/gpu/drm/ast/ast_main.c +++ b/drivers/gpu/drm/ast/ast_main.c @@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device *ast) } } + ast->config_mode = ast_use_p2a; + switch (ast->config_mode) { case ast_use_defaults: drm_info(dev, "Using default configuration\n"); ``` Then, the monitor light up, it display the Ubuntu greeter to me. Therefore, my patch is helpful, at lease for the Linux drm driver tester and developer. It allow programmers to test the specific part of the specific drive without changing a line of the source code and without the need of sudo authority. It helps to improve efficiency of the testing and patch verification. I know the PrimaryGPU option of Xorg conf, but this approach will remember the setup have been made, you need modify it with root authority each time you want to switch the primary. But on rapid developing and/or testing multiple video drivers, with only one computer hardware resource available. What we really want probably is a one-shoot command as this series provide. So, this is the first use case. This probably also help to test full modeset, PRIME and reverse PRIME on multiple video card machine. Best regards Thomas
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/5 23:05, Thomas Zimmermann wrote: You might have found a bug in the ast driver. Ast has means to detect if the device has been POSTed and maybe do that. If this doesn't work correctly, it needs a fix. That sounds fine. The bug is not a big deal, I'm just take it as an example and report it to you. But a real fix can be complex, because there are quite a lot of servers ship with ASpeed BMC hardware. Honestly I don't have the time fix it on formal way. I have already tons patches in pending and I will focus on solve VGAARB related problem. Because I want to test your patch occasionally. So this series is useful for myself at corner cases.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/5 23:05, Thomas Zimmermann wrote: However, on modern Linux systems the primary display does not really exist. 'Primary' is the device that is available via VGA, VESA or EFI. I may miss the point, what do you means by choose the word "modern"? Are you trying to tell me that X server is too old and Wayland is the modern display server? Our drivers don't use these interfaces, but the native registers. Yes and no? Yes for the machine with the UEFI firmware, but I not sure if this statement is true for the machine with the legacy firmware. As the display controller in the ASpeed BMC is VGA compatible. Therefore, in theory, it should works with the VGA console on the machine with another VGA compatible video card. So the ast_vga_set_decode() function provided in the 0007 patch probably useful on legacy firmware environment. To be honest, I have tested this on various machine with UEFI firmware. But I didn't realized that I should do the testing on legacy firmware environment before sending this patch. It seems that the testing effort needed are quite exhausting, since all my machines come with the UEFI firmware. So is it OK to leave the legacy part to someone else who interested in it? Probably Alex is more professional at legacy VGA routing stuff? :-)
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
On 2023/9/5 23:05, Thomas Zimmermann wrote: Hi Am 05.09.23 um 15:30 schrieb suijingfeng: Hi, On 2023/9/5 18:45, Thomas Zimmermann wrote: Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned If anything, the primary graphics adapter is the one initialized by the firmware. I think our boot-up graphics also make this assumption implicitly. Yes, but by the time of DRM drivers get loaded successfully,the boot-up graphics already finished. Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings). So, this series is definitely not to interact with the firmware framebuffer Yes and no. The helpers you mention will attempt to remove the firmware framebuffer on the given PCI device. If you have multiple PCI devices, the other devices would not be affected. Yes and no. For the yes part: drm_aperture_remove_conflicting_pci_framebuffers() only kill the conflict one. But for a specific machine with the modern UEFI firmware, there should be only one firmware framebuffer driver. That shoudd be the EFIFB(UEFI GOP). I do have multiple PCI devices, but I don't understand when and why a system will have more than one firmware framebuffer. Even for the machines with the legacy BIOS, the fixed VGA aperture address range can only be owned by one firmware driver. It is just that we need to handle the routing, the ->set_decode() callback of vga_client_register() is used to do such work. Am I correct?
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/5 23:05, Thomas Zimmermann wrote: However, on modern Linux systems the primary display does not really exist. No, it do exist. X server need to know which one is the primary GPU. The '*' character at the of (4@0:0:0) PCI device is the Primary. The '*' denote primary, see the log below. (II) xfree86: Adding drm device (/dev/dri/card2) (II) xfree86: Adding drm device (/dev/dri/card0) (II) Platform probe for /sys/devices/pci:00/:00:1c.5/:003:00.0/:04:00.0/drm/card0 (II) xfree86: Adding drm device (/dev/dri/card3) (II) Platform probe for /sys/devices/pci:00/:00:1c.6/:005:00.0/drm/card3 (--) PCI: (0@0:2:0) 8086:3e91:8086:3e91 rev 0, Mem @ 0xdb00/16216, 0xa000/536870912, I/O @ 0xf000/64, BIOS @ 0x/131072 (--) PCI: (1@0:0:0) 1002:6771:1043:8636 rev 0, Mem @ 0xc000/2688435456, 0xdf22/131072, I/O @ 0xe000/256, BIOS @ 0x/131072 (--) PCI:*(4@0:0:0) 1a03:2000:1a03:2000 rev 48, Mem @ 0xde00/166777216, 0xdf02/131072, I/O @ 0xc000/128, BIOS @ 0x/131072 (--) PCI: (5@0:0:0) 10de:1288:174b:b324 rev 161, Mem @ 0xdc00/116777216, 0xd000/134217728, 0xd800/33554432, I/O @ 0xb000/128, BIOS @@0x/524288 The modesetting driver of X server will create framebuffer on the primary video adapter. If a 2D video adapter (like the aspeed BMC) is not the primary, then it probably will not be used. The only chance to be able to display something is to functional as a output slave. But the output slave technology need the PRIME support for cross driver buffer sharing. So, there do have some difference between the primary and non-primary video adapters. 'Primary' is the device that is available via VGA, VESA or EFI. Our drivers don't use these interfaces, but the native registers. As you said yourself, these firmware devices (VGA, VESA, EFI) are removed ASAP by the native drivers.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
On 2023/9/5 18:49, Thomas Zimmermann wrote: Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned problem by introduced the ->be_primary() function stub. The specific device drivers can provide an implementation to hook up with this stub by calling the vga_client_register() function. Once the driver bound the device successfully, VGAARB will call back to the device driver. To query if the device drivers want to be primary or not. Device drivers can just pass NULL if have no such needs. Please note that: 1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would like to mount at least three video cards. 2) Typically, those non-86 machines don't have a good UEFI firmware support, which doesn't support select primary GPU as firmware stage. Even on x86, there are old UEFI firmwares which already made undesired decision for you. 3) This series is attempt to solve the remain problems at the driver level, while another series[1] of me is target to solve the majority of the problems at device level. Tested (limited) on x86 with four video card mounted, Intel UHD Graphics 630 is the default boot VGA, successfully override by ast2400 with ast.modeset=10 append at the kernel cmd line. FYI: per-driver modeset parameters are deprecated and not to be used. Please don't promote them. Well, please wait, I want to explain. drm/nouveau already promote it a little bit. Despite no code of conduct or specification guiding how the modules parameters should be. Noticed that there already have a lot of DRM drivers support the modeset parameters, for the modeset parameter, authors of various device driver try to make the usage not conflict with others. I believe that this is good thing for Linux users. It is probably the responsibility of the drm core maintainers to force various drm drivers to reach a minimal consensus. Probably it pains to do so and doesn't pay off. But reach a minimal consensus do benefit to Linux users. You can use modprobe.blacklist or initcall_blacklist on the kernel command line. There are some cases where the modprobe.blacklist doesn't works, I have come cross several time during the past. Because the device selected by the VGAARB is device-level thing, it is not the driver's problem. Sometimes when VGAARB has a bug, it will select a wrong device as primary. And the X server will use this wrong device as primary and completely crash there, due to lack a driver. Take my old S3 Graphics as an example: $ lspci | grep VGA 00:06.1 VGA compatible controller: Loongson Technology LLC DC (Display Controller) (rev 01) 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos XT [Radeon HD 7470/8470 / R5 235/310 OEM] 07:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01) 08:00.0 VGA compatible controller: S3 Graphics Ltd. Device 9070 (rev 01) Before apply this patch: [0.361748] pci :00:06.1: vgaarb: setting as boot VGA device [0.361753] pci :00:06.1: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none [0.361765] pci :03:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [0.361773] pci :07:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [0.361779] pci :08:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [0.361781] vgaarb: loaded [0.367838] pci :00:06.1: Overriding boot device as 1002:6778 [0.367841] pci :00:06.1: Overriding boot device as 5333:9070 [0.367843] pci :00:06.1: Overriding boot device as 5333:9070 For known reason, one of my system select the S3 Graphics as primary GPU. But this S3 Graphics not even have a decent drm upstream driver yet. Under such a case, I begin to believe that only the device who has a driver deserve the primary. Under such a condition, I want to reboot and enter the graphic environment with other working video cards. Either platform integrated and discrete GPU. This don't means I should compromise by un-mount the S3 graphics card from the motherboard, this also don't means that I should update my BIOS setting. As sometimes, the BIOS is more worse. With this series applied, all I need to do is to reboot the computer and pass a command line. By force override another video card (who has a decent driver support) as primary, I'm able to do the debugging under graphic environment. I would like to examine what's wrong with the vgaarb on a specific platform under X server graphic environment. Probably try compile a driver for this card and see it works, simply reboot without the need to change anything. It is so efficient. So this is probably the second usage of my patch. It hand the right of control back to the graphic developer.
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi Am 05.09.23 um 15:30 schrieb suijingfeng: Hi, On 2023/9/5 18:45, Thomas Zimmermann wrote: Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned If anything, the primary graphics adapter is the one initialized by the firmware. I think our boot-up graphics also make this assumption implicitly. Yes, but by the time of DRM drivers get loaded successfully,the boot-up graphics already finished. Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings). So, this series is definitely not to interact with the firmware framebuffer Yes and no. The helpers you mention will attempt to remove the firmware framebuffer on the given PCI device. If you have multiple PCI devices, the other devices would not be affected. This also means that probing a non-primary card will not affect the firmware framebuffer on the primary card. You can have all these drivers co-exist next to each other. If you link a full DRM driver into the kernel image, it might even be loaded before the firmware-framebuffer's driver. We had some funny bugs from these interactions. (or more intelligent framebuffer drivers). It is for user space program, such as X server and Wayland compositor. Its for Linux user or drm drivers testers, which allow them to direct graphic display server using right hardware of interested as primary video card. Also, I believe that X server and Wayland compositor are the best test examples. If a specific DRM driver can't work with X server as a primary, then there probably have something wrong. If you want to run a userspace compositor or X11 on a certain device, you best configure this in the program's config files. But not on the kernel command line. The whole concept of a 'primary' display is bogus IMHO. It only exists because old VGA and BIOS (and their equivalents on non-PC systems) were unable to use more than one graphics device. Hence, as you write below, only the first device got POSTed by the BIOS. If you had an additional card, the device driver needed to perform the POSTing. However, on modern Linux systems the primary display does not really exist. 'Primary' is the device that is available via VGA, VESA or EFI. Our drivers don't use these interfaces, but the native registers. As you said yourself, these firmware devices (VGA, VESA, EFI) are removed ASAP by the native drivers. But what's the use case for overriding this setting? On a specific machine with multiple GPUs mounted, only the primary graphics get POST-ed (initialized) by the firmware. Therefore, the DRM drivers for the rest video cards, have to choose to work without the prerequisite setups done by firmware, This is called as POST. One of the use cases of this series is to test if a specific DRM driver could works properly, even though there is no prerequisite works have been done by firmware at all. And it seems that the results is not satisfying in all cases. drm/ast is the first drm drivers which refused to work if not being POST-ed by the firmware. You might have found a bug in the ast driver. Ast has means to detect if the device has been POSTed and maybe do that. If this doesn't work correctly, it needs a fix. As Christian mentioned, if anything, you might add an option to specify the default card to vgaarb (e.g., as PCI slot). But userspace should avoid the idea of a primary card IMHO. Best regards Thomas Before apply this series, I was unable make drm/ast as the primary video card easily. On a multiple video card configuration, the monitor connected with the AST2400 not light up. While confusing, a naive programmer may suspect the PRIME is not working. After applied this series and passing ast.modeset=10 on the kernel cmd line, I found that the monitor connected with my ast2400 video card still black, It doesn't display and doesn't show image to me. While in the process of study drm/ast, I know that drm/ast driver has the POST code shipped. See the ast_post_gpu() function, then, I was wondering why this function doesn't works. After a short-time (hasty) debugging, I found that the the ast_post_gpu() function didn't get run. Because it have something to do with the ast->config_mode. Without thinking too much, I hardcoded the ast->config_mode as ast_use_p2a to force the ast_post_gpu() function get run. ``` --- a/drivers/gpu/drm/ast/ast_main.c +++ b/drivers/gpu/drm/ast/ast_main.c @@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device *ast) } } + ast->config_mode = ast_use_p2a; + switch (ast->config_mode) { case ast_use_defaults: drm_info(dev, "Using default configuration\n"); ``` Then, the monitor light up, it display the Ubuntu gr
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi, On 2023/9/5 18:45, Thomas Zimmermann wrote: Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned If anything, the primary graphics adapter is the one initialized by the firmware. I think our boot-up graphics also make this assumption implicitly. Yes, but by the time of DRM drivers get loaded successfully,the boot-up graphics already finished. Firmware framebuffer device already get killed by the drm_aperture_remove_conflicting_pci_framebuffers() function (or its siblings). So, this series is definitely not to interact with the firmware framebuffer (or more intelligent framebuffer drivers). It is for user space program, such as X server and Wayland compositor. Its for Linux user or drm drivers testers, which allow them to direct graphic display server using right hardware of interested as primary video card. Also, I believe that X server and Wayland compositor are the best test examples. If a specific DRM driver can't work with X server as a primary, then there probably have something wrong. But what's the use case for overriding this setting? On a specific machine with multiple GPUs mounted, only the primary graphics get POST-ed (initialized) by the firmware. Therefore, the DRM drivers for the rest video cards, have to choose to work without the prerequisite setups done by firmware, This is called as POST. One of the use cases of this series is to test if a specific DRM driver could works properly, even though there is no prerequisite works have been done by firmware at all. And it seems that the results is not satisfying in all cases. drm/ast is the first drm drivers which refused to work if not being POST-ed by the firmware. Before apply this series, I was unable make drm/ast as the primary video card easily. On a multiple video card configuration, the monitor connected with the AST2400 not light up. While confusing, a naive programmer may suspect the PRIME is not working. After applied this series and passing ast.modeset=10 on the kernel cmd line, I found that the monitor connected with my ast2400 video card still black, It doesn't display and doesn't show image to me. While in the process of study drm/ast, I know that drm/ast driver has the POST code shipped. See the ast_post_gpu() function, then, I was wondering why this function doesn't works. After a short-time (hasty) debugging, I found that the the ast_post_gpu() function didn't get run. Because it have something to do with the ast->config_mode. Without thinking too much, I hardcoded the ast->config_mode as ast_use_p2a to force the ast_post_gpu() function get run. ``` --- a/drivers/gpu/drm/ast/ast_main.c +++ b/drivers/gpu/drm/ast/ast_main.c @@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device *ast) } } + ast->config_mode = ast_use_p2a; + switch (ast->config_mode) { case ast_use_defaults: drm_info(dev, "Using default configuration\n"); ``` Then, the monitor light up, it display the Ubuntu greeter to me. Therefore, my patch is helpful, at lease for the Linux drm driver tester and developer. It allow programmers to test the specific part of the specific drive without changing a line of the source code and without the need of sudo authority. It helps to improve efficiency of the testing and patch verification. I know the PrimaryGPU option of Xorg conf, but this approach will remember the setup have been made, you need modify it with root authority each time you want to switch the primary. But on rapid developing and/or testing multiple video drivers, with only one computer hardware resource available. What we really want probably is a one-shoot command as this series provide. So, this is the first use case. This probably also help to test full modeset, PRIME and reverse PRIME on multiple video card machine. Best regards Thomas
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned problem by introduced the ->be_primary() function stub. The specific device drivers can provide an implementation to hook up with this stub by calling the vga_client_register() function. Once the driver bound the device successfully, VGAARB will call back to the device driver. To query if the device drivers want to be primary or not. Device drivers can just pass NULL if have no such needs. Please note that: 1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would like to mount at least three video cards. 2) Typically, those non-86 machines don't have a good UEFI firmware support, which doesn't support select primary GPU as firmware stage. Even on x86, there are old UEFI firmwares which already made undesired decision for you. 3) This series is attempt to solve the remain problems at the driver level, while another series[1] of me is target to solve the majority of the problems at device level. Tested (limited) on x86 with four video card mounted, Intel UHD Graphics 630 is the default boot VGA, successfully override by ast2400 with ast.modeset=10 append at the kernel cmd line. FYI: per-driver modeset parameters are deprecated and not to be used. Please don't promote them. You can use modprobe.blacklist or initcall_blacklist on the kernel command line. Best regards Thomas $ lspci | grep VGA 00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630] 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos XTX [Radeon HD 8490 / R5 235X OEM] 04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30) 05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 720] (rev a1) $ sudo dmesg | grep vgaarb pci :00:02.0: vgaarb: setting as boot VGA device pci :00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none pci :01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none pci :04:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none pci :05:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none vgaarb: loaded ast :04:00.0: vgaarb: Override as primary by driver i915 :00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem radeon :01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none ast :04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none v2: * Add a simple implemment for drm/i915 and drm/ast * Pick up all tags (Mario) v3: * Fix a mistake for drm/i915 implement * Fix patch can not be applied problem because of merge conflect. v4: * Focus on solve the real problem. v1,v2 at https://patchwork.freedesktop.org/series/120059/ v3 at https://patchwork.freedesktop.org/series/120562/ [1] https://patchwork.freedesktop.org/series/122845/ Sui Jingfeng (9): PCI/VGA: Allowing the user to select the primary video adapter at boot time drm/nouveau: Implement .be_primary() callback drm/radeon: Implement .be_primary() callback drm/amdgpu: Implement .be_primary() callback drm/i915: Implement .be_primary() callback drm/loongson: Implement .be_primary() callback drm/ast: Register as a VGA client by calling vga_client_register() drm/hibmc: Register as a VGA client by calling vga_client_register() drm/gma500: Register as a VGA client by calling vga_client_register() drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 13 - drivers/gpu/drm/ast/ast_drv.c | 31 ++ drivers/gpu/drm/gma500/psb_drv.c | 57 ++- .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c | 15 + drivers/gpu/drm/i915/display/intel_vga.c | 15 - drivers/gpu/drm/loongson/loongson_module.c| 2 +- drivers/gpu/drm/loongson/loongson_module.h| 1 + drivers/gpu/drm/loongson/lsdc_drv.c | 10 +++- drivers/gpu/drm/nouveau/nouveau_vga.c | 11 +++- drivers/gpu/drm/radeon/radeon_device.c| 10 +++- drivers/pci/vgaarb.c | 43 -- drivers/vfio/pci/vfio_pci_core.c | 2 +- include/linux/vgaarb.h| 8 ++- 14 files changed, 210 insertions(+), 19 deletions(-) -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg) OpenPGP_signature Description: OpenPGP digital signature
Re: [Intel-gfx] [Nouveau] [RFC, drm-misc-next v4 0/9] PCI/VGA: Allowing the user to select the primary video adapter at boot time
Hi Am 04.09.23 um 21:57 schrieb Sui Jingfeng: From: Sui Jingfeng On a machine with multiple GPUs, a Linux user has no control over which one is primary at boot time. This series tries to solve above mentioned If anything, the primary graphics adapter is the one initialized by the firmware. I think our boot-up graphics also make this assumption implicitly. But what's the use case for overriding this setting? Best regards Thomas problem by introduced the ->be_primary() function stub. The specific device drivers can provide an implementation to hook up with this stub by calling the vga_client_register() function. Once the driver bound the device successfully, VGAARB will call back to the device driver. To query if the device drivers want to be primary or not. Device drivers can just pass NULL if have no such needs. Please note that: 1) The ARM64, Loongarch, Mips servers have a lot PCIe slot, and I would like to mount at least three video cards. 2) Typically, those non-86 machines don't have a good UEFI firmware support, which doesn't support select primary GPU as firmware stage. Even on x86, there are old UEFI firmwares which already made undesired decision for you. 3) This series is attempt to solve the remain problems at the driver level, while another series[1] of me is target to solve the majority of the problems at device level. Tested (limited) on x86 with four video card mounted, Intel UHD Graphics 630 is the default boot VGA, successfully override by ast2400 with ast.modeset=10 append at the kernel cmd line. $ lspci | grep VGA 00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630] 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos XTX [Radeon HD 8490 / R5 235X OEM] 04:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30) 05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 720] (rev a1) $ sudo dmesg | grep vgaarb pci :00:02.0: vgaarb: setting as boot VGA device pci :00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none pci :01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none pci :04:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none pci :05:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none vgaarb: loaded ast :04:00.0: vgaarb: Override as primary by driver i915 :00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem radeon :01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none ast :04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none v2: * Add a simple implemment for drm/i915 and drm/ast * Pick up all tags (Mario) v3: * Fix a mistake for drm/i915 implement * Fix patch can not be applied problem because of merge conflect. v4: * Focus on solve the real problem. v1,v2 at https://patchwork.freedesktop.org/series/120059/ v3 at https://patchwork.freedesktop.org/series/120562/ [1] https://patchwork.freedesktop.org/series/122845/ Sui Jingfeng (9): PCI/VGA: Allowing the user to select the primary video adapter at boot time drm/nouveau: Implement .be_primary() callback drm/radeon: Implement .be_primary() callback drm/amdgpu: Implement .be_primary() callback drm/i915: Implement .be_primary() callback drm/loongson: Implement .be_primary() callback drm/ast: Register as a VGA client by calling vga_client_register() drm/hibmc: Register as a VGA client by calling vga_client_register() drm/gma500: Register as a VGA client by calling vga_client_register() drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 11 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 13 - drivers/gpu/drm/ast/ast_drv.c | 31 ++ drivers/gpu/drm/gma500/psb_drv.c | 57 ++- .../gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c | 15 + drivers/gpu/drm/i915/display/intel_vga.c | 15 - drivers/gpu/drm/loongson/loongson_module.c| 2 +- drivers/gpu/drm/loongson/loongson_module.h| 1 + drivers/gpu/drm/loongson/lsdc_drv.c | 10 +++- drivers/gpu/drm/nouveau/nouveau_vga.c | 11 +++- drivers/gpu/drm/radeon/radeon_device.c| 10 +++- drivers/pci/vgaarb.c | 43 -- drivers/vfio/pci/vfio_pci_core.c | 2 +- include/linux/vgaarb.h| 8 ++- 14 files changed, 210 insertions(+), 19 deletions(-) -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg) OpenPGP_signature Description: OpenPGP digital signature