Comments inline

On 9/13/21 2:31 PM, Jason Edgecombe wrote:
On the original server (ubuntu 20.04 w/AMD), I ran the sanity check instructions from the user guide and glxinfo returns an error:

```
% /opt/VirtualGL/bin/glxinfo -display :2 -c > /dev/null
libGL error: unable to load driver: swrast_dri.so
libGL error: failed to load driver: swrast
X Error of failed request:  GLXBadContext
   Major opcode of failed request:  151 (GLX)
   Minor opcode of failed request:  6 (X_GLXIsDirect)
   Serial number of failed request:  141
   Current serial number in output stream:  140
```

I also found the following interesting logs in the Xorg stdout:

```
 MESA-LOADER: failed to open mgag200: /opt/amdgpu/lib/x86_64-linux-gnu/dri/mgag200_dri.so: cannot open shared object file: No such file or directory (search paths /opt/amdgpu/lib/x86_64-linux-gnu/dri)
  failed to load driver: mgag200
 amdgpu_device_initialize: DRM version is 1.0.0 but this driver is only compatible with 3.x.x.  amdgpu_device_initialize: DRM version is 1.0.0 but this driver is only compatible with 3.x.x.

I recall now that the AMDGPU-PRO driver is not multi-GPU-friendly. Specifically, it installs its own OpenGL libraries and DRI driver under /opt/amdgpu-pro rather than using GLVND, so the AMDGPU-PRO driver is all-or-nothing with respect to a particular X.org instance. On my RHEL 8 system, which has both a Quadro and a Radeon Pro WX2100, I had to switch to the open source AMDGPU driver so I could attach the GPUs to separate screens of the same X server. (AMDGPU is GLVND-aware but AMDGPU-PRO is not.) In your case, however, since you're using a dedicated X server for your AMD GPU, I would suggest manipulating LD_LIBRARY_PATH such that it contains /opt/amdgpu-pro/lib* only when starting or attempting to use your dedicated X.org instance (:2) but not otherwise. That may eliminate some of the error messages above, although it's unclear whether those error messages are related to the issue at hand. Definitely you need to make sure that LD_LIBRARY_PATH contains /opt/amdgpu-pro/lib* before you attempt to use VirtualGL.


I was able to install Ubuntu 20.04 on a VM with an Nvidia Quadro w/PCI pass-through. It doesn't hang, but it still fails with the following message:

```
% vglrun /opt/VirtualGL/bin/glxspheres64
[VGL] NOTICE: Automatically setting VGL_CLIENT environment variable to
[VGL]    10.17.150.240, the IP address of your SSH client.
Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
Visual ID of window: 0x21
Context is Direct
OpenGL Renderer: Quadro M2000/PCIe/SSE2
X Error of failed request:  BadRequest (invalid request code or no such operation)
   Major opcode of failed request:  130 (MIT-SHM)
   Minor opcode of failed request:  1 (X_ShmAttach)
   Serial number of failed request:  13
   Current serial number in output stream:  14

That seems like an unrelated issue, something amiss with the 2D X server's MIT-SHM extension. You can try setting VGL_USEXSHM=0 in the environment to work around the issue.


It's worth noting that my console blanks out on all of my working servers when the VGL Xorg process is running and comes back when the process stops. I'm not sure how to make Xorg run a different virtual console, but I would love to fix my blank console issue.

You should be able to pass 'vtXX' to the Xorg process, where XX is the number of a virtual console.


Thanks,
Jason
---------------------------------------------------------------------------
Jason Edgecombe | Linux Administrator
UNC Charlotte | Office of OneIT
9201 University City Blvd. | Charlotte, NC 28223-0001
Phone: 704-687-1943 <tel:704-687-1943>
[email protected] <mailto:[email protected]> | oneit.charlotte.edu <https://oneit.charlotte.edu>
---------------------------------------------------------------------------
If you are not the intended recipient of this transmission or a person responsible for delivering it to the intended recipient, any disclosure, copying, distribution, or other use of any of the information in this transmission is strictly prohibited. If you have received this transmission in error, please notify me immediately by reply e-mail or by telephone at
704-687-1943 <tel:704-687-1943>.  Thank you.


On Mon, Sep 13, 2021 at 12:58 PM DRC <[email protected] <mailto:[email protected]>> wrote:

    Some questions/comments:

    - Is a display manager running on the Matrox card whenever the issue
    occurs?  What happens if you stop the display manager?

    - We've established that the issue didn't occur on Ubuntu 16.04 with
    an AMD GPU, but can you test whether the issue occurs with Ubuntu
    20.04 and an nVidia GPU or with RHEL 8 and an AMD GPU?  Those would
    be useful data points.  If the issue occurs with RHEL 8 and an AMD
    GPU, then that is a configuration I can reproduce in my own lab, but
    if the issue is specific to Ubuntu 20.04, then unfortunately I won't
    be able to reproduce the failing configuration.

    - I also wonder aloud whether this could have something to do with
    virtual consoles.  It would be worth investigating whether your
dedicated 3D X server is launching on a different virtual console. If so, does switching input to that virtual console eliminate the
    issue?  If so, then it may be that your headless X.org instance is
    still connected to the physical mouse and keyboard somehow.

    I don't have a lot of experience with running separate X servers for
    local logins and VirtualGL, so these are just shots in the dark at
    the moment.

    DRC

    On 9/13/21 10:29 AM, Jason Edgecombe wrote:
    Hi DRC,

    I'm not using Display :0 for VirtualGL. The console graphics card
    is a Matrox. The 2nd graphics card is an AMD Radeon Pro card in a
    headless config that is used for VirtualGL. This is an HP server,
    if that matters.

    I have run "vglserver_config -config +s +f -t" to
    configure virtualGL. I'm not sure that my use case is covered in
    the User's Guide.

    To reduce confusion, I've moved Xorg for the AMD card to
    DISPLAY=:2, but it still fails. Xorg on display :2 is run as  a
    service with the following command: "/usr/lib/xorg/Xorg :2 -config
    /etc/X11/xorg.2.conf". Display 2 doesn't run GDM or any kind of
    greeter. It's just raw Xorg that is dedicated to VirtualGL. I have
    similar configs working on RHEL7, and RHEL8 with Nvidia cards, and
    Ubuntu 16.04 with an identical AMD card.

    glxspheres works directly and confirms that access is granted:

        DISPLAY=:2 /opt/VirtualGL/bin/glxspheres64
        Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
        libGL error: unable to load driver: swrast_dri.so
        libGL error: failed to load driver: swrast
        GLX FB config ID of window: 0x187 (8/8/8/8)
        Visual ID of window: 0x4e7
        Context is Direct
        OpenGL Renderer: Radeon Pro WX 9100
        270.736061 frames/sec - 302.141444 Mpixels/sec
        329.714567 frames/sec - 367.961457 Mpixels/sec
        ^C


    but, vglrun hangs:

        % env | grep VGL
        VGL_FORCEALPHA=1
        VGL_COMPRESS=proxy
        VGL_DISPLAY=:2
        % VGL_TRACE=1 vglrun glxinfo
        [VGL] NOTICE: Automatically setting VGL_CLIENT environment
        variable to
        [VGL]    10.17.150.240, the IP address of your SSH client.
        [VGL 0x223daf00] XOpenDisplay (name=NULL [VGL] dlopen
        (filename=libX11-xcb.so.1 flag=258 retval=0x55dbe47c96b0)
        [VGL] dlopen (filename=libxcb.so.1 flag=258 retval=0x7fef227bd530)
        [VGL] dlopen (filename=libxshmfence.so.1 flag=258
        retval=0x55dbe47c9d30)
        [VGL] dlopen (filename=libxcb-dri3.so.0 flag=258
        retval=0x55dbe47ca310)
        [VGL] dlopen (filename=libxcb-dri2.so.0 flag=258
        retval=0x55dbe47ca920)
        [VGL] dlopen (filename=libxcb-randr.so.0 flag=258
        retval=0x55dbe47caf30)
        [VGL] dlopen (filename=libxcb-sync.so.1 flag=258
        retval=0x55dbe47cb540)
        [VGL] dlopen (filename=libX11.so.6 flag=258 retval=0x7fef2337e4f0)
        [VGL] dlopen (filename=libxcb-present.so.0 flag=258
        retval=0x55dbe47cbc60)
        [VGL] dlopen (filename=libxcb-glx.so.0 flag=258
        retval=0x55dbe47cc280)
        [VGL] dlopen (filename=libXfixes.so.3 flag=258
        retval=0x55dbe47cc890)
        [VGL] dlopen (filename=libXdamage.so.1 flag=258
        retval=0x55dbe47cd080)
        [VGL] dlopen (filename=libXext.so.6 flag=258
        retval=0x7fef227bc510)
        [VGL] dlopen (filename=libXxf86vm.so.1 flag=258
        retval=0x55dbe47cd750)
        [VGL] dlopen (filename=libXau.so.6 flag=258 retval=0x7fef22404000)
        [VGL] dlopen (filename=libXdmcp.so.6 flag=258
        retval=0x7fef22404510)
        dpy=0x55dbe47cddf0(localhost:10.0) ) 160.062075 ms
        name of display: localhost:10.0
        [VGL 0x223daf00] glXChooseVisual
        (dpy=0x55dbe47cddf0(localhost:10.0) screen=0
        attrib_list=[0x0004 0x0008=0x0001 0x0009=0x0001 0x000a=0x0001
        0x000c=0x0001 0x000d=0x0001 0x000e=0x0001 0x000f=0x0001
        0x0010=0x0001 0x0011=0x0001 0x0005 ] glxattribs=[0x000c=0x0001
        0x000d=0x0001 0x000e=0x0001 0x000f=0x0001 0x0010=0x0001
        0x0011=0x0001 0x0005=0x0001 0x0008=0x0001 0x0009=0x0001
        0x000a=0x0001 0x000b=0x0001 0x8011=0x0001 0x8010=0x0006 ]
        [VGL] dlopen (filename=NULL flag=258 retval=0x7fef233cf190)
        ^C
%

    Thanks,
    Jason
    ---------------------------------------------------------------------------
    Jason Edgecombe | Linux Administrator
    UNC Charlotte | Office of OneIT
    9201 University City Blvd. | Charlotte, NC 28223-0001
    Phone: 704-687-1943 <tel:704-687-1943>
    [email protected] <mailto:[email protected]> | oneit.charlotte.edu
    <https://oneit.charlotte.edu>
    ---------------------------------------------------------------------------
    If you are not the intended recipient of this transmission or a
    person responsible for delivering it to the intended recipient,
    any disclosure, copying, distribution, or other use of any of the
    information in this transmission is strictly prohibited. If you
    have received this transmission in error, please notify me
    immediately by reply e-mail or by telephone at
    704-687-1943 <tel:704-687-1943>. Thank you.


    On Mon, Sep 13, 2021 at 10:43 AM DRC <[email protected]
    <mailto:[email protected]>> wrote:

        [Caution: Email from External Sender. Do not click or open
        links or attachments unless you know this sender.]


        With recent (Wayland-enabled) releases of GDM, the greeter
        (the program
        that displays and processes the login prompt) is a Wayland
        application,
        so the display manager doesn't normally run an X.org instance
        while it
        is sitting at the login prompt.  Since VirtualGL's GLX back
        end needs an
        X.org instance in order to communicate with the GPU,
        vglserver_config in
        VGL 2.6.2 and later modifies the display manager configuration
        so that
        the greeter uses X.org instead of Wayland (but note that this
        modification makes it impossible to log in locally to the
        machine using
        a Wayland session.)  Recent (Wayland-enabled) releases of GDM
        also use a
        different instance of X.org for the greeter and the logged-in
        session.
        Thus, the expected behavior once you have followed the setup
        instructions in the VirtualGL User's Guide is:

        (a) Whenever the DM is sitting at the login prompt, Display :0
        will work
        as a 3D X server display (VGL_DISPLAY) for all VirtualGL users
        on the
        system.

        (b) Whenever someone is logged in, Display :1 will work as a
        3D X server
        display (VGL_DISPLAY) for the logged in user only.

        Assuming you have properly configured the machine per the
        instructions
        in the VirtualGL User's Guide, (b) seems to not be true for your
        machine.  I'm not sure why that's the case.  Can you confirm
        whether (a)
        is true?

        DRC

        On 9/13/21 9:07 AM, Jason Edgecombe wrote:
        > Hello,
        >
        > I'm having trouble getting VirtualGL to work on Ubuntu
        20.04. Running
        > `vglrun
        > glxinfo`  or any other 3D program just hangs. I have a
        server with an AMD
        > Radeon PRO WX 9100 GPU as the 2nd graphics card. I've
        installed the
        > amdgpu-pro-21.10-1263777-ubuntu-20.04 AMD driver. I can run
        glxspheres
        > directly
        > on the AMD card (DISPLAY=:1), but running `vglrun glxpheres`
        just hangs.
        >
        > The log below shows that glxspheres runs fine on the AMD
        card (DISPLAY=:1)
        > without vglrun:
        >
        > ```
        > % env | grep VGL
        > VGL_FORCEALPHA=1
        > VGL_COMPRESS=proxy
        > VGL_DISPLAY=:1
        > VGL_VERBOSE=1
        > % DISPLAY=:1 /opt/VirtualGL/bin/glxspheres64
        > Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
        > libGL error: unable to load driver: swrast_dri.so
        > libGL error: failed to load driver: swrast
        > Visual ID of window: 0x4e7
        > Context is Direct
        > OpenGL Renderer: Radeon Pro WX 9100
        > 326.487701 frames/sec - 364.360274 Mpixels/sec
        > 309.610468 frames/sec - 345.525283 Mpixels/sec
        > 327.475757 frames/sec - 365.462944 Mpixels/sec
        > ^C
        > %
        > ```
        >
        > Here is an example of `vglrun glxinfo` hanging:
        >
        > ```
        > % VGL_TRACE=1 vglrun glxinfo
        > [VGL] NOTICE: Automatically setting VGL_CLIENT environment
        variable to
        > [VGL]    10.17.150.240, the IP address of your SSH client.
        > [VGL] Shared memory segment ID for vglconfig: 10
        > [VGL 0xc4a0ce40] XOpenDisplay (name=NULL [VGL] VirtualGL
        v2.6.1 64-bit
        > (Build 20190411)
        > [VGL] dlopen (filename=libX11-xcb.so.1 flag=258
        retval=0x5586355a96b0)
        > [VGL] dlopen (filename=libxcb.so.1 flag=258
        retval=0x7f04c4def530)
        > [VGL] dlopen (filename=libxshmfence.so.1 flag=258
        retval=0x5586355a9d30)
        > [VGL] dlopen (filename=libxcb-dri3.so.0 flag=258
        retval=0x5586355aa310)
        > [VGL] dlopen (filename=libxcb-dri2.so.0 flag=258
        retval=0x5586355aa920)
        > [VGL] dlopen (filename=libxcb-randr.so.0 flag=258
        retval=0x5586355aaf30)
        > [VGL] dlopen (filename=libxcb-sync.so.1 flag=258
        retval=0x5586355ab540)
        > [VGL] dlopen (filename=libX11.so.6 flag=258[VGL] NOTICE:
        Replacing
        > dlopen("libX11.so.6") with dlopen("libvglfaker.so")
        >   retval=0x7f04c59a34f0)
        > [VGL] dlopen (filename=libxcb-present.so.0 flag=258
        retval=0x5586355abc60)
        > [VGL] dlopen (filename=libxcb-glx.so.0 flag=258
        retval=0x5586355ac280)
        > [VGL] dlopen (filename=libXfixes.so.3 flag=258
        retval=0x5586355ac890)
        > [VGL] dlopen (filename=libXdamage.so.1 flag=258
        retval=0x5586355ad080)
        > [VGL] dlopen (filename=libXext.so.6 flag=258
        retval=0x7f04c4dee510)
        > [VGL] dlopen (filename=libXxf86vm.so.1 flag=258
        retval=0x5586355ad750)
        > [VGL] dlopen (filename=libXau.so.6 flag=258
        retval=0x7f04c4a36000)
        > [VGL] dlopen (filename=libXdmcp.so.6 flag=258
        retval=0x7f04c4a36510)
        > dpy=0x5586355addf0(localhost:10.0) ) 193.939924 ms
        > name of display: localhost:10.0
        > [VGL 0xc4a0ce40] glXChooseVisual
        (dpy=0x5586355addf0(localhost:10.0)
        > screen=0 attrib_list=[0x0004 0x0008=0x0001 0x0009=0x0001
        0x000a=0x0001
        > 0x000c=0x0001 0x000d=0x0001 0x000e=0x0001 0x000f=0x0001
        0x0010=0x0001
        > 0x0011=0x0001 0x0005 ] glxattribs=[0x000c=0x0001 0x000d=0x0001
        > 0x000e=0x0001 0x000f=0x0001 0x0010=0x0001 0x0011=0x0001
        0x0005=0x0001
        > 0x8011=0x0001 0x0008=0x0001 0x0009=0x0001 0x000a=0x0001
        0x000b=0x0001
        > 0x8010=0x0006 0x0022=0x8002 ] [VGL] Opening connection to 3D
        X server :1
        > [VGL] dlopen (filename=NULL flag=258 retval=0x7f04c59f4190)
        > ```
        >
        > I had this working fine under Ubuntu 16.04 on the exact same
        hardware.
        >
        > Does anyone know how I can fix this?
        >
        > Thanks,
        > Jason
        >
        
---------------------------------------------------------------------------
        > Jason Edgecombe | Linux Administrator
        > UNC Charlotte | Office of OneIT
        >
        
---------------------------------------------------------------------------
        > If you are not the intended recipient of this transmission
        or a person
        > responsible for delivering it to the intended recipient, any
        disclosure,
        > copying, distribution, or other use of any of the
        information in this
        > transmission is strictly prohibited. If you have received this
        > transmission in error, please notify me immediately by reply
        e-mail or
        > by telephone at
        > 704-687-1943 <tel:704-687-1943> <tel:704-687-1943>.  Thank you.

-- You received this message because you are subscribed to the Google
    Groups "VirtualGL User Discussion/Support" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:[email protected]>.
    To view this discussion on the web visit
    
https://groups.google.com/d/msgid/virtualgl-users/7f20057e-0913-615e-f48f-0ed1bfb64583%40virtualgl.org
    
<https://groups.google.com/d/msgid/virtualgl-users/7f20057e-0913-615e-f48f-0ed1bfb64583%40virtualgl.org?utm_medium=email&utm_source=footer>.

--
You received this message because you are subscribed to the Google Groups "VirtualGL User Discussion/Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To view this discussion on the web visit https://groups.google.com/d/msgid/virtualgl-users/CAAR6MGCXf3eRewvnf3_AtW1f%3DQ87LZiCJd1YXWUSoG%2BpMifwuA%40mail.gmail.com <https://groups.google.com/d/msgid/virtualgl-users/CAAR6MGCXf3eRewvnf3_AtW1f%3DQ87LZiCJd1YXWUSoG%2BpMifwuA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--
You received this message because you are subscribed to the Google Groups "VirtualGL 
User Discussion/Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/virtualgl-users/6e67cb81-3656-384f-04e2-ec9e55cfb54f%40virtualgl.org.

Reply via email to