[RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Ilija Hadzic

>
> So I think we do have enough people interested in this and should be able
> to cobble together something that does The Right Thing.
>

We indeed have a non-trivial set of people interested in the same set of 
problems and each of us has partial and maybe competing solution. I want 
to make it clear that my, maybe disruptive and different from the 
plan-of-record, proposal should not be viewed as destructive or 
distracting. I am just offering to the community what I think is useful.

If this discussion sparks some joint effort that will bring us to the 
solution that everyone is happy with, even if no line of my code is found 
useful, I am perfectly fine with that (and I'll join the effort).

So at this point I think I should put out my back-of-the-napkin 
desiderata. That will hopefully shed some light on where I am coming from 
with VCRTCM proposal.

I want to be able to pull pixels out of the GPU and redirect them to an 
arbitrary device that can do something useful with them. This should not 
be limited to shooting photons into human eyeballs. I want to be able to 
run my applications without having to run X. I'd like the solution to be 
transparent to the application; that is, if I can write an application 
that can render something to a full screen, I want to redirect that 
"screen" to wherever I want without having to rewrite, recompile or relink 
the application. Actually, I want to do that redirection at runtime. I'd 
like to support all of the above in a way that it can also help solve more 
imminent shotcomings of Linux graphics system (Optimus, DisplayLink, etc. 
... cf. previous E-mails in this thread). I'd like it to work with 
multiple render nodes on the same GPU (something like Dave's multiseat 
work, in which both GPU and its display resources are virtual).

The logical consequence of this is that the render node and the display 
node should at some point become logically separate (different driver 
modules) even if they are physically on the same GPU. They are really two 
different subsystems that just happen to reside on the same circuit 
board, so it makes sense to separate them.

I don't think what I am saying is anything unique and what I said probably 
overlaps in good part with what others also want from the graphics 
subsystem. I can see the role of VCRTCM in all of the above, but I am 
open-minded. If we end up with a solution that has nothing to do with 
VCRTCM, I have no emotional ties with my code (and code of my colleagues 
that worked with me so far).

-- Ilija




[RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Ilija Hadzic


On Thu, 24 Nov 2011, Dave Airlie wrote:

> Okay so thats pretty much how I expected it to work, I don't think
> Virtual makes sense for a displaylink attached device though,
> again if you were using a real driver you would just re-use whatever
> output type it uses, though I'm not sure how well that works,

That is the consequence of the fact that virtual CRTCs are created at 
startup time when attached CTD is not known, while CTDs are attached at 
runtime. So when I register the virtual CRTC and the associated connector 
I have to use something for the connector type.

Admitting that my logic is biased by my design, to me "Virtual" connector 
type is an indicative that from GPU's perspective it's a connector that 
does not physically exist and is yet to be attached to some real display 
device. At that point the properties of the attached display become known 
to the system.

>
> Do you propogate full EDID information and all the modes or just the
> supported modes? we use this in userspace to put monitor names in
> GNOME display settings etc.
>

Right now we propagate the entire list of modes that the attached CTD 
device has queried from the connected display (monitor). Propagating full 
EDID is really easy to add. That's if the CTD is driver for some real 
display. If CTD is just a "make-believe" display whose purpose is to be 
the conduit to some other pixel-processing component (e.g. V4L2CTD), then 
at some point in the chain we have to make up the set of modes that the 
logical display accepts and in that case the EDID does not exist by 
definition.

> what does xrandr output looks like for a radeon GPU with 4 vcrtcs? do
> you see 4 disconnected connectors? that again isn't a pretty user
> experience.
>

Yes it shows 4 disconnected monitors. To me that is a logical consequence 
of the design in which virtual CRTCs and associated virtual connectors are 
always there. By know, it's clear to me that you are not too thrilled 
about it, but please allow me to turn the question back to you: in your 
solution with udl-v2 driver and a dedicated DDX for it, can you do the big 
desktop that spans across GPU's local and "foreign" displays and have 
acceleration on both? If not, what would it take you to get there and how 
complex the end result will be?

I'll get to Optimus/PRIME use case later, but if we for the moment focus 
on the use case in which a dumb framebuffer device extens the number of 
displays of a rendering-capable GPU, I think that VCRTCM offers quite a 
complete and universal solution and it is completely transparent with 
regard to the application, window manager, and display server.

Radeon + DisplayLink is the specific example. But in general it's Any GPU 
+ Any fbdev. It's not just one use case, it's a whole class of use cases
that would follow the same principle and for then the VCRTC alone 
suffices.


> My main problem with this is as I'll explain below it only covers some
> of the use cases, and I don't want a 50% solution at this point, by
> doing something like this you are making it harder to get proper
> support into something like wayland as they can ignore some of the
> problems, however since this doesn't solve all the other problems it
> means getting to a finished solution is actually less likely to
> happen.
>

I presume that by 50% solution you are referring to Optimus/PRIME use 
case. That case actually consists of two related, but different problems. 
First is "render on node X and display on node Y" and the second is 
"dynamically and hitlessly switch rendering between node X and Y".

I have never claimed that VCRTCs solve the second problem (I could switch 
by restarting Xorg, but I know that this is not the solution you are 
looking for). I fully understand why you want both problems solved at the 
same time. However, I don't understand why solving one first would inhibit 
solving the other.

On the other hand, the Radeon + DisplayLink tandem use case (or in general 
GPU + fbdev tandem) consists only of "render on X, display on Y" problem. 
Here, you will probably say that there one can switch between hardware and 
software rendering and that it also has both problems. That is true, but 
unlike the Optimus/PRIME use case, using fbdev as a display extension to 
GPU is still useful alone. My point is that there is a value in solving 
first one problem and then follow with the other.

I think the crux of the problem is that you are not convinced that the
VCRTCM solution for problem #1 will make solving problem #2 easier and 
maybe you are afraid that it will make it harder. If that's a fair 
statement and if having me create an existence proof for problem #2 that 
still uses VCRTCM will help bring our positions closer, I am perfectly 
willing to do so  I guess I've just signed up for some hacking ;-)

Note that for hitless GPU switching, I fully agree that support must be in 
userspace (you have to swap out paths in Mesa and DDX before even getting 
to kernel), but 

[PATCH 3/3 v2] drm/i915: hot plug/unplug notification to HDMI audio driver

2011-11-24 Thread Wu Fengguang
On Thu, Nov 24, 2011 at 03:26:49AM +0800, Keith Packard wrote:
> On Wed, 23 Nov 2011 16:29:58 +0800, Wu Fengguang  
> wrote:
> 
> > What I need is a hot plug hook that knows whether the monitor is
> > plugged or removed, which is only possible if the hook is called
> > after ->detect().
> 
> That would be mode_set to tell you that the monitor is in use, and the
> disable function to tell you when the monitor is no longer in use.
> 
> You do not want to do anything to the hardware in the hot_plug paths;
> those are strictly informative; telling user space which connectors are
> present.

Thanks a lot for the tips! When doing things in the right path, I got
a much reduced patch :-)

Due to DP being a bit more tricky than HDMI and no convenient DP test
environment, I'll have to delay the DP part to next week...

Thanks,
Fengguang
---
Subject: drm/i915: HDMI hot remove notification to audio driver
Date: Fri Nov 11 13:49:04 CST 2011

On HDMI monitor hot remove, clear SDVO_AUDIO_ENABLE accordingly, so that
the audio driver will receive hot plug events and take action to refresh
its device state and ELD contents.

The cleared SDVO_AUDIO_ENABLE bit needs to be restored to prevent losing
HDMI audio after DPMS on.

CC: Wang Zhenyu 
Signed-off-by: Wu Fengguang 
---
 drivers/gpu/drm/i915/intel_dp.c   |4 +++-
 drivers/gpu/drm/i915/intel_hdmi.c |8 ++--
 2 files changed, 9 insertions(+), 3 deletions(-)

--- linux.orig/drivers/gpu/drm/i915/intel_hdmi.c2011-11-24 
17:11:38.0 +0800
+++ linux/drivers/gpu/drm/i915/intel_hdmi.c 2011-11-24 17:15:03.0 
+0800
@@ -269,6 +269,10 @@ static void intel_hdmi_dpms(struct drm_e
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(encoder);
u32 temp;
+   u32 enable_bits = SDVO_ENABLE;
+
+   if (intel_hdmi->has_audio)
+   enable_bits |= SDVO_AUDIO_ENABLE;

temp = I915_READ(intel_hdmi->sdvox_reg);

@@ -281,9 +285,9 @@ static void intel_hdmi_dpms(struct drm_e
}

if (mode != DRM_MODE_DPMS_ON) {
-   temp &= ~SDVO_ENABLE;
+   temp &= ~enable_bits;
} else {
-   temp |= SDVO_ENABLE;
+   temp |= enable_bits;
}

I915_WRITE(intel_hdmi->sdvox_reg, temp);


[RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Dave Airlie
On Thu, Nov 24, 2011 at 12:58 PM, Alan Cox  wrote:
>> The thing is this is how optimus works, the nvidia gpus have an engine
>> that you can program to move data from the nvidia tiled VRAM format to
>
> This is even more of a special case than DisplayLink ;-)
>
>> Probably a good idea to do some more research on intel/nvidia GPUs.
>> With intel you can't read back from UMA since it'll be uncached memory
>> so unuseable, so you'll need to use the GPU to detile and move to some
>> sort of cached linear area you can readback from.
>
> It's main memory so there are various ways to read it or pull it into
> cached space.

We have no way to detile on the CPU for lots of intel corner cases, I don't hold
out for it being a proper solution, though in theory for hibernate its
a requirement to figure out.
But you can expose stuff via the GTT using fences to detile, but you
can't then get cached access to it.
So no really various ways, none of them useful or faster than getting
the GPU to blit somewhere linear
and flipping the dest mapping.

>
>> I merge this VCRTC stuff I give a lot of people an excuse for not
>> bothering to fix the harder problems that hotplug and dynamic GPUs put
>> in front of you.
>
> I think both cases are slightly missing the mark, both are specialist
> corner cases and once you add things like cameras to the mix that will
> become even more painfully obvious.
>
> The underlying need I think is a way to negotiate a shared buffer format
> or pipeline between two devices. You also need in some cases to think
> about shared fencing, and that is the bit that is really scary.
>
> Figuring out the transform from A to B ('lets both use this buffer
> format') or 'I can render then convert' is one thing. Dealing with two
> GPUs firing into the same buffer while scanning it out I just pray
> doesn't ever need shared fences.

But we have a project looking into all that, called dmabuf, we also
have the PRIME work which we hope to build on top of dmabuf.

The thing is there are lots of building blocks we need to put in
place, and we've mostly identified what they are, its just typing now.

Dave.


[RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Alan Cox
> The thing is this is how optimus works, the nvidia gpus have an engine
> that you can program to move data from the nvidia tiled VRAM format to

This is even more of a special case than DisplayLink ;-)

> Probably a good idea to do some more research on intel/nvidia GPUs.
> With intel you can't read back from UMA since it'll be uncached memory
> so unuseable, so you'll need to use the GPU to detile and move to some
> sort of cached linear area you can readback from.

It's main memory so there are various ways to read it or pull it into
cached space.

> I merge this VCRTC stuff I give a lot of people an excuse for not
> bothering to fix the harder problems that hotplug and dynamic GPUs put
> in front of you.

I think both cases are slightly missing the mark, both are specialist
corner cases and once you add things like cameras to the mix that will
become even more painfully obvious.

The underlying need I think is a way to negotiate a shared buffer format
or pipeline between two devices. You also need in some cases to think
about shared fencing, and that is the bit that is really scary.

Figuring out the transform from A to B ('lets both use this buffer
format') or 'I can render then convert' is one thing. Dealing with two
GPUs firing into the same buffer while scanning it out I just pray
doesn't ever need shared fences.

Alan




vmwgfx: strange loop in vmw_kms_update_layout_ioctl()

2011-11-24 Thread Xi Wang
Hi,

I came across this code snippet at vmwgfx_kms.c:2000.  The loop variable i is 
never used and rects is checked again and again.  Should it be something like 
rects[i].x instead of rects->x?  Thanks.

for (i = 0; i < arg->num_outputs; ++i) {
if (rects->x < 0 ||
rects->y < 0 ||
rects->x + rects->w > mode_config->max_width ||
rects->y + rects->h > mode_config->max_height) {
DRM_ERROR("Invalid GUI layout.\n");
ret = -EINVAL;
goto out_free;
}
}

- xi


[RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Daniel Vetter
On Thu, Nov 24, 2011 at 08:52:45AM +, Dave Airlie wrote:
> So the main problem with taking all this code on-board is it sort of
> solves (a), and (b) needs another bunch of work. Now I'd rather not
> solve 50% of the issue and have future userspace apps just think they
> can ignore the problem. As much as I dislike the whole dual-gpu setups
> the fact is they exist and we can't change that, so writing userspace
> to ignore the problem because its too hard isn't going to work. So if
> I merge this VCRTC stuff I give a lot of people an excuse for not
> bothering to fix the harder problems that hotplug and dynamic GPUs put
> in front of you.

My 2 Rappen on this: I agree completely with your point that we should aim
for a full solution. GPU memory management across different devices is
hard, but solveable.

Furthermore I fear that a 50% solution that hides the memory management
and shuffling issues from userspace will end up being a leaky abstraction
(e.g. how and when is stuff transferred to the usb dp port, the kernel
might pin scanout buffers behind userspace's back screwing up the vram
accounting in userspace, random hotplugging of outputs ...). Also
v4l/embedded folks have similar issues (and the same tendency to just go
with a "simple" solution fitting their usecase) and with Intel dead-set on
entering the SoC market I'll have the joy to mess around with this stuff
pretty soon, too.

So I think we do have enough people interested in this and should be able
to cobble together something that does The Right Thing.

Cheers, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48


WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413

2011-11-24 Thread Markus Trippelsdorf
On 2011.11.23 at 10:06 -0600, Christoph Lameter wrote:
> On Wed, 23 Nov 2011, Markus Trippelsdorf wrote:
> 
> > > FIX idr_layer_cache: Marking all objects used
> >
> > Yesterday I couldn't reproduce the issue at all. But today I've hit
> > exactly the same spot again. (CCing the drm list)
> 
> Well this is looks like write after free.
> 
> > =
> > BUG idr_layer_cache: Poison overwritten
> > -
> > Object 8802156487c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
> > 
> > Object 8802156487d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
> > 
> > Object 8802156487e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
> > 
> > Object 8802156487f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
> > 
> > Object 880215648800: 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
> > 
> > Object 880215648810: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
> > 
> 
> And its an integer sized write of 0. If you look at the struct definition
> and lookup the offset you should be able to locate the field that
> was modified.

Here are two more BUGs that seem to point to the same bug:

1)
...
Nov 21 18:30:30 x4 kernel: [drm] radeon: irq initialized.
Nov 21 18:30:30 x4 kernel: [drm] GART: num cpu pages 131072, num gpu pages 
131072
Nov 21 18:30:30 x4 kernel: [drm] Loading RS780 Microcode
Nov 21 18:30:30 x4 kernel: [drm] PCIE GART of 512M enabled (table at 
0xC004).
Nov 21 18:30:30 x4 kernel: radeon :01:05.0: WB enabled
Nov 21 18:30:30 x4 kernel: 
=
Nov 21 18:30:30 x4 kernel: BUG task_xstate: Not a valid slab page
Nov 21 18:30:30 x4 kernel: 
-
Nov 21 18:30:30 x4 kernel:
Nov 21 18:30:30 x4 kernel: INFO: Slab 0xea044300 objects=32767 
used=65535 fp=0x  (null) flags=0x0401
Nov 21 18:30:30 x4 kernel: Pid: 9, comm: ksoftirqd/1 Not tainted 
3.2.0-rc2-00274-g6fe4c6d-dirty #75
Nov 21 18:30:30 x4 kernel: Call Trace:
Nov 21 18:30:30 x4 kernel: [] slab_err+0x7d/0x90
Nov 21 18:30:30 x4 kernel: [] ? dump_trace+0x16f/0x2e0
Nov 21 18:30:30 x4 kernel: [] ? free_thread_xstate+0x24/0x40
Nov 21 18:30:30 x4 kernel: [] ? free_thread_xstate+0x24/0x40
Nov 21 18:30:30 x4 kernel: [] check_slab+0x96/0xc0
Nov 21 18:30:30 x4 kernel: [] free_debug_processing+0x34/0x19c
Nov 21 18:30:30 x4 kernel: [] ? set_track+0x5a/0x190
Nov 21 18:30:30 x4 kernel: [] ? sys_open+0x1b/0x20
Nov 21 18:30:30 x4 kernel: [] __slab_free+0x33/0x2d0
Nov 21 18:30:30 x4 kernel: [] ? sys_open+0x1b/0x20
Nov 21 18:30:30 x4 kernel: [] kmem_cache_free+0x104/0x120
Nov 21 18:30:30 x4 kernel: [] free_thread_xstate+0x24/0x40
Nov 21 18:30:30 x4 kernel: [] free_thread_info+0x14/0x30
Nov 21 18:30:30 x4 kernel: [] free_task+0x2f/0x50
Nov 21 18:30:30 x4 kernel: [] __put_task_struct+0xb0/0x110
Nov 21 18:30:30 x4 kernel: [] 
delayed_put_task_struct+0x3b/0xa0
Nov 21 18:30:30 x4 kernel: [] 
__rcu_process_callbacks+0x12a/0x350
Nov 21 18:30:30 x4 kernel: [] rcu_process_callbacks+0x62/0x140
Nov 21 18:30:30 x4 kernel: [] __do_softirq+0xa8/0x200
Nov 21 18:30:30 x4 kernel: [] run_ksoftirqd+0x107/0x210
Nov 21 18:30:30 x4 kernel: [] ? __do_softirq+0x200/0x200
Nov 21 18:30:30 x4 kernel: [] kthread+0x87/0x90
Nov 21 18:30:30 x4 kernel: [] kernel_thread_helper+0x4/0x10
Nov 21 18:30:30 x4 kernel: [] ? 
kthread_flush_work_fn+0x10/0x10
Nov 21 18:30:30 x4 kernel: [] ? gs_change+0xb/0xb
Nov 21 18:30:30 x4 kernel: FIX task_xstate: Object at 0x8110cf2b not 
freed
Nov 21 18:30:30 x4 kernel: [drm] ring test succeeded in 1 usecs
Nov 21 18:30:30 x4 kernel: [drm] radeon: ib pool ready.
Nov 21 18:30:30 x4 kernel: [drm] ib test succeeded in 0 usecs
Nov 21 18:30:30 x4 kernel: [drm] Radeon Display Connectors
Nov 21 18:30:30 x4 kernel: [drm] Connector 0

2)
...
Nov 21 17:04:38 x4 kernel: fbcon: radeondrmfb (fb0) is primary device
Nov 21 17:04:38 x4 kernel: Console: switching to colour frame buffer device 
131x105
Nov 21 17:04:38 x4 kernel: fb0: radeondrmfb frame buffer device
Nov 21 17:04:38 x4 kernel: drm: registered panic notifier
Nov 21 17:04:38 x4 kernel: [drm] Initialized radeon 2.11.0 20080528 for 
:01:05.0 on minor 0
Nov 21 17:04:38 x4 kernel: loop: module loaded
Nov 21 17:04:38 x4 kernel: ahci :00:11.0: version 3.0
Nov 21 17:04:38 x4 kernel: ahci :00:11.0: PCI INT A -> GSI 22 (level, low) 
-> IRQ 22
Nov 21 17:04:38 x4 kernel: ahci :00:11.0: AHCI 0001.0100 32 slots 6 ports 3 
Gbps 0x3f impl SATA mode
Nov 21 17:04:38 x4 kernel: ahci :00:11.0: flags: 64bit ncq sntf ilck pm led 
clo pmp pio slum part ccc
Nov 21 17:04:38 x4 kernel: scsi0 : ahci
Nov 21 17:04:38 x4 kernel: scsi1 : ahci
Nov 21 17:04:38 x4 kernel: 

[RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Dave Airlie
On Thu, Nov 24, 2011 at 5:59 AM, Ilija Hadzic
 wrote:
>
>
> On Wed, 23 Nov 2011, Dave Airlie wrote:
>
>> So another question I have is how you would intend this to work from a
>> user POV, like how it would integrate with a desktop environment, X or
>> wayland, i.e. with little or no configuration.
>>
>
> First thing to understand is that when a virtual CRTC is created, it looks
> to the user like the GPU has an additional DisplayPort connector.
> At present I "abuse" DisplayPort, but I have seen that you pushed a patch
> from VMware that adds Virtual connector, so eventually I'll switch to that
> naming. The number of virtual CRTCs is determined when the driver loads and
> that is a static configuration parameter. This does not restrict the user
> because unused virutal CRTCs are just like disconnected connectors on the
> GPU. In extreme case, a user could max out the number of virtual CRTCs (i.e.
> 32 minus #-of-physical-CRTCs), but in general the system needs to be booted
> with maximum number of anticipated CRTCs. Run-time addition and removal of
> CRTCs is not supported at this time and that would be much harder to
> implement and would affect the whole DRM module everywhere.
>
> So now we have a system that booted up and DRM sees all of its real
> connectors as well as virtual ones (as DisplayPorts at present). If there is
> no CTD device attached to virtual CRTCs, these virtual connectors are
> disconnected as far as DRM is concerned. Now the userspace must call
> "attach/fps" ioctl to associate CTDs with CRTCs. I'll explain shortly how to
> automate that and how to eliminate the burden from the user, but for now,
> please assume that "attach/fps" gets called from userland somehow.
>
> When the attach happens, that is a hotplug event (VCRTCM generates it) to
> DRM, just like someone plugged in the monitor. Then when XOrg starts, it
> will use the DisplayPort that represents a virtual CRTC just like any other
> connector. How it will use it, will depend on what the xorg.conf says, but
> the key point is that this connector is no different from any other
> connector that the GPU provides and is thus used as an "equal citizen". No
> special configuration is necessary once attached to CTD.
>
> If CTD is detached and new CTD attached, that is just like yanking out
> monitor cable and plugging in the new one. DRM will get all hotplug events
> and windowing system will do the same thing it would normally do with any
> other port. If RANDR is called to resize the desktop it will also work and X
> will have no idea that one of the connectors is on a virtual CRTC. I also
> have another feature, where when CTD is attached, it can ask the device it
> drives for the connection status and propagate that all the way back to DRM
> (this is useful for CTD devices that drive real monitors, like DisplayLink).

Okay so thats pretty much how I expected it to work, I don't think
Virtual makes sense for a displaylink attached device though,
again if you were using a real driver you would just re-use whatever
output type it uses, though I'm not sure how well that works,

Do you propogate full EDID information and all the modes or just the
supported modes? we use this in userspace to put monitor names in
GNOME display settings etc.

what does xrandr output looks like for a radeon GPU with 4 vcrtcs? do
you see 4 disconnected connectors? that again isn't a pretty user
experience.

> So this is your hotplug demo, but the difference is that the new desktop can
> use direct rendering. Also, everything that would work for a normal
> connector works here without having to do any additional tricks. RANDR also
> works seamlessly without having to do anything special. If you move away
> from Xorg, to some other system (Wayland?), it still works for as long as
> the new system knows how to deal with connectors that connect and
> disconnect.

My main problem with this is as I'll explain below it only covers some
of the use cases, and I don't want a 50% solution at this point, by
doing something like this you are making it harder to get proper
support into something like wayland as they can ignore some of the
problems, however since this doesn't solve all the other problems it
means getting to a finished solution is actually less likely to
happen.

>> I still foresee problems with tiling, we generally don't encourage
>> accel code to live in the kernel, and you'll really want a
>> tiled->untiled blit for this thing,
>
> Accel code should not go into the kernel (that I fully agree) and there is
> nothing here that would behove us to do so. Restricting my comments to
> Radeon GPU (which is the only one that I know well enough), shaders for blit
> copy live in the kernel and irrespective of VCRTCM work. I rely on them to
> move the frame buffer out of VRAM to CTD device but I don't add any
> additional features.
>
> Now for detiling, I think that it should be the responsibility of the
> receiving CTD device, not the GPU pushing the data 

No subject

2011-11-24 Thread
are looking for. I recognize that it disrupts your current views/plans on 
how this should be done, but I do want to work with you to find a suitable 
middle ground that covers most of the possiblities.

In case you are looking at my code to follow the above-described 
scenarios, please make sure you pull the latest stuff from my github 
repository. I have been pushing new stuff since my original annoucement.


> I still foresee problems with tiling, we generally don't encourage
> accel code to live in the kernel, and you'll really want a
> tiled->untiled blit for this thing,

Accel code should not go into the kernel (that I fully agree) and there is 
nothing here that would behove us to do so. Restricting my comments to 
Radeon GPU (which is the only one that I know well enough), shaders for 
blit copy live in the kernel and irrespective of VCRTCM work. I rely on 
them to move the frame buffer out of VRAM to CTD device but I don't add 
any additional features.

Now for detiling, I think that it should be the responsibility of the 
receiving CTD device, not the GPU pushing the data (Alan mentioned that 
during the initial set of comments, and although I didn't say anything to 
it that has been my view as well).

Even if you wanted to use GPU for detiling (which I'll explain shortly why 
you should not), it would not require any new accel code in the kernel. It 
would merely require one bit flip in the setup of blit copy that already 
lives in the kernel.

However, de-tiling in GPU is a bad idea for two reasons. I tried to do 
that just as an experiment on Radeon GPUs and watched with the PCI Express 
analyzer what happens on the bus (yeah, I have some "heavy weapons" in my 
lab). Normally a tile is a continuous array of memory locations in VRAM. 
If blit-copy function is told to assume tiled source and linear 
destination (de-tiling) it will read a continuous set of addresses in 
VRAM, but then scatter 8 rows of 8 pixels each on non-contignuous set of 
addresses of the destination. If the destination is the PCI-Express bus, 
it will result in 8 32-byte write transactions instead of 2 128-byte 
transactions per each tile. That will choke the throughput of the bus 
right there.

BTW, this is the crux of the blit-copy performance improvement that you 
got from me back in October. Since blit-copy deals with copying a linear 
array, playing with tiled/non-tiled bits only affects the order in which 
addresses are accessed, so the trick was to get rid of short PCIe 
transactions and also shape up linear to rectangle mapping to make address 
pattern more friendly for the host.


> also for Intel GPUs where you have
> UMA, would you read from the UMA.
>

Yes the read would be from UMA. I have not yet looked at Intel GPUs in 
detail, so I don't have an answer for you on what problems would pop up 
and how to solve them, but I'll be glad to revisit the Intel discussion 
once I do some homework.

Some initial thoughts is that frame buffer in Intel are at the end of the 
day pages in the system memory, so anyone/anything can get to them if they 
are correctly mapped.


> It also doesn't solve the optimus GPU problem in any useful fashion,
> since it can't deal with all the use cases, so we still have to write
> an alternate solution that can deal with them, so we just end up with
> two answers.
>

Can you elaborate on some specific use cases that are of your concern? I 
have had this case in mind and I think I can make it work. First I would 
have to add CTD functionality to Intel driver. That should be 
straightforward. Once I get there, I'll be ready to experiment and we'll 
probably be in better position to discuss the specifics then (i.e. when we 
have something working to compare with what you did in PRIME experiemnt), 
but it would be good to know your specific concerns early.


thanks,

Ilija



Re: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413

2011-11-24 Thread Markus Trippelsdorf
On 2011.11.23 at 10:06 -0600, Christoph Lameter wrote:
 On Wed, 23 Nov 2011, Markus Trippelsdorf wrote:
 
   FIX idr_layer_cache: Marking all objects used
 
  Yesterday I couldn't reproduce the issue at all. But today I've hit
  exactly the same spot again. (CCing the drm list)
 
 Well this is looks like write after free.
 
  =
  BUG idr_layer_cache: Poison overwritten
  -
  Object 8802156487c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
  
  Object 8802156487d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
  
  Object 8802156487e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
  
  Object 8802156487f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
  
  Object 880215648800: 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
  
  Object 880215648810: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  
  
 
 And its an integer sized write of 0. If you look at the struct definition
 and lookup the offset you should be able to locate the field that
 was modified.

Here are two more BUGs that seem to point to the same bug:

1)
...
Nov 21 18:30:30 x4 kernel: [drm] radeon: irq initialized.
Nov 21 18:30:30 x4 kernel: [drm] GART: num cpu pages 131072, num gpu pages 
131072
Nov 21 18:30:30 x4 kernel: [drm] Loading RS780 Microcode
Nov 21 18:30:30 x4 kernel: [drm] PCIE GART of 512M enabled (table at 
0xC004).
Nov 21 18:30:30 x4 kernel: radeon :01:05.0: WB enabled
Nov 21 18:30:30 x4 kernel: 
=
Nov 21 18:30:30 x4 kernel: BUG task_xstate: Not a valid slab page
Nov 21 18:30:30 x4 kernel: 
-
Nov 21 18:30:30 x4 kernel:
Nov 21 18:30:30 x4 kernel: INFO: Slab 0xea044300 objects=32767 
used=65535 fp=0x  (null) flags=0x0401
Nov 21 18:30:30 x4 kernel: Pid: 9, comm: ksoftirqd/1 Not tainted 
3.2.0-rc2-00274-g6fe4c6d-dirty #75
Nov 21 18:30:30 x4 kernel: Call Trace:
Nov 21 18:30:30 x4 kernel: [81101c1d] slab_err+0x7d/0x90
Nov 21 18:30:30 x4 kernel: [8103e29f] ? dump_trace+0x16f/0x2e0
Nov 21 18:30:30 x4 kernel: [81044764] ? free_thread_xstate+0x24/0x40
Nov 21 18:30:30 x4 kernel: [81044764] ? free_thread_xstate+0x24/0x40
Nov 21 18:30:30 x4 kernel: [81102566] check_slab+0x96/0xc0
Nov 21 18:30:30 x4 kernel: [814c5c29] free_debug_processing+0x34/0x19c
Nov 21 18:30:30 x4 kernel: [81101d9a] ? set_track+0x5a/0x190
Nov 21 18:30:30 x4 kernel: [8110cf2b] ? sys_open+0x1b/0x20
Nov 21 18:30:30 x4 kernel: [814c5e55] __slab_free+0x33/0x2d0
Nov 21 18:30:30 x4 kernel: [8110cf2b] ? sys_open+0x1b/0x20
Nov 21 18:30:30 x4 kernel: [81105134] kmem_cache_free+0x104/0x120
Nov 21 18:30:30 x4 kernel: [81044764] free_thread_xstate+0x24/0x40
Nov 21 18:30:30 x4 kernel: [81044794] free_thread_info+0x14/0x30
Nov 21 18:30:30 x4 kernel: [8106a4ff] free_task+0x2f/0x50
Nov 21 18:30:30 x4 kernel: [8106a5d0] __put_task_struct+0xb0/0x110
Nov 21 18:30:30 x4 kernel: [8106eb4b] 
delayed_put_task_struct+0x3b/0xa0
Nov 21 18:30:30 x4 kernel: [810aa01a] 
__rcu_process_callbacks+0x12a/0x350
Nov 21 18:30:30 x4 kernel: [810aa2a2] rcu_process_callbacks+0x62/0x140
Nov 21 18:30:30 x4 kernel: [81072e18] __do_softirq+0xa8/0x200
Nov 21 18:30:30 x4 kernel: [81073077] run_ksoftirqd+0x107/0x210
Nov 21 18:30:30 x4 kernel: [81072f70] ? __do_softirq+0x200/0x200
Nov 21 18:30:30 x4 kernel: [8108bb87] kthread+0x87/0x90
Nov 21 18:30:30 x4 kernel: [814cdcf4] kernel_thread_helper+0x4/0x10
Nov 21 18:30:30 x4 kernel: [8108bb00] ? 
kthread_flush_work_fn+0x10/0x10
Nov 21 18:30:30 x4 kernel: [814cdcf0] ? gs_change+0xb/0xb
Nov 21 18:30:30 x4 kernel: FIX task_xstate: Object at 0x8110cf2b not 
freed
Nov 21 18:30:30 x4 kernel: [drm] ring test succeeded in 1 usecs
Nov 21 18:30:30 x4 kernel: [drm] radeon: ib pool ready.
Nov 21 18:30:30 x4 kernel: [drm] ib test succeeded in 0 usecs
Nov 21 18:30:30 x4 kernel: [drm] Radeon Display Connectors
Nov 21 18:30:30 x4 kernel: [drm] Connector 0

2)
...
Nov 21 17:04:38 x4 kernel: fbcon: radeondrmfb (fb0) is primary device
Nov 21 17:04:38 x4 kernel: Console: switching to colour frame buffer device 
131x105
Nov 21 17:04:38 x4 kernel: fb0: radeondrmfb frame buffer device
Nov 21 17:04:38 x4 kernel: drm: registered panic notifier
Nov 21 17:04:38 x4 kernel: [drm] Initialized radeon 2.11.0 20080528 for 
:01:05.0 on minor 0
Nov 21 17:04:38 x4 kernel: loop: module loaded
Nov 21 17:04:38 x4 kernel: ahci :00:11.0: version 3.0
Nov 21 17:04:38 x4 kernel: ahci :00:11.0: PCI INT A - GSI 22 (level, low) 
- 

Re: [RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Dave Airlie
On Thu, Nov 24, 2011 at 5:59 AM, Ilija Hadzic
ihad...@research.bell-labs.com wrote:


 On Wed, 23 Nov 2011, Dave Airlie wrote:

 So another question I have is how you would intend this to work from a
 user POV, like how it would integrate with a desktop environment, X or
 wayland, i.e. with little or no configuration.


 First thing to understand is that when a virtual CRTC is created, it looks
 to the user like the GPU has an additional DisplayPort connector.
 At present I abuse DisplayPort, but I have seen that you pushed a patch
 from VMware that adds Virtual connector, so eventually I'll switch to that
 naming. The number of virtual CRTCs is determined when the driver loads and
 that is a static configuration parameter. This does not restrict the user
 because unused virutal CRTCs are just like disconnected connectors on the
 GPU. In extreme case, a user could max out the number of virtual CRTCs (i.e.
 32 minus #-of-physical-CRTCs), but in general the system needs to be booted
 with maximum number of anticipated CRTCs. Run-time addition and removal of
 CRTCs is not supported at this time and that would be much harder to
 implement and would affect the whole DRM module everywhere.

 So now we have a system that booted up and DRM sees all of its real
 connectors as well as virtual ones (as DisplayPorts at present). If there is
 no CTD device attached to virtual CRTCs, these virtual connectors are
 disconnected as far as DRM is concerned. Now the userspace must call
 attach/fps ioctl to associate CTDs with CRTCs. I'll explain shortly how to
 automate that and how to eliminate the burden from the user, but for now,
 please assume that attach/fps gets called from userland somehow.

 When the attach happens, that is a hotplug event (VCRTCM generates it) to
 DRM, just like someone plugged in the monitor. Then when XOrg starts, it
 will use the DisplayPort that represents a virtual CRTC just like any other
 connector. How it will use it, will depend on what the xorg.conf says, but
 the key point is that this connector is no different from any other
 connector that the GPU provides and is thus used as an equal citizen. No
 special configuration is necessary once attached to CTD.

 If CTD is detached and new CTD attached, that is just like yanking out
 monitor cable and plugging in the new one. DRM will get all hotplug events
 and windowing system will do the same thing it would normally do with any
 other port. If RANDR is called to resize the desktop it will also work and X
 will have no idea that one of the connectors is on a virtual CRTC. I also
 have another feature, where when CTD is attached, it can ask the device it
 drives for the connection status and propagate that all the way back to DRM
 (this is useful for CTD devices that drive real monitors, like DisplayLink).

Okay so thats pretty much how I expected it to work, I don't think
Virtual makes sense for a displaylink attached device though,
again if you were using a real driver you would just re-use whatever
output type it uses, though I'm not sure how well that works,

Do you propogate full EDID information and all the modes or just the
supported modes? we use this in userspace to put monitor names in
GNOME display settings etc.

what does xrandr output looks like for a radeon GPU with 4 vcrtcs? do
you see 4 disconnected connectors? that again isn't a pretty user
experience.

 So this is your hotplug demo, but the difference is that the new desktop can
 use direct rendering. Also, everything that would work for a normal
 connector works here without having to do any additional tricks. RANDR also
 works seamlessly without having to do anything special. If you move away
 from Xorg, to some other system (Wayland?), it still works for as long as
 the new system knows how to deal with connectors that connect and
 disconnect.

My main problem with this is as I'll explain below it only covers some
of the use cases, and I don't want a 50% solution at this point, by
doing something like this you are making it harder to get proper
support into something like wayland as they can ignore some of the
problems, however since this doesn't solve all the other problems it
means getting to a finished solution is actually less likely to
happen.

 I still foresee problems with tiling, we generally don't encourage
 accel code to live in the kernel, and you'll really want a
 tiled-untiled blit for this thing,

 Accel code should not go into the kernel (that I fully agree) and there is
 nothing here that would behove us to do so. Restricting my comments to
 Radeon GPU (which is the only one that I know well enough), shaders for blit
 copy live in the kernel and irrespective of VCRTCM work. I rely on them to
 move the frame buffer out of VRAM to CTD device but I don't add any
 additional features.

 Now for detiling, I think that it should be the responsibility of the
 receiving CTD device, not the GPU pushing the data (Alan mentioned that
 during the initial set of 

Re: [PATCH 3/3 v2] drm/i915: hot plug/unplug notification to HDMI audio driver

2011-11-24 Thread Wu Fengguang
On Thu, Nov 24, 2011 at 03:26:49AM +0800, Keith Packard wrote:
 On Wed, 23 Nov 2011 16:29:58 +0800, Wu Fengguang fengguang...@intel.com 
 wrote:
 
  What I need is a hot plug hook that knows whether the monitor is
  plugged or removed, which is only possible if the hook is called
  after -detect().
 
 That would be mode_set to tell you that the monitor is in use, and the
 disable function to tell you when the monitor is no longer in use.
 
 You do not want to do anything to the hardware in the hot_plug paths;
 those are strictly informative; telling user space which connectors are
 present.

Thanks a lot for the tips! When doing things in the right path, I got
a much reduced patch :-)

Due to DP being a bit more tricky than HDMI and no convenient DP test
environment, I'll have to delay the DP part to next week...

Thanks,
Fengguang
---
Subject: drm/i915: HDMI hot remove notification to audio driver
Date: Fri Nov 11 13:49:04 CST 2011

On HDMI monitor hot remove, clear SDVO_AUDIO_ENABLE accordingly, so that
the audio driver will receive hot plug events and take action to refresh
its device state and ELD contents.

The cleared SDVO_AUDIO_ENABLE bit needs to be restored to prevent losing
HDMI audio after DPMS on.

CC: Wang Zhenyu zhenyu.z.w...@intel.com
Signed-off-by: Wu Fengguang fengguang...@intel.com
---
 drivers/gpu/drm/i915/intel_dp.c   |4 +++-
 drivers/gpu/drm/i915/intel_hdmi.c |8 ++--
 2 files changed, 9 insertions(+), 3 deletions(-)

--- linux.orig/drivers/gpu/drm/i915/intel_hdmi.c2011-11-24 
17:11:38.0 +0800
+++ linux/drivers/gpu/drm/i915/intel_hdmi.c 2011-11-24 17:15:03.0 
+0800
@@ -269,6 +269,10 @@ static void intel_hdmi_dpms(struct drm_e
struct drm_i915_private *dev_priv = dev-dev_private;
struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(encoder);
u32 temp;
+   u32 enable_bits = SDVO_ENABLE;
+
+   if (intel_hdmi-has_audio)
+   enable_bits |= SDVO_AUDIO_ENABLE;
 
temp = I915_READ(intel_hdmi-sdvox_reg);
 
@@ -281,9 +285,9 @@ static void intel_hdmi_dpms(struct drm_e
}
 
if (mode != DRM_MODE_DPMS_ON) {
-   temp = ~SDVO_ENABLE;
+   temp = ~enable_bits;
} else {
-   temp |= SDVO_ENABLE;
+   temp |= enable_bits;
}
 
I915_WRITE(intel_hdmi-sdvox_reg, temp);
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Daniel Vetter
On Thu, Nov 24, 2011 at 08:52:45AM +, Dave Airlie wrote:
 So the main problem with taking all this code on-board is it sort of
 solves (a), and (b) needs another bunch of work. Now I'd rather not
 solve 50% of the issue and have future userspace apps just think they
 can ignore the problem. As much as I dislike the whole dual-gpu setups
 the fact is they exist and we can't change that, so writing userspace
 to ignore the problem because its too hard isn't going to work. So if
 I merge this VCRTC stuff I give a lot of people an excuse for not
 bothering to fix the harder problems that hotplug and dynamic GPUs put
 in front of you.

My 2 Rappen on this: I agree completely with your point that we should aim
for a full solution. GPU memory management across different devices is
hard, but solveable.

Furthermore I fear that a 50% solution that hides the memory management
and shuffling issues from userspace will end up being a leaky abstraction
(e.g. how and when is stuff transferred to the usb dp port, the kernel
might pin scanout buffers behind userspace's back screwing up the vram
accounting in userspace, random hotplugging of outputs ...). Also
v4l/embedded folks have similar issues (and the same tendency to just go
with a simple solution fitting their usecase) and with Intel dead-set on
entering the SoC market I'll have the joy to mess around with this stuff
pretty soon, too.

So I think we do have enough people interested in this and should be able
to cobble together something that does The Right Thing.

Cheers, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Alan Cox
 The thing is this is how optimus works, the nvidia gpus have an engine
 that you can program to move data from the nvidia tiled VRAM format to

This is even more of a special case than DisplayLink ;-)

 Probably a good idea to do some more research on intel/nvidia GPUs.
 With intel you can't read back from UMA since it'll be uncached memory
 so unuseable, so you'll need to use the GPU to detile and move to some
 sort of cached linear area you can readback from.

It's main memory so there are various ways to read it or pull it into
cached space.

 I merge this VCRTC stuff I give a lot of people an excuse for not
 bothering to fix the harder problems that hotplug and dynamic GPUs put
 in front of you.

I think both cases are slightly missing the mark, both are specialist
corner cases and once you add things like cameras to the mix that will
become even more painfully obvious.

The underlying need I think is a way to negotiate a shared buffer format
or pipeline between two devices. You also need in some cases to think
about shared fencing, and that is the bit that is really scary.

Figuring out the transform from A to B ('lets both use this buffer
format') or 'I can render then convert' is one thing. Dealing with two
GPUs firing into the same buffer while scanning it out I just pray
doesn't ever need shared fences.

Alan


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Dave Airlie
On Thu, Nov 24, 2011 at 12:58 PM, Alan Cox a...@lxorguk.ukuu.org.uk wrote:
 The thing is this is how optimus works, the nvidia gpus have an engine
 that you can program to move data from the nvidia tiled VRAM format to

 This is even more of a special case than DisplayLink ;-)

 Probably a good idea to do some more research on intel/nvidia GPUs.
 With intel you can't read back from UMA since it'll be uncached memory
 so unuseable, so you'll need to use the GPU to detile and move to some
 sort of cached linear area you can readback from.

 It's main memory so there are various ways to read it or pull it into
 cached space.

We have no way to detile on the CPU for lots of intel corner cases, I don't hold
out for it being a proper solution, though in theory for hibernate its
a requirement to figure out.
But you can expose stuff via the GTT using fences to detile, but you
can't then get cached access to it.
So no really various ways, none of them useful or faster than getting
the GPU to blit somewhere linear
and flipping the dest mapping.


 I merge this VCRTC stuff I give a lot of people an excuse for not
 bothering to fix the harder problems that hotplug and dynamic GPUs put
 in front of you.

 I think both cases are slightly missing the mark, both are specialist
 corner cases and once you add things like cameras to the mix that will
 become even more painfully obvious.

 The underlying need I think is a way to negotiate a shared buffer format
 or pipeline between two devices. You also need in some cases to think
 about shared fencing, and that is the bit that is really scary.

 Figuring out the transform from A to B ('lets both use this buffer
 format') or 'I can render then convert' is one thing. Dealing with two
 GPUs firing into the same buffer while scanning it out I just pray
 doesn't ever need shared fences.

But we have a project looking into all that, called dmabuf, we also
have the PRIME work which we hope to build on top of dmabuf.

The thing is there are lots of building blocks we need to put in
place, and we've mostly identified what they are, its just typing now.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Ilija Hadzic



On Thu, 24 Nov 2011, Dave Airlie wrote:


Okay so thats pretty much how I expected it to work, I don't think
Virtual makes sense for a displaylink attached device though,
again if you were using a real driver you would just re-use whatever
output type it uses, though I'm not sure how well that works,


That is the consequence of the fact that virtual CRTCs are created at 
startup time when attached CTD is not known, while CTDs are attached at 
runtime. So when I register the virtual CRTC and the associated connector 
I have to use something for the connector type.


Admitting that my logic is biased by my design, to me Virtual connector 
type is an indicative that from GPU's perspective it's a connector that 
does not physically exist and is yet to be attached to some real display 
device. At that point the properties of the attached display become known 
to the system.




Do you propogate full EDID information and all the modes or just the
supported modes? we use this in userspace to put monitor names in
GNOME display settings etc.



Right now we propagate the entire list of modes that the attached CTD 
device has queried from the connected display (monitor). Propagating full 
EDID is really easy to add. That's if the CTD is driver for some real 
display. If CTD is just a make-believe display whose purpose is to be 
the conduit to some other pixel-processing component (e.g. V4L2CTD), then 
at some point in the chain we have to make up the set of modes that the 
logical display accepts and in that case the EDID does not exist by 
definition.



what does xrandr output looks like for a radeon GPU with 4 vcrtcs? do
you see 4 disconnected connectors? that again isn't a pretty user
experience.



Yes it shows 4 disconnected monitors. To me that is a logical consequence 
of the design in which virtual CRTCs and associated virtual connectors are 
always there. By know, it's clear to me that you are not too thrilled 
about it, but please allow me to turn the question back to you: in your 
solution with udl-v2 driver and a dedicated DDX for it, can you do the big 
desktop that spans across GPU's local and foreign displays and have 
acceleration on both? If not, what would it take you to get there and how 
complex the end result will be?


I'll get to Optimus/PRIME use case later, but if we for the moment focus 
on the use case in which a dumb framebuffer device extens the number of 
displays of a rendering-capable GPU, I think that VCRTCM offers quite a 
complete and universal solution and it is completely transparent with 
regard to the application, window manager, and display server.


Radeon + DisplayLink is the specific example. But in general it's Any GPU 
+ Any fbdev. It's not just one use case, it's a whole class of use cases
that would follow the same principle and for then the VCRTC alone 
suffices.




My main problem with this is as I'll explain below it only covers some
of the use cases, and I don't want a 50% solution at this point, by
doing something like this you are making it harder to get proper
support into something like wayland as they can ignore some of the
problems, however since this doesn't solve all the other problems it
means getting to a finished solution is actually less likely to
happen.



I presume that by 50% solution you are referring to Optimus/PRIME use 
case. That case actually consists of two related, but different problems. 
First is render on node X and display on node Y and the second is 
dynamically and hitlessly switch rendering between node X and Y.


I have never claimed that VCRTCs solve the second problem (I could switch 
by restarting Xorg, but I know that this is not the solution you are 
looking for). I fully understand why you want both problems solved at the 
same time. However, I don't understand why solving one first would inhibit 
solving the other.


On the other hand, the Radeon + DisplayLink tandem use case (or in general 
GPU + fbdev tandem) consists only of render on X, display on Y problem. 
Here, you will probably say that there one can switch between hardware and 
software rendering and that it also has both problems. That is true, but 
unlike the Optimus/PRIME use case, using fbdev as a display extension to 
GPU is still useful alone. My point is that there is a value in solving 
first one problem and then follow with the other.


I think the crux of the problem is that you are not convinced that the
VCRTCM solution for problem #1 will make solving problem #2 easier and 
maybe you are afraid that it will make it harder. If that's a fair 
statement and if having me create an existence proof for problem #2 that 
still uses VCRTCM will help bring our positions closer, I am perfectly 
willing to do so  I guess I've just signed up for some hacking ;-)


Note that for hitless GPU switching, I fully agree that support must be in 
userspace (you have to swap out paths in Mesa and DDX before even getting 
to kernel), but like I said, that is a 

Re: [RFC] Virtual CRTCs (proposal + experimental code)

2011-11-24 Thread Ilija Hadzic




So I think we do have enough people interested in this and should be able
to cobble together something that does The Right Thing.



We indeed have a non-trivial set of people interested in the same set of 
problems and each of us has partial and maybe competing solution. I want 
to make it clear that my, maybe disruptive and different from the 
plan-of-record, proposal should not be viewed as destructive or 
distracting. I am just offering to the community what I think is useful.


If this discussion sparks some joint effort that will bring us to the 
solution that everyone is happy with, even if no line of my code is found 
useful, I am perfectly fine with that (and I'll join the effort).


So at this point I think I should put out my back-of-the-napkin 
desiderata. That will hopefully shed some light on where I am coming from 
with VCRTCM proposal.


I want to be able to pull pixels out of the GPU and redirect them to an 
arbitrary device that can do something useful with them. This should not 
be limited to shooting photons into human eyeballs. I want to be able to 
run my applications without having to run X. I'd like the solution to be 
transparent to the application; that is, if I can write an application 
that can render something to a full screen, I want to redirect that 
screen to wherever I want without having to rewrite, recompile or relink 
the application. Actually, I want to do that redirection at runtime. I'd 
like to support all of the above in a way that it can also help solve more 
imminent shotcomings of Linux graphics system (Optimus, DisplayLink, etc. 
... cf. previous E-mails in this thread). I'd like it to work with 
multiple render nodes on the same GPU (something like Dave's multiseat 
work, in which both GPU and its display resources are virtual).


The logical consequence of this is that the render node and the display 
node should at some point become logically separate (different driver 
modules) even if they are physically on the same GPU. They are really two 
different subsystems that just happen to reside on the same circuit 
board, so it makes sense to separate them.


I don't think what I am saying is anything unique and what I said probably 
overlaps in good part with what others also want from the graphics 
subsystem. I can see the role of VCRTCM in all of the above, but I am 
open-minded. If we end up with a solution that has nothing to do with 
VCRTCM, I have no emotional ties with my code (and code of my colleagues 
that worked with me so far).


-- Ilija


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


vmwgfx: strange loop in vmw_kms_update_layout_ioctl()

2011-11-24 Thread Xi Wang
Hi,

I came across this code snippet at vmwgfx_kms.c:2000.  The loop variable i is 
never used and rects is checked again and again.  Should it be something like 
rects[i].x instead of rects-x?  Thanks.

for (i = 0; i  arg-num_outputs; ++i) {
if (rects-x  0 ||
rects-y  0 ||
rects-x + rects-w  mode_config-max_width ||
rects-y + rects-h  mode_config-max_height) {
DRM_ERROR(Invalid GUI layout.\n);
ret = -EINVAL;
goto out_free;
}
}

- xi
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel