Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Kristian Høgsberg
On Fri, Sep 14, 2012 at 5:14 PM, Jesse Barnes jbar...@virtuousgeek.org wrote:
 On Wed, 12 Sep 2012 21:58:31 +0300
 Ville Syrjälä ville.syrj...@linux.intel.com wrote:

 On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
  On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
   But I think we could still do this w/ one ioctl per crtc for 
   atomic-pageflip.
  
   We could, if we want to sacrifice the synced multi display case. I just
   think it might be a real use case at some point. IVI feels like the most
   likely short term cadidate to me, but perhaps someone would finally
   introduce some new style phone/tablet thingy. I have seen
   concepts/prototypes of such multi display gadgets in the past, but the
   industry apparently got a bit stuck on the rectangular slab with
   touchscreen on one side design.
 
  I could be wrong, but I think IVI the screens would normally be too
  far apart to matter?

 I was thinking of something like a display on the dash that normally
 sits low with only a small sliver visible, and extends upwards when
 you fire up a movie player for example. Internally it could be made
 up of two displays for power savings purposes.

  Anyways, it is really only a problem if you can't do two ioctl()s
  within one vblank period. If it actually turns out to be a real
  problem,

 Well exactly that's the problem this whole atomic pageflip stuff is
 trying to tackle, no? ;)

  we could always add later an ioctl that takes an array of
  'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
  really useful or not.. but maybe I'm thinking too much about how
  weston does it's rendering of different output's independently.

 I'm just now thinking of surfaceflinger's way of doing things, with
 its prepare and commit phases. If you need to issue two ioctls to handle
 cloned displays, then you can end up in a somewhat funky situation.

 Let's say you have a video overlay in use (one each display naturally),
 and you increase the downscaling factor enough so that you now have
 enough memory bandwith to support only one overlay. With independent
 check ioctls for each display, you never have the full device state
 available in the kernel, so each check succeeds during the prepare
 phase. So you decide that you can keep using the video overlays.

 You then proceed to commit the state, but after the first display has
 been commited you get an error when trying to commit the second one.
 What can you do now? The only option is to keep displaying the old
 frame on the other displays for some time longer, and then on the
 next frame you can switch to GPU composition. But on the next frame you
 perhaps no longer need to use GPU composition, but since you can't
 verify that in the prepare phase, you have no other option but to use
 GPU composition.

 So when you run into a configuration that can be supported only
 partially, you get animation stalls on some displays due to skipped
 frames, and you always have to fall back to GPU composition for the
 next frame.

 If on the other hand you would check the whole state in one ioctl,
 you'd get the error in the first prepare phase, and could fall back
 to GPU composition immediately.

 Am I too much of a perfectionst in considering such things? I don't
 think so, but perhaps other people disagree.

 I don't think there's any harm in having multiple ioctls for different
 things.

 I was initially hoping the nuclear page flip would be very simple.
 Intended for simply updating buffers of several planes associated with
 a single display.  That makes the inner loop of something like Wayland
 or SF simple, but obviously doesn't cover every case (in fact I had
 avoided dealing with moving planes initially).

 Rob's patchset goes further than that, but obviously not as far as you
 propose.

 OTOH, keeping things really simple and not very featureful means there
 are fewer points of failure, which is what I think callers would expect
 from a flip API...

 So where does that leave us?  I'd propose we have a very simple,
 stripped down, single crtc flip ioctl, along with a big atomic mode set
 ioctl, and then perhaps a fancier multi-crtc flip ioctl.

I think (hope) the consensus coming out of this thread is something
along these lines:

 - We use properties for specifying what to change to be future
compatible with new crtc features, but also to allow exposing
hw-specific properties and tie them into the atomicity of the
pageflip.  The KMS overlays are a lowest-common denominator for all
the various overlay types out there and it should be possible to write
a piece of chipset specific compositor code to use features that can't
be expressed through KMS overlays.

 - We have two types of properties: dynamic and non-dynamic ones.
Dynamic properties can always be changed in the next frame (fb bos, hw
cursor position, overlay position, for example), 

Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Ville Syrjälä
On Fri, Sep 14, 2012 at 05:46:35PM -0400, Kristian Høgsberg wrote:
 I think (hope) the consensus coming out of this thread is something
 along these lines:
 
  - We use properties for specifying what to change to be future
 compatible with new crtc features, but also to allow exposing
 hw-specific properties and tie them into the atomicity of the
 pageflip.  The KMS overlays are a lowest-common denominator for all
 the various overlay types out there and it should be possible to write
 a piece of chipset specific compositor code to use features that can't
 be expressed through KMS overlays.

Properties are good. Check.

  - We have two types of properties: dynamic and non-dynamic ones.
 Dynamic properties can always be changed in the next frame (fb bos, hw
 cursor position, overlay position, for example), non-dynamic
 properties typically involve changing the way bandwidth are allocated
 and changing them may fail.

There's just no way to make such a general split. The simple fact is
that even moving an overlay can fail due to timing/bandwith related
constraints.

  - We need a test ioctl that can verify whether changing non-dynamic
 properties will work.  Using the atomic modeset for that with a
 test-only flag seems like a good option since that already has the
 logic to analyze bandwidth allocation across all crtcs.  On the other
 hand, it may make more sense to use the multiflip ioctl as well here.
 What we need to check is whether the change made by a multifflip is
 possible, so it seems natural to communicate that change to the kernel
 using the same ioctl and data structs as the multiflip itself.  The
 bandwidth calculation is a global decision and involves all crtcs and
 the current state, so the kernel can decide just fine if a multiflip
 is possible or not, based on the current state and the requested
 multiflip.

Ie. multiflip and atomic modeset need exactly the same thing here.

  - Atomic multiflip for one crtc is essential for avoiding flicker and
 artifacts, but ill-defined for multiple crtcs simultaneously and even
 in the genlock case, the failure mode is hardly noticable (one crtc
 may drop a frame in case the compositor is racing with vsync, in which
 case multiflip just means both crtcs drop a frame).

Sorry I don't follow. With two ioctls in the genlocked case, one crtc
could drop, and the other might not. That is going to be a problem if
both crtcs handle parts of the same physical display. Apart from the
possible IVI and phone/tablet/gadget uses, I can imagine this being
useful for large advertisement/presentation/simulation displays too.

Also allowing multi crtc flips in the non-genlocked case makes cloned
displays trivial to implement. This is especially useful if the system
is push based like surfaceflinger.

 For flipping
 multiple fbs and planes, on one crtc, however, atomicity means that we
 can combine gpu rendering and overlays in a reliable way, without
 having to worry about flicker when sprites turn on a frame later after
 we've already erased the surface contents from the main fb.  We need
 to be able to render the scene graph split across various planes at
 certain positions and know for certain that when we flip, that's the
 configuration that ends up on the output.

Sure, that's the main goal of this work.

  - Pageflip events can be controlled by a flag (as for the current
 pageflip ioctl) or perhaps disabled by setting user_data to 0, but the
 user data is passed in with each nuclear pageflip ioctl and each ioctl
 generates one event (if requested) which returns the user data that
 was passed in at ioctl time.  This is how it currently works, the
 event mechanism is already in place, I see no reason to change this
 behaviour.  Surely, we're not concerned about 8 extra bytes in the
 ioctl struct?  The atomic modeset event (in test mode or not) never
 generates an event, so there's no need for user data there.

There's no reason why you couldn't send the event in the blocking
modeset case too. Also it would open the door for asynchronous modeset,
if someone has the cojones to implement it.

  - Pageflip for multiple crtc may be useful in case of gen-locked
 crtc, but it is a corner case and not likely to be present or relevant
 in mainstream hw.

I've already provided many ideas where it could be used, and I don't
even consider myself a very imaginative person.

I don't see the point of forcing everyone with such a setup to add
hacks in order to work around artificial restrictions imposed by the
API. Do we want to make a system that people *want* to use, or one they
*have* to use.

 With the properties being an extensible mechanism,
 we could probably expose gen-locked crtcs through the properties or
 such and in worst case make a new ioctl as Jesses suggests.

Well, I just don't see the point of going about it in such a
roundabout way.

My current prototype code basically handles this case already, except
that I added an artifical restriction to avoid the async 

Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Ville Syrjälä
On Fri, Sep 14, 2012 at 09:12:35PM -0500, Rob Clark wrote:
 On Thu, Sep 13, 2012 at 11:35 AM, Rob Clark rob.cl...@linaro.org wrote:
  note that the test phase doesn't need vblank events, and also
  shouldn't -EBUSY if there is still a pending flip[*], so I'd propose
  that however we go about pageflip (one super-ioctl, or one per crtc),
  we could use the atomic-modeset ioctl for the test step
 
 actually, I think I take this back..  one thing that was discussed on
 IRC, but didn't make it to this email thread is the behavior of
 non-specified properties.  What I am thinking:
 
 modeset: unspecified properties revert to default
 pageflip: unspecified properties preserve current value

Why on earth would you want to revert stuff to default? That's only
going to make the code more complex.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Rob Clark
On Sat, Sep 15, 2012 at 9:53 AM, Ville Syrjälä syrj...@sci.fi wrote:
 On Fri, Sep 14, 2012 at 05:46:35PM -0400, Kristian Høgsberg wrote:
 I think (hope) the consensus coming out of this thread is something
 along these lines:

  - We use properties for specifying what to change to be future
 compatible with new crtc features, but also to allow exposing
 hw-specific properties and tie them into the atomicity of the
 pageflip.  The KMS overlays are a lowest-common denominator for all
 the various overlay types out there and it should be possible to write
 a piece of chipset specific compositor code to use features that can't
 be expressed through KMS overlays.

 Properties are good. Check.

  - We have two types of properties: dynamic and non-dynamic ones.
 Dynamic properties can always be changed in the next frame (fb bos, hw
 cursor position, overlay position, for example), non-dynamic
 properties typically involve changing the way bandwidth are allocated
 and changing them may fail.

 There's just no way to make such a general split. The simple fact is
 that even moving an overlay can fail due to timing/bandwith related
 constraints.

fwiw, the driver should indicate this by setting a flag on the
property, this way userspace knows what can be changed dynamically and
what can not.

  - We need a test ioctl that can verify whether changing non-dynamic
 properties will work.  Using the atomic modeset for that with a
 test-only flag seems like a good option since that already has the
 logic to analyze bandwidth allocation across all crtcs.  On the other
 hand, it may make more sense to use the multiflip ioctl as well here.
 What we need to check is whether the change made by a multifflip is
 possible, so it seems natural to communicate that change to the kernel
 using the same ioctl and data structs as the multiflip itself.  The
 bandwidth calculation is a global decision and involves all crtcs and
 the current state, so the kernel can decide just fine if a multiflip
 is possible or not, based on the current state and the requested
 multiflip.

 Ie. multiflip and atomic modeset need exactly the same thing here.

  - Atomic multiflip for one crtc is essential for avoiding flicker and
 artifacts, but ill-defined for multiple crtcs simultaneously and even
 in the genlock case, the failure mode is hardly noticable (one crtc
 may drop a frame in case the compositor is racing with vsync, in which
 case multiflip just means both crtcs drop a frame).

 Sorry I don't follow. With two ioctls in the genlocked case, one crtc
 could drop, and the other might not. That is going to be a problem if
 both crtcs handle parts of the same physical display. Apart from the
 possible IVI and phone/tablet/gadget uses, I can imagine this being
 useful for large advertisement/presentation/simulation displays too.

 Also allowing multi crtc flips in the non-genlocked case makes cloned
 displays trivial to implement. This is especially useful if the system
 is push based like surfaceflinger.

 For flipping
 multiple fbs and planes, on one crtc, however, atomicity means that we
 can combine gpu rendering and overlays in a reliable way, without
 having to worry about flicker when sprites turn on a frame later after
 we've already erased the surface contents from the main fb.  We need
 to be able to render the scene graph split across various planes at
 certain positions and know for certain that when we flip, that's the
 configuration that ends up on the output.

 Sure, that's the main goal of this work.

  - Pageflip events can be controlled by a flag (as for the current
 pageflip ioctl) or perhaps disabled by setting user_data to 0, but the
 user data is passed in with each nuclear pageflip ioctl and each ioctl
 generates one event (if requested) which returns the user data that
 was passed in at ioctl time.  This is how it currently works, the
 event mechanism is already in place, I see no reason to change this
 behaviour.  Surely, we're not concerned about 8 extra bytes in the
 ioctl struct?  The atomic modeset event (in test mode or not) never
 generates an event, so there's no need for user data there.

 There's no reason why you couldn't send the event in the blocking
 modeset case too. Also it would open the door for asynchronous modeset,
 if someone has the cojones to implement it.

  - Pageflip for multiple crtc may be useful in case of gen-locked
 crtc, but it is a corner case and not likely to be present or relevant
 in mainstream hw.

 I've already provided many ideas where it could be used, and I don't
 even consider myself a very imaginative person.

 I don't see the point of forcing everyone with such a setup to add
 hacks in order to work around artificial restrictions imposed by the
 API. Do we want to make a system that people *want* to use, or one they
 *have* to use.

 With the properties being an extensible mechanism,
 we could probably expose gen-locked crtcs through the properties or
 such and in worst case make a 

Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Rob Clark
On Sat, Sep 15, 2012 at 9:56 AM, Ville Syrjälä syrj...@sci.fi wrote:
 On Fri, Sep 14, 2012 at 09:12:35PM -0500, Rob Clark wrote:
 On Thu, Sep 13, 2012 at 11:35 AM, Rob Clark rob.cl...@linaro.org wrote:
  note that the test phase doesn't need vblank events, and also
  shouldn't -EBUSY if there is still a pending flip[*], so I'd propose
  that however we go about pageflip (one super-ioctl, or one per crtc),
  we could use the atomic-modeset ioctl for the test step

 actually, I think I take this back..  one thing that was discussed on
 IRC, but didn't make it to this email thread is the behavior of
 non-specified properties.  What I am thinking:

 modeset: unspecified properties revert to default
 pageflip: unspecified properties preserve current value

 Why on earth would you want to revert stuff to default? That's only
 going to make the code more complex.

well, you need to do it *somewhere*..  possibly it can be on drm file
close or dropmaster.  But modeset seems like a sensible place.  I
really hate the v4l2 approach of preserving settings for the next
process that opens the device.

BR,
-R

 --
 Ville Syrjälä
 syrj...@sci.fi
 http://www.sci.fi/~syrjala/
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Ville Syrjälä
On Sat, Sep 15, 2012 at 11:07:02AM -0500, Rob Clark wrote:
 On Sat, Sep 15, 2012 at 9:56 AM, Ville Syrjälä syrj...@sci.fi wrote:
  On Fri, Sep 14, 2012 at 09:12:35PM -0500, Rob Clark wrote:
  On Thu, Sep 13, 2012 at 11:35 AM, Rob Clark rob.cl...@linaro.org wrote:
   note that the test phase doesn't need vblank events, and also
   shouldn't -EBUSY if there is still a pending flip[*], so I'd propose
   that however we go about pageflip (one super-ioctl, or one per crtc),
   we could use the atomic-modeset ioctl for the test step
 
  actually, I think I take this back..  one thing that was discussed on
  IRC, but didn't make it to this email thread is the behavior of
  non-specified properties.  What I am thinking:
 
  modeset: unspecified properties revert to default
  pageflip: unspecified properties preserve current value
 
  Why on earth would you want to revert stuff to default? That's only
  going to make the code more complex.
 
 well, you need to do it *somewhere*..  possibly it can be on drm file
 close or dropmaster.  But modeset seems like a sensible place.  I
 really hate the v4l2 approach of preserving settings for the next
 process that opens the device.

Ah so it's the same workaround for lack of proper state management.
Each master should just have its own state. Or if that's too much to
ask, at least the reset could be done only when the master changes.

If you do it at modeset time, which props do you reset anyway? All of
them for the whole device? Just the ones related to the CRTCs undergoing
the modeset? What if there's some conflict between the default values
on that CRTC and the current values on another CRTC? What about properties
for planes that can move across CRTCs? This kind of partial state reset
opens up a lot of open questions, so a full reset at master switch seems
a lot more sensible.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Rob Clark
On Sat, Sep 15, 2012 at 12:04 PM, Ville Syrjälä syrj...@sci.fi wrote:
 On Sat, Sep 15, 2012 at 11:07:02AM -0500, Rob Clark wrote:
 On Sat, Sep 15, 2012 at 9:56 AM, Ville Syrjälä syrj...@sci.fi wrote:
  On Fri, Sep 14, 2012 at 09:12:35PM -0500, Rob Clark wrote:
  On Thu, Sep 13, 2012 at 11:35 AM, Rob Clark rob.cl...@linaro.org wrote:
   note that the test phase doesn't need vblank events, and also
   shouldn't -EBUSY if there is still a pending flip[*], so I'd propose
   that however we go about pageflip (one super-ioctl, or one per crtc),
   we could use the atomic-modeset ioctl for the test step
 
  actually, I think I take this back..  one thing that was discussed on
  IRC, but didn't make it to this email thread is the behavior of
  non-specified properties.  What I am thinking:
 
  modeset: unspecified properties revert to default
  pageflip: unspecified properties preserve current value
 
  Why on earth would you want to revert stuff to default? That's only
  going to make the code more complex.

 well, you need to do it *somewhere*..  possibly it can be on drm file
 close or dropmaster.  But modeset seems like a sensible place.  I
 really hate the v4l2 approach of preserving settings for the next
 process that opens the device.

 Ah so it's the same workaround for lack of proper state management.
 Each master should just have its own state. Or if that's too much to
 ask, at least the reset could be done only when the master changes.

 If you do it at modeset time, which props do you reset anyway? All of
 them for the whole device? Just the ones related to the CRTCs undergoing
 the modeset? What if there's some conflict between the default values
 on that CRTC and the current values on another CRTC? What about properties
 for planes that can move across CRTCs? This kind of partial state reset
 opens up a lot of open questions, so a full reset at master switch seems
 a lot more sensible.

Well, if you reset *all* properties on modeset, then crtcs's that
aren't set in the modeset go off..  atomic-modeset is userspace saying
here is the entire config I want.. go make it happen.  But I guess
it does get a bit easier to implement legacy setcrtc on top of the new
mechanism if untouched properties preserve their value.

I could live w/ just reset on master change.. that meets my minimum
requirement of not carrying state between different processes using
the device.  Having a flag indicating 'reset untouched properties'
would be useful if the default behavior is to preserve.

I still think setcrtc and pageflip shouldn't be mashed into a single ioctl :-)

BR,
-R

 --
 Ville Syrjälä
 syrj...@sci.fi
 http://www.sci.fi/~syrjala/
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Ville Syrjälä
On Sat, Sep 15, 2012 at 11:05:29AM -0500, Rob Clark wrote:
 On Sat, Sep 15, 2012 at 9:53 AM, Ville Syrjälä syrj...@sci.fi wrote:
  On Fri, Sep 14, 2012 at 05:46:35PM -0400, Kristian Høgsberg wrote:
  I think (hope) the consensus coming out of this thread is something
  along these lines:
 
   - We use properties for specifying what to change to be future
  compatible with new crtc features, but also to allow exposing
  hw-specific properties and tie them into the atomicity of the
  pageflip.  The KMS overlays are a lowest-common denominator for all
  the various overlay types out there and it should be possible to write
  a piece of chipset specific compositor code to use features that can't
  be expressed through KMS overlays.
 
  Properties are good. Check.
 
   - We have two types of properties: dynamic and non-dynamic ones.
  Dynamic properties can always be changed in the next frame (fb bos, hw
  cursor position, overlay position, for example), non-dynamic
  properties typically involve changing the way bandwidth are allocated
  and changing them may fail.
 
  There's just no way to make such a general split. The simple fact is
  that even moving an overlay can fail due to timing/bandwith related
  constraints.
 
 fwiw, the driver should indicate this by setting a flag on the
 property, this way userspace knows what can be changed dynamically and
 what can not.

OK maybe user space could notice that all of the properties it's
going to manipulate have that flag in the correct position. User space
could then skip the check ioctl, and proceed straight to the commit
phase with the nice feeling that it should not fail. But that's just
an optimization.

Or are you actually suggesting that changing any property with the
flag in the wrong position would require a full modeset, ie. force
you to take the blocking code path? That just won't fly.

  So I propose that we have:
  - One ioctl that takes an arbitrary number of obj/prop/value tuples
 
 well, to be fair, if we convert everything to properties, maybe
 drm/kms only needs one ioctl for everything :-P

Sure, why not ;) Well, we still have all the enumerations stuff and
whatnot. I see no point in changing those when they work adequately.
But actually setting the state of the hardware can be handled through
a single ioctl.

 But different ioctls are cheap..  I don't think it hurts to have two
 instead of one.  I really don't see modeset and pageflip as the same
 thing. Maybe from the front-end of the video pipe, they are.  But
 from encoder and back they are different.  Modeset can take many
 vblank cycles to complete.  And is infrequent.  Introducing a
 state-machine to try to make this asynchronous is just adding a lot of
 complexity and potential fail for not really much gain.

I don't entirely agree with the infrequent part. Fullscreen video on
external displays is one case where you really may want to change the
mode quite often. Or you may not want to change the actual timings on
the display, but you still want to change the resolution of the CRTC,
and let a panel fitter handle the difference in input and output
resolutions. But thanks to the way kms is designed those two things
are both linked to the display mode of the CRTC, so you still need a
modeset to handle it.

 Even flip on multiple crtcs introduces some new edge cases, like
 moving a plane from one crtc to another.  If this is split into two
 ioctl calls, then the test on the 2nd crtc can -EBUSY because the
 plane is still pending disconnect from the first.  But the test on the
 1st crtc can succeed.  I can see the usefulness of flip on
 multi-crtc... but since it isn't nearly as useful/important as
 flipping multiple planes on a single crtc, I don't see the harm in
 starting simple and adding this later.

Forcing you to rewrite user space multiple times. And keeping all the
old codepaths around in both kernel and user space side to maintain
ABI compatibility both ways. Also the speed at which these things
trickle through to the actual users is very slow, so adding new ioctls
every six months to handle overlapping tasks is not sensible IMHO.

My _point_ is that there is no need to hardcode these restrictions
into the API. We already know what kind of API will handle all these
cases, so why can't we just go with that API from the very beginning?

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Ville Syrjälä
On Sat, Sep 15, 2012 at 12:15:59PM -0500, Rob Clark wrote:
 On Sat, Sep 15, 2012 at 12:04 PM, Ville Syrjälä syrj...@sci.fi wrote:
  On Sat, Sep 15, 2012 at 11:07:02AM -0500, Rob Clark wrote:
  On Sat, Sep 15, 2012 at 9:56 AM, Ville Syrjälä syrj...@sci.fi wrote:
   On Fri, Sep 14, 2012 at 09:12:35PM -0500, Rob Clark wrote:
   On Thu, Sep 13, 2012 at 11:35 AM, Rob Clark rob.cl...@linaro.org 
   wrote:
note that the test phase doesn't need vblank events, and also
shouldn't -EBUSY if there is still a pending flip[*], so I'd propose
that however we go about pageflip (one super-ioctl, or one per crtc),
we could use the atomic-modeset ioctl for the test step
  
   actually, I think I take this back..  one thing that was discussed on
   IRC, but didn't make it to this email thread is the behavior of
   non-specified properties.  What I am thinking:
  
   modeset: unspecified properties revert to default
   pageflip: unspecified properties preserve current value
  
   Why on earth would you want to revert stuff to default? That's only
   going to make the code more complex.
 
  well, you need to do it *somewhere*..  possibly it can be on drm file
  close or dropmaster.  But modeset seems like a sensible place.  I
  really hate the v4l2 approach of preserving settings for the next
  process that opens the device.
 
  Ah so it's the same workaround for lack of proper state management.
  Each master should just have its own state. Or if that's too much to
  ask, at least the reset could be done only when the master changes.
 
  If you do it at modeset time, which props do you reset anyway? All of
  them for the whole device? Just the ones related to the CRTCs undergoing
  the modeset? What if there's some conflict between the default values
  on that CRTC and the current values on another CRTC? What about properties
  for planes that can move across CRTCs? This kind of partial state reset
  opens up a lot of open questions, so a full reset at master switch seems
  a lot more sensible.
 
 Well, if you reset *all* properties on modeset, then crtcs's that
 aren't set in the modeset go off..  atomic-modeset is userspace saying
 here is the entire config I want.. go make it happen.  But I guess
 it does get a bit easier to implement legacy setcrtc on top of the new
 mechanism if untouched properties preserve their value.

Yeah. I don't see much point in maintaining the state stometimes,
but sometimes not. Either do or do not.

 I could live w/ just reset on master change.. that meets my minimum
 requirement of not carrying state between different processes using
 the device.

 Having a flag indicating 'reset untouched properties'
 would be useful if the default behavior is to preserve.

Perhaps. Or perhaps some way to query the default values of properties.

 I still think setcrtc and pageflip shouldn't be mashed into a single ioctl :-)

So instead of one ioctl w/ async flag, you want one sync ioctl and one
async ioctl. Sure, why not. Both would en up doing much of the same
things when collecting and verifying the state, but sharing the code is
easy anyway.

But I'm actually not sure what everyone wants from the sync ioctl,
especially when you use it for somthing that doesn't involve changing
the timings. Should it behave like the current setcrtc, setplane etc.
where the ioctl is free to execute asynchronously, but without any way
to get a completion event? Or should it always block until the operation
is truly complete?

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Rob Clark
On Sat, Sep 15, 2012 at 1:00 PM, Ville Syrjälä syrj...@sci.fi wrote:
 On Sat, Sep 15, 2012 at 11:05:29AM -0500, Rob Clark wrote:
 On Sat, Sep 15, 2012 at 9:53 AM, Ville Syrjälä syrj...@sci.fi wrote:
  On Fri, Sep 14, 2012 at 05:46:35PM -0400, Kristian Høgsberg wrote:
  I think (hope) the consensus coming out of this thread is something
  along these lines:
 
   - We use properties for specifying what to change to be future
  compatible with new crtc features, but also to allow exposing
  hw-specific properties and tie them into the atomicity of the
  pageflip.  The KMS overlays are a lowest-common denominator for all
  the various overlay types out there and it should be possible to write
  a piece of chipset specific compositor code to use features that can't
  be expressed through KMS overlays.
 
  Properties are good. Check.
 
   - We have two types of properties: dynamic and non-dynamic ones.
  Dynamic properties can always be changed in the next frame (fb bos, hw
  cursor position, overlay position, for example), non-dynamic
  properties typically involve changing the way bandwidth are allocated
  and changing them may fail.
 
  There's just no way to make such a general split. The simple fact is
  that even moving an overlay can fail due to timing/bandwith related
  constraints.

 fwiw, the driver should indicate this by setting a flag on the
 property, this way userspace knows what can be changed dynamically and
 what can not.

 OK maybe user space could notice that all of the properties it's
 going to manipulate have that flag in the correct position. User space
 could then skip the check ioctl, and proceed straight to the commit
 phase with the nice feeling that it should not fail. But that's just
 an optimization.

 Or are you actually suggesting that changing any property with the
 flag in the wrong position would require a full modeset, ie. force
 you to take the blocking code path? That just won't fly.

no, no..  the flag would just be a hint to userspace that it was a
property that could be safely changed without a test step.  Otherwise,
userspace should do the test call first to confirm that the hw could
handle the new property values.

  So I propose that we have:
  - One ioctl that takes an arbitrary number of obj/prop/value tuples

 well, to be fair, if we convert everything to properties, maybe
 drm/kms only needs one ioctl for everything :-P

 Sure, why not ;) Well, we still have all the enumerations stuff and
 whatnot. I see no point in changing those when they work adequately.
 But actually setting the state of the hardware can be handled through
 a single ioctl.

 But different ioctls are cheap..  I don't think it hurts to have two
 instead of one.  I really don't see modeset and pageflip as the same
 thing. Maybe from the front-end of the video pipe, they are.  But
 from encoder and back they are different.  Modeset can take many
 vblank cycles to complete.  And is infrequent.  Introducing a
 state-machine to try to make this asynchronous is just adding a lot of
 complexity and potential fail for not really much gain.

 I don't entirely agree with the infrequent part. Fullscreen video on
 external displays is one case where you really may want to change the
 mode quite often. Or you may not want to change the actual timings on
 the display, but you still want to change the resolution of the CRTC,
 and let a panel fitter handle the difference in input and output
 resolutions. But thanks to the way kms is designed those two things
 are both linked to the display mode of the CRTC, so you still need a
 modeset to handle it.

well, possibly some properties could be added to circumvent that.. I
was more thinking of actual timing changes

 Even flip on multiple crtcs introduces some new edge cases, like
 moving a plane from one crtc to another.  If this is split into two
 ioctl calls, then the test on the 2nd crtc can -EBUSY because the
 plane is still pending disconnect from the first.  But the test on the
 1st crtc can succeed.  I can see the usefulness of flip on
 multi-crtc... but since it isn't nearly as useful/important as
 flipping multiple planes on a single crtc, I don't see the harm in
 starting simple and adding this later.

 Forcing you to rewrite user space multiple times. And keeping all the
 old codepaths around in both kernel and user space side to maintain
 ABI compatibility both ways. Also the speed at which these things
 trickle through to the actual users is very slow, so adding new ioctls
 every six months to handle overlapping tasks is not sensible IMHO.

well, I'm not quite advocating a new ioctl every 6 months.. but what I
meant is that the cost isn't high to have two ioctls, one for async
pageflip, and one for modeset.. and the sync modeset ioctl is required
if you are changing timings or lighting up a new display.

I think for pageflip, we could have just one ioctl, with a single
user-data.  And later add flags to indicate when the event should be
sent back 

Re: [RFC 0/9] nuclear pageflip

2012-09-15 Thread Rob Clark
On Sat, Sep 15, 2012 at 2:08 PM, Ville Syrjälä syrj...@sci.fi wrote:
 On Sat, Sep 15, 2012 at 12:15:59PM -0500, Rob Clark wrote:
 On Sat, Sep 15, 2012 at 12:04 PM, Ville Syrjälä syrj...@sci.fi wrote:
  On Sat, Sep 15, 2012 at 11:07:02AM -0500, Rob Clark wrote:
  On Sat, Sep 15, 2012 at 9:56 AM, Ville Syrjälä syrj...@sci.fi wrote:
   On Fri, Sep 14, 2012 at 09:12:35PM -0500, Rob Clark wrote:
   On Thu, Sep 13, 2012 at 11:35 AM, Rob Clark rob.cl...@linaro.org 
   wrote:
note that the test phase doesn't need vblank events, and also
shouldn't -EBUSY if there is still a pending flip[*], so I'd propose
that however we go about pageflip (one super-ioctl, or one per crtc),
we could use the atomic-modeset ioctl for the test step
  
   actually, I think I take this back..  one thing that was discussed on
   IRC, but didn't make it to this email thread is the behavior of
   non-specified properties.  What I am thinking:
  
   modeset: unspecified properties revert to default
   pageflip: unspecified properties preserve current value
  
   Why on earth would you want to revert stuff to default? That's only
   going to make the code more complex.
 
  well, you need to do it *somewhere*..  possibly it can be on drm file
  close or dropmaster.  But modeset seems like a sensible place.  I
  really hate the v4l2 approach of preserving settings for the next
  process that opens the device.
 
  Ah so it's the same workaround for lack of proper state management.
  Each master should just have its own state. Or if that's too much to
  ask, at least the reset could be done only when the master changes.
 
  If you do it at modeset time, which props do you reset anyway? All of
  them for the whole device? Just the ones related to the CRTCs undergoing
  the modeset? What if there's some conflict between the default values
  on that CRTC and the current values on another CRTC? What about properties
  for planes that can move across CRTCs? This kind of partial state reset
  opens up a lot of open questions, so a full reset at master switch seems
  a lot more sensible.

 Well, if you reset *all* properties on modeset, then crtcs's that
 aren't set in the modeset go off..  atomic-modeset is userspace saying
 here is the entire config I want.. go make it happen.  But I guess
 it does get a bit easier to implement legacy setcrtc on top of the new
 mechanism if untouched properties preserve their value.

 Yeah. I don't see much point in maintaining the state stometimes,
 but sometimes not. Either do or do not.

 I could live w/ just reset on master change.. that meets my minimum
 requirement of not carrying state between different processes using
 the device.

 Having a flag indicating 'reset untouched properties'
 would be useful if the default behavior is to preserve.

 Perhaps. Or perhaps some way to query the default values of properties.

Being able to query default values is probably useful.  But we might
want the flag anyways to make life easier on userspace.  But I guess
these can be added over time if needed, they shouldn't block getting
the basic multi-plane page-flip in place.

 I still think setcrtc and pageflip shouldn't be mashed into a single ioctl 
 :-)

 So instead of one ioctl w/ async flag, you want one sync ioctl and one
 async ioctl. Sure, why not. Both would en up doing much of the same
 things when collecting and verifying the state, but sharing the code is
 easy anyway.

yup, I see both using a lot of the same code.. all the stuff about
splitting out the object state is applicable to both.  The property
enhancements to support object properties, signed properties, etc,
this all applies for both.

 But I'm actually not sure what everyone wants from the sync ioctl,
 especially when you use it for somthing that doesn't involve changing
 the timings. Should it behave like the current setcrtc, setplane etc.
 where the ioctl is free to execute asynchronously, but without any way
 to get a completion event? Or should it always block until the operation
 is truly complete?

I'm thinking of the sync ioctl in particular for things that change
timings or light up a new display.  Stuff like adding a new plane,
flipping/moving an existing plane or crtc, etc.. these can all be done
via the async ioctl.  The crtc property on a plane, for example, would
probably not have the 'dynamic' flag set[*], so that would require a
test step.

[*] in my current patchset, the dynamic flags are set in core, but
this really isn't the way it should be, the driver should be in
control of which properties have the 'dynamic' flag set, but I needed
something temporary to get things up and running.

BR,
-R

 --
 Ville Syrjälä
 syrj...@sci.fi
 http://www.sci.fi/~syrjala/
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Ville Syrjälä
On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
 On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
  On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 02:40:56PM -0500, Rob Clark wrote:
   On Wed, Sep 12, 2012 at 1:58 PM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
 But I think we could still do this w/ one ioctl per crtc for 
 atomic-pageflip.

 We could, if we want to sacrifice the synced multi display case. I 
 just
 think it might be a real use case at some point. IVI feels like 
 the most
 likely short term cadidate to me, but perhaps someone would finally
 introduce some new style phone/tablet thingy. I have seen
 concepts/prototypes of such multi display gadgets in the past, but 
 the
 industry apparently got a bit stuck on the rectangular slab with
 touchscreen on one side design.
   
I could be wrong, but I think IVI the screens would normally be too
far apart to matter?
   
I was thinking of something like a display on the dash that normally
sits low with only a small sliver visible, and extends upwards when
you fire up a movie player for example. Internally it could be made
up of two displays for power savings purposes.
   
Anyways, it is really only a problem if you can't do two ioctl()s
within one vblank period. If it actually turns out to be a real
problem,
   
Well exactly that's the problem this whole atomic pageflip stuff is
trying to tackle, no? ;)
   
we could always add later an ioctl that takes an array of
'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
really useful or not.. but maybe I'm thinking too much about how
weston does it's rendering of different output's independently.
   
I'm just now thinking of surfaceflinger's way of doing things, with
its prepare and commit phases. If you need to issue two ioctls to 
handle
cloned displays, then you can end up in a somewhat funky situation.
   
Let's say you have a video overlay in use (one each display 
naturally),
and you increase the downscaling factor enough so that you now have
enough memory bandwith to support only one overlay. With independent
check ioctls for each display, you never have the full device state
available in the kernel, so each check succeeds during the prepare
phase. So you decide that you can keep using the video overlays.
   
You then proceed to commit the state, but after the first display has
been commited you get an error when trying to commit the second one.
What can you do now? The only option is to keep displaying the old
frame on the other displays for some time longer, and then on the
next frame you can switch to GPU composition. But on the next frame 
you
perhaps no longer need to use GPU composition, but since you can't
verify that in the prepare phase, you have no other option but to use
GPU composition.
   
So when you run into a configuration that can be supported only
partially, you get animation stalls on some displays due to skipped
frames, and you always have to fall back to GPU composition for the
next frame.
   
If on the other hand you would check the whole state in one ioctl,
you'd get the error in the first prepare phase, and could fall back
to GPU composition immediately.
   
Am I too much of a perfectionst in considering such things? I don't
think so, but perhaps other people disagree.
  
   Ok, if you have a case where the state of the two crtc's are not
   actually independent, then I think you have a valid point.
  
   I'm not quite sure what userspace would do about it, though.. for the
   general case where vsync isn't locked, and you can't even necessarily
   assume vsync period is the same, then I don't really think you want to
   couple rendering to the displays.
  
   I would say this is going to be the most common use case if you consider
   just the number of shipping devices. It's pretty much what every Android
   phone/tablet with a HDMI port has to do.
 
  bleh, surfaceflinger kinda sucks then..
 
  Why? This use case is not enforced by surfaceflinger, it's just the use
  case most devices would have.
 
  I don't think there's anything wrong with the way surfaceflinger is designed
  with the prepare and commit phases. How else would you do it?
 
 well, maybe I misunderstood how surfaceflinger works, but it sounded
 like it has one prepare/commit phase across outputs, vs what weston
 compositor does where each output is rendered and 

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Rob Clark
On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
 On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
  On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
[snip]
  
   I would say this is going to be the most common use case if you consider
   just the number of shipping devices. It's pretty much what every Android
   phone/tablet with a HDMI port has to do.
 
  bleh, surfaceflinger kinda sucks then..
 
  Why? This use case is not enforced by surfaceflinger, it's just the use
  case most devices would have.
 
  I don't think there's anything wrong with the way surfaceflinger is 
  designed
  with the prepare and commit phases. How else would you do it?

 well, maybe I misunderstood how surfaceflinger works, but it sounded
 like it has one prepare/commit phase across outputs, vs what weston
 compositor does where each output is rendered and flipped
 independently at the rate of that particular output.  If the two
 outputs just happen to be vsync aligned, you would end up flipping at
 the same time, but if the are not locked you don't have any artificial
 constraint in the rendering/flipping.

 OK so it's purely a pull based model, whereas surfaceflinger is more
 push based.

 I suppose it might be possible to make surfaceflinger support a pull
 model by driving the compositor loop through a combined signal from
 multiple outputs. But IIRC it did have some timing related code in
 there somewhere, so it might not be happy about it. It might also

As I understood, at least in older versions android versions,
rendering was based on a timer as there was no vblank event to
userspace on most SoC platforms (which sounds strange, but so far most
SoC's are using fbdev and/or crazy hacks rather than drm/kms)

not sure if the timer is still there.. but I hope it goes away, it is
really a horrible way to keep track of vsync

 affect the clients' rendering speed since the compositor would be
 pulling their buffers from queue at non-constant speed. I don't
 remember the details of the buffer management very well, so I can't be
 sure though. But I probably wouldn't bother trying this, since the
 straightforward approach is so simple, and the results are reasonably
 good.

 The pull model does seem more flexible. But it does require a bit of
 extra complexity in the compositor to avoid compositing the same scene
 multiple times needlessly when multiple cloned displays are involved.
 I suppose ideally you'd want to recompose for each display to minimize
 visible latency, but from power usage POV it may not be a good idea.

fwiw, weston is already being pretty clever about keeping track of
damage and minimizing the area of the screen that must be re-rendered.
 I'm not sure if SF does anything like this.

   From userspace API, I guess something like:
  
   struct drm_mode_crtc_atomic_page_flip {
 uint32_t flags;
 uint32_t count_crtcs;
 uint64_t crtc_ids_ptr;  /* array of uint32_t */
 uint64_t count_props_ptr; /* array of uint32_t, # of prop's per 
   crtc */
 uint64_t props_ptr;  /* ptr to array of 
   drm_mode_obj_set_property */
 uint64_t user_data;
   };
  
   Starting to look much like my drm_mode_atomic struct :)
  
   Let's compare:
  
   struct drm_mode_atomic {
   __u32 flags;
   __u32 count_objs;
   __u64 objs_ptr;
   __u64 count_props_ptr;
   __u64 props_ptr;
   __u64 prop_values_ptr;
   __u64 blob_values_ptr;
   };
 
  well, you do miss userdata, I think
 
  Sure, because I didn't add the event stuff yet.

 note that the test phase doesn't need vblank events, and also
 shouldn't -EBUSY if there is still a pending flip[*],

 Right. Personally I'm not a fan of the EBUSY behaviour at all. Seems
 a bit pointless since user space can take care of it via the event
 mechanism. But I suppose you want it for omap so that you can avoid
 having to write software workarounds to overcome the GO bit
 limitations.

I the main issue is disconnecting an overlay from one crtc and
connecting to another.. I would expect that any hw which can connect
an ovl to more than one possible crtc would have the same limit (ie.
have to wait until scanout on previous crtc completes), so I think
EBUSY is a good way to indicate to userspace that the requested
configuration is not possible *now* but would be possible in the
future.

 so I'd propose
 that however we go about pageflip (one super-ioctl, or one per crtc),
 we could use the atomic-modeset ioctl for the test step

 Yeah that seems reasonable. If we do that, then it doesn't matter that
 much whether we have commit per pipe or not, except for the synced
 displays case, which I'd still like to support if possible. At least
 someone explicitly wanted such a feature in 

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Ville Syrjälä
On Fri, Sep 14, 2012 at 08:25:53AM -0500, Rob Clark wrote:
 On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
  On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
   On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
 [snip]
   
I would say this is going to be the most common use case if you 
consider
just the number of shipping devices. It's pretty much what every 
Android
phone/tablet with a HDMI port has to do.
  
   bleh, surfaceflinger kinda sucks then..
  
   Why? This use case is not enforced by surfaceflinger, it's just the use
   case most devices would have.
  
   I don't think there's anything wrong with the way surfaceflinger is 
   designed
   with the prepare and commit phases. How else would you do it?
 
  well, maybe I misunderstood how surfaceflinger works, but it sounded
  like it has one prepare/commit phase across outputs, vs what weston
  compositor does where each output is rendered and flipped
  independently at the rate of that particular output.  If the two
  outputs just happen to be vsync aligned, you would end up flipping at
  the same time, but if the are not locked you don't have any artificial
  constraint in the rendering/flipping.
 
  OK so it's purely a pull based model, whereas surfaceflinger is more
  push based.
 
  I suppose it might be possible to make surfaceflinger support a pull
  model by driving the compositor loop through a combined signal from
  multiple outputs. But IIRC it did have some timing related code in
  there somewhere, so it might not be happy about it. It might also
 
 As I understood, at least in older versions android versions,
 rendering was based on a timer as there was no vblank event to
 userspace on most SoC platforms (which sounds strange, but so far most
 SoC's are using fbdev and/or crazy hacks rather than drm/kms)
 
 not sure if the timer is still there.. but I hope it goes away, it is
 really a horrible way to keep track of vsync

I've only looked at ICS in any detail. At least there we used the page
flip event from one display to set the pace of the compositor loop.
IIRC JB is supposed to have some vsync related changes, but I haven't
looked at the code.

  affect the clients' rendering speed since the compositor would be
  pulling their buffers from queue at non-constant speed. I don't
  remember the details of the buffer management very well, so I can't be
  sure though. But I probably wouldn't bother trying this, since the
  straightforward approach is so simple, and the results are reasonably
  good.
 
  The pull model does seem more flexible. But it does require a bit of
  extra complexity in the compositor to avoid compositing the same scene
  multiple times needlessly when multiple cloned displays are involved.
  I suppose ideally you'd want to recompose for each display to minimize
  visible latency, but from power usage POV it may not be a good idea.
 
 fwiw, weston is already being pretty clever about keeping track of
 damage and minimizing the area of the screen that must be re-rendered.
  I'm not sure if SF does anything like this.

IIRC it can do that, but the EGL implementation needs to support
EGL_BUFFER_PRESERVED.

I suppose the best way to implement EGL_BUFFER_PRESERVED with
page flips would be to schedule the flip and immediately perform
a blit from the new front buffer to the new back buffer. Well,
unless the hardware has some more clever mechanism for it.

Does weston depend on preserved flips too, or can it even track
damage independently for each buffer?

From userspace API, I guess something like:
   
struct drm_mode_crtc_atomic_page_flip {
  uint32_t flags;
  uint32_t count_crtcs;
  uint64_t crtc_ids_ptr;  /* array of uint32_t */
  uint64_t count_props_ptr; /* array of uint32_t, # of prop's 
per crtc */
  uint64_t props_ptr;  /* ptr to array of 
drm_mode_obj_set_property */
  uint64_t user_data;
};
   
Starting to look much like my drm_mode_atomic struct :)
   
Let's compare:
   
struct drm_mode_atomic {
__u32 flags;
__u32 count_objs;
__u64 objs_ptr;
__u64 count_props_ptr;
__u64 props_ptr;
__u64 prop_values_ptr;
__u64 blob_values_ptr;
};
  
   well, you do miss userdata, I think
  
   Sure, because I didn't add the event stuff yet.
 
  note that the test phase doesn't need vblank events, and also
  shouldn't -EBUSY if there is still a pending flip[*],
 
  Right. Personally I'm not a fan of the EBUSY behaviour at all. Seems
  a bit pointless since user space can take care of it via the event
  mechanism. But I suppose you want it for omap so that you can avoid
  having to write software workarounds to 

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Rob Clark
On Fri, Sep 14, 2012 at 8:58 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Fri, Sep 14, 2012 at 08:25:53AM -0500, Rob Clark wrote:
 On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
  On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
   On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
 [snip]
   
I would say this is going to be the most common use case if you 
consider
just the number of shipping devices. It's pretty much what every 
Android
phone/tablet with a HDMI port has to do.
  
   bleh, surfaceflinger kinda sucks then..
  
   Why? This use case is not enforced by surfaceflinger, it's just the use
   case most devices would have.
  
   I don't think there's anything wrong with the way surfaceflinger is 
   designed
   with the prepare and commit phases. How else would you do it?
 
  well, maybe I misunderstood how surfaceflinger works, but it sounded
  like it has one prepare/commit phase across outputs, vs what weston
  compositor does where each output is rendered and flipped
  independently at the rate of that particular output.  If the two
  outputs just happen to be vsync aligned, you would end up flipping at
  the same time, but if the are not locked you don't have any artificial
  constraint in the rendering/flipping.
 
  OK so it's purely a pull based model, whereas surfaceflinger is more
  push based.
 
  I suppose it might be possible to make surfaceflinger support a pull
  model by driving the compositor loop through a combined signal from
  multiple outputs. But IIRC it did have some timing related code in
  there somewhere, so it might not be happy about it. It might also

 As I understood, at least in older versions android versions,
 rendering was based on a timer as there was no vblank event to
 userspace on most SoC platforms (which sounds strange, but so far most
 SoC's are using fbdev and/or crazy hacks rather than drm/kms)

 not sure if the timer is still there.. but I hope it goes away, it is
 really a horrible way to keep track of vsync

 I've only looked at ICS in any detail. At least there we used the page
 flip event from one display to set the pace of the compositor loop.
 IIRC JB is supposed to have some vsync related changes, but I haven't
 looked at the code.

  affect the clients' rendering speed since the compositor would be
  pulling their buffers from queue at non-constant speed. I don't
  remember the details of the buffer management very well, so I can't be
  sure though. But I probably wouldn't bother trying this, since the
  straightforward approach is so simple, and the results are reasonably
  good.
 
  The pull model does seem more flexible. But it does require a bit of
  extra complexity in the compositor to avoid compositing the same scene
  multiple times needlessly when multiple cloned displays are involved.
  I suppose ideally you'd want to recompose for each display to minimize
  visible latency, but from power usage POV it may not be a good idea.

 fwiw, weston is already being pretty clever about keeping track of
 damage and minimizing the area of the screen that must be re-rendered.
  I'm not sure if SF does anything like this.

 IIRC it can do that, but the EGL implementation needs to support
 EGL_BUFFER_PRESERVED.

 I suppose the best way to implement EGL_BUFFER_PRESERVED with
 page flips would be to schedule the flip and immediately perform
 a blit from the new front buffer to the new back buffer. Well,
 unless the hardware has some more clever mechanism for it.

 Does weston depend on preserved flips too, or can it even track
 damage independently for each buffer?

well, weston knows how many buffers are at play.  So it takes the
union of the damage from the last time the buffer was used (well,
currently it assumes only double buffered) and the new damage.  This
way it avoids need for the gl driver, which doesn't know as well what
is going on as the app, from needing to do a back-blit.  It can do
this because w/ drm/gbm egl winsys, eglSwapBuffers() doesn't actually
swap the buffers on the display and weston is in charge of which
buffer is displayed or rendered.  Weston explicitly calls page flip
ioctl.  The good news being that it can atomically flip overlay layers
at the same time once the new ioctl is in place.

Maybe it is useful to look at http://github.com/robclark/kmscube .. it
doesn't actually use planes, but shows the interaction of egl and kms.
 Maybe I should enhance it w/ multiple rotating cubes on different
overlays. ;-)

From userspace API, I guess something like:
   
struct drm_mode_crtc_atomic_page_flip {
  uint32_t flags;
  uint32_t count_crtcs;
  uint64_t crtc_ids_ptr;  /* array of uint32_t */
  uint64_t count_props_ptr; /* array 

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Ville Syrjälä
On Fri, Sep 14, 2012 at 09:45:18AM -0500, Rob Clark wrote:
 On Fri, Sep 14, 2012 at 8:58 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Fri, Sep 14, 2012 at 08:25:53AM -0500, Rob Clark wrote:
  On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
   On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
  [snip]

 I would say this is going to be the most common use case if you 
 consider
 just the number of shipping devices. It's pretty much what every 
 Android
 phone/tablet with a HDMI port has to do.
   
bleh, surfaceflinger kinda sucks then..
   
Why? This use case is not enforced by surfaceflinger, it's just the 
use
case most devices would have.
   
I don't think there's anything wrong with the way surfaceflinger is 
designed
with the prepare and commit phases. How else would you do it?
  
   well, maybe I misunderstood how surfaceflinger works, but it sounded
   like it has one prepare/commit phase across outputs, vs what weston
   compositor does where each output is rendered and flipped
   independently at the rate of that particular output.  If the two
   outputs just happen to be vsync aligned, you would end up flipping at
   the same time, but if the are not locked you don't have any artificial
   constraint in the rendering/flipping.
  
   OK so it's purely a pull based model, whereas surfaceflinger is more
   push based.
  
   I suppose it might be possible to make surfaceflinger support a pull
   model by driving the compositor loop through a combined signal from
   multiple outputs. But IIRC it did have some timing related code in
   there somewhere, so it might not be happy about it. It might also
 
  As I understood, at least in older versions android versions,
  rendering was based on a timer as there was no vblank event to
  userspace on most SoC platforms (which sounds strange, but so far most
  SoC's are using fbdev and/or crazy hacks rather than drm/kms)
 
  not sure if the timer is still there.. but I hope it goes away, it is
  really a horrible way to keep track of vsync
 
  I've only looked at ICS in any detail. At least there we used the page
  flip event from one display to set the pace of the compositor loop.
  IIRC JB is supposed to have some vsync related changes, but I haven't
  looked at the code.
 
   affect the clients' rendering speed since the compositor would be
   pulling their buffers from queue at non-constant speed. I don't
   remember the details of the buffer management very well, so I can't be
   sure though. But I probably wouldn't bother trying this, since the
   straightforward approach is so simple, and the results are reasonably
   good.
  
   The pull model does seem more flexible. But it does require a bit of
   extra complexity in the compositor to avoid compositing the same scene
   multiple times needlessly when multiple cloned displays are involved.
   I suppose ideally you'd want to recompose for each display to minimize
   visible latency, but from power usage POV it may not be a good idea.
 
  fwiw, weston is already being pretty clever about keeping track of
  damage and minimizing the area of the screen that must be re-rendered.
   I'm not sure if SF does anything like this.
 
  IIRC it can do that, but the EGL implementation needs to support
  EGL_BUFFER_PRESERVED.
 
  I suppose the best way to implement EGL_BUFFER_PRESERVED with
  page flips would be to schedule the flip and immediately perform
  a blit from the new front buffer to the new back buffer. Well,
  unless the hardware has some more clever mechanism for it.
 
  Does weston depend on preserved flips too, or can it even track
  damage independently for each buffer?
 
 well, weston knows how many buffers are at play.  So it takes the
 union of the damage from the last time the buffer was used (well,
 currently it assumes only double buffered) and the new damage.

With more buffer it'll get a bit more complicate as it needs to keep 
accumulating the damage for all buffers. But it should still be fairly
trivial when you're in full control of the buffers.

 This
 way it avoids need for the gl driver, which doesn't know as well what
 is going on as the app, from needing to do a back-blit.  It can do
 this because w/ drm/gbm egl winsys, eglSwapBuffers() doesn't actually
 swap the buffers on the display and weston is in charge of which
 buffer is displayed or rendered.  Weston explicitly calls page flip
 ioctl.  The good news being that it can atomically flip overlay layers
 at the same time once the new ioctl is in place.

Yeah, with EGL in the mix, as can be the case with Android, the layering
can start to work against you a little bit. Well, it's 

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Rob Clark
On Fri, Sep 14, 2012 at 10:48 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Fri, Sep 14, 2012 at 09:45:18AM -0500, Rob Clark wrote:
 On Fri, Sep 14, 2012 at 8:58 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Fri, Sep 14, 2012 at 08:25:53AM -0500, Rob Clark wrote:
  On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
   On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
  [snip]

 I would say this is going to be the most common use case if you 
 consider
 just the number of shipping devices. It's pretty much what every 
 Android
 phone/tablet with a HDMI port has to do.
   
bleh, surfaceflinger kinda sucks then..
   
Why? This use case is not enforced by surfaceflinger, it's just the 
use
case most devices would have.
   
I don't think there's anything wrong with the way surfaceflinger is 
designed
with the prepare and commit phases. How else would you do it?
  
   well, maybe I misunderstood how surfaceflinger works, but it sounded
   like it has one prepare/commit phase across outputs, vs what weston
   compositor does where each output is rendered and flipped
   independently at the rate of that particular output.  If the two
   outputs just happen to be vsync aligned, you would end up flipping at
   the same time, but if the are not locked you don't have any artificial
   constraint in the rendering/flipping.
  
   OK so it's purely a pull based model, whereas surfaceflinger is more
   push based.
  
   I suppose it might be possible to make surfaceflinger support a pull
   model by driving the compositor loop through a combined signal from
   multiple outputs. But IIRC it did have some timing related code in
   there somewhere, so it might not be happy about it. It might also
 
  As I understood, at least in older versions android versions,
  rendering was based on a timer as there was no vblank event to
  userspace on most SoC platforms (which sounds strange, but so far most
  SoC's are using fbdev and/or crazy hacks rather than drm/kms)
 
  not sure if the timer is still there.. but I hope it goes away, it is
  really a horrible way to keep track of vsync
 
  I've only looked at ICS in any detail. At least there we used the page
  flip event from one display to set the pace of the compositor loop.
  IIRC JB is supposed to have some vsync related changes, but I haven't
  looked at the code.
 
   affect the clients' rendering speed since the compositor would be
   pulling their buffers from queue at non-constant speed. I don't
   remember the details of the buffer management very well, so I can't be
   sure though. But I probably wouldn't bother trying this, since the
   straightforward approach is so simple, and the results are reasonably
   good.
  
   The pull model does seem more flexible. But it does require a bit of
   extra complexity in the compositor to avoid compositing the same scene
   multiple times needlessly when multiple cloned displays are involved.
   I suppose ideally you'd want to recompose for each display to minimize
   visible latency, but from power usage POV it may not be a good idea.
 
  fwiw, weston is already being pretty clever about keeping track of
  damage and minimizing the area of the screen that must be re-rendered.
   I'm not sure if SF does anything like this.
 
  IIRC it can do that, but the EGL implementation needs to support
  EGL_BUFFER_PRESERVED.
 
  I suppose the best way to implement EGL_BUFFER_PRESERVED with
  page flips would be to schedule the flip and immediately perform
  a blit from the new front buffer to the new back buffer. Well,
  unless the hardware has some more clever mechanism for it.
 
  Does weston depend on preserved flips too, or can it even track
  damage independently for each buffer?

 well, weston knows how many buffers are at play.  So it takes the
 union of the damage from the last time the buffer was used (well,
 currently it assumes only double buffered) and the new damage.

 With more buffer it'll get a bit more complicate as it needs to keep
 accumulating the damage for all buffers. But it should still be fairly
 trivial when you're in full control of the buffers.

well, just track previous damage per buffer.. but yeah, slightly more
complicated

 This
 way it avoids need for the gl driver, which doesn't know as well what
 is going on as the app, from needing to do a back-blit.  It can do
 this because w/ drm/gbm egl winsys, eglSwapBuffers() doesn't actually
 swap the buffers on the display and weston is in charge of which
 buffer is displayed or rendered.  Weston explicitly calls page flip
 ioctl.  The good news being that it can atomically flip overlay layers
 at the 

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Ville Syrjälä
On Fri, Sep 14, 2012 at 11:29:04AM -0500, Rob Clark wrote:
 On Fri, Sep 14, 2012 at 10:48 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Fri, Sep 14, 2012 at 09:45:18AM -0500, Rob Clark wrote:
  On Fri, Sep 14, 2012 at 8:58 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Fri, Sep 14, 2012 at 08:25:53AM -0500, Rob Clark wrote:
   On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:

  I wonder if you've though about omap's FIFO merge. It can cause similar
  issues, that is some operations may need two vblanks to complete. And it
  looks like I'll get to worry about this stuff too since there are some
  watermark related wait_for_vblank() workarounds in the IVB sprite code,
  sigh.
 
 yeah, FIFO merge is a nice big headache.. and not really ideal for
 latency unless you have some advanced warning to disable FIFO merge
 before userspace wants to switch on an extra overlay.
 
 I think the best way to deal is just start switching off FIFO merge
 when userspace first does test w/ overlay, but return EBUSY.  It means
 we'll use the gpu for rendering for one frame, but I think that is
 better than blocking the compositor for a vblank or two.  Thou shalt
 not block the compositor.

Yeah, I suppose it could be handled through another property as well.
Perhaps some kind of LOW_POWER_OPTIMIZATIONS property that you'd
disable one vblank in advance. But then it's starting to be a bit
hardware specific ie. you more or less have to know the circumstances
when the property must be disabled. On omap it would be when more than
one plane is used, on IVB it appears that you need it when you want to
enable scaling.

I'm not too thrilled about the idea that the test phase would actually
touch the hardware. What happens if you do multiple test steps between
commits? But I can't immediately think of a good solution that would
avoid the need for hardware specific knowledge.

  And if we do support multiple crtc's w/ pageflip, I'm not sure if
  there is a good way to enforce two-steps.  Having a standardized way
  to tell userspace to try later seems like a good thing.
 
  Sure, for that it seems reasonable.
 
 Also, if you pageflip on multiple CRTC's, should the be multiple
 vblank events, and multiple userdata's?

 That's a bit of an open question. I was considering several 
 options:
   
the thing I like about one ioctl per crtc is that it avoids this 
whole
question..
   
And, I think as long as you have to update multiple different scanout
address registers, there is always going to be a race in multi-crtc
flipping.  Having a single ioctl does make the race smaller.  I'm not
sure how important that point is.
   
Which race?
  
   ie. if you set REG_CRTC1_ADDR just immediately before vblank and
   REG_CRTC2_ADDR just after
  
   Well, with unsynced crtcs I wouldn't call that any kind of meaningful 
   race.
   The same problem after all exists even with a single crtc. You either 
   make
   the deadline and write the register before vblank, or you don't make it
   and end up with a repeated frame.
 
  I meant w/ sync'd crtc's, there is still no 100% guarantee that the
  two flip at the same time.
 
  Sure there is. That's what the vblank evade stuff gives you. I just
  happen to need it even when using just one crtc because the hardware
  doesn't have the necessary mechanism to flip several planes atomically.
 
 hmm, I guess I don't quite follow then.  But I guess I don't know the
 intel hw well enough.  It seemed like you weren't atomically updating
 scanout registers.

I guarantee the atomicity by making sure I'm not too close to the start
of vblank when I write the registers. It's a very generic solution that
will work on any hardware with double buffered registers that get
flipped on vblank. Even if some of the registers would get flipped at
slightly different times (eg. plane A flips at vbl_start+1, plane B at
vbl_start+10) you could still use this method by extending the range of
scanlines to be avoided.

I suppose the most difficult bit is determining how long you need to
write all the necessary registers. You want to make it long enough to
guarantee atomic operation, but short enough to avoid blocking
needlessly. Currently I just have a hardcoded number there. If the
driver is going to be deployed on a specific device, it's easy enough
to measure it and tweak the number, but it would be nice to have the
driver calibrate itself. Or you just have a sysfs knob so that it
could tweaked more easily for specific systems by non-developers.

 But anyways, I think it is probably ok to not need the crtc up-front.
 We can catch issues w/ pending vblank at the atomic_test() stage.
 Still not sure what to 

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Rob Clark
On Fri, Sep 14, 2012 at 12:02 PM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Fri, Sep 14, 2012 at 11:29:04AM -0500, Rob Clark wrote:
 On Fri, Sep 14, 2012 at 10:48 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Fri, Sep 14, 2012 at 09:45:18AM -0500, Rob Clark wrote:
  On Fri, Sep 14, 2012 at 8:58 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Fri, Sep 14, 2012 at 08:25:53AM -0500, Rob Clark wrote:
   On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:

  I wonder if you've though about omap's FIFO merge. It can cause similar
  issues, that is some operations may need two vblanks to complete. And it
  looks like I'll get to worry about this stuff too since there are some
  watermark related wait_for_vblank() workarounds in the IVB sprite code,
  sigh.

 yeah, FIFO merge is a nice big headache.. and not really ideal for
 latency unless you have some advanced warning to disable FIFO merge
 before userspace wants to switch on an extra overlay.

 I think the best way to deal is just start switching off FIFO merge
 when userspace first does test w/ overlay, but return EBUSY.  It means
 we'll use the gpu for rendering for one frame, but I think that is
 better than blocking the compositor for a vblank or two.  Thou shalt
 not block the compositor.

 Yeah, I suppose it could be handled through another property as well.
 Perhaps some kind of LOW_POWER_OPTIMIZATIONS property that you'd
 disable one vblank in advance. But then it's starting to be a bit
 hardware specific ie. you more or less have to know the circumstances
 when the property must be disabled. On omap it would be when more than
 one plane is used, on IVB it appears that you need it when you want to
 enable scaling.

 I'm not too thrilled about the idea that the test phase would actually
 touch the hardware. What happens if you do multiple test steps between
 commits? But I can't immediately think of a good solution that would
 avoid the need for hardware specific knowledge.

yeah, that is the problem..

But then again, I suppose we could make fifo-merge disabled by default
and explicitly controlled by userspace via property.  A generic
userspace, and the property would never be set.  A userspace a bit
more optimized/customized for the hw, however, is in a better position
to know if there is likely to be changes in the next frame (ie. based
on user input, etc) and could make perhaps some more intelligent
decisions about when to enable it.

  And if we do support multiple crtc's w/ pageflip, I'm not sure if
  there is a good way to enforce two-steps.  Having a standardized way
  to tell userspace to try later seems like a good thing.
 
  Sure, for that it seems reasonable.
 
 Also, if you pageflip on multiple CRTC's, should the be multiple
 vblank events, and multiple userdata's?

 That's a bit of an open question. I was considering several 
 options:
   
the thing I like about one ioctl per crtc is that it avoids this 
whole
question..
   
And, I think as long as you have to update multiple different 
scanout
address registers, there is always going to be a race in multi-crtc
flipping.  Having a single ioctl does make the race smaller.  I'm 
not
sure how important that point is.
   
Which race?
  
   ie. if you set REG_CRTC1_ADDR just immediately before vblank and
   REG_CRTC2_ADDR just after
  
   Well, with unsynced crtcs I wouldn't call that any kind of meaningful 
   race.
   The same problem after all exists even with a single crtc. You either 
   make
   the deadline and write the register before vblank, or you don't make it
   and end up with a repeated frame.
 
  I meant w/ sync'd crtc's, there is still no 100% guarantee that the
  two flip at the same time.
 
  Sure there is. That's what the vblank evade stuff gives you. I just
  happen to need it even when using just one crtc because the hardware
  doesn't have the necessary mechanism to flip several planes atomically.

 hmm, I guess I don't quite follow then.  But I guess I don't know the
 intel hw well enough.  It seemed like you weren't atomically updating
 scanout registers.

 I guarantee the atomicity by making sure I'm not too close to the start
 of vblank when I write the registers. It's a very generic solution that
 will work on any hardware with double buffered registers that get
 flipped on vblank. Even if some of the registers would get flipped at
 slightly different times (eg. plane A flips at vbl_start+1, plane B at
 vbl_start+10) you could still use this method by extending the range of
 scanlines to be avoided.

ahh, ok, double-buffered..  well, if they are double buffered you
should be able to tolerate two ioctl() calls, 

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Ville Syrjälä
On Fri, Sep 14, 2012 at 12:34:59PM -0500, Rob Clark wrote:
 On Fri, Sep 14, 2012 at 12:02 PM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Fri, Sep 14, 2012 at 11:29:04AM -0500, Rob Clark wrote:
  On Fri, Sep 14, 2012 at 10:48 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Fri, Sep 14, 2012 at 09:45:18AM -0500, Rob Clark wrote:
   On Fri, Sep 14, 2012 at 8:58 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Fri, Sep 14, 2012 at 08:25:53AM -0500, Rob Clark wrote:
On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
 On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  That's a bit of an open question. I was considering several 
  options:

 the thing I like about one ioctl per crtc is that it avoids this 
 whole
 question..

 And, I think as long as you have to update multiple different 
 scanout
 address registers, there is always going to be a race in 
 multi-crtc
 flipping.  Having a single ioctl does make the race smaller.  I'm 
 not
 sure how important that point is.

 Which race?
   
ie. if you set REG_CRTC1_ADDR just immediately before vblank and
REG_CRTC2_ADDR just after
   
Well, with unsynced crtcs I wouldn't call that any kind of meaningful 
race.
The same problem after all exists even with a single crtc. You either 
make
the deadline and write the register before vblank, or you don't make 
it
and end up with a repeated frame.
  
   I meant w/ sync'd crtc's, there is still no 100% guarantee that the
   two flip at the same time.
  
   Sure there is. That's what the vblank evade stuff gives you. I just
   happen to need it even when using just one crtc because the hardware
   doesn't have the necessary mechanism to flip several planes atomically.
 
  hmm, I guess I don't quite follow then.  But I guess I don't know the
  intel hw well enough.  It seemed like you weren't atomically updating
  scanout registers.
 
  I guarantee the atomicity by making sure I'm not too close to the start
  of vblank when I write the registers. It's a very generic solution that
  will work on any hardware with double buffered registers that get
  flipped on vblank. Even if some of the registers would get flipped at
  slightly different times (eg. plane A flips at vbl_start+1, plane B at
  vbl_start+10) you could still use this method by extending the range of
  scanlines to be avoided.
 
 ahh, ok, double-buffered..  well, if they are double buffered you
 should be able to tolerate two ioctl() calls, because you have a
 relatively large window to update all the registers ;-)

Hey, if should would be good enough, there would be no need for an
atomic page flip ioctl.

And somehwat ironically, if I didn't have double buffered registers,
I'd just write the lot of them from the vblank irq handler, which would
be simpler in some sense. Well, to tell the truth, not all registers in
Intel HW are double buffered. Gamma tables/ramps for example are single
buffered, and if we actually start to care about accurate color
reproduction we may need to mix the two approaches. The other approach
would be to reject changes to features backed by single buffered registers
while the relevant piece of hardware is enabled.

-- 
Ville Syrjälä
syrj...@sci.fi
http://www.sci.fi/~syrjala/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Jesse Barnes
On Wed, 12 Sep 2012 21:58:31 +0300
Ville Syrjälä ville.syrj...@linux.intel.com wrote:

 On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
  On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
   But I think we could still do this w/ one ioctl per crtc for 
   atomic-pageflip.
  
   We could, if we want to sacrifice the synced multi display case. I just
   think it might be a real use case at some point. IVI feels like the most
   likely short term cadidate to me, but perhaps someone would finally
   introduce some new style phone/tablet thingy. I have seen
   concepts/prototypes of such multi display gadgets in the past, but the
   industry apparently got a bit stuck on the rectangular slab with
   touchscreen on one side design.
  
  I could be wrong, but I think IVI the screens would normally be too
  far apart to matter?
 
 I was thinking of something like a display on the dash that normally
 sits low with only a small sliver visible, and extends upwards when
 you fire up a movie player for example. Internally it could be made
 up of two displays for power savings purposes.
 
  Anyways, it is really only a problem if you can't do two ioctl()s
  within one vblank period. If it actually turns out to be a real
  problem,
 
 Well exactly that's the problem this whole atomic pageflip stuff is
 trying to tackle, no? ;)
 
  we could always add later an ioctl that takes an array of
  'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
  really useful or not.. but maybe I'm thinking too much about how
  weston does it's rendering of different output's independently.
 
 I'm just now thinking of surfaceflinger's way of doing things, with
 its prepare and commit phases. If you need to issue two ioctls to handle
 cloned displays, then you can end up in a somewhat funky situation.
 
 Let's say you have a video overlay in use (one each display naturally),
 and you increase the downscaling factor enough so that you now have
 enough memory bandwith to support only one overlay. With independent
 check ioctls for each display, you never have the full device state
 available in the kernel, so each check succeeds during the prepare
 phase. So you decide that you can keep using the video overlays.
 
 You then proceed to commit the state, but after the first display has
 been commited you get an error when trying to commit the second one.
 What can you do now? The only option is to keep displaying the old
 frame on the other displays for some time longer, and then on the
 next frame you can switch to GPU composition. But on the next frame you
 perhaps no longer need to use GPU composition, but since you can't
 verify that in the prepare phase, you have no other option but to use
 GPU composition.
 
 So when you run into a configuration that can be supported only
 partially, you get animation stalls on some displays due to skipped
 frames, and you always have to fall back to GPU composition for the 
 next frame.
 
 If on the other hand you would check the whole state in one ioctl,
 you'd get the error in the first prepare phase, and could fall back
 to GPU composition immediately.
 
 Am I too much of a perfectionst in considering such things? I don't
 think so, but perhaps other people disagree.

I don't think there's any harm in having multiple ioctls for different
things.

I was initially hoping the nuclear page flip would be very simple.
Intended for simply updating buffers of several planes associated with
a single display.  That makes the inner loop of something like Wayland
or SF simple, but obviously doesn't cover every case (in fact I had
avoided dealing with moving planes initially).

Rob's patchset goes further than that, but obviously not as far as you
propose.

OTOH, keeping things really simple and not very featureful means there
are fewer points of failure, which is what I think callers would expect
from a flip API...

So where does that leave us?  I'd propose we have a very simple,
stripped down, single crtc flip ioctl, along with a big atomic mode set
ioctl, and then perhaps a fancier multi-crtc flip ioctl.

-- 
Jesse Barnes, Intel Open Source Technology Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Rob Clark
On Fri, Sep 14, 2012 at 4:14 PM, Jesse Barnes jbar...@virtuousgeek.org wrote:
 On Wed, 12 Sep 2012 21:58:31 +0300
 Ville Syrjälä ville.syrj...@linux.intel.com wrote:

 On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
  On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
   But I think we could still do this w/ one ioctl per crtc for 
   atomic-pageflip.
  
   We could, if we want to sacrifice the synced multi display case. I just
   think it might be a real use case at some point. IVI feels like the most
   likely short term cadidate to me, but perhaps someone would finally
   introduce some new style phone/tablet thingy. I have seen
   concepts/prototypes of such multi display gadgets in the past, but the
   industry apparently got a bit stuck on the rectangular slab with
   touchscreen on one side design.
 
  I could be wrong, but I think IVI the screens would normally be too
  far apart to matter?

 I was thinking of something like a display on the dash that normally
 sits low with only a small sliver visible, and extends upwards when
 you fire up a movie player for example. Internally it could be made
 up of two displays for power savings purposes.

  Anyways, it is really only a problem if you can't do two ioctl()s
  within one vblank period. If it actually turns out to be a real
  problem,

 Well exactly that's the problem this whole atomic pageflip stuff is
 trying to tackle, no? ;)

  we could always add later an ioctl that takes an array of
  'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
  really useful or not.. but maybe I'm thinking too much about how
  weston does it's rendering of different output's independently.

 I'm just now thinking of surfaceflinger's way of doing things, with
 its prepare and commit phases. If you need to issue two ioctls to handle
 cloned displays, then you can end up in a somewhat funky situation.

 Let's say you have a video overlay in use (one each display naturally),
 and you increase the downscaling factor enough so that you now have
 enough memory bandwith to support only one overlay. With independent
 check ioctls for each display, you never have the full device state
 available in the kernel, so each check succeeds during the prepare
 phase. So you decide that you can keep using the video overlays.

 You then proceed to commit the state, but after the first display has
 been commited you get an error when trying to commit the second one.
 What can you do now? The only option is to keep displaying the old
 frame on the other displays for some time longer, and then on the
 next frame you can switch to GPU composition. But on the next frame you
 perhaps no longer need to use GPU composition, but since you can't
 verify that in the prepare phase, you have no other option but to use
 GPU composition.

 So when you run into a configuration that can be supported only
 partially, you get animation stalls on some displays due to skipped
 frames, and you always have to fall back to GPU composition for the
 next frame.

 If on the other hand you would check the whole state in one ioctl,
 you'd get the error in the first prepare phase, and could fall back
 to GPU composition immediately.

 Am I too much of a perfectionst in considering such things? I don't
 think so, but perhaps other people disagree.

 I don't think there's any harm in having multiple ioctls for different
 things.

 I was initially hoping the nuclear page flip would be very simple.
 Intended for simply updating buffers of several planes associated with
 a single display.  That makes the inner loop of something like Wayland
 or SF simple, but obviously doesn't cover every case (in fact I had
 avoided dealing with moving planes initially).

 Rob's patchset goes further than that, but obviously not as far as you
 propose.

 OTOH, keeping things really simple and not very featureful means there
 are fewer points of failure, which is what I think callers would expect
 from a flip API...

 So where does that leave us?  I'd propose we have a very simple,
 stripped down, single crtc flip ioctl, along with a big atomic mode set
 ioctl, and then perhaps a fancier multi-crtc flip ioctl.

well, I do think it is quite useful to be able to enable/disable
planes as part as part of the flip.. you really don't want to have to
block the compositor for a synchronous operation to enable/disable a
plane..  even if you have to return -EBUSY for a transition (ie. if a
plane is still pending vblank on other crtc to switch over, etc)

I am on the fence about multi-crtc flip.  Although the
everything-is-a-property approach does, I suppose, means we could
support it with one ioctl.  Maybe we should start without.  If later
we want to support multi-crtc flip, we can add a driver cap to give
userspace an idea about what it could expect to work.

What I am leaning towards now is an ioctl a bit more like 

Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Rob Clark
On Fri, Sep 14, 2012 at 1:23 PM, Ville Syrjälä syrj...@sci.fi wrote:
 On Fri, Sep 14, 2012 at 12:34:59PM -0500, Rob Clark wrote:
 On Fri, Sep 14, 2012 at 12:02 PM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Fri, Sep 14, 2012 at 11:29:04AM -0500, Rob Clark wrote:
  On Fri, Sep 14, 2012 at 10:48 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Fri, Sep 14, 2012 at 09:45:18AM -0500, Rob Clark wrote:
   On Fri, Sep 14, 2012 at 8:58 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Fri, Sep 14, 2012 at 08:25:53AM -0500, Rob Clark wrote:
On Fri, Sep 14, 2012 at 7:50 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Thu, Sep 13, 2012 at 11:35:59AM -0500, Rob Clark wrote:
 On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  That's a bit of an open question. I was considering several 
  options:

 the thing I like about one ioctl per crtc is that it avoids this 
 whole
 question..

 And, I think as long as you have to update multiple different 
 scanout
 address registers, there is always going to be a race in 
 multi-crtc
 flipping.  Having a single ioctl does make the race smaller.  
 I'm not
 sure how important that point is.

 Which race?
   
ie. if you set REG_CRTC1_ADDR just immediately before vblank and
REG_CRTC2_ADDR just after
   
Well, with unsynced crtcs I wouldn't call that any kind of 
meaningful race.
The same problem after all exists even with a single crtc. You 
either make
the deadline and write the register before vblank, or you don't make 
it
and end up with a repeated frame.
  
   I meant w/ sync'd crtc's, there is still no 100% guarantee that the
   two flip at the same time.
  
   Sure there is. That's what the vblank evade stuff gives you. I just
   happen to need it even when using just one crtc because the hardware
   doesn't have the necessary mechanism to flip several planes atomically.
 
  hmm, I guess I don't quite follow then.  But I guess I don't know the
  intel hw well enough.  It seemed like you weren't atomically updating
  scanout registers.
 
  I guarantee the atomicity by making sure I'm not too close to the start
  of vblank when I write the registers. It's a very generic solution that
  will work on any hardware with double buffered registers that get
  flipped on vblank. Even if some of the registers would get flipped at
  slightly different times (eg. plane A flips at vbl_start+1, plane B at
  vbl_start+10) you could still use this method by extending the range of
  scanlines to be avoided.

 ahh, ok, double-buffered..  well, if they are double buffered you
 should be able to tolerate two ioctl() calls, because you have a
 relatively large window to update all the registers ;-)

 Hey, if should would be good enough, there would be no need for an
 atomic page flip ioctl.

Well, true.. but there is a bit of a difference of scale.. I mean
flipping multiple layers on a single CRTC that is not vblank sync'd
with another CRTC should be a common case, and you can get incorrect
results on the screen, so an it *should* work solution is less
acceptable.

vsync locked crtc's seem like it would be less common, and less likely
to be noticed if once in a while the flip is off by a frame.  So it
seems like a less urgent issue to solve.

Just playing devil's advocate here ;-)

BR,
-R

 And somehwat ironically, if I didn't have double buffered registers,
 I'd just write the lot of them from the vblank irq handler, which would
 be simpler in some sense. Well, to tell the truth, not all registers in
 Intel HW are double buffered. Gamma tables/ramps for example are single
 buffered, and if we actually start to care about accurate color
 reproduction we may need to mix the two approaches. The other approach
 would be to reject changes to features backed by single buffered registers
 while the relevant piece of hardware is enabled.

 --
 Ville Syrjälä
 syrj...@sci.fi
 http://www.sci.fi/~syrjala/
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-14 Thread Rob Clark
On Thu, Sep 13, 2012 at 11:35 AM, Rob Clark rob.cl...@linaro.org wrote:
 note that the test phase doesn't need vblank events, and also
 shouldn't -EBUSY if there is still a pending flip[*], so I'd propose
 that however we go about pageflip (one super-ioctl, or one per crtc),
 we could use the atomic-modeset ioctl for the test step

actually, I think I take this back..  one thing that was discussed on
IRC, but didn't make it to this email thread is the behavior of
non-specified properties.  What I am thinking:

modeset: unspecified properties revert to default
pageflip: unspecified properties preserve current value

So I definitely do think there should be two ioctls, and that test for
pageflip should go via atomic-pageflip ioctl to be consistent w/ the
preserve-current-values approach.  Instead I'll just move the
is-there-a-pending-vblank to the top of atomic_commit() so it doesn't
get in the way if you try to test for frame n+1 while waiting for
vblank from frame n.


BR.
-R
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-13 Thread Ville Syrjälä
On Wed, Sep 12, 2012 at 02:40:56PM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 1:58 PM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
  On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
   But I think we could still do this w/ one ioctl per crtc for 
   atomic-pageflip.
  
   We could, if we want to sacrifice the synced multi display case. I just
   think it might be a real use case at some point. IVI feels like the most
   likely short term cadidate to me, but perhaps someone would finally
   introduce some new style phone/tablet thingy. I have seen
   concepts/prototypes of such multi display gadgets in the past, but the
   industry apparently got a bit stuck on the rectangular slab with
   touchscreen on one side design.
 
  I could be wrong, but I think IVI the screens would normally be too
  far apart to matter?
 
  I was thinking of something like a display on the dash that normally
  sits low with only a small sliver visible, and extends upwards when
  you fire up a movie player for example. Internally it could be made
  up of two displays for power savings purposes.
 
  Anyways, it is really only a problem if you can't do two ioctl()s
  within one vblank period. If it actually turns out to be a real
  problem,
 
  Well exactly that's the problem this whole atomic pageflip stuff is
  trying to tackle, no? ;)
 
  we could always add later an ioctl that takes an array of
  'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
  really useful or not.. but maybe I'm thinking too much about how
  weston does it's rendering of different output's independently.
 
  I'm just now thinking of surfaceflinger's way of doing things, with
  its prepare and commit phases. If you need to issue two ioctls to handle
  cloned displays, then you can end up in a somewhat funky situation.
 
  Let's say you have a video overlay in use (one each display naturally),
  and you increase the downscaling factor enough so that you now have
  enough memory bandwith to support only one overlay. With independent
  check ioctls for each display, you never have the full device state
  available in the kernel, so each check succeeds during the prepare
  phase. So you decide that you can keep using the video overlays.
 
  You then proceed to commit the state, but after the first display has
  been commited you get an error when trying to commit the second one.
  What can you do now? The only option is to keep displaying the old
  frame on the other displays for some time longer, and then on the
  next frame you can switch to GPU composition. But on the next frame you
  perhaps no longer need to use GPU composition, but since you can't
  verify that in the prepare phase, you have no other option but to use
  GPU composition.
 
  So when you run into a configuration that can be supported only
  partially, you get animation stalls on some displays due to skipped
  frames, and you always have to fall back to GPU composition for the
  next frame.
 
  If on the other hand you would check the whole state in one ioctl,
  you'd get the error in the first prepare phase, and could fall back
  to GPU composition immediately.
 
  Am I too much of a perfectionst in considering such things? I don't
  think so, but perhaps other people disagree.
 
 Ok, if you have a case where the state of the two crtc's are not
 actually independent, then I think you have a valid point.
 
 I'm not quite sure what userspace would do about it, though.. for the
 general case where vsync isn't locked, and you can't even necessarily
 assume vsync period is the same, then I don't really think you want to
 couple rendering to the displays.

I would say this is going to be the most common use case if you consider
just the number of shipping devices. It's pretty much what every Android
phone/tablet with a HDMI port has to do.

The solution we came up when working with Medfield was to throttle to
the internal display (usually ~60Hz), and let the external display drop
however many frames it needs. But I still wanted the external display to
always show the most recent frame possible, hence I came up with drm_flip
which supports that just fine. So we used three buffers and just issued
flips at the rate of the internal display, and the external display with
a 30Hz or 24Hz refresh rate ended up dropping some of the frames. It did
look fairly good actually. 

As a proof of the drm_flip implementation, I also did a quick hack where
I bumped the number of buffers up to ~15 or so, and then removed the
display related throttling completely (apart from preventing the GPU
from writing to an active scanout buffer). This allowed the app and
compositor to render at ~400 fps, and each displays would end up
displaying 60 or 30 of those frames each second without tearing. Now,
the only reason I needed so many 

Re: [RFC 0/9] nuclear pageflip

2012-09-13 Thread Rob Clark
On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 02:40:56PM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 1:58 PM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
  On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
   But I think we could still do this w/ one ioctl per crtc for 
   atomic-pageflip.
  
   We could, if we want to sacrifice the synced multi display case. I just
   think it might be a real use case at some point. IVI feels like the most
   likely short term cadidate to me, but perhaps someone would finally
   introduce some new style phone/tablet thingy. I have seen
   concepts/prototypes of such multi display gadgets in the past, but the
   industry apparently got a bit stuck on the rectangular slab with
   touchscreen on one side design.
 
  I could be wrong, but I think IVI the screens would normally be too
  far apart to matter?
 
  I was thinking of something like a display on the dash that normally
  sits low with only a small sliver visible, and extends upwards when
  you fire up a movie player for example. Internally it could be made
  up of two displays for power savings purposes.
 
  Anyways, it is really only a problem if you can't do two ioctl()s
  within one vblank period. If it actually turns out to be a real
  problem,
 
  Well exactly that's the problem this whole atomic pageflip stuff is
  trying to tackle, no? ;)
 
  we could always add later an ioctl that takes an array of
  'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
  really useful or not.. but maybe I'm thinking too much about how
  weston does it's rendering of different output's independently.
 
  I'm just now thinking of surfaceflinger's way of doing things, with
  its prepare and commit phases. If you need to issue two ioctls to handle
  cloned displays, then you can end up in a somewhat funky situation.
 
  Let's say you have a video overlay in use (one each display naturally),
  and you increase the downscaling factor enough so that you now have
  enough memory bandwith to support only one overlay. With independent
  check ioctls for each display, you never have the full device state
  available in the kernel, so each check succeeds during the prepare
  phase. So you decide that you can keep using the video overlays.
 
  You then proceed to commit the state, but after the first display has
  been commited you get an error when trying to commit the second one.
  What can you do now? The only option is to keep displaying the old
  frame on the other displays for some time longer, and then on the
  next frame you can switch to GPU composition. But on the next frame you
  perhaps no longer need to use GPU composition, but since you can't
  verify that in the prepare phase, you have no other option but to use
  GPU composition.
 
  So when you run into a configuration that can be supported only
  partially, you get animation stalls on some displays due to skipped
  frames, and you always have to fall back to GPU composition for the
  next frame.
 
  If on the other hand you would check the whole state in one ioctl,
  you'd get the error in the first prepare phase, and could fall back
  to GPU composition immediately.
 
  Am I too much of a perfectionst in considering such things? I don't
  think so, but perhaps other people disagree.

 Ok, if you have a case where the state of the two crtc's are not
 actually independent, then I think you have a valid point.

 I'm not quite sure what userspace would do about it, though.. for the
 general case where vsync isn't locked, and you can't even necessarily
 assume vsync period is the same, then I don't really think you want to
 couple rendering to the displays.

 I would say this is going to be the most common use case if you consider
 just the number of shipping devices. It's pretty much what every Android
 phone/tablet with a HDMI port has to do.

bleh, surfaceflinger kinda sucks then..

 The solution we came up when working with Medfield was to throttle to
 the internal display (usually ~60Hz), and let the external display drop
 however many frames it needs. But I still wanted the external display to
 always show the most recent frame possible, hence I came up with drm_flip
 which supports that just fine. So we used three buffers and just issued
 flips at the rate of the internal display, and the external display with
 a 30Hz or 24Hz refresh rate ended up dropping some of the frames. It did
 look fairly good actually.

 As a proof of the drm_flip implementation, I also did a quick hack where
 I bumped the number of buffers up to ~15 or so, and then removed the
 display related throttling completely (apart from preventing the GPU
 from writing to an active scanout buffer). This allowed the app and
 compositor to render at ~400 

Re: [RFC 0/9] nuclear pageflip

2012-09-13 Thread Ville Syrjälä
On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
 On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 02:40:56PM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 1:58 PM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
   On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
But I think we could still do this w/ one ioctl per crtc for 
atomic-pageflip.
   
We could, if we want to sacrifice the synced multi display case. I 
just
think it might be a real use case at some point. IVI feels like the 
most
likely short term cadidate to me, but perhaps someone would finally
introduce some new style phone/tablet thingy. I have seen
concepts/prototypes of such multi display gadgets in the past, but the
industry apparently got a bit stuck on the rectangular slab with
touchscreen on one side design.
  
   I could be wrong, but I think IVI the screens would normally be too
   far apart to matter?
  
   I was thinking of something like a display on the dash that normally
   sits low with only a small sliver visible, and extends upwards when
   you fire up a movie player for example. Internally it could be made
   up of two displays for power savings purposes.
  
   Anyways, it is really only a problem if you can't do two ioctl()s
   within one vblank period. If it actually turns out to be a real
   problem,
  
   Well exactly that's the problem this whole atomic pageflip stuff is
   trying to tackle, no? ;)
  
   we could always add later an ioctl that takes an array of
   'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
   really useful or not.. but maybe I'm thinking too much about how
   weston does it's rendering of different output's independently.
  
   I'm just now thinking of surfaceflinger's way of doing things, with
   its prepare and commit phases. If you need to issue two ioctls to handle
   cloned displays, then you can end up in a somewhat funky situation.
  
   Let's say you have a video overlay in use (one each display naturally),
   and you increase the downscaling factor enough so that you now have
   enough memory bandwith to support only one overlay. With independent
   check ioctls for each display, you never have the full device state
   available in the kernel, so each check succeeds during the prepare
   phase. So you decide that you can keep using the video overlays.
  
   You then proceed to commit the state, but after the first display has
   been commited you get an error when trying to commit the second one.
   What can you do now? The only option is to keep displaying the old
   frame on the other displays for some time longer, and then on the
   next frame you can switch to GPU composition. But on the next frame you
   perhaps no longer need to use GPU composition, but since you can't
   verify that in the prepare phase, you have no other option but to use
   GPU composition.
  
   So when you run into a configuration that can be supported only
   partially, you get animation stalls on some displays due to skipped
   frames, and you always have to fall back to GPU composition for the
   next frame.
  
   If on the other hand you would check the whole state in one ioctl,
   you'd get the error in the first prepare phase, and could fall back
   to GPU composition immediately.
  
   Am I too much of a perfectionst in considering such things? I don't
   think so, but perhaps other people disagree.
 
  Ok, if you have a case where the state of the two crtc's are not
  actually independent, then I think you have a valid point.
 
  I'm not quite sure what userspace would do about it, though.. for the
  general case where vsync isn't locked, and you can't even necessarily
  assume vsync period is the same, then I don't really think you want to
  couple rendering to the displays.
 
  I would say this is going to be the most common use case if you consider
  just the number of shipping devices. It's pretty much what every Android
  phone/tablet with a HDMI port has to do.
 
 bleh, surfaceflinger kinda sucks then..

Why? This use case is not enforced by surfaceflinger, it's just the use
case most devices would have.

I don't think there's anything wrong with the way surfaceflinger is designed
with the prepare and commit phases. How else would you do it?

  From userspace API, I guess something like:
 
  struct drm_mode_crtc_atomic_page_flip {
uint32_t flags;
uint32_t count_crtcs;
uint64_t crtc_ids_ptr;  /* array of uint32_t */
uint64_t count_props_ptr; /* array of uint32_t, # of prop's per crtc 
  */
uint64_t props_ptr;  /* ptr to array of drm_mode_obj_set_property */
uint64_t user_data;
  };
 
  Starting to look much like my drm_mode_atomic struct :)
 
  

Re: [RFC 0/9] nuclear pageflip

2012-09-13 Thread Rob Clark
On Thu, Sep 13, 2012 at 9:29 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Thu, Sep 13, 2012 at 08:39:54AM -0500, Rob Clark wrote:
 On Thu, Sep 13, 2012 at 3:40 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 02:40:56PM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 1:58 PM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
   On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
But I think we could still do this w/ one ioctl per crtc for 
atomic-pageflip.
   
We could, if we want to sacrifice the synced multi display case. I 
just
think it might be a real use case at some point. IVI feels like the 
most
likely short term cadidate to me, but perhaps someone would finally
introduce some new style phone/tablet thingy. I have seen
concepts/prototypes of such multi display gadgets in the past, but 
the
industry apparently got a bit stuck on the rectangular slab with
touchscreen on one side design.
  
   I could be wrong, but I think IVI the screens would normally be too
   far apart to matter?
  
   I was thinking of something like a display on the dash that normally
   sits low with only a small sliver visible, and extends upwards when
   you fire up a movie player for example. Internally it could be made
   up of two displays for power savings purposes.
  
   Anyways, it is really only a problem if you can't do two ioctl()s
   within one vblank period. If it actually turns out to be a real
   problem,
  
   Well exactly that's the problem this whole atomic pageflip stuff is
   trying to tackle, no? ;)
  
   we could always add later an ioctl that takes an array of
   'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
   really useful or not.. but maybe I'm thinking too much about how
   weston does it's rendering of different output's independently.
  
   I'm just now thinking of surfaceflinger's way of doing things, with
   its prepare and commit phases. If you need to issue two ioctls to handle
   cloned displays, then you can end up in a somewhat funky situation.
  
   Let's say you have a video overlay in use (one each display naturally),
   and you increase the downscaling factor enough so that you now have
   enough memory bandwith to support only one overlay. With independent
   check ioctls for each display, you never have the full device state
   available in the kernel, so each check succeeds during the prepare
   phase. So you decide that you can keep using the video overlays.
  
   You then proceed to commit the state, but after the first display has
   been commited you get an error when trying to commit the second one.
   What can you do now? The only option is to keep displaying the old
   frame on the other displays for some time longer, and then on the
   next frame you can switch to GPU composition. But on the next frame you
   perhaps no longer need to use GPU composition, but since you can't
   verify that in the prepare phase, you have no other option but to use
   GPU composition.
  
   So when you run into a configuration that can be supported only
   partially, you get animation stalls on some displays due to skipped
   frames, and you always have to fall back to GPU composition for the
   next frame.
  
   If on the other hand you would check the whole state in one ioctl,
   you'd get the error in the first prepare phase, and could fall back
   to GPU composition immediately.
  
   Am I too much of a perfectionst in considering such things? I don't
   think so, but perhaps other people disagree.
 
  Ok, if you have a case where the state of the two crtc's are not
  actually independent, then I think you have a valid point.
 
  I'm not quite sure what userspace would do about it, though.. for the
  general case where vsync isn't locked, and you can't even necessarily
  assume vsync period is the same, then I don't really think you want to
  couple rendering to the displays.
 
  I would say this is going to be the most common use case if you consider
  just the number of shipping devices. It's pretty much what every Android
  phone/tablet with a HDMI port has to do.

 bleh, surfaceflinger kinda sucks then..

 Why? This use case is not enforced by surfaceflinger, it's just the use
 case most devices would have.

 I don't think there's anything wrong with the way surfaceflinger is designed
 with the prepare and commit phases. How else would you do it?

well, maybe I misunderstood how surfaceflinger works, but it sounded
like it has one prepare/commit phase across outputs, vs what weston
compositor does where each output is rendered and flipped
independently at the rate of that particular output.  If the two
outputs just happen to be vsync aligned, you would end up flipping at
the same time, but if the are not 

Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Ville Syrjälä
On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
 On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net wrote:
  On Sun, 9 Sep 2012 22:19:59 -0500
  Rob Clark r...@ti.com wrote:
 
  On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org wrote:
   From: Rob Clark r...@ti.com
  
   This is following a bit the approach that Ville is taking for atomic-
   modeset, in that it is switching over to using properties for everything.
   The advantage of this approach is that it makes it easier to add new
   attributes to set as part of a page-flip (and even opens the option to
   add new object types).
 
  oh, and for those wondering what the hell this is all about,
  nuclear-pageflip is a pageflip that atomically updates the CRTC layer
  and zero or more attached plane layers, optionally changing various
  properties such as z-order (or even blending modes, etc) atomically.
  It includes the option for a test step so userspace compositor can
  test a proposed configuration (if any properties not marked as
  'dynamic' are updated).  This intended to allow, for example, weston
  compositor to synchronize updates to plane (sprite) layers with CRTC
  layer.
 
 
  Can we please name this something different? I complained about this in
  IRC, but I don't know if you hang around in #intel-gfx.
 
  Some suggestions:
  multiplane pageflip
  muliplane atomic pageflip
  atomic multiplane pageflip
  atomic multiflip
  pageflip of atomic and multiplane nature
 
  Nuclear is not at all descriptive and requires as your response shows
  :-)
 
 
 fair enough.. I was originally calling it atomic-pageflip until
 someone (I don't remember who) started calling it nuclear-pageflip to
 differentiate from atomic-modeset.  But IMO we could just call the two
 ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
 pageflip2, but that seems much more boring)

My plan has been to use the same ioctl for both uses. They'll need
nearly identical handling anyway on the ioctl level. I have something
semi-working currently, but I need to clean it up a bit before I can
show it to anyone.

-- 
Ville Syrjälä
Intel OTC
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Rob Clark
On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
 On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net wrote:
  On Sun, 9 Sep 2012 22:19:59 -0500
  Rob Clark r...@ti.com wrote:
 
  On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org wrote:
   From: Rob Clark r...@ti.com
  
   This is following a bit the approach that Ville is taking for atomic-
   modeset, in that it is switching over to using properties for 
   everything.
   The advantage of this approach is that it makes it easier to add new
   attributes to set as part of a page-flip (and even opens the option to
   add new object types).
 
  oh, and for those wondering what the hell this is all about,
  nuclear-pageflip is a pageflip that atomically updates the CRTC layer
  and zero or more attached plane layers, optionally changing various
  properties such as z-order (or even blending modes, etc) atomically.
  It includes the option for a test step so userspace compositor can
  test a proposed configuration (if any properties not marked as
  'dynamic' are updated).  This intended to allow, for example, weston
  compositor to synchronize updates to plane (sprite) layers with CRTC
  layer.
 
 
  Can we please name this something different? I complained about this in
  IRC, but I don't know if you hang around in #intel-gfx.
 
  Some suggestions:
  multiplane pageflip
  muliplane atomic pageflip
  atomic multiplane pageflip
  atomic multiflip
  pageflip of atomic and multiplane nature
 
  Nuclear is not at all descriptive and requires as your response shows
  :-)
 

 fair enough.. I was originally calling it atomic-pageflip until
 someone (I don't remember who) started calling it nuclear-pageflip to
 differentiate from atomic-modeset.  But IMO we could just call the two
 ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
 pageflip2, but that seems much more boring)

 My plan has been to use the same ioctl for both uses. They'll need
 nearly identical handling anyway on the ioctl level. I have something
 semi-working currently, but I need to clean it up a bit before I can
 show it to anyone.

I do think the atomic-pageflip ioctl really should take the crtc-id..
so probably should be two ioctls, but nearly identical

BR,
-R

 --
 Ville Syrjälä
 Intel OTC
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Ville Syrjälä
On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
  On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net wrote:
   On Sun, 9 Sep 2012 22:19:59 -0500
   Rob Clark r...@ti.com wrote:
  
   On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org wrote:
From: Rob Clark r...@ti.com
   
This is following a bit the approach that Ville is taking for atomic-
modeset, in that it is switching over to using properties for 
everything.
The advantage of this approach is that it makes it easier to add new
attributes to set as part of a page-flip (and even opens the option to
add new object types).
  
   oh, and for those wondering what the hell this is all about,
   nuclear-pageflip is a pageflip that atomically updates the CRTC layer
   and zero or more attached plane layers, optionally changing various
   properties such as z-order (or even blending modes, etc) atomically.
   It includes the option for a test step so userspace compositor can
   test a proposed configuration (if any properties not marked as
   'dynamic' are updated).  This intended to allow, for example, weston
   compositor to synchronize updates to plane (sprite) layers with CRTC
   layer.
  
  
   Can we please name this something different? I complained about this in
   IRC, but I don't know if you hang around in #intel-gfx.
  
   Some suggestions:
   multiplane pageflip
   muliplane atomic pageflip
   atomic multiplane pageflip
   atomic multiflip
   pageflip of atomic and multiplane nature
  
   Nuclear is not at all descriptive and requires as your response shows
   :-)
  
 
  fair enough.. I was originally calling it atomic-pageflip until
  someone (I don't remember who) started calling it nuclear-pageflip to
  differentiate from atomic-modeset.  But IMO we could just call the two
  ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
  pageflip2, but that seems much more boring)
 
  My plan has been to use the same ioctl for both uses. They'll need
  nearly identical handling anyway on the ioctl level. I have something
  semi-working currently, but I need to clean it up a bit before I can
  show it to anyone.
 
 I do think the atomic-pageflip ioctl really should take the crtc-id..
 so probably should be two ioctls, but nearly identical

But then you can't support atomic pageflips across multiple crtcs even
if the hardware would allow it. I would hate to add such limitation to
the API. I can immediately think of a use case; combining several
smaller displays to form a single larger display.

With a unified API you could just add some kind of flag that tells the
kernel that user space really wants an atomic update, and if the
driver/hardware can't do it, it can return an error.

-- 
Ville Syrjälä
Intel OTC
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Rob Clark
On Wed, Sep 12, 2012 at 9:23 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
  On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net wrote:
   On Sun, 9 Sep 2012 22:19:59 -0500
   Rob Clark r...@ti.com wrote:
  
   On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org 
   wrote:
From: Rob Clark r...@ti.com
   
This is following a bit the approach that Ville is taking for atomic-
modeset, in that it is switching over to using properties for 
everything.
The advantage of this approach is that it makes it easier to add new
attributes to set as part of a page-flip (and even opens the option 
to
add new object types).
  
   oh, and for those wondering what the hell this is all about,
   nuclear-pageflip is a pageflip that atomically updates the CRTC layer
   and zero or more attached plane layers, optionally changing various
   properties such as z-order (or even blending modes, etc) atomically.
   It includes the option for a test step so userspace compositor can
   test a proposed configuration (if any properties not marked as
   'dynamic' are updated).  This intended to allow, for example, weston
   compositor to synchronize updates to plane (sprite) layers with CRTC
   layer.
  
  
   Can we please name this something different? I complained about this in
   IRC, but I don't know if you hang around in #intel-gfx.
  
   Some suggestions:
   multiplane pageflip
   muliplane atomic pageflip
   atomic multiplane pageflip
   atomic multiflip
   pageflip of atomic and multiplane nature
  
   Nuclear is not at all descriptive and requires as your response shows
   :-)
  
 
  fair enough.. I was originally calling it atomic-pageflip until
  someone (I don't remember who) started calling it nuclear-pageflip to
  differentiate from atomic-modeset.  But IMO we could just call the two
  ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
  pageflip2, but that seems much more boring)
 
  My plan has been to use the same ioctl for both uses. They'll need
  nearly identical handling anyway on the ioctl level. I have something
  semi-working currently, but I need to clean it up a bit before I can
  show it to anyone.

 I do think the atomic-pageflip ioctl really should take the crtc-id..
 so probably should be two ioctls, but nearly identical

 But then you can't support atomic pageflips across multiple crtcs even
 if the hardware would allow it. I would hate to add such limitation to
 the API. I can immediately think of a use case; combining several
 smaller displays to form a single larger display.

 With a unified API you could just add some kind of flag that tells the
 kernel that user space really wants an atomic update, and if the
 driver/hardware can't do it, it can return an error.

well, that is really only a problem for X11..  and atomic flip across
multiple crtc's is a potential mess from performance standpoint unless
your displays are vsync'd lock.

weston already renders each output individually, rather than spanning
a single fb across multiple displays like x11 does.  So this problem
really doesn't exist for weston.

BR,
-R

 --
 Ville Syrjälä
 Intel OTC
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Ville Syrjälä
On Wed, Sep 12, 2012 at 09:28:43AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 9:23 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
   On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net wrote:
On Sun, 9 Sep 2012 22:19:59 -0500
Rob Clark r...@ti.com wrote:
   
On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org 
wrote:
 From: Rob Clark r...@ti.com

 This is following a bit the approach that Ville is taking for 
 atomic-
 modeset, in that it is switching over to using properties for 
 everything.
 The advantage of this approach is that it makes it easier to add 
 new
 attributes to set as part of a page-flip (and even opens the 
 option to
 add new object types).
   
oh, and for those wondering what the hell this is all about,
nuclear-pageflip is a pageflip that atomically updates the CRTC layer
and zero or more attached plane layers, optionally changing various
properties such as z-order (or even blending modes, etc) atomically.
It includes the option for a test step so userspace compositor can
test a proposed configuration (if any properties not marked as
'dynamic' are updated).  This intended to allow, for example, weston
compositor to synchronize updates to plane (sprite) layers with CRTC
layer.
   
   
Can we please name this something different? I complained about this 
in
IRC, but I don't know if you hang around in #intel-gfx.
   
Some suggestions:
multiplane pageflip
muliplane atomic pageflip
atomic multiplane pageflip
atomic multiflip
pageflip of atomic and multiplane nature
   
Nuclear is not at all descriptive and requires as your response shows
:-)
   
  
   fair enough.. I was originally calling it atomic-pageflip until
   someone (I don't remember who) started calling it nuclear-pageflip to
   differentiate from atomic-modeset.  But IMO we could just call the two
   ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
   pageflip2, but that seems much more boring)
  
   My plan has been to use the same ioctl for both uses. They'll need
   nearly identical handling anyway on the ioctl level. I have something
   semi-working currently, but I need to clean it up a bit before I can
   show it to anyone.
 
  I do think the atomic-pageflip ioctl really should take the crtc-id..
  so probably should be two ioctls, but nearly identical
 
  But then you can't support atomic pageflips across multiple crtcs even
  if the hardware would allow it. I would hate to add such limitation to
  the API. I can immediately think of a use case; combining several
  smaller displays to form a single larger display.
 
  With a unified API you could just add some kind of flag that tells the
  kernel that user space really wants an atomic update, and if the
  driver/hardware can't do it, it can return an error.
 
 well, that is really only a problem for X11..  and atomic flip across
 multiple crtc's is a potential mess from performance standpoint unless
 your displays are vsync'd lock.

It won't be truly atomic unless they are vsync locked. But anyways more 
buffers can be used to solve the performance problem. But that's a
separate issue and in that case it doesn't really matter whether you
issue separate ioctls for each crtc.

 weston already renders each output individually, rather than spanning
 a single fb across multiple displays like x11 does.  So this problem
 really doesn't exist for weston.

It does if you want to make sure the user sees the flip on both displays
at the same time.

-- 
Ville Syrjälä
Intel OTC
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Rob Clark
On Wed, Sep 12, 2012 at 9:34 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 09:28:43AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 9:23 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
   On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net 
   wrote:
On Sun, 9 Sep 2012 22:19:59 -0500
Rob Clark r...@ti.com wrote:
   
On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org 
wrote:
 From: Rob Clark r...@ti.com

 This is following a bit the approach that Ville is taking for 
 atomic-
 modeset, in that it is switching over to using properties for 
 everything.
 The advantage of this approach is that it makes it easier to add 
 new
 attributes to set as part of a page-flip (and even opens the 
 option to
 add new object types).
   
oh, and for those wondering what the hell this is all about,
nuclear-pageflip is a pageflip that atomically updates the CRTC 
layer
and zero or more attached plane layers, optionally changing various
properties such as z-order (or even blending modes, etc) atomically.
It includes the option for a test step so userspace compositor can
test a proposed configuration (if any properties not marked as
'dynamic' are updated).  This intended to allow, for example, weston
compositor to synchronize updates to plane (sprite) layers with CRTC
layer.
   
   
Can we please name this something different? I complained about this 
in
IRC, but I don't know if you hang around in #intel-gfx.
   
Some suggestions:
multiplane pageflip
muliplane atomic pageflip
atomic multiplane pageflip
atomic multiflip
pageflip of atomic and multiplane nature
   
Nuclear is not at all descriptive and requires as your response shows
:-)
   
  
   fair enough.. I was originally calling it atomic-pageflip until
   someone (I don't remember who) started calling it nuclear-pageflip to
   differentiate from atomic-modeset.  But IMO we could just call the two
   ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
   pageflip2, but that seems much more boring)
  
   My plan has been to use the same ioctl for both uses. They'll need
   nearly identical handling anyway on the ioctl level. I have something
   semi-working currently, but I need to clean it up a bit before I can
   show it to anyone.
 
  I do think the atomic-pageflip ioctl really should take the crtc-id..
  so probably should be two ioctls, but nearly identical
 
  But then you can't support atomic pageflips across multiple crtcs even
  if the hardware would allow it. I would hate to add such limitation to
  the API. I can immediately think of a use case; combining several
  smaller displays to form a single larger display.
 
  With a unified API you could just add some kind of flag that tells the
  kernel that user space really wants an atomic update, and if the
  driver/hardware can't do it, it can return an error.

 well, that is really only a problem for X11..  and atomic flip across
 multiple crtc's is a potential mess from performance standpoint unless
 your displays are vsync'd lock.

 It won't be truly atomic unless they are vsync locked. But anyways more
 buffers can be used to solve the performance problem. But that's a
 separate issue and in that case it doesn't really matter whether you
 issue separate ioctls for each crtc.

that was basically my thinking.. separate ioctls for each crtc.  The
way my branch works currently, you can do this.  A page-flip on crtc
#2 won't care that there is still a flip pending on crtc #1.

I guess that doesn't strictly guarantee that the two pageflips happen
at the exact same time, but unless you have some way to vsync lock the
two displays, I don't think that is possible anyways.  So I'm not
really sure it is worth over-complicating the ioctl to support two
crtc's.  The error checking in case a vsync is still pending is much
easier in the driver if you know the crtc up-front at the
atomic_begin() point.  Which is why I prefer to pass the crtc_id as a
field in the ioctl.

BR,
-R

 weston already renders each output individually, rather than spanning
 a single fb across multiple displays like x11 does.  So this problem
 really doesn't exist for weston.

 It does if you want to make sure the user sees the flip on both displays
 at the same time.

 --
 Ville Syrjälä
 Intel OTC
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Ville Syrjälä
On Wed, Sep 12, 2012 at 09:42:27AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 9:34 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 09:28:43AM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 9:23 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
   On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net 
wrote:
 On Sun, 9 Sep 2012 22:19:59 -0500
 Rob Clark r...@ti.com wrote:

 On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org 
 wrote:
  From: Rob Clark r...@ti.com
 
  This is following a bit the approach that Ville is taking for 
  atomic-
  modeset, in that it is switching over to using properties for 
  everything.
  The advantage of this approach is that it makes it easier to 
  add new
  attributes to set as part of a page-flip (and even opens the 
  option to
  add new object types).

 oh, and for those wondering what the hell this is all about,
 nuclear-pageflip is a pageflip that atomically updates the CRTC 
 layer
 and zero or more attached plane layers, optionally changing 
 various
 properties such as z-order (or even blending modes, etc) 
 atomically.
 It includes the option for a test step so userspace compositor can
 test a proposed configuration (if any properties not marked as
 'dynamic' are updated).  This intended to allow, for example, 
 weston
 compositor to synchronize updates to plane (sprite) layers with 
 CRTC
 layer.


 Can we please name this something different? I complained about 
 this in
 IRC, but I don't know if you hang around in #intel-gfx.

 Some suggestions:
 multiplane pageflip
 muliplane atomic pageflip
 atomic multiplane pageflip
 atomic multiflip
 pageflip of atomic and multiplane nature

 Nuclear is not at all descriptive and requires as your response 
 shows
 :-)

   
fair enough.. I was originally calling it atomic-pageflip until
someone (I don't remember who) started calling it nuclear-pageflip to
differentiate from atomic-modeset.  But IMO we could just call the 
two
ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
pageflip2, but that seems much more boring)
   
My plan has been to use the same ioctl for both uses. They'll need
nearly identical handling anyway on the ioctl level. I have something
semi-working currently, but I need to clean it up a bit before I can
show it to anyone.
  
   I do think the atomic-pageflip ioctl really should take the crtc-id..
   so probably should be two ioctls, but nearly identical
  
   But then you can't support atomic pageflips across multiple crtcs even
   if the hardware would allow it. I would hate to add such limitation to
   the API. I can immediately think of a use case; combining several
   smaller displays to form a single larger display.
  
   With a unified API you could just add some kind of flag that tells the
   kernel that user space really wants an atomic update, and if the
   driver/hardware can't do it, it can return an error.
 
  well, that is really only a problem for X11..  and atomic flip across
  multiple crtc's is a potential mess from performance standpoint unless
  your displays are vsync'd lock.
 
  It won't be truly atomic unless they are vsync locked. But anyways more
  buffers can be used to solve the performance problem. But that's a
  separate issue and in that case it doesn't really matter whether you
  issue separate ioctls for each crtc.
 
 that was basically my thinking.. separate ioctls for each crtc.  The
 way my branch works currently, you can do this.  A page-flip on crtc
 #2 won't care that there is still a flip pending on crtc #1.
 
 I guess that doesn't strictly guarantee that the two pageflips happen
 at the exact same time, but unless you have some way to vsync lock the
 two displays, I don't think that is possible anyways.

Sure you need hardware to keep the pipes in sync.

 So I'm not
 really sure it is worth over-complicating the ioctl to support two
 crtc's. The error checking in case a vsync is still pending is much
 easier in the driver if you know the crtc up-front at the
 atomic_begin() point.  Which is why I prefer to pass the crtc_id as a
 field in the ioctl.

Doing such checks in atomic_begin() is way too early. Unless you want
to block/return immediately if there's a pending flip.

I want to allow user space to force feed the driver with flips at
speeds greater than the display refresh. The last frame to finish
rendering before the vblank is the one that should end up on the
screen. That way you can do tear-free triple buffering without

Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Rob Clark
On Wed, Sep 12, 2012 at 10:12 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 09:42:27AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 9:34 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 09:28:43AM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 9:23 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
   On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net 
wrote:
 On Sun, 9 Sep 2012 22:19:59 -0500
 Rob Clark r...@ti.com wrote:

 On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark 
 rob.cl...@linaro.org wrote:
  From: Rob Clark r...@ti.com
 
  This is following a bit the approach that Ville is taking for 
  atomic-
  modeset, in that it is switching over to using properties for 
  everything.
  The advantage of this approach is that it makes it easier to 
  add new
  attributes to set as part of a page-flip (and even opens the 
  option to
  add new object types).

 oh, and for those wondering what the hell this is all about,
 nuclear-pageflip is a pageflip that atomically updates the CRTC 
 layer
 and zero or more attached plane layers, optionally changing 
 various
 properties such as z-order (or even blending modes, etc) 
 atomically.
 It includes the option for a test step so userspace compositor 
 can
 test a proposed configuration (if any properties not marked as
 'dynamic' are updated).  This intended to allow, for example, 
 weston
 compositor to synchronize updates to plane (sprite) layers with 
 CRTC
 layer.


 Can we please name this something different? I complained about 
 this in
 IRC, but I don't know if you hang around in #intel-gfx.

 Some suggestions:
 multiplane pageflip
 muliplane atomic pageflip
 atomic multiplane pageflip
 atomic multiflip
 pageflip of atomic and multiplane nature

 Nuclear is not at all descriptive and requires as your response 
 shows
 :-)

   
fair enough.. I was originally calling it atomic-pageflip until
someone (I don't remember who) started calling it nuclear-pageflip 
to
differentiate from atomic-modeset.  But IMO we could just call the 
two
ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
pageflip2, but that seems much more boring)
   
My plan has been to use the same ioctl for both uses. They'll need
nearly identical handling anyway on the ioctl level. I have something
semi-working currently, but I need to clean it up a bit before I can
show it to anyone.
  
   I do think the atomic-pageflip ioctl really should take the crtc-id..
   so probably should be two ioctls, but nearly identical
  
   But then you can't support atomic pageflips across multiple crtcs even
   if the hardware would allow it. I would hate to add such limitation to
   the API. I can immediately think of a use case; combining several
   smaller displays to form a single larger display.
  
   With a unified API you could just add some kind of flag that tells the
   kernel that user space really wants an atomic update, and if the
   driver/hardware can't do it, it can return an error.
 
  well, that is really only a problem for X11..  and atomic flip across
  multiple crtc's is a potential mess from performance standpoint unless
  your displays are vsync'd lock.
 
  It won't be truly atomic unless they are vsync locked. But anyways more
  buffers can be used to solve the performance problem. But that's a
  separate issue and in that case it doesn't really matter whether you
  issue separate ioctls for each crtc.

 that was basically my thinking.. separate ioctls for each crtc.  The
 way my branch works currently, you can do this.  A page-flip on crtc
 #2 won't care that there is still a flip pending on crtc #1.

 I guess that doesn't strictly guarantee that the two pageflips happen
 at the exact same time, but unless you have some way to vsync lock the
 two displays, I don't think that is possible anyways.

 Sure you need hardware to keep the pipes in sync.

 So I'm not
 really sure it is worth over-complicating the ioctl to support two
 crtc's. The error checking in case a vsync is still pending is much
 easier in the driver if you know the crtc up-front at the
 atomic_begin() point.  Which is why I prefer to pass the crtc_id as a
 field in the ioctl.

 Doing such checks in atomic_begin() is way too early. Unless you want
 to block/return immediately if there's a pending flip.

I want to return -EBUSY immediately if there is a pending flip.

 I want to allow user space to force feed the driver with flips at
 speeds greater than the display 

Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Ville Syrjälä
On Wed, Sep 12, 2012 at 10:23:48AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 10:12 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 09:42:27AM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 9:34 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 09:28:43AM -0500, Rob Clark wrote:
   On Wed, Sep 12, 2012 at 9:23 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
 On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net 
 wrote:
  On Sun, 9 Sep 2012 22:19:59 -0500
  Rob Clark r...@ti.com wrote:
 
  On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark 
  rob.cl...@linaro.org wrote:
   From: Rob Clark r...@ti.com
  
   This is following a bit the approach that Ville is taking 
   for atomic-
   modeset, in that it is switching over to using properties 
   for everything.
   The advantage of this approach is that it makes it easier to 
   add new
   attributes to set as part of a page-flip (and even opens the 
   option to
   add new object types).
 
  oh, and for those wondering what the hell this is all about,
  nuclear-pageflip is a pageflip that atomically updates the 
  CRTC layer
  and zero or more attached plane layers, optionally changing 
  various
  properties such as z-order (or even blending modes, etc) 
  atomically.
  It includes the option for a test step so userspace compositor 
  can
  test a proposed configuration (if any properties not marked as
  'dynamic' are updated).  This intended to allow, for example, 
  weston
  compositor to synchronize updates to plane (sprite) layers 
  with CRTC
  layer.
 
 
  Can we please name this something different? I complained about 
  this in
  IRC, but I don't know if you hang around in #intel-gfx.
 
  Some suggestions:
  multiplane pageflip
  muliplane atomic pageflip
  atomic multiplane pageflip
  atomic multiflip
  pageflip of atomic and multiplane nature
 
  Nuclear is not at all descriptive and requires as your response 
  shows
  :-)
 

 fair enough.. I was originally calling it atomic-pageflip until
 someone (I don't remember who) started calling it 
 nuclear-pageflip to
 differentiate from atomic-modeset.  But IMO we could just call 
 the two
 ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
 pageflip2, but that seems much more boring)

 My plan has been to use the same ioctl for both uses. They'll need
 nearly identical handling anyway on the ioctl level. I have 
 something
 semi-working currently, but I need to clean it up a bit before I 
 can
 show it to anyone.
   
I do think the atomic-pageflip ioctl really should take the crtc-id..
so probably should be two ioctls, but nearly identical
   
But then you can't support atomic pageflips across multiple crtcs even
if the hardware would allow it. I would hate to add such limitation to
the API. I can immediately think of a use case; combining several
smaller displays to form a single larger display.
   
With a unified API you could just add some kind of flag that tells the
kernel that user space really wants an atomic update, and if the
driver/hardware can't do it, it can return an error.
  
   well, that is really only a problem for X11..  and atomic flip across
   multiple crtc's is a potential mess from performance standpoint unless
   your displays are vsync'd lock.
  
   It won't be truly atomic unless they are vsync locked. But anyways more
   buffers can be used to solve the performance problem. But that's a
   separate issue and in that case it doesn't really matter whether you
   issue separate ioctls for each crtc.
 
  that was basically my thinking.. separate ioctls for each crtc.  The
  way my branch works currently, you can do this.  A page-flip on crtc
  #2 won't care that there is still a flip pending on crtc #1.
 
  I guess that doesn't strictly guarantee that the two pageflips happen
  at the exact same time, but unless you have some way to vsync lock the
  two displays, I don't think that is possible anyways.
 
  Sure you need hardware to keep the pipes in sync.
 
  So I'm not
  really sure it is worth over-complicating the ioctl to support two
  crtc's. The error checking in case a vsync is still pending is much
  easier in the driver if you know the crtc up-front at the
  atomic_begin() point.  Which is why I prefer to pass the crtc_id as a
  field in the ioctl.
 
  Doing such checks in atomic_begin() is way too early. Unless you want
  to block/return immediately if 

Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Rob Clark
On Wed, Sep 12, 2012 at 10:32 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 10:23:48AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 10:12 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 09:42:27AM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 9:34 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 09:28:43AM -0500, Rob Clark wrote:
   On Wed, Sep 12, 2012 at 9:23 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
 On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky 
 b...@bwidawsk.net wrote:
  On Sun, 9 Sep 2012 22:19:59 -0500
  Rob Clark r...@ti.com wrote:
 
  On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark 
  rob.cl...@linaro.org wrote:
   From: Rob Clark r...@ti.com
  
   This is following a bit the approach that Ville is taking 
   for atomic-
   modeset, in that it is switching over to using properties 
   for everything.
   The advantage of this approach is that it makes it easier 
   to add new
   attributes to set as part of a page-flip (and even opens 
   the option to
   add new object types).
 
  oh, and for those wondering what the hell this is all about,
  nuclear-pageflip is a pageflip that atomically updates the 
  CRTC layer
  and zero or more attached plane layers, optionally changing 
  various
  properties such as z-order (or even blending modes, etc) 
  atomically.
  It includes the option for a test step so userspace 
  compositor can
  test a proposed configuration (if any properties not marked as
  'dynamic' are updated).  This intended to allow, for example, 
  weston
  compositor to synchronize updates to plane (sprite) layers 
  with CRTC
  layer.
 
 
  Can we please name this something different? I complained 
  about this in
  IRC, but I don't know if you hang around in #intel-gfx.
 
  Some suggestions:
  multiplane pageflip
  muliplane atomic pageflip
  atomic multiplane pageflip
  atomic multiflip
  pageflip of atomic and multiplane nature
 
  Nuclear is not at all descriptive and requires as your 
  response shows
  :-)
 

 fair enough.. I was originally calling it atomic-pageflip until
 someone (I don't remember who) started calling it 
 nuclear-pageflip to
 differentiate from atomic-modeset.  But IMO we could just call 
 the two
 ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
 pageflip2, but that seems much more boring)

 My plan has been to use the same ioctl for both uses. They'll need
 nearly identical handling anyway on the ioctl level. I have 
 something
 semi-working currently, but I need to clean it up a bit before I 
 can
 show it to anyone.
   
I do think the atomic-pageflip ioctl really should take the 
crtc-id..
so probably should be two ioctls, but nearly identical
   
But then you can't support atomic pageflips across multiple crtcs 
even
if the hardware would allow it. I would hate to add such limitation 
to
the API. I can immediately think of a use case; combining several
smaller displays to form a single larger display.
   
With a unified API you could just add some kind of flag that tells 
the
kernel that user space really wants an atomic update, and if the
driver/hardware can't do it, it can return an error.
  
   well, that is really only a problem for X11..  and atomic flip across
   multiple crtc's is a potential mess from performance standpoint unless
   your displays are vsync'd lock.
  
   It won't be truly atomic unless they are vsync locked. But anyways more
   buffers can be used to solve the performance problem. But that's a
   separate issue and in that case it doesn't really matter whether you
   issue separate ioctls for each crtc.
 
  that was basically my thinking.. separate ioctls for each crtc.  The
  way my branch works currently, you can do this.  A page-flip on crtc
  #2 won't care that there is still a flip pending on crtc #1.
 
  I guess that doesn't strictly guarantee that the two pageflips happen
  at the exact same time, but unless you have some way to vsync lock the
  two displays, I don't think that is possible anyways.
 
  Sure you need hardware to keep the pipes in sync.
 
  So I'm not
  really sure it is worth over-complicating the ioctl to support two
  crtc's. The error checking in case a vsync is still pending is much
  easier in the driver if you know the crtc up-front at the
  atomic_begin() point.  Which is why I prefer to pass the crtc_id as a
  field in the ioctl.
 
  

Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Ville Syrjälä
On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 10:32 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 10:23:48AM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 10:12 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 09:42:27AM -0500, Rob Clark wrote:
   On Wed, Sep 12, 2012 at 9:34 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Wed, Sep 12, 2012 at 09:28:43AM -0500, Rob Clark wrote:
On Wed, Sep 12, 2012 at 9:23 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
  On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky 
  b...@bwidawsk.net wrote:
   On Sun, 9 Sep 2012 22:19:59 -0500
   Rob Clark r...@ti.com wrote:
  
   On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark 
   rob.cl...@linaro.org wrote:
From: Rob Clark r...@ti.com
   
This is following a bit the approach that Ville is taking 
for atomic-
modeset, in that it is switching over to using properties 
for everything.
The advantage of this approach is that it makes it easier 
to add new
attributes to set as part of a page-flip (and even opens 
the option to
add new object types).
  
   oh, and for those wondering what the hell this is all about,
   nuclear-pageflip is a pageflip that atomically updates the 
   CRTC layer
   and zero or more attached plane layers, optionally changing 
   various
   properties such as z-order (or even blending modes, etc) 
   atomically.
   It includes the option for a test step so userspace 
   compositor can
   test a proposed configuration (if any properties not marked 
   as
   'dynamic' are updated).  This intended to allow, for 
   example, weston
   compositor to synchronize updates to plane (sprite) layers 
   with CRTC
   layer.
  
  
   Can we please name this something different? I complained 
   about this in
   IRC, but I don't know if you hang around in #intel-gfx.
  
   Some suggestions:
   multiplane pageflip
   muliplane atomic pageflip
   atomic multiplane pageflip
   atomic multiflip
   pageflip of atomic and multiplane nature
  
   Nuclear is not at all descriptive and requires as your 
   response shows
   :-)
  
 
  fair enough.. I was originally calling it atomic-pageflip until
  someone (I don't remember who) started calling it 
  nuclear-pageflip to
  differentiate from atomic-modeset.  But IMO we could just call 
  the two
  ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 
  and
  pageflip2, but that seems much more boring)
 
  My plan has been to use the same ioctl for both uses. They'll 
  need
  nearly identical handling anyway on the ioctl level. I have 
  something
  semi-working currently, but I need to clean it up a bit before 
  I can
  show it to anyone.

 I do think the atomic-pageflip ioctl really should take the 
 crtc-id..
 so probably should be two ioctls, but nearly identical

 But then you can't support atomic pageflips across multiple crtcs 
 even
 if the hardware would allow it. I would hate to add such 
 limitation to
 the API. I can immediately think of a use case; combining several
 smaller displays to form a single larger display.

 With a unified API you could just add some kind of flag that tells 
 the
 kernel that user space really wants an atomic update, and if the
 driver/hardware can't do it, it can return an error.
   
well, that is really only a problem for X11..  and atomic flip across
multiple crtc's is a potential mess from performance standpoint 
unless
your displays are vsync'd lock.
   
It won't be truly atomic unless they are vsync locked. But anyways 
more
buffers can be used to solve the performance problem. But that's a
separate issue and in that case it doesn't really matter whether you
issue separate ioctls for each crtc.
  
   that was basically my thinking.. separate ioctls for each crtc.  The
   way my branch works currently, you can do this.  A page-flip on crtc
   #2 won't care that there is still a flip pending on crtc #1.
  
   I guess that doesn't strictly guarantee that the two pageflips happen
   at the exact same time, but unless you have some way to vsync lock the
   two displays, I don't think that is possible anyways.
  
   Sure you need hardware to keep the pipes in sync.
  
   So I'm not
   really sure it is worth over-complicating the ioctl to support two
   crtc's. The error 

Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Ville Syrjälä
On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
 On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
  But I think we could still do this w/ one ioctl per crtc for 
  atomic-pageflip.
 
  We could, if we want to sacrifice the synced multi display case. I just
  think it might be a real use case at some point. IVI feels like the most
  likely short term cadidate to me, but perhaps someone would finally
  introduce some new style phone/tablet thingy. I have seen
  concepts/prototypes of such multi display gadgets in the past, but the
  industry apparently got a bit stuck on the rectangular slab with
  touchscreen on one side design.
 
 I could be wrong, but I think IVI the screens would normally be too
 far apart to matter?

I was thinking of something like a display on the dash that normally
sits low with only a small sliver visible, and extends upwards when
you fire up a movie player for example. Internally it could be made
up of two displays for power savings purposes.

 Anyways, it is really only a problem if you can't do two ioctl()s
 within one vblank period. If it actually turns out to be a real
 problem,

Well exactly that's the problem this whole atomic pageflip stuff is
trying to tackle, no? ;)

 we could always add later an ioctl that takes an array of
 'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
 really useful or not.. but maybe I'm thinking too much about how
 weston does it's rendering of different output's independently.

I'm just now thinking of surfaceflinger's way of doing things, with
its prepare and commit phases. If you need to issue two ioctls to handle
cloned displays, then you can end up in a somewhat funky situation.

Let's say you have a video overlay in use (one each display naturally),
and you increase the downscaling factor enough so that you now have
enough memory bandwith to support only one overlay. With independent
check ioctls for each display, you never have the full device state
available in the kernel, so each check succeeds during the prepare
phase. So you decide that you can keep using the video overlays.

You then proceed to commit the state, but after the first display has
been commited you get an error when trying to commit the second one.
What can you do now? The only option is to keep displaying the old
frame on the other displays for some time longer, and then on the
next frame you can switch to GPU composition. But on the next frame you
perhaps no longer need to use GPU composition, but since you can't
verify that in the prepare phase, you have no other option but to use
GPU composition.

So when you run into a configuration that can be supported only
partially, you get animation stalls on some displays due to skipped
frames, and you always have to fall back to GPU composition for the 
next frame.

If on the other hand you would check the whole state in one ioctl,
you'd get the error in the first prepare phase, and could fall back
to GPU composition immediately.

Am I too much of a perfectionst in considering such things? I don't
think so, but perhaps other people disagree.

 I wonder.. if you have some special hw setup where you can keep the
 multiple display's in sync-lock, maybe a special virtual crtc would
 work.  That way userspace it appears as a single display.  I'm not
 really sure how crazy that would be to implement.

Sure, add more abstraction layers and you can solve any problem :)

 But I think in the
 common case, where the displays are not vsync locked, treating the
 page flips of different crtc's independently, and rendering the
 contents for different outputs independently (like weston) makes the
 most sense.

My API design doesn't preclude that at all. In light of my android tale
above how would weston decide whether it can use overlays in such a
case?

-- 
Ville Syrjälä
Intel OTC
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Rob Clark
On Wed, Sep 12, 2012 at 1:58 PM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 01:00:19PM -0500, Clark, Rob wrote:
 On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
  But I think we could still do this w/ one ioctl per crtc for 
  atomic-pageflip.
 
  We could, if we want to sacrifice the synced multi display case. I just
  think it might be a real use case at some point. IVI feels like the most
  likely short term cadidate to me, but perhaps someone would finally
  introduce some new style phone/tablet thingy. I have seen
  concepts/prototypes of such multi display gadgets in the past, but the
  industry apparently got a bit stuck on the rectangular slab with
  touchscreen on one side design.

 I could be wrong, but I think IVI the screens would normally be too
 far apart to matter?

 I was thinking of something like a display on the dash that normally
 sits low with only a small sliver visible, and extends upwards when
 you fire up a movie player for example. Internally it could be made
 up of two displays for power savings purposes.

 Anyways, it is really only a problem if you can't do two ioctl()s
 within one vblank period. If it actually turns out to be a real
 problem,

 Well exactly that's the problem this whole atomic pageflip stuff is
 trying to tackle, no? ;)

 we could always add later an ioctl that takes an array of
 'struct drm_mode_crtc_atomic_page_flip's.  I'm not sure if this is
 really useful or not.. but maybe I'm thinking too much about how
 weston does it's rendering of different output's independently.

 I'm just now thinking of surfaceflinger's way of doing things, with
 its prepare and commit phases. If you need to issue two ioctls to handle
 cloned displays, then you can end up in a somewhat funky situation.

 Let's say you have a video overlay in use (one each display naturally),
 and you increase the downscaling factor enough so that you now have
 enough memory bandwith to support only one overlay. With independent
 check ioctls for each display, you never have the full device state
 available in the kernel, so each check succeeds during the prepare
 phase. So you decide that you can keep using the video overlays.

 You then proceed to commit the state, but after the first display has
 been commited you get an error when trying to commit the second one.
 What can you do now? The only option is to keep displaying the old
 frame on the other displays for some time longer, and then on the
 next frame you can switch to GPU composition. But on the next frame you
 perhaps no longer need to use GPU composition, but since you can't
 verify that in the prepare phase, you have no other option but to use
 GPU composition.

 So when you run into a configuration that can be supported only
 partially, you get animation stalls on some displays due to skipped
 frames, and you always have to fall back to GPU composition for the
 next frame.

 If on the other hand you would check the whole state in one ioctl,
 you'd get the error in the first prepare phase, and could fall back
 to GPU composition immediately.

 Am I too much of a perfectionst in considering such things? I don't
 think so, but perhaps other people disagree.

Ok, if you have a case where the state of the two crtc's are not
actually independent, then I think you have a valid point.

I'm not quite sure what userspace would do about it, though.. for the
general case where vsync isn't locked, and you can't even necessarily
assume vsync period is the same, then I don't really think you want to
couple rendering to the displays.  OTOH, the 'test' step can come
independent of vblank, so maybe you'd want to do the 'test' step
together, and then the flip parts independently.  Hmm...

 I wonder.. if you have some special hw setup where you can keep the
 multiple display's in sync-lock, maybe a special virtual crtc would
 work.  That way userspace it appears as a single display.  I'm not
 really sure how crazy that would be to implement.

 Sure, add more abstraction layers and you can solve any problem :)

well, not really.. but my point was this sort of setup would be a
somewhat custom hardware setup, so maybe some hack in that case isn't
such a bad idea.  I dunno..

 But I think in the
 common case, where the displays are not vsync locked, treating the
 page flips of different crtc's independently, and rendering the
 contents for different outputs independently (like weston) makes the
 most sense.

 My API design doesn't preclude that at all. In light of my android tale
 above how would weston decide whether it can use overlays in such a
 case?

From userspace API, I guess something like:

struct drm_mode_crtc_atomic_page_flip {
uint32_t flags;
uint32_t count_crtcs;
uint64_t crtc_ids_ptr;  /* array of uint32_t */
uint64_t count_props_ptr; /* array of uint32_t, # of prop's per crtc */
 

Re: [RFC 0/9] nuclear pageflip

2012-09-12 Thread Clark, Rob
On Wed, Sep 12, 2012 at 12:27 PM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 10:48:16AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 10:32 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Wed, Sep 12, 2012 at 10:23:48AM -0500, Rob Clark wrote:
  On Wed, Sep 12, 2012 at 10:12 AM, Ville Syrjälä
  ville.syrj...@linux.intel.com wrote:
   On Wed, Sep 12, 2012 at 09:42:27AM -0500, Rob Clark wrote:
   On Wed, Sep 12, 2012 at 9:34 AM, Ville Syrjälä
   ville.syrj...@linux.intel.com wrote:
On Wed, Sep 12, 2012 at 09:28:43AM -0500, Rob Clark wrote:
On Wed, Sep 12, 2012 at 9:23 AM, Ville Syrjälä
ville.syrj...@linux.intel.com wrote:
 On Wed, Sep 12, 2012 at 07:30:18AM -0500, Rob Clark wrote:
 On Wed, Sep 12, 2012 at 3:59 AM, Ville Syrjälä
 ville.syrj...@linux.intel.com wrote:
  On Tue, Sep 11, 2012 at 05:07:49PM -0500, Rob Clark wrote:
  On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky 
  b...@bwidawsk.net wrote:
   On Sun, 9 Sep 2012 22:19:59 -0500
   Rob Clark r...@ti.com wrote:
  
   On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark 
   rob.cl...@linaro.org wrote:
From: Rob Clark r...@ti.com
   
This is following a bit the approach that Ville is 
taking for atomic-
modeset, in that it is switching over to using 
properties for everything.
The advantage of this approach is that it makes it 
easier to add new
attributes to set as part of a page-flip (and even opens 
the option to
add new object types).
  
   oh, and for those wondering what the hell this is all 
   about,
   nuclear-pageflip is a pageflip that atomically updates the 
   CRTC layer
   and zero or more attached plane layers, optionally 
   changing various
   properties such as z-order (or even blending modes, etc) 
   atomically.
   It includes the option for a test step so userspace 
   compositor can
   test a proposed configuration (if any properties not 
   marked as
   'dynamic' are updated).  This intended to allow, for 
   example, weston
   compositor to synchronize updates to plane (sprite) layers 
   with CRTC
   layer.
  
  
   Can we please name this something different? I complained 
   about this in
   IRC, but I don't know if you hang around in #intel-gfx.
  
   Some suggestions:
   multiplane pageflip
   muliplane atomic pageflip
   atomic multiplane pageflip
   atomic multiflip
   pageflip of atomic and multiplane nature
  
   Nuclear is not at all descriptive and requires as your 
   response shows
   :-)
  
 
  fair enough.. I was originally calling it atomic-pageflip 
  until
  someone (I don't remember who) started calling it 
  nuclear-pageflip to
  differentiate from atomic-modeset.  But IMO we could just 
  call the two
  ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 
  and
  pageflip2, but that seems much more boring)
 
  My plan has been to use the same ioctl for both uses. They'll 
  need
  nearly identical handling anyway on the ioctl level. I have 
  something
  semi-working currently, but I need to clean it up a bit before 
  I can
  show it to anyone.

 I do think the atomic-pageflip ioctl really should take the 
 crtc-id..
 so probably should be two ioctls, but nearly identical

 But then you can't support atomic pageflips across multiple crtcs 
 even
 if the hardware would allow it. I would hate to add such 
 limitation to
 the API. I can immediately think of a use case; combining several
 smaller displays to form a single larger display.

 With a unified API you could just add some kind of flag that 
 tells the
 kernel that user space really wants an atomic update, and if the
 driver/hardware can't do it, it can return an error.
   
well, that is really only a problem for X11..  and atomic flip 
across
multiple crtc's is a potential mess from performance standpoint 
unless
your displays are vsync'd lock.
   
It won't be truly atomic unless they are vsync locked. But anyways 
more
buffers can be used to solve the performance problem. But that's a
separate issue and in that case it doesn't really matter whether you
issue separate ioctls for each crtc.
  
   that was basically my thinking.. separate ioctls for each crtc.  The
   way my branch works currently, you can do this.  A page-flip on crtc
   #2 won't care that there is still a flip pending on crtc #1.
  
   I guess that doesn't strictly guarantee that the two pageflips happen
   at the exact same time, but unless you have some way to vsync lock the
   two displays, I don't think that is possible anyways.
  
   Sure you need hardware to keep the pipes in sync.
  

Re: [RFC 0/9] nuclear pageflip

2012-09-11 Thread Ben Widawsky
On Sun, 9 Sep 2012 22:19:59 -0500
Rob Clark r...@ti.com wrote:

 On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org wrote:
  From: Rob Clark r...@ti.com
 
  This is following a bit the approach that Ville is taking for atomic-
  modeset, in that it is switching over to using properties for everything.
  The advantage of this approach is that it makes it easier to add new
  attributes to set as part of a page-flip (and even opens the option to
  add new object types).
 
 oh, and for those wondering what the hell this is all about,
 nuclear-pageflip is a pageflip that atomically updates the CRTC layer
 and zero or more attached plane layers, optionally changing various
 properties such as z-order (or even blending modes, etc) atomically.
 It includes the option for a test step so userspace compositor can
 test a proposed configuration (if any properties not marked as
 'dynamic' are updated).  This intended to allow, for example, weston
 compositor to synchronize updates to plane (sprite) layers with CRTC
 layer.
 

Can we please name this something different? I complained about this in
IRC, but I don't know if you hang around in #intel-gfx.

Some suggestions:
multiplane pageflip
muliplane atomic pageflip
atomic multiplane pageflip
atomic multiflip
pageflip of atomic and multiplane nature

Nuclear is not at all descriptive and requires as your response shows
:-)


 BR,
 -R
 
  The basic principles are:
   a) split out object state (in this case, plane and crtc, although I
  expect more to be added as atomic-modeset is added) into seperate
  structures that can be atomically commited or rolled back
   b) expand the object property API (set_property()) to take a state
  object.  The obj-set_property() simply updates the state object
  without actually applying the changes to the hw.
   c) after all the property updates are done, the updated state can
  be checked for correctness and against the hw capabilities, and
  then either discarded or committed atomically.
 
  Since we need to include properties in the nuclear-pageflip scheme,
  doing everything via properties avoids updating a bunch of additional
  driver provided callbacks.  Instead we just drop crtc-page_flip()
  and plane-update_plane().  By splitting out the object's mutable
  state into drm_{plane,crtc,etc}_state structs (which are wrapped by
  the individual drivers to add their own hw specific state), we can
  use some helpers (drm_{plane,crtc,etc}_set_property() and
  drm_{plane,crtc,etc}_check_state()) to keep core error checking in
  drm core and avoid pushing the burden of dealing with common
  properties to the individual drivers.
 
  So far, I've only updated omapdrm to the new APIs, as a proof of
  concept.  Only a few drivers support drm plane, so I expect the
  updates to convert drm-plane to properties should not be so hard.
  Possibly for crtc/pageflip we might need to have a transition period
  where we still support crtc-page_flip() code path until all drivers
  are updated.
 
  My complete branch is here:
 
https://github.com/robclark/kernel-omap4/commits/drm_nuclear
git://github.com/robclark/kernel-omap4.git drm_nuclear
 
  Rob Clark (9):
drm: add atomic fxns
drm: add object property type
drm: add DRM_MODE_PROP_DYNAMIC property flag
drm: convert plane to properties
drm: add drm_plane_state
drm: convert page_flip to properties
drm: add drm_crtc_state
drm: nuclear pageflip
drm/omap: update for atomic age
 
   drivers/gpu/drm/drm_crtc.c|  769 
  +++--
   drivers/gpu/drm/drm_crtc_helper.c |   51 +--
   drivers/gpu/drm/drm_drv.c |1 +
   drivers/gpu/drm/drm_fb_helper.c   |   11 +-
   drivers/staging/omapdrm/Makefile  |1 +
   drivers/staging/omapdrm/omap_atomic.c |  270 
   drivers/staging/omapdrm/omap_atomic.h |   52 +++
   drivers/staging/omapdrm/omap_crtc.c   |  247 +--
   drivers/staging/omapdrm/omap_drv.c|5 +
   drivers/staging/omapdrm/omap_drv.h|   35 +-
   drivers/staging/omapdrm/omap_fb.c |   44 +-
   drivers/staging/omapdrm/omap_plane.c  |  270 ++--
   include/drm/drm.h |2 +
   include/drm/drmP.h|   32 ++
   include/drm/drm_crtc.h|  140 --
   include/drm/drm_mode.h|   48 ++
   16 files changed, 1378 insertions(+), 600 deletions(-)
   create mode 100644 drivers/staging/omapdrm/omap_atomic.c
   create mode 100644 drivers/staging/omapdrm/omap_atomic.h
 
  --
  1.7.9.5
 
 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Ben Widawsky, Intel Open Source Technology Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC 0/9] nuclear pageflip

2012-09-11 Thread Rob Clark
On Tue, Sep 11, 2012 at 4:15 PM, Ben Widawsky b...@bwidawsk.net wrote:
 On Sun, 9 Sep 2012 22:19:59 -0500
 Rob Clark r...@ti.com wrote:

 On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org wrote:
  From: Rob Clark r...@ti.com
 
  This is following a bit the approach that Ville is taking for atomic-
  modeset, in that it is switching over to using properties for everything.
  The advantage of this approach is that it makes it easier to add new
  attributes to set as part of a page-flip (and even opens the option to
  add new object types).

 oh, and for those wondering what the hell this is all about,
 nuclear-pageflip is a pageflip that atomically updates the CRTC layer
 and zero or more attached plane layers, optionally changing various
 properties such as z-order (or even blending modes, etc) atomically.
 It includes the option for a test step so userspace compositor can
 test a proposed configuration (if any properties not marked as
 'dynamic' are updated).  This intended to allow, for example, weston
 compositor to synchronize updates to plane (sprite) layers with CRTC
 layer.


 Can we please name this something different? I complained about this in
 IRC, but I don't know if you hang around in #intel-gfx.

 Some suggestions:
 multiplane pageflip
 muliplane atomic pageflip
 atomic multiplane pageflip
 atomic multiflip
 pageflip of atomic and multiplane nature

 Nuclear is not at all descriptive and requires as your response shows
 :-)


fair enough.. I was originally calling it atomic-pageflip until
someone (I don't remember who) started calling it nuclear-pageflip to
differentiate from atomic-modeset.  But IMO we could just call the two
ioclts atomic-modeset and atomic-pageflip.  (Or even modeset2 and
pageflip2, but that seems much more boring)

BR,
-R


 BR,
 -R

  The basic principles are:
   a) split out object state (in this case, plane and crtc, although I
  expect more to be added as atomic-modeset is added) into seperate
  structures that can be atomically commited or rolled back
   b) expand the object property API (set_property()) to take a state
  object.  The obj-set_property() simply updates the state object
  without actually applying the changes to the hw.
   c) after all the property updates are done, the updated state can
  be checked for correctness and against the hw capabilities, and
  then either discarded or committed atomically.
 
  Since we need to include properties in the nuclear-pageflip scheme,
  doing everything via properties avoids updating a bunch of additional
  driver provided callbacks.  Instead we just drop crtc-page_flip()
  and plane-update_plane().  By splitting out the object's mutable
  state into drm_{plane,crtc,etc}_state structs (which are wrapped by
  the individual drivers to add their own hw specific state), we can
  use some helpers (drm_{plane,crtc,etc}_set_property() and
  drm_{plane,crtc,etc}_check_state()) to keep core error checking in
  drm core and avoid pushing the burden of dealing with common
  properties to the individual drivers.
 
  So far, I've only updated omapdrm to the new APIs, as a proof of
  concept.  Only a few drivers support drm plane, so I expect the
  updates to convert drm-plane to properties should not be so hard.
  Possibly for crtc/pageflip we might need to have a transition period
  where we still support crtc-page_flip() code path until all drivers
  are updated.
 
  My complete branch is here:
 
https://github.com/robclark/kernel-omap4/commits/drm_nuclear
git://github.com/robclark/kernel-omap4.git drm_nuclear
 
  Rob Clark (9):
drm: add atomic fxns
drm: add object property type
drm: add DRM_MODE_PROP_DYNAMIC property flag
drm: convert plane to properties
drm: add drm_plane_state
drm: convert page_flip to properties
drm: add drm_crtc_state
drm: nuclear pageflip
drm/omap: update for atomic age
 
   drivers/gpu/drm/drm_crtc.c|  769 
  +++--
   drivers/gpu/drm/drm_crtc_helper.c |   51 +--
   drivers/gpu/drm/drm_drv.c |1 +
   drivers/gpu/drm/drm_fb_helper.c   |   11 +-
   drivers/staging/omapdrm/Makefile  |1 +
   drivers/staging/omapdrm/omap_atomic.c |  270 
   drivers/staging/omapdrm/omap_atomic.h |   52 +++
   drivers/staging/omapdrm/omap_crtc.c   |  247 +--
   drivers/staging/omapdrm/omap_drv.c|5 +
   drivers/staging/omapdrm/omap_drv.h|   35 +-
   drivers/staging/omapdrm/omap_fb.c |   44 +-
   drivers/staging/omapdrm/omap_plane.c  |  270 ++--
   include/drm/drm.h |2 +
   include/drm/drmP.h|   32 ++
   include/drm/drm_crtc.h|  140 --
   include/drm/drm_mode.h|   48 ++
   16 files changed, 1378 insertions(+), 600 deletions(-)
   create mode 100644 drivers/staging/omapdrm/omap_atomic.c
   create mode 100644 drivers/staging/omapdrm/omap_atomic.h
 
  --
  

Re: [RFC 0/9] nuclear pageflip

2012-09-09 Thread Rob Clark
On Sun, Sep 9, 2012 at 10:03 PM, Rob Clark rob.cl...@linaro.org wrote:
 From: Rob Clark r...@ti.com

 This is following a bit the approach that Ville is taking for atomic-
 modeset, in that it is switching over to using properties for everything.
 The advantage of this approach is that it makes it easier to add new
 attributes to set as part of a page-flip (and even opens the option to
 add new object types).

oh, and for those wondering what the hell this is all about,
nuclear-pageflip is a pageflip that atomically updates the CRTC layer
and zero or more attached plane layers, optionally changing various
properties such as z-order (or even blending modes, etc) atomically.
It includes the option for a test step so userspace compositor can
test a proposed configuration (if any properties not marked as
'dynamic' are updated).  This intended to allow, for example, weston
compositor to synchronize updates to plane (sprite) layers with CRTC
layer.

BR,
-R

 The basic principles are:
  a) split out object state (in this case, plane and crtc, although I
 expect more to be added as atomic-modeset is added) into seperate
 structures that can be atomically commited or rolled back
  b) expand the object property API (set_property()) to take a state
 object.  The obj-set_property() simply updates the state object
 without actually applying the changes to the hw.
  c) after all the property updates are done, the updated state can
 be checked for correctness and against the hw capabilities, and
 then either discarded or committed atomically.

 Since we need to include properties in the nuclear-pageflip scheme,
 doing everything via properties avoids updating a bunch of additional
 driver provided callbacks.  Instead we just drop crtc-page_flip()
 and plane-update_plane().  By splitting out the object's mutable
 state into drm_{plane,crtc,etc}_state structs (which are wrapped by
 the individual drivers to add their own hw specific state), we can
 use some helpers (drm_{plane,crtc,etc}_set_property() and
 drm_{plane,crtc,etc}_check_state()) to keep core error checking in
 drm core and avoid pushing the burden of dealing with common
 properties to the individual drivers.

 So far, I've only updated omapdrm to the new APIs, as a proof of
 concept.  Only a few drivers support drm plane, so I expect the
 updates to convert drm-plane to properties should not be so hard.
 Possibly for crtc/pageflip we might need to have a transition period
 where we still support crtc-page_flip() code path until all drivers
 are updated.

 My complete branch is here:

   https://github.com/robclark/kernel-omap4/commits/drm_nuclear
   git://github.com/robclark/kernel-omap4.git drm_nuclear

 Rob Clark (9):
   drm: add atomic fxns
   drm: add object property type
   drm: add DRM_MODE_PROP_DYNAMIC property flag
   drm: convert plane to properties
   drm: add drm_plane_state
   drm: convert page_flip to properties
   drm: add drm_crtc_state
   drm: nuclear pageflip
   drm/omap: update for atomic age

  drivers/gpu/drm/drm_crtc.c|  769 
 +++--
  drivers/gpu/drm/drm_crtc_helper.c |   51 +--
  drivers/gpu/drm/drm_drv.c |1 +
  drivers/gpu/drm/drm_fb_helper.c   |   11 +-
  drivers/staging/omapdrm/Makefile  |1 +
  drivers/staging/omapdrm/omap_atomic.c |  270 
  drivers/staging/omapdrm/omap_atomic.h |   52 +++
  drivers/staging/omapdrm/omap_crtc.c   |  247 +--
  drivers/staging/omapdrm/omap_drv.c|5 +
  drivers/staging/omapdrm/omap_drv.h|   35 +-
  drivers/staging/omapdrm/omap_fb.c |   44 +-
  drivers/staging/omapdrm/omap_plane.c  |  270 ++--
  include/drm/drm.h |2 +
  include/drm/drmP.h|   32 ++
  include/drm/drm_crtc.h|  140 --
  include/drm/drm_mode.h|   48 ++
  16 files changed, 1378 insertions(+), 600 deletions(-)
  create mode 100644 drivers/staging/omapdrm/omap_atomic.c
  create mode 100644 drivers/staging/omapdrm/omap_atomic.h

 --
 1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel