[RFC 0/7+1] Add in-kernel vblank delaying mechanism

2014-12-04 Thread Daniel Vetter
On Wed, Nov 19, 2014 at 8:47 PM, Paulo Zanoni  wrote:
>
>  2. How should the driver interface look like?
>
>a. All the possibilities are passed through the function call, so the 
> drm.ko
>   code needs to set the struct members itself.
>b. The caller already sets the struct members instead of passing them as
>   parameters to the function.
>c. Something different?

We need b because the caller must allocate the structure (the point
where we can return -ENOMEM to userspace might be much after the point
where we need to schedule a vblank callback for e.g. async flips). But
for simple interfaces we need a few things passed directly I think
(since I expect that we'll reuse vblank callback structures similar to
how we reuse timers/work items).

I'll follow up with a detailed review of the new interface exposed to
drivers and what I think it should look like, need to head off now.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[RFC 0/7+1] Add in-kernel vblank delaying mechanism

2014-11-27 Thread Daniel Vetter
On Thu, Nov 27, 2014 at 09:25:15AM +1000, Dave Airlie wrote:
> >> ---
> >> TL;DR summary:
> >> I wrote patches. Help me choose the best implementation and interface.
> >> ---
> >>
> >> So the i915.ko driver could use some mechanism to run functions after a 
> >> given
> >> number of vblanks. Implementations for this mechanism were already 
> >> proposed in
> >> the past (by Chris Wilson and others), but they were i915-specific instead 
> >> of a
> >> generic drm.ko implementation. We even had patches that make use of this 
> >> new
> >> mechanism.
> >>
> >> Since this is very similar to the vblank IOCTL we currently have, but with 
> >> the
> >> caller being a Kernel module instead of user space, I decided to write an
> >> implementation that tries to reuse the current IOCTL infrastructure.
> >>
> >> In the first patch we add all the necessary code to allow the modules to 
> >> call
> >> the vblank ioctl from Kernel space: they provide a callback function that 
> >> is
> >> going to be called instead of the traditional send_vblank_event(), which 
> >> means
> >> the Kernel callers don't have to deal with the file descriptors and DRM 
> >> events.
> >>
> >> In the second patch we add a simple extension of the feature, to allow the
> >> drivers to have their callbacks called in a non-atomic context.
> >>
> >> In all the other subsequent patches we rework the underlying code so that
> >> the common aspects between the user space vblank IOCTL and the Kernel 
> >> interface
> >> are all stored in a single structure (struct drm_vblank_wait_item), and 
> >> then
> >> anything that is specific to the different users is stored in a structure 
> >> that
> >> contains struct drm_vblank_wait_item. This way, most of the drm.ko code 
> >> only
> >> needs to deal with struct drm_vblank_wait_item, not really knowing how it 
> >> is
> >> wrapped. The advantage of this rework is that it reduces the number of 
> >> memory
> >> allocations needed in some cases (from 2 to 1) and it also reduces the 
> >> amount of
> >> memory used on each allocation.
> >>
> >> But while this all sounds nice in the above description, I had the feeling 
> >> that
> >> this adds a lot of complexity and makes the code not-so-beautiful. If we 
> >> ever
> >> need more extensions/options to this feature, we're going to have to 
> >> untangle
> >> the code even more from the IOCTL part. We also have a small "abstraction 
> >> break"
> >> in change introduced in patch 6. And there's the risk that some of the 
> >> reworks
> >> may have added a regression somewhere.
> >>
> >> Based on that, I decided to also provide an alternative implementation. In 
> >> this
> >> implementation we add a new dev->vblank_kernel_list instead of reusing
> >> dev->vblank_event_list, and add new code to handle that this. This new 
> >> code is
> >> completely based on the code that handles dev->vblank_kernel_list, so 
> >> there's
> >> some duplication here. On the other hand, since the in-Kernel 
> >> infrastructure is
> >> not tied to the IOCTL structure, we can very easily grow and adapt the 
> >> Kernel
> >> code without having to fight against the IOCTL code. And the risk of 
> >> regressions
> >> is greatly reduced.
> >>
> >> The second difference of this alternative implementation is the signature 
> >> of the
> >> drm_wait_vblank_kernel() function. While it could be exactly the same as 
> >> the one
> >> used in the other series, I decided to make it different so we can also 
> >> choose
> >> which one to use. In the 7-patch series implementation, all the user needs 
> >> to do
> >> is to allocate the structure, and call the function, properly setting all 
> >> its
> >> arguments. Then the function is responsible for parsing the arguments and
> >> populating the structure based on that. On the alternative implementation, 
> >> the
> >> user has to fill the structure with the appropriate arguments, and then 
> >> call
> >> drm_wait_vblank_kernel() passing just the allocated structure as an 
> >> argument.
> >>
> >> If you notice, you will see that these patches add a new debugfs file to
> >> i915.ko. This file is just a very simple example user of the new interface,
> >> which I used while developing. If you have difficulty understanding the new
> >> interface, please also look at the i915/i915_debugfs.c diff. Of course, I 
> >> plan
> >> to exclude this debugfs interface from the final to-be-merged patches.
> >>
> >> So, in summary, we have a few things we need to discuss:
> >>
> >>  1. Which implementation to use?
> >>
> >>a. Just the 2 first patches of the 7-patch series?
> >>   Pros:
> >> - the Kernel users basically call the vblank IOCTL
> >> - the code changes are minimal
> >>  Cons:
> >> - The amount of memory allocations and memory space consumed is not
> >>   optimal
> >>
> >>b. The full 7-patch series?
> >>   Pros:
> >> - The same backend is used for both the Kernel and IOCTL.
> >>   Cons:
> >

[RFC 0/7+1] Add in-kernel vblank delaying mechanism

2014-11-27 Thread Dave Airlie
>> ---
>> TL;DR summary:
>> I wrote patches. Help me choose the best implementation and interface.
>> ---
>>
>> So the i915.ko driver could use some mechanism to run functions after a given
>> number of vblanks. Implementations for this mechanism were already proposed 
>> in
>> the past (by Chris Wilson and others), but they were i915-specific instead 
>> of a
>> generic drm.ko implementation. We even had patches that make use of this new
>> mechanism.
>>
>> Since this is very similar to the vblank IOCTL we currently have, but with 
>> the
>> caller being a Kernel module instead of user space, I decided to write an
>> implementation that tries to reuse the current IOCTL infrastructure.
>>
>> In the first patch we add all the necessary code to allow the modules to call
>> the vblank ioctl from Kernel space: they provide a callback function that is
>> going to be called instead of the traditional send_vblank_event(), which 
>> means
>> the Kernel callers don't have to deal with the file descriptors and DRM 
>> events.
>>
>> In the second patch we add a simple extension of the feature, to allow the
>> drivers to have their callbacks called in a non-atomic context.
>>
>> In all the other subsequent patches we rework the underlying code so that
>> the common aspects between the user space vblank IOCTL and the Kernel 
>> interface
>> are all stored in a single structure (struct drm_vblank_wait_item), and then
>> anything that is specific to the different users is stored in a structure 
>> that
>> contains struct drm_vblank_wait_item. This way, most of the drm.ko code only
>> needs to deal with struct drm_vblank_wait_item, not really knowing how it is
>> wrapped. The advantage of this rework is that it reduces the number of memory
>> allocations needed in some cases (from 2 to 1) and it also reduces the 
>> amount of
>> memory used on each allocation.
>>
>> But while this all sounds nice in the above description, I had the feeling 
>> that
>> this adds a lot of complexity and makes the code not-so-beautiful. If we ever
>> need more extensions/options to this feature, we're going to have to untangle
>> the code even more from the IOCTL part. We also have a small "abstraction 
>> break"
>> in change introduced in patch 6. And there's the risk that some of the 
>> reworks
>> may have added a regression somewhere.
>>
>> Based on that, I decided to also provide an alternative implementation. In 
>> this
>> implementation we add a new dev->vblank_kernel_list instead of reusing
>> dev->vblank_event_list, and add new code to handle that this. This new code 
>> is
>> completely based on the code that handles dev->vblank_kernel_list, so there's
>> some duplication here. On the other hand, since the in-Kernel infrastructure 
>> is
>> not tied to the IOCTL structure, we can very easily grow and adapt the Kernel
>> code without having to fight against the IOCTL code. And the risk of 
>> regressions
>> is greatly reduced.
>>
>> The second difference of this alternative implementation is the signature of 
>> the
>> drm_wait_vblank_kernel() function. While it could be exactly the same as the 
>> one
>> used in the other series, I decided to make it different so we can also 
>> choose
>> which one to use. In the 7-patch series implementation, all the user needs 
>> to do
>> is to allocate the structure, and call the function, properly setting all its
>> arguments. Then the function is responsible for parsing the arguments and
>> populating the structure based on that. On the alternative implementation, 
>> the
>> user has to fill the structure with the appropriate arguments, and then call
>> drm_wait_vblank_kernel() passing just the allocated structure as an argument.
>>
>> If you notice, you will see that these patches add a new debugfs file to
>> i915.ko. This file is just a very simple example user of the new interface,
>> which I used while developing. If you have difficulty understanding the new
>> interface, please also look at the i915/i915_debugfs.c diff. Of course, I 
>> plan
>> to exclude this debugfs interface from the final to-be-merged patches.
>>
>> So, in summary, we have a few things we need to discuss:
>>
>>  1. Which implementation to use?
>>
>>a. Just the 2 first patches of the 7-patch series?
>>   Pros:
>> - the Kernel users basically call the vblank IOCTL
>> - the code changes are minimal
>>  Cons:
>> - The amount of memory allocations and memory space consumed is not
>>   optimal
>>
>>b. The full 7-patch series?
>>   Pros:
>> - The same backend is used for both the Kernel and IOCTL.
>>   Cons:
>> - The code gets more complex.
>> - Extending the Kernel interface can get complicated due to the fact
>>   that it shares code with the IOCTL interface.
>> - I didn't really like this solution.
>>
>>c. The alternative implementation I wrote?
>>   Pros:
>> - It's independent from the IOCTL code, making furt

[RFC 0/7+1] Add in-kernel vblank delaying mechanism

2014-11-26 Thread Daniel Vetter
On Wed, Nov 19, 2014 at 05:47:07PM -0200, Paulo Zanoni wrote:
> From: Paulo Zanoni 
> 
> Hi
> 
> ---
> TL;DR summary:
> I wrote patches. Help me choose the best implementation and interface.
> ---
> 
> So the i915.ko driver could use some mechanism to run functions after a given
> number of vblanks. Implementations for this mechanism were already proposed in
> the past (by Chris Wilson and others), but they were i915-specific instead of 
> a
> generic drm.ko implementation. We even had patches that make use of this new
> mechanism.
> 
> Since this is very similar to the vblank IOCTL we currently have, but with the
> caller being a Kernel module instead of user space, I decided to write an
> implementation that tries to reuse the current IOCTL infrastructure.
> 
> In the first patch we add all the necessary code to allow the modules to call
> the vblank ioctl from Kernel space: they provide a callback function that is
> going to be called instead of the traditional send_vblank_event(), which means
> the Kernel callers don't have to deal with the file descriptors and DRM 
> events.
> 
> In the second patch we add a simple extension of the feature, to allow the
> drivers to have their callbacks called in a non-atomic context.
> 
> In all the other subsequent patches we rework the underlying code so that
> the common aspects between the user space vblank IOCTL and the Kernel 
> interface
> are all stored in a single structure (struct drm_vblank_wait_item), and then
> anything that is specific to the different users is stored in a structure that
> contains struct drm_vblank_wait_item. This way, most of the drm.ko code only
> needs to deal with struct drm_vblank_wait_item, not really knowing how it is
> wrapped. The advantage of this rework is that it reduces the number of memory
> allocations needed in some cases (from 2 to 1) and it also reduces the amount 
> of
> memory used on each allocation.
> 
> But while this all sounds nice in the above description, I had the feeling 
> that
> this adds a lot of complexity and makes the code not-so-beautiful. If we ever
> need more extensions/options to this feature, we're going to have to untangle
> the code even more from the IOCTL part. We also have a small "abstraction 
> break"
> in change introduced in patch 6. And there's the risk that some of the reworks
> may have added a regression somewhere.
> 
> Based on that, I decided to also provide an alternative implementation. In 
> this
> implementation we add a new dev->vblank_kernel_list instead of reusing
> dev->vblank_event_list, and add new code to handle that this. This new code is
> completely based on the code that handles dev->vblank_kernel_list, so there's
> some duplication here. On the other hand, since the in-Kernel infrastructure 
> is
> not tied to the IOCTL structure, we can very easily grow and adapt the Kernel
> code without having to fight against the IOCTL code. And the risk of 
> regressions
> is greatly reduced.
> 
> The second difference of this alternative implementation is the signature of 
> the
> drm_wait_vblank_kernel() function. While it could be exactly the same as the 
> one
> used in the other series, I decided to make it different so we can also choose
> which one to use. In the 7-patch series implementation, all the user needs to 
> do
> is to allocate the structure, and call the function, properly setting all its
> arguments. Then the function is responsible for parsing the arguments and
> populating the structure based on that. On the alternative implementation, the
> user has to fill the structure with the appropriate arguments, and then call
> drm_wait_vblank_kernel() passing just the allocated structure as an argument.
> 
> If you notice, you will see that these patches add a new debugfs file to
> i915.ko. This file is just a very simple example user of the new interface,
> which I used while developing. If you have difficulty understanding the new
> interface, please also look at the i915/i915_debugfs.c diff. Of course, I plan
> to exclude this debugfs interface from the final to-be-merged patches.
> 
> So, in summary, we have a few things we need to discuss:
> 
>  1. Which implementation to use?
> 
>a. Just the 2 first patches of the 7-patch series?
>   Pros:
> - the Kernel users basically call the vblank IOCTL
> - the code changes are minimal
>  Cons:
> - The amount of memory allocations and memory space consumed is not
>   optimal
> 
>b. The full 7-patch series?
>   Pros:
> - The same backend is used for both the Kernel and IOCTL.
>   Cons:
> - The code gets more complex.
> - Extending the Kernel interface can get complicated due to the fact
>   that it shares code with the IOCTL interface.
> - I didn't really like this solution.
> 
>c. The alternative implementation I wrote?
>   Pros:
> - It's independent from the IOCTL code, making further patches easier.
>   

[RFC 0/7+1] Add in-kernel vblank delaying mechanism

2014-11-19 Thread Paulo Zanoni
From: Paulo Zanoni 

Hi

---
TL;DR summary:
I wrote patches. Help me choose the best implementation and interface.
---

So the i915.ko driver could use some mechanism to run functions after a given
number of vblanks. Implementations for this mechanism were already proposed in
the past (by Chris Wilson and others), but they were i915-specific instead of a
generic drm.ko implementation. We even had patches that make use of this new
mechanism.

Since this is very similar to the vblank IOCTL we currently have, but with the
caller being a Kernel module instead of user space, I decided to write an
implementation that tries to reuse the current IOCTL infrastructure.

In the first patch we add all the necessary code to allow the modules to call
the vblank ioctl from Kernel space: they provide a callback function that is
going to be called instead of the traditional send_vblank_event(), which means
the Kernel callers don't have to deal with the file descriptors and DRM events.

In the second patch we add a simple extension of the feature, to allow the
drivers to have their callbacks called in a non-atomic context.

In all the other subsequent patches we rework the underlying code so that
the common aspects between the user space vblank IOCTL and the Kernel interface
are all stored in a single structure (struct drm_vblank_wait_item), and then
anything that is specific to the different users is stored in a structure that
contains struct drm_vblank_wait_item. This way, most of the drm.ko code only
needs to deal with struct drm_vblank_wait_item, not really knowing how it is
wrapped. The advantage of this rework is that it reduces the number of memory
allocations needed in some cases (from 2 to 1) and it also reduces the amount of
memory used on each allocation.

But while this all sounds nice in the above description, I had the feeling that
this adds a lot of complexity and makes the code not-so-beautiful. If we ever
need more extensions/options to this feature, we're going to have to untangle
the code even more from the IOCTL part. We also have a small "abstraction break"
in change introduced in patch 6. And there's the risk that some of the reworks
may have added a regression somewhere.

Based on that, I decided to also provide an alternative implementation. In this
implementation we add a new dev->vblank_kernel_list instead of reusing
dev->vblank_event_list, and add new code to handle that this. This new code is
completely based on the code that handles dev->vblank_kernel_list, so there's
some duplication here. On the other hand, since the in-Kernel infrastructure is
not tied to the IOCTL structure, we can very easily grow and adapt the Kernel
code without having to fight against the IOCTL code. And the risk of regressions
is greatly reduced.

The second difference of this alternative implementation is the signature of the
drm_wait_vblank_kernel() function. While it could be exactly the same as the one
used in the other series, I decided to make it different so we can also choose
which one to use. In the 7-patch series implementation, all the user needs to do
is to allocate the structure, and call the function, properly setting all its
arguments. Then the function is responsible for parsing the arguments and
populating the structure based on that. On the alternative implementation, the
user has to fill the structure with the appropriate arguments, and then call
drm_wait_vblank_kernel() passing just the allocated structure as an argument.

If you notice, you will see that these patches add a new debugfs file to
i915.ko. This file is just a very simple example user of the new interface,
which I used while developing. If you have difficulty understanding the new
interface, please also look at the i915/i915_debugfs.c diff. Of course, I plan
to exclude this debugfs interface from the final to-be-merged patches.

So, in summary, we have a few things we need to discuss:

 1. Which implementation to use?

   a. Just the 2 first patches of the 7-patch series?
  Pros:
- the Kernel users basically call the vblank IOCTL
- the code changes are minimal
 Cons:
- The amount of memory allocations and memory space consumed is not
  optimal

   b. The full 7-patch series?
  Pros:
- The same backend is used for both the Kernel and IOCTL.
  Cons:
- The code gets more complex.
- Extending the Kernel interface can get complicated due to the fact
  that it shares code with the IOCTL interface.
- I didn't really like this solution.

   c. The alternative implementation I wrote?
  Pros:
- It's independent from the IOCTL code, making further patches easier.
  Cons:
- There is some duplicated code.

   d. Something totally different from these alternatives?

 2. How should the driver interface look like?

   a. All the possibilities are passed through the function call, so the drm.ko
  code needs to set the struct members itself.