[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-11 Thread Terje Bergström
On 11.03.2013 09:18, Thierry Reding wrote:
> This sound a bit over-engineered at this point in time. DRM is currently
> the only user. Is anybody working on any non-DRM drivers that would use
> this?

Well, this contains beginning of that:

http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=blob;f=drivers/media/video/tegra_v4l2_camera.c;h=644d0be5380367aca4c826c49724c03aad08387c;hb=l4t/l4t-r16-r2

I don't want to give these guys any excuse not to port it over to host1x
code base. :-)

> Even that aside, I don't think host1x_mem_handle is a good choice of
> name here. The objects are much more than handles. They are in fact
> buffer objects, which can optionally be attached to a handle. I also
> think that using a void * to store the handle specific data isn't such a
> good idea.

Naming if not an issue for me - we can easily agree on using _bo.

> So how about the following proposal, which I think might satisfy both of
> us:
> 
>   struct host1x_bo;
> 
>   struct host1x_bo_ops {
>   struct host1x_bo *(*get)(struct host1x_bo *bo);
>   void (*put)(struct host1x_bo *bo);
>   dma_addr_t (*pin)(struct host1x_bo *bo, struct sg_table **sgt);
>   ...
>   };
> 
>   struct host1x_bo *host1x_bo_get(struct host1x_bo *bo);
>   void host1x_bo_put(struct host1x_bo *bo);
>   dma_addr_t host1x_bo_pin(struct host1x_bo *bo, struct sg_table **sgt);
>   ...
> 
>   struct host1x_bo {
>   const struct host1x_bo_ops *ops;
>   ...
>   };
> 
>   struct tegra_drm_bo {
>   struct host1x_bo base;
>   ...
>   };
> 
> That way you can get rid of the host1x_memmgr_create_handle() helper and
> instead embed host1x_bo into driver-/framework-specific structures with
> the necessary initialization.

This would make sense. We'll get back when we have enough of
implementation done to understand it all. One consequence is that we
cannot use drm_gem_cma_create() anymore. We'll have to introduce a
function that does the same as drm_gem_cma_create(), but it takes a
pre-allocated drm_gem_cma_object pointer. That way we can allocate the
struct, and use DRM CMA just to initialize the drm_gem_cma_object.

Other way would be just taking a copy of DRM CMA helper, but I'd like to
defer that to the next step when we implement IOMMU aware allocator.

> It also allows you to interact directly with the objects instead of
> having to go through the memmgr API. The memory manager doesn't really
> exist anymore so keeping the name in the API is only confusing. Your
> current proposal deals with memory handles directly already so it's
> really just making the naming more consistent.

The memmgr APIs are currently just a shortcut wrapper to the ops, so in
that sense the memmgr does not really exist. I think it might still make
sense to keep static inline wrappers for calling the ops within, but we
could rename them to host1x_bo_somethingandother. Then it'd follow the
pattern we are using for the hw ops in the latest set.

Terje


[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-11 Thread Thierry Reding
On Mon, Mar 11, 2013 at 11:21:05AM +0200, Terje Bergstr?m wrote:
> On 11.03.2013 09:18, Thierry Reding wrote:
> > This sound a bit over-engineered at this point in time. DRM is currently
> > the only user. Is anybody working on any non-DRM drivers that would use
> > this?
> 
> Well, this contains beginning of that:
> 
> http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=blob;f=drivers/media/video/tegra_v4l2_camera.c;h=644d0be5380367aca4c826c49724c03aad08387c;hb=l4t/l4t-r16-r2
> 
> I don't want to give these guys any excuse not to port it over to host1x
> code base. :-)

I was aware of that driver but I didn't realize it had been available
publicly. It's great to see this, though, and one more argument in
favour of not binding the host1x_bo too tightly to DRM/GEM.

> > So how about the following proposal, which I think might satisfy both of
> > us:
> > 
> > struct host1x_bo;
> > 
> > struct host1x_bo_ops {
> > struct host1x_bo *(*get)(struct host1x_bo *bo);
> > void (*put)(struct host1x_bo *bo);
> > dma_addr_t (*pin)(struct host1x_bo *bo, struct sg_table **sgt);
> > ...
> > };
> > 
> > struct host1x_bo *host1x_bo_get(struct host1x_bo *bo);
> > void host1x_bo_put(struct host1x_bo *bo);
> > dma_addr_t host1x_bo_pin(struct host1x_bo *bo, struct sg_table **sgt);
> > ...
> > 
> > struct host1x_bo {
> > const struct host1x_bo_ops *ops;
> > ...
> > };
> > 
> > struct tegra_drm_bo {
> > struct host1x_bo base;
> > ...
> > };
> > 
> > That way you can get rid of the host1x_memmgr_create_handle() helper and
> > instead embed host1x_bo into driver-/framework-specific structures with
> > the necessary initialization.
> 
> This would make sense. We'll get back when we have enough of
> implementation done to understand it all. One consequence is that we
> cannot use drm_gem_cma_create() anymore. We'll have to introduce a
> function that does the same as drm_gem_cma_create(), but it takes a
> pre-allocated drm_gem_cma_object pointer. That way we can allocate the
> struct, and use DRM CMA just to initialize the drm_gem_cma_object.

I certainly think that introducing a drm_gem_cma_object_init() function
shouldn't pose a problem. If you do, make sure to update the existing
drm_gem_cma_create() to use it. Having both lets users have the choice
to use drm_gem_cma_create() if they don't need to embed it, or
drm_gem_cma_object_init() otherwise.

> Other way would be just taking a copy of DRM CMA helper, but I'd like to
> defer that to the next step when we implement IOMMU aware allocator.

I'm not sure I understand what you're saying, but if you add a function
as discussed above this shouldn't be necessary.

> > It also allows you to interact directly with the objects instead of
> > having to go through the memmgr API. The memory manager doesn't really
> > exist anymore so keeping the name in the API is only confusing. Your
> > current proposal deals with memory handles directly already so it's
> > really just making the naming more consistent.
> 
> The memmgr APIs are currently just a shortcut wrapper to the ops, so in
> that sense the memmgr does not really exist. I think it might still make
> sense to keep static inline wrappers for calling the ops within, but we
> could rename them to host1x_bo_somethingandother. Then it'd follow the
> pattern we are using for the hw ops in the latest set.

Yes, that's exactly what I had in mind in the above proposal. They could
be inline, but it's probably also okay if they're not. They aren't meant
to be used very frequently so the extra function call shouldn't matter
much. It might be easier to do add some additional checks if they aren't
inlined. I'm fine either way.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 



[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-11 Thread Terje Bergström
On 08.03.2013 22:43, Thierry Reding wrote:
> A bo is just a buffer object, so I don't see why the name shouldn't
> be used. The name is in no way specific to DRM or GEM. But the point
> that I was trying to make was that there is nothing to suggest that
> we couldn't use drm_gem_object as the underlying scaffold to base all
> host1x buffer objects on.
> 
> Furthermore I don't understand why you've chosen this approach. It
> is completely different from what other drivers do and therefore
> makes it more difficult to comprehend. That alone I could live with
> if there were any advantages to that approach, but as far as I can
> tell there are none.

I was following the plan we agreed on earlier in email discussion with
you and Lucas:

On 29.11.2012 11:09, Lucas Stach wrote:
> We should aim for a clean split here. GEM handles are something which
> is really specific to how DRM works and as such should be constructed
> by tegradrm. nvhost should really just manage allocations/virtual
> address space and provide something that is able to back all the GEM
> handle operations.
> 
> nvhost has really no reason at all to even know about GEM handles.
> If you back a GEM object by a nvhost object you can just peel out
> the nvhost handles from the GEM wrapper in the tegradrm submit ioctl
> handler and queue the job to nvhost using it's native handles.
> 
> This way you would also be able to construct different handles (like
> GEM obj or V4L2 buffers) from the same backing nvhost object. Note
> that I'm not sure how useful this would be, but it seems like a
> reasonable design to me being able to do so.

With this structure, we are already prepared for non-DRM APIs. Tt's a
matter of familiarity of code versus future expansion. Code paths for
both are as simple/complex, so neither has a direct technical
superiority in performance.

I know other DRM drivers have opted to hard code GEM dependency
throughout the code. Then again, host1x hardware is managing much more
than graphics, so we need to think outside the DRM box, too.

Terje


[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-11 Thread Thierry Reding
On Mon, Mar 11, 2013 at 08:29:59AM +0200, Terje Bergstr?m wrote:
> On 08.03.2013 22:43, Thierry Reding wrote:
> > A bo is just a buffer object, so I don't see why the name shouldn't
> > be used. The name is in no way specific to DRM or GEM. But the point
> > that I was trying to make was that there is nothing to suggest that
> > we couldn't use drm_gem_object as the underlying scaffold to base all
> > host1x buffer objects on.
> > 
> > Furthermore I don't understand why you've chosen this approach. It
> > is completely different from what other drivers do and therefore
> > makes it more difficult to comprehend. That alone I could live with
> > if there were any advantages to that approach, but as far as I can
> > tell there are none.
> 
> I was following the plan we agreed on earlier in email discussion with
> you and Lucas:
> 
> On 29.11.2012 11:09, Lucas Stach wrote:
> > We should aim for a clean split here. GEM handles are something which
> > is really specific to how DRM works and as such should be constructed
> > by tegradrm. nvhost should really just manage allocations/virtual
> > address space and provide something that is able to back all the GEM
> > handle operations.
> > 
> > nvhost has really no reason at all to even know about GEM handles.
> > If you back a GEM object by a nvhost object you can just peel out
> > the nvhost handles from the GEM wrapper in the tegradrm submit ioctl
> > handler and queue the job to nvhost using it's native handles.
> > 
> > This way you would also be able to construct different handles (like
> > GEM obj or V4L2 buffers) from the same backing nvhost object. Note
> > that I'm not sure how useful this would be, but it seems like a
> > reasonable design to me being able to do so.
> 
> With this structure, we are already prepared for non-DRM APIs. Tt's a
> matter of familiarity of code versus future expansion. Code paths for
> both are as simple/complex, so neither has a direct technical
> superiority in performance.
> 
> I know other DRM drivers have opted to hard code GEM dependency
> throughout the code. Then again, host1x hardware is managing much more
> than graphics, so we need to think outside the DRM box, too.

This sound a bit over-engineered at this point in time. DRM is currently
the only user. Is anybody working on any non-DRM drivers that would use
this?

Even that aside, I don't think host1x_mem_handle is a good choice of
name here. The objects are much more than handles. They are in fact
buffer objects, which can optionally be attached to a handle. I also
think that using a void * to store the handle specific data isn't such a
good idea.

So how about the following proposal, which I think might satisfy both of
us:

struct host1x_bo;

struct host1x_bo_ops {
struct host1x_bo *(*get)(struct host1x_bo *bo);
void (*put)(struct host1x_bo *bo);
dma_addr_t (*pin)(struct host1x_bo *bo, struct sg_table **sgt);
...
};

struct host1x_bo *host1x_bo_get(struct host1x_bo *bo);
void host1x_bo_put(struct host1x_bo *bo);
dma_addr_t host1x_bo_pin(struct host1x_bo *bo, struct sg_table **sgt);
...

struct host1x_bo {
const struct host1x_bo_ops *ops;
...
};

struct tegra_drm_bo {
struct host1x_bo base;
...
};

That way you can get rid of the host1x_memmgr_create_handle() helper and
instead embed host1x_bo into driver-/framework-specific structures with
the necessary initialization.

It also allows you to interact directly with the objects instead of
having to go through the memmgr API. The memory manager doesn't really
exist anymore so keeping the name in the API is only confusing. Your
current proposal deals with memory handles directly already so it's
really just making the naming more consistent.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 



Re: [PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-11 Thread Thierry Reding
On Mon, Mar 11, 2013 at 11:21:05AM +0200, Terje Bergström wrote:
> On 11.03.2013 09:18, Thierry Reding wrote:
> > This sound a bit over-engineered at this point in time. DRM is currently
> > the only user. Is anybody working on any non-DRM drivers that would use
> > this?
> 
> Well, this contains beginning of that:
> 
> http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=blob;f=drivers/media/video/tegra_v4l2_camera.c;h=644d0be5380367aca4c826c49724c03aad08387c;hb=l4t/l4t-r16-r2
> 
> I don't want to give these guys any excuse not to port it over to host1x
> code base. :-)

I was aware of that driver but I didn't realize it had been available
publicly. It's great to see this, though, and one more argument in
favour of not binding the host1x_bo too tightly to DRM/GEM.

> > So how about the following proposal, which I think might satisfy both of
> > us:
> > 
> > struct host1x_bo;
> > 
> > struct host1x_bo_ops {
> > struct host1x_bo *(*get)(struct host1x_bo *bo);
> > void (*put)(struct host1x_bo *bo);
> > dma_addr_t (*pin)(struct host1x_bo *bo, struct sg_table **sgt);
> > ...
> > };
> > 
> > struct host1x_bo *host1x_bo_get(struct host1x_bo *bo);
> > void host1x_bo_put(struct host1x_bo *bo);
> > dma_addr_t host1x_bo_pin(struct host1x_bo *bo, struct sg_table **sgt);
> > ...
> > 
> > struct host1x_bo {
> > const struct host1x_bo_ops *ops;
> > ...
> > };
> > 
> > struct tegra_drm_bo {
> > struct host1x_bo base;
> > ...
> > };
> > 
> > That way you can get rid of the host1x_memmgr_create_handle() helper and
> > instead embed host1x_bo into driver-/framework-specific structures with
> > the necessary initialization.
> 
> This would make sense. We'll get back when we have enough of
> implementation done to understand it all. One consequence is that we
> cannot use drm_gem_cma_create() anymore. We'll have to introduce a
> function that does the same as drm_gem_cma_create(), but it takes a
> pre-allocated drm_gem_cma_object pointer. That way we can allocate the
> struct, and use DRM CMA just to initialize the drm_gem_cma_object.

I certainly think that introducing a drm_gem_cma_object_init() function
shouldn't pose a problem. If you do, make sure to update the existing
drm_gem_cma_create() to use it. Having both lets users have the choice
to use drm_gem_cma_create() if they don't need to embed it, or
drm_gem_cma_object_init() otherwise.

> Other way would be just taking a copy of DRM CMA helper, but I'd like to
> defer that to the next step when we implement IOMMU aware allocator.

I'm not sure I understand what you're saying, but if you add a function
as discussed above this shouldn't be necessary.

> > It also allows you to interact directly with the objects instead of
> > having to go through the memmgr API. The memory manager doesn't really
> > exist anymore so keeping the name in the API is only confusing. Your
> > current proposal deals with memory handles directly already so it's
> > really just making the naming more consistent.
> 
> The memmgr APIs are currently just a shortcut wrapper to the ops, so in
> that sense the memmgr does not really exist. I think it might still make
> sense to keep static inline wrappers for calling the ops within, but we
> could rename them to host1x_bo_somethingandother. Then it'd follow the
> pattern we are using for the hw ops in the latest set.

Yes, that's exactly what I had in mind in the above proposal. They could
be inline, but it's probably also okay if they're not. They aren't meant
to be used very frequently so the extra function call shouldn't matter
much. It might be easier to do add some additional checks if they aren't
inlined. I'm fine either way.

Thierry


pgp7Ra4rYMFfZ.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-11 Thread Terje Bergström
On 11.03.2013 09:18, Thierry Reding wrote:
> This sound a bit over-engineered at this point in time. DRM is currently
> the only user. Is anybody working on any non-DRM drivers that would use
> this?

Well, this contains beginning of that:

http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=blob;f=drivers/media/video/tegra_v4l2_camera.c;h=644d0be5380367aca4c826c49724c03aad08387c;hb=l4t/l4t-r16-r2

I don't want to give these guys any excuse not to port it over to host1x
code base. :-)

> Even that aside, I don't think host1x_mem_handle is a good choice of
> name here. The objects are much more than handles. They are in fact
> buffer objects, which can optionally be attached to a handle. I also
> think that using a void * to store the handle specific data isn't such a
> good idea.

Naming if not an issue for me - we can easily agree on using _bo.

> So how about the following proposal, which I think might satisfy both of
> us:
> 
>   struct host1x_bo;
> 
>   struct host1x_bo_ops {
>   struct host1x_bo *(*get)(struct host1x_bo *bo);
>   void (*put)(struct host1x_bo *bo);
>   dma_addr_t (*pin)(struct host1x_bo *bo, struct sg_table **sgt);
>   ...
>   };
> 
>   struct host1x_bo *host1x_bo_get(struct host1x_bo *bo);
>   void host1x_bo_put(struct host1x_bo *bo);
>   dma_addr_t host1x_bo_pin(struct host1x_bo *bo, struct sg_table **sgt);
>   ...
> 
>   struct host1x_bo {
>   const struct host1x_bo_ops *ops;
>   ...
>   };
> 
>   struct tegra_drm_bo {
>   struct host1x_bo base;
>   ...
>   };
> 
> That way you can get rid of the host1x_memmgr_create_handle() helper and
> instead embed host1x_bo into driver-/framework-specific structures with
> the necessary initialization.

This would make sense. We'll get back when we have enough of
implementation done to understand it all. One consequence is that we
cannot use drm_gem_cma_create() anymore. We'll have to introduce a
function that does the same as drm_gem_cma_create(), but it takes a
pre-allocated drm_gem_cma_object pointer. That way we can allocate the
struct, and use DRM CMA just to initialize the drm_gem_cma_object.

Other way would be just taking a copy of DRM CMA helper, but I'd like to
defer that to the next step when we implement IOMMU aware allocator.

> It also allows you to interact directly with the objects instead of
> having to go through the memmgr API. The memory manager doesn't really
> exist anymore so keeping the name in the API is only confusing. Your
> current proposal deals with memory handles directly already so it's
> really just making the naming more consistent.

The memmgr APIs are currently just a shortcut wrapper to the ops, so in
that sense the memmgr does not really exist. I think it might still make
sense to keep static inline wrappers for calling the ops within, but we
could rename them to host1x_bo_somethingandother. Then it'd follow the
pattern we are using for the hw ops in the latest set.

Terje
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-11 Thread Thierry Reding
On Mon, Mar 11, 2013 at 08:29:59AM +0200, Terje Bergström wrote:
> On 08.03.2013 22:43, Thierry Reding wrote:
> > A bo is just a buffer object, so I don't see why the name shouldn't
> > be used. The name is in no way specific to DRM or GEM. But the point
> > that I was trying to make was that there is nothing to suggest that
> > we couldn't use drm_gem_object as the underlying scaffold to base all
> > host1x buffer objects on.
> > 
> > Furthermore I don't understand why you've chosen this approach. It
> > is completely different from what other drivers do and therefore
> > makes it more difficult to comprehend. That alone I could live with
> > if there were any advantages to that approach, but as far as I can
> > tell there are none.
> 
> I was following the plan we agreed on earlier in email discussion with
> you and Lucas:
> 
> On 29.11.2012 11:09, Lucas Stach wrote:
> > We should aim for a clean split here. GEM handles are something which
> > is really specific to how DRM works and as such should be constructed
> > by tegradrm. nvhost should really just manage allocations/virtual
> > address space and provide something that is able to back all the GEM
> > handle operations.
> > 
> > nvhost has really no reason at all to even know about GEM handles.
> > If you back a GEM object by a nvhost object you can just peel out
> > the nvhost handles from the GEM wrapper in the tegradrm submit ioctl
> > handler and queue the job to nvhost using it's native handles.
> > 
> > This way you would also be able to construct different handles (like
> > GEM obj or V4L2 buffers) from the same backing nvhost object. Note
> > that I'm not sure how useful this would be, but it seems like a
> > reasonable design to me being able to do so.
> 
> With this structure, we are already prepared for non-DRM APIs. Tt's a
> matter of familiarity of code versus future expansion. Code paths for
> both are as simple/complex, so neither has a direct technical
> superiority in performance.
> 
> I know other DRM drivers have opted to hard code GEM dependency
> throughout the code. Then again, host1x hardware is managing much more
> than graphics, so we need to think outside the DRM box, too.

This sound a bit over-engineered at this point in time. DRM is currently
the only user. Is anybody working on any non-DRM drivers that would use
this?

Even that aside, I don't think host1x_mem_handle is a good choice of
name here. The objects are much more than handles. They are in fact
buffer objects, which can optionally be attached to a handle. I also
think that using a void * to store the handle specific data isn't such a
good idea.

So how about the following proposal, which I think might satisfy both of
us:

struct host1x_bo;

struct host1x_bo_ops {
struct host1x_bo *(*get)(struct host1x_bo *bo);
void (*put)(struct host1x_bo *bo);
dma_addr_t (*pin)(struct host1x_bo *bo, struct sg_table **sgt);
...
};

struct host1x_bo *host1x_bo_get(struct host1x_bo *bo);
void host1x_bo_put(struct host1x_bo *bo);
dma_addr_t host1x_bo_pin(struct host1x_bo *bo, struct sg_table **sgt);
...

struct host1x_bo {
const struct host1x_bo_ops *ops;
...
};

struct tegra_drm_bo {
struct host1x_bo base;
...
};

That way you can get rid of the host1x_memmgr_create_handle() helper and
instead embed host1x_bo into driver-/framework-specific structures with
the necessary initialization.

It also allows you to interact directly with the objects instead of
having to go through the memmgr API. The memory manager doesn't really
exist anymore so keeping the name in the API is only confusing. Your
current proposal deals with memory handles directly already so it's
really just making the naming more consistent.

Thierry


pgpAZ7oCR9wg0.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-10 Thread Terje Bergström
On 08.03.2013 22:43, Thierry Reding wrote:
> A bo is just a buffer object, so I don't see why the name shouldn't
> be used. The name is in no way specific to DRM or GEM. But the point
> that I was trying to make was that there is nothing to suggest that
> we couldn't use drm_gem_object as the underlying scaffold to base all
> host1x buffer objects on.
> 
> Furthermore I don't understand why you've chosen this approach. It
> is completely different from what other drivers do and therefore
> makes it more difficult to comprehend. That alone I could live with
> if there were any advantages to that approach, but as far as I can
> tell there are none.

I was following the plan we agreed on earlier in email discussion with
you and Lucas:

On 29.11.2012 11:09, Lucas Stach wrote:
> We should aim for a clean split here. GEM handles are something which
> is really specific to how DRM works and as such should be constructed
> by tegradrm. nvhost should really just manage allocations/virtual
> address space and provide something that is able to back all the GEM
> handle operations.
> 
> nvhost has really no reason at all to even know about GEM handles.
> If you back a GEM object by a nvhost object you can just peel out
> the nvhost handles from the GEM wrapper in the tegradrm submit ioctl
> handler and queue the job to nvhost using it's native handles.
> 
> This way you would also be able to construct different handles (like
> GEM obj or V4L2 buffers) from the same backing nvhost object. Note
> that I'm not sure how useful this would be, but it seems like a
> reasonable design to me being able to do so.

With this structure, we are already prepared for non-DRM APIs. Tt's a
matter of familiarity of code versus future expansion. Code paths for
both are as simple/complex, so neither has a direct technical
superiority in performance.

I know other DRM drivers have opted to hard code GEM dependency
throughout the code. Then again, host1x hardware is managing much more
than graphics, so we need to think outside the DRM box, too.

Terje
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-08 Thread Thierry Reding
On Fri, Mar 08, 2013 at 06:16:16PM +0200, Terje Bergstr?m wrote:
> On 26.02.2013 11:48, Terje Bergstr?m wrote:
> > On 25.02.2013 17:24, Thierry Reding wrote:
[...]
> >>> +struct mem_handle;
> >>> +struct platform_device;
> >>> +
> >>> +struct host1x_job_unpin_data {
> >>> + struct mem_handle *h;
> >>> + struct sg_table *mem;
> >>> +};
> >>> +
> >>> +enum mem_mgr_flag {
> >>> + mem_mgr_flag_uncacheable = 0,
> >>> + mem_mgr_flag_write_combine = 1,
> >>> +};
> >>
> >> I'd like to see this use a more object-oriented approach and more common
> >> terminology. All of these handles are essentially buffer objects, so
> >> maybe something like host1x_bo would be a nice and short name.
> 
> We did this a bit differently, but following pretty much the same
> principles. We have host1x_mem_handle, which contains an ops pointer.
> The handle gets encapsulated inside drm_gem_cma_object.
> 
> _bo structs seem to usually contains a drm_gem_object, so we thought
> it's better not to reuse that term.
> 
> Please check the code and let us know what you think. This pretty much
> follows what Lucas proposed a while ago, and keeps neatly the DRM
> specific parts inside the drm directory.

A bo is just a buffer object, so I don't see why the name shouldn't be
used. The name is in no way specific to DRM or GEM. But the point that I
was trying to make was that there is nothing to suggest that we couldn't
use drm_gem_object as the underlying scaffold to base all host1x buffer
objects on.

Furthermore I don't understand why you've chosen this approach. It is
completely different from what other drivers do and therefore makes it
more difficult to comprehend. That alone I could live with if there were
any advantages to that approach, but as far as I can tell there are
none.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 



[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-08 Thread Terje Bergström
On 26.02.2013 11:48, Terje Bergstr?m wrote:
> On 25.02.2013 17:24, Thierry Reding wrote:
>> You use two different styles to indent the function parameters. You
>> might want to stick to one, preferably aligning them with the first
>> parameter on the first line.
> 
> I've generally favored "two tabs" indenting, but we'll anyway
> standardize on one.

We standardized on the convention used in tegradrm, i.e. aligning with
first parameter.

>> There's nothing in this function that requires a platform_device, so
>> passing struct device should be enough. Or maybe host1x_cdma should get
>> a struct device * field?
> 
> I think we'll just start using struct device * in general in code.
> Arto's been already fixing a lot of these, so he might've already fixed
> this.

We did a sweep in the code and now I hope everything that can, uses
struct device *. The side effect was getting rid of a lot of casting,
which is good.

>> Why don't you use any of the kernel's reference counting mechanisms?
>>
>>> +void host1x_channel_put(struct host1x_channel *ch)
>>> +{
>>> + mutex_lock(&ch->reflock);
>>> + if (ch->refcount == 1) {
>>> + host1x_get_host(ch->dev)->cdma_op.stop(&ch->cdma);
>>> + host1x_cdma_deinit(&ch->cdma);
>>> + }
>>> + ch->refcount--;
>>> + mutex_unlock(&ch->reflock);
>>> +}
>>
>> I think you can do all of this using a kref.
> 
> I think the original reason was that there's no reason to use atomic
> kref, as we anyway have to do mutual exclusion via mutex. But, using
> kref won't be any problem, so we could use that.

Actually, we ended up with a problem with this. kref assumes that once
refcount goes to zero, it gets destroyed. In ch->refcount, going to zero
is just fine and just indicates that we need to initialize. And, we
anyway need to do locking, so we didn't do the conversion to kref.

>>> +struct host1x_channel *host1x_channel_alloc(struct platform_device *pdev)
>>> +{
(...)
>>> +}
>>
>> I think the critical section could be shorter here. It's probably not
>> worth the extra trouble, though, given that channels are not often
>> allocated.
> 
> Yeah, boot time isn't measured in microseconds. :-) But, if we just make
> allocated_channels an atomic, we should be able to drop chlist_mutex
> altogether and it could simplify the code.

There wasn't much we could have moved outside the critical section, so
we didn't touch this area.

>> Also, is it really necessary to abstract these into an ops structure? I
>> get that newer hardware revisions might require different ops for sync-
>> point handling because the register layout or number of syncpoints may
>> be different, but the CDMA and push buffer (below) concepts are pretty
>> much a software abstraction, and as such its implementation is unlikely
>> to change with some future hardware revision.
> 
> Pushbuffer ops can become generic. There's only one catch - init uses
> the restart opcode. But the opcode is not going to change, so we can
> generalize that.

We ended up keeping the init as an operation, but rest of push buffer
ops became generic.

>>
>>> +/*
>>> + * Push two words to the push buffer
>>> + * Caller must ensure push buffer is not full
>>> + */
>>> +static void push_buffer_push_to(struct push_buffer *pb,
>>> + struct mem_handle *handle,
>>> + u32 op1, u32 op2)
>>> +{
>>> + u32 cur = pb->cur;
>>> + u32 *p = (u32 *)((u32)pb->mapped + cur);
>>
>> You do all this extra casting to make sure to increment by bytes and not
>> 32-bit words. How about you change pb->cur to contain the word index, so
>> that you don't have to go through hoops each time around.

When we changed DMASTART and DMAEND to actually denote the push buffer
area, we noticed that DMAGET and DMAPUT are actually relative to
DMASTART and DMAEND. This and the need to access both CPU and device
virtual addresses coupled with changing to word indexes didn't actually
simplify the code, so we kept still using byte indexes.

>>
>>> +/*
>>> + * Return the number of two word slots free in the push buffer
>>> + */
>>> +static u32 push_buffer_space(struct push_buffer *pb)
>>> +{
>>> + return ((pb->fence - pb->cur) & (PUSH_BUFFER_SIZE - 1)) / 8;
>>> +}
>>
>> Why & (PUSH_BUFFER_SIZE - 1) here? fence - cur can never be larger than
>> PUSH_BUFFER_SIZE, can it?
> 
> You're right, this function doesn't need to worry about wrapping.

Arto noticed this, but actually I was wrong - the wrapping is very
possible. We just have to remember that if we're processing something at
the end of push buffer, cur might be in the end, and fence in the beginning.

>>> diff --git a/drivers/gpu/host1x/memmgr.h b/drivers/gpu/host1x/memmgr.h
>> [...]
>>> +struct mem_handle;
>>> +struct platform_device;
>>> +
>>> +struct host1x_job_unpin_data {
>>> + struct mem_handle *h;
>>> + struct sg_table *mem;
>>> +};
>>> +
>>> +enum mem_mgr_flag {
>>> + mem_mgr_flag_uncacheable = 0,
>>> + mem_mgr_flag_write_combine = 1,
>>> +};
>>
>> I'd 

Re: [PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-08 Thread Thierry Reding
On Fri, Mar 08, 2013 at 06:16:16PM +0200, Terje Bergström wrote:
> On 26.02.2013 11:48, Terje Bergström wrote:
> > On 25.02.2013 17:24, Thierry Reding wrote:
[...]
> >>> +struct mem_handle;
> >>> +struct platform_device;
> >>> +
> >>> +struct host1x_job_unpin_data {
> >>> + struct mem_handle *h;
> >>> + struct sg_table *mem;
> >>> +};
> >>> +
> >>> +enum mem_mgr_flag {
> >>> + mem_mgr_flag_uncacheable = 0,
> >>> + mem_mgr_flag_write_combine = 1,
> >>> +};
> >>
> >> I'd like to see this use a more object-oriented approach and more common
> >> terminology. All of these handles are essentially buffer objects, so
> >> maybe something like host1x_bo would be a nice and short name.
> 
> We did this a bit differently, but following pretty much the same
> principles. We have host1x_mem_handle, which contains an ops pointer.
> The handle gets encapsulated inside drm_gem_cma_object.
> 
> _bo structs seem to usually contains a drm_gem_object, so we thought
> it's better not to reuse that term.
> 
> Please check the code and let us know what you think. This pretty much
> follows what Lucas proposed a while ago, and keeps neatly the DRM
> specific parts inside the drm directory.

A bo is just a buffer object, so I don't see why the name shouldn't be
used. The name is in no way specific to DRM or GEM. But the point that I
was trying to make was that there is nothing to suggest that we couldn't
use drm_gem_object as the underlying scaffold to base all host1x buffer
objects on.

Furthermore I don't understand why you've chosen this approach. It is
completely different from what other drivers do and therefore makes it
more difficult to comprehend. That alone I could live with if there were
any advantages to that approach, but as far as I can tell there are
none.

Thierry


pgpwKhzUdPjOx.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-03-08 Thread Terje Bergström
On 26.02.2013 11:48, Terje Bergström wrote:
> On 25.02.2013 17:24, Thierry Reding wrote:
>> You use two different styles to indent the function parameters. You
>> might want to stick to one, preferably aligning them with the first
>> parameter on the first line.
> 
> I've generally favored "two tabs" indenting, but we'll anyway
> standardize on one.

We standardized on the convention used in tegradrm, i.e. aligning with
first parameter.

>> There's nothing in this function that requires a platform_device, so
>> passing struct device should be enough. Or maybe host1x_cdma should get
>> a struct device * field?
> 
> I think we'll just start using struct device * in general in code.
> Arto's been already fixing a lot of these, so he might've already fixed
> this.

We did a sweep in the code and now I hope everything that can, uses
struct device *. The side effect was getting rid of a lot of casting,
which is good.

>> Why don't you use any of the kernel's reference counting mechanisms?
>>
>>> +void host1x_channel_put(struct host1x_channel *ch)
>>> +{
>>> + mutex_lock(&ch->reflock);
>>> + if (ch->refcount == 1) {
>>> + host1x_get_host(ch->dev)->cdma_op.stop(&ch->cdma);
>>> + host1x_cdma_deinit(&ch->cdma);
>>> + }
>>> + ch->refcount--;
>>> + mutex_unlock(&ch->reflock);
>>> +}
>>
>> I think you can do all of this using a kref.
> 
> I think the original reason was that there's no reason to use atomic
> kref, as we anyway have to do mutual exclusion via mutex. But, using
> kref won't be any problem, so we could use that.

Actually, we ended up with a problem with this. kref assumes that once
refcount goes to zero, it gets destroyed. In ch->refcount, going to zero
is just fine and just indicates that we need to initialize. And, we
anyway need to do locking, so we didn't do the conversion to kref.

>>> +struct host1x_channel *host1x_channel_alloc(struct platform_device *pdev)
>>> +{
(...)
>>> +}
>>
>> I think the critical section could be shorter here. It's probably not
>> worth the extra trouble, though, given that channels are not often
>> allocated.
> 
> Yeah, boot time isn't measured in microseconds. :-) But, if we just make
> allocated_channels an atomic, we should be able to drop chlist_mutex
> altogether and it could simplify the code.

There wasn't much we could have moved outside the critical section, so
we didn't touch this area.

>> Also, is it really necessary to abstract these into an ops structure? I
>> get that newer hardware revisions might require different ops for sync-
>> point handling because the register layout or number of syncpoints may
>> be different, but the CDMA and push buffer (below) concepts are pretty
>> much a software abstraction, and as such its implementation is unlikely
>> to change with some future hardware revision.
> 
> Pushbuffer ops can become generic. There's only one catch - init uses
> the restart opcode. But the opcode is not going to change, so we can
> generalize that.

We ended up keeping the init as an operation, but rest of push buffer
ops became generic.

>>
>>> +/*
>>> + * Push two words to the push buffer
>>> + * Caller must ensure push buffer is not full
>>> + */
>>> +static void push_buffer_push_to(struct push_buffer *pb,
>>> + struct mem_handle *handle,
>>> + u32 op1, u32 op2)
>>> +{
>>> + u32 cur = pb->cur;
>>> + u32 *p = (u32 *)((u32)pb->mapped + cur);
>>
>> You do all this extra casting to make sure to increment by bytes and not
>> 32-bit words. How about you change pb->cur to contain the word index, so
>> that you don't have to go through hoops each time around.

When we changed DMASTART and DMAEND to actually denote the push buffer
area, we noticed that DMAGET and DMAPUT are actually relative to
DMASTART and DMAEND. This and the need to access both CPU and device
virtual addresses coupled with changing to word indexes didn't actually
simplify the code, so we kept still using byte indexes.

>>
>>> +/*
>>> + * Return the number of two word slots free in the push buffer
>>> + */
>>> +static u32 push_buffer_space(struct push_buffer *pb)
>>> +{
>>> + return ((pb->fence - pb->cur) & (PUSH_BUFFER_SIZE - 1)) / 8;
>>> +}
>>
>> Why & (PUSH_BUFFER_SIZE - 1) here? fence - cur can never be larger than
>> PUSH_BUFFER_SIZE, can it?
> 
> You're right, this function doesn't need to worry about wrapping.

Arto noticed this, but actually I was wrong - the wrapping is very
possible. We just have to remember that if we're processing something at
the end of push buffer, cur might be in the end, and fence in the beginning.

>>> diff --git a/drivers/gpu/host1x/memmgr.h b/drivers/gpu/host1x/memmgr.h
>> [...]
>>> +struct mem_handle;
>>> +struct platform_device;
>>> +
>>> +struct host1x_job_unpin_data {
>>> + struct mem_handle *h;
>>> + struct sg_table *mem;
>>> +};
>>> +
>>> +enum mem_mgr_flag {
>>> + mem_mgr_flag_uncacheable = 0,
>>> + mem_mgr_flag_write_combine = 1,
>>> +};
>>
>> I'd 

[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-02-27 Thread Thierry Reding
On Tue, Feb 26, 2013 at 11:48:18AM +0200, Terje Bergstr?m wrote:
> On 25.02.2013 17:24, Thierry Reding wrote:
> > On Tue, Jan 15, 2013 at 01:43:59PM +0200, Terje Bergstrom wrote:
[...]
> >> +/*
> >> + * Start timer for a buffer submition that has completed yet.
> > 
> > "submission". And I don't understand the "that has completed yet" part.
> 
> It should become "Start timer that tracks the time spent by the job".

Yes, that's a lot better.

> >> + if (list_empty(&cdma->sync_queue) &&
> >> + cdma->event == CDMA_EVENT_SYNC_QUEUE_EMPTY)
> >> + signal = true;
> > 
> > This looks funny, maybe:
> > 
> > if (cdma->event == CDMA_EVENT_SYNC_QUEUE_EMPTY &&
> > list_empty(&cdma->sync_queue))
> > signal = true;
> > 
> > ?
> 
> Indenting at least is strange. I don't have a preference for the
> ordering of conditions, so if you like the latter order, we can just use
> that.

I just happen to find it easier to read that way. If you want to keep
the ordering that's fine, but the indentation needs to be fixed.

> >> +{
> >> + u32 get_restart;
> > 
> > Maybe just call this "restart" or "restart_addr". get_restart sounds
> > like a function name.
> 
> Ok, how about "restart_dmaget_addr"? That indicates what we're doing
> with the restart address.

Sounds good.

> >> + list_for_each_entry(job, &cdma->sync_queue, list) {
> >> + if (syncpt_val < job->syncpt_end)
> >> + break;
> >> +
> >> + host1x_job_dump(&dev->dev, job);
> >> + }
> > 
> > That's potentially a lot of debug output. I wonder if it might make
> > sense to control parts of this via a module parameter. Then again, if
> > somebody really needs to debug this, maybe they really want *all* the
> > information.
> 
> host1x_job_dump() uses dev_dbg(), so it only dumps a lot if DEBUG has
> been defined in that file.

Okay, let's leave it like that then.

> >> +/*
> >> + * Destroy a cdma
> >> + */
> >> +void host1x_cdma_deinit(struct host1x_cdma *cdma)
> >> +{
> >> + struct push_buffer *pb = &cdma->push_buffer;
> >> + struct host1x *host1x = cdma_to_host1x(cdma);
> >> +
> >> + if (cdma->running) {
> >> + pr_warn("%s: CDMA still running\n",
> >> + __func__);
> >> + } else {
> >> + host1x->cdma_pb_op.destroy(pb);
> >> + host1x->cdma_op.timeout_destroy(cdma);
> >> + }
> >> +}
> > 
> > There's no way to recover from the situation where a cdma is still
> > running. Can this not return an error code (-EBUSY?) if the cdma can't
> > be destroyed?
> 
> It's called from close(), which cannot return an error code. It's
> actually more of a power optimization. The effect is that if there are
> no users for channel, we'll just not free up the push buffer.
> 
> I think the proper fix would actually be to check in host1x_cdma_init()
> if push buffer is already allocated and cdma->running. In that case we
> could skip most of initialization.

Yes, in that case it might be useful to do this. I still think it's
worth to return an error code to the caller, even if it can't be
propagated. That way the caller at least has the possibility to react.

I'm still not quite sure I understand the necessity for this, though.
Maybe you can give an example of when this will actually happen?

> >> +/*
> >> + * cdma
> >> + *
> >> + * This is in charge of a host command DMA channel.
> >> + * Sends ops to a push buffer, and takes responsibility for unpinning
> >> + * (& possibly freeing) of memory after those ops have completed.
> >> + * Producer:
> >> + *   begin
> >> + *   push - send ops to the push buffer
> >> + *   end - start command DMA and enqueue handles to be unpinned
> >> + * Consumer:
> >> + *   update - call to update sync queue and push buffer, unpin memory
> >> + */
> > 
> > I find the name to be a bit confusing. For some reason I automatically
> > think of GSM when I read CDMA. This really is more of a job queue, so
> > maybe calling it host1x_job_queue might be more appropriate. But I've
> > already requested a lot of things to be renamed, so I think I can live
> > with this being called CDMA if you don't want to change it.
> > 
> > Alternatively all of these could be moved to the struct host1x_channel
> > given that there's only one of each of the push_buffer, buffer_timeout
> > and host1x_cma objects per channel.
> 
> I did consider merging those two at a time. That should work, as they
> both deal with channels essentially. I also saw that the resulting file
> and data structures became quite large, so I have so far preferred to
> keep them separate.
> 
> This way I can keep the "higher level" stuff (inserting setclass,
> serializing, allocating sync point ranges, etc) in one file and lower
> level stuff (write to hardware, deal with push buffer pointers, etc) in
> another.

Alright. I can live with that.

> >> +int host1x_channel_submit(struct host1x_job *

Re: [PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-02-27 Thread Thierry Reding
On Tue, Feb 26, 2013 at 11:48:18AM +0200, Terje Bergström wrote:
> On 25.02.2013 17:24, Thierry Reding wrote:
> > On Tue, Jan 15, 2013 at 01:43:59PM +0200, Terje Bergstrom wrote:
[...]
> >> +/*
> >> + * Start timer for a buffer submition that has completed yet.
> > 
> > "submission". And I don't understand the "that has completed yet" part.
> 
> It should become "Start timer that tracks the time spent by the job".

Yes, that's a lot better.

> >> + if (list_empty(&cdma->sync_queue) &&
> >> + cdma->event == CDMA_EVENT_SYNC_QUEUE_EMPTY)
> >> + signal = true;
> > 
> > This looks funny, maybe:
> > 
> > if (cdma->event == CDMA_EVENT_SYNC_QUEUE_EMPTY &&
> > list_empty(&cdma->sync_queue))
> > signal = true;
> > 
> > ?
> 
> Indenting at least is strange. I don't have a preference for the
> ordering of conditions, so if you like the latter order, we can just use
> that.

I just happen to find it easier to read that way. If you want to keep
the ordering that's fine, but the indentation needs to be fixed.

> >> +{
> >> + u32 get_restart;
> > 
> > Maybe just call this "restart" or "restart_addr". get_restart sounds
> > like a function name.
> 
> Ok, how about "restart_dmaget_addr"? That indicates what we're doing
> with the restart address.

Sounds good.

> >> + list_for_each_entry(job, &cdma->sync_queue, list) {
> >> + if (syncpt_val < job->syncpt_end)
> >> + break;
> >> +
> >> + host1x_job_dump(&dev->dev, job);
> >> + }
> > 
> > That's potentially a lot of debug output. I wonder if it might make
> > sense to control parts of this via a module parameter. Then again, if
> > somebody really needs to debug this, maybe they really want *all* the
> > information.
> 
> host1x_job_dump() uses dev_dbg(), so it only dumps a lot if DEBUG has
> been defined in that file.

Okay, let's leave it like that then.

> >> +/*
> >> + * Destroy a cdma
> >> + */
> >> +void host1x_cdma_deinit(struct host1x_cdma *cdma)
> >> +{
> >> + struct push_buffer *pb = &cdma->push_buffer;
> >> + struct host1x *host1x = cdma_to_host1x(cdma);
> >> +
> >> + if (cdma->running) {
> >> + pr_warn("%s: CDMA still running\n",
> >> + __func__);
> >> + } else {
> >> + host1x->cdma_pb_op.destroy(pb);
> >> + host1x->cdma_op.timeout_destroy(cdma);
> >> + }
> >> +}
> > 
> > There's no way to recover from the situation where a cdma is still
> > running. Can this not return an error code (-EBUSY?) if the cdma can't
> > be destroyed?
> 
> It's called from close(), which cannot return an error code. It's
> actually more of a power optimization. The effect is that if there are
> no users for channel, we'll just not free up the push buffer.
> 
> I think the proper fix would actually be to check in host1x_cdma_init()
> if push buffer is already allocated and cdma->running. In that case we
> could skip most of initialization.

Yes, in that case it might be useful to do this. I still think it's
worth to return an error code to the caller, even if it can't be
propagated. That way the caller at least has the possibility to react.

I'm still not quite sure I understand the necessity for this, though.
Maybe you can give an example of when this will actually happen?

> >> +/*
> >> + * cdma
> >> + *
> >> + * This is in charge of a host command DMA channel.
> >> + * Sends ops to a push buffer, and takes responsibility for unpinning
> >> + * (& possibly freeing) of memory after those ops have completed.
> >> + * Producer:
> >> + *   begin
> >> + *   push - send ops to the push buffer
> >> + *   end - start command DMA and enqueue handles to be unpinned
> >> + * Consumer:
> >> + *   update - call to update sync queue and push buffer, unpin memory
> >> + */
> > 
> > I find the name to be a bit confusing. For some reason I automatically
> > think of GSM when I read CDMA. This really is more of a job queue, so
> > maybe calling it host1x_job_queue might be more appropriate. But I've
> > already requested a lot of things to be renamed, so I think I can live
> > with this being called CDMA if you don't want to change it.
> > 
> > Alternatively all of these could be moved to the struct host1x_channel
> > given that there's only one of each of the push_buffer, buffer_timeout
> > and host1x_cma objects per channel.
> 
> I did consider merging those two at a time. That should work, as they
> both deal with channels essentially. I also saw that the resulting file
> and data structures became quite large, so I have so far preferred to
> keep them separate.
> 
> This way I can keep the "higher level" stuff (inserting setclass,
> serializing, allocating sync point ranges, etc) in one file and lower
> level stuff (write to hardware, deal with push buffer pointers, etc) in
> another.

Alright. I can live with that.

> >> +int host1x_channel_submit(struct host1x_job *

[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-02-26 Thread Terje Bergström
On 25.02.2013 17:24, Thierry Reding wrote:
> * PGP Signed by an unknown key
> 
> On Tue, Jan 15, 2013 at 01:43:59PM +0200, Terje Bergstrom wrote:
> [...]
>> diff --git a/drivers/gpu/host1x/Kconfig b/drivers/gpu/host1x/Kconfig
>> index e89fb2b..57680a6 100644
>> --- a/drivers/gpu/host1x/Kconfig
>> +++ b/drivers/gpu/host1x/Kconfig
>> @@ -3,4 +3,27 @@ config TEGRA_HOST1X
>>   help
>> Driver for the Tegra host1x hardware.
>>
>> -   Required for enabling tegradrm.
>> +   Required for enabling tegradrm and 2D acceleration.
> 
> I don't think I commented on this in the other patches, but I think this
> could use a bit more information about what host1x is. Also mentioning
> that it is a requirement for tegra-drm and 2D acceleration isn't very
> useful because it can equally well be expressed in Kconfig. If you add
> some description about what host1x is, people will know that they want
> to enable it.

Ok, we'll rewrite that. I think we can reuse the text from commit msg
that I stole from Stephen's appnote.

> 
>> +if TEGRA_HOST1X
>> +
>> +config TEGRA_HOST1X_CMA
>> + bool "Support DRM CMA buffers"
>> + depends on DRM
>> + default y
>> + select DRM_GEM_CMA_HELPER
>> + select DRM_KMS_CMA_HELPER
>> + help
>> +   Say yes if you wish to use DRM CMA buffers.
>> +
>> +   If unsure, choose Y.
> 
> Perhaps make this not user-selectable (for now)? If somebody disables
> this explicitly they won't get a working driver, right?

True, there's no alternative, so it should not be user selectable.

> 
>> diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
> [...]
>> +#include "cdma.h"
>> +#include "channel.h"
>> +#include "dev.h"
>> +#include "memmgr.h"
>> +#include "job.h"
>> +#include 
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#define TRACE_MAX_LENGTH 128U
> 
> "" includes generally follow <> ones.

Will do.

> 
>> +/*
>> + * Add an entry to the sync queue.
>> + */
>> +static void add_to_sync_queue(struct host1x_cdma *cdma,
>> +   struct host1x_job *job,
>> +   u32 nr_slots,
>> +   u32 first_get)
>> +{
>> + if (job->syncpt_id == NVSYNCPT_INVALID) {
>> + dev_warn(&job->ch->dev->dev, "%s: Invalid syncpt\n",
>> + __func__);
>> + return;
>> + }
>> +
>> + job->first_get = first_get;
>> + job->num_slots = nr_slots;
>> + host1x_job_get(job);
>> + list_add_tail(&job->list, &cdma->sync_queue);
>> +}
> 
> It's a bit odd that you pass a job in here along with some parameters
> that are then assigned to the job's fields. Couldn't you just assign
> them to the job's fields before passing the job into this function?
> 
> I also see that you only use this function once, so maybe you could
> open-code it instead.

I think open coding would be the best choice. There's no real reason to
have this as separate function. That'd solve the odd parameters
phenomenon, too.

> 
>> +/*
>> + * Return the status of the cdma's sync queue or push buffer for the given 
>> event
>> + *  - sq empty: returns 1 for empty, 0 for not empty (as in "1 empty queue" 
>> :-)
>> + *  - pb space: returns the number of free slots in the channel's push 
>> buffer
>> + * Must be called with the cdma lock held.
>> + */
>> +static unsigned int cdma_status_locked(struct host1x_cdma *cdma,
>> + enum cdma_event event)
>> +{
>> + struct host1x *host1x = cdma_to_host1x(cdma);
>> + switch (event) {
>> + case CDMA_EVENT_SYNC_QUEUE_EMPTY:
>> + return list_empty(&cdma->sync_queue) ? 1 : 0;
>> + case CDMA_EVENT_PUSH_BUFFER_SPACE: {
>> + struct push_buffer *pb = &cdma->push_buffer;
>> + return host1x->cdma_pb_op.space(pb);
>> + }
>> + default:
>> + return 0;
>> + }
>> +}
> 
> Similarly this function is only used in one place and it requires a
> whole lot of documentation to define the meaning of the return value. If
> you implement this functionality directly in host1x_cdma_wait_locked()
> you have much more context and don't require all this "protocol".

I agree, this function is confusing. For some future functionality, it's
going to be called from a second place with CDMA_EVENT_SYNC_QUEUE_EMPTY,
but it's better of both of those calls are just opened up to get rid of
the extra switch().

> 
>> +/*
>> + * Start timer for a buffer submition that has completed yet.
> 
> "submission". And I don't understand the "that has completed yet" part.

It should become "Start timer that tracks the time spent by the job".

> 
>> + * Must be called with the cdma lock held.
>> + */
>> +static void cdma_start_timer_locked(struct host1x_cdma *cdma,
>> + struct host1x_job *job)
> 
> You use two different styles to indent the function parameters. You
> might want to stick to one, preferably aligning them with the first
> parameter on the first line.

I've generally fa

[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-02-25 Thread Thierry Reding
On Tue, Jan 15, 2013 at 01:43:59PM +0200, Terje Bergstrom wrote:
[...]
> diff --git a/drivers/gpu/host1x/Kconfig b/drivers/gpu/host1x/Kconfig
> index e89fb2b..57680a6 100644
> --- a/drivers/gpu/host1x/Kconfig
> +++ b/drivers/gpu/host1x/Kconfig
> @@ -3,4 +3,27 @@ config TEGRA_HOST1X
>   help
> Driver for the Tegra host1x hardware.
>  
> -   Required for enabling tegradrm.
> +   Required for enabling tegradrm and 2D acceleration.

I don't think I commented on this in the other patches, but I think this
could use a bit more information about what host1x is. Also mentioning
that it is a requirement for tegra-drm and 2D acceleration isn't very
useful because it can equally well be expressed in Kconfig. If you add
some description about what host1x is, people will know that they want
to enable it.

> +if TEGRA_HOST1X
> +
> +config TEGRA_HOST1X_CMA
> + bool "Support DRM CMA buffers"
> + depends on DRM
> + default y
> + select DRM_GEM_CMA_HELPER
> + select DRM_KMS_CMA_HELPER
> + help
> +   Say yes if you wish to use DRM CMA buffers.
> +
> +   If unsure, choose Y.

Perhaps make this not user-selectable (for now)? If somebody disables
this explicitly they won't get a working driver, right?

> diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
[...]
> +#include "cdma.h"
> +#include "channel.h"
> +#include "dev.h"
> +#include "memmgr.h"
> +#include "job.h"
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define TRACE_MAX_LENGTH 128U

"" includes generally follow <> ones.

> +/*
> + * Add an entry to the sync queue.
> + */
> +static void add_to_sync_queue(struct host1x_cdma *cdma,
> +   struct host1x_job *job,
> +   u32 nr_slots,
> +   u32 first_get)
> +{
> + if (job->syncpt_id == NVSYNCPT_INVALID) {
> + dev_warn(&job->ch->dev->dev, "%s: Invalid syncpt\n",
> + __func__);
> + return;
> + }
> +
> + job->first_get = first_get;
> + job->num_slots = nr_slots;
> + host1x_job_get(job);
> + list_add_tail(&job->list, &cdma->sync_queue);
> +}

It's a bit odd that you pass a job in here along with some parameters
that are then assigned to the job's fields. Couldn't you just assign
them to the job's fields before passing the job into this function?

I also see that you only use this function once, so maybe you could
open-code it instead.

> +/*
> + * Return the status of the cdma's sync queue or push buffer for the given 
> event
> + *  - sq empty: returns 1 for empty, 0 for not empty (as in "1 empty queue" 
> :-)
> + *  - pb space: returns the number of free slots in the channel's push buffer
> + * Must be called with the cdma lock held.
> + */
> +static unsigned int cdma_status_locked(struct host1x_cdma *cdma,
> + enum cdma_event event)
> +{
> + struct host1x *host1x = cdma_to_host1x(cdma);
> + switch (event) {
> + case CDMA_EVENT_SYNC_QUEUE_EMPTY:
> + return list_empty(&cdma->sync_queue) ? 1 : 0;
> + case CDMA_EVENT_PUSH_BUFFER_SPACE: {
> + struct push_buffer *pb = &cdma->push_buffer;
> + return host1x->cdma_pb_op.space(pb);
> + }
> + default:
> + return 0;
> + }
> +}

Similarly this function is only used in one place and it requires a
whole lot of documentation to define the meaning of the return value. If
you implement this functionality directly in host1x_cdma_wait_locked()
you have much more context and don't require all this "protocol".

> +/*
> + * Start timer for a buffer submition that has completed yet.

"submission". And I don't understand the "that has completed yet" part.

> + * Must be called with the cdma lock held.
> + */
> +static void cdma_start_timer_locked(struct host1x_cdma *cdma,
> + struct host1x_job *job)

You use two different styles to indent the function parameters. You
might want to stick to one, preferably aligning them with the first
parameter on the first line.

> +{
> + struct host1x *host = cdma_to_host1x(cdma);
> +
> + if (cdma->timeout.clientid) {
> + /* timer already started */
> + return;
> + }
> +
> + cdma->timeout.clientid = job->clientid;
> + cdma->timeout.syncpt = host1x_syncpt_get(host, job->syncpt_id);
> + cdma->timeout.syncpt_val = job->syncpt_end;
> + cdma->timeout.start_ktime = ktime_get();
> +
> + schedule_delayed_work(&cdma->timeout.wq,
> + msecs_to_jiffies(job->timeout));
> +}
> +
> +/*
> + * Stop timer when a buffer submition completes.

"submission"

> +/*
> + * For all sync queue entries that have already finished according to the
> + * current sync point registers:
> + *  - unpin & unref their mems
> + *  - pop their push buffer slots
> + *  - remove them from the sync queue
> + * This is normally called from the host code's worker thread, but can be
> + * called m

Re: [PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-02-25 Thread Thierry Reding
On Tue, Jan 15, 2013 at 01:43:59PM +0200, Terje Bergstrom wrote:
[...]
> diff --git a/drivers/gpu/host1x/Kconfig b/drivers/gpu/host1x/Kconfig
> index e89fb2b..57680a6 100644
> --- a/drivers/gpu/host1x/Kconfig
> +++ b/drivers/gpu/host1x/Kconfig
> @@ -3,4 +3,27 @@ config TEGRA_HOST1X
>   help
> Driver for the Tegra host1x hardware.
>  
> -   Required for enabling tegradrm.
> +   Required for enabling tegradrm and 2D acceleration.

I don't think I commented on this in the other patches, but I think this
could use a bit more information about what host1x is. Also mentioning
that it is a requirement for tegra-drm and 2D acceleration isn't very
useful because it can equally well be expressed in Kconfig. If you add
some description about what host1x is, people will know that they want
to enable it.

> +if TEGRA_HOST1X
> +
> +config TEGRA_HOST1X_CMA
> + bool "Support DRM CMA buffers"
> + depends on DRM
> + default y
> + select DRM_GEM_CMA_HELPER
> + select DRM_KMS_CMA_HELPER
> + help
> +   Say yes if you wish to use DRM CMA buffers.
> +
> +   If unsure, choose Y.

Perhaps make this not user-selectable (for now)? If somebody disables
this explicitly they won't get a working driver, right?

> diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
[...]
> +#include "cdma.h"
> +#include "channel.h"
> +#include "dev.h"
> +#include "memmgr.h"
> +#include "job.h"
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define TRACE_MAX_LENGTH 128U

"" includes generally follow <> ones.

> +/*
> + * Add an entry to the sync queue.
> + */
> +static void add_to_sync_queue(struct host1x_cdma *cdma,
> +   struct host1x_job *job,
> +   u32 nr_slots,
> +   u32 first_get)
> +{
> + if (job->syncpt_id == NVSYNCPT_INVALID) {
> + dev_warn(&job->ch->dev->dev, "%s: Invalid syncpt\n",
> + __func__);
> + return;
> + }
> +
> + job->first_get = first_get;
> + job->num_slots = nr_slots;
> + host1x_job_get(job);
> + list_add_tail(&job->list, &cdma->sync_queue);
> +}

It's a bit odd that you pass a job in here along with some parameters
that are then assigned to the job's fields. Couldn't you just assign
them to the job's fields before passing the job into this function?

I also see that you only use this function once, so maybe you could
open-code it instead.

> +/*
> + * Return the status of the cdma's sync queue or push buffer for the given 
> event
> + *  - sq empty: returns 1 for empty, 0 for not empty (as in "1 empty queue" 
> :-)
> + *  - pb space: returns the number of free slots in the channel's push buffer
> + * Must be called with the cdma lock held.
> + */
> +static unsigned int cdma_status_locked(struct host1x_cdma *cdma,
> + enum cdma_event event)
> +{
> + struct host1x *host1x = cdma_to_host1x(cdma);
> + switch (event) {
> + case CDMA_EVENT_SYNC_QUEUE_EMPTY:
> + return list_empty(&cdma->sync_queue) ? 1 : 0;
> + case CDMA_EVENT_PUSH_BUFFER_SPACE: {
> + struct push_buffer *pb = &cdma->push_buffer;
> + return host1x->cdma_pb_op.space(pb);
> + }
> + default:
> + return 0;
> + }
> +}

Similarly this function is only used in one place and it requires a
whole lot of documentation to define the meaning of the return value. If
you implement this functionality directly in host1x_cdma_wait_locked()
you have much more context and don't require all this "protocol".

> +/*
> + * Start timer for a buffer submition that has completed yet.

"submission". And I don't understand the "that has completed yet" part.

> + * Must be called with the cdma lock held.
> + */
> +static void cdma_start_timer_locked(struct host1x_cdma *cdma,
> + struct host1x_job *job)

You use two different styles to indent the function parameters. You
might want to stick to one, preferably aligning them with the first
parameter on the first line.

> +{
> + struct host1x *host = cdma_to_host1x(cdma);
> +
> + if (cdma->timeout.clientid) {
> + /* timer already started */
> + return;
> + }
> +
> + cdma->timeout.clientid = job->clientid;
> + cdma->timeout.syncpt = host1x_syncpt_get(host, job->syncpt_id);
> + cdma->timeout.syncpt_val = job->syncpt_end;
> + cdma->timeout.start_ktime = ktime_get();
> +
> + schedule_delayed_work(&cdma->timeout.wq,
> + msecs_to_jiffies(job->timeout));
> +}
> +
> +/*
> + * Stop timer when a buffer submition completes.

"submission"

> +/*
> + * For all sync queue entries that have already finished according to the
> + * current sync point registers:
> + *  - unpin & unref their mems
> + *  - pop their push buffer slots
> + *  - remove them from the sync queue
> + * This is normally called from the host code's worker thread, but can be
> + * called m

[PATCHv5,RESEND 3/8] gpu: host1x: Add channel support

2013-01-15 Thread Terje Bergstrom
Add support for host1x client modules, and host1x channels to submit
work to the clients. The work is submitted in GEM CMA buffers, so
this patch adds support for them.

Signed-off-by: Terje Bergstrom 
---
 drivers/gpu/host1x/Kconfig  |   25 +-
 drivers/gpu/host1x/Makefile |5 +
 drivers/gpu/host1x/cdma.c   |  439 +++
 drivers/gpu/host1x/cdma.h   |  107 +
 drivers/gpu/host1x/channel.c|  140 ++
 drivers/gpu/host1x/channel.h|   58 +++
 drivers/gpu/host1x/cma.c|  116 +
 drivers/gpu/host1x/cma.h|   43 ++
 drivers/gpu/host1x/dev.c|   13 +
 drivers/gpu/host1x/dev.h|   59 +++
 drivers/gpu/host1x/host1x.h |   29 ++
 drivers/gpu/host1x/hw/cdma_hw.c |  475 +
 drivers/gpu/host1x/hw/cdma_hw.h |   37 ++
 drivers/gpu/host1x/hw/channel_hw.c  |  148 +++
 drivers/gpu/host1x/hw/host1x01.c|6 +
 drivers/gpu/host1x/hw/host1x01_hardware.h   |  124 ++
 drivers/gpu/host1x/hw/hw_host1x01_channel.h |  102 +
 drivers/gpu/host1x/hw/hw_host1x01_sync.h|   12 +
 drivers/gpu/host1x/hw/hw_host1x01_uclass.h  |  168 
 drivers/gpu/host1x/hw/syncpt_hw.c   |   10 +
 drivers/gpu/host1x/intr.c   |   29 +-
 drivers/gpu/host1x/intr.h   |6 +
 drivers/gpu/host1x/job.c|  612 +++
 drivers/gpu/host1x/job.h|  164 +++
 drivers/gpu/host1x/memmgr.c |  173 
 drivers/gpu/host1x/memmgr.h |   72 
 drivers/gpu/host1x/syncpt.c |   11 +
 drivers/gpu/host1x/syncpt.h |4 +
 include/trace/events/host1x.h   |  211 +
 29 files changed, 3396 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/host1x/cdma.c
 create mode 100644 drivers/gpu/host1x/cdma.h
 create mode 100644 drivers/gpu/host1x/channel.c
 create mode 100644 drivers/gpu/host1x/channel.h
 create mode 100644 drivers/gpu/host1x/cma.c
 create mode 100644 drivers/gpu/host1x/cma.h
 create mode 100644 drivers/gpu/host1x/host1x.h
 create mode 100644 drivers/gpu/host1x/hw/cdma_hw.c
 create mode 100644 drivers/gpu/host1x/hw/cdma_hw.h
 create mode 100644 drivers/gpu/host1x/hw/channel_hw.c
 create mode 100644 drivers/gpu/host1x/hw/hw_host1x01_channel.h
 create mode 100644 drivers/gpu/host1x/hw/hw_host1x01_uclass.h
 create mode 100644 drivers/gpu/host1x/job.c
 create mode 100644 drivers/gpu/host1x/job.h
 create mode 100644 drivers/gpu/host1x/memmgr.c
 create mode 100644 drivers/gpu/host1x/memmgr.h

diff --git a/drivers/gpu/host1x/Kconfig b/drivers/gpu/host1x/Kconfig
index e89fb2b..57680a6 100644
--- a/drivers/gpu/host1x/Kconfig
+++ b/drivers/gpu/host1x/Kconfig
@@ -3,4 +3,27 @@ config TEGRA_HOST1X
help
  Driver for the Tegra host1x hardware.

- Required for enabling tegradrm.
+ Required for enabling tegradrm and 2D acceleration.
+
+if TEGRA_HOST1X
+
+config TEGRA_HOST1X_CMA
+   bool "Support DRM CMA buffers"
+   depends on DRM
+   default y
+   select DRM_GEM_CMA_HELPER
+   select DRM_KMS_CMA_HELPER
+   help
+ Say yes if you wish to use DRM CMA buffers.
+
+ If unsure, choose Y.
+
+config TEGRA_HOST1X_FIREWALL
+   bool "Enable HOST1X security firewall"
+   default y
+   help
+ Say yes if kernel should protect command streams from tampering.
+
+ If unsure, choose Y.
+
+endif
diff --git a/drivers/gpu/host1x/Makefile b/drivers/gpu/host1x/Makefile
index 5ef47ff..cdd87c8 100644
--- a/drivers/gpu/host1x/Makefile
+++ b/drivers/gpu/host1x/Makefile
@@ -4,6 +4,11 @@ host1x-y = \
syncpt.o \
dev.o \
intr.o \
+   cdma.o \
+   channel.o \
+   job.o \
+   memmgr.o \
hw/host1x01.o

+host1x-$(CONFIG_TEGRA_HOST1X_CMA) += cma.o
 obj-$(CONFIG_TEGRA_HOST1X) += host1x.o
diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
new file mode 100644
index 000..d6a38d2
--- /dev/null
+++ b/drivers/gpu/host1x/cdma.c
@@ -0,0 +1,439 @@
+/*
+ * Tegra host1x Command DMA
+ *
+ * Copyright (c) 2010-2013, NVIDIA Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include "cdma.h"
+#include "channel.h"
+#in