[PATCH 6/6] drm/msm: add OCMEM driver

2015-10-01 Thread Rob Clark
On Thu, Oct 1, 2015 at 8:59 PM, Stephen Boyd  wrote:
> On 10/01, Rob Clark wrote:
>> On Thu, Oct 1, 2015 at 3:25 PM, Rob Clark  wrote:
>> > On Thu, Oct 1, 2015 at 2:00 PM, Stephen Boyd  
>> > wrote:
>> >>
>> >> The simplest thing to do is to split the memory between GPU and
>> >> vidc statically. The other use cases with preemption and eviction
>> >> and DMA add a lot of complexity that we can explore at a later
>> >> time if need be.
>> >
>> > true, as long as one of the clients is the static gpu client, I guess
>> > we could reasonably easily support up to two clients reasonably
>> > easily...
>>
>> btw, random thought..  drm_mm is a utility in drm that serves a
>> similar function to genalloc for graphics drivers to manage their
>> address space(s) (used for everything from mmap virtual address space
>> of buffers allocated against device to managing vram and gart
>> allocations, etc... when vram carveout is used w/ drm/msm (due to no
>> iommu) I use it to manage allocations from the carveout).  It has some
>> potentially convenient twists, like supporting allocation from top of
>> the "address space" instead of bottom.  I'm thinking in particular of
>> allocating "narrow mode" allocations from top and "wide mode" from
>> bottom since wide vs narrow can only be set per region and not per
>> macro within the region.  (It can also search by first-fit or
>> best-fit.. although not sure if that is useful to us, since OCMEM size
>> is relatively constrained.)
>>
>> Not that I really want to keep ocmem allocator in drm.. I'd really
>> rather it be someone else's headache once it gets to implementing the
>> crazy stuff for all the random use-cases of other OCMEM users, since
>> gpu's use of OCMEM is rather simple/static..
>>
>> The way downstream driver does this is with a bunch of extra
>> bookkeeping on top of genalloc so it can do a dummy allocation to
>> force "from the top" allocation (and then immediately freeing the
>> dummy allocation)..  Maybe it just makes sense to teach genalloc how
>> to do from-top vs from-bottom allocations?  Not really sure..
>>
>
> It looks like video and GPU both use wide mode, if I'm reading
> the code correctly. So it seems that we don't need to do anything
> special for allocations, just hand the GPU and vidc a chunk of
> the memory in wide mode and then let the GPU and vidc drivers
> manage buffers within their chunk however they see fit.

right, I think it is just audio that uses narrow-mode..  and at that
point I think we need to use rpm for configuring things, since I guess
audio wants to do some of that from dsp for low-power audio playback
type use-cases??  Maybe we can ignore this for a while.. although when
we get to the point of fixing it, spiffing out genalloc sounds more
interesting than layering a bunch of bookkeeping on top of genalloc..

> One pitfall is going to be power management. ocmem is closely
> tied to the GPU power domains, so when video is using its chunk
> of memory we're going to need to keep ocmem powered up even if
> the GPU is off. I suppose for now we can just leave it always
> powered on once the driver probes, but if we ever want to turn it
> off we're going to need some tracking mechanism to make sure we
> don't turn it off when buffers are in use.

part of the problem that right now drm/msm is going to hold onto it's
allocation even when it clks down gpu.  I suppose not terribly
difficult to fix (although one reason why I didn't bother making
things terribly dynamic w/ clk mgmt or macro mode state in current
ocmem patch).. at this point I'm still more at the 'just try to make
it work in the first place' stage, rather than worrying too much about
optimizing for power..

BR,
-R


[PATCH 6/6] drm/msm: add OCMEM driver

2015-10-01 Thread Rob Clark
On Thu, Oct 1, 2015 at 3:25 PM, Rob Clark  wrote:
> On Thu, Oct 1, 2015 at 2:00 PM, Stephen Boyd  wrote:
>> On 10/01, Stanimir Varbanov wrote:
>>> On 09/30/2015 02:45 PM, Rob Clark wrote:
>>> > On Wed, Sep 30, 2015 at 7:31 AM, Rob Clark  wrote:
>>> >> On Wed, Sep 30, 2015 at 3:51 AM, Stanimir Varbanov
>>> >>  wrote:
>>> >>> On 09/29/2015 10:48 PM, Rob Clark wrote:
>>> >> was mandatory or just power optimization..  but yes, the plan was to
>>> >> move it somewhere else (not sure quite where, drivers/doc/qcom?)
>>> >> sometime..  Preferably when someone who understands all the other
>>> >> ocmem use-cases better could figure out what we really need to add to
>>> >> the driver.
>>> >>
>>> >> In downstream driver there is a lot of complexity that appears to be
>>> >> in order to allow two clients to each allocate a portion of a macro
>>> >> within a region (ie. aggregate_macro_state(), apply_macro_vote(),
>>> >> etc), and I wanted to figure out if that is even a valid use-case
>>> >> before trying to make ocmem something that could actually support
>>> >> multiple clients.
>>> >>
>>> >> There is also some complexity about ensuring that if clients aren't
>>> >> split up on region boundaries, that you don't have one client in
>>> >> region asking for wide-mode and other for narrow-mode..
>>> >> (switch_region_mode()) but maybe we could handle that by just
>>> >> allocating wide from bottom and narrow from top.  Also seems to be
>>> >> some craziness for allowing one client to pre-empt/evict another.. a
>>> >> dm engine, etc, etc..
>>> >>
>>> >> All I know is gpu just statically allocates one big region aligned
>>> >> chunk of ocmem, so I ignored the rest of the crazy (maybe or maybe not
>>> >> hypothetical) use-cases for now...
>>>
>>> OK, I will try to sort out ocmem use cases for vidc driver.
>>>
>>
>> The simplest thing to do is to split the memory between GPU and
>> vidc statically. The other use cases with preemption and eviction
>> and DMA add a lot of complexity that we can explore at a later
>> time if need be.
>
> true, as long as one of the clients is the static gpu client, I guess
> we could reasonably easily support up to two clients reasonably
> easily...

btw, random thought..  drm_mm is a utility in drm that serves a
similar function to genalloc for graphics drivers to manage their
address space(s) (used for everything from mmap virtual address space
of buffers allocated against device to managing vram and gart
allocations, etc... when vram carveout is used w/ drm/msm (due to no
iommu) I use it to manage allocations from the carveout).  It has some
potentially convenient twists, like supporting allocation from top of
the "address space" instead of bottom.  I'm thinking in particular of
allocating "narrow mode" allocations from top and "wide mode" from
bottom since wide vs narrow can only be set per region and not per
macro within the region.  (It can also search by first-fit or
best-fit.. although not sure if that is useful to us, since OCMEM size
is relatively constrained.)

Not that I really want to keep ocmem allocator in drm.. I'd really
rather it be someone else's headache once it gets to implementing the
crazy stuff for all the random use-cases of other OCMEM users, since
gpu's use of OCMEM is rather simple/static..

The way downstream driver does this is with a bunch of extra
bookkeeping on top of genalloc so it can do a dummy allocation to
force "from the top" allocation (and then immediately freeing the
dummy allocation)..  Maybe it just makes sense to teach genalloc how
to do from-top vs from-bottom allocations?  Not really sure..

BR,
-R


[PATCH 6/6] drm/msm: add OCMEM driver

2015-10-01 Thread Stephen Boyd
On 10/01, Rob Clark wrote:
> On Thu, Oct 1, 2015 at 3:25 PM, Rob Clark  wrote:
> > On Thu, Oct 1, 2015 at 2:00 PM, Stephen Boyd  
> > wrote:
> >>
> >> The simplest thing to do is to split the memory between GPU and
> >> vidc statically. The other use cases with preemption and eviction
> >> and DMA add a lot of complexity that we can explore at a later
> >> time if need be.
> >
> > true, as long as one of the clients is the static gpu client, I guess
> > we could reasonably easily support up to two clients reasonably
> > easily...
> 
> btw, random thought..  drm_mm is a utility in drm that serves a
> similar function to genalloc for graphics drivers to manage their
> address space(s) (used for everything from mmap virtual address space
> of buffers allocated against device to managing vram and gart
> allocations, etc... when vram carveout is used w/ drm/msm (due to no
> iommu) I use it to manage allocations from the carveout).  It has some
> potentially convenient twists, like supporting allocation from top of
> the "address space" instead of bottom.  I'm thinking in particular of
> allocating "narrow mode" allocations from top and "wide mode" from
> bottom since wide vs narrow can only be set per region and not per
> macro within the region.  (It can also search by first-fit or
> best-fit.. although not sure if that is useful to us, since OCMEM size
> is relatively constrained.)
> 
> Not that I really want to keep ocmem allocator in drm.. I'd really
> rather it be someone else's headache once it gets to implementing the
> crazy stuff for all the random use-cases of other OCMEM users, since
> gpu's use of OCMEM is rather simple/static..
> 
> The way downstream driver does this is with a bunch of extra
> bookkeeping on top of genalloc so it can do a dummy allocation to
> force "from the top" allocation (and then immediately freeing the
> dummy allocation)..  Maybe it just makes sense to teach genalloc how
> to do from-top vs from-bottom allocations?  Not really sure..
> 

It looks like video and GPU both use wide mode, if I'm reading
the code correctly. So it seems that we don't need to do anything
special for allocations, just hand the GPU and vidc a chunk of
the memory in wide mode and then let the GPU and vidc drivers
manage buffers within their chunk however they see fit.

One pitfall is going to be power management. ocmem is closely
tied to the GPU power domains, so when video is using its chunk
of memory we're going to need to keep ocmem powered up even if
the GPU is off. I suppose for now we can just leave it always
powered on once the driver probes, but if we ever want to turn it
off we're going to need some tracking mechanism to make sure we
don't turn it off when buffers are in use.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


[PATCH 6/6] drm/msm: add OCMEM driver

2015-10-01 Thread Rob Clark
On Thu, Oct 1, 2015 at 2:00 PM, Stephen Boyd  wrote:
> On 10/01, Stanimir Varbanov wrote:
>> On 09/30/2015 02:45 PM, Rob Clark wrote:
>> > On Wed, Sep 30, 2015 at 7:31 AM, Rob Clark  wrote:
>> >> On Wed, Sep 30, 2015 at 3:51 AM, Stanimir Varbanov
>> >>  wrote:
>> >>> On 09/29/2015 10:48 PM, Rob Clark wrote:
>> >> was mandatory or just power optimization..  but yes, the plan was to
>> >> move it somewhere else (not sure quite where, drivers/doc/qcom?)
>> >> sometime..  Preferably when someone who understands all the other
>> >> ocmem use-cases better could figure out what we really need to add to
>> >> the driver.
>> >>
>> >> In downstream driver there is a lot of complexity that appears to be
>> >> in order to allow two clients to each allocate a portion of a macro
>> >> within a region (ie. aggregate_macro_state(), apply_macro_vote(),
>> >> etc), and I wanted to figure out if that is even a valid use-case
>> >> before trying to make ocmem something that could actually support
>> >> multiple clients.
>> >>
>> >> There is also some complexity about ensuring that if clients aren't
>> >> split up on region boundaries, that you don't have one client in
>> >> region asking for wide-mode and other for narrow-mode..
>> >> (switch_region_mode()) but maybe we could handle that by just
>> >> allocating wide from bottom and narrow from top.  Also seems to be
>> >> some craziness for allowing one client to pre-empt/evict another.. a
>> >> dm engine, etc, etc..
>> >>
>> >> All I know is gpu just statically allocates one big region aligned
>> >> chunk of ocmem, so I ignored the rest of the crazy (maybe or maybe not
>> >> hypothetical) use-cases for now...
>>
>> OK, I will try to sort out ocmem use cases for vidc driver.
>>
>
> The simplest thing to do is to split the memory between GPU and
> vidc statically. The other use cases with preemption and eviction
> and DMA add a lot of complexity that we can explore at a later
> time if need be.

true, as long as one of the clients is the static gpu client, I guess
we could reasonably easily support up to two clients reasonably
easily...

BR,
-R

> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project


[PATCH 6/6] drm/msm: add OCMEM driver

2015-10-01 Thread Stanimir Varbanov
On 09/30/2015 02:45 PM, Rob Clark wrote:
> On Wed, Sep 30, 2015 at 7:31 AM, Rob Clark  wrote:
>> On Wed, Sep 30, 2015 at 3:51 AM, Stanimir Varbanov
>>  wrote:
>>> Hi Rob,
>>>
>>> Thanks for your work.
>>>
>>> On 09/29/2015 10:48 PM, Rob Clark wrote:
 For now, since the GPU is the only upstream consumer, just stuff this
 into drm/msm.  Eventually if we have other consumers, we'll have to
>>>
>>> As the video encoder/decoder driver (vidc) for apq8084 && msm8974 also
>>> use the ocmem for scratch buffers, it might be better to relocate the
>>> ocmem driver in drivers/soc/qcom from the beginning?
>>
>> I wasn't really sure how soon vidc would support 8084/8974 (I assume
>> first target is 8916 which fortunately doesn't have ocmem), or if it

My expectations are that the same driver will work on apq8084, as well.

>> was mandatory or just power optimization..  but yes, the plan was to
>> move it somewhere else (not sure quite where, drivers/doc/qcom?)
>> sometime..  Preferably when someone who understands all the other
>> ocmem use-cases better could figure out what we really need to add to
>> the driver.
>>
>> In downstream driver there is a lot of complexity that appears to be
>> in order to allow two clients to each allocate a portion of a macro
>> within a region (ie. aggregate_macro_state(), apply_macro_vote(),
>> etc), and I wanted to figure out if that is even a valid use-case
>> before trying to make ocmem something that could actually support
>> multiple clients.
>>
>> There is also some complexity about ensuring that if clients aren't
>> split up on region boundaries, that you don't have one client in
>> region asking for wide-mode and other for narrow-mode..
>> (switch_region_mode()) but maybe we could handle that by just
>> allocating wide from bottom and narrow from top.  Also seems to be
>> some craziness for allowing one client to pre-empt/evict another.. a
>> dm engine, etc, etc..
>>
>> All I know is gpu just statically allocates one big region aligned
>> chunk of ocmem, so I ignored the rest of the crazy (maybe or maybe not
>> hypothetical) use-cases for now...

OK, I will try to sort out ocmem use cases for vidc driver.

> 
> btw, I should add that I don't mind adding it in drivers/soc/qcom (or
> somewhere else?) to start with if others prefer, I just didn't want to
> give the wrong impression that it is ready yet for multiple clients.

I see. Then to avoid confusions you could remove all clients and keep
only graphics from ocmem_client enum.

> 
> All I know is the gpu uses it statically and is pretty much useless
> without ocmem, so for lack of understanding of the other use cases I
> just tried to add simply what the gpu needs..

Got it now, thanks.

-- 
regards,
Stan


[PATCH 6/6] drm/msm: add OCMEM driver

2015-10-01 Thread Stephen Boyd
On 10/01, Stanimir Varbanov wrote:
> On 09/30/2015 02:45 PM, Rob Clark wrote:
> > On Wed, Sep 30, 2015 at 7:31 AM, Rob Clark  wrote:
> >> On Wed, Sep 30, 2015 at 3:51 AM, Stanimir Varbanov
> >>  wrote:
> >>> On 09/29/2015 10:48 PM, Rob Clark wrote:
> >> was mandatory or just power optimization..  but yes, the plan was to
> >> move it somewhere else (not sure quite where, drivers/doc/qcom?)
> >> sometime..  Preferably when someone who understands all the other
> >> ocmem use-cases better could figure out what we really need to add to
> >> the driver.
> >>
> >> In downstream driver there is a lot of complexity that appears to be
> >> in order to allow two clients to each allocate a portion of a macro
> >> within a region (ie. aggregate_macro_state(), apply_macro_vote(),
> >> etc), and I wanted to figure out if that is even a valid use-case
> >> before trying to make ocmem something that could actually support
> >> multiple clients.
> >>
> >> There is also some complexity about ensuring that if clients aren't
> >> split up on region boundaries, that you don't have one client in
> >> region asking for wide-mode and other for narrow-mode..
> >> (switch_region_mode()) but maybe we could handle that by just
> >> allocating wide from bottom and narrow from top.  Also seems to be
> >> some craziness for allowing one client to pre-empt/evict another.. a
> >> dm engine, etc, etc..
> >>
> >> All I know is gpu just statically allocates one big region aligned
> >> chunk of ocmem, so I ignored the rest of the crazy (maybe or maybe not
> >> hypothetical) use-cases for now...
> 
> OK, I will try to sort out ocmem use cases for vidc driver.
> 

The simplest thing to do is to split the memory between GPU and
vidc statically. The other use cases with preemption and eviction
and DMA add a lot of complexity that we can explore at a later
time if need be.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


[PATCH 6/6] drm/msm: add OCMEM driver

2015-09-30 Thread Stanimir Varbanov
Hi Rob,

Thanks for your work.

On 09/29/2015 10:48 PM, Rob Clark wrote:
> For now, since the GPU is the only upstream consumer, just stuff this
> into drm/msm.  Eventually if we have other consumers, we'll have to

As the video encoder/decoder driver (vidc) for apq8084 && msm8974 also
use the ocmem for scratch buffers, it might be better to relocate the
ocmem driver in drivers/soc/qcom from the beginning?

I'm working on vidc driver upstream version so it will be nice to test
ocmem driver from it, too.

> split this out and make the allocation less hard coded.  But I'll punt
> on that until I better understand the non-gpu uses-cases (and whether
> the allocation *really* needs to be as complicated as it is in the
> downstream driver).
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/Makefile  |   3 +-
>  drivers/gpu/drm/msm/adreno/a3xx_gpu.c |  19 +-
>  drivers/gpu/drm/msm/adreno/a4xx_gpu.c |  19 +-
>  drivers/gpu/drm/msm/msm_drv.c |   2 +
>  drivers/gpu/drm/msm/msm_gpu.h |   3 +
>  drivers/gpu/drm/msm/ocmem/ocmem.c | 399 
> ++
>  drivers/gpu/drm/msm/ocmem/ocmem.h |  46 
>  7 files changed, 463 insertions(+), 28 deletions(-)
>  create mode 100644 drivers/gpu/drm/msm/ocmem/ocmem.c
>  create mode 100644 drivers/gpu/drm/msm/ocmem/ocmem.h
> 



-- 
regards,
Stan


[PATCH 6/6] drm/msm: add OCMEM driver

2015-09-30 Thread Rob Clark
On Wed, Sep 30, 2015 at 7:31 AM, Rob Clark  wrote:
> On Wed, Sep 30, 2015 at 3:51 AM, Stanimir Varbanov
>  wrote:
>> Hi Rob,
>>
>> Thanks for your work.
>>
>> On 09/29/2015 10:48 PM, Rob Clark wrote:
>>> For now, since the GPU is the only upstream consumer, just stuff this
>>> into drm/msm.  Eventually if we have other consumers, we'll have to
>>
>> As the video encoder/decoder driver (vidc) for apq8084 && msm8974 also
>> use the ocmem for scratch buffers, it might be better to relocate the
>> ocmem driver in drivers/soc/qcom from the beginning?
>
> I wasn't really sure how soon vidc would support 8084/8974 (I assume
> first target is 8916 which fortunately doesn't have ocmem), or if it
> was mandatory or just power optimization..  but yes, the plan was to
> move it somewhere else (not sure quite where, drivers/doc/qcom?)
> sometime..  Preferably when someone who understands all the other
> ocmem use-cases better could figure out what we really need to add to
> the driver.
>
> In downstream driver there is a lot of complexity that appears to be
> in order to allow two clients to each allocate a portion of a macro
> within a region (ie. aggregate_macro_state(), apply_macro_vote(),
> etc), and I wanted to figure out if that is even a valid use-case
> before trying to make ocmem something that could actually support
> multiple clients.
>
> There is also some complexity about ensuring that if clients aren't
> split up on region boundaries, that you don't have one client in
> region asking for wide-mode and other for narrow-mode..
> (switch_region_mode()) but maybe we could handle that by just
> allocating wide from bottom and narrow from top.  Also seems to be
> some craziness for allowing one client to pre-empt/evict another.. a
> dm engine, etc, etc..
>
> All I know is gpu just statically allocates one big region aligned
> chunk of ocmem, so I ignored the rest of the crazy (maybe or maybe not
> hypothetical) use-cases for now...

btw, I should add that I don't mind adding it in drivers/soc/qcom (or
somewhere else?) to start with if others prefer, I just didn't want to
give the wrong impression that it is ready yet for multiple clients.

All I know is the gpu uses it statically and is pretty much useless
without ocmem, so for lack of understanding of the other use cases I
just tried to add simply what the gpu needs..

BR,
-R

>
>> I'm working on vidc driver upstream version so it will be nice to test
>> ocmem driver from it, too.
>>
>>> split this out and make the allocation less hard coded.  But I'll punt
>>> on that until I better understand the non-gpu uses-cases (and whether
>>> the allocation *really* needs to be as complicated as it is in the
>>> downstream driver).
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>>  drivers/gpu/drm/msm/Makefile  |   3 +-
>>>  drivers/gpu/drm/msm/adreno/a3xx_gpu.c |  19 +-
>>>  drivers/gpu/drm/msm/adreno/a4xx_gpu.c |  19 +-
>>>  drivers/gpu/drm/msm/msm_drv.c |   2 +
>>>  drivers/gpu/drm/msm/msm_gpu.h |   3 +
>>>  drivers/gpu/drm/msm/ocmem/ocmem.c | 399 
>>> ++
>>>  drivers/gpu/drm/msm/ocmem/ocmem.h |  46 
>>>  7 files changed, 463 insertions(+), 28 deletions(-)
>>>  create mode 100644 drivers/gpu/drm/msm/ocmem/ocmem.c
>>>  create mode 100644 drivers/gpu/drm/msm/ocmem/ocmem.h
>>>
>>
>> 
>>
>> --
>> regards,
>> Stan


[PATCH 6/6] drm/msm: add OCMEM driver

2015-09-30 Thread Rob Clark
On Wed, Sep 30, 2015 at 3:51 AM, Stanimir Varbanov
 wrote:
> Hi Rob,
>
> Thanks for your work.
>
> On 09/29/2015 10:48 PM, Rob Clark wrote:
>> For now, since the GPU is the only upstream consumer, just stuff this
>> into drm/msm.  Eventually if we have other consumers, we'll have to
>
> As the video encoder/decoder driver (vidc) for apq8084 && msm8974 also
> use the ocmem for scratch buffers, it might be better to relocate the
> ocmem driver in drivers/soc/qcom from the beginning?

I wasn't really sure how soon vidc would support 8084/8974 (I assume
first target is 8916 which fortunately doesn't have ocmem), or if it
was mandatory or just power optimization..  but yes, the plan was to
move it somewhere else (not sure quite where, drivers/doc/qcom?)
sometime..  Preferably when someone who understands all the other
ocmem use-cases better could figure out what we really need to add to
the driver.

In downstream driver there is a lot of complexity that appears to be
in order to allow two clients to each allocate a portion of a macro
within a region (ie. aggregate_macro_state(), apply_macro_vote(),
etc), and I wanted to figure out if that is even a valid use-case
before trying to make ocmem something that could actually support
multiple clients.

There is also some complexity about ensuring that if clients aren't
split up on region boundaries, that you don't have one client in
region asking for wide-mode and other for narrow-mode..
(switch_region_mode()) but maybe we could handle that by just
allocating wide from bottom and narrow from top.  Also seems to be
some craziness for allowing one client to pre-empt/evict another.. a
dm engine, etc, etc..

All I know is gpu just statically allocates one big region aligned
chunk of ocmem, so I ignored the rest of the crazy (maybe or maybe not
hypothetical) use-cases for now...

BR,
-R

> I'm working on vidc driver upstream version so it will be nice to test
> ocmem driver from it, too.
>
>> split this out and make the allocation less hard coded.  But I'll punt
>> on that until I better understand the non-gpu uses-cases (and whether
>> the allocation *really* needs to be as complicated as it is in the
>> downstream driver).
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  drivers/gpu/drm/msm/Makefile  |   3 +-
>>  drivers/gpu/drm/msm/adreno/a3xx_gpu.c |  19 +-
>>  drivers/gpu/drm/msm/adreno/a4xx_gpu.c |  19 +-
>>  drivers/gpu/drm/msm/msm_drv.c |   2 +
>>  drivers/gpu/drm/msm/msm_gpu.h |   3 +
>>  drivers/gpu/drm/msm/ocmem/ocmem.c | 399 
>> ++
>>  drivers/gpu/drm/msm/ocmem/ocmem.h |  46 
>>  7 files changed, 463 insertions(+), 28 deletions(-)
>>  create mode 100644 drivers/gpu/drm/msm/ocmem/ocmem.c
>>  create mode 100644 drivers/gpu/drm/msm/ocmem/ocmem.h
>>
>
> 
>
> --
> regards,
> Stan


[PATCH 6/6] drm/msm: add OCMEM driver

2015-09-29 Thread Rob Clark
For now, since the GPU is the only upstream consumer, just stuff this
into drm/msm.  Eventually if we have other consumers, we'll have to
split this out and make the allocation less hard coded.  But I'll punt
on that until I better understand the non-gpu uses-cases (and whether
the allocation *really* needs to be as complicated as it is in the
downstream driver).

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/Makefile  |   3 +-
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c |  19 +-
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c |  19 +-
 drivers/gpu/drm/msm/msm_drv.c |   2 +
 drivers/gpu/drm/msm/msm_gpu.h |   3 +
 drivers/gpu/drm/msm/ocmem/ocmem.c | 399 ++
 drivers/gpu/drm/msm/ocmem/ocmem.h |  46 
 7 files changed, 463 insertions(+), 28 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/ocmem/ocmem.c
 create mode 100644 drivers/gpu/drm/msm/ocmem/ocmem.h

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 0a543eb..8ddf6fa 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -48,7 +48,8 @@ msm-y := \
msm_iommu.o \
msm_perf.o \
msm_rd.o \
-   msm_ringbuffer.o
+   msm_ringbuffer.o \
+   ocmem/ocmem.o

 msm-$(CONFIG_DRM_MSM_FBDEV) += msm_fbdev.o
 msm-$(CONFIG_COMMON_CLK) += mdp/mdp4/mdp4_lvds_pll.o
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index ca29688..2a8bf4c 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -17,10 +17,7 @@
  * this program.  If not, see .
  */

-#ifdef CONFIG_MSM_OCMEM
-#  include 
-#endif
-
+#include "ocmem/ocmem.h"
 #include "a3xx_gpu.h"

 #define A3XX_INT0_MASK \
@@ -322,10 +319,8 @@ static void a3xx_destroy(struct msm_gpu *gpu)

adreno_gpu_cleanup(adreno_gpu);

-#ifdef CONFIG_MSM_OCMEM
-   if (a3xx_gpu->ocmem_base)
+   if (a3xx_gpu->ocmem_hdl)
ocmem_free(OCMEM_GRAPHICS, a3xx_gpu->ocmem_hdl);
-#endif

kfree(a3xx_gpu);
 }
@@ -539,6 +534,7 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev)
struct msm_gpu *gpu;
struct msm_drm_private *priv = dev->dev_private;
struct platform_device *pdev = priv->gpu_pdev;
+   struct ocmem_buf *ocmem_hdl;
int ret;

if (!pdev) {
@@ -569,18 +565,13 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev)
goto fail;

/* if needed, allocate gmem: */
-   if (adreno_is_a330(adreno_gpu)) {
-#ifdef CONFIG_MSM_OCMEM
-   /* TODO this is different/missing upstream: */
-   struct ocmem_buf *ocmem_hdl =
-   ocmem_allocate(OCMEM_GRAPHICS, 
adreno_gpu->gmem);
-
+   ocmem_hdl = ocmem_allocate(OCMEM_GRAPHICS, adreno_gpu->gmem);
+   if (!IS_ERR(ocmem_hdl)) {
a3xx_gpu->ocmem_hdl = ocmem_hdl;
a3xx_gpu->ocmem_base = ocmem_hdl->addr;
adreno_gpu->gmem = ocmem_hdl->len;
DBG("using %dK of OCMEM at 0x%08x", adreno_gpu->gmem / 1024,
a3xx_gpu->ocmem_base);
-#endif
}

if (!gpu->mmu) {
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index a53f1be..17f084d 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -10,10 +10,9 @@
  * GNU General Public License for more details.
  *
  */
+
+#include "ocmem/ocmem.h"
 #include "a4xx_gpu.h"
-#ifdef CONFIG_MSM_OCMEM
-#  include 
-#endif

 #define A4XX_INT0_MASK \
(A4XX_INT0_RBBM_AHB_ERROR |\
@@ -289,10 +288,8 @@ static void a4xx_destroy(struct msm_gpu *gpu)

adreno_gpu_cleanup(adreno_gpu);

-#ifdef CONFIG_MSM_OCMEM
-   if (a4xx_gpu->ocmem_base)
+   if (a4xx_gpu->ocmem_hdl)
ocmem_free(OCMEM_GRAPHICS, a4xx_gpu->ocmem_hdl);
-#endif

kfree(a4xx_gpu);
 }
@@ -538,6 +535,7 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev)
struct msm_gpu *gpu;
struct msm_drm_private *priv = dev->dev_private;
struct platform_device *pdev = priv->gpu_pdev;
+   struct ocmem_buf *ocmem_hdl;
int ret;

if (!pdev) {
@@ -568,18 +566,13 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev)
goto fail;

/* if needed, allocate gmem: */
-   if (adreno_is_a4xx(adreno_gpu)) {
-#ifdef CONFIG_MSM_OCMEM
-   /* TODO this is different/missing upstream: */
-   struct ocmem_buf *ocmem_hdl =
-   ocmem_allocate(OCMEM_GRAPHICS, 
adreno_gpu->gmem);
-
+   ocmem_hdl = ocmem_allocate(OCMEM_GRAPHICS, adreno_gpu->gmem);
+   if (!IS_ERR(ocmem_hdl)) {
a4xx_gpu->ocmem_hdl = ocmem_hdl;
a4xx_gpu->ocmem_base = ocmem_hdl->addr;
adreno_gpu->gmem = ocmem_hdl->len;
DBG("using %dK of OCMEM at 0x%08x",