Re: [Mesa-dev] [PATCH 00/11] [RFC v2] Solve the mapping bug

2014-06-23 Thread Tom Stellard
On Sun, Jun 22, 2014 at 04:05:49PM +0200, Francisco Jerez wrote:
> Bruno Jimenez  writes:
> 
> > On Sat, 2014-06-21 at 17:39 +0200, Francisco Jerez wrote:
> >[...]
> >> The implementation of PIPE_TRANSFER_MAP_DIRECTLY introduced in PATCH 10
> >> has somewhat worrying semantics: A mapping with this flag might become
> >> stale unpredictably if a kernel is run, maybe from a different command
> >> queue.  Clover's transfer functions don't hit that path right now on
> >> single-threaded applications, but they might in the future as we start
> >> accelerating the APIs we currently implement with soft_copy_op().  This
> >> is a bug IMHO: even direct mappings should last until the corresponding
> >> unmap call.
> >
> > I think I'm not fully understanding you here. I tried to use
> > PIPE_TRANSFER_MAP_DIRECTLY only with clEnqueue{Write,Read} functions,
> > which map the resource, copy it and unmap it when finished. Is it
> > possible for another kernel to access the memory of a buffer that is
> > being read/written?
> >
> AFAICT, yes.  All command queues created on the same device share the
> same memory pool, so a kernel being queued for execution in one could
> invalidate a concurrent mapping done with PIPE_TRANSFER_MAP_DIRECTLY by
> one of the transfer functions.  On top of that the transfer functions
> might start queuing kernels themselves in the future to accelerate
> certain operations we currently do on the CPU, which would make this
> scenario more likely.
> 
> > I had no intention of having user mappings made with that flag.
> > [Although a possible solution, with a lot of warnings of course, for the
> > avobe problem could be to allow a user to use this flag]
> >
> >> I'm not advocating a revert of the series because it fixes a serious
> >> bug, but please don't push patches 10-11, we should probably start
> >> looking for a different solution.  Some suggestions are:
> >
> > I also asked for them to not to be pushed. And with your reasons, until
> > we find a better way or we change how buffers are handled, I won't
> > propose them again.
> >
> >>  - Why do you even need a pool?  Wouldn't it be possible to create a
> >>huge RAT, e.g. covering a 4GB portion of the GPU memory and then use
> >>a special memory domain or some sort of flag to tell the kernel to
> >>allocate a buffer from that region (or relocate if it's already been
> >>allocated elsewhere)?  This is especially easy on hardware with
> >>virtual memory, as you could simply reserve an arbitrarily large
> >>block of virtual memory, bind it as e.g. RAT0, and then map other
> >>buffer objects into the block on-demand as they're bound to the
> >>compute pipeline -- There would be no need to move the actual bits
> >>around.  This is similar to the approach I used in my original
> >>proof-of-concept implementation of the compute API on nv50.
> >
> > This is one of the things I have been wondering recently, given that
> > radeonsi doesn't use a pool, why r600 needs one? I still have to
> > understand AMD docs and how *exactly* everything works.
> >
> 
> Probably because on SI compute kernels can access random locations of
> memory without going through an RAT?  I have little actual experience
> with radeons, Tom should know the low-level details.
>

The reason there is no memory pool in radeonsi is because SI and newer support
virtual memory, so there is already one contiguous address space and also
because there is no limit to the number of resources that can be accessed by
a shader.

-Tom
 
> > 4GB seems like a big amount of memory for me, my little cedar has only
> > 512MB :)
> >
> >>  - If you insist on using a pool, you could (mostly) avoid the storage
> >>duplication and the mapping copies by allocating buffer objects
> >>directly from the pool as it was before this series, and then keep
> >>some sort of reference count specific to the pool storage that would
> >>be incremented on map and decremented on unmap.  Once you need to
> >>grow the pool you'd keep the old storage around temporarily and
> >>migrate buffers to the new storage lazily as they are required or
> >>unmapped.  Once the reference count drops to zero you'd be free to
> >>release the backing BO to the system.  The fact that you'd keep both
> >>storage buffers around for a bit means that you'd be able to use DMA
> >>to migrate the pool contents instead of the CPU copies you're doing
> >>now, which is likely to be substantially more efficient.
> >
> > I see how this would solve the slow mappings problem, but I think that
> > it could mean a higher memory usage. In the case of a user creating some
> > buffers, mapping one of them and them adding more so that the pool has
> > to grow, we would have to keep the full size of the old pool just for a
> > buffer, plus the new pool.
> >
> 
> That's a fair point, this solution would only get rid of the extra
> copying but it wouldn't solve memory usage probl

Re: [Mesa-dev] [PATCH 00/11] [RFC v2] Solve the mapping bug

2014-06-22 Thread Francisco Jerez
Bruno Jimenez  writes:

> On Sat, 2014-06-21 at 17:39 +0200, Francisco Jerez wrote:
>[...]
>> The implementation of PIPE_TRANSFER_MAP_DIRECTLY introduced in PATCH 10
>> has somewhat worrying semantics: A mapping with this flag might become
>> stale unpredictably if a kernel is run, maybe from a different command
>> queue.  Clover's transfer functions don't hit that path right now on
>> single-threaded applications, but they might in the future as we start
>> accelerating the APIs we currently implement with soft_copy_op().  This
>> is a bug IMHO: even direct mappings should last until the corresponding
>> unmap call.
>
> I think I'm not fully understanding you here. I tried to use
> PIPE_TRANSFER_MAP_DIRECTLY only with clEnqueue{Write,Read} functions,
> which map the resource, copy it and unmap it when finished. Is it
> possible for another kernel to access the memory of a buffer that is
> being read/written?
>
AFAICT, yes.  All command queues created on the same device share the
same memory pool, so a kernel being queued for execution in one could
invalidate a concurrent mapping done with PIPE_TRANSFER_MAP_DIRECTLY by
one of the transfer functions.  On top of that the transfer functions
might start queuing kernels themselves in the future to accelerate
certain operations we currently do on the CPU, which would make this
scenario more likely.

> I had no intention of having user mappings made with that flag.
> [Although a possible solution, with a lot of warnings of course, for the
> avobe problem could be to allow a user to use this flag]
>
>> I'm not advocating a revert of the series because it fixes a serious
>> bug, but please don't push patches 10-11, we should probably start
>> looking for a different solution.  Some suggestions are:
>
> I also asked for them to not to be pushed. And with your reasons, until
> we find a better way or we change how buffers are handled, I won't
> propose them again.
>
>>  - Why do you even need a pool?  Wouldn't it be possible to create a
>>huge RAT, e.g. covering a 4GB portion of the GPU memory and then use
>>a special memory domain or some sort of flag to tell the kernel to
>>allocate a buffer from that region (or relocate if it's already been
>>allocated elsewhere)?  This is especially easy on hardware with
>>virtual memory, as you could simply reserve an arbitrarily large
>>block of virtual memory, bind it as e.g. RAT0, and then map other
>>buffer objects into the block on-demand as they're bound to the
>>compute pipeline -- There would be no need to move the actual bits
>>around.  This is similar to the approach I used in my original
>>proof-of-concept implementation of the compute API on nv50.
>
> This is one of the things I have been wondering recently, given that
> radeonsi doesn't use a pool, why r600 needs one? I still have to
> understand AMD docs and how *exactly* everything works.
>

Probably because on SI compute kernels can access random locations of
memory without going through an RAT?  I have little actual experience
with radeons, Tom should know the low-level details.

> 4GB seems like a big amount of memory for me, my little cedar has only
> 512MB :)
>
>>  - If you insist on using a pool, you could (mostly) avoid the storage
>>duplication and the mapping copies by allocating buffer objects
>>directly from the pool as it was before this series, and then keep
>>some sort of reference count specific to the pool storage that would
>>be incremented on map and decremented on unmap.  Once you need to
>>grow the pool you'd keep the old storage around temporarily and
>>migrate buffers to the new storage lazily as they are required or
>>unmapped.  Once the reference count drops to zero you'd be free to
>>release the backing BO to the system.  The fact that you'd keep both
>>storage buffers around for a bit means that you'd be able to use DMA
>>to migrate the pool contents instead of the CPU copies you're doing
>>now, which is likely to be substantially more efficient.
>
> I see how this would solve the slow mappings problem, but I think that
> it could mean a higher memory usage. In the case of a user creating some
> buffers, mapping one of them and them adding more so that the pool has
> to grow, we would have to keep the full size of the old pool just for a
> buffer, plus the new pool.
>

That's a fair point, this solution would only get rid of the extra
copying but it wouldn't solve memory usage problem in some situations
(long-lived mappings).  IMHO the former is more worrying because it has
an impact on every map operation no matter what, while the increased
memory usage will only have an impact in situations of heavy memory
pressure -- And even in those cases the kernel might be able to avoid an
OOM situation by evicting the old pool storage from graphics memory
(which, I must admit, might be rather punishing too), as we can be sure
that there will be no further reference

Re: [Mesa-dev] [PATCH 00/11] [RFC v2] Solve the mapping bug

2014-06-21 Thread Bruno Jimenez
On Sat, 2014-06-21 at 17:39 +0200, Francisco Jerez wrote:
> Bruno Jiménez  writes:
> 
> > Hi,
> >
> > This is my second attempt to fix the mapping bug adding all the
> > suggestions that Tom Stellard sent, and, so far, it seems that
> > it is resolved.
> >
> > This series changes completely how OpenCL buffers are handled
> > by the r600g driver. Before this, we would add them directly to
> > a pool, and this pool would grow whenever we needed more space.
> > But this process implied destroying the pool and creating a new
> > one. There could be cases where a buffer would be mapped and
> > the pool would grow, leaving one side of the mapping pointed
> > to where the item was. This is the 'mapping bug'
> >
> > Now, Items will have an intermediate resource, where all mappings
> > can be done, and when a buffer is going to be used with a kernel
> > it is promoted to the pool. In the case where a promoted item
> > is going to be mapped, it is previously demoted, so even if
> > the pool changes its location due to growing, the map remains
> > valid. In the case of a buffer mapped for reading, and used
> > by a kernel to read from it, we will duplicate this buffer,
> > having the intermediate buffer, where the user has its map, and
> > an item in the pool, which is the one that the kernel is going
> > to use.
> >
> > As a summary for v2:
> > Patches 1-8: These are the main part of the series, and solve
> > the mapping bug.
> > Patches 1 and 7 now use less explicit castings
> > Patch 2 is new and introduces the 'is_item_in_pool'
> > function, which is used in patches 3 and 8
> >
> > Patch 9: Is a complete rewrite of v1 patch 8 using gallium
> > utils for double lists
> >
> > Patches 10 and 11: These are just a proof of concept for avoiding
> > transfers GPU <-> GPU when using all CL Read/Write functions.
> > They are v1 patch 9 splited in two to separate r600g changes
> > from clover changes.
> > Now, in clover's side it introduces and uses
> > 'CLOVER_TRANSFER_MAP_DIRECTLY' so it doesen't collide with
> > any other OpenCL flag.
> >
> > Please review and Thanks :)
> >
> > Bruno Jiménez (11):
> >   r600g/compute: Add an intermediate resource for OpenCL buffers
> >   r600g/compute: Add an util function to know if an item is in the pool
> >   r600g/compute: Add statuses to the compute_memory_items
> >   r600g/compute: divide the item list in two
> >   r600g/compute: Only move to the pool the buffers marked for promoting
> >   r600g/compute: Avoid problems when promoting items mapped for reading
> >   r600g/compute: Implement compute_memory_demote_item
> >   r600g/compute: Map only against intermediate buffers
> >   r600g/compute: Use gallium util functions for double lists
> >   r600g/compute: Map directly the pool in some cases
> >   clover: Use PIPE_TRANSFER_MAP_DIRECTLY when writing/reading buffers
> >
> >  src/gallium/drivers/r600/compute_memory_pool.c | 294 
> > -
> >  src/gallium/drivers/r600/compute_memory_pool.h |  31 ++-
> >  src/gallium/drivers/r600/evergreen_compute.c   |  38 ++-
> >  src/gallium/state_trackers/clover/api/transfer.cpp |   4 +-
> >  src/gallium/state_trackers/clover/core/object.hpp  |   4 +
> >  .../state_trackers/clover/core/resource.cpp|   2 +
> >  6 files changed, 233 insertions(+), 140 deletions(-)
> >
> > -- 
> > 2.0.0
> >
> 
> Hi Bruno, 
> 
> I'm sorry for having taken so long to comment on this, I haven't had
> time to read through your patch series carefully until last night.

Hi Francisco,

It's Ok, we can't be always here to review everything :)

> I'm not convinced that this is going to be a satisfactory solution, it
> punishes well-behaving applications by duplicating memory usage under
> some circumstances, and by slowing down buffer mapping with (in most
> cases) unnecessary shadow copies: Even though using a buffer while
> mapped is something to keep in mind because it's allowed by the spec
> with some restrictions, I think it's reasonable to assume that we're
> only going to see a comparatively small number of active buffer mappings
> during the execution of any compute kernel, so it seems unfortunate to
> me that the rest of mappings will have to pay for this one corner case.

I have to say that I understand perfectly your concerns about memory
duplication and slow mappings. In fact, the memory duplication is mainly
caused by the possibility of having a buffer mapped for reading that a
kernel uses + relocation of the pool. And the slow mappings are caused
by any map that could stay alive while a kernel is launched +
relocation.

For the OpenCL code that I have seen, I haven't yet encountered one that
left a buffer mapped for reading and at the same time launched a kernel
that reads from it. But I do have encountered programs that left buffers
mapped while they  launched kernels that don't touch those buffers. But
I agree that punishing mappings made simply for read/write between
kernel e

Re: [Mesa-dev] [PATCH 00/11] [RFC v2] Solve the mapping bug

2014-06-21 Thread Francisco Jerez
Bruno Jiménez  writes:

> Hi,
>
> This is my second attempt to fix the mapping bug adding all the
> suggestions that Tom Stellard sent, and, so far, it seems that
> it is resolved.
>
> This series changes completely how OpenCL buffers are handled
> by the r600g driver. Before this, we would add them directly to
> a pool, and this pool would grow whenever we needed more space.
> But this process implied destroying the pool and creating a new
> one. There could be cases where a buffer would be mapped and
> the pool would grow, leaving one side of the mapping pointed
> to where the item was. This is the 'mapping bug'
>
> Now, Items will have an intermediate resource, where all mappings
> can be done, and when a buffer is going to be used with a kernel
> it is promoted to the pool. In the case where a promoted item
> is going to be mapped, it is previously demoted, so even if
> the pool changes its location due to growing, the map remains
> valid. In the case of a buffer mapped for reading, and used
> by a kernel to read from it, we will duplicate this buffer,
> having the intermediate buffer, where the user has its map, and
> an item in the pool, which is the one that the kernel is going
> to use.
>
> As a summary for v2:
> Patches 1-8: These are the main part of the series, and solve
> the mapping bug.
> Patches 1 and 7 now use less explicit castings
> Patch 2 is new and introduces the 'is_item_in_pool'
> function, which is used in patches 3 and 8
>
> Patch 9: Is a complete rewrite of v1 patch 8 using gallium
> utils for double lists
>
> Patches 10 and 11: These are just a proof of concept for avoiding
> transfers GPU <-> GPU when using all CL Read/Write functions.
> They are v1 patch 9 splited in two to separate r600g changes
> from clover changes.
> Now, in clover's side it introduces and uses
> 'CLOVER_TRANSFER_MAP_DIRECTLY' so it doesen't collide with
> any other OpenCL flag.
>
> Please review and Thanks :)
>
> Bruno Jiménez (11):
>   r600g/compute: Add an intermediate resource for OpenCL buffers
>   r600g/compute: Add an util function to know if an item is in the pool
>   r600g/compute: Add statuses to the compute_memory_items
>   r600g/compute: divide the item list in two
>   r600g/compute: Only move to the pool the buffers marked for promoting
>   r600g/compute: Avoid problems when promoting items mapped for reading
>   r600g/compute: Implement compute_memory_demote_item
>   r600g/compute: Map only against intermediate buffers
>   r600g/compute: Use gallium util functions for double lists
>   r600g/compute: Map directly the pool in some cases
>   clover: Use PIPE_TRANSFER_MAP_DIRECTLY when writing/reading buffers
>
>  src/gallium/drivers/r600/compute_memory_pool.c | 294 
> -
>  src/gallium/drivers/r600/compute_memory_pool.h |  31 ++-
>  src/gallium/drivers/r600/evergreen_compute.c   |  38 ++-
>  src/gallium/state_trackers/clover/api/transfer.cpp |   4 +-
>  src/gallium/state_trackers/clover/core/object.hpp  |   4 +
>  .../state_trackers/clover/core/resource.cpp|   2 +
>  6 files changed, 233 insertions(+), 140 deletions(-)
>
> -- 
> 2.0.0
>

Hi Bruno, 

I'm sorry for having taken so long to comment on this, I haven't had
time to read through your patch series carefully until last night.

I'm not convinced that this is going to be a satisfactory solution, it
punishes well-behaving applications by duplicating memory usage under
some circumstances, and by slowing down buffer mapping with (in most
cases) unnecessary shadow copies: Even though using a buffer while
mapped is something to keep in mind because it's allowed by the spec
with some restrictions, I think it's reasonable to assume that we're
only going to see a comparatively small number of active buffer mappings
during the execution of any compute kernel, so it seems unfortunate to
me that the rest of mappings will have to pay for this one corner case.

The implementation of PIPE_TRANSFER_MAP_DIRECTLY introduced in PATCH 10
has somewhat worrying semantics: A mapping with this flag might become
stale unpredictably if a kernel is run, maybe from a different command
queue.  Clover's transfer functions don't hit that path right now on
single-threaded applications, but they might in the future as we start
accelerating the APIs we currently implement with soft_copy_op().  This
is a bug IMHO: even direct mappings should last until the corresponding
unmap call.

I'm not advocating a revert of the series because it fixes a serious
bug, but please don't push patches 10-11, we should probably start
looking for a different solution.  Some suggestions are:

 - Why do you even need a pool?  Wouldn't it be possible to create a
   huge RAT, e.g. covering a 4GB portion of the GPU memory and then use
   a special memory domain or some sort of flag to tell the kernel to
   allocate a buffer from that region (or relocate if it's already been
   allocated elsewhere)?  This is espec

Re: [Mesa-dev] [PATCH 00/11] [RFC v2] Solve the mapping bug

2014-06-20 Thread Bruno Jimenez
On Fri, 2014-06-20 at 13:50 -0400, Tom Stellard wrote:
> On Wed, Jun 18, 2014 at 05:01:50PM +0200, Bruno Jiménez wrote:
> > Hi,
> > 
> > This is my second attempt to fix the mapping bug adding all the
> > suggestions that Tom Stellard sent, and, so far, it seems that
> > it is resolved.
> > 
> > This series changes completely how OpenCL buffers are handled
> > by the r600g driver. Before this, we would add them directly to
> > a pool, and this pool would grow whenever we needed more space.
> > But this process implied destroying the pool and creating a new
> > one. There could be cases where a buffer would be mapped and
> > the pool would grow, leaving one side of the mapping pointed
> > to where the item was. This is the 'mapping bug'
> > 
> > Now, Items will have an intermediate resource, where all mappings
> > can be done, and when a buffer is going to be used with a kernel
> > it is promoted to the pool. In the case where a promoted item
> > is going to be mapped, it is previously demoted, so even if
> > the pool changes its location due to growing, the map remains
> > valid. In the case of a buffer mapped for reading, and used
> > by a kernel to read from it, we will duplicate this buffer,
> > having the intermediate buffer, where the user has its map, and
> > an item in the pool, which is the one that the kernel is going
> > to use.
> >
> 
> I've just pushed patches 1-9.  Nice work!

Thanks!

And thanks to you for all the help!
Bruno

> 
> -Tom
> 
> > As a summary for v2:
> > Patches 1-8: These are the main part of the series, and solve
> > the mapping bug.
> > Patches 1 and 7 now use less explicit castings
> > Patch 2 is new and introduces the 'is_item_in_pool'
> > function, which is used in patches 3 and 8
> > 
> > Patch 9: Is a complete rewrite of v1 patch 8 using gallium
> > utils for double lists
> > 
> > Patches 10 and 11: These are just a proof of concept for avoiding
> > transfers GPU <-> GPU when using all CL Read/Write functions.
> > They are v1 patch 9 splited in two to separate r600g changes
> > from clover changes.
> > Now, in clover's side it introduces and uses
> > 'CLOVER_TRANSFER_MAP_DIRECTLY' so it doesen't collide with
> > any other OpenCL flag.
> > 
> > Please review and Thanks :)
> > 
> > Bruno Jiménez (11):
> >   r600g/compute: Add an intermediate resource for OpenCL buffers
> >   r600g/compute: Add an util function to know if an item is in the pool
> >   r600g/compute: Add statuses to the compute_memory_items
> >   r600g/compute: divide the item list in two
> >   r600g/compute: Only move to the pool the buffers marked for promoting
> >   r600g/compute: Avoid problems when promoting items mapped for reading
> >   r600g/compute: Implement compute_memory_demote_item
> >   r600g/compute: Map only against intermediate buffers
> >   r600g/compute: Use gallium util functions for double lists
> >   r600g/compute: Map directly the pool in some cases
> >   clover: Use PIPE_TRANSFER_MAP_DIRECTLY when writing/reading buffers
> > 
> >  src/gallium/drivers/r600/compute_memory_pool.c | 294 
> > -
> >  src/gallium/drivers/r600/compute_memory_pool.h |  31 ++-
> >  src/gallium/drivers/r600/evergreen_compute.c   |  38 ++-
> >  src/gallium/state_trackers/clover/api/transfer.cpp |   4 +-
> >  src/gallium/state_trackers/clover/core/object.hpp  |   4 +
> >  .../state_trackers/clover/core/resource.cpp|   2 +
> >  6 files changed, 233 insertions(+), 140 deletions(-)
> > 
> > -- 
> > 2.0.0
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/11] [RFC v2] Solve the mapping bug

2014-06-20 Thread Tom Stellard
On Wed, Jun 18, 2014 at 05:01:50PM +0200, Bruno Jiménez wrote:
> Hi,
> 
> This is my second attempt to fix the mapping bug adding all the
> suggestions that Tom Stellard sent, and, so far, it seems that
> it is resolved.
> 
> This series changes completely how OpenCL buffers are handled
> by the r600g driver. Before this, we would add them directly to
> a pool, and this pool would grow whenever we needed more space.
> But this process implied destroying the pool and creating a new
> one. There could be cases where a buffer would be mapped and
> the pool would grow, leaving one side of the mapping pointed
> to where the item was. This is the 'mapping bug'
> 
> Now, Items will have an intermediate resource, where all mappings
> can be done, and when a buffer is going to be used with a kernel
> it is promoted to the pool. In the case where a promoted item
> is going to be mapped, it is previously demoted, so even if
> the pool changes its location due to growing, the map remains
> valid. In the case of a buffer mapped for reading, and used
> by a kernel to read from it, we will duplicate this buffer,
> having the intermediate buffer, where the user has its map, and
> an item in the pool, which is the one that the kernel is going
> to use.
>

I've just pushed patches 1-9.  Nice work!

-Tom

> As a summary for v2:
> Patches 1-8: These are the main part of the series, and solve
> the mapping bug.
> Patches 1 and 7 now use less explicit castings
> Patch 2 is new and introduces the 'is_item_in_pool'
> function, which is used in patches 3 and 8
> 
> Patch 9: Is a complete rewrite of v1 patch 8 using gallium
> utils for double lists
> 
> Patches 10 and 11: These are just a proof of concept for avoiding
> transfers GPU <-> GPU when using all CL Read/Write functions.
> They are v1 patch 9 splited in two to separate r600g changes
> from clover changes.
> Now, in clover's side it introduces and uses
> 'CLOVER_TRANSFER_MAP_DIRECTLY' so it doesen't collide with
> any other OpenCL flag.
> 
> Please review and Thanks :)
> 
> Bruno Jiménez (11):
>   r600g/compute: Add an intermediate resource for OpenCL buffers
>   r600g/compute: Add an util function to know if an item is in the pool
>   r600g/compute: Add statuses to the compute_memory_items
>   r600g/compute: divide the item list in two
>   r600g/compute: Only move to the pool the buffers marked for promoting
>   r600g/compute: Avoid problems when promoting items mapped for reading
>   r600g/compute: Implement compute_memory_demote_item
>   r600g/compute: Map only against intermediate buffers
>   r600g/compute: Use gallium util functions for double lists
>   r600g/compute: Map directly the pool in some cases
>   clover: Use PIPE_TRANSFER_MAP_DIRECTLY when writing/reading buffers
> 
>  src/gallium/drivers/r600/compute_memory_pool.c | 294 
> -
>  src/gallium/drivers/r600/compute_memory_pool.h |  31 ++-
>  src/gallium/drivers/r600/evergreen_compute.c   |  38 ++-
>  src/gallium/state_trackers/clover/api/transfer.cpp |   4 +-
>  src/gallium/state_trackers/clover/core/object.hpp  |   4 +
>  .../state_trackers/clover/core/resource.cpp|   2 +
>  6 files changed, 233 insertions(+), 140 deletions(-)
> 
> -- 
> 2.0.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/11] [RFC v2] Solve the mapping bug

2014-06-18 Thread Bruno Jiménez
Hi,

This is my second attempt to fix the mapping bug adding all the
suggestions that Tom Stellard sent, and, so far, it seems that
it is resolved.

This series changes completely how OpenCL buffers are handled
by the r600g driver. Before this, we would add them directly to
a pool, and this pool would grow whenever we needed more space.
But this process implied destroying the pool and creating a new
one. There could be cases where a buffer would be mapped and
the pool would grow, leaving one side of the mapping pointed
to where the item was. This is the 'mapping bug'

Now, Items will have an intermediate resource, where all mappings
can be done, and when a buffer is going to be used with a kernel
it is promoted to the pool. In the case where a promoted item
is going to be mapped, it is previously demoted, so even if
the pool changes its location due to growing, the map remains
valid. In the case of a buffer mapped for reading, and used
by a kernel to read from it, we will duplicate this buffer,
having the intermediate buffer, where the user has its map, and
an item in the pool, which is the one that the kernel is going
to use.

As a summary for v2:
Patches 1-8: These are the main part of the series, and solve
the mapping bug.
Patches 1 and 7 now use less explicit castings
Patch 2 is new and introduces the 'is_item_in_pool'
function, which is used in patches 3 and 8

Patch 9: Is a complete rewrite of v1 patch 8 using gallium
utils for double lists

Patches 10 and 11: These are just a proof of concept for avoiding
transfers GPU <-> GPU when using all CL Read/Write functions.
They are v1 patch 9 splited in two to separate r600g changes
from clover changes.
Now, in clover's side it introduces and uses
'CLOVER_TRANSFER_MAP_DIRECTLY' so it doesen't collide with
any other OpenCL flag.

Please review and Thanks :)

Bruno Jiménez (11):
  r600g/compute: Add an intermediate resource for OpenCL buffers
  r600g/compute: Add an util function to know if an item is in the pool
  r600g/compute: Add statuses to the compute_memory_items
  r600g/compute: divide the item list in two
  r600g/compute: Only move to the pool the buffers marked for promoting
  r600g/compute: Avoid problems when promoting items mapped for reading
  r600g/compute: Implement compute_memory_demote_item
  r600g/compute: Map only against intermediate buffers
  r600g/compute: Use gallium util functions for double lists
  r600g/compute: Map directly the pool in some cases
  clover: Use PIPE_TRANSFER_MAP_DIRECTLY when writing/reading buffers

 src/gallium/drivers/r600/compute_memory_pool.c | 294 -
 src/gallium/drivers/r600/compute_memory_pool.h |  31 ++-
 src/gallium/drivers/r600/evergreen_compute.c   |  38 ++-
 src/gallium/state_trackers/clover/api/transfer.cpp |   4 +-
 src/gallium/state_trackers/clover/core/object.hpp  |   4 +
 .../state_trackers/clover/core/resource.cpp|   2 +
 6 files changed, 233 insertions(+), 140 deletions(-)

-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev