Dave Airlie wrote:
>> Dave, I'd like to see the flag DRM_BO_FLAG_CACHED really mean cache-coherent
>> memory, that is cache coherent also while visible to the GPU. There are HW
>> implementations out there (Poulsbo at least) where this option actually seems
>> to work, althought it's considerably slower for things like texturing. It's
>> also a requirement for user bo's since they will have VMAs that we cant kill
>> and remap.
>>     
>
> Most PCIE cards will be cache coherent, however AGP cards not so much, so 
> need to think if a generic _CACHED makes sense especially for something 
> like radeon, will I have to pass different flags depending on the GART 
> type.... this seems like uggh.. so maybe a separate flag makes more 
> sense..
>
>   
OK. We're using this functionality in Poulsbo, so we should probably
sort this out to avoid breaking things.
>> Could we perhaps change the flag DRM_BO_FLAG_READ_CACHED to mean
>> DRM_BO_FLAG_MAPPED_CACHED to implement the behaviour you describe. This will
>> also indicate that the buffer cannot be used for user-space sub-allocators, 
>> as
>> we in that case must be able to guarantee that the CPU can access parts of 
>> the
>> buffer while other parts are validated for the GPU.
>>     
>
> Yes, to be honest sub-allocators for most use-cases should be avoided if 
> possible, we should be able to make the kernel interface fast enough for 
> most things if we don't have to switching caching flags on the fly at 
> map/destroy etc.. 
>   
Yes, Eric seems to have the same opinion. I'm not quite sure I 
understand the reasoning behind it.
Is it the added complexity or something else?

While it's super to have a fast kernel interface, the inherent latency 
and allocation granularity
will probably always make a user-space sub-allocator a desirable thing. 
Particularly something like a slab
allocator that would also to some extent avoid fragmentation.

My view of TTM has changed to be a bit from the opposite side:
Let's say we have a fast user-space per-client allocator.
What kernel functionality would we require to make sure that it
can assume it's the sole owner of the memory it manages?

For a repeated usage pattern like batch-buffers we end up allocating 
pages from the kernel,
 setting up one VMA per buffer, modifying  gart- and page tables and in 
the worst case even caching policy for each and every use.
Even if this can be made reasonably fast, I think it's a CPU overhead we 
really shouldn't be paying??

/Thomas










-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to