ttm: Rework validation & memory space allocation

Thomas Hellström Sat, 05 Dec 2009 14:31:42 -0800

Jerome Glisse wrote:
> On Sat, Dec 05, 2009 at 01:14:29PM +0100, Thomas Hellström wrote:
>   
>> Jerome,
>>
>> I'm just reviewing the API change here. I'll do a more thorough code 
>> review when the API is agreed upon.
>>
>>     
>>> +
>>> +/**
>>> + * struct ttm_placement
>>> + *
>>> + * @fpfn:          first valid page frame number to put the object
>>> + * @lpfn:          last valid page frame number to put the object
>>> + * @placements:            prefered placements, 4bits per placement,
>>> + * first 4 bits are the the number of placement
>>> + * @busy_placements:       prefered placements when need to evict,
>>> + * 4bits per placement, first 4 bits are the the number of placement
>>> + * @flags: Additional placement flags
>>> + *
>>> + * Structure indicating the placement you request for an object.
>>> + */
>>> +struct ttm_placement {
>>> +   unsigned        fpfn;
>>> +   unsigned        lpfn;
>>> +   u64             placements;
>>> +   u32             flags;
>>> +};
>>> +
>>>   
>>>       
>> The names fpfn and lpfn are a bit misleading, since the pfn can be 
>> confused with kernel pfns. These are really page offsets into the 
>> managed area? At least this confused me a bit.
>>
>> For @placements and @busy_placements, I take it you encode both the 
>> possible placements and the priority order (although you leave out the 
>> @busy_placements in the struct. I think @busy_placements are needed, 
>> though. Consider a situation where you want to place a texture in vram, 
>> but if that causes a pipeline stall due to VRAM eviction, rather place 
>> it in AGP.
>>
>> I dislike the coding of all this information in a single member, also I 
>> think one should distinguish between allowable placement and priority. 
>> The validation code may want to alter priority, which is not easily done 
>> in current TTM, and core TTM basically only wants to be able to move 
>> objects out to cached or uncached system memory.
>>
>> What about
>>
>> struct ttm_placement {
>>        unsigned long fpfn;
>>        unsigned long lpfn;
>>        const u64 *placements;             //Encoded the traditional way
>>        unsigned num_placements;
>>        const u64 *busy_placements;
>>        unsigned num_busy_placements;
>> }
>>     
>
> I removed busy placement, i think giving placement preference is enough.
>   
Obviously at least OpenChrome is using that functionality.
A typical use-case where this isn't enough is where you have a texture 
that's preferrably
placed in VRAM but it may also be placed in TT. If both these are 
unavailable,
you'd like to start evicting from TT, because that's done in an instant, 
compared to PCI DMA from
VRAM which is at 60MB/s on some chipsets.



> If you look at code you can see that ttm will try with first placement,
> than second, ... My change allow to change the priority as you provide
> one with each validation.
>
>   

Yes. Allowing to change the priority for each validation is a thing 
that's missing, and I'd like to see
that functionality go in.

> For instance if you prefer a bo in vram and only allow vram :
> placements = (VRAM << 4) | 1
>
> If you prefere bo in VRAM but the bo can also be in TT :
> placements = (VRAM << 4) | (TT << 8) | 2
>
> If you prefer bo in TT but can also be in vram :
> placements = (TT << 4) | (VRAM << 8) | 2
>
> I would like to avoid having pointer, using a quadword seems enought
> to fit all need i can think of (15 different placement). So my
> change is all about allowing driver to alter priority for each
> validation.
>   

Packing the information in a quadword is also very ugly, and makes the 
unpacking code
hard to follow. 
In the typical case, either the driver decides placement based on 
previous usage history,
and in that case only one placement would be given. Or the legacy case 
where a
driver uses a static array for priorities. Hence a pointer to an array 
is IMHO the obvious choice.

> On the avoiding eviction thing, the plan i was thinking of is mostly
> to let the driver come up with an heuristic to decide when to put
> a bo in vram or not. This would mean driver won't set vram as first
> priority placement if the driver don't think it's worth evicting
> other bo for it. Thought adding busy might be a good idea, i can
> understand that it could be usefull.
>   

One thing that should be considered if you want to implement an advanced 
heuristic in the driver,
is to add a bool to the validate function that stops it from evicting. 
In that case the driver could
do that itself, and implement whatever strategy it likes.


> We can avoid using a structure to pack arguments but giving the
> number of them i thought it was a good idea to pack them. Don't
> know if there is a kernel policy about that.
>
>   
>> // For core ttm use.
>> static inline struct ttm_placement *ttm_encode_simple_placement(u64 
>> desired, struct ttm_placement *placement)
>> {
>>         placement->fpfn = placement->lpfn = 0;
>>         placement->placements = placement->busy_placements = &desired;
>>         placement->num_placements = placement->num_busy_placements = 1;
>> }
>>
>>
>>     
>>> @@ -445,7 +468,6 @@ extern int ttm_bo_check_placement(struct 
>>> ttm_buffer_object *bo,
>>>   *
>>>   * @bdev: Pointer to a ttm_bo_device struct.
>>>   * @mem_type: The memory type.
>>> - * @p_offset: offset for managed area in pages.
>>>   * @p_size: size managed area in pages.
>>>   *
>>>   * Initialize a manager for a given memory type.
>>> @@ -458,7 +480,7 @@ extern int ttm_bo_check_placement(struct 
>>> ttm_buffer_object *bo,
>>>   */
>>>  
>>>  extern int ttm_bo_init_mm(struct ttm_bo_device *bdev, unsigned type,
>>> -                     unsigned long p_offset, unsigned long p_size);
>>> +                           unsigned long p_size);
>>>   
>>>       
>> Hmm,
>> Why remove p_offset here? It allows splitting a PCI memory region into 
>> multiple TTM memory regions.
>>
>> Thanks,
>> Thomas
>>     
>
> It seemed redondant with the io field, i think we can deal with
> the use case you are describing without p_offset. Also i did this
> so it's really easy to convert fpfn, lpfn from 0 to valid range.
> (if lpfn == 0 then lpfn = man->size).
>   

Yes. The downside is that you have to implement each memory type 
differently in the
manager initialization. Anyway, that's a rare use-case so given the 
obvious advantage for
fpfn and lpfn it looks like a price worth paying.

Thanks,
Thomas




------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [PATCH 2/3] drm/ttm: Rework validation & memory space allocation

Reply via email to