>
> Since Poulsbo is CMA, to avoid the SMP ipi issue, it should be possible
> to enclose the whole reloc fixup within a spinlock and use
> kmap_atomic which should be faster than kmap.
> Since within a spinlock, also preemption is disabled we can guarantee
> that a batchbuffer write followed by a clflush executes on the same
> processor => no need for ipi, and the clflush can follow immediately
> after a write.
> We've used this technique in psb_mmu.c, although we're using
> preempt_disable() / preempt_enable() to collect per-processor clflushes.
>
> So, basically something like the following should be a fast ipi-free way
> to do this:
>
> spin_lock()
> while(more_relocs_to_do) {
>  kmap_atomic(dst_buffer); // Reuse old map if same page
>  apply_reloc():
>  clflush(newly_written_address);
>  kunmap_atomic(dst_buffer);
> }
> spin_unlock();

So this should work fine if every cacheline portion of the buffer to 
relocate contains a relocation, so that the snoop logic invalidates that 
cacheline on the other processors, but if you have very sparse relocations 
I could see ssomething like

CPU0 writes relocation bo initially - one page with no relocations in 
cache
->schedule
CPU1 enters kernel preempt sectiion, and starts relocating never
hitting that page,
CPU1 clflushes
GPU never sees the one page with no relocs..

Now maybe I'm missing something but I'm not sure how to protect against 
that..

Dave.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to