> allocate pixmap gets cached memory
> copy data into the pixmap
> pre-use from hardware we flush the cache lines and tlb
> use the pixmap in hardware
> pre-free we need to set the page back to cached so we flush the tlb
> free the memory.

> Now the big issue here on SMP is that the cache and/or tlb flushes
> require IPIs and they are very noticeable on the profiles,

Blame intel ;)

> Any other ideas and suggestions?

Without knowing exactly what you are doing:

- Copies to uncached memory are very expensive on an x86 processor
(so it might be faster not to write and flush)
- Its not clear from your description how intelligent your transfer
system is.


I'd expect for example that the process was something like

        Parse pending commands until either
        1. Queue empties
        2. A time target passes

        For each command we need to shove a pixmap over add it
        to the buffer to transfer

        Do a single CLFLUSH and maybe IPI

        Fire up the command queue

        Keep the buffers hanging around until there is memory pressure
        if we may reuse that pixmap

Can you clarify that ?

If the hugepage anti-frag stuff ever gets merged this would also help as
you could possibly grab a huge page from the allocator for this purpose
and have to flip only one TLB entry.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to