On Thu, Sep 9, 1999, Dave Baukus <[EMAIL PROTECTED]> wrote:

>Finally to the questions:
>       1.)     Is anybody actually running an ebsa285
>               with caching enabled ?
>
>               If so what type of changes are
>               required to the drivers ?
>
>       2.)     Are we way off base here ?
>               Shouldn't the bigphysmem or changes to kmalloc()
>               solve the problem ?
>
>               What subtlety are we missing ?

This is a generic problem with all SA110/21285 platform (EBSA, Netwinder,
CATS, ...)

I discussed this recently with Philip and Russel. The problem is that the
SA110/21285 pair doesn't handle cache coherency. That means that when a
devices accesses the memory via the PCI (bus master), nothing is done to
make sure the datas are coherent with the CPU cache.
(The architecture of the SA110 cache, based on virtual addresses, would
probably have made this almost impossible without a bunch of hacks in the
silicon).

Some drivers have been "fixed" for this behaviour by adding various cache
flushes and invalidate on memory ranges at specific locations. Basically,
you need to invalidate a range which will be written by the device and
flush a range which will be read by the device. This is how the tulip
driver was fixed and I did fix the pcnet32 driver the same way recently
(I can send you patches). I've seen a fixed sym53c8xx in the CVS recently.

The problem with this solution is that it's far from perfect. It imposes
to place those flush and invalidate in critical locations which can make
the drivers quite unmaintainable. Also, when dealing with "shared
regions" where both the CPU and the device will read&write small datas
(words), there are still some potential coherency issues. The CPU flushes
cache lines, so when flushing, let's say, a word, the entire cache line
containing this word will be flushed, possibly corrupting whatever the
device wanted to write somewhere else in this same cache line. The
current fixed tulip and pcnet32 seems to work fine, probably because the
SA110 support a mecanism of "half cache lines" (2 dirty bits per line),
making the half cache line the same size as a ring descriptor entry for
those drivers.

However, a better solution (and probably more efficient, especially for
the symbios driver) would be to allocate those shared regions in
non-cachable space. Flushes and invalidate would still be required for
data buffers, but ring descriptors or SCSI controller scripts should be
in non-cachable space.

I tried implementing a vmalloc_uncached (with the help of Russel) but
unfortunately, virt_to_bus can't get the physical address of a vmalloc'ed
area, and I didn't want to walk the page tables.

I had to stop hacking on this (more urgent work to do), but I beleive
there is still the possibility to use the low-level _ioremap function to
create a second, uncachable, mapping for a given kmalloc'ed region. If
you make sure to invalidate it once, and then only use the address
returned by _ioremap, this should work.

I'll do more experiments with this next week,

Benjamin.


unsubscribe: body of `unsubscribe linux-arm' to [EMAIL PROTECTED]

Reply via email to