2010/5/31 luca ellero <[email protected]>:
> Pei Lin wrote:
>>
>> 2010/5/17 luca ellero <[email protected]>:
>>
>>>
>>> Hi list,
>>> I have some (maybe stupid) questions which I can't answer even after
>>> reading
>>> lots of documentation.
>>> Suppose I have a PCI device which has some I/O registers mapped to memory
>>> (here I mean access are made through memory, not I/O space).
>>> As far as I know the right way to access them is through functions such
>>> as
>>> iowrite8 and friends:
>>>
>>> spin_lock(Q)
>>> iowrite8(some_address, ADDR)
>>> iowrite8(some_data, DATA);
>>> spin_unlock(Q);
>>>
>>> My questions are:
>>>
>>> 1) Do I need a write memory barrier (wmb) between the two iowrite8?
>>> I think I need it because I've read the implementation of iowrite8 and
>>> (in
>>> kernel 2.6.30.6) this expands to:
>>>
>>> void iowrite8(u8 val, void *addr)
>>> {
>>> do {
>>> unsigned long port = (unsigned long )addr;
>>> if (port >= 0x40000UL) {
>>> writeb(val, addr);
>>> } else if (port > 0x10000UL) {
>>> port &= 0x0ffffUL;
>>> outb(val,port);
>>> } else bad_io_access(port, "outb(val,port)" );
>>> } while (0);
>>> }
>>>
>>> where writeb is:
>>>
>>> static inline void writeb(unsigned char val, volatile void *addr) {
>>> asm volatile("movb %0,%1":
>>> :"q" (val), "m" (*(volatile unsigned char *)addr)
>>> :"memory");
>>> }
>>>
>>> which contains only a compiler barrier (the :"memory" in the asm
>>> statement)
>>> but no CPU barrier. So, without wmb(), CPU can reorder the iowrite8 with
>>> disastrous effect. Am I right?
>>>
>>>
>>> 2) do I need mmiowb() before spin_unlock()?
>>> Documentation about mmiowb() is really confusing me, so any explanation
>>> about his use is really welcome.
>>>
>>
>> See the documentation which explains it clearly.
>> http://lxr.linux.no/linux+v2.6.27.46/Documentation/memory-barriers.txt
>>
>> 1295LOCKS VS I/O ACCESSES
>> 1296---------------------
>> 1297
>> 1298Under certain circumstances (especially involving NUMA), I/O accesses
>> within
>> 1299two spinlocked sections on two different CPUs may be seen as
>> interleaved by the
>> 1300PCI bridge, because the PCI bridge does not necessarily participate in
>> the
>> 1301cache-coherence protocol, and is therefore incapable of issuing the
>> required
>> 1302read memory barriers.
>> 1303
>> 1304For example:
>> 1305
>> 1306 CPU 1 CPU 2
>> 1307 ===============================
>> ===============================
>> 1308 spin_lock(Q)
>> 1309 writel(0, ADDR)
>> 1310 writel(1, DATA);
>> 1311 spin_unlock(Q);
>> 1312 spin_lock(Q);
>> 1313 writel(4, ADDR);
>> 1314 writel(5, DATA);
>> 1315 spin_unlock(Q);
>> 1316
>> 1317may be seen by the PCI bridge as follows:
>> 1318
>> 1319 STORE *ADDR = 0, STORE *ADDR = 4, STORE *DATA = 1, STORE *DATA
>> = 5
>> 1320
>> 1321which would probably cause the hardware to malfunction.
>> 1322
>> 1323
>> 1324What is necessary here is to intervene with an mmiowb() before
>> dropping the
>> 1325spinlock, for example:
>> 1326
>> 1327 CPU 1 CPU 2
>> 1328 ===============================
>> ===============================
>> 1329 spin_lock(Q)
>> 1330 writel(0, ADDR)
>> 1331 writel(1, DATA);
>> 1332 mmiowb();
>> 1333 spin_unlock(Q);
>> 1334 spin_lock(Q);
>> 1335 writel(4, ADDR);
>> 1336 writel(5, DATA);
>> 1337 mmiowb();
>> 1338 spin_unlock(Q);
>> 1339
>> 1340this will ensure that the two stores issued on CPU 1 appear at the
>> PCI bridge
>> 1341before either of the stores issued on CPU 2.
>> 1342
>> 1343
>> 1344Furthermore, following a store by a load from the same device
>> obviates the need
>> 1345for the mmiowb(), because the load forces the store to complete
>> before the load
>> 1346is performed:
>> 1347
>> 1348 CPU 1 CPU 2
>> 1349 ===============================
>> ===============================
>> 1350 spin_lock(Q)
>> 1351 writel(0, ADDR)
>> 1352 a = readl(DATA);
>> 1353 spin_unlock(Q);
>> 1354 spin_lock(Q);
>> 1355 writel(4, ADDR);
>> 1356 b = readl(DATA);
>> 1357 spin_unlock(Q);
>> 1358
>> 1359
>> 1360See Documentation/DocBook/deviceiobook.tmpl for more information.
>>
>
> Thanks for your reply,
> I've already read the documentation, anyway what surprises me is the fact
> that mmiowb() (at least on x86) is defined as a compiler barrier (barrier())
> and nothing else. I would expect it to do something more than that: some
> specific PCI command or at least a dummy "read" from some PCI register
> (since a read forces the store to complete).
As for MIPS, it defined as
/* Depends on MIPS II instruction set */
#define mmiowb() asm volatile ("sync" ::: "memory") .
For X86
#define mb() asm volatile("mfence":::"memory")
#define rmb() asm volatile("lfence":::"memory")
#define wmb() asm volatile("sfence" ::: "memory")
For x86, use the mfence/lfence/sfence pair to guarantee it.
i found an old mail discussion for the mmiowb() usage.
http://www.gelato.unsw.edu.au/archives/linux-ia64/0708/21056.html
http://www.gelato.unsw.edu.au/archives/linux-ia64/0708/21096.html
From: Nick Piggin <npiggin_at_suse.de>
Date: 2007-08-24 12:59:16
On Thu, Aug 23, 2007 at 09:16:42AM -0700, Linus Torvalds wrote:
>
>
> On Thu, 23 Aug 2007, Nick Piggin wrote:
> >
> > Also, FWIW, there are some advantages of deferring the mmiowb thingy
> > until the point of unlock.
>
> And that is exactly what ppc64 does.
>
> But you're missing a big point: for 99.9% of all hardware, mmiowb() is a
> total no-op. So when you talk about "advantages", you're not talking about
> any *real* advantage, are you?
> Furthermore a lot of PCI drivers seem to ignore its use.
> Can you explain me that?
i only got one linker which may explain why many driver removed the mmiowb().
http://lwn.net/Articles/283776/
> Luca
>
>
>
--
Best Regards
Lin
--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to [email protected]
Please read the FAQ at http://kernelnewbies.org/FAQ