2010/5/10 Blue Swirl <blauwir...@gmail.com>: > On 5/10/10, Artyom Tarasenko <atar4q...@googlemail.com> wrote: >> 2010/5/10 Blue Swirl <blauwir...@gmail.com>: >> >> > On 5/10/10, Artyom Tarasenko <atar4q...@googlemail.com> wrote: >> >> 2010/5/9 Blue Swirl <blauwir...@gmail.com>: >> >> > On 5/9/10, Artyom Tarasenko <atar4q...@googlemail.com> wrote: >> >> >> 2010/5/9 Blue Swirl <blauwir...@gmail.com>: >> >> >> >> >> >> > On 5/8/10, Artyom Tarasenko <atar4q...@googlemail.com> wrote: >> >> >> >> On the real hardware (SS-5, LX) the MMU is not padded, but >> aliased. >> >> >> >> Software shouldn't use aliased addresses, neither should it >> crash >> >> >> >> when it uses (on the real hardware it wouldn't). Using >> empty_slot >> >> >> >> instead of aliasing can help with debugging such accesses. >> >> >> > >> >> >> > TurboSPARC Microprocessor User's Manual shows that there are >> >> >> > additional pages after the main IOMMU for AFX registers. So this >> is >> >> >> > not board specific, but depends on CPU/IOMMU versions. >> >> >> >> >> >> >> >> >> I checked it on the real hw: on LX and SS-5 these are aliased MMU >> addresses. >> >> >> SS-20 doesn't have any aliasing. >> >> > >> >> > But are your machines equipped with TurboSPARC or some other CPU? >> >> >> >> >> >> Good point, I must confess, I missed the word "Turbo" in your first >> >> answer. LX and SS-20 don't. >> >> But SS-5 must have a TurboSPARC CPU: >> >> >> >> ok cd /FMI,MB86904 >> >> ok .attributes >> >> context-table 00 00 00 00 03 ff f0 00 00 00 10 00 >> >> psr-implementation 00000000 >> >> psr-version 00000004 >> >> implementation 00000000 >> >> version 00000004 >> >> cache-line-size 00000020 >> >> cache-nlines 00000200 >> >> page-size 00001000 >> >> dcache-line-size 00000010 >> >> dcache-nlines 00000200 >> >> dcache-associativity 00000001 >> >> icache-line-size 00000020 >> >> icache-nlines 00000200 >> >> icache-associativity 00000001 >> >> ncaches 00000002 >> >> mmu-nctx 00000100 >> >> sparc-version 00000008 >> >> mask_rev 00000026 >> >> device_type cpu >> >> name FMI,MB86904 >> >> >> >> and still it behaves the same as TI,TMS390S10 from the LX. This is done >> on SS-5: >> >> >> >> ok 10000000 20 spacel@ . >> >> 4000009 >> >> ok 14000000 20 spacel@ . >> >> 4000009 >> >> ok 14000004 20 spacel@ . >> >> 23000 >> >> ok 1f000004 20 spacel@ . >> >> 23000 >> >> ok 10000008 20 spacel@ . >> >> 4000009 >> >> ok 14000028 20 spacel@ . >> >> 4000009 >> >> ok 1000000c 20 spacel@ . >> >> 23000 >> >> ok 10000010 20 spacel@ . >> >> 4000009 >> >> >> >> >> >> LX is the same except for the IOMMU-version: >> >> >> >> ok 10000000 20 spacel@ . >> >> 4000005 >> >> ok 14000000 20 spacel@ . >> >> 4000005 >> >> ok 18000000 20 spacel@ . >> >> 4000005 >> >> ok 1f000000 20 spacel@ . >> >> 4000005 >> >> ok 1ff00000 20 spacel@ . >> >> 4000005 >> >> ok 1fff0004 20 spacel@ . >> >> 1fe000 >> >> ok 10000004 20 spacel@ . >> >> 1fe000 >> >> ok 10000108 20 spacel@ . >> >> 41000005 >> >> ok 10000040 20 spacel@ . >> >> 41000005 >> >> ok 1fff0040 20 spacel@ . >> >> 41000005 >> >> ok 1fff0044 20 spacel@ . >> >> 1fe000 >> >> ok 1fff0024 20 spacel@ . >> >> 1fe000 >> >> >> >> >> >> >> At what address the additional AFX registers are located? >> >> > >> >> > Here's complete TurboSPARC IOMMU address map: >> >> > PA[30:0] Register Access >> >> > 1000_0000 IOMMU Control R/W >> >> > 1000_0004 IOMMU Base Address R/W >> >> > 1000_0014 Flush All IOTLB Entries W >> >> > 1000_0018 Address Flush W >> >> > 1000_1000 Asynchronous Fault Status R/W >> >> > 1000_1004 Asynchronous Fault Address R/W >> >> > 1000_1010 SBus Slot Configuration 0 R/W >> >> > 1000_1014 SBus Slot Configuration 1 R/W >> >> > 1000_1018 SBus Slot Configuration 2 R/W >> >> > 1000_101C SBus Slot Configuration 3 R/W >> >> > 1000_1020 SBus Slot Configuration 4 R/W >> >> > 1000_1050 Memory Fault Status R/W >> >> > 1000_1054 Memory Fault Address R/W >> >> > 1000_2000 Module Identification R/W >> >> > 1000_3018 Mask Identification R >> >> > 1000_4000 AFX Queue Level W >> >> > 1000_6000 AFX Queue Level R >> >> > 1000_7000 AFX Queue Status R >> >> >> >> >> >> >> >> But if I read it correctly 0x12fff294 (which makes SunOS crash with -m >> 32) is >> >> well above this limit. >> > >> > Oh, so I also misread something. You are not talking about the >> > adjacent pages, but 16MB increments. >> > >> > Earlier I sent a patch for a generic address alias device, would it be >> > useful for this? >> >> >> Should do as well. But I thought empty_slot is less overhead and >> easier to debug. >>
Also the aliasing patch would require one more parameter: the size of area which has to be aliased. Except we implement stubs for all missing devices and and do aliasing of the connected port ranges. And then again, SS-20 doesn't have aliasing in this area at all. What do you think about this (empty_slot) solution (except that I missed the SoB line)? Meanwhile it's tested with SunOS 4.1.3U1 too. >>> Maybe we have a general design problem, perhaps unassigned access >>> faults should only be triggered inside SBus slots and ignored >>> elsewhere. If this is true, generic Sparc32 unassigned access handler >>> should just ignore the access and special fault generating slots >>> should be installed for empty SBus address ranges. Agreed that they should be special for SBus, because SS-20 OBP is not happy with the fault we are currently generating. But otherwise I think qemu does it correct. On SS-5: ok f7ff0000 2f spacel@ . Data Access Error ok sfar@ . f7ff0000 ok 20000000 2f spacel@ . Data Access Error ok sfar@ . 20000000 ok 40000000 20 spacel@ . Data Access Error ok sfar@ . 40000000 Neither ff7ff0000 nor f20000000, nor 40000000 are in SBus range, right? >> My impression was that SS-5 and SS-20 do unassigned accesses a bit >> differently. >> The current IOMMU implementation fits SS-20, which has no aliasing. > > It's probably rather the board design than just IOMMU. Agreed. That's why I bound the patch to machine hwdef and not to iommu. >> >> >> > One approach would be that IOMMU_NREGS would be increased to cover >> >> >> > these registers (with the bump in savevm version field) and >> >> >> > iommu_init1() should check the version field to see how much MMIO >> to >> >> >> > provide. >> >> >> >> >> >> >> >> >> The problem I see here is that we already have too much registers: we >> >> >> emulate SS-20 IOMMU (I guess), while SS-5 and LX seem to have only >> >> >> 0x20 registers which are aliased all the way. >> >> >> >> >> >> >> >> >> > But in order to avoid the savevm version change, iommu_init1() >> could >> >> >> > just install dummy MMIO (in the TurboSPARC case), if OBP does not >> care >> >> >> > if the read back data matches what has been written earlier. >> Because >> >> >> > from OBP point of view this is identical to what your patch >> results >> >> >> > in, I'd suppose this approach would also work. >> >> >> >> >> >> >> >> >> OBP doesn't seem to care about these addresses at all. It's only the >> "MUNIX" >> >> >> SunOS 4.1.4 kernel who does. The "MUNIX" kernel is the only kernel >> available >> >> >> during the installation, so it is currently not possible to install >> 4.1.4. >> >> >> Surprisingly "GENERIC" kernel which is on the disk after the >> >> >> installation doesn't >> >> >> try to access these address ranges either, so a disk image taken >> from a live >> >> >> system works. >> >> >> >> >> >> Actually access to the non-connected/aliased addresses may also be a >> >> >> consequence of phys_page_find bug I mentioned before. When I run >> >> >> install with -m 64 and -m 256 it tries to access different >> >> >> non-connected addresses. May also be a SunOS bug of course. 256m >> used >> >> >> to be a lot back then. >> >> > >> >> > Perhaps with 256MB, memory probing advances blindly from memory to >> >> > IOMMU registers. Proll (used before OpenBIOS) did that once, with bad >> >> > results :-). If this is true, 64M, 128M and 192M should show identical >> >> > results and only with close or equal to 256M the accesses happen. >> >> >> >> >> >> 32m: 0x12fff294 >> >> 64m: 0x14fff294 >> >> 192m:0x1cfff294 >> >> 256m:0x20fff294 >> >> >> >> Memory probing? It would be strange that OS would do it itself. The OS >> >> could just >> >> ask OBP how much does it have. Here is the listing where it happens: >> >> >> >> _swift_vac_rgnflush: rd %psr, %g2 >> >> _swift_vac_rgnflush+4: andn %g2, 0x20, %g5 >> >> _swift_vac_rgnflush+8: mov %g5, %psr >> >> _swift_vac_rgnflush+0xc: nop >> >> _swift_vac_rgnflush+0x10: nop >> >> _swift_vac_rgnflush+0x14: mov 0x100, %g5 >> >> _swift_vac_rgnflush+0x18: lda [%g5] 0x4, %g5 >> >> _swift_vac_rgnflush+0x1c: sll %o2, 0x2, %g1 >> >> _swift_vac_rgnflush+0x20: sll %g5, 0x4, %g5 >> >> _swift_vac_rgnflush+0x24: add %g5, %g1, %g5 >> >> _swift_vac_rgnflush+0x28: lda [%g5] 0x20, %g5 >> >> >> >> _swift_vac_rgnflush+0x28: is the fatal one. >> >> >> >> kadb> $c >> >> _swift_vac_rgnflush(?) >> >> _vac_rgnflush() + 4 >> >> _hat_setup_kas(0xc00,0xf0447000,0x43a000,0x400,0xf043a000,0x3c0) + 70 >> >> _startup(0xfe000000,0x10000000,0xfa000000,0xf00e2bfc,0x10,0xdbc00) + >> 1414 >> >> _main(0xf00e0fb4,0xf0007810,0x293ff49f,0xa805209c,0x200,0xf00d1d18) + 14 >> >> >> >> Unfortunately (but not surprisingly) kadb doesn't allow debugging >> >> cache-flush code, so I can't check what is in >> >> [%g5] (aka sfar) on the real machine when this happens. >> > >> > Linux code for Swift/TurboSPARC VAC flush should be similar. Do you have an idea why would anyone try reading a value referenced in sfar? Especially during flushing? I can't imagine a case where it wouldn't produce a fault. >> >> But the bug in phys_page_find would explain this accesses: sfar gets >> >> the wrong address, and then the secondary access happens on this wrong >> >> address instead of the original one. >> > >> > I doubt phys_page_find can be buggy, it is so vital for all architecture. >> >> >> But you've seen the example of buggy behaviour I posted last Friday, right? >> If it's not phys_page_find, it's either cpu_physical_memory_rw (which >> is also pretty generic), or >> the way SS-20 registers devices. Can it be that all the pages must be >> registered in the proper order? > > How about unassigned access handler, could it be suspected? Doesn't look like it: it gets a physical address as a parameter. How would it know the address is wrong? >> I think it's a pretty rare use case where you have a memory fault (not >> a translation fault) on an unknown address. You may have such fault >> during device probing, but in such case you know what address you are >> probing, so you don't care about the sync fault address register. >> >> Besides, do all architectures have sync fault address register? > > No, I think system level checks like that and IOMMU-like controls on > most architectures are very poor compared to Sparc32. Server and > mainframe systems may be a bit better. And do we have any mainframe emulated good enough to have a user base and hence bug reports? >> >> fwiw the routine is called only once on the real hardware. It sort of >> >> speaks for your hypothesis about the memory probing. Although it may >> >> not necessarily probe for memory... >> >> >> >> -- Regards, Artyom Tarasenko solaris/sparc under qemu blog: http://tyom.blogspot.com/