Re: Startup IPI (was: Re: test13-pre3)
On Wed, 20 Dec 2000, Petr Vandrovec wrote: > /usr/bin/time says that program runs for 3.40 - 3.56secs, so after dividing Well, the test looks reasonable if the system load is low. Still the performance is surprisingly low -- after changing the transfer width to 16 bits I ran the test on my dual P5MMX system equipped with an old ISA VGA card and I achieved 10.74ms for VGA RAM accesses and 586.6us for uncached main memory accesses. > by 1000 I get 3.4ms... Maybe I should complain to VIA or to Matrox that > it is piece of crap ? For VIA -- definitely. I don't think Matrox is at fault, though. > My order was simple: no rambus memory, dual PIII at least on 800MHz > and UDMA66. Yes, maybe I should buy ServerWorks instead of VIA, but > I hoped... At least ServerWorks claims they are willing to cooperate with us although results seem to be questionable so far... Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On 20 Dec 00 at 19:52, Maciej W. Rozycki wrote: > > it kills machine; only problem is that 0x1300 wr-rd cycles to VGA apperture > > take 3.48ms, and this does not correspond with needed 200us udelay. > > Hmm, how do you calculate the time? Assuming AGP4x runs at 133MHz and a > read or write cycle lasts for a single clock tick (I don't know exact AGP > specs -- please correct me if I'm wrong), I find 0x1300 cycles to finish > in about 73usecs. The loop execution overhead may double the result and > it will still fit within 300usecs. It is easy: int mfd; volatile unsigned long* memory; int i; mfd = open("/dev/mem", O_RDWR); memory = mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, mfd, 0x000B8000); close(mfd); for (i = 0; i < 0x1300 * 1000; i++) { *memory = i; *memory; } munmap(memory, 4096); /usr/bin/time says that program runs for 3.40 - 3.56secs, so after dividing by 1000 I get 3.4ms... Maybe I should complain to VIA or to Matrox that it is piece of crap ? > > Without VIA datasheet I cannot try to disable some PCI features to find > > which one is culprit, so I'm sorry. > > But you may complain to the manufacturer and/or change hardware. I'm > still uncertain the delay should stay in... My order was simple: no rambus memory, dual PIII at least on 800MHz and UDMA66. Yes, maybe I should buy ServerWorks instead of VIA, but I hoped... Best regards, Petr Vandrovec [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On Tue, 19 Dec 2000, Petr Vandrovec wrote: > I did... So it uses 'xchg %eax,APIC_ICR' instead of 'movl %eax,APIC_ICR', > yes (as verified in generated code...)? No change, still dies, as expected > (do not forget that before it dies, it can do ~0x1300 write-read cycles I've forgotten indeed... > from videomemory (AGP4x), so secondary CPU just does some thinking before This might be the time needed to deliver the IPI. Remember that the inter-APIC bus is serial and not that fast. > it kills machine; only problem is that 0x1300 wr-rd cycles to VGA apperture > take 3.48ms, and this does not correspond with needed 200us udelay. Hmm, how do you calculate the time? Assuming AGP4x runs at 133MHz and a read or write cycle lasts for a single clock tick (I don't know exact AGP specs -- please correct me if I'm wrong), I find 0x1300 cycles to finish in about 73usecs. The loop execution overhead may double the result and it will still fit within 300usecs. > Maybe chipset decides to do something when second CPU cannot obtain > bus access in 10 pci cycles?). I guess a certain initial cycle from the AP confuses the chipset somehow. > Do you (or anyone else) have code which can dump MTRR registers of each > of CPU before mtrr driver takes over them? At least first CPU does not have > any problem... A brief look at arch/i386/kernel/mtrr.c reveals the bootstrap CPU's settings do not get changed. As a result they may always be fetched from the /proc filesystem. For APs you probably need to tweak sources. > I even placed 'wbinvd' and 'wbinvd; cpuid' before sending startup IPI, > but it does not matter. Secondary CPU just does not finish even first > instruction when first CPU reads from videoram again and again. Well, the CPU obeys the writeback and the invalidation, but does the chipset? > Without VIA datasheet I cannot try to disable some PCI features to find > which one is culprit, so I'm sorry. But you may complain to the manufacturer and/or change hardware. I'm still uncertain the delay should stay in... Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On Tue, 19 Dec 2000 [EMAIL PROTECTED] wrote: [snip of Petr's system info] > Okay. Mine, as far as I can tell, only depends on the L2 cache being set > to '64MB' instead of '512MB' in the field 'L2 Cache Cacheable Size' under > 'Chipset Features Setup' on my BIOS. This is unfortunately the latest BIOS > for this motherboard available. It's a TD5TH version 1.1 > > H. Have you tried booting with an hercmono (if you can get your paws > on one, that is)?. > > > Right after 'Freeing unused kernel memory...' > I get a kernel BUG at buffer.c:821 with this setting at 256MB, -test12 > without fbcon. With fbcon it would appear to switch video mode and > freeze with a black screen with cursor at the bottom, at that point. > > And then I get an oops dump in the swapper task. I'll try decoding it in a > little while, since I'll have to manually input it. Here we go: I DID have to copy it onto paper and type it in after rebooting. >>EIP; c01354a6<= Trace; c0217d72 Trace; c02180da Trace; c0186e85 Trace; c019e238 Trace; c01a22b7 Trace; c019fb0e Trace; c01a21d0 Trace; c010c2d1 Trace; c010c4b8 Trace; c0108d40 Trace; c010ac00 Trace; c0108d40 Trace; c0108dbc Trace; c0108dd2 Trace; c0105000 Trace; c01001cf Code; c01354a6 <_EIP>: Code; c01354a6<= 0: 0f 0b ud2a <= Code; c01354a8 2: 83 c4 0c add$0xc,%esp Code; c01354ab 5: 90nop Code; c01354ac 6: 8d 74 26 00 lea0x0(%esi,1),%esi Code; c01354b0 a: 8d 5e 28 lea0x28(%esi),%ebx Code; c01354b3 d: 8d 46 2c lea0x2c(%esi),%eax Code; c01354b6 10: 39 46 2c cmp%eax,0x2c(%esi) Code; c01354b9 13: 74 00 je 15 <_EIP+0x15> c01354bb - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On Tue, 19 Dec 2000, Petr Vandrovec wrote: > On 18 Dec 00 at 21:59, [EMAIL PROTECTED] wrote: > > > > Pardon me for not fully groking the issues here and possibly coming to a > > wrong conclusion, but this has to do with SMP systems crashing at APIC > > init time, just before penguin display (with fbcon at least)? If so, I > > have a board that does this with certain cache settings made in the BIOS. > > It's a 430HX chipset with two Pentium MMX 200s installed, *ancient* BIOS. > > I'm using BIOS dated 19/07/2000, last week it was latest BIOS on Gigabyte > site for 6VXD7 (two PIII/800). I did not looked for updates today yet. > > I tried to change C2P Concurrency & Master (en/dis), AGP Mode (1x/2x/4x), > Power mgmt - Display Activity (monitor/ignore), PNP OS (yes/no) > (24 combinations total), but any combination dies if there are read > accesses to videoram during startup. Today I finally digged out some > old ISA VGA (Realtek), plugged it in and - it dies too. So it does not > depend on bus type. Okay. Mine, as far as I can tell, only depends on the L2 cache being set to '64MB' instead of '512MB' in the field 'L2 Cache Cacheable Size' under 'Chipset Features Setup' on my BIOS. This is unfortunately the latest BIOS for this motherboard available. It's a TD5TH version 1.1 H. Have you tried booting with an hercmono (if you can get your paws on one, that is)?. Right after 'Freeing unused kernel memory...' I get a kernel BUG at buffer.c:821 with this setting at 256MB, -test12 without fbcon. With fbcon it would appear to switch video mode and freeze with a black screen with cursor at the bottom, at that point. And then I get an oops dump in the swapper task. I'll try decoding it in a little while, since I'll have to manually input it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On 19 Dec 00 at 19:30, Maciej W. Rozycki wrote: > > When I replaced address with 0xC01B8000 (some cachable memory), it worked > > fine. When replaced with 0xC00C8000 (supposedly unused address, but maybe > > it is just set as cacheable in chipset), it works too. > > Hmm, a read from an uncached location could result in sending delayed > APIC writes to the bus in case of an incorrect MTRR setting for the APIC > space. Could you please disable CONFIG_X86_GOOD_APIC? This will result > in using locked cycles for APIC writes, i.e. immediate bus accesses. I did... So it uses 'xchg %eax,APIC_ICR' instead of 'movl %eax,APIC_ICR', yes (as verified in generated code...)? No change, still dies, as expected (do not forget that before it dies, it can do ~0x1300 write-read cycles from videomemory (AGP4x), so secondary CPU just does some thinking before it kills machine; only problem is that 0x1300 wr-rd cycles to VGA apperture take 3.48ms, and this does not correspond with needed 200us udelay. Maybe chipset decides to do something when second CPU cannot obtain bus access in 10 pci cycles?). > Please also check MTRR settings, especially for the APIC range. They > might need fixing. Do you (or anyone else) have code which can dump MTRR registers of each of CPU before mtrr driver takes over them? At least first CPU does not have any problem... > > at the beginning of trampoline.S, and then boot with 'no-scroll', but > > character in upper left corner did not change, so secondary CPU probably > > even did not start code fetches. That's all I can say until > > I put non-AGP card into the box (but I need AGP, so it is not real > > option). > > An easier way to check an application processor is alive could be > enabling the speaker -- after setting it up by the bootstrap CPU it only > takes three instructions to set bits 0 and 1 of port 0x61 and the result > is not volatile. A LED diagnostic display would be better, but typical > PCs don't have one, unfortunately. Fortunately secondary CPU starts with AL & 3 == 0, so it is just one 'outb %al,$0x61' instruction. When first CPU reads memory in loop, it beeps and beeps and beeps. If first CPU does 'udelay(300);', it works fine (I put mdelay(100) after enabling speaker, so I hear short 1000Hz beep during boot). So secondary CPU does not correctly execute even first instruction. But it either locks bus forever (looks like that because of ATX poweroff button does not work anymore), or confuses first CPU so much that it also cannot continue... > > Yeah. Just do not read video memory when another CPU starts. I'll try > > disabling cache on both CPUs, maybe it will make some difference, as > > secondary CPU should start with caches disabled. But maybe that it is > > just broken AGP bus, and nothing else. But until I find what's really > > broken on my hardware, I'd like to leave 'udelay(300)' in. > > If the problem is with write combining then disabling the cache won't > help, I'm afraid. Read loop reads one short from one constant address, so any write* should not make any problem. > > instead of string as soon as second CPU started (no, it did not race due > > to missing console_lock; before first printk() secondary CPU should fill > > whole screen with letter '2'. It did not). ^ digit. I'm sorry ;-) > > I would still verify (i.e. with the speaker) that's really the second CPU > causing the corruption. I even placed 'wbinvd' and 'wbinvd; cpuid' before sending startup IPI, but it does not matter. Secondary CPU just does not finish even first instruction when first CPU reads from videoram again and again. Without VIA datasheet I cannot try to disable some PCI features to find which one is culprit, so I'm sorry. Best regards, Petr Vandrovec [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On Tue, 19 Dec 2000, Petr Vandrovec wrote: > Uh. It took couple of hours to find it. Just place > > { int i; volatile unsigned short* p = 0xC00B8000; for (i = 0; i < 6553600; >i++) { *p; } }(**) > > instead of udelay(300) and this loop does not finish. Same for > unsigned long* p. inb/outb(0x3C0) are ok. Writes are OK too. Only > simple fetches from videoram kills it. > > When I replaced address with 0xC01B8000 (some cachable memory), it worked > fine. When replaced with 0xC00C8000 (supposedly unused address, but maybe > it is just set as cacheable in chipset), it works too. Hmm, a read from an uncached location could result in sending delayed APIC writes to the bus in case of an incorrect MTRR setting for the APIC space. Could you please disable CONFIG_X86_GOOD_APIC? This will result in using locked cycles for APIC writes, i.e. immediate bus accesses. Please also check MTRR settings, especially for the APIC range. They might need fixing. > at the beginning of trampoline.S, and then boot with 'no-scroll', but > character in upper left corner did not change, so secondary CPU probably > even did not start code fetches. That's all I can say until > I put non-AGP card into the box (but I need AGP, so it is not real > option). An easier way to check an application processor is alive could be enabling the speaker -- after setting it up by the bootstrap CPU it only takes three instructions to set bits 0 and 1 of port 0x61 and the result is not volatile. A LED diagnostic display would be better, but typical PCs don't have one, unfortunately. > Yeah. Just do not read video memory when another CPU starts. I'll try > disabling cache on both CPUs, maybe it will make some difference, as > secondary CPU should start with caches disabled. But maybe that it is > just broken AGP bus, and nothing else. But until I find what's really > broken on my hardware, I'd like to leave 'udelay(300)' in. If the problem is with write combining then disabling the cache won't help, I'm afraid. > (*) When I was calling directly > vt_console_print(NULL, "Message1\n", 9); > vt_console_print(NULL, "Message2\n", 9); > instead of printk, I got > Message1 > Messag<0x..><0x..><0x00><0x80><0x..><0x80><0x..><0x80>... > - wrong text with wrong length, so it probably started fetching garbage > instead of string as soon as second CPU started (no, it did not race due > to missing console_lock; before first printk() secondary CPU should fill > whole screen with letter '2'. It did not). I would still verify (i.e. with the speaker) that's really the second CPU causing the corruption. -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
> > In the case where it boots does it also report mismatched MTRRs ?? > > Yes, it complains. But BIOS correctly reports x1/x2 depending on > number of CPUs I plug into motherboard, so I believe that it did > some initialization before it start loading OS. That may explain the hangs. Intel docs don't seem to guarantee what happens if the MTRRs don't match across CPU's. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On 18 Dec 00 at 21:59, [EMAIL PROTECTED] wrote: > > Pardon me for not fully groking the issues here and possibly coming to a > wrong conclusion, but this has to do with SMP systems crashing at APIC > init time, just before penguin display (with fbcon at least)? If so, I > have a board that does this with certain cache settings made in the BIOS. > It's a 430HX chipset with two Pentium MMX 200s installed, *ancient* BIOS. I'm using BIOS dated 19/07/2000, last week it was latest BIOS on Gigabyte site for 6VXD7 (two PIII/800). I did not looked for updates today yet. I tried to change C2P Concurrency & Master (en/dis), AGP Mode (1x/2x/4x), Power mgmt - Display Activity (monitor/ignore), PNP OS (yes/no) (24 combinations total), but any combination dies if there are read accesses to videoram during startup. Today I finally digged out some old ISA VGA (Realtek), plugged it in and - it dies too. So it does not depend on bus type. Best regards, Petr Vandrovec [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On 18 Dec 00 at 23:51, Alan Cox wrote: > > Yeah. Just do not read video memory when another CPU starts. I'll try > > disabling cache on both CPUs, maybe it will make some difference, as > > secondary CPU should start with caches disabled. But maybe that it is > > just broken AGP bus, and nothing else. But until I find what's really > > broken on my hardware, I'd like to leave 'udelay(300)' in. > > In the case where it boots does it also report mismatched MTRRs ?? Yes, it complains. But BIOS correctly reports x1/x2 depending on number of CPUs I plug into motherboard, so I believe that it did some initialization before it start loading OS. calibrating APIC timer ... . CPU clock speed is 797.0452 MHz. . host bus clock speed is 99.6305 MHz. cpu: 0, clocks: 996305, slice: 332101 CPU0 cpu: 1, clocks: 996305, slice: 332101 CPU1 checking TSC synchronization across CPUs: passed. Setting commenced=1, go go go mtrr: your CPUs had inconsistent variable MTRR settings mtrr: probably your BIOS does not setup all CPUs Best regards, Petr Vandrovec [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
Pardon me for not fully groking the issues here and possibly coming to a wrong conclusion, but this has to do with SMP systems crashing at APIC init time, just before penguin display (with fbcon at least)? If so, I have a board that does this with certain cache settings made in the BIOS. It's a 430HX chipset with two Pentium MMX 200s installed, *ancient* BIOS. -- Ferret - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
> Yeah. Just do not read video memory when another CPU starts. I'll try > disabling cache on both CPUs, maybe it will make some difference, as > secondary CPU should start with caches disabled. But maybe that it is > just broken AGP bus, and nothing else. But until I find what's really > broken on my hardware, I'd like to leave 'udelay(300)' in. In the case where it boots does it also report mismatched MTRRs ?? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On 18 Dec 00 at 19:44, Maciej W. Rozycki wrote: > > No, I'll try. It occured with either AGP (Matrox G200/G400/G450) or > > PCI (S3, CL5434) VGA adapter. I did not tried real ISA VGA... > > Oops, I've forgotten there exist non-ISA display adapters. ;-) Just try > if accessing one bus or another changes the behaviour. Uh. It took couple of hours to find it. Just place { int i; volatile unsigned short* p = 0xC00B8000; for (i = 0; i < 6553600; i++) { *p; } }(**) instead of udelay(300) and this loop does not finish. Same for unsigned long* p. inb/outb(0x3C0) are ok. Writes are OK too. Only simple fetches from videoram kills it. When I replaced address with 0xC01B8000 (some cachable memory), it worked fine. When replaced with 0xC00C8000 (supposedly unused address, but maybe it is just set as cacheable in chipset), it works too. Symptoms of lockup are same as hangup in printk() without udelay(300), only problem is that 'vt_console_print' (*) does not do fetches from videoram, it does stores only... Placing this loop before sending startup IPI, or just below udelay(300) is OK (modulo that this loop takes so long that secondary CPU complains about no callin received). I even tried to add: mov $0xB800,%ax mov %ax,%ds movw %ax,0 at the beginning of trampoline.S, and then boot with 'no-scroll', but character in upper left corner did not change, so secondary CPU probably even did not start code fetches. That's all I can say until I put non-AGP card into the box (but I need AGP, so it is not real option). > > and VT82C686 (rev 22) ISA bridge. I tried to request documentation > > of 694X from VIA, but I did not heard from them. They have probably > > some secrets hidden in their hardware... > > They wan't to keep the competition from being bug-compatible, it would > seem... Yeah. Just do not read video memory when another CPU starts. I'll try disabling cache on both CPUs, maybe it will make some difference, as secondary CPU should start with caches disabled. But maybe that it is just broken AGP bus, and nothing else. But until I find what's really broken on my hardware, I'd like to leave 'udelay(300)' in. (*) When I was calling directly vt_console_print(NULL, "Message1\n", 9); vt_console_print(NULL, "Message2\n", 9); instead of printk, I got Message1 Messag<0x..><0x..><0x00><0x80><0x..><0x80><0x..><0x80>... - wrong text with wrong length, so it probably started fetching garbage instead of string as soon as second CPU started (no, it did not race due to missing console_lock; before first printk() secondary CPU should fill whole screen with letter '2'. It did not). (**) When I had '*p = i; *p' in loop, from visual inspection it was dying in range i=0x1380-0x13FF (blue background, cyan letter with diacritics). End of guessing. Best regards, Petr Vandrovec [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On Mon, 18 Dec 2000, Petr Vandrovec wrote: > It is possible. But it is hard to track, as it works with serial console, > and it is not possible to paint characters to VGA screen, as vgacon uses > hardware panning instead of scrolling :-( And if it dies, shift-pageup > apparently does not work... And filling whole 32KB with some char > does not work, as it changes timing too much... Just disable the problematic printk()s for making tests (you may just undefine APIC_DEBUG in include/asm-i386/apic.h) -- we already know what is going to be printed here. ;-) > No, I'll try. It occured with either AGP (Matrox G200/G400/G450) or > PCI (S3, CL5434) VGA adapter. I did not tried real ISA VGA... Oops, I've forgotten there exist non-ISA display adapters. ;-) Just try if accessing one bus or another changes the behaviour. > Yes. I could understand if I had to place bigger udelay() after INIT IPI, > as this can cause some specific PIII initialization and Intel says that > there should not be any MESI traffic during this init (at least I understand Hmm, weird -- for integrated APICs an INIT IPI is about the same as shutdown apart from the fact an NMI won't wake up a CPU (that might actually be the local APIC not passing NMIs to the CPU in this case, though). > it that way). But after startup IPI it should just start executing code... > I tried to put 'wbinvd' here and there, but it did not make any change, > only udelay() between startup IPI cmd and first printk() did. Hmm, a startup IPI is rather fast so the code just after issuing it may somehow interact with the application's CPU trampoline. But try to disable CONFIG_X86_GOOD_APIC, yet (you may configure for classic Pentium, for example), and see if that changes anything (it shouldn't, but who knows...). > I have no idea. I know that board has VT82C694X (rev c4) host and PCI bridge, Just look at the board and search for an I/O APIC chip. ;-) > and VT82C686 (rev 22) ISA bridge. I tried to request documentation > of 694X from VIA, but I did not heard from them. They have probably > some secrets hidden in their hardware... They wan't to keep the competition from being bug-compatible, it would seem... Maciej -- + Maciej W. Rozycki, Technical University of Gdansk, Poland + +--+ +e-mail: [EMAIL PROTECTED], PGP key available+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Startup IPI (was: Re: test13-pre3)
On 18 Dec 00 at 18:18, Maciej W. Rozycki wrote: > On Mon, 18 Dec 2000, Petr Vandrovec wrote: > > > No. Without udelay() before first printk() it just does not boot on my > > motherboard. There were two choices: either remove all printk() from > > these loops (define Dprintk to null), or add udelay(x), where x >= 200, > > before first printk. I sent patch twice to linux-kernel, and to > > [EMAIL PROTECTED], and nobody said anything against it. > > I see. But are you sure this is the right fix? You may be covering > the real problem with this arbitrary delay. It is possible. But it is hard to track, as it works with serial console, and it is not possible to paint characters to VGA screen, as vgacon uses hardware panning instead of scrolling :-( And if it dies, shift-pageup apparently does not work... And filling whole 32KB with some char does not work, as it changes timing too much... > > analyzer (or if I should come with motherboard), I'm willing to continue > > testing. But current idea is that inb/outb done by cursor positioning > > code is incompatible with something else done in secondary CPU startup. > > Have you tried putting explicit display adapter (other ISA) I/O accesses > after sending the IPI to see if they trigger the problem? IPIs are No, I'll try. It occured with either AGP (Matrox G200/G400/G450) or PCI (S3, CL5434) VGA adapter. I did not tried real ISA VGA... > > Without delay() both CPU die, and board does not react to anything except > > hard reset anymore (and sometime it does not react even to hard reset; lookup > > for my messages during last week). > > Now THAT is weird. It might mean a chipset bug. Still no idea how an > inter-APIC message might trigger it -- it completely bypasses MB Yes. I could understand if I had to place bigger udelay() after INIT IPI, as this can cause some specific PIII initialization and Intel says that there should not be any MESI traffic during this init (at least I understand it that way). But after startup IPI it should just start executing code... I tried to put 'wbinvd' here and there, but it did not make any change, only udelay() between startup IPI cmd and first printk() did. > chipset... Hmm, maybe not... Is your I/O APIC discrete (like Intel's > 82093AA) or integrated? It appears there are vendors manufacturing I/O > APIC clones and this may imply new problems, sigh... I have no idea. I know that board has VT82C694X (rev c4) host and PCI bridge, and VT82C686 (rev 22) ISA bridge. I tried to request documentation of 694X from VIA, but I did not heard from them. They have probably some secrets hidden in their hardware... Best regards, Petr Vandrovec [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/