> :I strongly doubt that this is a CAM isr problem- the error pattern isn't > :entirely clear from what you said, but it looks more like a FIFO or CACHE > :LINE sized type of problem- it looks to be < 16 bytes, but not a short > :count. Because this isn't one of the wacky systems I spent most of my > :career on at Sun where the first and usual suspect was a system memory > :cache line because IO wasn't cache coherent on Suns between the Sun > :3/{50,60,75,150} and the advent SuperSparc Viking Chipset, I'd guess a > :FIFO somewhere in the I/O movement path. > : > :Justin- any changes lately where flushing a FIFO in the Adaptec at the end > :of tranfer might have been spoodged? > : > :-matt > > The problem is definitely aligned in some way. Here's a diff of > a hexdump of one error. Sometimes I lose a whole page, sometimes two > pages, sometimes 16 bytes, but the error is always page aligned. > > 1536c1536 > < 0005ff0 3333 2033 3434 3434 7c20 207c 3030 3030 > --- > > 0005ff0 7365 3d20 3120 093b 2309 6720 6f6c 6162 > > A cache-line problem would fit the symptoms. I know it isn't the > hardware... this 1xCPU PPro/200 system has been with me for several > years and this test didn't fail like this a month ago. When I updated > the machine last (unfortunately w/ about a month's worth of changes), > my buildworlds started failing with odd errors. > > I then switched away from the failing buildworlds (which take an hour) > and started doing cp -r's and then diff -r's (takes only 20 min), and as > you can see I'm still seeing the problem. > > Maybe this is DMA related. Perhaps the cache is not getting cleared? > Maybe an MMU optimization someone threw in recently?
That's possible too- I'll admit I'm a bit hazy on i386 specifics- it's always been a "just works wrt I/O" so for all I know there's a required i/o flush command when you switch mappings. Gawd I hate these kind of problems. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message