Re: pmac_zilog debugging ...
Benjamin Herrenschmidt wrote: On Thu, 2008-11-13 at 14:29 -0800, Kevin Diggs wrote: Benjamin Herrenschmidt wrote: On Thu, 2008-11-13 at 03:38 -0800, Kevin Diggs wrote: 12,206 PowerMac Zilog interrupts Interrupt load is higher without the DMA support. Is it possible that this hardware was not meant to be used without the DMA (i.e. it does not work quite right?)? Well, the HW Rx buffer is only 3 bytes so if you have high interrupt latencies you are more likely to loose data... These are not real 8530s any more, right? How certain are we of this? Is it possible that there is a larger buffer when used with the DMA capability ... somehow? Well, the main thing is that when using DMA, it doesn't need to wait for the kernel to come fetch the bytes, and thus the only latency that matters if the DMA list is appropriately provisioned is the bus latency, which is much less likely to be an issue even with a small buffer. ... if the DMA list is appropriately provisioned ... ??? It's definitely not a basic 8530, it's an ESCC but I don't think the base rx buffer in polled mode is any bigger (I may be wrong). Any idea how we might find out? I tried to put some debug statements where the flow lines are managed. I could have goofed it up. They never produce any output. The latest attempt used nortscts which should have disabled flow control. That coupled with the fact that a 250 MHz 750GX is talking to a 486dx4 at 1200 - 9600 baud I would have thought would reduce the chance the PowerMac would fall behind? Have you disabled flow control both with the old macserial -and- pmac_zilog and still experiencing the same problems ? (ie one works and the other one doesn't ?) I reinstalled the disk with 2.4.31 and reran the tests. With nortscts added to the pppd options macserial still works fine. Even at 115,200 it seemed fine (I expected to see some type of packet errors via pppstats or ifconfig). I did not, however, verify that pppd actually does something with this option. I can't get pmac_zilog to work even at 1200 baud. That is pretty slow. You would need to explain to me the advantage of doing DMA in this case??? Well, if I setup for example 128 DMA descriptors for 1 byte each, then the chip will be able to DMA up to 128 bytes without CPU intervention, thus is a -lot- less likely to overflow it's fifo. It's essentially a way to have the DMA engine operate as an external FIFO. As the CPU fetches the bytes, it can recycle the descriptors at the end of the list effectively acting as some kind of ring buffer. We can more easily do DMA for Tx but while this can improve performances and lower interrupt usage, it is not a correctness issue in the sense that Tx isn't -losing- data today, it's Rx that is a potential problem. Ah! I think I get it. I was thinking that cpu intervention would be required after each byte. But the descriptors are more like linked commands for the DMA hardware. So, I'm on board with this approach. Since I don't really know what I am doing, how do you recommend I proceed? Would it be correct to say that DMA would also free the cpu from doing io accesses which are MUCH slower than normal memory acdcesses? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: pmac_zilog debugging ...
Well, the main thing is that when using DMA, it doesn't need to wait for the kernel to come fetch the bytes, and thus the only latency that matters if the DMA list is appropriately provisioned is the bus latency, which is much less likely to be an issue even with a small buffer. ... if the DMA list is appropriately provisioned ... ??? Well, if you have some descriptors to available buffers :-) If you let that run out too, then the chip will miss It's definitely not a basic 8530, it's an ESCC but I don't think the base rx buffer in polled mode is any bigger (I may be wrong). Any idea how we might find out? Nope. Trial and error ? :-) I reinstalled the disk with 2.4.31 and reran the tests. With nortscts added to the pppd options macserial still works fine. Even at 115,200 it seemed fine (I expected to see some type of packet errors via pppstats or ifconfig). I did not, however, verify that pppd actually does something with this option. I can't get pmac_zilog to work even at 1200 baud. That is pretty slow. That's definitely strange. I would expect the kernel to be able to get interrupts fast enough to service a 1200 bauds serial port. Maybe there's something else wrong, or an other driver causing undue interrupt latencies Out of curiosity, check that IDE properly unmasks interrupts (hdparm -u1 /dev/hda). Ah! I think I get it. I was thinking that cpu intervention would be required after each byte. But the descriptors are more like linked commands for the DMA hardware. Yes. So, I'm on board with this approach. Since I don't really know what I am doing, how do you recommend I proceed? Google for a document called MacTech.pdf which contains various documentations for bits of the ancestor of the IO chip in your machine, along with a description of the DBDMA engine :-) Something else you can do is to look at how it's properly used by other drivers such as bmac and look at some of the darwin source code for reference on how the HW works. Would it be correct to say that DMA would also free the cpu from doing io accesses which are MUCH slower than normal memory acdcesses? To a certain extent yes. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: pmac_zilog debugging ...
Benjamin Herrenschmidt wrote: That's definitely strange. I would expect the kernel to be able to get interrupts fast enough to service a 1200 bauds serial port. Maybe there's something else wrong, or an other driver causing undue interrupt latencies As far as I can see the system is NOT busy. I see no evidence of excessive interrupt loading. It does have an Adaptec 2940 u2w SCSI card, an ATI video card, and a USB/firewire card. The SCSI card has some disks on it. The other two cards are unused. I guess, in theory, something in my 2.6.27 kernel could be causing one of the two unused cards to throw spurious interrupts? I still think the hardware is mis-behaving. Out of curiosity, check that IDE properly unmasks interrupts (hdparm -u1 /dev/hda). This is an 8600. It is SCSI only (the onboard controller is the MESH). So, I'm on board with this approach. Since I don't really know what I am doing, how do you recommend I proceed? Google for a document called MacTech.pdf which contains various documentations for bits of the ancestor of the IO chip in your machine, along with a description of the DBDMA engine :-) Something else you can do is to look at how it's properly used by other drivers such as bmac and look at some of the darwin source code for reference on how the HW works. where might one find older Darwin source? Cheers, Ben. ___ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: pmac_zilog debugging ...
On Mon, 2008-11-17 at 02:21 -0800, Kevin Diggs wrote: Benjamin Herrenschmidt wrote: That's definitely strange. I would expect the kernel to be able to get interrupts fast enough to service a 1200 bauds serial port. Maybe there's something else wrong, or an other driver causing undue interrupt latencies As far as I can see the system is NOT busy. I see no evidence of excessive interrupt loading. It does have an Adaptec 2940 u2w SCSI card, an ATI video card, and a USB/firewire card. The SCSI card has some disks on it. The other two cards are unused. I guess, in theory, something in my 2.6.27 kernel could be causing one of the two unused cards to throw spurious interrupts? I still think the hardware is mis-behaving. That's strange. Maybe one of the drivers is occasionally hogging interrutps. Well, there may also be a bug in the code :-) One thing you can try is to disable DMA in macserial (shouldn't be hard to hack) and see if it degrades the same way. Out of curiosity, check that IDE properly unmasks interrupts (hdparm -u1 /dev/hda). This is an 8600. It is SCSI only (the onboard controller is the MESH). Ah yes. So, I'm on board with this approach. Since I don't really know what I am doing, how do you recommend I proceed? Google for a document called MacTech.pdf which contains various documentations for bits of the ancestor of the IO chip in your machine, along with a description of the DBDMA engine :-) Something else you can do is to look at how it's properly used by other drivers such as bmac and look at some of the darwin source code for reference on how the HW works. where might one find older Darwin source? Apple still has most of them back to 10.0 and even the recent ones still have an SCC serial driver afaik. Ben. Cheers, Ben. ___ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: pmac_zilog debugging ...
On Thu, 2008-11-13 at 03:38 -0800, Kevin Diggs wrote: 12,206 PowerMac Zilog interrupts Interrupt load is higher without the DMA support. Is it possible that this hardware was not meant to be used without the DMA (i.e. it does not work quite right?)? Well, the HW Rx buffer is only 3 bytes so if you have high interrupt latencies you are more likely to loose data... Now, as I said, have you looked at flow control ? It's a likely cause of problems and it's possible that pmac_zilog doesn't do it the way macserial did... Regarding DMA, it's possible to implement, though there were interesting issues with the way it was done in macserial, it should be done differently in pmac_zilog. I think the only approach that really works properly (though it's fugly) is what Apple does in OSX I think, which is to have a DMA descriptor per input byte (no need for a huge DMA buffer anyway). Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: pmac_zilog debugging ...
Benjamin Herrenschmidt wrote: On Thu, 2008-11-13 at 03:38 -0800, Kevin Diggs wrote: 12,206 PowerMac Zilog interrupts Interrupt load is higher without the DMA support. Is it possible that this hardware was not meant to be used without the DMA (i.e. it does not work quite right?)? Well, the HW Rx buffer is only 3 bytes so if you have high interrupt latencies you are more likely to loose data... These are not real 8530s any more, right? How certain are we of this? Is it possible that there is a larger buffer when used with the DMA capability ... somehow? Now, as I said, have you looked at flow control ? It's a likely cause of problems and it's possible that pmac_zilog doesn't do it the way macserial did... I tried to put some debug statements where the flow lines are managed. I could have goofed it up. They never produce any output. The latest attempt used nortscts which should have disabled flow control. That coupled with the fact that a 250 MHz 750GX is talking to a 486dx4 at 1200 - 9600 baud I would have thought would reduce the chance the PowerMac would fall behind? Regarding DMA, it's possible to implement, though there were interesting issues with the way it was done in macserial, it should be done differently in pmac_zilog. I think the only approach that really works properly (though it's fugly) is what Apple does in OSX I think, which is to have a DMA descriptor per input byte (no need for a huge DMA buffer anyway). You would need to explain to me the advantage of doing DMA in this case??? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: pmac_zilog debugging ...
On Sat, 2008-11-08 at 16:52 +1100, Paul Mackerras wrote: Kevin Diggs writes: pppd ttyS0 1200 satellites: netmask 255.255.255.0 lock crtscts mru 1064 noauth debug kdebug 7 logfile /tmp/pppd.log local to connect an 8600 to a laptop via ppp the link will lock up in short order from payloaded pings. Any advice on how to figure out where it is locking up? This command works fine to connect two x86 laptops. At 1200 it does take a while for an xterm to show up, though. Try it without the crtscts (actually, use nocrtscts instead of crtscts). That will tell us whether it is hardware flow control causing problems. IIRC, those mac serial ports didn't have all of RTS, CTS, CD and DTR, but I don't recall which one(s) were missing. Also make sure you don't have the xonxoff option. iirc, RTS is missing and on MacOS we typically have an option to use DTR instead. There's also some issues with CTS polarity which is different between the actual serial ports, vs. some internal modems or something around those lines. Ben ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
pmac_zilog debugging ...
Hi, If I use a command similar to: pppd ttyS0 1200 satellites: netmask 255.255.255.0 lock crtscts mru 1064 noauth debug kdebug 7 logfile /tmp/pppd.log local to connect an 8600 to a laptop via ppp the link will lock up in short order from payloaded pings. Any advice on how to figure out where it is locking up? This command works fine to connect two x86 laptops. At 1200 it does take a while for an xterm to show up, though. kevin ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: pmac_zilog debugging ...
On Fri, 2008-11-07 at 13:38 -0800, Kevin Diggs wrote: to connect an 8600 to a laptop via ppp the link will lock up in short order from payloaded pings. Any advice on how to figure out where it is locking up? This command works fine to connect two x86 laptops. At 1200 it does take a while for an xterm to show up, though. Flow control might be busted. You may need to revert the polarity of the CTS line sampling. Let me know what you find out. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: pmac_zilog debugging ...
Kevin Diggs writes: pppd ttyS0 1200 satellites: netmask 255.255.255.0 lock crtscts mru 1064 noauth debug kdebug 7 logfile /tmp/pppd.log local to connect an 8600 to a laptop via ppp the link will lock up in short order from payloaded pings. Any advice on how to figure out where it is locking up? This command works fine to connect two x86 laptops. At 1200 it does take a while for an xterm to show up, though. Try it without the crtscts (actually, use nocrtscts instead of crtscts). That will tell us whether it is hardware flow control causing problems. IIRC, those mac serial ports didn't have all of RTS, CTS, CD and DTR, but I don't recall which one(s) were missing. Also make sure you don't have the xonxoff option. Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev