Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes
On Mon, Mar 20, 2017 at 08:19:16AM -0700, EricB wrote: > On 3/20/2017 02:11, Julien Cigar wrote: > > On Mon, Mar 20, 2017 at 01:54:02AM +0100, Freek Dijkstra wrote: > >> On 20-03-2017 00:49, Jay Grizzard wrote: > >> > >>> I haven't fired up my 6501 to check the exact CPU model (...anyone want > >>> to buy it? It's of no use to me since Soekris never released the > >>> information needed to work with the FPGA), but I wonder if this could > >>> possibly be what's going on with them: > >>> https://www.theregister.co.uk/2017/02/06/cisco_intel_decline_to_link_product_warning_to_faulty_chip/ > >> That article seems only related to the Intel Atom C2000. > >> The net6501 contains an older Intel Atom E640, single chip processor > >> with EG20T companion chip. > >> > >> Of course, that doesn't exclude the possibility that something similar > >> is the cause. > >> > >> At least the lesson I learned from Dries is that it really is related to > >> the CPU, not to the power (as was previously suggested), and that it > >> might reboot if cooled down for a sufficiently long period of time. > >> > >> Well, mine stopped working last December after 5 years of operation as > >> my home router. I replaced it with something else (a Ubiquity > >> EdgeRouter), but I still had it lying around (mostly because I wanted to > >> reuse to two mSATA disks, which seems uncommon these days) > >> > >> Well 3 months of cooling down should be sufficient, so I couldn't help > >> to check what happens if I fire it up again, and to my surprise, it did > >> boot all the way just a few seconds ago! > >> > >> Not that it matters much, I guess it still is a bit too unreliable to my > >> taste. > > We bought 3 6501 2 years ago (for a redundant firewall/router + > > HAProxy). One of them died after a year, got replaced under warranty and > > died again 3 months later, another died just a few weeks agos, and the > > last one suffer from this puzzling "error led" problem (works until > > reboot, "sometimes" boot later, ...). > > > > I think there is definitively an hardware/BIOS issue with the board, I'm > > confident that the problem doesn't come from the power supply or an > > overheating from the CPU. > > > > As others I'm disappointed that Soekris is so silent on this issue, > > especially if the root cause doesn't come from a broken bios/PCB > > design/... > > > > As we run FreeBSD, we finally replaced those boxes with a RCC-VE 4860 1U > > from Netgate (https://www.netgate.com/products/rcc-ve-4860-1u.html) > I'm curious if ADI Engineering has addressed the issue that Intel > identifed with the RCC-VE. Also curious if those issues with the Intel > Avoton/C2000 chips is part of why the Soekris net6801 never went to > production... https://www.netgate.com/blog/clock-signal-component-issue.html I really hope it could be fixed with a BIOS upgrade or so... It's really strange, it looks exactly the same issue as for the 6501, but the chip is not listed as problematic in the Intel document... (to be followed I guess ...) > > > >> Regards, > >> Freek > >> > >> ___ > >> Soekris-tech mailing list > >> Soekris-tech@lists.soekris.com > >> http://lists.soekris.com/mailman/listinfo/soekris-tech > > > > > > ___ > > Soekris-tech mailing list > > Soekris-tech@lists.soekris.com > > http://lists.soekris.com/mailman/listinfo/soekris-tech > > ___ > Soekris-tech mailing list > Soekris-tech@lists.soekris.com > http://lists.soekris.com/mailman/listinfo/soekris-tech -- Julien Cigar Belgian Biodiversity Platform (http://www.biodiversity.be) PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 No trees were killed in the creation of this message. However, many electrons were terribly inconvenienced. signature.asc Description: PGP signature ___ Soekris-tech mailing list Soekris-tech@lists.soekris.com http://lists.soekris.com/mailman/listinfo/soekris-tech
Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes
On 3/20/2017 02:11, Julien Cigar wrote: > On Mon, Mar 20, 2017 at 01:54:02AM +0100, Freek Dijkstra wrote: >> On 20-03-2017 00:49, Jay Grizzard wrote: >> >>> I haven't fired up my 6501 to check the exact CPU model (...anyone want >>> to buy it? It's of no use to me since Soekris never released the >>> information needed to work with the FPGA), but I wonder if this could >>> possibly be what's going on with them: >>> https://www.theregister.co.uk/2017/02/06/cisco_intel_decline_to_link_product_warning_to_faulty_chip/ >> That article seems only related to the Intel Atom C2000. >> The net6501 contains an older Intel Atom E640, single chip processor >> with EG20T companion chip. >> >> Of course, that doesn't exclude the possibility that something similar >> is the cause. >> >> At least the lesson I learned from Dries is that it really is related to >> the CPU, not to the power (as was previously suggested), and that it >> might reboot if cooled down for a sufficiently long period of time. >> >> Well, mine stopped working last December after 5 years of operation as >> my home router. I replaced it with something else (a Ubiquity >> EdgeRouter), but I still had it lying around (mostly because I wanted to >> reuse to two mSATA disks, which seems uncommon these days) >> >> Well 3 months of cooling down should be sufficient, so I couldn't help >> to check what happens if I fire it up again, and to my surprise, it did >> boot all the way just a few seconds ago! >> >> Not that it matters much, I guess it still is a bit too unreliable to my >> taste. > We bought 3 6501 2 years ago (for a redundant firewall/router + > HAProxy). One of them died after a year, got replaced under warranty and > died again 3 months later, another died just a few weeks agos, and the > last one suffer from this puzzling "error led" problem (works until > reboot, "sometimes" boot later, ...). > > I think there is definitively an hardware/BIOS issue with the board, I'm > confident that the problem doesn't come from the power supply or an > overheating from the CPU. > > As others I'm disappointed that Soekris is so silent on this issue, > especially if the root cause doesn't come from a broken bios/PCB > design/... > > As we run FreeBSD, we finally replaced those boxes with a RCC-VE 4860 1U > from Netgate (https://www.netgate.com/products/rcc-ve-4860-1u.html) I'm curious if ADI Engineering has addressed the issue that Intel identifed with the RCC-VE. Also curious if those issues with the Intel Avoton/C2000 chips is part of why the Soekris net6801 never went to production... > >> Regards, >> Freek >> >> ___ >> Soekris-tech mailing list >> Soekris-tech@lists.soekris.com >> http://lists.soekris.com/mailman/listinfo/soekris-tech > > > ___ > Soekris-tech mailing list > Soekris-tech@lists.soekris.com > http://lists.soekris.com/mailman/listinfo/soekris-tech ___ Soekris-tech mailing list Soekris-tech@lists.soekris.com http://lists.soekris.com/mailman/listinfo/soekris-tech
Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes
On 2017-03-19, Dries Verachtertwrote: > I have a net6501 soekris device and it has a strange issue: when the > device is working correctly and I reboot, then it doesn't start > anymore: [...] > If I keep the device unplugged from a power source for +/- 8 minutes, > then it does start again and everything works like it should. Hardware failure. My 6501-70 had exactly the same problem. Since it was still under warranty, I sent it to Soekris and they replaced the board. -- Christian "naddy" Weisgerber na...@mips.inka.de ___ Soekris-tech mailing list Soekris-tech@lists.soekris.com http://lists.soekris.com/mailman/listinfo/soekris-tech
Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes
Hi Dries, This is a longstanding problem .. in my experience my board was rock solid for a couple of years until I experienced a power failure and then it failed to boot. It was a brick for over two years, but the last time I tried it did boot and so now I'm keeping it as an emergency spare. At the time I thought that the combios might be stuck in a loop waiting for some hardware to initialize, and upon more reflection I suspect that's probably the case. If it the loop/jump was no-oped one might be able to get the board to boot sans effected hardware .. I'd thought about disassembling the bios and looking for that loop and replacing the JMP with a NOOP but didn't have time at the time to mess with it and so bought a PC-Engines board instead. I'm sure that if Soekris folks were still developing combios this would be a trivial change to make and I'm sure they have no shortage of dead boards to test it with. :) I've often thought that If a sick board would at least boot without the network ports it might still have limited use in some other capacity as opposed to being a complete brick. Might also be possible to write a special kernel driver to 'whack' it into shape once up and running. --Andrew On 2017-03-20 5:40 AM, Dries Verachtert wrote: On Mon, Mar 20, 2017 at 10:11 AM, Julien Cigarwrote: I think there is definitively an hardware/BIOS issue with the board, I'm confident that the problem doesn't come from the power supply or an overheating from the CPU. Ed, Jay, Grizzard, Freek, Julien, Thank you for all the information. I bought the board +/- 2,5 years ago. The Soekris website mentions 3 years of limited warranty so I'll try to do an RMA. But in the long run, I'll probably buy something else because I prefer to have a router/firewall that will still work after a reboot. I've actually spent a lot of time fiddling with different serial cables/ports and trying all combinations of serial speed/parity/stop bits before I figured out that it only started again if I kept it disconnected for some time. It would be nice if Soekris could document this somewhere on their wiki... Kind regards, Dries ___ Soekris-tech mailing list Soekris-tech@lists.soekris.com http://lists.soekris.com/mailman/listinfo/soekris-tech ___ Soekris-tech mailing list Soekris-tech@lists.soekris.com http://lists.soekris.com/mailman/listinfo/soekris-tech
Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes
On Mon, Mar 20, 2017 at 01:54:02AM +0100, Freek Dijkstra wrote: > On 20-03-2017 00:49, Jay Grizzard wrote: > > > I haven't fired up my 6501 to check the exact CPU model (...anyone want > > to buy it? It's of no use to me since Soekris never released the > > information needed to work with the FPGA), but I wonder if this could > > possibly be what's going on with them: > > https://www.theregister.co.uk/2017/02/06/cisco_intel_decline_to_link_product_warning_to_faulty_chip/ > > That article seems only related to the Intel Atom C2000. > The net6501 contains an older Intel Atom E640, single chip processor > with EG20T companion chip. > > Of course, that doesn't exclude the possibility that something similar > is the cause. > > At least the lesson I learned from Dries is that it really is related to > the CPU, not to the power (as was previously suggested), and that it > might reboot if cooled down for a sufficiently long period of time. > > Well, mine stopped working last December after 5 years of operation as > my home router. I replaced it with something else (a Ubiquity > EdgeRouter), but I still had it lying around (mostly because I wanted to > reuse to two mSATA disks, which seems uncommon these days) > > Well 3 months of cooling down should be sufficient, so I couldn't help > to check what happens if I fire it up again, and to my surprise, it did > boot all the way just a few seconds ago! > > Not that it matters much, I guess it still is a bit too unreliable to my > taste. We bought 3 6501 2 years ago (for a redundant firewall/router + HAProxy). One of them died after a year, got replaced under warranty and died again 3 months later, another died just a few weeks agos, and the last one suffer from this puzzling "error led" problem (works until reboot, "sometimes" boot later, ...). I think there is definitively an hardware/BIOS issue with the board, I'm confident that the problem doesn't come from the power supply or an overheating from the CPU. As others I'm disappointed that Soekris is so silent on this issue, especially if the root cause doesn't come from a broken bios/PCB design/... As we run FreeBSD, we finally replaced those boxes with a RCC-VE 4860 1U from Netgate (https://www.netgate.com/products/rcc-ve-4860-1u.html) > > Regards, > Freek > > ___ > Soekris-tech mailing list > Soekris-tech@lists.soekris.com > http://lists.soekris.com/mailman/listinfo/soekris-tech -- Julien Cigar Belgian Biodiversity Platform (http://www.biodiversity.be) PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 No trees were killed in the creation of this message. However, many electrons were terribly inconvenienced. signature.asc Description: PGP signature ___ Soekris-tech mailing list Soekris-tech@lists.soekris.com http://lists.soekris.com/mailman/listinfo/soekris-tech