Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes

2017-03-20 Thread Julien Cigar
On Mon, Mar 20, 2017 at 08:19:16AM -0700, EricB wrote:
> On 3/20/2017 02:11, Julien Cigar wrote:
> > On Mon, Mar 20, 2017 at 01:54:02AM +0100, Freek Dijkstra wrote:
> >> On 20-03-2017 00:49, Jay Grizzard wrote:
> >>
> >>> I haven't fired up my 6501 to check the exact CPU model (...anyone want
> >>> to buy it? It's of no use to me since Soekris never released the
> >>> information needed to work with the FPGA), but I wonder if this could
> >>> possibly be what's going on with them:
> >>> https://www.theregister.co.uk/2017/02/06/cisco_intel_decline_to_link_product_warning_to_faulty_chip/
> >> That article seems only related to the Intel Atom C2000.
> >> The net6501 contains an older Intel Atom E640, single chip processor
> >> with EG20T companion chip.
> >>
> >> Of course, that doesn't exclude the possibility that something similar
> >> is the cause.
> >>
> >> At least the lesson I learned from Dries is that it really is related to
> >> the CPU, not to the power (as was previously suggested), and that it
> >> might reboot if cooled down for a sufficiently long period of time.
> >>
> >> Well, mine stopped working last December after 5 years of operation as
> >> my home router. I replaced it with something else (a Ubiquity
> >> EdgeRouter), but I still had it lying around (mostly because I wanted to
> >> reuse to two mSATA disks, which seems uncommon these days)
> >>
> >> Well 3 months of cooling down should be sufficient, so I couldn't help
> >> to check what happens if I fire it up again, and to my surprise, it did
> >> boot all the way just a few seconds ago!
> >>
> >> Not that it matters much, I guess it still is a bit too unreliable to my
> >> taste.
> > We bought 3 6501 2 years ago (for a redundant firewall/router +
> > HAProxy). One of them died after a year, got replaced under warranty and
> > died again 3 months later, another died just a few weeks agos, and the
> > last one suffer from this puzzling "error led" problem (works until
> > reboot, "sometimes" boot later, ...).
> >
> > I think there is definitively an hardware/BIOS issue with the board, I'm
> > confident that the problem doesn't come from the power supply or an
> > overheating from the CPU.
> >
> > As others I'm disappointed that Soekris is so silent on this issue,
> > especially if the root cause doesn't come from a broken bios/PCB
> > design/...
> >
> > As we run FreeBSD, we finally replaced those boxes with a RCC-VE 4860 1U 
> > from Netgate (https://www.netgate.com/products/rcc-ve-4860-1u.html)
> I'm curious if ADI Engineering has addressed the issue that Intel
> identifed with the RCC-VE.  Also curious if those issues with the Intel
> Avoton/C2000 chips is part of why the Soekris net6801 never went to
> production...

https://www.netgate.com/blog/clock-signal-component-issue.html

I really hope it could be fixed with a BIOS upgrade or so...

It's really strange, it looks exactly the same issue as for the 6501, but
the chip is not listed as problematic in the Intel document... 

(to be followed I guess ...)

> >
> >> Regards,
> >> Freek
> >>
> >> ___
> >> Soekris-tech mailing list
> >> Soekris-tech@lists.soekris.com
> >> http://lists.soekris.com/mailman/listinfo/soekris-tech
> >
> >
> > ___
> > Soekris-tech mailing list
> > Soekris-tech@lists.soekris.com
> > http://lists.soekris.com/mailman/listinfo/soekris-tech
> 

> ___
> Soekris-tech mailing list
> Soekris-tech@lists.soekris.com
> http://lists.soekris.com/mailman/listinfo/soekris-tech


-- 
Julien Cigar
Belgian Biodiversity Platform (http://www.biodiversity.be)
PGP fingerprint: EEF9 F697 4B68 D275 7B11  6A25 B2BB 3710 A204 23C0
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.


signature.asc
Description: PGP signature
___
Soekris-tech mailing list
Soekris-tech@lists.soekris.com
http://lists.soekris.com/mailman/listinfo/soekris-tech


Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes

2017-03-20 Thread EricB
On 3/20/2017 02:11, Julien Cigar wrote:
> On Mon, Mar 20, 2017 at 01:54:02AM +0100, Freek Dijkstra wrote:
>> On 20-03-2017 00:49, Jay Grizzard wrote:
>>
>>> I haven't fired up my 6501 to check the exact CPU model (...anyone want
>>> to buy it? It's of no use to me since Soekris never released the
>>> information needed to work with the FPGA), but I wonder if this could
>>> possibly be what's going on with them:
>>> https://www.theregister.co.uk/2017/02/06/cisco_intel_decline_to_link_product_warning_to_faulty_chip/
>> That article seems only related to the Intel Atom C2000.
>> The net6501 contains an older Intel Atom E640, single chip processor
>> with EG20T companion chip.
>>
>> Of course, that doesn't exclude the possibility that something similar
>> is the cause.
>>
>> At least the lesson I learned from Dries is that it really is related to
>> the CPU, not to the power (as was previously suggested), and that it
>> might reboot if cooled down for a sufficiently long period of time.
>>
>> Well, mine stopped working last December after 5 years of operation as
>> my home router. I replaced it with something else (a Ubiquity
>> EdgeRouter), but I still had it lying around (mostly because I wanted to
>> reuse to two mSATA disks, which seems uncommon these days)
>>
>> Well 3 months of cooling down should be sufficient, so I couldn't help
>> to check what happens if I fire it up again, and to my surprise, it did
>> boot all the way just a few seconds ago!
>>
>> Not that it matters much, I guess it still is a bit too unreliable to my
>> taste.
> We bought 3 6501 2 years ago (for a redundant firewall/router +
> HAProxy). One of them died after a year, got replaced under warranty and
> died again 3 months later, another died just a few weeks agos, and the
> last one suffer from this puzzling "error led" problem (works until
> reboot, "sometimes" boot later, ...).
>
> I think there is definitively an hardware/BIOS issue with the board, I'm
> confident that the problem doesn't come from the power supply or an
> overheating from the CPU.
>
> As others I'm disappointed that Soekris is so silent on this issue,
> especially if the root cause doesn't come from a broken bios/PCB
> design/...
>
> As we run FreeBSD, we finally replaced those boxes with a RCC-VE 4860 1U 
> from Netgate (https://www.netgate.com/products/rcc-ve-4860-1u.html)
I'm curious if ADI Engineering has addressed the issue that Intel
identifed with the RCC-VE.  Also curious if those issues with the Intel
Avoton/C2000 chips is part of why the Soekris net6801 never went to
production...
>
>> Regards,
>> Freek
>>
>> ___
>> Soekris-tech mailing list
>> Soekris-tech@lists.soekris.com
>> http://lists.soekris.com/mailman/listinfo/soekris-tech
>
>
> ___
> Soekris-tech mailing list
> Soekris-tech@lists.soekris.com
> http://lists.soekris.com/mailman/listinfo/soekris-tech

___
Soekris-tech mailing list
Soekris-tech@lists.soekris.com
http://lists.soekris.com/mailman/listinfo/soekris-tech


Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes

2017-03-20 Thread Christian Weisgerber
On 2017-03-19, Dries Verachtert  wrote:

> I have a net6501 soekris device and it has a strange issue: when the
> device is working correctly and I reboot, then it doesn't start
> anymore: [...]
> If I keep the device unplugged from a power source for +/- 8 minutes,
> then it does start again and everything works like it should.

Hardware failure.
My 6501-70 had exactly the same problem.  Since it was still under
warranty, I sent it to Soekris and they replaced the board.

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de
___
Soekris-tech mailing list
Soekris-tech@lists.soekris.com
http://lists.soekris.com/mailman/listinfo/soekris-tech


Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes

2017-03-20 Thread Andrew Atrens

Hi Dries,

This is a longstanding problem .. in my experience my board was rock 
solid for a couple of years until I experienced a power failure and then 
it failed to boot. It was a brick for over two years, but the last time 
I tried it did boot and so now I'm keeping it as an emergency spare.


At the time I thought that the combios might be stuck in a loop waiting 
for some hardware to initialize, and upon more reflection I suspect 
that's probably the case.


If it the loop/jump was no-oped one might be able to get the board to 
boot sans effected hardware .. I'd thought about disassembling the bios 
and looking for that loop and replacing the JMP with a NOOP but didn't 
have time at the time to mess with it and so bought a PC-Engines board 
instead.  I'm sure that if Soekris folks were still developing combios 
this would be a trivial change to make and I'm sure they have no 
shortage of dead boards to test it with.  :)


I've often thought that If a sick board would at least boot without the 
network ports  it might still have limited use in some other capacity as 
opposed to being a complete brick.  Might also be possible to write a 
special kernel driver to 'whack' it into shape once up and running.


--Andrew


On 2017-03-20 5:40 AM, Dries Verachtert wrote:

On Mon, Mar 20, 2017 at 10:11 AM, Julien Cigar  wrote:

I think there is definitively an hardware/BIOS issue with the board, I'm
confident that the problem doesn't come from the power supply or an
overheating from the CPU.

Ed, Jay, Grizzard, Freek, Julien,

Thank you for all the information. I bought the board +/- 2,5 years
ago. The Soekris website mentions 3 years of limited warranty so I'll
try to do an RMA. But in the long run, I'll probably buy something
else because I prefer to have a router/firewall that will still work
after a reboot.

I've actually spent a lot of time fiddling with different serial
cables/ports and trying all combinations of serial speed/parity/stop
bits before I figured out that it only started again if I kept it
disconnected for some time. It would be nice if Soekris could document
this somewhere on their wiki...

Kind regards,
Dries
___
Soekris-tech mailing list
Soekris-tech@lists.soekris.com
http://lists.soekris.com/mailman/listinfo/soekris-tech


___
Soekris-tech mailing list
Soekris-tech@lists.soekris.com
http://lists.soekris.com/mailman/listinfo/soekris-tech


Re: [Soekris] net6501: doesn't start if not first unplugged for 8 minutes

2017-03-20 Thread Julien Cigar
On Mon, Mar 20, 2017 at 01:54:02AM +0100, Freek Dijkstra wrote:
> On 20-03-2017 00:49, Jay Grizzard wrote:
> 
> > I haven't fired up my 6501 to check the exact CPU model (...anyone want
> > to buy it? It's of no use to me since Soekris never released the
> > information needed to work with the FPGA), but I wonder if this could
> > possibly be what's going on with them:
> > https://www.theregister.co.uk/2017/02/06/cisco_intel_decline_to_link_product_warning_to_faulty_chip/
> 
> That article seems only related to the Intel Atom C2000.
> The net6501 contains an older Intel Atom E640, single chip processor
> with EG20T companion chip.
> 
> Of course, that doesn't exclude the possibility that something similar
> is the cause.
> 
> At least the lesson I learned from Dries is that it really is related to
> the CPU, not to the power (as was previously suggested), and that it
> might reboot if cooled down for a sufficiently long period of time.
> 
> Well, mine stopped working last December after 5 years of operation as
> my home router. I replaced it with something else (a Ubiquity
> EdgeRouter), but I still had it lying around (mostly because I wanted to
> reuse to two mSATA disks, which seems uncommon these days)
> 
> Well 3 months of cooling down should be sufficient, so I couldn't help
> to check what happens if I fire it up again, and to my surprise, it did
> boot all the way just a few seconds ago!
> 
> Not that it matters much, I guess it still is a bit too unreliable to my
> taste.

We bought 3 6501 2 years ago (for a redundant firewall/router +
HAProxy). One of them died after a year, got replaced under warranty and
died again 3 months later, another died just a few weeks agos, and the
last one suffer from this puzzling "error led" problem (works until
reboot, "sometimes" boot later, ...).

I think there is definitively an hardware/BIOS issue with the board, I'm
confident that the problem doesn't come from the power supply or an
overheating from the CPU.

As others I'm disappointed that Soekris is so silent on this issue,
especially if the root cause doesn't come from a broken bios/PCB
design/...

As we run FreeBSD, we finally replaced those boxes with a RCC-VE 4860 1U 
from Netgate (https://www.netgate.com/products/rcc-ve-4860-1u.html)

> 
> Regards,
> Freek
> 
> ___
> Soekris-tech mailing list
> Soekris-tech@lists.soekris.com
> http://lists.soekris.com/mailman/listinfo/soekris-tech

-- 
Julien Cigar
Belgian Biodiversity Platform (http://www.biodiversity.be)
PGP fingerprint: EEF9 F697 4B68 D275 7B11  6A25 B2BB 3710 A204 23C0
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.


signature.asc
Description: PGP signature
___
Soekris-tech mailing list
Soekris-tech@lists.soekris.com
http://lists.soekris.com/mailman/listinfo/soekris-tech