Re: [OpenIndiana-discuss] Rpool uncorrectable error

2021-08-30 Thread Tony Brian Albers
Hi Michelle,

I'd take a close look at:

1. SATA cable, try replacing it. Check that connectors are ok on the 
mainboard(firmly seated, no obvious soldering issues). Check that the 
cable does not have any kinks or sharp bends(a bend radius smaller that 
a golfball might give you issues -especially when using faster drives).

2. Power supply (try using another connector from the PSU to the SSD, if 
possible try using another power supply. It might be unstable to a point 
where only the SSD suffers from it and not the rest of the system -this 
can actually happen.)

3. Memory, try removing the dimms, clean the slots with a soft brush and 
reseat them.

4. Interference, do you have anything near the system that generates a 
magnetic field or other kinds of electromagnetic readiation?  I once had 
a Sun E250 in a small business that failed with the weirdest errors from 
the most obscure components. After seeing another E250 having the exact 
same issues when in the same location, we found out that it was caused 
by the power supply to their 12V halogen lights in the ceiling... After 
replacing the power supply to the lights, the errors went away.

Oh yeah, and upgrade the BIOS if possible. Just in case.

HTH

/tony



On 29/08/2021 14.51, Michelle wrote:
> AHCI was already enabled. The IDE mode was legacy, so I've turned that
> off and a few other settings, but there's only so far I can go.
> 
> So far, the OS is booting without the need for re-installation as it
> was already AHCI... but... well
> 
> Michelle.
> 
> 
> On Sun, 2021-08-29 at 14:56 +0300, Toomas Soome via openindiana-discuss
> wrote:
>>> On 29. Aug 2021, at 14:31, Michelle  wrote:
>>>
>>> I'm sat here, not knowing quite what I'm dealing with.
>>>
>>> OI recent build on HP N54L, as you may remember I hit a problem on
>>> the
>>> 8th August with rpool encountering an error on the SSD with the OS.
>>>
>>> I replaced the drive and had the same thing straight away. Replaced
>>> with another drive and took the unit down to basics including a RAM
>>> check and rebuilt it. It's been running until today... same thing.
>>>
>>> On the messages screen I see at just gone 00:00 this morning...
>>> Sense Key: aborted command
>>> Vendor Gen-ATA error code 0x3
>>> Rpool has encountered an uncorrectable IO failure and has been
>>> suspended. Zpool clear will be required before the pool can be
>>> written
>>> to
>>>
>>> The rpool message repeated at 00:05, 00:22, 02:54 and at 08:54
>>> ...and
>>> it's now stopped responding.
>>>
>>> So... I've now got no idea where to take this. On the face of it,
>>> the
>>> hardware is fine... on the face of it. However, there's something
>>> wrong
>>> here and given the history, I'm not sure where to start.
>>>
>>> Part of me thinks I'm looking at hardware...
>>>
>>> Power? Hmmm... why does the SSD fail instead of the spinning rust
>>> drives?
>>>
>>> Yet another hard drive? This is number three now, throwing
>>> suspicion at
>>> other aspects.
>>>
>>> I am reticent to just replace with another hard drive and carry on.
>>> I
>>> think it may not be to blame here.
>>>
>>> Grateful for people's thoughts.
>>>
>>> Michelle.
>>>
>>
>> Apparently this can be multiple things… Does your system allow to set
>> AHCI (SATA) mode from BIOS setup? This may help.
>>
>> also see
>> https://docs.oracle.com/cd/E19253-01/820-7273/ggmsj/index.html <
>> https://docs.oracle.com/cd/E19253-01/820-7273/ggmsj/index.html>
>>
>> In general, you really want to keep away from IDE…
>>
>> rgds,
>> toomas
>> ___
>> openindiana-discuss mailing list
>> openindiana-discuss@openindiana.org
>> https://openindiana.org/mailman/listinfo/openindiana-discuss
> 
> 
> ___
> openindiana-discuss mailing list
> openindiana-discuss@openindiana.org
> https://openindiana.org/mailman/listinfo/openindiana-discuss
> 


-- 
Tony Albers - Systems Architect - Data Department, Royal Danish Library, 
Victor Albecks Vej 1, 8000 Aarhus C, Denmark
Tel: +45 2566 2383 - CVR/SE: 2898 8842 - EAN: 5798000792142
___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Rpool uncorrectable error

2021-08-29 Thread Michelle
AHCI was already enabled. The IDE mode was legacy, so I've turned that
off and a few other settings, but there's only so far I can go.

So far, the OS is booting without the need for re-installation as it
was already AHCI... but... well

Michelle.


On Sun, 2021-08-29 at 14:56 +0300, Toomas Soome via openindiana-discuss 
wrote:
> > On 29. Aug 2021, at 14:31, Michelle  wrote:
> > 
> > I'm sat here, not knowing quite what I'm dealing with.
> > 
> > OI recent build on HP N54L, as you may remember I hit a problem on
> > the
> > 8th August with rpool encountering an error on the SSD with the OS.
> > 
> > I replaced the drive and had the same thing straight away. Replaced
> > with another drive and took the unit down to basics including a RAM
> > check and rebuilt it. It's been running until today... same thing.
> > 
> > On the messages screen I see at just gone 00:00 this morning...
> > Sense Key: aborted command
> > Vendor Gen-ATA error code 0x3
> > Rpool has encountered an uncorrectable IO failure and has been
> > suspended. Zpool clear will be required before the pool can be
> > written
> > to
> > 
> > The rpool message repeated at 00:05, 00:22, 02:54 and at 08:54
> > ...and
> > it's now stopped responding.
> > 
> > So... I've now got no idea where to take this. On the face of it,
> > the
> > hardware is fine... on the face of it. However, there's something
> > wrong
> > here and given the history, I'm not sure where to start.
> > 
> > Part of me thinks I'm looking at hardware...
> > 
> > Power? Hmmm... why does the SSD fail instead of the spinning rust
> > drives?
> > 
> > Yet another hard drive? This is number three now, throwing
> > suspicion at
> > other aspects.
> > 
> > I am reticent to just replace with another hard drive and carry on.
> > I
> > think it may not be to blame here.
> > 
> > Grateful for people's thoughts.
> > 
> > Michelle.
> > 
> 
> Apparently this can be multiple things… Does your system allow to set
> AHCI (SATA) mode from BIOS setup? This may help.
> 
> also see 
> https://docs.oracle.com/cd/E19253-01/820-7273/ggmsj/index.html <
> https://docs.oracle.com/cd/E19253-01/820-7273/ggmsj/index.html> 
> 
> In general, you really want to keep away from IDE…
> 
> rgds,
> toomas
> ___
> openindiana-discuss mailing list
> openindiana-discuss@openindiana.org
> https://openindiana.org/mailman/listinfo/openindiana-discuss


___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Rpool uncorrectable error

2021-08-29 Thread Michelle
I've done workaround number 5 in that document.

The system has been working for years, and I'm not sure about the
AHCI... will have to look at that as that will require an OS rebuild, I
believe.

The only thing that's changed before this started happening, was the OS
upgrade. But obviously there's hardware degradation over time to deal
with.

I'll put a backup and then rebuild on the radar for the next few days,
and see what other suggestions come up.

Michelle.



On Sun, 2021-08-29 at 14:56 +0300, Toomas Soome via openindiana-discuss 
wrote:
> > On 29. Aug 2021, at 14:31, Michelle  wrote:
> > 
> > I'm sat here, not knowing quite what I'm dealing with.
> > 
> > OI recent build on HP N54L, as you may remember I hit a problem on
> > the
> > 8th August with rpool encountering an error on the SSD with the OS.
> > 
> > I replaced the drive and had the same thing straight away. Replaced
> > with another drive and took the unit down to basics including a RAM
> > check and rebuilt it. It's been running until today... same thing.
> > 
> > On the messages screen I see at just gone 00:00 this morning...
> > Sense Key: aborted command
> > Vendor Gen-ATA error code 0x3
> > Rpool has encountered an uncorrectable IO failure and has been
> > suspended. Zpool clear will be required before the pool can be
> > written
> > to
> > 
> > The rpool message repeated at 00:05, 00:22, 02:54 and at 08:54
> > ...and
> > it's now stopped responding.
> > 
> > So... I've now got no idea where to take this. On the face of it,
> > the
> > hardware is fine... on the face of it. However, there's something
> > wrong
> > here and given the history, I'm not sure where to start.
> > 
> > Part of me thinks I'm looking at hardware...
> > 
> > Power? Hmmm... why does the SSD fail instead of the spinning rust
> > drives?
> > 
> > Yet another hard drive? This is number three now, throwing
> > suspicion at
> > other aspects.
> > 
> > I am reticent to just replace with another hard drive and carry on.
> > I
> > think it may not be to blame here.
> > 
> > Grateful for people's thoughts.
> > 
> > Michelle.
> > 
> 
> Apparently this can be multiple things… Does your system allow to set
> AHCI (SATA) mode from BIOS setup? This may help.
> 
> also see 
> https://docs.oracle.com/cd/E19253-01/820-7273/ggmsj/index.html <
> https://docs.oracle.com/cd/E19253-01/820-7273/ggmsj/index.html> 
> 
> In general, you really want to keep away from IDE…
> 
> rgds,
> toomas
> ___
> openindiana-discuss mailing list
> openindiana-discuss@openindiana.org
> https://openindiana.org/mailman/listinfo/openindiana-discuss


___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Rpool uncorrectable error

2021-08-29 Thread Michelle
root@jaguar:/var# iostat -En
c6d1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Model: KINGSTON SA400S Revision:  Serial No: 50026B73801BCEA 
Size: 120.03GB <120033607680 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 
c4t0d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: ATA  Product: WDC WD60EFRX-68L Revision: 0A82 Serial No:
WD-WX21DA84EH0F 
Size: 6001.18GB <6001175126016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c4t2d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: ATA  Product: WDC WD60EFRX-68L Revision: 0A82 Serial No:
WD-WX51DB880RJ4 
Size: 6001.18GB <6001175126016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c4t3d0   Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: ATA  Product: WDC WD60EFAX-68J Revision: 0A83 Serial No:
WD-WX22DC0AUC60 
Size: 6001.18GB <6001175126016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 


Output from fmadm faulty only gives the error which happened when I
brought up the unit without the tank pool inserted... just in case.

That's as far as I know to go.

Michelle.


On Sun, 2021-08-29 at 12:56 +0100, Michelle wrote:
> I have the machine up and running and issued a zpool clear against
> rpool. At least as long as it lasts.
> 
> Reading things from ten years ago, I've disabled the microcode... it
> seems to be all I can do. I'm not expecting that to be the issue
> but...
> hey.
> 
> Nothing in /var/log/syslog which gives any clue. Last message was
> 23:05
> from sendmail, and then at 11:46 on a failure to gethostbyaddr after
> a
> reboot.
> 
> 
> 
> On Sun, 2021-08-29 at 12:31 +0100, Michelle wrote:
> > I'm sat here, not knowing quite what I'm dealing with.
> > 
> > OI recent build on HP N54L, as you may remember I hit a problem on
> > the
> > 8th August with rpool encountering an error on the SSD with the OS.
> > 
> > I replaced the drive and had the same thing straight away. Replaced
> > with another drive and took the unit down to basics including a RAM
> > check and rebuilt it. It's been running until today... same thing.
> > 
> > On the messages screen I see at just gone 00:00 this morning...
> > Sense Key: aborted command
> > Vendor Gen-ATA error code 0x3
> > Rpool has encountered an uncorrectable IO failure and has been
> > suspended. Zpool clear will be required before the pool can be
> > written
> > to
> > 
> > The rpool message repeated at 00:05, 00:22, 02:54 and at 08:54
> > ...and
> > it's now stopped responding.
> > 
> > So... I've now got no idea where to take this. On the face of it,
> > the
> > hardware is fine... on the face of it. However, there's something
> > wrong
> > here and given the history, I'm not sure where to start.
> > 
> > Part of me thinks I'm looking at hardware...
> > 
> > Power? Hmmm... why does the SSD fail instead of the spinning rust
> > drives?
> > 
> > Yet another hard drive? This is number three now, throwing
> > suspicion
> > at
> > other aspects.
> > 
> > I am reticent to just replace with another hard drive and carry on.
> > I
> > think it may not be to blame here.
> > 
> > Grateful for people's thoughts.
> > 
> > Michelle.
> > 
> > 
> > ___
> > openindiana-discuss mailing list
> > openindiana-discuss@openindiana.org
> > https://openindiana.org/mailman/listinfo/openindiana-discuss


___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Rpool uncorrectable error

2021-08-29 Thread Toomas Soome via openindiana-discuss


> On 29. Aug 2021, at 14:31, Michelle  wrote:
> 
> I'm sat here, not knowing quite what I'm dealing with.
> 
> OI recent build on HP N54L, as you may remember I hit a problem on the
> 8th August with rpool encountering an error on the SSD with the OS.
> 
> I replaced the drive and had the same thing straight away. Replaced
> with another drive and took the unit down to basics including a RAM
> check and rebuilt it. It's been running until today... same thing.
> 
> On the messages screen I see at just gone 00:00 this morning...
> Sense Key: aborted command
> Vendor Gen-ATA error code 0x3
> Rpool has encountered an uncorrectable IO failure and has been
> suspended. Zpool clear will be required before the pool can be written
> to
> 
> The rpool message repeated at 00:05, 00:22, 02:54 and at 08:54 ...and
> it's now stopped responding.
> 
> So... I've now got no idea where to take this. On the face of it, the
> hardware is fine... on the face of it. However, there's something wrong
> here and given the history, I'm not sure where to start.
> 
> Part of me thinks I'm looking at hardware...
> 
> Power? Hmmm... why does the SSD fail instead of the spinning rust
> drives?
> 
> Yet another hard drive? This is number three now, throwing suspicion at
> other aspects.
> 
> I am reticent to just replace with another hard drive and carry on. I
> think it may not be to blame here.
> 
> Grateful for people's thoughts.
> 
> Michelle.
> 


Apparently this can be multiple things… Does your system allow to set AHCI 
(SATA) mode from BIOS setup? This may help.

also see https://docs.oracle.com/cd/E19253-01/820-7273/ggmsj/index.html 
 

In general, you really want to keep away from IDE…

rgds,
toomas
___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Rpool uncorrectable error

2021-08-29 Thread Michelle
I have the machine up and running and issued a zpool clear against
rpool. At least as long as it lasts.

Reading things from ten years ago, I've disabled the microcode... it
seems to be all I can do. I'm not expecting that to be the issue but...
hey.

Nothing in /var/log/syslog which gives any clue. Last message was 23:05
from sendmail, and then at 11:46 on a failure to gethostbyaddr after a
reboot.



On Sun, 2021-08-29 at 12:31 +0100, Michelle wrote:
> I'm sat here, not knowing quite what I'm dealing with.
> 
> OI recent build on HP N54L, as you may remember I hit a problem on
> the
> 8th August with rpool encountering an error on the SSD with the OS.
> 
> I replaced the drive and had the same thing straight away. Replaced
> with another drive and took the unit down to basics including a RAM
> check and rebuilt it. It's been running until today... same thing.
> 
> On the messages screen I see at just gone 00:00 this morning...
> Sense Key: aborted command
> Vendor Gen-ATA error code 0x3
> Rpool has encountered an uncorrectable IO failure and has been
> suspended. Zpool clear will be required before the pool can be
> written
> to
> 
> The rpool message repeated at 00:05, 00:22, 02:54 and at 08:54 ...and
> it's now stopped responding.
> 
> So... I've now got no idea where to take this. On the face of it, the
> hardware is fine... on the face of it. However, there's something
> wrong
> here and given the history, I'm not sure where to start.
> 
> Part of me thinks I'm looking at hardware...
> 
> Power? Hmmm... why does the SSD fail instead of the spinning rust
> drives?
> 
> Yet another hard drive? This is number three now, throwing suspicion
> at
> other aspects.
> 
> I am reticent to just replace with another hard drive and carry on. I
> think it may not be to blame here.
> 
> Grateful for people's thoughts.
> 
> Michelle.
> 
> 
> ___
> openindiana-discuss mailing list
> openindiana-discuss@openindiana.org
> https://openindiana.org/mailman/listinfo/openindiana-discuss


___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss