Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-23 Thread jason matthews



On 9/23/15 2:12 AM, Richard Patterson wrote:

Any ideas?



Two failed drives at the same time. Sigh.

Normally the odds would be in your favor but we are talking about WD.

What happens if you put in the drive one of those drives, you may have 
to cycle through them to find one that doesnt hang the bus/hba etc.


You could also try limiting the time the OS waits on an I/O operation. 
put this in /etc/system and reboot.


* spend no more than 10 seconds on any one I/O
set sd:sd_io_time=10

Then see if you can get the pool to import.

j.


___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-23 Thread jason matthews



On 9/23/15 12:10 PM, jason matthews wrote:



On 9/23/15 2:12 AM, Richard Patterson wrote:

Any ideas?



consider the possibility you have selected the wrong drive as the faulty 
one. i suspect this is the case as three drives going bad simultaneously 
is extremely unlikely.


j.

___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-23 Thread Richard Patterson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Ok, I identified the possibly faulty drives, and took them out. (Doing
a full test on them in another box)

Trying to import the backups zpool without the 2 drives (raidz2 so
should be ok):

zpool import -f backups

Hangs the server after about 10 mins. (I had top running, and it's
stopped updating)

Any ideas?

Regards

Richard

On 22/09/2015 10:21, Richard Patterson wrote:
> Ok, I removed the controller again, booted it up fine, changed 
> time-out to 180.
> 
> Removed the backups zpool (zfs export), shutdown and re-installed
> the controller.
> 
> Booted up fine.
> 
> zfs import -f backups
> 
> ^^Hangs at this point.
> 
> A while ago, i had a script running in cron to report on zpool
> status, etc.
> 
> I must admit, my checking of the reports was a bit lax of late, and
> I found the last few had this in the status:
> 
> NAME   STATE READ WRITE CKSUM backups
> ONLINE   0 0 0 raidz2-0 ONLINE   0
> 0 0 c3t50014EE6AEEE4A0Bd0  ONLINE   0 0 0 
> c3t50014EE6AEEE5C8Bd0  ONLINE   0 0 0 
> c3t50014EE0591526CAd0  ONLINE   0 0 0 
> c3t50014EE659992C01d0  ONLINE   0 0 0 
> c3t50014EE6599926FBd0  ONLINE   0 012 
> c3t50014EE65999205Fd0  ONLINE   0 0 4 
> c3t50014EE659992081d0  ONLINE   0 0 0 
> c3t50014EE659993822d0  ONLINE   0 0 0
> 
> So, looks like I might have at least 1 failed disk... will need to
> try to identify the failed one(s), and pull it / them out.
> 
> Thanks so far, I'll update when I've had a chance to pull the
> faulty disk(s).
> 
> 
> ___ openindiana-discuss
> mailing list openindiana-discuss@openindiana.org 
> http://openindiana.org/mailman/listinfo/openindiana-discuss
> 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJWAmziAAoJEAKlOyw2yHcoS2sP/RGX6S4P1HKEgzKnSqpRdGD3
5G3+v2n6u8obO2ZGR0aUsdGiJ/vGi2ny6btpKya5AI5U2doABV4ACUWNkoD0rNaj
zE2zuO/qlKjk8Nv3ozzqhXtNGMUKGSkz+x1x92hgQKkIoviZLO0cUMTtcZYW6aIk
P4vokucs6JFbCTOsZa0sjT3qpmSyw1yqc1y/j0mM2ac1c1pkQNMhFsUwKi4PnOML
vmNawJCExTvHKNIFoqBXXEZiC7olNMUyG7uhTsK4CxPfCAHCrPBCTcOZZHAF8T8n
BTqHh8WVaoUsVo9ZNYisjtKSW6lfEHtOYyjnYVRUw3XNT27LTYRzwblaP8PgYav+
EmB+aQHLm33k1oIXctkm1SaO9bFXWghhhkMlp5dxocLlsSiH050rS0lc/dWrSGTX
9LgsR9tUEefl/DFI6UO30xzMtCkuPML76irnlPfcXEqYICjHLVCWiqGAkAcfFjFp
UmYs0cop97HnSiWHZ5+m7cSLyjQGAOBCLx00IcoXbzaS/2g+g5tE9AxjKPxszY85
Z+MCE9VcK/RQs9Krd8BD4MaOoUR/xAg/alGjmWWp5MfLwujJi/4RRRZjo22SgDoa
+BmEg1w1cENoG6oBD5B8vhDFz5ZrMpEgQoza3jV+3OeYj93AzVQDn/p8Anof6RKv
BHrpZJdtD2Gmr0rTtraf
=15xt
-END PGP SIGNATURE-

___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-23 Thread Nikola M
If drives had beed set as JBOD (on disk format not related to disk 
controller)
then drives themselves should work on any machine and controller and OS 
using OpenZFS.


Of course, it is best not to have SAS/SATA expanders and to use SAS 
controller for SAS and SATA for SATA.


I would Leave all disks in and try to import with some another install 
of both Openindiana, other illumos distro or even FreBSD and Linux with 
ZFSOnLinux.


You can install OI Hipster in VirtualBox and then do zfs send/receive 
system and user datasets to the main system disk/rpool to newly made and 
empied BE. That way you can also boot OI Hipster and see if importing 
behaves differently with newest illumos.
I bet it's some hardware failure to aether controller, expanders and/or 
disk(s).


On 09/23/15 11:12 AM, Richard Patterson wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Ok, I identified the possibly faulty drives, and took them out. (Doing
a full test on them in another box)

Trying to import the backups zpool without the 2 drives (raidz2 so
should be ok):

zpool import -f backups

Hangs the server after about 10 mins. (I had top running, and it's
stopped updating)

Any ideas?

Regards

Richard

On 22/09/2015 10:21, Richard Patterson wrote:

Ok, I removed the controller again, booted it up fine, changed
time-out to 180.

Removed the backups zpool (zfs export), shutdown and re-installed
the controller.

Booted up fine.

zfs import -f backups

^^Hangs at this point.

A while ago, i had a script running in cron to report on zpool
status, etc.

I must admit, my checking of the reports was a bit lax of late, and
I found the last few had this in the status:

NAME   STATE READ WRITE CKSUM backups
ONLINE   0 0 0 raidz2-0 ONLINE   0
0 0 c3t50014EE6AEEE4A0Bd0  ONLINE   0 0 0
c3t50014EE6AEEE5C8Bd0  ONLINE   0 0 0
c3t50014EE0591526CAd0  ONLINE   0 0 0
c3t50014EE659992C01d0  ONLINE   0 0 0
c3t50014EE6599926FBd0  ONLINE   0 012
c3t50014EE65999205Fd0  ONLINE   0 0 4
c3t50014EE659992081d0  ONLINE   0 0 0
c3t50014EE659993822d0  ONLINE   0 0 0

So, looks like I might have at least 1 failed disk... will need to
try to identify the failed one(s), and pull it / them out.

Thanks so far, I'll update when I've had a chance to pull the
faulty disk(s).


___ openindiana-discuss
mailing list openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


-BEGIN PGP SIGNATURE-
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJWAmziAAoJEAKlOyw2yHcoS2sP/RGX6S4P1HKEgzKnSqpRdGD3
5G3+v2n6u8obO2ZGR0aUsdGiJ/vGi2ny6btpKya5AI5U2doABV4ACUWNkoD0rNaj
zE2zuO/qlKjk8Nv3ozzqhXtNGMUKGSkz+x1x92hgQKkIoviZLO0cUMTtcZYW6aIk
P4vokucs6JFbCTOsZa0sjT3qpmSyw1yqc1y/j0mM2ac1c1pkQNMhFsUwKi4PnOML
vmNawJCExTvHKNIFoqBXXEZiC7olNMUyG7uhTsK4CxPfCAHCrPBCTcOZZHAF8T8n
BTqHh8WVaoUsVo9ZNYisjtKSW6lfEHtOYyjnYVRUw3XNT27LTYRzwblaP8PgYav+
EmB+aQHLm33k1oIXctkm1SaO9bFXWghhhkMlp5dxocLlsSiH050rS0lc/dWrSGTX
9LgsR9tUEefl/DFI6UO30xzMtCkuPML76irnlPfcXEqYICjHLVCWiqGAkAcfFjFp
UmYs0cop97HnSiWHZ5+m7cSLyjQGAOBCLx00IcoXbzaS/2g+g5tE9AxjKPxszY85
Z+MCE9VcK/RQs9Krd8BD4MaOoUR/xAg/alGjmWWp5MfLwujJi/4RRRZjo22SgDoa
+BmEg1w1cENoG6oBD5B8vhDFz5ZrMpEgQoza3jV+3OeYj93AzVQDn/p8Anof6RKv
BHrpZJdtD2Gmr0rTtraf
=15xt
-END PGP SIGNATURE-

___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss



___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-23 Thread Nikola M

On 09/23/15 11:25 AM, Nikola M wrote:
I bet it's some hardware failure to aether controller, expanders 
and/or disk(s).
Sorry for not posting below quotes in previous message, I got distraced, 
sorry.




___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-22 Thread Richard Patterson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Ok, I removed the controller again, booted it up fine, changed
time-out to 180.

Removed the backups zpool (zfs export), shutdown and re-installed the
controller.

Booted up fine.

zfs import -f backups

^^Hangs at this point.

A while ago, i had a script running in cron to report on zpool status,
etc.

I must admit, my checking of the reports was a bit lax of late, and I
found the last few had this in the status:

NAME   STATE READ WRITE CKSUM
backupsONLINE   0 0 0
  raidz2-0 ONLINE   0 0 0
c3t50014EE6AEEE4A0Bd0  ONLINE   0 0 0
c3t50014EE6AEEE5C8Bd0  ONLINE   0 0 0
c3t50014EE0591526CAd0  ONLINE   0 0 0
c3t50014EE659992C01d0  ONLINE   0 0 0
c3t50014EE6599926FBd0  ONLINE   0 012
c3t50014EE65999205Fd0  ONLINE   0 0 4
c3t50014EE659992081d0  ONLINE   0 0 0
c3t50014EE659993822d0  ONLINE   0 0 0

So, looks like I might have at least 1 failed disk... will need to try
to identify the failed one(s), and pull it / them out.

Thanks so far, I'll update when I've had a chance to pull the faulty
disk(s).

- -- 
Richard


On 21/09/2015 18:44, jason matthews wrote:
> try the following: delete the zpool cache, /etc/zfs/zpool.cache 
> reboot the system, it should boot since it wont try to mount the
> backups zpool change the bootarchive time out to two minutes 
> manually import the backups. clear the remaining services in
> maintenance mode
> 
> that should get it going.
> 
> you probably have a failing disk. reboot to see if  the system can 
> complete the boot procedure. if bootadm times out again, use
> iostat -nMxC to look for a disk that 100% busy or has outrageous
> svc times etc, or errors. you might even try looking for a disk
> with a solid activity light.  if you find such a disk pull it.
> 
> are you using "green" drives? green drives tend to spend down
> between 6 and 30 seconds which a problematic on zpool with a large
> number of disks.
> 
> j.
> 
> On 9/21/15 10:09 AM, Alexander wrote:
>> truss can help to see where boot-archive stumbles
>> 
>> -- Alex
>> 
>> 
>> On 21 Sep 2015 at 20:02:44, Watson, Dan
>> (dan.wat...@bcferries.com) wrote:
>> 
>> I have had this issue. My (bad) solution was to comment out
>> bootadm from both boot-archive service scripts and run them
>> manually as needed. It seems to be an indicator of too much
>> latency. SVC only waits 20s by default and bootadm update can
>> take up to 100s with sub-par disks.
>> 
>> I find when the disks are "good enough" the first run will time
>> out, but after you log in to maintenance mode and clear the
>> service is finishes instantly because it completed when SVC timed
>> it out waiting for it.
>> 
>> Dan -Original Message- From: Richard Patterson
>> [mailto:rich...@helpquick.co.uk] Sent: September 21, 2015 12:49
>> AM To: Discussion list for OpenIndiana Subject: Re:
>> [OpenIndiana-discuss] System will not boot, boot-archive fails
>> 
> Thanks Nikola,
> 
> On 21/09/2015 07:37, Nikola M wrote:
>>>> There is a bug in the GRUB that is shipped as a part of
>>>> illumos, that breaks with STAGE2 if menu.lst is too big (e.g.
>>>> you have many Boot environments , undetermined how much,
>>>> maybe over 25 BEs?).
>>>> 
> I don't think I've encountered this issue, as I only have 4 BEs
> listed in grub.
> 
>>>>> OpenIndiana Build oi_151a9 64-bit (illumos 52e13e00ba) 32GB
>>>>> Ram, Quad-core AMD CPU rpool is on a single 250GB SATA
>>>>> drive.
>>>> it is not healthy to be left with no room on disk, it kills
>>>> ZFS performance, etc. You can clear some space from rpool to
>>>> some external drive, for starters.
>>>> 
> The rpool has plenty of space left, 56.1G in use, 173G free, so 
> doesn't look like that's the problem.
> 
> I also did a scrub on rpool, which shows no errors.
> 
> As a test, I removed the other 8 drives (took the controller out 
> completely), and the system boots fine... seems to be something to
> do with the large "backups" zpool, as the system will not boot with
> it attached.
> 
> To make things worse, I can only work on the server at the
> console, which is in a remote office from where I work normally, so
> I'm unable to try things until I get to that office on an evening.
> 
> Many thanks
> 
> Richard
>> 
>> __

Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-21 Thread Nikola M

On 09/21/15 07:10 AM, Richard Patterson wrote:

Hi All,

I have a server which will not boot... We had a power cut last week, but
prior to this the system had been up for well over a year.

Now, on bootup, I get messages saying

svc:/system/boot-archive:default: Method or service exit timed out.
Killing contract 16
svc:/system/boot-archive:default: Method "/lib/svc/method/boot-archive"
failed due to signal KILL

It then drops me into maintenance mode.

I am in the dark here, but maybe this is your problem:
There is a bug in the GRUB that is shipped as a part of illumos, that 
breaks with STAGE2 if menu.lst is too big (e.g. you have many Boot 
environments , undetermined how much, maybe over 25 BEs?).
Problem is known by illumos but not addressed nor fixed and there is a 
word that GRUB2 implementation in S11 has same problem, just it uses 
bigger buffer and list of Be's can be a little bigger.

illumos people also are working on new boot loader etc.

It is reported that if that is the problem, you can delete some BE's 
inside of /rpool/boot/grub/menu.lst

and then it would boot right. (and later delete zfs dataset of BE's removed)
It is done when booting from LiveDVD/USB and importing zpool to access 
menu.lst.




Once logged in, I can't run bootadm update-archive, as it says another
instance of bootadm is already running. (then console hangs)

Most commands hang the console, such as zpool list, or zfs list.

I also get a very short amount of time to do anything, as it seems to
run out of memory (or swap) after a few minutes

WARNING: /etc/svc/volatile: File system full, swap space limit exceeded

# swap -l
No swap devices configured
# swap -s
17112k bytes allocated + 1623k reserved = 18744k used, 10439268k available

If I run swap -s again, the figures are the same, except the available
is reducing.

OpenIndiana Build oi_151a9 64-bit (illumos 52e13e00ba)

32GB Ram, Quad-core AMD CPU

rpool is on a single 250GB SATA drive.


it is not healthy to be left with no room on disk, it kills ZFS 
performance, etc.

You can clear some space from rpool to some external drive, for starters.


___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-21 Thread Richard Patterson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thanks Nikola,

On 21/09/2015 07:37, Nikola M wrote:
> There is a bug in the GRUB that is shipped as a part of illumos,
> that breaks with STAGE2 if menu.lst is too big (e.g. you have many
> Boot environments , undetermined how much, maybe over 25 BEs?).
> 

I don't think I've encountered this issue, as I only have 4 BEs listed
in grub.

>> 
>> OpenIndiana Build oi_151a9 64-bit (illumos 52e13e00ba)
>> 
>> 32GB Ram, Quad-core AMD CPU
>> 
>> rpool is on a single 250GB SATA drive.
> 
> it is not healthy to be left with no room on disk, it kills ZFS 
> performance, etc. You can clear some space from rpool to some
> external drive, for starters.
> 

The rpool has plenty of space left, 56.1G in use, 173G free, so
doesn't look like that's the problem.

I also did a scrub on rpool, which shows no errors.

As a test, I removed the other 8 drives (took the controller out
completely), and the system boots fine... seems to be something to do
with the large "backups" zpool, as the system will not boot with it
attached.

To make things worse, I can only work on the server at the console,
which is in a remote office from where I work normally, so I'm unable
to try things until I get to that office on an evening.

Many thanks

Richard
-BEGIN PGP SIGNATURE-
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJV/7ZvAAoJEAKlOyw2yHco8IkP/3Tnf+sX1fPEnRluzkQCTmCB
BciFqo+CKi509xHCeWjPRLlgZzVlwjrx532LEJ8eFn0v8hm6mtjL0jsBl1DLcThr
ANczRvO5ux2XkED/ZVyR8lIFNMX326YWb9FUHIy6Pwj81OEMaYWNYUcjSj0VdZAJ
qcimnETs55Pemn2o+zOLYorNLPMnHCzc7JUR3c6o/nguRUUIeejMEKdal9NSwLnN
U7Khl2jpSp+4DVXepGa3CSwcjKE2XnHUWQTqlh2IfWV1lDFMrOFVuTVQkZn0Z8ND
z1fhwqQ1r49IAFrX2HyP9/4LoGxSJ9uOKNZPHcspnU12DiLPjoZe8dWh5lkz8Kym
RRe7JFOZZuyx3VdIq97jLZBxlf0GhZKWwKY2gnpMs2nSDGb/LHLIk7Gr/90Xkkc4
mF3X6Ti+BmoBCuqWUM2sqGK/yLdsjhjnaTyReT0G0cLgVY7UDX1Qcn86u1Vc2ZyY
8CaTIE4h/Mkm4ny3lhBp2wqax06zpvxo0gbFZjTq4NFyo9w4lZMMPx8MqjoSwdns
CsKCSlWoDgfDN7KL8TNs6URgvp+Fx71S8x+5UrA7N9XehCfKs/88ug7nyFPBZzg2
fxrcg1/edaWphr1WrA9Abo0Pfoie739qiwGpV0+xY+FwhFTvppE79t4eHh3z9Ydb
HLGK+6KvkS8s9AdLAhmW
=gpe4
-END PGP SIGNATURE-

___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-21 Thread jason matthews

try the following:
delete the zpool cache, /etc/zfs/zpool.cache
reboot the system, it should boot since it wont try to mount the backups 
zpool

change the bootarchive time out to two minutes
manually import the backups.
clear the remaining services in maintenance mode

that should get it going.

you probably have a failing disk. reboot to see if  the system can 
complete the boot procedure. if bootadm times out again, use iostat 
-nMxC to look for a disk that 100% busy or has outrageous svc times etc, 
or errors. you might even try looking for a disk with a solid activity 
light.  if you find such a disk pull it.


are you using "green" drives? green drives tend to spend down between 6 
and 30 seconds which a problematic on zpool with a large number of disks.


j.

On 9/21/15 10:09 AM, Alexander wrote:

truss can help to see where boot-archive stumbles

--
Alex


On 21 Sep 2015 at 20:02:44, Watson, Dan (dan.wat...@bcferries.com) wrote:

I have had this issue. My (bad) solution was to comment out bootadm from both 
boot-archive service scripts and run them manually as needed. It seems to be an 
indicator of too much latency. SVC only waits 20s by default and bootadm update 
can take up to 100s with sub-par disks.

I find when the disks are "good enough" the first run will time out, but after 
you log in to maintenance mode and clear the service is finishes instantly because it 
completed when SVC timed it out waiting for it.

Dan
-Original Message-
From: Richard Patterson [mailto:rich...@helpquick.co.uk]
Sent: September 21, 2015 12:49 AM
To: Discussion list for OpenIndiana
Subject: Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thanks Nikola,

On 21/09/2015 07:37, Nikola M wrote:

There is a bug in the GRUB that is shipped as a part of illumos,
that breaks with STAGE2 if menu.lst is too big (e.g. you have many
Boot environments , undetermined how much, maybe over 25 BEs?).
  

I don't think I've encountered this issue, as I only have 4 BEs listed
in grub.

  
OpenIndiana Build oi_151a9 64-bit (illumos 52e13e00ba)
  
32GB Ram, Quad-core AMD CPU
  
rpool is on a single 250GB SATA drive.
  
it is not healthy to be left with no room on disk, it kills ZFS

performance, etc. You can clear some space from rpool to some
external drive, for starters.
  

The rpool has plenty of space left, 56.1G in use, 173G free, so
doesn't look like that's the problem.

I also did a scrub on rpool, which shows no errors.

As a test, I removed the other 8 drives (took the controller out
completely), and the system boots fine... seems to be something to do
with the large "backups" zpool, as the system will not boot with it
attached.

To make things worse, I can only work on the server at the console,
which is in a remote office from where I work normally, so I'm unable
to try things until I get to that office on an evening.

Many thanks

Richard
-BEGIN PGP SIGNATURE-
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJV/7ZvAAoJEAKlOyw2yHco8IkP/3Tnf+sX1fPEnRluzkQCTmCB
BciFqo+CKi509xHCeWjPRLlgZzVlwjrx532LEJ8eFn0v8hm6mtjL0jsBl1DLcThr
ANczRvO5ux2XkED/ZVyR8lIFNMX326YWb9FUHIy6Pwj81OEMaYWNYUcjSj0VdZAJ
qcimnETs55Pemn2o+zOLYorNLPMnHCzc7JUR3c6o/nguRUUIeejMEKdal9NSwLnN
U7Khl2jpSp+4DVXepGa3CSwcjKE2XnHUWQTqlh2IfWV1lDFMrOFVuTVQkZn0Z8ND
z1fhwqQ1r49IAFrX2HyP9/4LoGxSJ9uOKNZPHcspnU12DiLPjoZe8dWh5lkz8Kym
RRe7JFOZZuyx3VdIq97jLZBxlf0GhZKWwKY2gnpMs2nSDGb/LHLIk7Gr/90Xkkc4
mF3X6Ti+BmoBCuqWUM2sqGK/yLdsjhjnaTyReT0G0cLgVY7UDX1Qcn86u1Vc2ZyY
8CaTIE4h/Mkm4ny3lhBp2wqax06zpvxo0gbFZjTq4NFyo9w4lZMMPx8MqjoSwdns
CsKCSlWoDgfDN7KL8TNs6URgvp+Fx71S8x+5UrA7N9XehCfKs/88ug7nyFPBZzg2
fxrcg1/edaWphr1WrA9Abo0Pfoie739qiwGpV0+xY+FwhFTvppE79t4eHh3z9Ydb
HLGK+6KvkS8s9AdLAhmW
=gpe4
-END PGP SIGNATURE-

___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss

___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss
___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss



___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-21 Thread Alexander
truss can help to see where boot-archive stumbles

-- 
Alex


On 21 Sep 2015 at 20:02:44, Watson, Dan (dan.wat...@bcferries.com) wrote:

I have had this issue. My (bad) solution was to comment out bootadm from both 
boot-archive service scripts and run them manually as needed. It seems to be an 
indicator of too much latency. SVC only waits 20s by default and bootadm update 
can take up to 100s with sub-par disks.  

I find when the disks are "good enough" the first run will time out, but after 
you log in to maintenance mode and clear the service is finishes instantly 
because it completed when SVC timed it out waiting for it.  

Dan  
-Original Message-  
From: Richard Patterson [mailto:rich...@helpquick.co.uk]  
Sent: September 21, 2015 12:49 AM  
To: Discussion list for OpenIndiana  
Subject: Re: [OpenIndiana-discuss] System will not boot, boot-archive fails  

-BEGIN PGP SIGNED MESSAGE-  
Hash: SHA1  

Thanks Nikola,  

On 21/09/2015 07:37, Nikola M wrote:  
> There is a bug in the GRUB that is shipped as a part of illumos,  
> that breaks with STAGE2 if menu.lst is too big (e.g. you have many  
> Boot environments , undetermined how much, maybe over 25 BEs?).  
>  

I don't think I've encountered this issue, as I only have 4 BEs listed  
in grub.  

>>  
>> OpenIndiana Build oi_151a9 64-bit (illumos 52e13e00ba)  
>>  
>> 32GB Ram, Quad-core AMD CPU  
>>  
>> rpool is on a single 250GB SATA drive.  
>  
> it is not healthy to be left with no room on disk, it kills ZFS  
> performance, etc. You can clear some space from rpool to some  
> external drive, for starters.  
>  

The rpool has plenty of space left, 56.1G in use, 173G free, so  
doesn't look like that's the problem.  

I also did a scrub on rpool, which shows no errors.  

As a test, I removed the other 8 drives (took the controller out  
completely), and the system boots fine... seems to be something to do  
with the large "backups" zpool, as the system will not boot with it  
attached.  

To make things worse, I can only work on the server at the console,  
which is in a remote office from where I work normally, so I'm unable  
to try things until I get to that office on an evening.  

Many thanks  

Richard  
-BEGIN PGP SIGNATURE-  
Version: GnuPG v2  
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/  

iQIcBAEBAgAGBQJV/7ZvAAoJEAKlOyw2yHco8IkP/3Tnf+sX1fPEnRluzkQCTmCB  
BciFqo+CKi509xHCeWjPRLlgZzVlwjrx532LEJ8eFn0v8hm6mtjL0jsBl1DLcThr  
ANczRvO5ux2XkED/ZVyR8lIFNMX326YWb9FUHIy6Pwj81OEMaYWNYUcjSj0VdZAJ  
qcimnETs55Pemn2o+zOLYorNLPMnHCzc7JUR3c6o/nguRUUIeejMEKdal9NSwLnN  
U7Khl2jpSp+4DVXepGa3CSwcjKE2XnHUWQTqlh2IfWV1lDFMrOFVuTVQkZn0Z8ND  
z1fhwqQ1r49IAFrX2HyP9/4LoGxSJ9uOKNZPHcspnU12DiLPjoZe8dWh5lkz8Kym  
RRe7JFOZZuyx3VdIq97jLZBxlf0GhZKWwKY2gnpMs2nSDGb/LHLIk7Gr/90Xkkc4  
mF3X6Ti+BmoBCuqWUM2sqGK/yLdsjhjnaTyReT0G0cLgVY7UDX1Qcn86u1Vc2ZyY  
8CaTIE4h/Mkm4ny3lhBp2wqax06zpvxo0gbFZjTq4NFyo9w4lZMMPx8MqjoSwdns  
CsKCSlWoDgfDN7KL8TNs6URgvp+Fx71S8x+5UrA7N9XehCfKs/88ug7nyFPBZzg2  
fxrcg1/edaWphr1WrA9Abo0Pfoie739qiwGpV0+xY+FwhFTvppE79t4eHh3z9Ydb  
HLGK+6KvkS8s9AdLAhmW  
=gpe4  
-END PGP SIGNATURE-  

___  
openindiana-discuss mailing list  
openindiana-discuss@openindiana.org  
http://openindiana.org/mailman/listinfo/openindiana-discuss  

___  
openindiana-discuss mailing list  
openindiana-discuss@openindiana.org  
http://openindiana.org/mailman/listinfo/openindiana-discuss  
___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

2015-09-21 Thread Watson, Dan
I have had this issue. My (bad) solution was to comment out bootadm from both 
boot-archive service scripts and run them manually as needed. It seems to be an 
indicator of too much latency. SVC only waits 20s by default and bootadm update 
can take up to 100s with sub-par disks.

I find when the disks are "good enough" the first run will time out, but after 
you log in to maintenance mode and clear the service is finishes instantly 
because it completed when SVC timed it out waiting for it.

Dan
-Original Message-
From: Richard Patterson [mailto:rich...@helpquick.co.uk] 
Sent: September 21, 2015 12:49 AM
To: Discussion list for OpenIndiana
Subject: Re: [OpenIndiana-discuss] System will not boot, boot-archive fails

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thanks Nikola,

On 21/09/2015 07:37, Nikola M wrote:
> There is a bug in the GRUB that is shipped as a part of illumos,
> that breaks with STAGE2 if menu.lst is too big (e.g. you have many
> Boot environments , undetermined how much, maybe over 25 BEs?).
> 

I don't think I've encountered this issue, as I only have 4 BEs listed
in grub.

>> 
>> OpenIndiana Build oi_151a9 64-bit (illumos 52e13e00ba)
>> 
>> 32GB Ram, Quad-core AMD CPU
>> 
>> rpool is on a single 250GB SATA drive.
> 
> it is not healthy to be left with no room on disk, it kills ZFS 
> performance, etc. You can clear some space from rpool to some
> external drive, for starters.
> 

The rpool has plenty of space left, 56.1G in use, 173G free, so
doesn't look like that's the problem.

I also did a scrub on rpool, which shows no errors.

As a test, I removed the other 8 drives (took the controller out
completely), and the system boots fine... seems to be something to do
with the large "backups" zpool, as the system will not boot with it
attached.

To make things worse, I can only work on the server at the console,
which is in a remote office from where I work normally, so I'm unable
to try things until I get to that office on an evening.

Many thanks

Richard
-BEGIN PGP SIGNATURE-
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJV/7ZvAAoJEAKlOyw2yHco8IkP/3Tnf+sX1fPEnRluzkQCTmCB
BciFqo+CKi509xHCeWjPRLlgZzVlwjrx532LEJ8eFn0v8hm6mtjL0jsBl1DLcThr
ANczRvO5ux2XkED/ZVyR8lIFNMX326YWb9FUHIy6Pwj81OEMaYWNYUcjSj0VdZAJ
qcimnETs55Pemn2o+zOLYorNLPMnHCzc7JUR3c6o/nguRUUIeejMEKdal9NSwLnN
U7Khl2jpSp+4DVXepGa3CSwcjKE2XnHUWQTqlh2IfWV1lDFMrOFVuTVQkZn0Z8ND
z1fhwqQ1r49IAFrX2HyP9/4LoGxSJ9uOKNZPHcspnU12DiLPjoZe8dWh5lkz8Kym
RRe7JFOZZuyx3VdIq97jLZBxlf0GhZKWwKY2gnpMs2nSDGb/LHLIk7Gr/90Xkkc4
mF3X6Ti+BmoBCuqWUM2sqGK/yLdsjhjnaTyReT0G0cLgVY7UDX1Qcn86u1Vc2ZyY
8CaTIE4h/Mkm4ny3lhBp2wqax06zpvxo0gbFZjTq4NFyo9w4lZMMPx8MqjoSwdns
CsKCSlWoDgfDN7KL8TNs6URgvp+Fx71S8x+5UrA7N9XehCfKs/88ug7nyFPBZzg2
fxrcg1/edaWphr1WrA9Abo0Pfoie739qiwGpV0+xY+FwhFTvppE79t4eHh3z9Ydb
HLGK+6KvkS8s9AdLAhmW
=gpe4
-END PGP SIGNATURE-

___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss

___
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss