That's very strange, since SMF should have timed out the process it if
takes that long to start, and put it in maintenance mode.

Seems like the ioctl() is somehow causing a hard-hang on the process which
can't then be killed...

Would be worth doing some dtrace on the process to see what exactly it's up
to, and whether it's coming out of the kernel at all.

Thanks,

Darren.

On 28/02/2012 07:58, Lucia Lai wrote:
> Hi Darren,
> 
> Thanks for your quick response. The fmd service has been like that for 
> hours. So I guess either fmd itself has problem or some problem on the 
> nodes (went wrong at the same time on all the three nodes? Hmmm...). 
> pstack of the fmd process shows it stays at an ioctl(). I'll check with 
> our lab folk, and see if I am lucky to find someone who is with fmd as well.
> 
> Thanks again,
> 
> - Lucia
> 
> 
> On 2/27/2012 11:32 PM, Darren Kenny wrote:
>> Hi Lucia,
>>
>>  From what you've provided below, the fmd service is still in the process of
>> starting up, and hasn't completed startup (or as according to the http
>> link, reached a definite failure).
>>
>> If you wait a bit longer does it fail totally? Or does it remain in this
>> 'starting' phase for a long time?
>>
>> Until fmd starts then the auto installer won't start...
>>
>> Thanks,
>>
>> Darren.
>>
>> On 28/02/2012 06:21, Lucia Lai wrote:
>>> Hi,
>>>
>>> I did a few re-install with AI on the same nodes with no problem. It was
>>> to install s11 GA. Then on the last try the installer is not even
>>> starting. I tried a few times and it is the same thing on all three
>>> nodes. Any clues? Let me know if you need further info. Thanks.
>>>
>>> {b} ok  boot net:dhcp - install
>>>
>>> SC Alert: Host System has Reset
>>> \
>>>
>>> Sun Fire T200, No Keyboard
>>> Copyright (c) 1998, 2010, Oracle and/or its affiliates. All rights reserved.
>>> OpenBoot 4.30.4.b, 16256 MB memory available, Serial #67260464.
>>> Ethernet address 0:14:4f:2:50:30, Host ID: 84025030.
>>>
>>>
>>>
>>> Boot device: /pci@780/pci@0/pci@1/network@0:dhcp  File and args: - install
>>> 1000 Mbps full duplex  Link up
>>> Timed out waiting for BOOTP/DHCP reply
>>> <time unavailable>  wanboot info: WAN boot messages->console
>>> <time unavailable>  wanboot info: configuring
>>> /pci@780/pci@0/pci@1/network@0:dhcp
>>>
>>> 1000 Mbps full duplex  Link up
>>> <time unavailable>  wanboot info: Starting DHCP configuration
>>> <time unavailable>  wanboot info: DHCP configuration succeeded
>>> <time unavailable>  wanboot progress: wanbootfs: Read 368 of 368 kB (100%)
>>> <time unavailable>  wanboot info: wanbootfs: Download complete
>>> Tue Feb 28 05:58:11 wanboot progress: miniroot: Read 213735 of 213735 kB
>>> (100%)
>>> Tue Feb 28 05:58:11 wanboot info: miniroot: Download complete
>>> SunOS Release 5.11 Version 11.0 64-bit
>>> Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
>>> Remounting root read/write
>>> Probing for device nodes ...
>>> Preparing network image for use
>>> Downloading solaris.zlib
>>> --2012-02-28 05:36:32--
>>> http://10.134.83.35:5555/rpool/ai/target/bank-sol-11-1111-ai-sparc//solaris.zlib
>>> Connecting to 10.134.83.35:5555... connected.
>>> HTTP request sent, awaiting response... 200 OK
>>> Length: 126752256 (121M) [text/plain]
>>> Saving to: `/tmp/solaris.zlib'
>>>
>>> 100%[======================================>] 126,752,256 21.5M/s   in
>>> 6.5s
>>>
>>> 2012-02-28 05:36:39 (18.7 MB/s) - `/tmp/solaris.zlib' saved
>>> [126752256/126752256]
>>>
>>> Downloading solarismisc.zlib
>>> --2012-02-28 05:36:39--
>>> http://10.134.83.35:5555/rpool/ai/target/bank-sol-11-1111-ai-sparc//solarismisc.zlib
>>> Connecting to 10.134.83.35:5555... connected.
>>> HTTP request sent, awaiting response... 200 OK
>>> Length: 20636672 (20M) [text/plain]
>>> Saving to: `/tmp/solarismisc.zlib'
>>>
>>> 100%[======================================>] 20,636,672  17.5M/s   in
>>> 1.1s
>>>
>>> 2012-02-28 05:36:40 (17.5 MB/s) - `/tmp/solarismisc.zlib' saved
>>> [20636672/20636672]
>>>
>>> Downloading .image_info
>>> --2012-02-28 05:36:40--
>>> http://10.134.83.35:5555/rpool/ai/target/bank-sol-11-1111-ai-sparc//.image_info
>>> Connecting to 10.134.83.35:5555... connected.
>>> HTTP request sent, awaiting response... 200 OK
>>> Length: 65 [text/plain]
>>> Saving to: `/tmp/.image_info'
>>>
>>> 100%[======================================>] 65          --.-K/s   in
>>> 0s
>>>
>>> 2012-02-28 05:36:40 (1.52 MB/s) - `/tmp/.image_info' saved [65/65]
>>>
>>> Done mounting image
>>> Configuring devices.
>>> Hostname: pbank13
>>> WARNING: ds@1: ds_handle_recv: invalid message length, received 24576
>>> bytes, expected 50384
>>> Service discovery phase initiated
>>> Service name to look up: bank-sol-11-1111-ai-sparc
>>> Service discovery over multicast DNS failed
>>> Service bank-sol-11-1111-ai-sparc located at 10.134.83.35:5555 will be used
>>> Service discovery finished successfully
>>> Process of obtaining install manifest initiated
>>> Using the install manifest obtained via service discovery
>>>
>>> pbank13 console login: root
>>> Password:
>>> Feb 28 05:39:02 pbank13 login: ROOT LOGIN /dev/console
>>> Oracle Corporation      SunOS 5.11      11.0    November 2011
>>> root@pbank13:~# svcs -xv
>>> svc:/system/fmd:default (Solaris Fault Manager)
>>>   State: offline since Tue Feb 28 05:37:49 2012
>>> Reason: Start method is running.
>>>     See: http://sun.com/msg/SMF-8000-C4
>>>     See: man -M /usr/share/man -s 1M fmd
>>>     See: /var/svc/log/system-fmd:default.log
>>> Impact: 3 dependent services are not running:
>>>          svc:/system/devchassis:daemon
>>>          svc:/application/auto-installer:default
>>>          svc:/system/fpsd:default
>>>
>>> svc:/network/ilomconfig-interconnect:default (ilomconfig-interconnect)
>>>   State: offline since Tue Feb 28 05:36:49 2012
>>> Reason: Start method is running.
>>>     See: http://sun.com/msg/SMF-8000-C4
>>>     See: man -M /usr/share/man -s 1M ilomconfig
>>>     See: /var/svc/log/network-ilomconfig-interconnect:default.log
>>> Impact: This service is not running.
>>>
>>> root@pbank13:~# more /var/svc/log/system-fmd:default.log
>>> [ Feb 28 05:37:50 Executing start method ("/lib/svc/method/svc-fmd"). ]
>>> root@pbank13:~# ps -ef| grep fmd
>>>      root   973   964   0 05:39:54 console     0:00 grep fmd
>>>      root   703     9   0 05:37:50 ?           0:00 /usr/lib/fm/fmd/fmd
>>>      root   749   703   0 05:37:51 ?           0:00 /usr/lib/fm/fmd/fmd
>>> root@pbank13:~# ls -l /var/fm
>>> total 2
>>> drwxr-xr-x   6 root     sys          512 Oct 20 23:11 fmd
>>>
>>> root@pbank13:~# ls -l /var/fm/fmd
>>> total 8
>>> drwxr-xr-x   2 root     sys          512 Oct 20 23:03 ckpt
>>> drwxr-xr-x   2 root     sys          512 Oct 20 23:03 rsrc
>>> drwxr-xr-x   2 root     sys          512 Oct 20 23:03 topo
>>> drwxr-xr-x   2 root     sys          512 Oct 20 23:03 xprt
>>>
>>>
>>> - Lucia
>>>
>>> _______________________________________________
>>> caiman-discuss mailing list
>>> [email protected]
>>> http://mail.opensolaris.org/mailman/listinfo/caiman-discuss
> 
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Reply via email to