Re: [OpenIndiana-discuss] Access to ZFS viz CIFS fromwindows regularlyhangs.

2012-06-13 Thread Mike La Spina

> Does the suspend event only occur on SMB clients or does it impact the
other storage clients when triggered by the Windows clients?

It does not seem to effect the vmware hosted machines via nfs. Next time
it hangs I will try a nfs transfer to it.
- If this is correct its a further indication of an AD/SMB issue, but
not verified at this point.

> Any domain controller event errors?

Yes there are, I will go and resolve this first before I go any further.

- Highly suspect this is where you need to focus.
- This error is suspicious and does look like a issue on the domain. 
- Jun 12 11:26:07 ringwood smbd[6032]: [ID 702911 daemon.notice]
smbd_dc_update: stirling-dynamics.com: located red
- Jun 12 11:34:17 ringwood smbd[6032]: [ID 702911 daemon.error]
smbrdr_exchange[4]: failed (INVALID_HANDLE)
- I would look further back in time and see if it correlates with the
suspended access event. That would define a clear resolution path.

> dmsg output?

Attached - is this the correct etiquette?

- Jun 11 21:38:18 ringwood fmd: [ID 377184 daemon.error] SUNW-MSG-ID:
SMF-8000-YX, TYPE: defect, VER: 1, SEVERITY: major
- Jun 11 21:38:18 ringwood EVENT-TIME: Mon Jun 11 21:38:18 BST 2012
- Jun 11 21:38:18 ringwood PLATFORM: S5520HC, CSN: ,
HOSTNAME: ringwood
- Jun 11 21:38:18 ringwood SOURCE: software-diagnosis, REV: 0.1
- Jun 11 21:38:18 ringwood EVENT-ID:
cc9f2029-a779-cbd2-e425-8ffbaa19f639
- Jun 11 21:38:18 ringwood DESC: A service failed - a method is failing
in a retryable manner but too often.
- Jun 11 21:38:18 ringwood   Refer to http://sun.com/msg/SMF-8000-YX for
more information.
- Jun 11 21:38:18 ringwood AUTO-RESPONSE: The service has been placed
into the maintenance state.
- Jun 11 21:38:18 ringwood IMPACT: svc:/application/time-slider:default
is unavailable.

- The time slider snapshot service failed? Or was it stopped manually?

> fmdump -eV output?

Also attached.
- Nothing remarkable

> uname -a output?

SunOS ringwood 5.11 oi_148 i86pc i386 i86pc Solaris

> Have you attempted a packet capture of the event?
> snoop -o smb-client.cap 

Not yet, It could be caputureing for 4 hours before it happens, I will
resolve the AD domain issue first.
- Good approach. 4 hours of packet tracing is hard to digest! It would
certainly need to be truncated down to the trigger event. 

- Mike



___

The contents of this e-mail and any attachment(s) are strictly
confidential and are solely for the person(s) at the e-mail address(es)
above. If you are not an addressee, you may not disclose, distribute,
copy or use this e-mail, and we request that you send an e-mail to
ad...@stirling-dynamics.com and delete this e-mail.  Stirling Dynamics
Ltd. accepts no legal liability for the contents of this e-mail
including any errors, interception or interference, as internet
communications are not secure.  Any views or opinions presented are
solely those of the author and do not necessarily represent those of
Stirling Dynamics Ltd. Registered In England No. 2092114 Registered
Office: 26 Regent Street, Clifton, Bristol. BS8 4HG VAT no. GB 464 6551
29
___

This e-mail has been scanned for all viruses MessageLabs.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Access to ZFS viz CIFS from windows regularly hangs.

2012-06-13 Thread Gordon Ross
Have you done a snoop (or wireshark) capture to see what the box is
doing during the pause?
It's possible that it's trying to talk to an AD server...

On Tue, Jun 12, 2012 at 9:52 AM, John McEntee
 wrote:
> I am having problems with a openindiana storage server I have built am I am
> trying to track down the cause to fix it. The current symptoms are seen from
> all windows clients (both 7 and XP) that will report an error stating.
>
>
>
> Path File is not accessible. The specified network name is no longer
> available.
>
>
>
> Another symptom  is windows explorer hangs and the user has to wait for it
> to some back.
>
>
>
> Just waiting a while ( a few minutes) and the box comes back.
>
>
>
> I  currently think the root cause is in openindiana somewhere but am at a
> bit of a loss. I have tried many things and have still not fixed it. I think
> the box is lightly loaded for the hardware spec but kernel load increases to
> 40% when a zfssnap is taking place.
>
>
>
> Hardware spec.
>
> 2 x Xeon E6520 cpus
>
> 48 GB RAM
>
> Intel HC5520 motherboard
>
> 3 x LSI SAS 9211-8i  cards
>
>
>
> Currently on openindiana 148
>
>
>
> The box is joined to a windows 2003 domain.
>
>
>
> Zpool tank is 3 way mirror of 7 x 3TB hitachi disk (using 21 disks in total,
> zpool size of 19 TB, ) with 2 x SSD   8GB ZIL  on each and 140GB L2ARC on
> each, default checksum, no dedup and no compression.
>
>
>
> Server operates as a windows home directory for 58 users (some laptops users
> so just a backup location), a main shared drive for the company of 120
> users.
>
> It is also a nfs server to a vmware vsphere 4 server hosting 10 virtual
> machines.
>
>
>
> There are only 8 active production file systems, and 12 backup file systems
> from other hosts (done out of hours).
>
>
>
> Zpool iostat peaks at about 35 MB for the pool mostly around the 0 to 7 MB
> level.
>
>
>
> Turning of time-sliderd does not stop the problem. (backups run out of
> hours)
>
>
>
> A  dtrace -n 'sched:::off-cpu { @[execname]=count()}'
>
> Used to give a sched count in the 6 to 7 fiqures over 3 seconds, but  turing
> apci off with
>
> #eeprom acpi-user-options=0x8
>
> Reduced this to 5 figures.
>
>
>
> What can I do to identify the problem to be able to fix it?
>
>
>
> Thanks
>
>
>
> John
>
>
>
> Other information:
>
>
>
> dtrace -n 'sched:::off-cpu { @[execname]=count()}'
>
> dtrace: description 'sched:::off-cpu ' matched 3 probes
>
> ^C
>
>
>
>  gconfd-2                                                          2
>
>  idmapd                                                            2
>
>  inetd                                                             2
>
>  nscd                                                              2
>
>  sendmail                                                          2
>
>  svc.startd                                                        2
>
>  gnome-power-mana                                                  3
>
>  fmd                                                               4
>
>  sshd                                                              4
>
>  devfsadm                                                          6
>
>  fsflush                                                           7
>
>  nfsmapid                                                          7
>
>  ntpd                                                              7
>
>  dtrace                                                           13
>
>  Xorg                                                             17
>
>  gdm-simple-greet                                                 17
>
>  svc.configd                                                      71
>
>  smbd                                                            113
>
>  time-sliderd                                                    138
>
>  zpool-rpool                                                     597
>
>  nfsd                                                            918
>
>  zpool-tank                                                     1968
>
>  sched                                                         80542
>
>
>
> # echo hz/D | sudo mdb -k
>
> hz:
>
> hz:             100
>
>
>
> # echo apic_timer::print apic_timer_t | sudo mdb -k
>
> {
>
>    mode = 0
>
>    apic_timer_enable_ops = oneshot_timer_enable
>
>    apic_timer_disable_ops = oneshot_timer_disable
>
>    apic_timer_reprogram_ops = oneshot_timer_reprogram
>
> }
>
>
>
> ___
>
> The contents of this e-mail and any attachment(s) are strictly confidential 
> and are solely for the person(s) at the e-mail address(es) above. If you are 
> not an addressee, you may not disclose, distribute, copy or use this e-mail, 
> and we request that you send an e-mail to ad...@stirling-dynamics.com and 
> delete this e-mail.  Stirling Dynamics Ltd. accepts no legal liability for 
> the contents of this e-mail including any errors, interception or 
> interference, as inte

Re: [OpenIndiana-discuss] Access to ZFS viz CIFS from windows regularly hangs.

2012-06-13 Thread John McEntee
> I had similar issues before I enabled TLER, and disabled the head parking
on my WD Green drives. A quick Google shows some evidence of similar
features on the 3TB Hitachis.

I chose Hitachi because I didn't think they had this "feature". I would have
thought ZFS would have reported errors and removed them from the array if it
was happening? I will go and search the internet.

John



___

The contents of this e-mail and any attachment(s) are strictly confidential and 
are solely for the person(s) at the e-mail address(es) above. If you are not an 
addressee, you may not disclose, distribute, copy or use this e-mail, and we 
request that you send an e-mail to ad...@stirling-dynamics.com and delete this 
e-mail.  Stirling Dynamics Ltd. accepts no legal liability for the contents of 
this e-mail including any errors, interception or interference, as internet 
communications are not secure.  Any views or opinions presented are solely 
those of the author and do not necessarily represent those of Stirling Dynamics 
Ltd. Registered In England No. 2092114 Registered Office: 26 Regent Street, 
Clifton, Bristol. BS8 4HG
VAT no. GB 464 6551 29
___

This e-mail has been scanned for all viruses MessageLabs.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Access to ZFS viz CIFS from windows regularly hangs.

2012-06-13 Thread John McEntee
> The first thing to verify is your network and network interface.  Run
continuous traffic and see if there are any hickups.  
> You can use /usr/sbin/ping for testing with larger packets.
>
> Also check the log files under /var/adm and /var/log.  Also check output
of 'fmadm -ev' and 'fmadm faulty'.

Nothing shows in dmesg, or in the logs under /var/adm and /var/log  no
hardware errors are logged by zfs either. 

'fmadm -ev' errors on invalid options and 'fmadm faulty' is clear (nothing
to report).

The nfs share to the vmware server seems fine as well, but it may be more
accommodating of problems. Large file transfers initiated from a windows
desktop run at 40 to 50 MB/s (desktop probably the bottle neck).

Thanks

John


___

The contents of this e-mail and any attachment(s) are strictly confidential and 
are solely for the person(s) at the e-mail address(es) above. If you are not an 
addressee, you may not disclose, distribute, copy or use this e-mail, and we 
request that you send an e-mail to ad...@stirling-dynamics.com and delete this 
e-mail.  Stirling Dynamics Ltd. accepts no legal liability for the contents of 
this e-mail including any errors, interception or interference, as internet 
communications are not secure.  Any views or opinions presented are solely 
those of the author and do not necessarily represent those of Stirling Dynamics 
Ltd. Registered In England No. 2092114 Registered Office: 26 Regent Street, 
Clifton, Bristol. BS8 4HG
VAT no. GB 464 6551 29
___

This e-mail has been scanned for all viruses MessageLabs.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss