Re: [zfs-discuss] JBOD performance

2007-12-18 Thread Roch - PAE

Frank Penczek writes:
  Hi,
  
  On Dec 17, 2007 4:18 PM, Roch - PAE [EMAIL PROTECTED] wrote:

 The pool holds home directories so small sequential writes to one
 large file present one of a few interesting use cases.
  
   Can you be more specific here ?
  
   Do you have a body of application that would do small
   sequential writes; or one in particular ? Another
   interesting info is if we expect those to be allocating
   writes or overwrite (beware that some app, move the old file
   out, then run allocating writes, then unlink the original
   file).
  
  Sorry, I try to be more specific.
  The zpool contains home directories that are exported to client machines.
  It is hard to predict what exactly users are doing, but one thing users do 
  for
  certain is checking out software projects from our subversion server. The
  projects typically contain many source code files (thousands) and a
  build process
  accesses all of them in the worst case. That is what I meant by many (small)
  files like compiling projects in my previous post. The performance
  for this case
  is ... hopefully improvable.
  

This we'll have to work on. But first, If this is to
Storage with NVRAM, I assume you checked that the storage
does not flush it's caches :


http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes


If that is not your problem and if ZFS underperform another
FS on the backend of NFS, then this needs investigation.

If ZFS/NFS underperformance a direct attach FS that might
just be an NFS issue not related to ZFS. Again that needs
investigation. 

Performance gains won't happen unless we find out what
doesn't work.

  Now for sequential writes:
  We don't have a specific application issuing sequential writes but I
  can think of
  at least a few cases where these writes may occur, e.g.
  dumps of substantial amounts of measurement data or growing log files
  of applications.
  In either case these would be mainly allocating writes.
  

Right but I'd hope the application would issue substantially 
large writes specially if it' needs to dump  data at high rate.
If the data rate is more modest, then the CPU lost to this 
effect will itself be modest.

  Does this provide the information you're interested in?
  

I get a sense that it's  more important we find out what is
your build issue is. But the small writes will have to be
improved one day also.


-r

  
  Cheers,
Frank

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Roadmap - thoughts on expanding raidz / restriping / defrag

2007-12-18 Thread Paul van der Zwan

On 17 Dec 2007, at 11:42, Jeff Bonwick wrote:

 In short, yes.  The enabling technology for all of this is something
 we call bp rewrite -- that is, the ability to rewrite an existing
 block pointer (bp) to a new location.  Since ZFS is COW, this would
 be trivial in the absence of snapshots -- just touch all the data.
 But because a block may appear in many snapshots, there's more to it.
 It's not impossible, just a bit tricky... and we're working on it.

 Once we have bp rewrite, many cool features will become available as
 trivial applications of it: on-line defrag, restripe, recompress, etc.


Does that include evacuating vdevs ? Marking a vdev read only and  
then doing a
rewrite pass would clear out the vdev, wouldn't it ?

Paul

 Jeff

 On Mon, Dec 17, 2007 at 02:29:14AM -0800, Ross wrote:
 Hey folks,

 Does anybody know if any of these are on the roadmap for ZFS, or  
 have any idea how long it's likely to be before we see them (we're  
 in no rush - late 2008 would be fine with us, but it would be nice  
 to know they're being worked on)?

 I've seen many people ask for the ability to expand a raid-z pool  
 by adding devices.  I'm wondering if it would be useful to work on  
 a defrag / restriping tool to work hand in hand with this.

 I'm assuming that when the functionality is available, adding a  
 disk to a raid-z set will mean the existing data stays put, and  
 new data is written across a wider stripe.  That's great for  
 performance for new data, but not so good for the existing files.   
 Another problem is that you can't guarantee how much space will be  
 added.  That will have to be calculated based on how much data you  
 already have.

 ie:  If you have a simple raid-z of five 500GB drives, you would  
 expect adding another drive to add 500GB of space.  However, if  
 your pool is half full, you can only make use of 250GB of space,  
 the other 250GB is going to be wasted.

 What I would propose to solve this is to implement a defrag /  
 restripe utility as part of the raid-z upgrade process, making it  
 a three step process:

  - New drive added to raid-z pool
  - Defrag tool begins restriping and defragmenting old data
  - Once restripe complete, pool reports the additional free space

 There are some limitations to this.  You would maybe want to  
 advise that expanding a raid-z pool should only be done with a  
 reasonable amount of free disk space, and that it may take some  
 time.  It may also be beneficial to add the ability to add  
 multiple disks in one go.

 However, if it works it would seem to add several benefits:
  - Raid-z pools can be expanded
  - ZFS gains a defrag tool
  - ZFS gains a restriping tool


 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] aoc-sat2-mv8 (was: LSI SAS3081E = unstable drive numbers?)

2007-12-18 Thread Kent Watsen


Kent Watsen wrote:
 So, I picked up an AOC-SAT2-MV8 off eBay for not too much and then I got 
 a 4xSATA to one SFF-8087 cable to connect it to one one my six 
 backplanes.  But, as fortune would have it, the cable I bought has SATA 
 connectors that are physically too big to plug into the AOC-SAT2-MV8 - 
 since the AOC-SAT2-MV8 stacks two SATA connectors on top of each other...
   


As a temporary solution, I hooked up the reverse breakout cable using 
ports 1, 3, 5, and 7 on the aoc-sat2-mv8 - the cables fit this way 
because its using only one port from each stack.  Anyway, the good news 
is that drives showed up in Solaris right away and their IDs are stable 
between hot-swaps and reboots.  So I'll be keeping the aoc-sat2-mv8 
(anybody want a SAS3081E?)

I've already ordered more cables for the aoc-sat2-mv8 and will report 
which ones work when I get them


Thanks,
Kent

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [osol-code] /usr/bin and /usr/xpg4/bin differences

2007-12-18 Thread James Carlson
Sasidhar Kasturi writes:
  Is it that /usr/bin binaries are more advanced than that of /xpg4
 things or .. the extensions of the /xpg4 things?

No.  They're just different.

 If i want to make some modifications in the code.. Can i do it for /xpg4/bin
 commands or .. i should do it for /usr/bin commands??

There's no simple answer to that question.

If your modifications affect things that are covered by the relevant
standards, and if your modifications are not in compliance with those
standards, then you should not be changing the /usr/xpg4/bin or
/usr/xpg6/bin versions of the utility.

If your modifications affect compatibility with historic Solaris or
SunOS behavior, then you'll need to look closely at how your changes
fit into the existing /usr/bin utility.

In general, I think we'd like to see new features added to both where
possible and where conflicts are not present.  But each proposal is
different.

I'd suggest doing one (or maybe more) of the following:

  - putting together a proposal for a change, getting a sponsor
through the usual process, and then bring the issues up in an ARC
review.

  - finding an expert in the area you're planning to change to help
give you some advice.

  - getting a copy of the standards documents (most are on-line these
days; see www.opengroup.org) and figuring out what issues apply in
your case.

-- 
James Carlson, Solaris Networking  [EMAIL PROTECTED]
Sun Microsystems / 35 Network Drive71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] aoc-sat2-mv8

2007-12-18 Thread James C. McPherson
Kent Watsen wrote:
 
 Kent Watsen wrote:
 So, I picked up an AOC-SAT2-MV8 off eBay for not too much and then I got 
 a 4xSATA to one SFF-8087 cable to connect it to one one my six 
 backplanes.  But, as fortune would have it, the cable I bought has SATA 
 connectors that are physically too big to plug into the AOC-SAT2-MV8 - 
 since the AOC-SAT2-MV8 stacks two SATA connectors on top of each other...
 As a temporary solution, I hooked up the reverse breakout cable using 
 ports 1, 3, 5, and 7 on the aoc-sat2-mv8 - the cables fit this way 
 because its using only one port from each stack.  Anyway, the good news 
 is that drives showed up in Solaris right away and their IDs are stable 
 between hot-swaps and reboots.  So I'll be keeping the aoc-sat2-mv8 

This is progress, I'm glad to hear it.

 (anybody want a SAS3081E?)

  MEMEMEMEMEMEMEMEMEMEMEME !!! :-)

 I've already ordered more cables for the aoc-sat2-mv8 and will report 
 which ones work when I get them

That will be very good info to have - there's too little information
and personal experience surrounding the SAS cabling world as yet.


cheers,
James C. McPherson
--
Solaris kernel software engineer, system admin and troubleshooter
   http://www.jmcp.homeunix.com/blog
   http://blogs.sun.com/jmcp
Find me on LinkedIn @ http://www.linkedin.com/in/jamescmcpherson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is round-robin I/O correct for ZFS?

2007-12-18 Thread Gary Mills
On Fri, Dec 14, 2007 at 10:55:10PM -0800, Jonathan Loran wrote:
 
 This is the same configuration we use on 4 separate servers (T2000, two 
 X4100, and a V215).  We do use a different iSCSI solution, but we have 
 the same multi path config setup with scsi_vhci.  Dual GigE switches on 
 separate NICs both server and iSCSI node side.  We suffered from the 
 e1000g interface flapping bug, on two of these systems, and one time a 
 SAN interface went down to stay (until reboot).  The vhci multi path 
 performed flawlessly.  I scrubbed the pools (one of them is 10TB) and no 
 errors were found, even though we had heavy IO at the time of the NIC 
 failure.  I think this configuration is a good one.

Thanks for the response.  I did a failover test by disconnecting
ethernet cables yesterday.  It didn't behave the way it was supposed
to.  Likely there's something wrong with my multipath configuration.
I'll have to review it, but that's why I have a test server.

I was concerned about simultaneous SCSI commands over the two paths
that might get executed out of order, but something must ensure that
that never happens.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [osol-code] /usr/bin and /usr/xpg4/bin differences

2007-12-18 Thread Bill Sommerfeld

On Sat, 2007-12-15 at 22:00 -0800, Sasidhar Kasturi wrote:
 If i want to make some modifications in the code.. Can i do it
 for /xpg4/bin commands or .. i should do it for /usr/bin commands?? 

If possible (if there's no inherent conflict with either the applicable
standards or existing practice) you should do it for both to minimize
the difference between the two variants of the commands.

I'm currently working with John Plocher to figure out why the opinion
for psarc 2005/683 (which sets precedent that divergence between command
variants should be minimized) hasn't been published, but there's a more
detailed explanation of the desired relationship
between /usr/bin, /usr/xpg4/bin, and /usr/xpg6/bin in that opinion.

- Bill

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HA-NFS AND HA-ZFS

2007-12-18 Thread Robert Milkowski
Hello Matthew,

Monday, December 17, 2007, 5:45:12 PM, you wrote:

MCA We are currently running sun cluster 3.2 on solaris 10u3. We are
MCA using ufs/vxvm 4.1 as our shared file systems. However, I would
MCA like to migrate to HA-NFS on ZFS. Since there is no conversion
MCA process from UFS to ZFS other than copy, I would like to migrate
MCA on my own time. To do this I am planning to add a new zpool
MCA HAStoragePlus resource to my existing HA-NFS resource group. This
MCA way I can migrate data from my existing UFS to ZFS on my own time
MCA and the clients will not know the difference.

MCA I made sure that the zpool was available on both nodes of the
MCA cluster. I then created a new HAStoragePlus resource for the
MCA zpool. I updated my NFS resource to depend on both HAStoragePlus
MCA resources. I added the two test file systems to the current dfstab.nfs-rs 
file.
MCA I manually ran the shares and I was able to mount the new zfs
MCA file system. However, once the monitor ran it re-shared I guess
MCA and now the ZFS based filesystems are not available.

MCA I read that you are not to add the ZFS based file systems to the
MCA FileSystemMountPoints property. Any ideas?

You just use Zpool property and not FileSystemMountPoints with ZFS.

Check 
http://milek.blogspot.com/2006/09/zfs-in-high-availability-environments.html
for step-by-step example.

btw: for testing, if you really have to test it on production cluster,
I would first create a new test resource group with only two resources
- IP (logicalhostname) and hastorageplus with zfs pool. Then check if
you can failover, failback, etc. once you prove it you know how it
workse, move the resource to production rg.


-- 
Best regards,
 Robert Milkowski   mailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [osol-code] /usr/bin and /usr/xpg4/bin differences

2007-12-18 Thread John Plocher
Sasidhar Kasturi wrote:
 Thank you,
  Is it that /usr/bin binaries are more advanced than that of 
 /xpg4 things or .. the extensions of the /xpg4 things?

They *should* be the same level of advancement, but each has a
different set of promises and expectations it needs to live up to...

 
 If i want to make some modifications in the code.. Can i do it for 
 /xpg4/bin commands or .. i should do it for /usr/bin commands??


If you are doing this just for yourself, it doesn't matter - fix
the one you use.  If you intend to push these changes back into the
OS.o source base, you will need to make the changes to both (and,
possibly interact with the OpenSolaris ARC Community if your changes
affect the architecture/interfaces of the commands).

In the case of df, I'm not at all sure why the two commands are
different. (I'm sure someone else will chime in and educate me :-)

-John

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] SAS cabling was Re: aoc-sat2-mv8

2007-12-18 Thread Al Hopper
On Tue, 18 Dec 2007, James C. McPherson wrote:

... snip .
 That will be very good info to have - there's too little information
 and personal experience surrounding the SAS cabling world as yet.

Here's a couple of resources:

SAS Integrators Guide:
wget http://www.lsi.com/documentation/storage/megaraid/SAS_IG.pdf

You can always get good advice from:
http://www.cs-electronics.com/

Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
Graduate from sugar-coating school?  Sorry - I never attended! :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Does Oracle support ZFS as a file system with Oracle RAC?

2007-12-18 Thread David Runyon
Does anyone know this?

David Runyon
Disk Sales Specialist

Sun Microsystems, Inc.
4040 Palm Drive
Santa Clara, CA 95054 US
Mobile 925 323-1211
Email [EMAIL PROTECTED]




Russ Lai wrote:
 Dave;
 Does ZFS support Oracle RAC?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does Oracle support ZFS as a file system with Oracle RAC?

2007-12-18 Thread Enda O'Connor ( Sun Micro Systems Ireland)
David Runyon wrote:
 Does anyone know this?
 
 David Runyon
 Disk Sales Specialist
 
 Sun Microsystems, Inc.
 4040 Palm Drive
 Santa Clara, CA 95054 US
 Mobile 925 323-1211
 Email [EMAIL PROTECTED]
 
 
 
 
 Russ Lai wrote:
 Dave;
 Does ZFS support Oracle RAC?
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
metalink doc 403202.1 appears to support this config, but to me reads a 
little unclear.


{
Applies to:
Oracle Server - Enterprise Edition - Version: 9.2.0.5 to 10.2.0.3
Solaris Operating System (SPARC 64-bit)
Goal
Is the Zeta File System (ZFS) of Solaris 10 certified/supported by 
ORACLE for:
- Database
- RAC

Solution
Oracle certifies and support the RDBMS on the whole OS for non-RAC 
installations. However if there is an exception, this should appear on 
the Release Notes, or in the OS Oracle specific documentation manual.

As you are not specific to cluster file systems for RAC installations, 
usually there is no problem on install Oracle on the file systems 
provided by OS vendor.But if any underlying OS error is found then it 
should be handled by the OS vendor.

Over the past few years Oracle has worked with all the leading system 
and storage vendors to validate their specialized storage products, 
under the Oracle Storage Compatibility Program (OSCP), to ensure these 
products were compatible for use with the Oracle database. Under the 
OSCP, Oracle and its partners worked together to validate specialized 
storage technology including NFS file servers, remote mirroring, and 
snapshot products.

At this time Oracle believes that these three specialized storage 
technologies are well understood by the customers, are very mature, and 
the Oracle technology requirements are well know. As of January, 2007, 
Oracle will no longer validate these products.

On a related note, many Oracle customers have embraced the concept of 
the resilient low-cost storage grid defined by Oracle's Resilient 
Low-Cost Storage Initiative (leveraging the Oracle Database 10g 
Automatic Storage Management (ASM) feature to make low-cost, modular 
storage arrays resilient), and many storage vendors continue to 
introduce new, low-cost, modular arrays for an Oracle storage grid 
environment. As of January, 2007, the Resilient Low-Cost Storage 
Initiative is discontinued.

For more information on the same please refer to Oracle Storage Program 
Change Notice

}
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] JBOD performance

2007-12-18 Thread Peter Schuller
 Sequential writing problem with process throttling - there's an open
 bug for it for quite a while. Try to lower txg_time to 1s - should
 help a little bit.

Yeah, my post was mostly to emphasize that on commodity hardware raidz2 does 
not even come close to being a CPU bottleneck. It wasn't a poke at the 
streaming performance. Very interesting to hear there's a bug open for it 
though.

 Can you also post iostat -xnz 1 while you're doing dd?
 and zpool status

This was FreeBSD, but I can provide iostat -x if you still want it for some 
reason. 

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller [EMAIL PROTECTED]'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



signature.asc
Description: This is a digitally signed message part.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does Oracle support ZFS as a file system with Oracle RAC?

2007-12-18 Thread Mike Gerdts
On Dec 18, 2007 11:01 AM, David Runyon [EMAIL PROTECTED] wrote:
 Does anyone know this?

There are multiple file system usages involved in Oracle RAC:

1) Oracle Home - This is where the oracle software lives.  This can be
   on a file system shared among all nodes or a per-host file system.
   ZFS should work fine in the per-host configuration, but I don't
   know about an official support statement.  This is likely not very
   important because of...
2) Database files - I'll lump redo logs, etc. in with this.  In Oracle
   RAC these must live on a shared-rw (e.g. clustered VxFS, NFS) file
   system.  ZFS does not do this.

If you drink the Oracle kool-aid and are using 10g or later the
database files will go into ASM, which seems to share a number of
characteristics with (but is largely complementary to) ZFS.  That is,
it spreads writes among all allocated disks, provides redundancy
without an underlying volume manager or hardware RAID, is transaction
safe, etc.  I am pretty sure that ASM also supports per-block
checksums, space efficient snapshots, block level incremental backups,
etc.  Although ASM is a relatively new technology, I think it has many
more hours of runtime and likely more space in production use than
ZFS.

I think that ZFS holds a lot of promise for shared-nothing database
clusters, such as is being done by Greenplumb with their extended
variant of Postgres.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is round-robin I/O correct for ZFS?

2007-12-18 Thread Jonathan Loran



Gary Mills wrote:

On Fri, Dec 14, 2007 at 10:55:10PM -0800, Jonathan Loran wrote:
  
This is the same configuration we use on 4 separate servers (T2000, two 
X4100, and a V215).  We do use a different iSCSI solution, but we have 
the same multi path config setup with scsi_vhci.  Dual GigE switches on 
separate NICs both server and iSCSI node side.  We suffered from the 
e1000g interface flapping bug, on two of these systems, and one time a 
SAN interface went down to stay (until reboot).  The vhci multi path 
performed flawlessly.  I scrubbed the pools (one of them is 10TB) and no 
errors were found, even though we had heavy IO at the time of the NIC 
failure.  I think this configuration is a good one.



Thanks for the response.  I did a failover test by disconnecting
ethernet cables yesterday.  It didn't behave the way it was supposed
to.  Likely there's something wrong with my multipath configuration.
I'll have to review it, but that's why I have a test server.

I was concerned about simultaneous SCSI commands over the two paths
that might get executed out of order, but something must ensure that
that never happens.

  


From the Sun side, the scsi_vhci is pretty simple.  There's a number of 
options you can tweak with mdadm, but I haven't ever needed to.  Perhaps 
you will however.  The iSCSI targets may be persisting on the failed 
path for some reason, I don't now.  Not familiar with the Netapp in 
iSCSI config.  Our targets simply respond on what ever path a SCSI 
command is sent on.  This means the initiator side (scsi_vhci) drives 
the path assignments for each iSCSI command. At least this is how I 
understand it.  The scsi_vhci will never send two of the same commands 
to both paths, as long as they are up.  If a paths fails, then a retry 
of any pending operations will occur on the operational path, and that's 
it.  If a path returns to service, then it will be re-utilized.


Jon

--


- _/ _/  /   - Jonathan Loran -   -
-/  /   /IT Manager   -
-  _  /   _  / / Space Sciences Laboratory, UC Berkeley
-/  / /  (510) 643-5146 [EMAIL PROTECTED]
- __/__/__/   AST:7731^29u18e3




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is round-robin I/O correct for ZFS?

2007-12-18 Thread Jonathan Loran



Jonathan Loran wrote:



Gary Mills wrote:

On Fri, Dec 14, 2007 at 10:55:10PM -0800, Jonathan Loran wrote:
  
This is the same configuration we use on 4 separate servers (T2000, two 
X4100, and a V215).  We do use a different iSCSI solution, but we have 
the same multi path config setup with scsi_vhci.  Dual GigE switches on 
separate NICs both server and iSCSI node side.  We suffered from the 
e1000g interface flapping bug, on two of these systems, and one time a 
SAN interface went down to stay (until reboot).  The vhci multi path 
performed flawlessly.  I scrubbed the pools (one of them is 10TB) and no 
errors were found, even though we had heavy IO at the time of the NIC 
failure.  I think this configuration is a good one.



Thanks for the response.  I did a failover test by disconnecting
ethernet cables yesterday.  It didn't behave the way it was supposed
to.  Likely there's something wrong with my multipath configuration.
I'll have to review it, but that's why I have a test server.

I was concerned about simultaneous SCSI commands over the two paths
that might get executed out of order, but something must ensure that
that never happens.

  


From the Sun side, the scsi_vhci is pretty simple.  There's a number 
of options you can tweak with mdadm, 

I meant mpathadm.  Oops.  I need more coffee (or less).

Jon
but I haven't ever needed to.  Perhaps you will however.  The iSCSI 
targets may be persisting on the failed path for some reason, I don't 
now.  Not familiar with the Netapp in iSCSI config.  Our targets 
simply respond on what ever path a SCSI command is sent on.  This 
means the initiator side (scsi_vhci) drives the path assignments for 
each iSCSI command. At least this is how I understand it.  The 
scsi_vhci will never send two of the same commands to both paths, as 
long as they are up.  If a paths fails, then a retry of any pending 
operations will occur on the operational path, and that's it.  If a 
path returns to service, then it will be re-utilized.


Jon
--


- _/ _/  /   - Jonathan Loran -   -
-/  /   /IT Manager   -
-  _  /   _  / / Space Sciences Laboratory, UC Berkeley
-/  / /  (510) 643-5146 [EMAIL PROTECTED]
- __/__/__/   AST:7731^29u18e3
 

  



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


--


- _/ _/  /   - Jonathan Loran -   -
-/  /   /IT Manager   -
-  _  /   _  / / Space Sciences Laboratory, UC Berkeley
-/  / /  (510) 643-5146 [EMAIL PROTECTED]
- __/__/__/   AST:7731^29u18e3




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] JBOD performance

2007-12-18 Thread Robert Milkowski
Hello Peter,

Tuesday, December 18, 2007, 5:12:48 PM, you wrote:

 Sequential writing problem with process throttling - there's an open
 bug for it for quite a while. Try to lower txg_time to 1s - should
 help a little bit.

PS Yeah, my post was mostly to emphasize that on commodity hardware raidz2 does
PS not even come close to being a CPU bottleneck. It wasn't a poke at the
PS streaming performance. Very interesting to hear there's a bug open for it
PS though.

 Can you also post iostat -xnz 1 while you're doing dd?
 and zpool status

PS This was FreeBSD, but I can provide iostat -x if you still want it for some
PS reason. 


I was just wandering that maybe there's a problem with just one
disk...



-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Fwd: zfs boot suddenly not working

2007-12-18 Thread Michael Hale



Begin forwarded message:


From: Michael Hale [EMAIL PROTECTED]
Date: December 18, 2007 6:15:12 PM CST
To: zfs-discuss@opensolaris.org
Subject: zfs boot suddenly not working

We have a machine that is configured with zfs boot , Nevada v67- we  
have two pools, rootpool and datapool.  It has been working ok since  
June.  Today it kernel panicked and now when we try to boot it up,  
it gets to the grub screen, we select ZFS, and then there is a  
kernel panic that flashes by too quickly for us to see and then it  
reboots.


If we boot to a nevada v77 DVD and if we boot to that, we can do a  
zpool import and mount the zfs pools successfully.  We scrubbed them  
and didn't find any errors.  From the nevada v77 DVD we can see  
everything ok.


Here is our grub menu.lst

title Solaris ZFS snv_67 X86
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
module$ /platform/i86pc/$ISADIR/boot_archive

First of all, is there a way to slow down that kernel panic so that  
we can see what it is?  Also, we suspect that maybe /platform/i86pm/ 
boot_archive might have been damaged.  Is there a way to regenerate  
it?

--
Michael Hale
[EMAIL PROTECTED]





--
Michael Hale
[EMAIL PROTECTED]
http://www.gift-culture.org




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: zfs boot suddenly not working

2007-12-18 Thread Jason King
Edit the kernel$ line and add '-k' at the end.  That should drop you
into the kernel debugger after the panic (typing '$q' will exit the
debugger, and resume whatever it was doing -- in this case likely
rebooting).


On Dec 18, 2007 6:26 PM, Michael Hale [EMAIL PROTECTED] wrote:


 Begin forwarded message:

 From: Michael Hale [EMAIL PROTECTED]
 Date: December 18, 2007 6:15:12 PM CST
 To: zfs-discuss@opensolaris.org
 Subject: zfs boot suddenly not working

  We have a machine that is configured with zfs boot , Nevada v67- we have
 two pools, rootpool and datapool.  It has been working ok since June.  Today
 it kernel panicked and now when we try to boot it up, it gets to the grub
 screen, we select ZFS, and then there is a kernel panic that flashes by too
 quickly for us to see and then it reboots.

 If we boot to a nevada v77 DVD and if we boot to that, we can do a zpool
 import and mount the zfs pools successfully.  We scrubbed them and didn't
 find any errors.  From the nevada v77 DVD we can see everything ok.

 Here is our grub menu.lst

 title Solaris ZFS snv_67 X86
 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
 module$ /platform/i86pc/$ISADIR/boot_archive

 First of all, is there a way to slow down that kernel panic so that we can
 see what it is?  Also, we suspect that maybe /platform/i86pm/boot_archive
 might have been damaged.  Is there a way to regenerate it?
 --
 Michael Hale
 [EMAIL PROTECTED]





 --
 Michael Hale
 [EMAIL PROTECTED]
 http://www.gift-culture.org





 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: zfs boot suddenly not working

2007-12-18 Thread Michael Hale
After doing that, this is what we see:

panic[cpu0]/thread=fbc257a0: cannot mount root path /[EMAIL 
PROTECTED],0/ 
pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL PROTECTED]/pci1028,[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0:f

fbc46790 genunix: rootconf+112 ()
fbc467e0 genunix:vfs_mountroot +65 ()
fbc46810 genunix:main+ce ()
fbc46820 unix:_locore_start+92 ()

panic: entering debugger

/rootpool/rootfs/etc/zfs/zpool.cache was last updated dec 13 at 15:25  
and has a size of 3880 bytes

if we boot off the DVD (snv 77) /etc/zfs/zpool.cache has a size of 1604



On Dec 18, 2007, at 7:08 PM, Jason King wrote:

 Edit the kernel$ line and add '-k' at the end.  That should drop you
 into the kernel debugger after the panic (typing '$q' will exit the
 debugger, and resume whatever it was doing -- in this case likely
 rebooting).


 On Dec 18, 2007 6:26 PM, Michael Hale [EMAIL PROTECTED] wrote:


 Begin forwarded message:

 From: Michael Hale [EMAIL PROTECTED]
 Date: December 18, 2007 6:15:12 PM CST
 To: zfs-discuss@opensolaris.org
 Subject: zfs boot suddenly not working

 We have a machine that is configured with zfs boot , Nevada v67- we  
 have
 two pools, rootpool and datapool.  It has been working ok since  
 June.  Today
 it kernel panicked and now when we try to boot it up, it gets to  
 the grub
 screen, we select ZFS, and then there is a kernel panic that  
 flashes by too
 quickly for us to see and then it reboots.

 If we boot to a nevada v77 DVD and if we boot to that, we can do a  
 zpool
 import and mount the zfs pools successfully.  We scrubbed them and  
 didn't
 find any errors.  From the nevada v77 DVD we can see everything ok.

 Here is our grub menu.lst

 title Solaris ZFS snv_67 X86
 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
 module$ /platform/i86pc/$ISADIR/boot_archive

 First of all, is there a way to slow down that kernel panic so that  
 we can
 see what it is?  Also, we suspect that maybe /platform/i86pm/ 
 boot_archive
 might have been damaged.  Is there a way to regenerate it?
 --
 Michael Hale
 [EMAIL PROTECTED]





 --
 Michael Hale
 [EMAIL PROTECTED]
 http://www.gift-culture.org





 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



--
Michael Hale
[EMAIL PROTECTED]
http://www.gift-culture.org




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: zfs boot suddenly not working

2007-12-18 Thread Michael Hale
the root pool mounts fine - if I do:
  zpool import rootpool
  zpool get bootfs rootpool
mkdir /mnt
mount -F zfs rootpool/rootfs it mounts fine

/etc/zfs/zpool.cache exists

a zpool get all rootpool gets us:

size 19.9 G
used 3.67G
available 16.2G
capacity 18%
altroot -
health ONLINE
guid 1573491433247481682
version 6
botfs rootpool/rootfs
delegation on
autoreplace off
cachefile -
failmode wait

we've scrubbed the pool, the config is a rootpool with two mirrors,  
c3t0d0s5 c3t3d0s5

On Dec 18, 2007, at 8:03 PM, Rob Logan wrote:


 panic[cpu0]/thread=fbc257a0: cannot mount root path /[EMAIL 
 PROTECTED],0/

 when booted from snv_77 type:

 zpool import rootpool
 zpool get bootfs rootpool
 mkdir /mnt
 mount -F zfs the bootfs string /mnt

 my guess is it will fail... so then do
 zfs list

 and find one that will mount, then

 zpool set bootfs=root/snv_77 rootpool
 grep zfs /mnt/etc/vfstab

 and verify it matches what you set bootfs to

 also take a peek at /rootpool/boot/grub/menu.lst
 but it will be fine..

   Rob


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--
Michael Hale
[EMAIL PROTECTED]
http://www.gift-culture.org




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: zfs boot suddenly not working

2007-12-18 Thread Rob Logan

  bootfs rootpool/rootfs

does grep zfs /mnt/etc/vfstab look like:

rootpool/rootfs- /   zfs -   no  -

(bet it doesn't... edit like above and reboot)

or second guess (well, third :-) is your theory that
can be checked with:

zpool import rootpool
zpool import datapool
mkdir /mnt
mount -F zfs rootpool/rootfs /mnt
tail /mnt/boot/solaris/filelist.ramdisk
echo look for (no leading /)   etc/zfs/zpool.cache
cp /etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache
/usr/sbin/bootadm update-archive -R /mnt
reboot






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does Oracle support ZFS as a file system with Oracle RAC?

2007-12-18 Thread David Magda

On Dec 18, 2007, at 12:23, Mike Gerdts wrote:

 2) Database files - I'll lump redo logs, etc. in with this.  In Oracle
RAC these must live on a shared-rw (e.g. clustered VxFS, NFS) file
system.  ZFS does not do this.

If you can use NFS, can't you put things on ZFS and then export?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs boot suddenly not working

2007-12-18 Thread Michael Hale

On Dec 18, 2007, at 6:15 PM, Michael Hale wrote:

 We have a machine that is configured with zfs boot , Nevada v67- we  
 have two pools, rootpool and datapool.  It has been working ok since  
 June.  Today it kernel panicked and now when we try to boot it up,  
 it gets to the grub screen, we select ZFS, and then there is a  
 kernel panic that flashes by too quickly for us to see and then it  
 reboots.

we tried importing the pools, exporting the pools, and then  
reimporting the pools to generate new zfspool.cache files - is the  
file format the same between nv67 and nv77?
--
Michael Hale
[EMAIL PROTECTED]
http://www.gift-culture.org




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: zfs boot suddenly not working

2007-12-18 Thread Michael Hale

On Dec 18, 2007, at 8:26 PM, Rob Logan wrote:


 bootfs rootpool/rootfs

 does grep zfs /mnt/etc/vfstab look like:

 rootpool/rootfs- /   zfs -   no  -

 (bet it doesn't... edit like above and reboot)

That is exactly what it looks like :^(



 or second guess (well, third :-) is your theory that
 can be checked with:

 zpool import rootpool
 zpool import datapool
 mkdir /mnt
 mount -F zfs rootpool/rootfs /mnt
 tail /mnt/boot/solaris/filelist.ramdisk
 echo look for (no leading /)   etc/zfs/zpool.cache
 cp /etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache
 /usr/sbin/bootadm update-archive -R /mnt
 reboot


We're trying this now. This seems to have worked! :^)  I guess the  
zpool.cache in the bootimage got corrupted?

--
Michael Hale
[EMAIL PROTECTED]
http://www.gift-culture.org




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: zfs boot suddenly not working

2007-12-18 Thread Rob Logan

   I guess the zpool.cache in the bootimage got corrupted?
not on zfs :-)   perhaps a path to a drive changed?

Rob

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: zfs boot suddenly not working

2007-12-18 Thread Michael Hale

On Dec 18, 2007, at 9:15 PM, Rob Logan wrote:


 I guess the zpool.cache in the bootimage got corrupted?
 not on zfs :-)   perhaps a path to a drive changed?

heh  - probably.

This is off topic but now this brings us to another problem...

My fellow syadmin here at work was trying to get solaris 10 to talk to  
our openldap server.  He ran ldapclient with a manual config to set up  
password authentication.  Upon reboot, it displayed the hostname and  
then said:

ldap nis domain name is

and then would hang.  At that point, it was taken into single user  
mode and he ran:

ldap client uninit

which wiped out the LDAP configuration, but now upon boot, the machine  
says:

Hostname: mbox02
NIS domain name is

and just hangs.  If we try to boot into single user mode, it asks for  
the root password - we type it in and then it just hangs.  We've  
waited several minutes now and it just seems to be locked up there

I know this is off topic but does anybody have any ideas?

--
Michael Hale
[EMAIL PROTECTED]
http://www.gift-culture.org




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: zfs boot suddenly not working

2007-12-18 Thread Mark J Musante

I can think of two things to check:

First, is there a 'bootfs' line in your grub entry?  I didn't see it  
in the original email; not sure if it was left out or it simply isn't  
present.  If it's not present, ensure the 'bootfs' property is set on  
your pool.


Secondly, ensure that there's a zpool.cache entry in the  
filelist.ramdisk (if not, add it and re-run 'bootadm update-archive')


If that doesn't do the trick, take a look at this page and see if  
there's anything that'll help:

http://www.opensolaris.org/os/community/zfs/boot/zfsboot-manual/

Regards,
markm

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does Oracle support ZFS as a file system with Oracle RAC?

2007-12-18 Thread Louwtjie Burger
On 12/19/07, David Magda [EMAIL PROTECTED] wrote:

 On Dec 18, 2007, at 12:23, Mike Gerdts wrote:

  2) Database files - I'll lump redo logs, etc. in with this.  In Oracle
 RAC these must live on a shared-rw (e.g. clustered VxFS, NFS) file
 system.  ZFS does not do this.

 If you can use NFS, can't you put things on ZFS and then export?

Is it a good idea to put a oracle database on the other end of a NFS
mount ? (performance wise)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SAS cabling was Re: aoc-sat2-mv8

2007-12-18 Thread James C. McPherson
Al Hopper wrote:
 On Tue, 18 Dec 2007, James C. McPherson wrote:
 
 ... snip .
 That will be very good info to have - there's too little information
 and personal experience surrounding the SAS cabling world as yet.
 
 Here's a couple of resources:
 
 SAS Integrators Guide:
 wget http://www.lsi.com/documentation/storage/megaraid/SAS_IG.pdf
 
 You can always get good advice from:
 http://www.cs-electronics.com/


thanks for the links Al, duly bookmarked.


James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss