date:20071207

There won't be a performance hit beyond that of RAIDZ2 vs. RAIDZ.

But you'll wind up with a pool with fundamentally single-disk-failure 
tolerance, so I'm not sure it's worth it (at least until there's a mechanism 
for replacing the remaining raidz1 vdevs with raidz2).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Odd prioritisation issues.

 I was under the impression that real-time processes essentially trump all
 others, and I'm surprised by this behaviour; I had a dozen or so RT-processes
 sat waiting for disc for about 20s.

Process priorities on Solaris affect CPU scheduling, but not (currently) I/O 
scheduling nor memory usage.

 *  Is this a ZFS issue?  Would we be better using another filesystem?

It is a ZFS issue, though depending on your I/O patterns, you might be able to 
see similar starvation on other file systems.  In general, other file systems 
issue I/O independently, so on average each process will make roughly equal 
forward process on a continuous basis.  You still don't have guaranteed I/O 
rates (in the sense that XFS on SGI, for instance, provides).

 *  Is there any way to mitigate against it?  Reduce the number of iops
 available for reading, say?
 Is there any way to disable or invert this behaviour?

I'll let the ZFS developers tackle this one 

---

Have you considered using two systems (or two virtual systems) to ensure that 
the writer isn't affected by reads? Some QFS customers use this configuration, 
with one system writing to disk and another system reading from the same disk. 
This requires the use of a SAN file system but it provides the potential for 
much greater (and controllable) throughput. If your I/O needs are modest (less 
than a few GB/second), this is overkill.

Anton
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Odd prioritisation issues.

Dickon Hood wrote:
 We've got an interesting application which involves recieving lots of
 multicast groups, and writing the data to disc as a cache.  We're
 currently using ZFS for this cache, as we're potentially dealing with a
 couple of TB at a time.
 
 The threads writing to the filesystem have real-time SCHED_FIFO priorities
 set to 25.  The processes recovering data from the cache and moving it
 elsewhere are niced at +10.
 
 We're seeing the writes stall in favour of the reads.  For normal
 workloads I can understand the reasons, but I was under the impression
 that real-time processes essentially trump all others, and I'm surprised
 by this behaviour; I had a dozen or so RT-processes sat waiting for disc
 for about 20s.

Are the files opened with O_DSYNC or does the application call fsync ?

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Error in zpool man page?

2007-12-07 Thread jonathan soons

The man page gives this form:
 zpool create [-fn] [-R root] [-m mountpoint] pool vdev ...
however, lower down, there is this command:
# zpool create mirror c0t0d0 c0t1d0 mirror c1t0d0 c1t1d0
Isn't the pool element missing in the command?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Odd prioritisation issues.

2007-12-07 Thread Dickon Hood

On Fri, Dec 07, 2007 at 12:38:11 +, Darren J Moffat wrote:
: Dickon Hood wrote:
: We've got an interesting application which involves recieving lots of
: multicast groups, and writing the data to disc as a cache.  We're
: currently using ZFS for this cache, as we're potentially dealing with a
: couple of TB at a time.

: The threads writing to the filesystem have real-time SCHED_FIFO priorities
: set to 25.  The processes recovering data from the cache and moving it
: elsewhere are niced at +10.

: We're seeing the writes stall in favour of the reads.  For normal
: workloads I can understand the reasons, but I was under the impression
: that real-time processes essentially trump all others, and I'm surprised
: by this behaviour; I had a dozen or so RT-processes sat waiting for disc
: for about 20s.

: Are the files opened with O_DSYNC or does the application call fsync ?

No.  O_WRONLY|O_CREAT|O_LARGEFILE|O_APPEND.  Would that help?

-- 
Dickon Hood

Due to digital rights management, my .sig is temporarily unavailable.
Normal service will be resumed as soon as possible.  We apologise for the
inconvenience in the meantime.

No virus was found in this outgoing message as I didn't bother looking.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Odd prioritisation issues.

2007-12-07 Thread Dickon Hood

We've got an interesting application which involves recieving lots of
multicast groups, and writing the data to disc as a cache.  We're
currently using ZFS for this cache, as we're potentially dealing with a
couple of TB at a time.

The threads writing to the filesystem have real-time SCHED_FIFO priorities
set to 25.  The processes recovering data from the cache and moving it
elsewhere are niced at +10.

We're seeing the writes stall in favour of the reads.  For normal
workloads I can understand the reasons, but I was under the impression
that real-time processes essentially trump all others, and I'm surprised
by this behaviour; I had a dozen or so RT-processes sat waiting for disc
for about 20s.

My questions:

  *  Is this a ZFS issue?  Would we be better using another filesystem?

  *  Is there any way to mitigate against it?  Reduce the number of iops
 available for reading, say?

  *  Is there any way to disable or invert this behaviour?

  *  Is this a bug, or should it be considered one?

Thanks.

-- 
Dickon Hood

Due to digital rights management, my .sig is temporarily unavailable.
Normal service will be resumed as soon as possible.  We apologise for the
inconvenience in the meantime.

No virus was found in this outgoing message as I didn't bother looking.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Error in zpool man page?

2007-12-07 Thread Cindy . Swearingen

Jonathan,

I think I remember seeing this error in an older Solaris release. The 
current zpool.1m man page doesn't have this error unless I'm missing it:

http://docs.sun.com/app/docs/doc/819-2240/zpool-1m

In a current Solaris release, this command fails as expected:

# zpool create mirror c0t2d0 c0t4d0
cannot create 'mirror': name is reserved
pool name may have been omitted

mirror is reserved as described on page 39 of the ZFS Admin
Guide, here:

http://opensolaris.org/os/community/zfs/docs/zfsadmin.pdf

Cindy

jonathan soons wrote:
 The man page gives this form:
  zpool create [-fn] [-R root] [-m mountpoint] pool vdev ...
 however, lower down, there is this command:
 # zpool create mirror c0t0d0 c0t1d0 mirror c1t0d0 c1t1d0
 Isn't the pool element missing in the command?
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Odd prioritisation issues.

Dickon Hood wrote:
 On Fri, Dec 07, 2007 at 12:38:11 +, Darren J Moffat wrote:
 : Dickon Hood wrote:
 : We've got an interesting application which involves recieving lots of
 : multicast groups, and writing the data to disc as a cache.  We're
 : currently using ZFS for this cache, as we're potentially dealing with a
 : couple of TB at a time.
 
 : The threads writing to the filesystem have real-time SCHED_FIFO priorities
 : set to 25.  The processes recovering data from the cache and moving it
 : elsewhere are niced at +10.
 
 : We're seeing the writes stall in favour of the reads.  For normal
 : workloads I can understand the reasons, but I was under the impression
 : that real-time processes essentially trump all others, and I'm surprised
 : by this behaviour; I had a dozen or so RT-processes sat waiting for disc
 : for about 20s.
 
 : Are the files opened with O_DSYNC or does the application call fsync ?
 
 No.  O_WRONLY|O_CREAT|O_LARGEFILE|O_APPEND.  Would that help?

Don't know if it will help, but it will be different :-).  I suspected 
that since you put the processes in the RT class you would also be doing 
synchronous writes.

If you can test this it may be worth doing so for the sake of gathering 
another data point.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Moving ZFS file system to a different system

Hello Walter,

Thursday, December 6, 2007, 7:05:54 PM, you wrote:

Hi All,
We are currently a hardware issue with our zfs file server hence the file system is unusable.
We are planning to move it to a different system.

The setup on the file server when it was running was

bash-3.00# zpool status
pool: store1
state: ONLINE
scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
backup ONLINE0 0 0
c1t2d1 ONLINE0 0 0
c1t2d2 ONLINE0 0 0
c1t2d3 ONLINE0 0 0
c1t2d4 ONLINE0 0 0
c1t2d5 ONLINE0 0 0

errors: No known data errors

pool: store2
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see:http://www.sun.com/msg/ZFS-8000-9P
scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
storeONLINE0 0 1
c1t3d0 ONLINE0 0 0
c1t3d1 ONLINE0 0 0
c1t3d2 ONLINE0 0 1
c1t3d3 ONLINE0 0 0
c1t3d4 ONLINE0 0 0

errors: No known data errors

The store1 was a external raid device with slice configured to boot the system+swap and the remaining disk space configured for use with zfs.

The store2 was a similar external raid device which had all slices configured for use for zfs.

Since both are scsi raid devices, we are thinking of booting up the former using a different SUN Box.

Are there some precautions to be taken to avoid any data loss?

Thanks,
--W

Just make sure the external storage is not connected to both hosts at the same time.
Once you connect it to another host simply import both pools with -f (as pool wasn't cleanly exported I guess).

Please also notice that you've encountered one uncorrectable error in store2 pool.
Well, actually it looks like it was corrected judging from the message.
IIRC it's a known bug (should have been already corrected) - metadata cksum error propagates to top level vdev unnecessarily.

--
Best regards,
Robert Milkowski mailto:[EMAIL PROTECTED]
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Error in zpool man page?

2007-12-07 Thread jonathan soons

SunOS 5.10  Last change: 25 Apr 2006 

Yes, I see that my other server is more up to date.

SunOS 5.10  Last change: 13 Feb 2007   
This one was recently installed.

Is there a patch that was not included with 10_Recommended?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Same device added twice

2007-12-07 Thread Tony Dalton

Hi,

I'm new to the list and fairly new to ZFS so hopefully this isn't a dumb 
question, but...

I just inadvertantly added s0 of a disk to a zpool, and then added the 
entire device:

NAME   STATE READ WRITE CKSUM
zfs-bo ONLINE   0 0 0
c4t60060E801419DC0119DC01A0d0ONLINE   0 0 0
c4t600A0B800026A5EC07FE47557497d0s0  ONLINE   0 0 0
c4t600A0B800026A5EC07FE47557497d0ONLINE   0 0 0

This stirkes me as bad - should I be able to do this?

Thanks,

Tony


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zfs rollback without unmounting a file system

Hello zfs-discuss,

  http://bugs.opensolaris.org/view_bug.do?bug_id=6421210

1. App opens and creates an empty file /pool/fs1/file1
2. zfs snapshot pool/[EMAIL PROTECTED]
3. App writes something to file and still keeps it open
4. zfs rollback pool/[EMAIL PROTECTED]

Now what happens to fd App is using? What is a file contents from App
point of view as long as it's still open?
New opens of that file will oen empy file I guess.


  

-- 
Best regards,
 Robert Milkowski  mailto:[EMAIL PROTECTED]
 http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Help replacing dual identity disk in ZFS raidz and SVM mirror

Hello Matt,

Monday, December 3, 2007, 8:36:28 PM, you wrote:

MB Hi,

MB We have a number of 4200's setup using a combination of an SVM 4
MB way mirror and a ZFS raidz stripe.

MB Each disk (of 4) is divided up like this

MB / 6GB UFS s0 
MB Swap 8GB s1
MB /var 6GB UFS s3
MB Metadb 50MB UFS s4
MB /data 48GB ZFS s5 

MB For SVM we do a 4 way mirror on /,swap, and /var
MB So we have 3 SVM mirrors
MB d0=root (sub mirrors d10, d20, d30, d40)
MB d1=swap (sub mirrors d11, d21,d31,d41)
MB d3=/var (sub mirrors d13,d23,d33,d43)

MB For ZFS we have a single Raidz set across all four disks s5

MB Everything has worked flawlessly for some time. This week we
MB discovered that one of our 4200's is reporting some level of
MB failure with regards to one of the disks

MB We see these recurring errors in the syslog
MB Dec  3 12:00:47 vfcustgfs02b scsi: [ID 107833 kern.notice] 
MB Vendor: FUJITSUSerial Number: 0616S02DD5
MB Dec  3 12:00:47 vfcustgfs02b scsi: [ID 107833 kern.notice]  Sense Key: 
Media Error
MB Dec  3 12:00:47 vfcustgfs02b scsi: [ID 107833 kern.notice] 
MB ASC: 0x15 (mechanical positioning error), ASCQ: 0x1, FRU: 0x0

MB When we run a metastat we see that 2 of the 3 SVM mirrors is
MB reporting that the failing disks submirror needs maintenance.
MB Oddly enough, the third SVM mirror reports no issues making me
MB think there is a media error on the disk that only happens to
MB affect 2 of the 3 disks slices respectively

MB Also zpool status reports read issues on the failing disk

MB config:

MB NAME  STATE READ WRITE CKSUM
MB zpool ONLINE   0 0 0
MB   raidz   ONLINE   0 0 0
MB c0t0d0s5  ONLINE   0 0 0
MB c0t1d0s5  ONLINE  50 0 0
MB c0t2d0s5  ONLINE   0 0 0
MB c0t3d0s5  ONLINE   0 0 0

MB So my question is what series of steps do we need to perform
MB given the fact that I have one disk out of four that hosts a zfs
MB raidz on one slice, and SVM mirrors on 3 other slices, but only 2
MB of the 3 SVM mirrors report requiring maintenance.

MB We want to keep the data integrity in place (obviously) 
MB The server is still operational, but we want to take this
MB opportunity to hammer out these steps.


If you can add another disk then do it and replace a failing one with
the new one in SVM and ZFS (one by one - should be faster).

I guess you can't add another disk.
Then detach the disk from SVM (no need from MD which are already in
maintainance mode), detech (offline) it from zfs pool, destroy metadb
replica on that disk, write down vtoc
(prtvtoc), use cfgadm -c unconfigure or -disconnect, remove disk, put
new one, label it (fmthard) the same, put metadb, attach it to zfs
(online it actually) first, as you are risking your data in your
config on zfs, while still having 3-way mirror on SVM, once its done
attach it (replace) in SVM.
Probably install a bootblock too.




-- 
Best regards,
 Robert Milkowskimailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

I believe the data dedup is also a feature of NTFS.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Trial x4500, zfs with NFS and quotas.

Hello Jorgen,

Honestly - I don't think zfs is a good solution to your problem.

What you could try to do however when it comes to x4500 is:

1. Use SVM+UFS+user quotas
2. Use zfs and create several (like up-to 20? so each stays below 1TB)
   ufs file systems on zvols and then apply user quotas on ufs level -
   I would say you are risking lot of possible strange interactions
   here - maybe it will work perfectly, maybe not. I would also be
   concern about file system consistency.

   

-- 
Best regards,
 Robert Milkowskimailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Error in zpool man page?

2007-12-07 Thread Mike Dotson


On Fri, 2007-12-07 at 08:02 -0800, jonathan soons wrote:
 The man page gives this form:
  zpool create [-fn] [-R root] [-m mountpoint] pool vdev ...
 however, lower down, there is this command:
 # zpool create mirror c0t0d0 c0t1d0 mirror c1t0d0 c1t1d0
 Isn't the pool element missing in the command?

In the command you pasted above yes, however, looking at the man pages I
have, I see the correct command line.  What OS and rev was this from?

  
 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Mike Dotson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Mail system errors (On Topic).


I keep getting ETOOMUCHTROLL errors thrown while reading this list,  is
there a list admin that can clean up the mess?   I would hope that repeated
personal attacks could be considered grounds for removal/blocking.

Wade Stuart
Fallon Worldwide
P: 612.758.2660
C: 612.877.0385

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Error in zpool man page?

2007-12-07 Thread jonathan soons

mis _HOLD_ # cat /etc/release
   Solaris 10 6/06 s10s_u2wos_09a SPARC
   Copyright 2006 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
 Assembled 09 June 2006
mis _HOLD_ #
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] /dev/zfs ioctl performance

2007-12-07 Thread Scott

Hello,

I have been trying to chase down some ZFS performance issues, and I was hoping 
someone with more ZFS experience might be able to comment.

When running a zfs list command, it often takes several minutes to complete.  
I see similar behavior when running most other ZFS commands, such as zfs set, 
or when creating a snapshot.  While it is running, the load of the server stays 
approximately the same (0.30), no process goes above 1% CPU usage, and the 
server does not utilize swap.

The server I am using for testing is running Solaris Express DE 9/07, based on 
B70.  It has 15 SATA drives in a RAID-Z2 pool, and the actual I/O performance 
is good as far as I can tell.  The pool contains 150 filesystems, including 
snapshots.  It is an active server, and generally has several users writing to 
the disk at any one time.

Below is a snippet of output from a truss -d zfs list:

...
 0.0283 open(/dev/zfs, O_RDWR) = 3
...
 1.7142 ioctl(3, ZFS_IOC_DATASET_LIST_NEXT, 0x08045D38) = 0
 2.6371 ioctl(3, ZFS_IOC_OBJSET_STATS, 0x08043E98) = 0
 2.7222 ioctl(3, ZFS_IOC_DATASET_LIST_NEXT, 0x08044D88) = 0
 4.6572 ioctl(3, ZFS_IOC_OBJSET_STATS, 0x08042EE8) = 0
...

As you can see, it spends quite a bit of time waiting for the return of ioctl 
calls on /dev/zfs.  Since the command has to make many of these calls, the 
cumulative delays add up to a significant wait.

Any ideas?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

 NOTHING anton listed takes the place of ZFS

That's not surprising, since I didn't list any file systems.

Here's a few file systems, and some of their distinguishing features.  None of 
them do exactly what ZFS does.  ZFS doesn't do what they do, either.

QFS: Very, very fast.  Supports segregation of data from metadata, and classes 
of data.  Supports SAN access to data.

XFS: Also fast; works efficiently on multiprocessors (in part because 
allocation can proceed in parallel).  Supports SAN access to data (CXFS).  
Delayed allocation allows temporary files to stay in memory and never even be 
written to disk (and improves contiguity of data on disk).

JFS: Another very solid journaled file system.

GPFS: Yet another SAN file system, with tighter semantics than QFS or XFS; 
highly reliable.

StorNext: Hey, it's another SAN file system!  Guaranteed I/O rates (hmmm, which 
XFS has too, at least on Irix) -- a key for video use.

SAMFS: Integrated archiving -- got petabytes of data that you need virtually 
online?  SAM's your man!  (well, at least your file system)

AdvFS: A journaled file system with snapshots, integrated volume management, 
online defragmentation, etc.

VxFS: Everybody knows, right?  Journaling, snapshots (including writable 
snapshots), highly tuned features for databases, block-level change tracking 
for more efficient backups, etc.

There are many, many different needs.  There's a reason why there is no one 
true file system.

-- Anton

 Better yet, you get back to writing that file system
 that's going to fix all these horrible deficiencies
 in zfs.

Ever heard of RMS?

A file system which supports not only sequential access to files, or random 
access, but keyed access.  (e.g. update the record whose key is 123)?

A file system which allowed any program to read any file, without needing to 
know about its internal format?  (so such an indexed file could just be read as 
a sequence of ordered records by applications which processed ordinary text 
files.)

A file system which could be shared between two, or even more, running 
operating systems, with direct access from each system to the disks.

A file system with features like access control with alarms, MAC security on a 
per-file basis, multiple file versions, automatic deletion of temporary files, 
verify-after-write.

You probably wouldn't be interested; but others would. It solves a particular 
set of needs (primarily in the enterprise market).  It did it very well.  It 
did it some 30 years before ZFS.  It's very much worthwhile listening to those 
who built such a system, and their experiences, if your goal is to learn about 
file systems.  Even if they don't suffer fools gladly.



If you've got a problem for which ZFS is the best solution, great.  Use it.  
But don't think that it solves every problem, nor that it's perfect for 
everyone -- even you.

(One particular area to think about -- how do you back up your multi-terabyte 
pool?  And how do you restore an individual file from your backups?)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

  You have me at a disadvantage here, because I'm
 not
  even a Unix (let alone Solaris and Linux)
 aficionado.
  But don't Linux snapshots in conjunction with
 rsync
  (leaving aside other possibilities that I've never
  heard of) provide rather similar capabilities
 (e.g.,
  incremental backup or re-synching), especially
 when
   used in conjunction with scripts and cron?
  
 
 
 Which explains why you keep ranting without knowing
 what you're talking about.

Au contraire, cookie:  I present things in detail to make it possible for 
anyone capable of understanding the discussion to respond substantively if 
there's something that requires clarification or further debate.

You, by contrast, babble on without saying anything substantive at all - which 
makes you kind of amusing, but otherwise useless.  You could at least have 
tried to answer my question above, since you took the trouble to quote it - but 
of course you didn't, just babbled some more.

  Copy-on-write.  Even a
 bookworm with 0 real-life-experience should be able
 to apply this one to a working situation.  

As I may well have been designing and implementing file systems since before 
you were born (or not:  you just have a conspicuously callow air about you), my 
'real-life' experience with things like COW is rather extensive.  And while I 
don't have experience with Linux adjuncts like rsync, unlike some people I'm 
readily able to learn from the experience of others (who seem far more credible 
when describing their successful use of rsync and snapshots on Linux than 
anything I've seen you offer up here).

 
 There's a reason ZFS (and netapp) can take snapshots
 galore without destroying their filesystem
 performance.

Indeed:  it's because ZFS already sacrificed a significant portion of that 
performance by disregarding on-disk contiguity, so there's relatively little 
left to lose.  By contrast, systems that respect the effects of contiguity on 
performance (and WAFL does to a greater degree than ZFS) reap its benefits all 
the time (whether snapshots exist or not) while only paying a penalty when data 
is changed (and they don't have to change as much data as ZFS does because they 
don't have to propagate changes right back to the root superblock on every 
update).

It is possible to have nearly all of the best of both worlds, but unfortunately 
not with any current implementations that I know of.  ZFS could at least come 
considerably closer, though, if it reorganized opportunistically as discussed 
in the database thread.

(By the way, since we're talking about snapshots here rather than about clones 
it doesn't matter at all how many there are, so your 'snapshots galore' bluster 
above is just more evidence of your technical incompetence:  with any 
reasonable implementation the only run-time overhead occurs in keeping the most 
recent snapshot up to date, regardless of how many older snapshots may also be 
present.)

But let's see if you can, for once, actually step up to the plate and discuss 
something technically, rather than spout buzzwords that you apparently don't 
come even close to understanding:

Are you claiming that writing snapshot before-images of modified data (as, 
e.g., Linux LVM snapshots do) for the relatively brief period that it takes to 
transfer incremental updates to another system 'destroys' performance?  First 
of all, that's clearly dependent upon the update rate during that interval, so 
if it happens at a quiet time (which presumably would be arranged if its 
performance impact actually *was* a significant issue) your assertion is 
flat-out-wrong.  Even if the snapshot must be processed during normal 
operation, maintaining it still won't be any problem if the run-time workload 
is read-dominated.

And I suppose Sun must be lying in its documentation for fssnap (which Sun has 
offered since Solaris 8 with good old update-in-place UFS) where it says While 
the snapshot is active, users of the file system might notice a slight 
performance impact [as contrasted with your contention that performance is 
'destroyed'] when the file system is written to, but they see no impact when 
the file system is read 
(http://docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWaadm/SYSADV1/p185.html). 
 You'd really better contact them right away and set them straight.

Normal system cache mechanisms should typically keep about-to-be-modified data 
around long enough to avoid the need to read it back in from disk to create the 
before-image for modified data used in a snapshot, and using a log-structured 
approach to storing these BIs in the snapshot file or volume (though I don't 
know what specific approaches are used in fssnap and LVM:  do you?) would be 
extremely efficient - resulting in minimal impact on normal system operation 
regardless of write activity.

C'mon, cookie:  surprise us for once - say something intelligent.  With 
guidance and practice, you might even be able to make a habit of it.

- bill
 
 
This

Re: [zfs-discuss] Yager on ZFS

 There are a category of errors that are 
 not caused by firmware, or any type of software. The
 hardware just doesn't write or read the correct bit value this time
 around. With out a checksum there's no way for the firmware to know, and
 next time it very well may write or read the correct bit value from the
 exact same spot on the disk, so scrubbing is not going to flag this
 sector as 'bad'.

There seems to be a lot of ignorance about how disks actually work in this 
thread.

Here's the data path, to a first approximation.

  Processor = RAM = controller RAM = disk cache RAM = read/write head 
= media

There are four buses in the above (which is a slight oversimplification): the 
processor/memory bus, the internal I/O bus (e.g. PCI), the external I/O bus 
(e.g. SATA), and the internal disk bus. (The last arrow isn't a bus, it's the 
magnetic field.)

Errors can be introduced at any point and there are corresponding error 
detection and correction mechanisms at each point.

Processor: Usually parity on internal registers  buses, ECC on larger cache.
Processor/memory bus: Usually ECC (SECDED).
RAM: Usually SECDED or better for better servers, parity for cheap servers, 
nothing @ low-end.
Internal I/O bus: Usually parity (PCI) or CRC (PCI-E).
Controller RAM: Usually parity for low-end controllers, rarely ECC for high-end 
controllers.
External I/O bus: Usually CRC.
Disk cache RAM: Usually parity for low-end disks, ECC for high-end disks.
Internal disk bus: Media ECC.
Read/write head: N/A, doesn't hold bits.
Media: Media ECC.

The disk, as it's transferring data from its cache to the media, adds a very 
large and complex error-correction coding to the data. This protects against a 
huge number of errors, 20 or more bits in a single 512-byte block.  This is 
because the media is very noisy.

So there is far *better* protection than a checksum for the data once it gets 
to the disk, and you can't possibly (well, not within any reasonable 
probability) return bad data from disk.  You'll get an I/O error (media error 
in SCSI parlance) instead.

ZFS protects against an error introduced between memory and the disk.  Aha!, 
you say, there's a lot of steps there, and we could get an error at any 
point!  There are a lot of points there, but very few where the data isn't 
already protected by either CRC or parity.  (Why do controllers usually use 
parity internally?  The same reason the processor uses parity for L1; access is 
speed-critical, and the data is live in the cache/FIFO for such a small 
amount of time that the probability of a multi-bit error is negligible.)

 Now you may claim that this type of error happens so infrequently that 
 it's not worth it.

I do claim that the error you described -- a bit error on the disk, undetected 
by the disk's ECC -- is infrequent to the point of being negligible.  The much 
more frequent case, an error which is detected but not corrected by ECC, is 
handled by simple mirroring.

Anton
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

2007-12-07 Thread Tim Cook

 You have me at a disadvantage here, because I'm not
 even a Unix (let alone Solaris and Linux) aficionado.
 But don't Linux snapshots in conjunction with rsync
 (leaving aside other possibilities that I've never
 heard of) provide rather similar capabilities (e.g.,
 incremental backup or re-synching), especially when
  used in conjunction with scripts and cron?
 


Which explains why you keep ranting without knowing what you're talking about.  
Copy-on-write.  Even a bookworm with 0 real-life-experience should be able to 
apply this one to a working situation.  

There's a reason ZFS (and netapp) can take snapshots galore without destroying 
their filesystem performance.  Hell this one even applies *IN THEORY*, so you 
might not even have to *slum* with any real-world usage to grasp the concept.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mail system errors (On Topic).

 
 I keep getting ETOOMUCHTROLL errors thrown while
 reading this list,  is
 there a list admin that can clean up the mess?   I
 would hope that repeated
 personal attacks could be considered grounds for
 removal/blocking.

Actually, most of your more unpleasant associates here seem to suffer primarily 
from blind and misguided loyalty and/or an excess of testosterone - so there's 
always hope that they'll grow up over time and become productive contributors.  
And if I'm not complaining about their attacks but just dealing with them in 
kind while carrying on more substantive conversations, it's not clear that they 
should pose a serious problem for others.

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

Once again, profuse apologies for having taken so long (well over 24 hours by 
now - though I'm not sure it actually appeared in the forum until a few hours 
after its timestamp) to respond to this.

 can you guess? wrote:
 
  Primarily its checksumming features, since other
 open source solutions support simple disk scrubbing
 (which given its ability to catch most deteriorating
 disk sectors before they become unreadable probably
 has a greater effect on reliability than checksums in
 any environment where the hardware hasn't been
 slapped together so sloppily that connections are
 flaky).

 From what I've read on the subject, That premise
  seems bad from the 
 tart.

Then you need to read more or understand it better.

  I don't believe that scrubbing will catch all
 the types of 
 errors that checksumming will.

That's absolutely correct, but it in no way contradicts what I said (and you 
quoted) above.  Perhaps you should read that again, more carefully:  it merely 
states that disk scrubbing probably has a *greater* effect on reliability than 
checksums do, not that it completely subsumes their features.

 There are a category
 of errors that are 
 not caused by firmware, or any type of software. The
 hardware just 
 doesn't write or read the correct bit value this time
 around. With out a 
 checksum there's no way for the firmware to know, and
 next time it very 
 well may write or read the correct bit value from the
 exact same spot on 
 the disk, so scrubbing is not going to flag this
 sector as 'bad'.

It doesn't have to, because that's a *correctable* error that the disk's 
extensive correction codes (which correct *all* single-bit errors as well as 
most considerably longer error bursts) resolve automatically.

 
 Now you may claim that this type of error happens so
 infrequently

No, it's actually one of the most common forms, due to the desire to pack data 
on the platter as tightly as possible:  that's why those long correction codes 
were created.

Rather than comment on the rest of your confused presentation about disk error 
rates, I'll just present a capsule review of the various kinds:

1.  Correctable errors (which I just described above).  If a disk notices that 
a sector *consistently* requires correction it may deal with it as described in 
the next paragraph.

2.  Errors that can be corrected only with retries (i.e., the sector is not 
*consistently* readable even after the ECC codes have been applied, but can be 
successfully read after multiple attempts which can do things like fiddle 
slightly with the head position over the track and signal amplification to try 
to get a better response).  A disk may try to rewrite such a sector in place to 
see if its readability improves as a result, and if it doesn't will then 
transparently revector the data to a spare sector if one exists and mark the 
original sector as 'bad'.  Background scrubbing gives the disk an opportunity 
to discover such sectors *before* they become completely unreadable, thus 
significantly improving reliability even in non-redundant environments.

3.  Uncorrectable errors (bursts too long for the ECC codes to handle even 
after the kinds of retries described above, but which the ECC codes can still 
detect):  scrubbing catches these as well, and if suitable redundancy exists it 
can correct them by rewriting the offending sector (the disk may transparently 
revector it if necessary, or the LVM or file system can if the disk can't).  
Disk vendor specs nominally state that one such error may occur for every 10^14 
bits transferred for a contemporary commodity (ATA or SATA) drive (i.e., about 
once in every 12.5 TB), but studies suggest that in practice they're much rarer.

4.  Undetectable errors (errors which the ECC codes don't detect but which 
ZFS's checksums presumably would).  Disk vendors no longer provide specs for 
this reliability metric.  My recollection from a decade or more ago is that 
back when they used to it was three orders of magnitude lower than the 
uncorrectable error rate:  if that still obtained it would mean about once in 
every 12.5 petabytes transferred, but given that the real-world incidence of 
uncorrectable errors is so much lower than speced and that ECC codes keep 
increasing in length it might be far lower than that now.

...

  Aside from the problems that scrubbing handles (and
 you need scrubbing even if you have checksums,
 because scrubbing is what helps you *avoid* data loss
 rather than just discover it after it's too late to
 do anything about it), and aside from problems 
 Again I think you're wrong on the basis for your
 point.

No:  you're just confused again.

 The checksumming 
 in ZFS (if I understand it correctly) isn't used for
 only detecting the 
 problem. If the ZFS pool has any redundancy at all,
 those same checksums 
 can be used to repair that same data, thus *avoiding*
 the data loss.

1.  Unlike things like disk ECC codes, ZFS's checksums can't

Re: [zfs-discuss] ZFS on Freebsd 7.0

2007-12-07 Thread Peter Schuller

   NAMESTATE READ WRITE CKSUM
   fatty   DEGRADED 0 0 3.71K
 raidz2DEGRADED 0 0 3.71K
   da0 ONLINE   0 0 0
   da1 ONLINE   0 0 0
   da2 ONLINE   0 0 0
   da3 ONLINE   0 0   300
   da4 ONLINE   0 0 0
   da5 ONLINE   0 0 0
   da6 ONLINE   0 0   253
   da7 ONLINE   0 0 0
   da8 ONLINE   0 0 0
   spare   DEGRADED 0 0 0
 da9   OFFLINE  0 0 0
 da11  ONLINE   0 0 0
   da10ONLINE   0 0 0
   spares
 da11  INUSE currently in use

 errors: 801 data errors, use '-v' for a list


 After I detach the spare da11 and bring da9 back online all the errors
 go away.

Theory:

Suppose da3 and da6 are either bad drives, have cabling issues, or are on a 
controller suffering corruption (different from the other drives).

If you now were to replace da9 by da11, the resilver operation would be 
reading from these drives, thus triggering checksum issues. Once you bring 
da9 back in, it is either entirely up to date or very close to up to date, so 
the amount of I/O required to resilver it is very small and may not trigger 
problems.

If this theory is correct, a scrub (zpool scrub fatty) should encounter 
checksum errors on da3 and da6.

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller [EMAIL PROTECTED]'
Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org



signature.asc
Description: This is a digitally signed message part.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

2007-12-07 Thread Tim Cook

 If you ever progress beyond counting on your fingers
 you might (with a lot of coaching from someone who
 actually cares about your intellectual development)
 be able to follow Anton's recent explanation of this
 (given that the higher-level overviews which I've
 provided apparently flew completely over your head).

Seriously, are you 14?  NOTHING anton listed takes the place of ZFS, and your 
pie in the sky theories do not a product make.  So yet again, your long winded 
insult ridden response can be translated to My name is billtodd, I haven't a 
fucking clue, I'm wrong, so I'll defer to my usual defensive tactics of 
attempting to insult those who know more, and have more experience in the REAL 
WORLD than I.  

You do a GREAT job of spewing theory, you do a piss poor job of relating 
ANYTHING to the real world.

 I discussed that in detail elsewhere here yesterday
 (in more detail than previously in an effort to help
 the slower members of the class keep up).

No, no you didn't.  You listed of a couple of bullshit products that don't come 
anywhere near the features of ZFS.  Let's throw out a bunch of half-completed 
projects that require hours of research just to setup, much less integrate, and 
call it done.

MDADM, next up you'll tell us we really never needed to move beyond fat, 
because hey, that really was *good enough* too!

But of course, your usual well I haven't really used the product, but I have 
read up on it excuse must be a get-out-of jail free card, AMIRITE?!
 
  and ease of
  use
 
 That actually may be a legitimate (though hardly
 decisive) ZFS advantage:  it's too bad its developers
 didn't extend it farther (e.g., by eliminating the
 vestiges of LVM redundancy management and supporting
 seamless expansion to multi-node server systems).
 
 - bill

Right, they definitely shouldn't have released zfs because every feature they 
ever plan on implementing wasn't there yet.  

Tell you what, why don't you try using some of these products you've read 
about, then you can come back and attempt to continue this discussion.  I 
don't care what your *THEORIES* are, I care about how things work here in the 
real world.

Better yet, you get back to writing that file system that's going to fix all 
these horrible deficiencies in zfs.  Then you can show the world just how 
superior you are. *RIGHT*
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Seperate ZIL

2007-12-07 Thread Vincent Fox

 On Wed, 5 Dec 2007, Brian Hechinger wrote:

 [1] Finally, someone built a flash SSD that rocks
 (and they know how 
 fast it is judging by the pricetag):
 http://www.tomshardware.com/2007/11/21/mtron_ssd_32_gb
 /
 http://www.anandtech.com/storage/showdoc.aspx?i=3167

Great now if only Sun would rebadge these and certify them for use in their 
products, I'd have some PO's waiting.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS on Freebsd 7.0

2007-12-07 Thread Jason Morton

I am using ZFS on FreeBSD 7.0_beta3. This is the first time i have  
used ZFS and I have run into something that I am not sure if this is  
normal, but am very concerned about.


SYSTEM INFO:
hp 320s (storage array)
12 disks (750GB each)
2GB RAM
1GB flash drive (running the OS)

When I take a disk offline and replace it with my spare, after the  
spare rebuild it shows there are numerous errors. see below:

scrub: resilver completed with 946 errors on Thu Dec  6 15:15:32 2007
config:

NAMESTATE READ WRITE CKSUM
fatty   DEGRADED 0 0 3.71K
  raidz2DEGRADED 0 0 3.71K
da0 ONLINE   0 0 0
da1 ONLINE   0 0 0
da2 ONLINE   0 0 0
da3 ONLINE   0 0   300
da4 ONLINE   0 0 0
da5 ONLINE   0 0 0
da6 ONLINE   0 0   253
da7 ONLINE   0 0 0
da8 ONLINE   0 0 0
spare   DEGRADED 0 0 0
  da9   OFFLINE  0 0 0
  da11  ONLINE   0 0 0
da10ONLINE   0 0 0
spares
  da11  INUSE currently in use

errors: 801 data errors, use '-v' for a list


After I detach the spare da11 and bring da9 back online all the errors  
go away.


pool: fatty
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the  
errors

using 'zpool clear' or replace the device with 'zpool replace'.
 see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver completed with 0 errors on Thu Dec  6 15:57:23 2007
config:

NAMESTATE READ WRITE CKSUM
fatty   ONLINE   0 0 3.71K
  raidz2ONLINE   0 0 3.71K
da0 ONLINE   0 0 0
da1 ONLINE   0 0 0
da2 ONLINE   0 0 0
da3 ONLINE   0 0   300
da4 ONLINE   0 0 0
da5 ONLINE   0 0 0
da6 ONLINE   0 0   253
da7 ONLINE   0 0 0
da8 ONLINE   0 0 0
da9 ONLINE   0 0 0
da10ONLINE   0 0 0
spares
  da11  AVAIL

errors: No known data errors


Thanks
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs rollback without unmounting a file system

Allowing a filesystem to be rolled back without unmounting it sounds unwise, 
given the potentially confusing effect on any application with a file currently 
open there.

And if a user can't roll back their home directory filesystem, is that so bad?  
Presumably they can still access snapshot versions of individual files or even 
entire directory sub-trees and copy them to their current state if they want to 
- or whistle up someone else to perform a rollback of their home directory if 
they really need to.

I'm not normally one to advocate protecting users from themselves, but I do 
think that applications have some rights to believe that there are some 
guarantees about stability as long as they have a file accessed (and that the 
system should terminate that access if it can't sustain those guarantees).

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Error in zpool man page?

2007-12-07 Thread Mike Dotson


On Fri, 2007-12-07 at 08:24 -0800, jonathan soons wrote:
 SunOS 5.10  Last change: 25 Apr 2006 
 
 Yes, I see that my other server is more up to date.
 
 SunOS 5.10  Last change: 13 Feb 2007   
 This one was recently installed.

What OS rev?  (more /etc/release)  

I don't have any systems later than update 3 patched to January 2007 and
have the correct man page.

Looks like perhaps bug 6419899 which was fixed in patch 119246-16 and
119246-21 was released on 11-DEC-2006 and included in Solaris 10 11/06
(update 3).  Latest is rev 27 of patch 119246.

 
 Is there a patch that was not included with 10_Recommended?
  
 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Thanks...


Mike Dotson
Area System Support Engineer - ACS West
Phone: (503) 343-5157
[EMAIL PROTECTED]


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mail system errors (On Topic).

Please see below for an example.

-Wade

[EMAIL PROTECTED] wrote on 12/07/2007 03:07:29 PM:

 
  I keep getting ETOOMUCHTROLL errors thrown while
  reading this list,  is
  there a list admin that can clean up the mess?   I
  would hope that repeated
  personal attacks could be considered grounds for
  removal/blocking.

 Actually, most of your more unpleasant associates here seem to
 suffer primarily from blind and misguided loyalty and/or an excess
 of testosterone - so there's always hope that they'll grow up over
 time and become productive contributors.  And if I'm not complaining
 about their attacks but just dealing with them in kind while
 carrying on more substantive conversations, it's not clear that they
 should pose a serious problem for others.

 - bill


 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] OT: NTFS Single Instance Storage (Re: Yager on ZFS

Thanks Darren.

I found another link that goes into the 2003 implementation:

http://blogs.technet.com/filecab/archive/tags/Single+Instance+Store+_2800_SIS_2900_/default.aspx

It looks pretty nice,  although I am not sure about the userland dedup
service design -- I would like to see it implemented closer to the fs and
dealing with blocks instead of files.

[EMAIL PROTECTED] wrote on 12/07/2007 01:23:22 PM:

 [EMAIL PROTECTED] wrote:
  Darren,
 
Do you happen to have any links for this?  I have not seen
anything
  about NTFS and CAS/dedupe besides some of the third party apps/services
  that just use NTFS as their backing store.

 Single Instance Storage is what Microsoft uses to refer to this:

 http://research.microsoft.com/sn/Farsite/WSS2000.pdf


 --
 Darren J Moffat
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] OT: NTFS Single Instance Storage (Re: Yager on ZFS

[EMAIL PROTECTED] wrote:
 Darren,
 
   Do you happen to have any links for this?  I have not seen anything
 about NTFS and CAS/dedupe besides some of the third party apps/services
 that just use NTFS as their backing store.

Single Instance Storage is what Microsoft uses to refer to this:

http://research.microsoft.com/sn/Farsite/WSS2000.pdf


-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

 So name these mystery alternatives that come anywhere
 close to the protection,

If you ever progress beyond counting on your fingers you might (with a lot of 
coaching from someone who actually cares about your intellectual development) 
be able to follow Anton's recent explanation of this (given that the 
higher-level overviews which I've provided apparently flew completely over your 
head).

 functionality,

I discussed that in detail elsewhere here yesterday (in more detail than 
previously in an effort to help the slower members of the class keep up).

 and ease of
 use

That actually may be a legitimate (though hardly decisive) ZFS advantage:  it's 
too bad its developers didn't extend it farther (e.g., by eliminating the 
vestiges of LVM redundancy management and supporting seamless expansion to 
multi-node server systems).

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS with Memory Sticks

Hello Paul,

Wednesday, December 5, 2007, 10:34:47 PM, you wrote:

PG Constantin Gonzalez wrote:
 Hi Paul,

 yes, ZFS is platform agnostic and I know it works in SANs.

 For the USB stick case, you may have run into labeling issues. Maybe
 Solaris SPARC did not recognize the x64 type label on the disk (which
 is strange, because it should...).

 Did you try making sure that ZFS creates an EFI label on the disk?
 You can check this by running zpool status and then the devices should
 look like c6t0d0 without the s0 part.

 If you want to force this, you can create an EFI label on the USB disk
 from hand by saying fdisk -E /dev/rdsk/cxtxdx.

 Hope this helps,
Constantin


PG OK, tried some things you said.

PG This is the Volume formated on the PC (W2100z), the Volume is named 
PG Radical-Vol

PG # /usr/sbin/zpool import -f Radical-Vol
PG cannot import 'Radical-Vol': one or more devices is currently unavailable

PG # /usr/sbin/zpool import
PG   pool: Radical-Vol
PG id: 3051993120652382125
PG  state: FAULTED
PG status: One or more devices contains corrupted data.
PG action: The pool cannot be imported due to damaged devices or data.
PGsee: http://www.sun.com/msg/ZFS-8000-5E
PG config:

PG Radical-Vol  UNAVAIL   insufficient replicas
PG   c7t0d0s0  UNAVAIL   corrupted data



PG Here's the device:

PG $ rmformat
PG Looking for devices...
PG  1. Logical Node: /dev/rdsk/c1t2d0s2
PG Physical Node: /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
PG Connected Device: SONY DVD RW DRU-720A  JY02
PG Device Type: Unknown
PG  2. Logical Node: /dev/rdsk/c7t0d0s2
PG Physical Node:
PG /[EMAIL PROTECTED],70/[EMAIL PROTECTED],2/[EMAIL PROTECTED]/[EMAIL 
PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0
PG Connected Device: USB 2.0  Flash Disk   1.00
PG Device Type: Removable

PG Following your command:

PG $ /opt/sfw/bin/sudo /usr/sbin/zpool status
PG   pool: Rad_Disk_1
PG  state: ONLINE
PG status: The pool is formatted using an older on-disk format.  The pool can
PG still be used, but some features are unavailable.
PG action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
PG pool will no longer be accessible on older software versions.
PG  scrub: none requested
PG config:

PG NAMESTATE READ WRITE CKSUM
PG Rad_Disk_1  ONLINE   0 0 0
PG   c0t1d0ONLINE   0 0 0

PG errors: No known data errors




PG It obviously doesn't show, not mounted.



PG And last the fdisk command:


PG # fdisk -E /dev/rdsk/c7t0d0
PG fdisk: Cannot stat device /dev/rdsk/c7t0d0




Looks like you have SMI label not EFI.
Can you re-create a pool again with command:

zpool create Radival-Vol c7t0d0

then put some data, do zpool export Radical-Vol and try to import it
on another box. Should work.


-- 
Best regards,
 Robert Milkowski   mailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on Freebsd 7.0

2007-12-07 Thread Karl Pielorz



--On 07 December 2007 11:18 -0600 Jason Morton 
[EMAIL PROTECTED] wrote:

 I am using ZFS on FreeBSD 7.0_beta3. This is the first time i have used
 ZFS and I have run into something that I am not sure if this is normal,
 but am very concerned about.

 SYSTEM INFO:
 hp 320s (storage array)
 12 disks (750GB each)
 2GB RAM
 1GB flash drive (running the OS)

Hi There,

I've been running ZFS under FreeBSD 7.0 for a few months now, and we also 
have a lot of HP / Proliant Kit - and, touch wood, so far - we've not seen 
any issues.

The first thing I'd suggest is make sure you have the absolutely *latest* 
firmware on the BIOS, and RAID controller (P400 I think the 320S is) from 
HP's site. We've had a number of problems with drives 'disappearing' 
array's locking, and errors with previous firmware in the past - which were 
all (finally) resolved by updated firmware. Even our latest delivered batch 
of 360's and 380's didn't have anything like 'current' firmware on.

 When I take a disk offline and replace it with my spare, after the spare
 rebuild it shows there are numerous errors. see below:
 scrub: resilver completed with 946 errors on Thu Dec  6 15:15:32 2007

Being as they're checksum errors - they probably won't be logged on the 
console (as ZFS detected them, and not nesc. the underlying CAM layers) - 
but worth checking in case something isn't happy.

With that in mind - you might also want to check if there's anything in 
common with da3 and da6 - either in the physical drives, or where they are 
on the DSL320's drive bay/box allocations, as shown by the RAID controller 
config (F8 at boot time when the RAID is init'ing).

-Kp
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs export/import problem in cluster env.

Hello Marcin,

Saturday, December 1, 2007, 9:57:11 AM, you wrote:

MW i did some test lately with zfs, env is:
MW 2 node veritas cluster 5.0 on solaris 8/07 with recommended
MW patches, 2 machines v440  v480, shared storage through switch on 6120 
array.
MW 2 luns from array, on every zfs pool. problem is, after
MW installing oracle db on one of the luns, zpool import / export did
MW not work. bad news for us is, that when i run oracle db on one
MW node, all is ok, then i export zpool from node one and import on
MW node 2, import did not work, and solaris system panic and reboot.

MW thank you for some help

Maybe VCS set a SCSI reservation on those disks?
Does VCS support ZFS yet?



-- 
Best regards,
 Robert Milkowskimailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS