Re: [zfs-discuss] zvol recordsize for backing a zpool over iSCSI

2010-08-02 Thread Bruno Sousa
On 2-8-2010 2:53, Richard Elling wrote:
 On Jul 30, 2010, at 11:35 AM, Andrew Gabriel wrote:

   
 Just wondering if anyone has experimented with working out the best zvol 
 recordsize for a zvol which is backing a zpool over iSCSI?
 

 This is an interesting question.  Today, most ZFS implementations are done
 directly on devices with an effective, fixed recordsize of 512 bytes.  But 
 that
 isn't very efficient and things like raidz don't quite work like you might 
 expect.
 Next up is 4KB sectors, which is a better starting point, IMHO.

 That said, ultimately it is the size of the data to be written that dictates 
 the
 best answer.

   
Up to now i configure the recordsize of a iscsi zvol to match the same
as the one seen by the iscsi initiator side.
So for instance if i'm creating an iscsi zvol for a NTFS volume ,
depending on the size of the zvol, i use from 4KB up to 64KB as seen in
http://support.microsoft.com/?scid=kb%3Ben-us%3B140365x=20y=18 .
Likewise if i'm going to use an EXT3 filesystem i try to use the same
rules, as seen in http://en.wikipedia.org/wiki/Ext3#Size_limits .

This has been working good for me, but probably i'm not the best example
since i refuse to provide iscsi zvol's on RAIDn implementations, i
only use ZFS Mirrors for that..

Bruno

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] pool scrub clean, filesystem broken

2010-08-02 Thread Brian
I have several filesystems in a raid-z pool.  All seem to be working correctly 
except one which is mounted but yields the following with ls -lh:

?-  ? ??  ? ? media

I just finished scrubbing the pool and there are no errors.  I can't seem to do 
anything with the filesystem (cd, chown, etc)

-brian
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] pool scrub clean, filesystem broken

2010-08-02 Thread Preston Connors
On Mon, 2010-08-02 at 08:48 -0700, Brian wrote:
 I have several filesystems in a raid-z pool.  All seem to be working 
 correctly except one which is mounted but yields the following with ls -lh:
 
 ?-  ? ??  ? ? media
 
 I just finished scrubbing the pool and there are no errors.  I can't seem to 
 do anything with the filesystem (cd, chown, etc)
 
 -brian

Brian,

File listings with questions marks in them has happened to me on snv_134
when I had multiple systems mounting the same LUN via COMSTAR iSCSI
targets. The file system was actually formatted as ext2. ext2 is not
meant to be iSCSI mounted on multiple systems in read-write mode and
data can easily be corrupted because ext2 is not a clustered file
system. Luckily, unmounting then mounting the file system on the
affected hosts solved the problem and the LUN became readable.

-- 
Thank you,
Preston Connors
Atlantic.Net

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] pool scrub clean, filesystem broken

2010-08-02 Thread Brian
Thanks Preston.  I am actually using ZFS locally, connected directly to 3 sata 
drives in a raid-z pool. The filesystem is ZFS and it mounts without complaint 
and the pool is clean.  I am at a loss as to what is happening. 

-brian
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using multiple logs on single SSD devices

2010-08-02 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Jonathan Loran
 
 But here's what's keeping me up at night:  We're running zpool v15,
 which as I understand it means if an X25e log fails, the pool is toast.
  Obviously, the log devices are not mirrored.  My bad :(  I guess this
 begs the first question, which is:
 
 - if the machine is running, and the log device fails, AND the failure
 is detected as such, will the ZIL roll back into the main pool drives?
  If so, are we saved?

Because you're at pool v15, it does not matter if the log device fails while
you're running, or you're offline and trying to come online, or whatever.
Simply if the log device fails, unmirrored, and the version is less than 19,
the pool is simply lost.  There are supposedly techniques to recover, so
it's not necessarily a data unrecoverable by any means situation, but you
certainly couldn't recover without a server crash, or at least shutdown.
And it would certainly be a nightmare, at best.  The system will not fall
back to ZIL in the main pool.  That was a feature created in v19.


 - Second question, how about this: partition the two X25E drives into
 two, and then mirror each half of each drive as log devices for each
 pool.  Am I missing something with this scheme?  On boot, will the GUID
 for each pool get found by the system from the partitioned log drives?

I'm afraid it's too late for that, unless you're willing to destroy 
recreate your pool.  You cannot remove the existing log device.  You cannot
shrink it.  You cannot replace it with a smaller one.  The only things you
can do right now are:

(a) Start mirroring that log device with another device of the same size or
larger.
or
(b) Buy another SSD which is larger than the first.  Create a slice on the
2nd which is equal to the size of the first.  Mirror the first onto the
slice of the 2nd.  After resilver, detach the first drive, and replace it
with another one of the larger drives.  Slice the 3rd drive just like the
2nd, and mirror the 2nd drive slice onto it.  Now you've got a mirrored 
sliced device, without any downtime, but you had to buy 2x 2x larger drives
in order to do it.
or
(c) Destroy  recreate your whole pool, but learn from your mistake.  This
time, slice each SSD, and mirror the slices to form the log device.

BTW, ask me how I know this in such detail?  It's cuz I made the same
mistake last year.  There was one interesting possibility we considered, but
didn't actually implement:

We are running a stripe of mirrors.  We considered the possibility of
breaking the mirrors, creating a new pool out of the other half using the
SSD properly sliced.  Using zfs send to replicate all the snapshots over
to the new pool, up to a very recent time. 

Then, we'd be able to make a very short service window.  Shutdown briefly,
send that one final snapshot to the new pool, destroy the old pool, rename
the new pool to take the old name, and bring the system back up again.
Instead of scheduling a long service window.  As soon as the system is up
again, start mirroring and resilvering (er ... initial silvering), and of
course, slice the SSD before attaching the mirror.

Naturally there is some risk, running un-mirrored long enough to send the
snaps... and so forth.

Anyway, just an option to consider.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss