[zfs-discuss] Re: zfs boot error recovery

2007-05-31 Thread Jakob Praher

hi Will,

thanks for your answer.
Will Murnane schrieb:

On 5/31/07, Jakob Praher <[EMAIL PROTECTED]> wrote:

c) si 3224 related question: is it possible to simply hot swap the disk
(i have the disks in special hot-swappable units, but have no experience
in hotswapping under solaris, such that i want to have some echo).

As it happens, I just happen to have tried this - albeit on a
different card, it went well.  I have a Marvell 88SX6081 controller,
and removing a disk caused no undue panic (as far as I can tell).
Adding a new disk, the kernel detected it immediately and then I had
to run "cfgadm -cconfigure scsi0/1" or something like that.  Then it
Just Worked.  I don't know if this is recommended or not... but it
worked for me.

What is the best way to simulate a disk error under zfs.
before i want to add real data to the system, i want to make sure it works.

my naive aproach:

1) remove disk from any pool membership (is this needed?)

zpool xxx detach 
zpool yyy detach 

2) disk should be free to be removed
3) pull plug
4) see what happens

5) plug disk in
6) restore zpool membership again

(1) and (6) should not be really needed, or do I see that incorrectly?

-- Jakob




Will


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: zfs boot error recovery

2007-05-31 Thread Jakob Praher

Jakob Praher schrieb:

hi all,

i would like to ask some questions regarding best practices for zfs
recovery if disk errors occur.

currently i have zfs boot (nv62) and the following setup:

2 si3224 controllers (each 4 sata disks)
8 sata disks, same size, same type

i have two pools:
a) rootpool
b) datapool

the rootpool is a mirrored pool, where every disk has a slice (the s0,
which is 5 % of the whole disk) and this is devoted to the rootpool,
just for mirroring.

the rest of the disk (s1) is added to the datapool which is raidz.

my idea is that if any disk is corrupt i am still be able to boot.

now I have some questions:

a) if i want to boot from every disk in case of error, i have to setup
grub on every disk, such that if the controller sets this disk as the
booting, the rootpool is able to be loaded from that.

b) what is the best way to as fast as possible replace a disk.
adding a disk as hotspare for the raidz is a good idea. but i also would
like to replace the disk during runtime as simple as possible.

the problem is that for the root pool the disks are labeled (the slices
thingy). So I cannot simply  detach the volumes and replace the disk and
attach them again, but I have to format the disk such that the slicing
exists. Is there some clever way to automatically re-label a replacement
disk?



i found out that storing or getting the label information from another 
disk should work:


prtvtoc /dev/rdsk/s2 | fmthard -s - /dev/rdsk/s2

for instance i could simply store the label of all disks on the root 
pool, which should be available as long as any of the 8 disks is still 
availabe. So in case of repair i simply have to fmthard -s  
before attaching the replaced disk.



c) si 3224 related question: is it possible to simply hot swap the disk
(i have the disks in special hot-swappable units, but have no experience
in hotswapping under solaris, such that i want to have some echo).

d) do you have best practices for systems like that above? what are the
best resources on the web for learning about monitoring the health of
the zfs system (like email notifications in case of disk failures...)

thannks in advance
-- Jakob


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs boot error recovery

2007-05-31 Thread Jakob Praher
hi all,

i would like to ask some questions regarding best practices for zfs
recovery if disk errors occur.

currently i have zfs boot (nv62) and the following setup:

2 si3224 controllers (each 4 sata disks)
8 sata disks, same size, same type

i have two pools:
a) rootpool
b) datapool

the rootpool is a mirrored pool, where every disk has a slice (the s0,
which is 5 % of the whole disk) and this is devoted to the rootpool,
just for mirroring.

the rest of the disk (s1) is added to the datapool which is raidz.

my idea is that if any disk is corrupt i am still be able to boot.

now I have some questions:

a) if i want to boot from every disk in case of error, i have to setup
grub on every disk, such that if the controller sets this disk as the
booting, the rootpool is able to be loaded from that.

b) what is the best way to as fast as possible replace a disk.
adding a disk as hotspare for the raidz is a good idea. but i also would
like to replace the disk during runtime as simple as possible.

the problem is that for the root pool the disks are labeled (the slices
thingy). So I cannot simply  detach the volumes and replace the disk and
attach them again, but I have to format the disk such that the slicing
exists. Is there some clever way to automatically re-label a replacement
disk?

c) si 3224 related question: is it possible to simply hot swap the disk
(i have the disks in special hot-swappable units, but have no experience
in hotswapping under solaris, such that i want to have some echo).

d) do you have best practices for systems like that above? what are the
best resources on the web for learning about monitoring the health of
the zfs system (like email notifications in case of disk failures...)

thannks in advance
-- Jakob

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: opensolaris, zfs rootfs raidz

2007-04-05 Thread Jakob Praher
Erik Trimble wrote:
> On Thu, 2007-04-05 at 22:59 +0200, Jakob Praher wrote:
>> Hi Cyril,
>>  
>> So to get this right:
>> Nevada == Solaris Express?!
>>
> Yes, it's a bit confusing.  Think of "Nevada" as a distro name (in Linux
> terms), which uses the OpenSolaris source base.  There are (generally)
> weekly builds, which is what you will see referred to as "B61".  Solaris
> Express is the marketing name for periodic releases of specific builds
> of Nevada (so, every couple of months, a build of Nevada is released as
> "Solaris Express" - it's for people who want the latest technology,,
> with _some_ support options, while not living on the absolute bleeding
> edge like us folks).
> 
The thing is: I am creating a network centered storage server, and for
that I'd like to have a somewhat stable OS. I would like to use ZFS and
then snapshot to another node (quite freqently) which should give me
some DRBD like behavior.

IMHO I wanted to have just one giant raidz zfs pool that can  be booted
from and not bother with the rest. I thought it would be rather hard to
support RAIDz as a root pool. Though I gave it a try.

Maybe I just should forget the root zfs stuff, if i nonetheless have to
 use 2 pools in order to have the rest use raidz which is what i need
for robustness.

Maybe I will just take a hardware raid approach to the root partition
(just to have failover support) and to not make the one giant root
approach using zfs.

One ZFS related question: If I use the ufs partition to boot into the
ZFS partition (the "old rootfs" stuff), raidz should then be technically
speeking possible?

Since in this case grub is using UFS to load the platform kernel and the
 initial ramdisk?

So maybe it should work to have a very small UFS partition mirrored
manually on several disks and then to boot into a raidz ZFS. My ZFS
partition FAULTED when I tried to boot via UFS on Solaris 10. Is the
root fs support mentioned in:
http://blogs.sun.com/tabriz/#are_you_ready_to_rumble supported in
Solaris 10?

Thanks. I am sorry for so much noise on this file system related list.

-- Jakob


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: opensolaris, zfs rootfs raidz

2007-04-05 Thread Jakob Praher
Hi Cyril,

thanks for your quick response!
Cyril Plisko wrote:
> On 4/5/07, Jakob Praher <[EMAIL PROTECTED]> wrote:
>> hi all,
> 
>>
>> I am new to solaris.
>> I am creating a zfs filestore which should boot via rootfs.
>> The version of the system is: SunOS store1 5.10 Generic_118855-33 i86pc
>> i386 i86pc.
>>
>> Now I have seen that there is a new rootfs support for solaris starting
>> with build: snv_62.
>> (http://www.opensolaris.org/os/community/zfs/boot/zfsboot-manual/)
>>
>> Is it
>> a) possible to start from a raidz pool?
> 
> No. At this point raidz pool is not usable as a boot pool.
> 

Is this possible then to use a mirror pool?

>> b) possible to update my version using a patch from the web to the above
>> version?
> 
> Generally speaking there is no patch to get your system from any 5.x
> release
> to 5.x+1 release. You may, however, upgrade your current system from
> SunOS 5.10 to Nevada using regular LiveUpgrade or DeadUpgrade(TM)
> procedure.
> 
> (I would just install from scratch - assuming you have all your valuable
> data stored externally or on exportable zpool)
> 
So to get this right:
Nevada == Solaris Express?!

This is a little bit confusing.
I am very glad Ian Murdock joined Sun. Hopefully system upgrade will be
as easy as apt-get dist-upgrade.
Is there any easy way to just get the latest solaris kernel via web?

thanks
Jakob



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] opensolaris, zfs rootfs raidz

2007-04-05 Thread Jakob Praher

hi all,

I am new to solaris.
I am creating a zfs filestore which should boot via rootfs.
The version of the system is: SunOS store1 5.10 Generic_118855-33 i86pc 
i386 i86pc.


Now I have seen that there is a new rootfs support for solaris starting 
with build: snv_62. 
(http://www.opensolaris.org/os/community/zfs/boot/zfsboot-manual/)


Is it
a) possible to start from a raidz pool?
b) possible to update my version using a patch from the web to the above 
version?


thanks in advance
-- Jakob



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: drbd using zfs send/receive?

2006-09-21 Thread Jakob Praher

Frank Cusack wrote:

On September 18, 2006 5:45:08 PM +0200 Jakob Praher <[EMAIL PROTECTED]> wrote:



huh.  How do you create a SAN with NFS?
Sorry. Okay it would be Network Attached Sotrage not the other way round 
. I guess you are right.


BUT if we are at discussing NFS for distributed stroage: What are your 
guys performance data for NFSv4 as a storage node. How well does the 
current Solaris NFSv4 stack interoperate with the Linux stack?

Would you go for that?

What about iSCSI on top of ZFS? is that an option. I did a research on 
iSCSI vs NFSv4 once and I found out that the overhead for transproting 
the fs metadata (in the NFSv4 case) is not the real problem for many 
szenarios. Especially the COMPOUND messages should help here.




I have been using DRBD on linux before and now am asking whether some 
of   you have experience on

on-demand network filesystem mirrors.



AFAIK, Solaris does not export file change notification to userland in
any way that would be useful for on-demand filesystem replication.  From
looking at drbd for 5 minutes, it looks like the kind of notification
that windows/linux/macos provides isn't what drbd uses; it does BLOCK
LEVEL replication, and part of the software is a kernel module to export
that data to userspace.  It sounds like that distinction doesn't matter
for what you are trying to achieve, and I believe that this block-by-block
duplication isn't a great idea for zfs anyway.  It might be neat if zfs
could inform userland of each new txg.

yes. exactly. It is a block device driver and that replicates. So it 
sits right underneath Linux's VFS.
Okay that is something i wanted to know. Are there any good heartbeat 
control apps for Solaris out there? I mean if i want to have failover 
(even if it is a little bit cheap) it should detect failures and react 
accordingly. Switching from Sender to Receiver should not be difficult 
given that all you need is to make ZFS snapshots. (and that is really 
cheap in ZFS).




Is this mere a hack or can it be used to create some sort of failover.

E.g. DRBD has the master/slave option, which can be configured easily. 
Something like this would
be nice out of the box. So in case of failure another node is the 
master and if the former master
is back again, it is simply the slave, so that both have the current 
data available again.


Any pointers to solutions in that area are greatly appreaciated.


See if <http://blogs.sun.com/timf/entry/zfs_automatic_snapshots_now_with>
comes close.

I have 2 setups, one using SC 3.2 with a SAN (both systems can access
the same filesystem, yes it's not as redundant as a remote node and
remote filesystem, but it's for HA not DR.  I could add another JBOD
to the SAN and configure zfs to mirror between the two enclosures to
get rid of the SPoF of the JBOD backplane/midplane, but it's not
worth it.


JBOD, SPoF - what are these things?


The other setup is using my own cron script (zfs send | zfs recv) to
send snapshots to a "remote" (just another server in the same rack)
host.  This is for a service that also has very high availability
requirements but where I can't afford shared storage.  I do a homegrown
heartbeat and failover thing.  I'm looking at replacing the cron script
with the SMF service linked above, but I'm in no rush since the cron job
works quite well.

If zfs is otherwise a good solution for you, you might want to consider
if you really need true on-demand replication.  Maybe 5-minute or even
1-minute recency is good enough.  I would imagine that you don't actually
get too much better than 30s with drbd anyway, since outside of fsync()
data doesn't actually make it to disk (and then replicated by drbd)
more frequently than that for some generic application.


Okay. I think zfs is nice. I am using xfs+lvm2 on my linux boxes so far. 
This works nice too.


SMF is the init.d replacement of solaris, right? What would that look 
like. What would SMF do, but restart your app if it fails? Would you 
like to have a background task running instead of kicking it on with cron?


Thanks
-- Jakob

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] drbd using zfs send/receive?

2006-09-18 Thread Jakob Praher

hi everyone,

I am planning on creating a local SAN via NFS(v4) and several redundant 
nodes.


I have been using DRBD on linux before and now am asking whether some of 
 you have experience on on-demand network filesystem mirrors.


I have yet little Solaris sysadmin know how, but i am interesting 
whether there is an on-demand support for sending snapshots. I.e. not 
via a cron job, but via a kind of filesystem change notification system.


Is this mere a hack or can it be used to create some sort of failover.

E.g. DRBD has the master/slave option, which can be configured easily. 
Something like this would be nice out of the box. So in case of failure 
another node is the master and if the former master is back again, it is 
simply the slave, so that both have the current data available again.


Any pointers to solutions in that area are greatly appreaciated.

-- Jakob

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss