date:20100330

Thanks for the reply.

I didn't get very much further.

Yes, ZFS loves raw devices. When I had two devices I wouldn't be in this mess.
I would simply install opensolaris on the first disk and add the second ssd to 
the
data pool with a zpool add mpool cache cxtydz Notice that no slices or 
partitions
were used.
But I don't have space for two devices. So I have to deal with slices and 
partitions.
I did another clean install in 12Gb partition leaving 18Gb free.
I tried parted to resize the partition, but it said that resizing (solaris2) 
partitions
wasn't implemented.
I tried fdisk but no luck either.
I tried the send and receive, create new partition and slices, restore rpool in
slice0, do installgrub but it wouldn't boot anymore.

Can anybody give a summary of commands/steps howto accomplish a bootable
rpool and l2arc on a ssd. Preferably for the x86 platform.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc


F. Wessels wrote:

Thanks for the reply.

I didn't get very much further.

Yes, ZFS loves raw devices. When I had two devices I wouldn't be in this mess.
I would simply install opensolaris on the first disk and add the second ssd to 
the
data pool with a zpool add mpool cache cxtydz Notice that no slices or 
partitions
were used.
But I don't have space for two devices. So I have to deal with slices and 
partitions.
I did another clean install in 12Gb partition leaving 18Gb free.
I tried parted to resize the partition, but it said that resizing (solaris2) 
partitions
wasn't implemented.
I tried fdisk but no luck either.
I tried the send and receive, create new partition and slices, restore rpool in
slice0, do installgrub but it wouldn't boot anymore.

Can anybody give a summary of commands/steps howto accomplish a bootable
rpool and l2arc on a ssd. Preferably for the x86 platform.
  


Look up zvols, as this is what you want to use, NOT partitions (for the 
many reasons you've encountered).


In essence, do a normal install, using the ENTIRE disk for your rpool.

Then create a zvol in the rpool:

# zfs create -V 8GB rpool/zvolname

Add this zvol as the cache device (L2arc) for your other pool

# zpool create tank mirror c1t0d0 c1t1d0s0 cache rpool/zvolname




the default block size for zvols is 8k, which I'd be interested in 
having someone test out other sizes, to see which would be best for an 
L2ARC device.





--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc

2010-03-30 Thread Darren J Moffat


On 30/03/2010 10:05, Erik Trimble wrote:

F. Wessels wrote:

Thanks for the reply.

I didn't get very much further.

Yes, ZFS loves raw devices. When I had two devices I wouldn't be in
this mess.
I would simply install opensolaris on the first disk and add the
second ssd to the
data pool with a zpool add mpool cache cxtydz Notice that no slices or
partitions
were used.
But I don't have space for two devices. So I have to deal with slices
and partitions.
I did another clean install in 12Gb partition leaving 18Gb free.
I tried parted to resize the partition, but it said that resizing
(solaris2) partitions
wasn't implemented.
I tried fdisk but no luck either.
I tried the send and receive, create new partition and slices, restore
rpool in
slice0, do installgrub but it wouldn't boot anymore.

Can anybody give a summary of commands/steps howto accomplish a bootable
rpool and l2arc on a ssd. Preferably for the x86 platform.


Look up zvols, as this is what you want to use, NOT partitions (for the
many reasons you've encountered).


In this case partitions is the only way this will work.


In essence, do a normal install, using the ENTIRE disk for your rpool.

Then create a zvol in the rpool:

# zfs create -V 8GB rpool/zvolname

Add this zvol as the cache device (L2arc) for your other pool

# zpool create tank mirror c1t0d0 c1t1d0s0 cache rpool/zvolname



That won't work L2ARC devices can not be a ZVOL of another pool, they 
can't be a file either.  An L2ARC device must be a physical device.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc


Darren J Moffat wrote:

On 30/03/2010 10:05, Erik Trimble wrote:

F. Wessels wrote:

Thanks for the reply.

I didn't get very much further.

Yes, ZFS loves raw devices. When I had two devices I wouldn't be in
this mess.
I would simply install opensolaris on the first disk and add the
second ssd to the
data pool with a zpool add mpool cache cxtydz Notice that no slices or
partitions
were used.
But I don't have space for two devices. So I have to deal with slices
and partitions.
I did another clean install in 12Gb partition leaving 18Gb free.
I tried parted to resize the partition, but it said that resizing
(solaris2) partitions
wasn't implemented.
I tried fdisk but no luck either.
I tried the send and receive, create new partition and slices, restore
rpool in
slice0, do installgrub but it wouldn't boot anymore.

Can anybody give a summary of commands/steps howto accomplish a 
bootable

rpool and l2arc on a ssd. Preferably for the x86 platform.


Look up zvols, as this is what you want to use, NOT partitions (for the
many reasons you've encountered).


In this case partitions is the only way this will work.


In essence, do a normal install, using the ENTIRE disk for your rpool.

Then create a zvol in the rpool:

# zfs create -V 8GB rpool/zvolname

Add this zvol as the cache device (L2arc) for your other pool

# zpool create tank mirror c1t0d0 c1t1d0s0 cache rpool/zvolname



That won't work L2ARC devices can not be a ZVOL of another pool, they 
can't be a file either.  An L2ARC device must be a physical device.





I could have sworn I did this with a zvol awhile ago. Maybe that was for 
something else...


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc

2010-03-30 Thread Darren J Moffat


On 30/03/2010 10:13, Erik Trimble wrote:

Add this zvol as the cache device (L2arc) for your other pool


# zpool create tank mirror c1t0d0 c1t1d0s0 cache rpool/zvolname



That won't work L2ARC devices can not be a ZVOL of another pool, they
can't be a file either. An L2ARC device must be a physical device.




I could have sworn I did this with a zvol awhile ago. Maybe that was for
something else...


The check for the L2ARC device being a block device has always been there.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc


Darren J Moffat wrote:

On 30/03/2010 10:13, Erik Trimble wrote:

Add this zvol as the cache device (L2arc) for your other pool


# zpool create tank mirror c1t0d0 c1t1d0s0 cache rpool/zvolname



That won't work L2ARC devices can not be a ZVOL of another pool, they
can't be a file either. An L2ARC device must be a physical device.




I could have sworn I did this with a zvol awhile ago. Maybe that was for
something else...


The check for the L2ARC device being a block device has always been 
there.


I just tried a couple of things on one of my test machines, and I think 
I know where I was mis-remembering it from:  you can add a file or  zvol 
as a ZIL device.




Frankly, I'm a little confused by this.  I would think that you would 
have consistent behavior between ZIL and L2ARC devices - either they 
both can be a file/zvol, or neither can.  Not the current behavior.




--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc

Thank you Erik for the reply.

I misunderstood Dan's suggestion about the zvol in the first place. Now you 
make the same suggestion also. Doesn't zfs prefer raw devices? When following 
this route the zvol used as cache device for tank makes use of the ARC of rpool 
what doesn't seem right. Or is there some setting to prevent this. It's zfs 
preference for raw devices that I looked for a way to use a slice as cache.

Regards,

Frederik
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-30 Thread Karsten Weiss

Hi, I did some tests on a Sun Fire x4540 with an external J4500 array 
(connected via two
HBA ports). I.e. there are 96 disks in total configured as seven 12-disk raidz2 
vdevs
(plus system, spares, unused disks) providing a ~ 63 TB pool with fletcher4 
checksums.
The system was recently equipped with a Sun Flash Accelerator F20 with 4 FMod
modules to be used as log devices (ZIL). I was using the latest snv_134 
software release.

Here are some first performance numbers for the extraction of an uncompressed 
50 MB
tarball on a Linux (CentOS 5.4 x86_64) NFS-client which mounted the test 
filesystem
(no compression or dedup) via NFSv3 (rsize=wsize=32k,sync,tcp,hard).

standard ZIL:   7m40s  (ZFS default)
1x SSD ZIL:  4m07s  (Flash Accelerator F20)
2x SSD ZIL:  2m42s  (Flash Accelerator F20)
2x SSD mirrored ZIL:   3m59s  (Flash Accelerator F20)
3x SSD ZIL:  2m47s  (Flash Accelerator F20)
4x SSD ZIL:  2m57s  (Flash Accelerator F20)
disabled ZIL:   0m15s
(local extraction0m0.269s)

I was not so much interested in the absolute numbers but rather in the relative
performance differences between the standard ZIL, the SSD ZIL and the disabled
ZIL cases.

Any opinions on the results? I wish the SSD ZIL performance was closer to the
disabled ZIL case than it is right now.

ATM I tend to use two F20 FMods for the log and the two other FMods as L2ARC 
cache
devices (although the system has lots of system memory i.e. the L2ARC is not 
really
necessary). But the speedup of disabling the ZIL altogether is appealing (and 
would
probably be acceptable in this environment).
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc

Thank you Darren.

So no zvol's as L2ARC cache device. That leaves partitions and slices.
When I tried to add a second partition, the first contained slices with the 
root pool, as cache device. Zpool refused, it reported that the device CxTyDzP2 
(note P2) wasn't supported. Perhaps I did something wrong with fdisk. But the 
partitions were both there. Also in parted.
And the other option, use a slice in the first partition as cache device. But I 
messed up my boot environment with that. And I'm not sure about the correct 
resizing procedure.

Any suggestions?

Regards,

Frederik.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc


F. Wessels wrote:

Thank you Erik for the reply.

I misunderstood Dan's suggestion about the zvol in the first place. Now you 
make the same suggestion also. Doesn't zfs prefer raw devices? When following 
this route the zvol used as cache device for tank makes use of the ARC of rpool 
what doesn't seem right. Or is there some setting to prevent this. It's zfs 
preference for raw devices that I looked for a way to use a slice as cache.

Regards,

Frederik
  
As Darren pointed out, you can't use anything but a block device for the 
L2ARC device.  So my suggestion doesn't work.  You can use a file or 
zvol as a ZIL device, but not an L2ARC device.




--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs diff

 On Mon, Mar 29, 2010 at 5:39 PM, Nicolas Williams
 nicolas.willi...@sun.com wrote:
  One really good use for zfs diff would be: as a way to index zfs send
  backups by contents.
 
 Or to generate the list of files for incremental backups via NetBackup
 or similar.  This is especially important for file systems will
 millions of files with relatively few changes.

+1

The reason zfs send is so fast, is not because it's so fast.  It's because
it does not need any time to index and compare and analyze which files have
changed since the last snapshot or increment.

If the zfs diff command could generate the list of changed files, and you
feed that into tar or whatever, then these 3rd party backup tools become
suddenly much more effective.  Able to rival the performance of zfs send.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc

2010-03-30 Thread Scott Duckworth

Just clarifying Darren's comment - we got bitten by this pretty badly so I
figure it's worth saying again here.  ZFS will *allow* you to use a ZVOL of
one pool as a ZDEV in another pool, but it results in race conditions and an
unstable system.  (At least on Solaris 10 update 8).

We tried to use a ZVOL from rpool (on fast 15k rpm drives) as a cache device
for another pool (on slower 7.2k rpm drives).  It worked great up until it
hit the race condition and hung the system.  It would have been nice if zfs
had issued a warning, or at least if this fact was better documented.

Scott Duckworth, Systems Programmer II
Clemson University School of Computing


On Tue, Mar 30, 2010 at 5:09 AM, Darren J Moffat darr...@opensolaris.orgwrote:

 On 30/03/2010 10:05, Erik Trimble wrote:

 F. Wessels wrote:

 Thanks for the reply.

 I didn't get very much further.

 Yes, ZFS loves raw devices. When I had two devices I wouldn't be in
 this mess.
 I would simply install opensolaris on the first disk and add the
 second ssd to the
 data pool with a zpool add mpool cache cxtydz Notice that no slices or
 partitions
 were used.
 But I don't have space for two devices. So I have to deal with slices
 and partitions.
 I did another clean install in 12Gb partition leaving 18Gb free.
 I tried parted to resize the partition, but it said that resizing
 (solaris2) partitions
 wasn't implemented.
 I tried fdisk but no luck either.
 I tried the send and receive, create new partition and slices, restore
 rpool in
 slice0, do installgrub but it wouldn't boot anymore.

 Can anybody give a summary of commands/steps howto accomplish a bootable
 rpool and l2arc on a ssd. Preferably for the x86 platform.


 Look up zvols, as this is what you want to use, NOT partitions (for the
 many reasons you've encountered).


 In this case partitions is the only way this will work.


  In essence, do a normal install, using the ENTIRE disk for your rpool.

 Then create a zvol in the rpool:

 # zfs create -V 8GB rpool/zvolname

 Add this zvol as the cache device (L2arc) for your other pool

 # zpool create tank mirror c1t0d0 c1t1d0s0 cache rpool/zvolname


 That won't work L2ARC devices can not be a ZVOL of another pool, they can't
 be a file either.  An L2ARC device must be a physical device.

 --
 Darren J Moffat

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc

2010-03-30 Thread Rob Logan



 you can't use anything but a block device for the L2ARC device.

sure you can... 
http://mail.opensolaris.org/pipermail/zfs-discuss/2010-March/039228.html
it even lives through a reboot (rpool is mounted before other pools)

zpool create -f test c9t3d0s0 c9t4d0s0
zfs create -V 3G rpool/cache
zpool add test cache /dev/zvol/dsk/rpool/cache
reboot

if your asking for a L2ARC on rpool, well, yea, its not mounted soon enough, 
but the
point is to put rpool, swap, and L2ARC for your storage pool all on a single
SSD..

Rob

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool split problem?

2010-03-30 Thread Mark J Musante

OK, I see what the problem is: the /etc/zfs/zpool.cache file.

When the pool was split, the zpool.cache file was also split - and the split 
happens prior to the config file being updated.  So, after booting off the 
split side of the mirror, zfs attempts to mount rpool based on the information 
in the zpool.cache file (which still shows it as a mirror of c0t0d0s0 and 
c0t1d0s0).

The fix would be to remove the appropriate entry from the split-off pool's 
zpool.cache file.  Easy to say, not so easy to do.  I have filed CR 6939334 to 
track this issue.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs recreate questions

2010-03-30 Thread JD Trout

Thanks for the details Edward, that is good to know.

Another quick question.

In my test setup I created the pool using snv_134 because I wanted to see how 
things would run as the next release is supposed to be based off of snv_134 
(from my understanding).  However, I recently read that the 2010.03 release 
time is unknown and that things are kind of uncertain (is this true? Is there a 
link that provides info about the release schedule?).  Anyway, my question is, 
I created the pool with snv_134 and since I need to get this hardware to 
production, I can't wait for the new release and have to move forward with 
2009.06.  However, as expected I can't import it because the pool was created 
with a newer version of ZFS. What options are there to import?

Like I said, I don't need the data, so I can blow out the pool and start over.  
However, I was curious to see how ZFS could handle this situation. 

Thanks guys, I really appreciate all the info.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc

2010-03-30 Thread Richard Elling

On Mar 29, 2010, at 1:10 PM, F. Wessels wrote:
 Hi,
 
 as Richard Elling wrote earlier:
 For more background, low-cost SSDs intended for the boot market are
 perfect candidates. Take a X-25V @ 40GB and use 15-20 GB for root
 and the rest for an L2ARC. For small form factor machines or machines
 with max capacity of 8GB of RAM (a typical home system) this can make a
 pleasant improvement over a HDD-only implementation.
 
 For the upcoming 2010.03 release and now testing with a b134.
 What is the most appropiate way to accomplish this?

The most appropriate (supportable by Oracle) is to use the automated 
installer.  An example of the manifest is:
http://dlc.sun.com/osol/docs/content/dev/AIinstall/customai.html#ievtoc

 The caiman installer allows you to control the size of the partition on the 
 boot disk but it doesn't allow you (at least I couldn't figure out how) to 
 control the size of the slices. So you end with slice0 filling the entire 
 partition. 
 Now this leaves you with two options, create a second partition or start a 
 complex process of backing up the root pool, reslicing the first partition, 
 restore the root pool and pray that the system will boot again.
 I tried the first, knowing that multiple partitions isn't recommended. I 
 couldn't get zfs to add the second partition as L2ARC. It simply said that it 
 wasn't supported.
 Before I try the second option perhaps somebody can give some directions 
 howto accomplish a shared rpool and l2arc on a (ss)disk.

There are perhaps a half dozen ways to do this.  As others have mentioned,
using fdisk partitions can be done and is particularly easy when using the
text-based installer. However, with that option you need a smarter partition
editor than fdisk (eg. gparted)

And, of course, you can fake out the installer altogether... or even change the
source code...
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc

F. Wessels wrote:

Hi,

as Richard Elling wrote earlier:
For more background, low-cost SSDs intended for the boot market are
perfect candidates. Take a X-25V @ 40GB and use 15-20 GB for root
and the rest for an L2ARC. For small form factor machines or machines
with max capacity of 8GB of RAM (a typical home system) this can make a
pleasant improvement over a HDD-only implementation.

For the upcoming 2010.03 release and now testing with a b134.
What is the most appropiate way to accomplish this?

The caiman installer allows you to control the size of the partition on the boot disk but it doesn't allow you (at least I couldn't figure out how) to control the size of the slices. So you end with slice0 filling the entire partition.
Now this leaves you with two options, create a second partition or start a complex process of backing up the root pool, reslicing the first partition, restore the root pool and pray that the system will boot again.

I tried the first, knowing that multiple partitions isn't recommended. I
couldn't get zfs to add the second partition as L2ARC. It simply said that it
wasn't supported.
Before I try the second option perhaps somebody can give some directions howto
accomplish a shared rpool and l2arc on a (ss)disk.

Regards,

Frederik

As I think was possibly mentioned before on this thread, what you
probably want to do is either:

(a) create a zvol inside the existing rpool, then add the zvol as an L2ARC

(b) create a file in one of the rpool filesystems, and add that as the L2ARC

Likely, (a) is the better option.

So, go ahead and give the entire boot SSD to the installer to create a
rpool of the entire disk, then zvol off a section to be used as the L2ARC.

--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-30 Thread Kyle McDonald

On 3/30/2010 2:44 PM, Adam Leventhal wrote:
 Hey Karsten,

 Very interesting data. Your test is inherently single-threaded so I'm not 
 surprised that the benefits aren't more impressive -- the flash modules on 
 the F20 card are optimized more for concurrent IOPS than single-threaded 
 latency.

   

Yes it would be interesting to see the Avg numbers for 10 or more
clients (or jobs on one client) all performing that same test.

 -Kyle

 Adam

 On Mar 30, 2010, at 3:30 AM, Karsten Weiss wrote:

   
 Hi, I did some tests on a Sun Fire x4540 with an external J4500 array 
 (connected via two
 HBA ports). I.e. there are 96 disks in total configured as seven 12-disk 
 raidz2 vdevs
 (plus system, spares, unused disks) providing a ~ 63 TB pool with fletcher4 
 checksums.
 The system was recently equipped with a Sun Flash Accelerator F20 with 4 FMod
 modules to be used as log devices (ZIL). I was using the latest snv_134 
 software release.

 Here are some first performance numbers for the extraction of an 
 uncompressed 50 MB
 tarball on a Linux (CentOS 5.4 x86_64) NFS-client which mounted the test 
 filesystem
 (no compression or dedup) via NFSv3 (rsize=wsize=32k,sync,tcp,hard).

 standard ZIL:   7m40s  (ZFS default)
 1x SSD ZIL:  4m07s  (Flash Accelerator F20)
 2x SSD ZIL:  2m42s  (Flash Accelerator F20)
 2x SSD mirrored ZIL:   3m59s  (Flash Accelerator F20)
 3x SSD ZIL:  2m47s  (Flash Accelerator F20)
 4x SSD ZIL:  2m57s  (Flash Accelerator F20)
 disabled ZIL:   0m15s
 (local extraction0m0.269s)

 I was not so much interested in the absolute numbers but rather in the 
 relative
 performance differences between the standard ZIL, the SSD ZIL and the 
 disabled
 ZIL cases.

 Any opinions on the results? I wish the SSD ZIL performance was closer to the
 disabled ZIL case than it is right now.

 ATM I tend to use two F20 FMods for the log and the two other FMods as L2ARC 
 cache
 devices (although the system has lots of system memory i.e. the L2ARC is not 
 really
 necessary). But the speedup of disabling the ZIL altogether is appealing 
 (and would
 probably be acceptable in this environment).
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 

 --
 Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Cannot replace a replacing device

2010-03-30 Thread Jim

Thanks - have run it and returns pretty quickly. Given the output (attached) 
what action can I take?

Thanks

James
-- 
This message posted from opensolaris.orgDirty time logs:

tank
outage [300718,301073] length 356
outage [301138,301139] length 2
outage [301149,301149] length 1
outage [301151,301153] length 3
outage [301155,301155] length 1
outage [301157,301158] length 2
outage [301182,301182] length 1
outage [301262,301262] length 1
outage [301911,301916] length 6
outage [304063,304063] length 1
outage [304791,304796] length 6

raidz
outage [300718,301073] length 356
outage [301138,301139] length 2
outage [301149,301149] length 1
outage [301151,301153] length 3
outage [301155,301155] length 1
outage [301157,301158] length 2
outage [301182,301182] length 1
outage [301262,301262] length 1
outage [301911,301916] length 6
outage [304063,304063] length 1
outage [304791,304796] length 6

/dev/ad4

/dev/ad6

replacing
outage [300718,301073] length 356
outage [301138,301139] length 2
outage [301149,301149] length 1
outage [301151,301153] length 3
outage [301155,301155] length 1
outage [301157,301158] length 2
outage [301182,301182] length 1
outage [301262,301262] length 1
outage [301911,301916] length 6
outage [304063,304063] length 1
outage [304791,304796] length 6

/dev/ad7/old
outage [300718,301073] length 356
outage [301138,301139] length 2
outage [301149,301149] length 1
outage [301151,301153] length 3
outage [301155,301155] length 1
outage [301157,301158] length 2
outage [301182,301182] length 1
outage [301262,301262] length 1
outage [301911,301916] length 6
outage [304063,304063] length 1
outage [304791,304796] length 6

/dev/ad7
outage [300718,301073] length 356
outage [301138,301139] length 2
outage [301149,301149] length 1
outage [301151,301153] length 3
outage [301155,301155] length 1
outage [301157,301158] length 2
outage [301182,301182] length 1
outage [301262,301262] length 1
outage [301911,301916] length 6
outage [304063,304063] length 1
outage [304791,304796] length 6


Metaslabs:


vdev 0 0   26   20.0M

offset spacemapfree
-- 

 4   52166M
 8   56   2.66G
 c   65   12.4M
10   66   20.7M
14   69   29.1M
18   73   29.7M
1c   77   29.6M
20   81   79.2M
24   91   87.9M
28   92   63.2M
2c   94   94.2M
30   99123M
34  103523M
38  107   50.9M
3c  111117M
40  116   54.3M
44  119   60.2M
48  123   97.4M
4c  126   1.20G
50  129   48.5M
54  132106M
58  137   27.4M
5c  140   39.6M
60  146   45.3M
64  149   34.9M
68  151544M
6c  154   36.6M
70  156   19.4M
74  160   35.7M
78  162   41.2M
7c  166   23.1M
9c   74   14.1M
a0   78   15.2M
a4   88   28.1M
a8  174   23.3M
ac  178   24.2M
b0  181   26.3M
b4  100   43.4M
b8  104   33.6M
bc  108   30.6M
c0  113   59.8M
c4  115   53.9M
c8  120   30.8M
cc  124   82.2M
d0  127   36.9M
d4  130   76.2M
d8  133   39.7M

Re: [zfs-discuss] Mounting a snapshot of an iSCSI volume using Windows

2010-03-30 Thread Lutz Schumann

Hello, 

wanted to know if there are any updates on this topic ?

Regards, 
Robert
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Simultaneous failure recovery

2010-03-30 Thread Peter Tribble

I have a pool (on an X4540 running S10U8) in which a disk failed, and the
hot spare kicked in. That's perfect. I'm happy.

Then a second disk fails.

Now, I've replaced the first failed disk, and it's resilvered and I have my
hot spare back.

But: why hasn't it used the spare to cover the other failed drive? And
can I hotspare it manually?  I could do a straight replace, but that
isn't quite the same thing.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] sharing a ssd between rpool and l2arc

Hi all,

yes it works with the partitions. 
I think that I made a typo during the initial testing off adding a partition as 
cache, probably swapped the 0 for an o.

Tested with a b134 gui and text installer on the x86 platform.
So here it goes:
Install opensolaris into a partition and leave some space for the L2ARC.
This will remove all partitions from the disk!
After the installation login.
# fdisk /dev/rdsk/[boot disk]p0
select option 1 create a partition
select option 1 for a SOLARIS2 partition type
specify a size
answer no, do not make this partition active
when satified with the result write your changes by choosing 6 and exit fdisk

finally add your cache device to your data pool
# zpool add mpool cache /dev/rdsk/cXtYdZp2
that's it

Some notes:
-you CAN remove the cache device from the pool.
-you CAN import the pool with a missing cache device. (Remember you CANN'T 
import a pool with a missing slog!)

Open questions:
-what happens when one cache device fails in case of several striped cache 
devices? Will this disable the entire L2ARC or will it continue to function 
minus the faulty device?
-The alignment mess. I know that the (intel) ssd's are sensitive to 
misaligment. Fdisk only allows you to enter cylinders NOT lba addresses. You 
probably can workaround that with parted.
-Does anybody know more about the recently announced flash aware sd driver? 
This was on storage-discuss a couple of days ago.

Does anybody have any tips to squeeze the most out of the ssd's?

Thank you all for your time and interest.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Simultaneous failure recovery

2010-03-30 Thread Ian Collins


On 03/31/10 10:39 AM, Peter Tribble wrote:

I have a pool (on an X4540 running S10U8) in which a disk failed, and the
hot spare kicked in. That's perfect. I'm happy.

Then a second disk fails.

Now, I've replaced the first failed disk, and it's resilvered and I have my
hot spare back.

But: why hasn't it used the spare to cover the other failed drive? And
can I hotspare it manually?  I could do a straight replace, but that
isn't quite the same thing.

   
Was the spare spare when the second drive failed?  If not, I don't think 
it will get used.  My understanding is the spares are added when the 
drive is faulted, so it's an event rather then level driven action.


At least I'm not the only one seeing multiple drive failures this week!

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-30 Thread Richard Elling

On Mar 30, 2010, at 2:50 PM, Jeroen Roodhart wrote:
 Hi Karsten. Adam, List,
 
 Adam Leventhal wrote:
 
 Very interesting data. Your test is inherently single-threaded so I'm not 
 surprised that the benefits aren't more impressive -- the flash modules on 
 the F20 card are optimized more for concurrent IOPS than single-threaded 
 latency.
 
 Well, I actually wanted to do a bit more bottleneck searching, but let me 
 weigh in with some measurements of our own :)
 
 We're om a single X4540 with quad-core CPUs so we're on the older 
 hypertransport bus. Connected it up to two X2200-s running Centos 5, each on 
 its own 1Gb link. Switched write caching off with the following addition to 
 the /kernel/drv/sd.conf file (Karsten: if you didn't do this already, you 
 _really_ want to :) ):
 
 # 
 http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes
 # Add whitespace to make the vendor ID (VID) 8 ... and Product ID (PID) 16 
 characters long...
 sd-config-list = ATA MARVELL SD88SA02,cache-nonvolatile;
 cache-nonvolatile=1, 0x4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1;

If you are going to trick the system into thinking a volatile cache is 
nonvolatile, you
might as well disable the ZIL -- the data corruption potential is the same.

 As test we've found that untarring an eclipse sourcetar file is a good use 
 case. So we use that. Called from a shell script that creates a directory, 
 pushes directory and does the unpacking, for 40 times on each machine.
 
 Now for the interesting bit: 
 
 When we use one vmod, both machines are finished in about 6min45, zilstat 
 maxes out at about 4200 IOPS.   
 Using four vmods it takes about 6min55, zilstat maxes out at 2200 IOPS.
 
 In both cases, probing the hyper transport bus seems to show no bottleneck 
 there (although I'd like to see the biderectional flow, but I know we can't 
 :) ). Network stays comfortably under the 400Mbits/s and that's peak load 
 when using 1 vmod.
 
 Looking at the IO-connection architecture, it figures that in this set we 
 traverse the different HT busses quite a lot. So we've also placed an Intel 
 dual 1Gb NIC in another PCIE slot, so that ZIL traffic should only have to 
 use 1 HT bus (not counting offloading intelligence). That helped a bit, but 
 not much:
 
 Around 6min35 using one vmod and 6min45 using four vmod-s.
 
 It made looking at the HT-dtrace more telling though. Since the outgoing 
 HT-bus to the F20 (and the e1000-s) is now, expectedly, a better indication 
 of the ZIL traffic.
 
 We didn't do the 40 x 2 untar test whilst not using a SSD device. As an 
 indication: unpacking a single tarbal then takes about 1min30. 
 
 In case it means anything, single tarbal unpack no_zil, 1vmod, 1vmod_Intel, 
 4vmod-s, 4vmod_Intel measures around (decimals only used as indication!):
   
  4s, 12s,11.2s,  12.5s, 11.6s
 
 
 Taking this all in account, I still don't see what's holding it up. 
 Interestingly enough, the client side times are close within about 10 secs, 
 but zilstat shows something different. Hypothesis: Zilstat shows only one 
 vmod andwere capped in a layer above the ZIL? Can't rule out networking just 
 yet, but my gut tells me we're not network bound here. That leaves the ZFS 
 ZPL/VFS layer?  

The difference between writing to the ZIL and not writing to the ZIL is 
perhaps thousands of CPU cycles.  For a latency-sensitive workload
this will be noticed.
 -- richard

 
 I'm very open to suggestions on how to proceed... :)
 
 With kind regards,
 
 Jeroen
 --
 Jeroen Roodhart
 ICT Consultant
University of Amsterdam
 j.r.roodhart uva.nl  Informatiseringscentrum
Technical support/ATG
 --
 See http://www.science.uva.nl/~jeroen for openPGP public key
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] VMware client solaris 10, RAW physical disk and zfs snapshots problem - all created snapshots are equal to zero.

2010-03-30 Thread Vladimir Novakovic

I'm running Windows 7 64bit and VMware player 3 with Solaris 10 64bit
as a client. I have added additional hard drive to virtual Solaris 10
as physical drive. Solaris 10 can see and use already created zpool
without problem. I could also create additional zpool on the other
mounted raw device. I can also synchronize zfs file system to other
physical disk and the other zpool with zfs send/receive.

All those physical disks are presented in Windows 7 and not initialized.

The problem that I have now is that each created snapshot is always
equal to zero... zfs just not storing changes that I have made to the
file system before making a snapshot.

Before each snapshot I have added or deleted different files, so the
snapshot should archive those differences but this is not my case. :-(

My zfs list looks now like this:

 r...@sl-node01:~# zfs list
NAME    USED  AVAIL  REFER  MOUNTPOINT
mypool01   91.9G   137G    23K  /mypool01
mypool01/storage01 91.9G   137G  91.7G  /mypool01/storage01
mypool01/storag...@30032010-1  0  -  91.9G  -
mypool01/storag...@30032010-2  0  -  91.9G  -
mypool01/storag...@30032010-3  0  -  91.7G  -
mypool02   91.9G   137G    24K  /mypool02
mypool02/copies  23K   137G    23K  /mypool02/copies
mypool02/storage01 91.9G   137G  91.9G  /mypool02/storage01
mypool02/storag...@30032010-1  0  -  91.9G  -
mypool02/storag...@30032010-2  0  -  91.9G  -

As you can see each snapshot is equal to zero besides changes that I
have made to zfs content.

I would like to understand what is making virtual Solaris 10
impossible to track those changes with creating a snapshot?

Is it problem with Windows, VMware player or basically with mounted
raw device to virtual machine? Does someone has experinace with this
issue?


Regards,
Vladimir
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-30 Thread Jeroen Roodhart

If you are going to trick the system into thinking a volatile cache is 
nonvolatile, you
might as well disable the ZIL -- the data corruption potential is the same.

I'm sorry? I believe the F20 has a supercap or the like? The advise on:

http://wikis.sun.com/display/Performance/Tuning+ZFS+for+the+F5100#TuningZFSfortheF5100-ZFSF5100

Is to disable write caching altogether. We opted not to do _that_ though... :)

Are you sure about disabling write cache on the F20 is a bad thing to do?

With kind regards,

Jeroen
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-30 Thread Richard Elling

On Mar 30, 2010, at 3:32 PM, Jeroen Roodhart wrote:
 If you are going to trick the system into thinking a volatile cache is 
 nonvolatile, you
 might as well disable the ZIL -- the data corruption potential is the same.
 
 I'm sorry? I believe the F20 has a supercap or the like? The advise on:

You are correct, I misread the Marvell (as in F20) and X4540 (as in not X4500)
combination.

 http://wikis.sun.com/display/Performance/Tuning+ZFS+for+the+F5100#TuningZFSfortheF5100-ZFSF5100
 
 Is to disable write caching altogether. We opted not to do _that_ though... :)

Good idea.  That recommendation is flawed for the general case and only
applies when all devices have nonvolatile caches.

 Are you sure about disabling write cache on the F20 is a bad thing to do?

I agree that it is a reasonable choice.

For this case, what is the average latency to the F20?
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-30 Thread Carson Gaspar

Richard Elling wrote:

On Mar 30, 2010, at 3:32 PM, Jeroen Roodhart wrote:

If you are going to trick the system into thinking a volatile cache is
nonvolatile, you
might as well disable the ZIL -- the data corruption potential is the same.

I'm sorry? I believe the F20 has a supercap or the like? The advise on:

You are correct, I misread the Marvell (as in F20) and X4540 (as in not X4500)
combination.

http://wikis.sun.com/display/Performance/Tuning+ZFS+for+the+F5100#TuningZFSfortheF5100-ZFSF5100

Is to disable write caching altogether. We opted not to do _that_ though... :)

Good idea. That recommendation is flawed for the general case and only
applies when all devices have nonvolatile caches.

Are you sure about disabling write cache on the F20 is a bad thing to do?

I agree that it is a reasonable choice.

For those following along at home, I'm pretty sure that the terminology
being used is confusing at best, and just plain wrong at worst.

The write cache is _not_ being disabled. The write cache is being marked
as non-volatile.

By marking the write cache as non-volatile, one is telling ZFS to not
issue cache flush commands.

BTW, why is a Sun/Oracle branded product not properly respecting the NV
bit in the cache flush command? This seems remarkably broken, and leads
to the amazingly bad advice given on the wiki referenced above.

--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] How to destroy iscsi dataset?

2010-03-30 Thread Frank Middleton


Our backup system has a couple of datasets used for iscsi
that have somehow lost their baseline snapshots with the
live system. In fact zfs list -t snapshots doesn't show
any snapshots at all for them. We rotate backup and live
every now and then, so these datasets have been shared
at some time.

Therefore an incremental zfs send/recv will fail for
these datasets. The send script automatically uses
a non-incremental send if the target dataset is missing,
so all I need to do is somehow destroy them.

# svcs -a | grep iscsi
disabled   18:50:21 svc:/network/iscsi_initiator:default
disabled   18:50:34 svc:/network/iscsi/target:default
disabled   18:50:38 svc:/system/iscsitgt:default
disabled   18:50:39 svc:/network/iscsi/initiator:default
# zfs list  space/os-vdisks/osolx86
NAME  USED  AVAIL  REFER  MOUNTPOINT
space/os-vdisks/osolx8620G   657G  14.9G  -
# zfs get shareiscsi space/os-vdisks/osolx86
NAME PROPERTYVALUE   SOURCE
space/os-vdisks/osolx86  shareiscsi  off local
# zfs destroy -f space/os-vdisks/osolx86
cannot destroy 'space/os-vdisks/osolx86': dataset is busy

AFAIK they aren't shared in any way now.
How to delete these datasets, or find out why they are busy?

Thanks
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

 But the speedup of disabling the ZIL altogether is
 appealing (and would
 probably be acceptable in this environment).

Just to make sure you know ... if you disable the ZIL altogether, and you
have a power interruption, failed cpu, or kernel halt, then you're likely to
have a corrupt unusable zpool, or at least data corruption.  If that is
indeed acceptable to you, go nuts.  ;-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

 standard ZIL:   7m40s  (ZFS default)
 1x SSD ZIL:  4m07s  (Flash Accelerator F20)
 2x SSD ZIL:  2m42s  (Flash Accelerator F20)
 2x SSD mirrored ZIL:   3m59s  (Flash Accelerator F20)
 3x SSD ZIL:  2m47s  (Flash Accelerator F20)
 4x SSD ZIL:  2m57s  (Flash Accelerator F20)
 disabled ZIL:   0m15s
 (local extraction0m0.269s)
 
 I was not so much interested in the absolute numbers but rather in the
 relative
 performance differences between the standard ZIL, the SSD ZIL and the
 disabled
 ZIL cases.

Oh, one more comment.  If you don't mirror your ZIL, and your unmirrored SSD
goes bad, you lose your whole pool.  Or at least suffer data corruption.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] VMware client solaris 10, RAW physical disk and zfs snapshots problem - all created snapshots are equal to zero.

2010-03-30 Thread Richard Jahnel

what size is the gz file if you do an incremental send to file?

something like:

zfs send -i sn...@vol sn...@vol | gzip  /somplace/somefile.gz
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] VMware client solaris 10, RAW physical disk and zfs snapshots problem - all created snapshots are equal to zero.

 The problem that I have now is that each created snapshot is always
 equal to zero... zfs just not storing changes that I have made to the
 file system before making a snapshot.
 
  r...@sl-node01:~# zfs list
 NAME    USED  AVAIL  REFER  MOUNTPOINT
 mypool01   91.9G   137G    23K  /mypool01
 mypool01/storage01 91.9G   137G  91.7G  /mypool01/storage01
 mypool01/storag...@30032010-1  0  -  91.9G  -
 mypool01/storag...@30032010-2  0  -  91.9G  -
 mypool01/storag...@30032010-3  0  -  91.7G  -
 mypool02   91.9G   137G    24K  /mypool02
 mypool02/copies  23K   137G    23K  /mypool02/copies
 mypool02/storage01 91.9G   137G  91.9G  /mypool02/storage01
 mypool02/storag...@30032010-1  0  -  91.9G  -
 mypool02/storag...@30032010-2  0  -  91.9G  -

Try this:
zfs snapshot mypool01/storag...@30032010-4
dd if=/dev/urandom of=/mypool01/storage01/randomfile bs=1024k count=1024
zfs snapshot mypool01/storag...@30032010-5
rm /mypool01/storage01/randomfile
zfs snapshot mypool01/storag...@30032010-6
zfs list

And see what happens.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-30 Thread Bob Friesenhahn


On Tue, 30 Mar 2010, Edward Ned Harvey wrote:


But the speedup of disabling the ZIL altogether is
appealing (and would
probably be acceptable in this environment).


Just to make sure you know ... if you disable the ZIL altogether, and you
have a power interruption, failed cpu, or kernel halt, then you're likely to
have a corrupt unusable zpool, or at least data corruption.  If that is
indeed acceptable to you, go nuts.  ;-)


I believe that the above is wrong information as long as the devices 
involved do flush their caches when requested to.  Zfs still writes 
data in order (at the TXG level) and advances to the next transaction 
group when the devices written to affirm that they have flushed their 
cache.  Without the ZIL, data claimed to be synchronously written 
since the previous transaction group may be entirely lost.


If the devices don't flush their caches appropriately, the ZIL is 
irrelevant to pool corruption.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

 Again, we can't get a straight answer on this one..
 (or at least not 1 straight answer...)
 
 Since the ZIL logs are committed atomically they are either committed
 in FULL, or NOT at all (by way of rollback of incomplete ZIL applies at
 zpool mount time / or transaction rollbacks if things go exceptionally
 bad), the only LOST data would be what hasn't been transferred from ZIL
 to the primary pool..
 
 But the pool should be sane.

If this is true ...  Suppose you shutdown a system, remove the ZIL device,
and power back on again.  What will happen?  I'm informed that with current
versions of solaris, you simply can't remove a zil device once it's added to
a pool.  (That's changed in recent versions of opensolaris) ... but in any
system where removing the zil isn't allowed, what happens if the zil is
removed?

I have to assume something which isn't quite sane happens.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-30 Thread Bob Friesenhahn


On Tue, 30 Mar 2010, Edward Ned Harvey wrote:


If this is true ...  Suppose you shutdown a system, remove the ZIL device,
and power back on again.  What will happen?  I'm informed that with current
versions of solaris, you simply can't remove a zil device once it's added to
a pool.  (That's changed in recent versions of opensolaris) ... but in any
system where removing the zil isn't allowed, what happens if the zil is
removed?


If the ZIL device goes away then zfs might refuse to use the pool 
without user affirmation (due to potential loss of uncommitted 
transactions), but if the dedicated ZIL device is gone, zfs will use 
disks in the main pool for the ZIL.


This has been clarified before on the list by top zfs developers.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs recreate questions

 Anyway, my question is, [...]
 as expected I can't import it because the pool was created
 with a newer version of ZFS. What options are there to import?

I'm quite sure there is no option to import or receive or downgrade a zfs
filesystem from a later version.  I'm pretty sure your only option is
something like tar

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

 If the ZIL device goes away then zfs might refuse to use the pool
 without user affirmation (due to potential loss of uncommitted
 transactions), but if the dedicated ZIL device is gone, zfs will use
 disks in the main pool for the ZIL.
 
 This has been clarified before on the list by top zfs developers.

Here's a snippet from man zpool.  (Latest version available today in
solaris)

zpool remove pool device ...
Removes the specified device from the pool. This command
currently  only  supports  removing hot spares and cache
devices. Devices that are part of a mirrored  configura-
tion  can  be  removed  using  the zpool detach command.
Non-redundant and raidz devices cannot be removed from a
pool.

So you think it would be ok to shutdown, physically remove the log device,
and then power back on again, and force import the pool?  So although there
may be no live way to remove a log device from a pool, it might still be
possible if you offline the pool to ensure writes are all completed before
removing the device?

If it were really just that simple ... if zfs only needed to stop writing to
the log device and ensure the cache were flushed, and then you could safely
remove the log device ... doesn't it seem silly that there was ever a time
when that wasn't implemented?  Like ... Today.  (Still not implemented in
solaris, only opensolaris.)

I know I am not going to put the health of my pool on the line, assuming
this line of thought.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-03-30 Thread Rob Logan


 if you disable the ZIL altogether, and you have a power interruption, failed 
 cpu, 
 or kernel halt, then you're likely to have a corrupt unusable zpool

the pool will always be fine, no matter what.

 or at least data corruption. 

yea, its a good bet that data sent to your file or zvol will not be there
when the box comes back, even though your program had finished seconds 
before the crash.

Rob

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

 So you think it would be ok to shutdown, physically remove the log
 device,
 and then power back on again, and force import the pool?  So although
 there
 may be no live way to remove a log device from a pool, it might still
 be
 possible if you offline the pool to ensure writes are all completed
 before
 removing the device?
 
 If it were really just that simple ... if zfs only needed to stop
 writing to
 the log device and ensure the cache were flushed, and then you could
 safely
 remove the log device ... doesn't it seem silly that there was ever a
 time
 when that wasn't implemented?  Like ... Today.  (Still not implemented
 in
 solaris, only opensolaris.)

Allow me to clarify a little further, why I care about this so much.  I have
a solaris file server, with all the company jewels on it.  I had a pair of
intel X.25 SSD mirrored log devices.  One of them failed.  The replacement
device came with a newer version of firmware on it.  Now, instead of
appearing as 29.802 Gb, it appears at 29.801 Gb.  I cannot zpool attach.
New device is too small.

So apparently I'm the first guy this happened to.  Oracle is caught totally
off guard.  They're pulling their inventory of X25's from dispatch
warehouses, and inventorying all the firmware versions, and trying to figure
it all out.  Meanwhile, I'm still degraded.  Or at least, I think I am.

Nobody knows any way for me to remove my unmirrored log device.  Nobody
knows any way for me to add a mirror to it (until they can locate a drive
with the correct firmware.)  All the support people I have on the phone are
just as scared as I am.  Well we could upgrade the firmware of your
existing drive, but that'll reduce it by 0.001 Gb, and that might just
create a time bomb to destroy your pool at a later date.  So we don't do
it.

Nobody has suggested that I simply shutdown and remove my unmirrored SSD,
and power back on.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers