Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-04-10 Thread Markus Kovero
...
 
 I have identified the culprit is the Western Digital drive WD2002FYPS-01U1B0. 
 It's not clear if they can fix it in firmware, but Western Digital is 
 replacing my drives.

 Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc
 Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info]
 /p...@0,0/pci15ad,7...@15/pci1000,3...@0 (mpt_sas0):
 Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13.
 Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc

Hi, do you have disks connected in sata1/2? With 
WD2003FYYS-01T8B0/WD20EADS-00S2B0/WD1001FALS-00J7B1/WD1002FBYS-01A6B0 these 
timeouts are to be expected if disk is in SATA2 mode, 
we've get rid of these timeouts after forcing disks in SATA1-mode with jumpers, 
now they only appear when disk is having real issues and needs to be replaced.


Yours 
Markus Kovero
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?

2010-04-10 Thread Erik Trimble

Tonmaus wrote:

Hi David,

why not just use a couple of SAS expanders? 


Regards,

Tonmaus
  

I would go this route (SAS expanders for a 8-port HBA).

E.g.:

http://www.supermicro.com/products/accessories/mobilerack/CSE-M28E2.cfm

is a great internal bay set up if you want a non-Supermicro case. $250 
or thereabouts.



You can get two of these, plus a pair of nice LSI HBAs.   As an added 
bonus, it's easy to put a SSD into one or more of the bays with no 
additional rewiring needed.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-04-10 Thread Maurice Volaski
Hi, do you have disks connected in sata1/2? With 
WD2003FYYS-01T8B0/WD20EADS-00S2B0/WD1001FALS-00J7B1/WD1002FBYS-01A6B0 
these timeouts are to be expected if disk is in SATA2 mode,


No, why are they to be expected with SATA2 mode? Is the defect 
specific to the SATA2 circuitry? I guess it could be a temporary 
workaround provided they would eventually fix the problem in 
firmware, but I'm getting new drives, so I guess I can't complain :-)

--

Maurice Volaski, maurice.vola...@einstein.yu.edu
Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-04-10 Thread Markus Kovero

 No, why are they to be expected with SATA2 mode? Is the defect 
 specific to the SATA2 circuitry? I guess it could be a temporary 
 workaround provided they would eventually fix the problem in 
 firmware, but I'm getting new drives, so I guess I can't complain :-)

Probably your new disks do this too, I really don't know whats with flawkey 
sata2 but I'd be quite sure it would fix your issues.
Performance drop is not even noticeable, so it's worth a try.

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?

2010-04-10 Thread Richard Jahnel
Just as an FYI, not all drives like sas expanders.

As an example, we had a lot of trouble with Indilinx MLC based SSDs. The 
systems had  Adaptec 52445 controllers and Chenbro SAS expanders. In the end we 
had to remove the SAS expanders and put a 2nd 52445 in each system to get them 
to work properly.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about backup and mirrored pools

2010-04-10 Thread Harry Putnam
David Dyer-Bennet d...@dd-b.net writes:

[...]

 Am I way wrong on this, and further I'm curious if it would make more
 versatile use of the space if I were to put the mirrored pairs into
 one big pool containing 3 mirrored pairs (6 discs)

 Well, my own thinking doesn't consider that adequate for my own data;
 which is not identical to thinking you're actually wrong, of course.

 Issues I see include:  Flood, fire, foes, bugs, user error.  rm -rf /
 will destroy your data just as well on the mirror as on a single disk, as
 will hacker breakins.  OS and driver bugs can corrupt both sides of the
 mirror.  And burning your house down, or flooding it perhaps (depending on
 where your server is; mine's in the basement, so if we flood, it gets
 wet), will destroy your data.

Yeah there is all that, but in my case the data is also on the other
machines in bits and pieces over several machines.  Is yours only on
the zfs server? 

An example here might be my photo/music collection.  It resides on a
windows XP pro machine where I have the tools I use to tinker with it.

I back it up to the zfs server, but what is on the server is always a
bit older (between backups) than the current version on the windows
machine.

So if the zfs server were to be beamed up to another solar system, I'd
still have the latest greatest version on the windows machine.

Anyway losing my entire house and several machines would leave me with
much bigger problems than losing my photo/music collection.

The shelter I'd be living in wouldn't have room for several machines.
Nor would I have money to spend on such luxuries.

 I make and keep off-site backups, formerly on optical media, moving
 towards external disk drives.

I'd be interested to hear about that.  If you think its OT here feel
free to write me direct (reader AT newsguy DOT com)

I have something like a terabyte of data on the server.  Man I'd
really hate to try to back that up to optical media.  Even to external
hard drive would be a major time sync.

Just backing up the 80 or so GB of Photos/music to optical medial is
a nasty undertaking.  I quit doing that when it grew past 15 gb or so.

[...] thanks for the other (snipped) input.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about backup and mirrored pools

2010-04-10 Thread Harry Putnam
Bob Friesenhahn bfrie...@simple.dallas.tx.us writes:

 On Fri, 9 Apr 2010, Harry Putnam wrote:

 Am I way wrong on this, and further I'm curious if it would make more
 versatile use of the space if I were to put the mirrored pairs into
 one big pool containing 3 mirrored pairs (6 discs)

 Besides more versatile use of the space, you would get 3X the
 performance.

That speed up surprises me. Can you explain briefly how that works?

If that is a bit much to ask here, maybe a pointer to specific
documentation?

 Luckily, since you are using mirrors, you can easily migrate disks
 from your existing extra pools to the coalesced pool.  Just make sure
 to scrub first in order to have confidence that there won't be data
 loss.

Would that be in pairs, or can you show a generalized outline of how
that would be done.  Again, a documentation pointer would be good too. 

Is it wise to have rpool as the migration destination, making it the
only pool?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about backup and mirrored pools

2010-04-10 Thread Harry Putnam
Richard Jahnel rich...@ellipseinc.com writes:

[...]

 Perhaps mirrored sets with daily snapshots and a knowedge of how to
 mount snapshots as clones so that you can pull a copy of that file
 you deleted 3 days ago. :)

I've been doing that with the default auto snapshot setup, but hadn't
noticed a need to mount a snapshot as a clone in order to be able to
pull a file.  I've just sought out most recent snapshot with the file
and copied it.  

I'm not sure now if I've retrieved lost files, or just got an older
version that way... but I think both.

Can you explain a bit why you need to mount a snapshot as clone?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?

2010-04-10 Thread Tonmaus
As far as I have read, that problem has been reported to be a compatibility 
problem of the Adaptec controller and the expander chipset, e.g. LSI SASx which 
is also on the mentioned Chenbro expander. There is no problem with 106x 
chipset and sas expanders that I know of.

People sceptical about expanders: quite a couple of th Areca cards actually 
have expander chips on board. Don't know about the 1680 specifically, though.

Cheers,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about backup and mirrored pools

2010-04-10 Thread Bob Friesenhahn

On Sat, 10 Apr 2010, Harry Putnam wrote:


Am I way wrong on this, and further I'm curious if it would make more
versatile use of the space if I were to put the mirrored pairs into
one big pool containing 3 mirrored pairs (6 discs)


Besides more versatile use of the space, you would get 3X the
performance.


That speed up surprises me. Can you explain briefly how that works?


It is quite simple.  With three sets of mirrors in the pool, the data 
is distributed across the three mirrors.  There is 3X the hardware 
available for each write.  There is more than 3X the hardware for each 
read since either side of a mirror may be used to satisfy a read.



If that is a bit much to ask here, maybe a pointer to specific
documentation?


That would be too much to ask since it is clear that you did not spend 
more than a few minutes reading the documentation.  Spend some more 
minutes.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-04-10 Thread Edward Ned Harvey
Due to recent experiences, and discussion on this list, my colleague and I
performed some tests:

 

Using solaris 10, fully upgraded.  (zpool 15 is latest, which does not have
log device removal that was introduced in zpool 19)  In any way possible,
you lose an unmirrored log device, and the OS will crash, and the whole
zpool is permanently gone, even after reboots.

 

Using opensolaris, upgraded to latest, which includes zpool version 22.  (Or
was it 23?  I forget now.)  Anyway, it's =19 so it has log device removal.

1.   Created a pool, with unmirrored log device.

2.   Started benchmark of sync writes, verified the log device getting
heavily used.

3.   Yank out the log device.

Behavior was good.  The pool became degraded which is to say, it started
using the primary storage for the ZIL, performance presumably degraded, but
the system remained operational and error free.

I was able to restore perfect health by zpool remove the failed log
device, and zpool add a new log device.

 

Next:

1.   Created a pool, with unmirrored log device.

2.   Started benchmark of sync writes, verified the log device getting
heavily used.

3.   Yank out both power cords.

4.   While the system is down, also remove the log device.

(OOoohhh, that's harsh.)  I created a situation where an unmirrored log
device is known to have unplayed records, there is an ungraceful shutdown,
*and* the device disappears.  That's the absolute worst case scenario
possible, other than the whole building burning down.  Anyway, the system
behaved as well as it possibly could.  During boot, the faulted pool did not
come up, but the OS came up fine.  My zpool status showed this:

 

# zpool status

 

  pool: junkpool

 state: FAULTED

status: An intent log record could not be read.

Waiting for adminstrator intervention to fix the faulted pool.

action: Either restore the affected device(s) and run 'zpool online',

or ignore the intent log records by running 'zpool clear'.

   see: http://www.sun.com/msg/ZFS-8000-K4

 scrub: none requested

config:

 

NAMESTATE READ WRITE CKSUM

junkpoolFAULTED  0 0 0  bad intent log

  c8t4d0ONLINE   0 0 0

  c8t5d0ONLINE   0 0 0

logs

  c8t3d0UNAVAIL  0 0 0  cannot open

 

(---)

I know the unplayed log device data is lost forever.  So I clear the error,
remove the faulted log device, and acknowledge that I have lost the last few
seconds of written data, up to the system crash:

 

# zpool clear junkpool

# zpool status

 

  pool: junkpool

 state: DEGRADED

status: One or more devices could not be opened.  Sufficient replicas exist
for

the pool to continue functioning in a degraded state.

action: Attach the missing device and online it using 'zpool online'.

   see: http://www.sun.com/msg/ZFS-8000-2Q

 scrub: none requested

config:

 

NAMESTATE READ WRITE CKSUM

junkpoolDEGRADED 0 0 0

  c8t4d0ONLINE   0 0 0

  c8t5d0ONLINE   0 0 0

logs

  c8t3d0UNAVAIL  0 0 0  cannot open

 

# zpool remove junkpool c8t3d0

# zpool status junkpool

 

  pool: junkpool

 state: ONLINE

 scrub: none requested

config:

 

NAMESTATE READ WRITE CKSUM

junkpoolONLINE   0 0 0

  c8t4d0ONLINE   0 0 0

  c8t5d0ONLINE   0 0 0

 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-04-10 Thread Bob Friesenhahn

On Sat, 10 Apr 2010, Edward Ned Harvey wrote:


Using solaris 10, fully upgraded.  (zpool 15 is latest, which does not have log 
device removal that was
introduced in zpool 19)  In any way possible, you lose an unmirrored log 
device, and the OS will crash, and
the whole zpool is permanently gone, even after reboots.


Is anyone willing to share what zfs version will be included with 
Solaris 10 U9?  Will graceful intent log removal be included?


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?

2010-04-10 Thread Edward Ned Harvey
Neil or somebody?  Actual ZFS developers?  Taking feedback here?   ;-)

 

While I was putting my poor little server through cruel and unusual
punishment as described in my post a moment ago, I noticed something
unexpected:

 

I expected that while I'm stressing my log device by infinite sync writes,
my primary storage devices would also be busy(ish).  Not really busy, but
not totally idle either.  Since the primary storage is a stripe of spindle
mirrors, obviously it can handle much more sustainable throughput than the
individual log device, but the log device can respond with smaller latency.
What I noticed was this:

 

For several seconds, *only* the log device is busy.  Then it stops, and for
maybe 0.5 secs *only* the primary storage disks are busy.  Repeat, recycle.

 

I expected to see the log device busy nonstop.  And the spindle disks
blinking lightly.  As long as the spindle disks are idle, why wait for a
larger TXG to be built?  Why not flush out smaller TXG's as long as the
disks are idle?  But worse yet . During the 1-second (or 0.5 second) that
the spindle disks are busy, why stop the log device?  (Presumably also
stopping my application that's doing all the writing.)

 

This means, if I'm doing zillions of *tiny* sync writes, I will get the best
performance with the dedicated log device present.  But if I'm doing large
sync writes, I would actually get better performance without the log device
at all.  Or else . add just as many log devices as I have primary storage
devices.  Which seems kind of crazy.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-04-10 Thread Tim Cook
On Sat, Apr 10, 2010 at 10:08 AM, Edward Ned Harvey
solar...@nedharvey.comwrote:

  Due to recent experiences, and discussion on this list, my colleague and
 I performed some tests:



 Using solaris 10, fully upgraded.  (zpool 15 is latest, which does not have
 log device removal that was introduced in zpool 19)  In any way possible,
 you lose an unmirrored log device, and the OS will crash, and the whole
 zpool is permanently gone, even after reboots.



 Using opensolaris, upgraded to latest, which includes zpool version 22.
 (Or was it 23?  I forget now.)  Anyway, it’s =19 so it has log device
 removal.

 1.   Created a pool, with unmirrored log device.

 2.   Started benchmark of sync writes, verified the log device getting
 heavily used.

 3.   Yank out the log device.

 Behavior was good.  The pool became “degraded” which is to say, it started
 using the primary storage for the ZIL, performance presumably degraded, but
 the system remained operational and error free.

 I was able to restore perfect health by “zpool remove” the failed log
 device, and “zpool add” a new log device.



 Next:

 1.   Created a pool, with unmirrored log device.

 2.   Started benchmark of sync writes, verified the log device getting
 heavily used.

 3.   Yank out both power cords.

 4.   While the system is down, also remove the log device.

 (OOoohhh, that’s harsh.)  I created a situation where an unmirrored log
 device is known to have unplayed records, there is an ungraceful shutdown, *
 *and** the device disappears.  That’s the absolute worst case scenario
 possible, other than the whole building burning down.  Anyway, the system
 behaved as well as it possibly could.  During boot, the faulted pool did not
 come up, but the OS came up fine.  My “zpool status” showed this:



 # zpool status



   pool: junkpool

  state: FAULTED

 status: An intent log record could not be read.

 Waiting for adminstrator intervention to fix the faulted pool.

 action: Either restore the affected device(s) and run 'zpool online',

 or ignore the intent log records by running 'zpool clear'.

see: http://www.sun.com/msg/ZFS-8000-K4

  scrub: none requested

 config:



 NAMESTATE READ WRITE CKSUM

 junkpoolFAULTED  0 0 0  bad intent log

   c8t4d0ONLINE   0 0 0

   c8t5d0ONLINE   0 0 0

 logs

   c8t3d0UNAVAIL  0 0 0  cannot open



 (---)

 I know the unplayed log device data is lost forever.  So I clear the error,
 remove the faulted log device, and acknowledge that I have lost the last few
 seconds of written data, up to the system crash:



 # zpool clear junkpool

 # zpool status



   pool: junkpool

  state: DEGRADED

 status: One or more devices could not be opened.  Sufficient replicas exist
 for

 the pool to continue functioning in a degraded state.

 action: Attach the missing device and online it using 'zpool online'.

see: http://www.sun.com/msg/ZFS-8000-2Q

  scrub: none requested

 config:



 NAMESTATE READ WRITE CKSUM

 junkpoolDEGRADED 0 0 0

   c8t4d0ONLINE   0 0 0

   c8t5d0ONLINE   0 0 0

 logs

   c8t3d0UNAVAIL  0 0 0  cannot open



 # zpool remove junkpool c8t3d0

 # zpool status junkpool



   pool: junkpool

  state: ONLINE

  scrub: none requested

 config:



 NAMESTATE READ WRITE CKSUM

 junkpoolONLINE   0 0 0

   c8t4d0ONLINE   0 0 0

   c8t5d0ONLINE   0 0 0





Awesome!  Thanks for letting us know the results of your tests Ed, that's
extremely helpful.  I was actually interested in grabbing some of the
cheaper intel SSD's for home use, but didn't want to waste my money if it
wasn't going to handle the various failure modes gracefully.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?

2010-04-10 Thread Bob Friesenhahn

On Sat, 10 Apr 2010, Edward Ned Harvey wrote:


For several seconds, *only* the log device is busy.  Then it stops, 
and for maybe 0.5 secs *only* the primary storage disks are busy.  
Repeat, recycle.


I expected to see the log device busy nonstop.  And the spindle 
disks blinking lightly.  As long as the spindle disks are idle, why 
wait for a larger TXG to be built?  Why not flush out smaller TXG’s 
as long as the disks are idle?  But worse yet … During the 1-second 
(or 0.5 second) that the spindle disks are busy, why stop the log 
device?  (Presumably also stopping my application that’s doing all 
the writing.)


What you are seeing should be expected and is good.  The intent log 
allows synchronous writes to be turned into lazy ordinary writes (like 
async writes) in the next TXG cycle.  Since the intent log is on a 
SSD, the pressure is taken off of the primary disks to serve that 
function so you will not see so many IOPS to the primary disks.


This means, if I’m doing zillions of *tiny* sync writes, I will get 
the best performance with the dedicated log device present.  But if 
I’m doing large sync writes, I would actually get better performance 
without the log device at all.  Or else … add just as many log 
devices as I have primary storage devices.  Which seems kind of 
crazy.


If this is really a problem for you, then you should be able to 
somewhat resolve it by placing a smaller cap on the maximum size of a 
TXG.  Then the system will write more often.  However, the maximum 
synchronous bulk write rate will still be limited by the bandwidth of 
your intent log devices.  Huge synchronous bulk writes are pretty rare 
since usually the bottleneck is elsewhere, such as the ethernet.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully

2010-04-10 Thread matthew patton
Thanks for the testing. so FINALLY with version  19 does ZFS demonstrate 
production-ready status in my book. How long is it going to take Solaris to 
catch up?



  
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS RaidZ recommendation

2010-04-10 Thread Tim Cook
On Fri, Apr 9, 2010 at 9:31 PM, Eric D. Mudama edmud...@bounceswoosh.orgwrote:

 On Sat, Apr 10 at  7:22, Daniel Carosone wrote:

 On Fri, Apr 09, 2010 at 10:21:08AM -0700, Eric Andersen wrote:

  If I could find a reasonable backup method that avoided external
  enclosures altogether, I would take that route.


 I'm tending to like bare drives.

 If you have the chassis space, there are 5-in-3 bays that don't need
 extra drive carriers, they just slot a bare 3.5 drive.  For e.g.

 http://www.newegg.com/Product/Product.aspx?Item=N82E16817994077


 I have a few of the 3-in-2 versions of that same enclosure from the
 same manufacturer, and they installed in about 2 minutes in my tower
 case.

 The 5-in-3 doesn't have grooves in the sides like their 3-in-2 does,
 so some cases may not accept the 5-in-3 if your case has tabs to
 support devices like DVD drives in the 5.25 slots.

 The grooves are clearly visible in this picture:

 http://www.newegg.com/Product/Product.aspx?Item=N82E16817994075

 The doors are a bit light perhaps, but it works just fine for my
 needs and holds drives securely.  The small fans are a bit noisy, but
 since the box lives in the basement I don't really care.

 --eric


 --
 Eric D. Mudama
 edmud...@mail.bounceswoosh.org



At that price, for the 5-in-3 at least, I'd go with supermicro.  For $20
more, you get what appears to be a far more solid enclosure.

--Tim



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Secure delete?

2010-04-10 Thread Roy Sigurd Karlsbakk
Hi all

Is it possible to securely delete a file from a zfs dataset/zpool once it's 
been snapshotted, meaning delete (and perhaps overwrite) all copies of this 
file?

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about backup and mirrored pools

2010-04-10 Thread Harry Putnam
Bob Friesenhahn bfrie...@simple.dallas.tx.us writes:

 On Sat, 10 Apr 2010, Harry Putnam wrote:

 Am I way wrong on this, and further I'm curious if it would make more
 versatile use of the space if I were to put the mirrored pairs into
 one big pool containing 3 mirrored pairs (6 discs)

 Besides more versatile use of the space, you would get 3X the
 performance.

 That speed up surprises me. Can you explain briefly how that works?

 It is quite simple.  With three sets of mirrors in the pool, the data
 is distributed across the three mirrors.  There is 3X the hardware
 available for each write.  There is more than 3X the hardware for each
 read since either side of a mirror may be used to satisfy a read.

Thanks for your comments.  Its always so much easier when someone
explains in plain english.

 If that is a bit much to ask here, maybe a pointer to specific
 documentation?

 That would be too much to ask since it is clear that you did not spend
 more than a few minutes reading the documentation.  Spend some more
 minutes.

Well now, that would only be true if you meant very recently.  In fact
I have spent quite hefty amounts of times reading zfs and opensolaris
documentation.  Including hefty tracts of the `Bible' (that isn't even
close) book.

The trouble is that I'm understand about 1/10 of it, and that 1/10 soon
departs my pea brain when I don't use it daily.

You may be used to dealing with folks who have a basic understanding
of OSs' and programming, rc files and etc.  Probably some amount of
formal higher education too. I've come on that kind of info in a very
haphazard, hard scrabble way.

My education stopped in 9th grade, I went to work at that point, out
in the west of our country.

Started industrial work a few yrs later (1965) and eventually became a
field construction boilermaker and worked around many of the midwest,
western and west coast states on refineries, powerplants, steelmills,
and other big industrial plants (now retired).

None of that was very conducive to the finer points of Operating
systems and programming or admin chores.

So whatever I've learned about that kind of stuff in the last 10 yrs
or so has gaping holes that you could drive 18 wheelers through.  

I've never found that reading documentation I barely understand is a
very good way of actually learning something.  

On the other hand, hearing about it from current practitioners and heavy
experimentation is (for me) the best way to learn about something.
With some of that going, then the documentation may start to be a lot
more meaningful, once I have something to hang it on.

Egad... sorry about the rant but sometimes it seems to just need to be
said. 

Thanks again for what you have contributed, not just to this thread
but your many hundreds, maybe thousands of messages of help to others
as well.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Secure delete?

2010-04-10 Thread Andrey Kuzmin
No, until all snapshots referencing the file in question are removed.

Simplest way to understand snapshots is to consider them as
references. Any file-system object (say, file or block) is only
removed when its reference count drops to zero.

Regards,
Andrey




On Sat, Apr 10, 2010 at 10:20 PM, Roy Sigurd Karlsbakk
r...@karlsbakk.net wrote:
 Hi all

 Is it possible to securely delete a file from a zfs dataset/zpool once it's 
 been snapshotted, meaning delete (and perhaps overwrite) all copies of this 
 file?

 Best regards

 roy
 --
 Roy Sigurd Karlsbakk
 (+47) 97542685
 r...@karlsbakk.net
 http://blogg.karlsbakk.net/
 --
 I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det 
 er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
 idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
 relevante synonymer på norsk.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Secure delete?

2010-04-10 Thread Joerg Schilling
Roy Sigurd Karlsbakk r...@karlsbakk.net wrote:

 I guess that's the way I thought it was. Perhaps it would be nice to add such 
 a feature? If something gets stuck in a truckload of snapshots, say a 40GB 
 file in the root fs, it'd be nice to just rm --killemall largefile

Let us first asume the simple case where the file is not a part
of any snapshot.

For a secure delete, a file needs to be overwritten in place and this 
cannot be done on a FS that is always COW.

The secure deletion of the data would be something that hallens before
the file is actually unlinked (e.g. by rm). This secure deletion would
need open the file in a non COW mode.

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?

2010-04-10 Thread Daniel Carosone
On Sat, Apr 10, 2010 at 11:50:05AM -0500, Bob Friesenhahn wrote:
 Huge synchronous bulk writes are pretty rare since usually the 
 bottleneck is elsewhere, such as the ethernet.

Also, large writes can go straight to the pool, and the zil only logs
the intent to commit those blocks (ie, link them into the zfs data
structure).   I don't recall what the threshold for this is, but I
think it's one of those Evil Tunables.

--
Dan.


pgpUn4A0x96oI.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?

2010-04-10 Thread Richard Jahnel
Any hints as to where you read that? I'm working on another system design with 
LSI controllers and being able to use SAS expanders would be a big help.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS RaidZ recommendation

2010-04-10 Thread Daniel Carosone
On Sat, Apr 10, 2010 at 12:56:04PM -0500, Tim Cook wrote:
 At that price, for the 5-in-3 at least, I'd go with supermicro.  For $20
 more, you get what appears to be a far more solid enclosure.

My intent with that link was only to show an example, not make a
recommendation. I'm glad others have the example I picked as the
easiest search result, I don't. There are plenty of other choices.

Note also, the supermicro ones are not trayless.  The example was
specifically of a trayless model.  Supermicro may be good for
permanent drives, but the trayless option is convenient for backup
bays, where you have 2 or more sets of drives that rotate through the
bays in the backup cycle.

Getting extra trays can be irritating, and at least some trays make
handling and storing the drives outside their slots rather cumbersome
(odd corners and edges and stacking difficulties).

Having bays that take bare drives is also great for recovering data
from disks taken from other machines.

Bare drives are also most easily interchangable between racks from
different makers - say if a better trayless model became available
between the purchase times of different machines.

--
Dan.


pgp5TkE1uAvAf.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration

2010-04-10 Thread Daniel Carosone
On Sat, Apr 10, 2010 at 02:51:45PM -0500, Harry Putnam wrote:
 [Note: This discussion started in another thread
 
 Subject: about backup and mirrored pools
 
 but the subject has been significantly changed so started a new
 thread]
 
 Bob Friesenhahn bfrie...@simple.dallas.tx.us writes:
 
  Luckily, since you are using mirrors, you can easily migrate disks
  from your existing extra pools to the coalesced pool.  Just make sure
  to scrub first in order to have confidence that there won't be data
  loss.

You know, when I saw those words, I worried someone, somewhere would
interpret them incorrectly.

The migration he's referring to is of disks, not of contents.  The
contents you'd have to migrate first (say with send|recv), before
destroying the emptied pool and adding the disks to the pool you want
to expand, as a new vdev.There's an implicit requirement here for
free space (or staging space elsewhere) to enable the move.

Note, also, rpool can't have multiple vdevs, so the best you could
combine curently is z2 and z3.

 I'm getting a little (read horribly) confused how to go about doing
 something like creating a single zpool of 3 sets of currently mirrored
 disks that for each pair, constitute a zpool themselves.

This would be something like a zpool merge operation, which does not
exist.

 In the case I'm describing I guess rpool will be the only pool when
 its completed.  Is that a sensible thing to do?

No, as above.  You might consider new disks for a new rpool (say, ssd
with some zil or l2arc) and reusing the current disks for data if
they're the same as the other data disks.

 Or would it make more sense to leave rpool alone and make a single
 zpool of the other two mirrored pairs?

Yep.

--
Dan.


pgpsuMiotk6Nw.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?

2010-04-10 Thread Neil Perrin

On 04/10/10 09:28, Edward Ned Harvey wrote:


Neil or somebody?  Actual ZFS developers?  Taking feedback here?   ;-)

 

While I was putting my poor little server through cruel and unusual 
punishment as described in my post a moment ago, I noticed something 
unexpected:


 

I expected that while I'm stressing my log device by infinite sync 
writes, my primary storage devices would also be busy(ish).  Not 
really busy, but not totally idle either.  Since the primary storage 
is a stripe of spindle mirrors, obviously it can handle much more 
sustainable throughput than the individual log device, but the log 
device can respond with smaller latency.  What I noticed was this:


 

For several seconds, **only** the log device is busy.  Then it stops, 
and for maybe 0.5 secs **only** the primary storage disks are busy.  
Repeat, recycle.




These are the txgs getting pushed out.


 

I expected to see the log device busy nonstop.  And the spindle disks 
blinking lightly.  As long as the spindle disks are idle, why wait for 
a larger TXG to be built?  Why not flush out smaller TXG's as long as 
the disks are idle?


Sometimes it's more efficient to batch up requests. Less blocks are 
written. As you mentioned you weren't stressing the system heavily.
ZFS will perform differently when under pressure. It will shorten the 
time between txgs if the data arrives quicker.


  But worse yet ... During the 1-second (or 0.5 second) that the 
spindle disks are busy, why stop the log device?  (Presumably also 
stopping my application that's doing all the writing.)


Yes, this has been observed by many people. There are two sides to this 
problem related to the

CPU and IO used while pushing a txg:

6806882 need a less brutal I/O scheduler
6881015 ZFS write activity prevents other threads from running in a 
timely manner


The CPU side (6881015) was fixed relatively recently in snv_129.

 

This means, if I'm doing zillions of **tiny** sync writes, I will get 
the best performance with the dedicated log device present.  But if 
I'm doing large sync writes, I would actually get better performance 
without the log device at all.  Or else ... add just as many log 
devices as I have primary storage devices.  Which seems kind of crazy.


Yes you're right, there are times when it's better to bypass the slog 
and use the pool disks which can deliver better bandwidth.


The algorithm for where and what the ZIL writes has got quite complex:

- There was another change recently to bypass the slog if 1MB had been 
sent to it and 2MB were waiting to be sent.
- There's a new property logbias which when set to throughput directs 
the ZIL to send all of it's writes to the main pool devices thus freeing 
the slog for more latency sensitive work (ideal for database data files).
- If synchronous writes are large (32K) and block aligned then the 
blocks are written directly to the pool and a small record
 written to the log. Later when the txg commits then the blocks are 
just linked into the txg. However, this processing is not
 done if there are any slogs because I found it didn't perform as well. 
Probably ought to be re-evaluated.
- There are further tweaks being suggested and which might make it to a 
ZIL near you soon.


Neil.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?

2010-04-10 Thread Neil Perrin

On 04/10/10 14:55, Daniel Carosone wrote:

On Sat, Apr 10, 2010 at 11:50:05AM -0500, Bob Friesenhahn wrote:
  
Huge synchronous bulk writes are pretty rare since usually the 
bottleneck is elsewhere, such as the ethernet.



Also, large writes can go straight to the pool, and the zil only logs
the intent to commit those blocks (ie, link them into the zfs data
structure).   I don't recall what the threshold for this is, but I
think it's one of those Evil Tunables.
  

This is zfs_immediate_write_sz which is 32K.  However this only happens
currently if you don't have any slogs. If logbias is set to throughput then
all writes go straight to the pool regardless of zfs_immediate_write_sz.

Neil.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration

2010-04-10 Thread Bob Friesenhahn

On Sun, 11 Apr 2010, Daniel Carosone wrote:


The migration he's referring to is of disks, not of contents.  The
contents you'd have to migrate first (say with send|recv), before
destroying the emptied pool and adding the disks to the pool you want
to expand, as a new vdev.There's an implicit requirement here for
free space (or staging space elsewhere) to enable the move.


Since he is already using mirrors, he already has enough free space 
since he can move one disk from each mirror to the main pool (which 
unfortunately, can't be the boot 'rpool' pool), send the data, and 
then move the second disks from the pools which are be removed.  The 
main risk here is that there is only single redundancy for a while.



No, as above.  You might consider new disks for a new rpool (say, ssd
with some zil or l2arc) and reusing the current disks for data if
they're the same as the other data disks.


It is nice to have a tiny disk for the root pool.  An alternative is 
to create a small partition on two of the boot disks for use as the 
root pool, and use the remainder ('hog') partition for the main data 
pool.  It is usually most convenient in the long run for the root pool 
to be physically separate from the data storage though.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration

2010-04-10 Thread Bob Friesenhahn

On Sat, 10 Apr 2010, Bob Friesenhahn wrote:


Since he is already using mirrors, he already has enough free space since he 
can move one disk from each mirror to the main pool (which unfortunately, 
can't be the boot 'rpool' pool), send the data, and then move the second 
disks from the pools which are be removed.  The main risk here is that there 
is only single redundancy for a while.


I should mention that since this is a bit risky, it would be wise for 
Harry to post the procedure and commands he plans to use so that other 
eyes can verify that a correct result will be obtained.


There is more risk of configuring the pool incorrectly (e.g. failing 
to end up with mirror vdevs) than there is of losing data.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration

2010-04-10 Thread Daniel Carosone
On Sat, Apr 10, 2010 at 06:20:54PM -0500, Bob Friesenhahn wrote:
 Since he is already using mirrors, he already has enough free space  
 since he can move one disk from each mirror to the main pool (which  
 unfortunately, can't be the boot 'rpool' pool), send the data, and then 
 move the second disks from the pools which are be removed. 

Ah, right you are. D'oh.

 The main risk here is that there is only single redundancy for a while.

You mean single copy, no redundancy, but otherwise yes.. perhaps
that's why I hadn't noticed this scheme, but it was a subconscious
oversight.  I'd rather consider and eliminate it consciously if so. 

For Harry's benefit, the recipe we're talking about here is roughly as
follows. Your pools z2 and z3, we will merge into z2.  diskx and
disky are the current members of z3.
 
Break the z3 mirror

# zpool detach z3 diskx

Add a new vdev to z2

# zpool add -f z2 diskx

The -f may be necessary, since you're adding a vdev with different
redundancy profile to the existing vdev. 

Replicate the z3 data into z2

# zfs snapshot -R z...@move
# zfs create z2/z3
# zfs send -R z...@move | zfs recv -d z2/z3

Free up the second z3 disk and attach as a mirror

# zpool destroy z3
# zpool attach z2 diskx disky

Again, commands are approximate to illustrate the steps; in particular
you might choose a differnet replication structure.

--
Dan.


pgpqZBmWn0joT.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration

2010-04-10 Thread Harry Putnam
Daniel Carosone d...@geek.com.au writes:

Thanks for the input.. very helpful.

[...]

 No, as above.  You might consider new disks for a new rpool (say, ssd
 with some zil or l2arc) and reusing the current disks for data if
 they're the same as the other data disks.

Would you mind expanding the abbrevs: ssd zil 12arc?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration

2010-04-10 Thread Bob Friesenhahn

On Sat, 10 Apr 2010, Harry Putnam wrote:


Would you mind expanding the abbrevs: ssd zil 12arc?


SSD   = Solid State Device
ZIL   = ZFS Intent Log (log of pending synchronous writes)
L2ARC = Level 2 Adaptive Replacement Cache

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration

2010-04-10 Thread Ian Collins

On 04/11/10 11:55 AM, Harry Putnam wrote:

Would you mind expanding the abbrevs: ssd zil 12arc?

   

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss