Re: [zfs-discuss] Oracle DB sequential dump questions

2008-10-02 Thread Adrian Saul
I would look at what size IOs you are doing in each case.

I have been playing with a T5240 and got 400Mb/s read and 200Mb/s write speeds 
with iozone throughput tests on a 6 disk mirror pool, so the box and ZFS can 
certainly push data around - but that was using 128k blocks.

You mention the disks are doing bursts of 50-60M which suggests they have more 
bandwidth and are not flat out trying to prefetch data.

I suspect you might be IOPS bound - if you are doing a serial read then write 
workload and only doing small blocks to the tape it might lead to higher 
service times on the tape device hence slowing down your overall read speed.

It its LTO-4 try and up your block size as big as you can go - 256k, 512k or 
higher and maybe use truss on the process to see what read/write sizes its 
doing.  I also found the iosnoop dtrace tool from Brendan Greg's dtrace toolkit 
to be very helpful in tracking down these sorts of issues.

HTH.

Cheers,
  Adrian
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] query: why does zfs boot in 10/08 not support flash archive jumpstart

2008-10-01 Thread Adrian Saul
With much excitement I have been reading the new features coming into Solaris 
10 in 10/08 and am eager to start playing with zfs root.  However one thing 
which struck me as strange and somewhat annoying is that it appears in the FAQs 
and documentation that its not possible to do a ZFS root install using 
jumpstart and flash archives?

I predominantly do my installs using flash archives as it saves massive amounts 
of time in the install process and gives me consistancy between builds.

Really I am just curious why it isnt supported, and what the intention is for 
supporting it and when?

Cheers,
  Adrian
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS error handling - suggestion

2008-02-18 Thread Adrian Saul
Howdy,
 I have at several times had issues with consumer grade PC hardware and ZFS not 
getting along.  The problem is not the disks but the fact I dont have ECC and 
end to end checking on the datapath.  What is happening is that random memory 
errors and bit flips are written out to disk and when read back again ZFS 
reports it as a checksum failure:

  pool: myth
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
mythONLINE   0 048
  raidz1ONLINE   0 048
c7t1d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c6t1d0  ONLINE   0 0 0
c6t2d0  ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

/myth/tv/1504_20080216203700.mpg
/myth/tv/1509_20080217192700.mpg
 
Note there are no disk errors, just entire RAID errors.  I get the same thing 
on a mirror pool where both sides of the mirror have identical errors.  All I 
can assume is that it was corrupted after the checksum was calculated and 
flushed to disk like that.  In the past it was a motherboard capacitor that had 
popped - but it was enough to generate these errors under load.

At any rate ZFS is doing the right thing by telling me - what I dont like is 
that from that point on I cant convince ZFS to ignore it.  The data in question 
is video files - a bit flip here or there wont matter.  But if ZFS reads the 
affected block it returns and I/O error and until I restore the file I have no 
option but to try and make the application skip over it.  If it was UFS for 
example I would have never known, but ZFS makes a point of stopping anything 
using it - understandably, but annoyingly as well.

What I would like to see is an option to ZFS in the style of the 'onerror' for 
UFS i.e the ability to tell ZFS to join fight club - let what doesnt matter 
truely slide.  For example:

zfs set erroraction=[iofail|log|ignore]

This would default to the current action of iofail but in the event you 
wanted to try and recover or repair data, you could set log to say generate an 
FMA event that there is bad checksums, or ignore, to get on with your day.

As mentioned, I see this as mostly an option to help repair data after the 
issue is identified or repaired.  Of course its data specific, but if the 
application can allow it or handle it, why should ZFS get in the way?

Just a thought.

Cheers,
  Adrian

PS: And yes, I am now buying some ECC memory.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: suggestion: directory promotion to filesystem

2007-02-22 Thread Adrian Saul
thanks for the replies - I imagined it would have been discussed but must have 
been searching the wrong terms :)

Any idea on the timeline or future of zfs split ?

Cheers,
 Adrian
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] suggestion: directory promotion to filesystem

2007-02-21 Thread Adrian Saul
Not sure how technically feasible it is, but something I thought of while 
shuffling some files around my home server.  My poor understanding of ZFS 
internals is that the entire pool is effectivly a tree structure, with nodes 
either being data or metadata.  Given that, couldnt ZFS just change a directory 
node to a filesystem with little effort, allowing me do everything ZFS does 
with filesystems on a subset of my filesystem :)

Say you have some filesystems you created early on before you had a good idea 
of usage.  Say for example I made a large share filesystem and started filling 
it up with photos and movies and some assorted downloads.  A few months later I 
realise it would be so much nicer to be able to snapshot my movies and photos 
seperatly for backups, instead of doing the whole share.

Not hard to work around - zfs create and a mv/tar command and it is done... 
some time later.  If there was say  a zfs graft directory newfs command, 
you could just break of the directory as a new filesystem and away you go - no 
copying, no risking cleaning up the wrong files etc.

Corollary - zfs merge - take a filesystem and merge it into an existing 
filesystem.

Just a thought - any comments welcome.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: pools are zfs file systems?

2006-07-15 Thread Adrian Saul
when playing with ZFS to try and come up with some standards for using it in 
our environment I also disliked having the pool directory mounted when my 
intentions were not to us it, but subdivide the space with in it.

Simple fix:

zpool create data blah
zfs create data/share
zfs create data/oracle
zfs set mountpoint=/export/share data/share
zfs set mountpoint=/oracle data/oracle
zfs set mountpoint=none data

It is semi-clean - you have to remember to set mountpoints if you create any 
more children under data, and zones tended not to like importing datasets that 
had the mountpoint set to none.

My 2c
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] S10U2: zfs instant hang with dovecot imap server using mmap

2006-07-14 Thread Adrian Saul
Hi,
  I just upgraded my home box to run Solaris 10 6/06 and converted my previous 
filesytems over to ZFS, including /var/mail.  Previously on S10 FCS I was 
running dovecot mail server from blastwave.org without issue.  On upgrading to 
Update 2 I have found that the mail server hangs frequently.  The imap process 
cannot be killed, dtraced, pstacked or trussed.  After a few goes at dtrace I 
took a core dump and had a look at that.  The stack for the imap process was 
simply:

 0t1550::pid2proc|::walk thread|::findstack -v
stack pointer for thread d77f8600: d41e8e2c
  d41e8e78 0xd41e8e44(2, d41e8f44)
  d41e8ed8 zfs_write+0x59f(d6a273c0, d41e8f44, 0, d8adee10, 0)
  d41e8f0c fop_write+0x2d(d6a273c0, d41e8f44, 0, d8adee10, 0)
  d41e8f8c write+0x29a()
  d41e8fb4 sys_sysenter+0xdc()

Some digging in sunsolve I found a few references to mmap and zfs_write locks - 
so on a hunch (because fuser previously showed my mailfile as mmaped) I 
disabled mmap in the dovecot configuration, and I no longer get the deadlocks.

I could not find an exact bug in sunsolve - is this a known one or is more work 
needed?  I can provide the core file or SSH access to the box if more analysis 
is needed.

Cheers,
  Adrian
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: S10U2: zfs instant hang with dovecot imap server using mmap

2006-07-14 Thread Adrian Saul
Oh - and by instant hang I mean I can reproduce it simply by rebooting, 
enabling the dovecot service and then connecting to dovecot with thunderbird.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: S10U2: zfs instant hang with dovecot imap server using mmap

2006-07-14 Thread Adrian Saul
There are also a number of procmail processes with the following thread stacks

 0t1611::pid2proc|::walk thread|::findstack -v
stack pointer for thread d8ae1a00: d3bcbbac
[ d3bcbbac 0xfe826b37() ]
  d3bcbbc4 swtch+0x13e()
  d3bcbbe8 cv_wait_sig+0x119(da58fb4c, d46a8680)
  d3bcbc00 wait_for_lock+0x30(da58fac8)
  d3bcbc20 flk_wait_execute_request+0x156(da58fac8)
  d3bcbc64 flk_process_request+0x4c7(da58fac8)
  d3bcbd38 reclock+0x3a9(d6a273c0, d3bcbe64, 6, 10a, 202e92c, 0)
  d3bcbd8c fs_frlock+0x252(d6a273c0, 7, d3bcbe64, 10a, 202e92c, 0)
  d3bcbdc0 zfs_frlock+0x73(d6a273c0, 7, d3bcbe64, 10a, 202e92c, 0)
  d3bcbdf8 fop_frlock+0x2c(d6a273c0, 7, d3bcbe64, 10a, 202e92c, 0)
  d3bcbf8c fcntl+0x95d()
  d3bcbfb4 sys_sysenter+0xdc()
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss