[zfs-discuss] ZFS write pauses and how TXG groups work

2008-08-08 Thread Tre Stylez
I currently have a Solaris u5 machine with 2gig memory and 6 disks in a RAIDZ 
config. When i write across the network to a smb/nfs share i notice a pause in 
the writes every 5seconds. 

Now from what i've read every 5seconds a TXG group goes in to the quiesced 
state and then gets sync'd to disk. What i would like to know is can/should 
another TXG group be opening at the same time and still allow writes while the 
sync is taking place in the other TXG and should this cause a write pause to my 
smb/nfs application? 

It appears i have 1gig of ARC cache so i believe 500meg of that can be used for 
writes. I've done a few writes tests across the network at various speeds to 
smb/nfs shares varying from 40MB/sec to 90MB/sec and i still get the pauses. If 
i was doing 40MB/sec when i hit the 5second sync time i would have about 200MB 
in cache and i thought it could happily open another TXG without causing a 
pause/throttle to an application.

The RAIDZ pool i have can do around 140MB/sec write speed(from a re-write test 
when cache is full) and iostat of the pool show that every 5seconds it's not 
having any issues getting the TXG group out, which appears to take about 
1-2seconds.

So i guess i'm asking is this expected behaviour and am i missing something?

Here is an iostat snip when writing to the pool across the network at around 
35-40MB/sec. I was still getting 1second pauses every 5seconds.

silvia  1.76T   987G  0971  0   120M
silvia  1.76T   986G  0276  0  20.1M
silvia  1.76T   986G  0  0  0  0
silvia  1.76T   986G  0  0  0  0
silvia  1.76T   986G  0  0  0  0
silvia  1.76T   986G  0984  0   122M
silvia  1.76T   986G  0262  0  18.3M
silvia  1.76T   986G  0  0  0  0
silvia  1.76T   986G  0  0  0  0
silvia  1.76T   986G  0  0  0  0
silvia  1.76T   986G  0972  0   120M
silvia  1.76T   986G  0308  0  19.6M

Thanks for your time :).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Poor ZFS performance when file system is close to full

2008-08-08 Thread Lance
> We had a situation where write speeds to a ZFS
> consisting of 2 7TB RAID5 LUNs came to a crawl.

Sounds like you've hit Bug# 6596237 "Stop looking and start ganging".  We ran 
into the same problem on our X4500 Thumpers.  Write throughput dropped to 200 
KB/s.  We now keep utilization under 90% to help combat the problem till a 
patch is available.  It seems to be worse if you have millions of files within 
the zpool.  We had over 25M files in a 16TB zpool.  At 95% full, it was 
virtually unusable for writing files.  This was on a host running vanilla S10U4 
x86.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Booting from a USB HD

2008-08-08 Thread W. Wayne Liauh
> Your problem is almost certainly that your boot
> device order differs, 
> probably due to thw BIOS differences you mention. Go
> to the grub command 
> line, and do a "find /platform/i86pc/multiboot". Pay
> attention to the 
> hd(n,m) it prints (I hope!) and edit your boot entry
> to match. Once 
> you're up in multi-user, add a new boot entry to
> /boot/grub/menu.lst for 
> your alternate device numbering.
> 

I did a grub> find/platform/i86pc/multiboot, and it returned (hd0,0,a), as I 
had expected.

But, nevertheless, I added (hd0,0,a) to the two grub boot lines (for both 
kernel$ and module$). Same results.  Grub still won't boot from the USB flash 
stick.  I suspect the USB driver may be lacking during stage 2.


> FYI, this is a generic x86 grub problem - Linux
> behaves the same way.
> 

Yes, I remember having a similar "won't boot from USB disk" problem with SuSE 
10.3.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RFE 4852783

2008-08-08 Thread Miles Nordin
> "t" == Tim  <[EMAIL PROTECTED]> writes:

 t> Why would you have to buy smaller disks?  You can replace the
 t> 320's with 1tb drives and after the last 320 is out of the
 t> raidgroup, it will grow automatically.

This does work for me to grow a mirrored vdev on nevada b71.  The way
I found to view the size of an individual vdev was through 'zpool
iostat -v':

before:
-8<-
terabithia:/# zpool iostat -v andaman 1
capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
andaman  2.23T   928G  0  0  0  0
  mirror  926G  2.07G  0  0  0  0
c3t11d0  -  -  0  0  0  0
c3t9d0   -  -  0  0  0  0
  mirror  681G   247G  0  0  0  0
c3t14d0  -  -  0  0  0  0
c3t8d0   -  -  0  0  0  0
  mirror  231G   601M  0  0  0  0
c3t28d0  -  -  0  0  0  0
c3t26d0  -  -  0  0  0  0
  mirror  231G   540M  0  0  0  0
c3t29d0  -  -  0  0  0  0
c3t15d0  -  -  0  0  0  0
  mirror  109G   589G  0  0  0  0
c3t18d0  -  -  0  0  0  0
c3t13d0  -  -  0  0  0  0
  mirror  100G  89.0G  0  0  0  0
c3t25d0  -  -  0  0  0  0
c3t17d0  -  -  0  0  0  0
---  -  -  -  -  -  -
-8<-

terabithia:/# zpool replace andaman c3t25d0 c3t30d0

after resilver:
-8<-
terabithia:/# zpool iostat -v andaman 1
capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
andaman  2.23T   928G  0  0  0  0
  mirror  926G  2.07G  0  0  0  0
c3t11d0  -  -  0  0  0  0
c3t9d0   -  -  0  0  0  0
  mirror  681G   247G  0  0  0  0
c3t14d0  -  -  0  0  0  0
c3t8d0   -  -  0  0  0  0
  mirror  231G   601M  0  0  0  0
c3t28d0  -  -  0  0  0  0
c3t26d0  -  -  0  0  0  0
  mirror  231G   539M  0  0  0  0
c3t29d0  -  -  0  0  0  0
c3t15d0  -  -  0  0  0  0
  mirror  109G   589G  0  0  0  0
c3t18d0  -  -  0  0  0  0
c3t13d0  -  -  0  0  0  0
  mirror  100G  89.0G  0  0  0  0
c3t30d0  -  -  0  0  0  0
c3t17d0  -  -  0  0  0  0
---  -  -  -  -  -  -
-8<-

terabithia:/# zpool export andaman
terabithia:/# zpool import andaman

after export/import:
-8<-
terabithia:/# zpool iostat -v andaman 1
capacity operationsbandwidth
pool  used  avail   read  write   read  write
---  -  -  -  -  -  -
andaman  2.23T   971G  0  0  0  0
  mirror  926G  2.07G  0  0  0  0
c3t11d0  -  -  0  0  0  0
c3t9d0   -  -  0  0  0  0
  mirror  681G   247G  0  0  0  0
c3t14d0  -  -  0  0  0  0
c3t8d0   -  -  0  0  0  0
  mirror  231G   601M  0  0  0  0
c3t28d0  -  -  0  0  0  0
c3t26d0  -  -  0  0  0  0
  mirror  231G   539M  0  0  0  0
c3t29d0  -  -  0  0  0  0
c3t15d0  -  -  0  0  0  0
  mirror  109G   589G  0  0  0  0
c3t18d0  -  -  0  0  0  0
c3t13d0  -  -  0  0  0  0
  mirror  100G   132G  0  0  0  0
c3t30d0  -  -  0  0  0  0
c3t17d0  -  -  0  0  0  0
---  -  -  -  -  -  -
-8<-





pgpR7c6QTRBBd.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] resilver in progress - which disk is inconsistent?

2008-08-08 Thread Marion Hakanson
[EMAIL PROTECTED] said:
> AFAIK there is no way to tell resilvering to pause, so I want to detach the
> inconsistent disk and attach it again tonight, when it won't affect users. To
> do that I need to know which disk is inconsistent, but zpool status does not
> show me any info in regard.
>  
> Is there any way to identify which disk is inconsistent? 

I know this is too late to help you now, but...  Doesn't "zpool status -v"
do what you want?

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs-auto-snapshot 0.11 work (was Re: zfs-auto-snapshot with at scheduling )

2008-08-08 Thread Dave
Tim Foster wrote:
> On Wed, 2008-08-06 at 15:58 -0700, Rob wrote:
>>> The other changes that will appear in 0.11 (which is
>>> nearly done) are:
>> Still looking forward to seeing .11 :)
> 
> Wow, there's one user out there at least!  Thanks!
> 

Keep up the good work, Tim. There are more users of your work out there 
than you might think  :)

--
Dave
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



Re: [zfs-discuss] ZFS on 32bit.

2008-08-08 Thread eric kustarz
I've filed specifically for ZFS:
6735425 some places where 64bit values are being incorrectly accessed  
on 32bit processors

eric

On Aug 6, 2008, at 1:59 PM, Brian D. Horn wrote:

> In the most recent code base (both OpenSolaris/Nevada and S10Ux with  
> patches)
> all the known marvell88sx problems have long ago been dealt with.
>
> However, I've said this before.  Solaris on 32-bit platforms has  
> problems and
> is not to be trusted.  There are far, far too many places in the  
> source
> code where a 64-bit object is either loaded or stored without any  
> atomic
> locking occurring which could result in any number of wrong and bad  
> behaviors.
> ZFS has some problems of this sort, but so does some of the low  
> level 32-bit
> x86 code.  The problem was reported long ago, but to the best of my  
> knowledge
> the issues have not been addressed.  Looking below it appears that  
> nothing
> has been done for about 9 months.
>
> Here is the top of the bug report:
>
> Bug ID 6634371
> Synopsis  Solaris ON is broken w.r.t. 64-bit operations on 32-bit  
> processors
> State 1-Dispatched (Default State)
> Category:Subcategory  kernel:other
> Keywords  32-bit | 64-bit | atomic
> Reported Against  
> Duplicate Of  
> Introduced In 
> Commit to Fix 
> Fixed In  
> Release Fixed 
> Related Bugs  
> Submit Date   27-NOV-2007
> Last Update Date  28-NOV-2007
>
>
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic at zpool import

2008-08-08 Thread Richard Elling
There is a chance that a software bug or change has been made which
will help you to recover from this.  I suggest getting the latest SXCE
DVD, booting single user, and attempt an import.

Note: you may see a message indicating that you can upgrade the
pool.  Do not upgrade the pool if you intend to continue running
Solaris 10 in the near future.
 -- richard

Borys Saulyak wrote:
> Hi,
>
> I have problem with Solaris 10. I know that this forum is for OpenSolaris but 
> may be someone will have an idea.
> My box is crashing on any attempt to import zfs pool. First crash happened on 
> export operation and since then I cannot import pool anymore due to kernel 
> panics. Is there any way of getting it imported or fixed? Removal of 
> zpool.cache did not help.
>
> Here are details:
> SunOS omases11 5.10 Generic_137112-02 i86pc i386 i86pc
>
> [EMAIL PROTECTED]:~[8]#zpool import 
> pool: public 
> id: 10521132528798740070 
> state: ONLINE 
> action: The pool can be imported using its name or numeric identifier. 
> config: 
>
> public ONLINE 
> c7t60060160CBA21000A5D22553CA91DC11d0 ONLINE 
>
> pool: private 
> id: 3180576189687249855 
> state: ONLINE 
> action: The pool can be imported using its name or numeric identifier. 
> config: 
>
> private ONLINE 
> c7t60060160CBA21000A6D22553CA91DC11d0 ONLINE 
>
>
> [EMAIL PROTECTED]:~[8]#zpool import private 
>
> panic[cpu3]/thread=fe8001223c80: ZFS: bad checksum (read on  off 
> 0: zio a26b7680 
> [L0 packed nvlist] 4000L/600P DVA[0]=<0:10c000f400:600> 
> DVA[1]=<0:b40014e00:600> fletcher4 lzjb 
> LE contiguous birth=3640409 fill=1 
> cksum=6c8098535e:6150d1eeb30a:2f1f7efda48588:105955d437bb76e5): error 50 
>
> fe8001223ac0 zfs:zfsctl_ops_root+2ff1624c () 
> fe8001223ad0 zfs:zio_next_stage+65 () 
> fe8001223b00 zfs:zio_wait_for_children+49 () 
> fe8001223b10 zfs:zio_wait_children_done+15 () 
> fe8001223b20 zfs:zio_next_stage+65 () 
> fe8001223b60 zfs:zio_vdev_io_assess+84 () 
> fe8001223b70 zfs:zio_next_stage+65 () 
> fe8001223bd0 zfs:vdev_mirror_io_done+c1 () 
> fe8001223be0 zfs:zio_vdev_io_done+14 () 
> fe8001223c60 genunix:taskq_thread+bc () 
> fe8001223c70 unix:thread_start+8 () 
>
> syncing file systems... [2] 212 [2] 210 [2] 210 [2] 210 [2] 210 [2] 210 [2] 
> 210 [2] 210 [2] 210 [2] 210 [2] 210 [2] 210 [2] 210 [2] 210 [2] 210 [2] 210 
> [2] 210 [2] 210 [2] 210 [2] 210 [2] 210 [2] 210 done (not all i/o completed) 
> dumping to /dev/dsk/c3t2d0s1, offset 65536, content: kernel
>  
>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] [install-discuss] lucreate into New ZFS pool

2008-08-08 Thread Johan Hartzenberg
Hello,

Since I've got my disk partitioning sorted out now, I want to move my BE
from the old disk to the new disk.

I created a new zpool, named RPOOL for distinction with the existing
"rpool".
I then did lucreate -p RPOOL -n new95

This completed without error, the log is at the bottom of this mail.

I have not yet dared to run luactivate. I also have not yet dared set the
ACTIVE flag on any partitions on the new disk (I had some interesting times
with that previously).  Before I complete these steps to set the active
partition and run luactivate, I have a few questions:

1. I somehow doubt that the lucreate process installed a boot block on the
new disk...  How can I confirm this?  Or is luactivate supposed to do this?
2. There are a number of open issues still with ZFS root.  I saw some notes
pertaining to leaving the first cylinder of the disk out from the root pool
slice.  What is that all about?
3. I have a remnant of the lucreate process in my mounts ... (which
prevents, for example lumount and previously caused problems with
luactivate)
4. I see the vdev for dump got created in the new pool, but not for swap?
Is this to be expected?
5. There were notes about errors which were recorded in /tmp/lucopy.errors
... I've rebooted my machine since, so I can't review those any more  I
guess I need to run the lucreate again to see if it happens again and to be
able to read those logs before they get lost again.
6. Since SHARED is an entirely independent pool, and since the purpose of
this lucreate is to move root from one disk to another, I don't see why
lucreate needed to make snapshots of the zone!
7. Despite the messages that the grub menu have been distributed and
populated successfully, the new boot environment have not been added to the
grub menu list.  My experience though is that this happens during
luactivate, so I'm not concerned about this just yet.


Below is some bits showing the current status of the system:

$ zfs list -r RPOOL
NAME   USED  AVAIL  REFER  MOUNTPOINT
RPOOL 7.97G  24.0G  26.5K  /RPOOL
RPOOL/ROOT6.47G  24.0G18K  /RPOOL/ROOT
RPOOL/ROOT/new95  6.47G  24.0G  6.47G  /.alt.new95
RPOOL/dump1.50G  25.5G16K  -
/RPOOL/boot/grub $
/RPOOL/boot/grub $
/RPOOL/boot/grub $ lustatus
Boot Environment   Is   Active ActiveCanCopy
Name   Complete NowOn Reboot Delete Status
--  -- - -- --
snv_94 yes  no noyes-
snv_95 yes  yesyes   no -
new95  yes  no noyes-
/RPOOL/boot/grub $ luumount new95
ERROR: boot environment  is not mounted


$ zfs list -r RPOOL
NAME   USED  AVAIL  REFER  MOUNTPOINT
RPOOL 7.97G  24.0G  26.5K  /RPOOL
RPOOL/ROOT6.47G  24.0G18K  /RPOOL/ROOT
RPOOL/ROOT/new95  6.47G  24.0G  6.47G  /.alt.new95
RPOOL/dump1.50G  25.5G16K  -


$ lustatus
Boot Environment   Is   Active ActiveCanCopy
Name   Complete NowOn Reboot Delete Status
--  -- - -- --
snv_94 yes  no noyes-
snv_95 yes  yesyes   no -
new95  yes  no noyes-

Thank you,
  _Johan


For what it is worth, below is the log of the lucreate session.
/dev/dsk $ zpool create -f RPOOL c0d0s0
/dev/dsk $ timex lucreate -p RPOOL -n new95
Checking GRUB menu...
System has findroot enabled GRUB
Analyzing system configuration.
Comparing source boot environment  file systems with the file
system(s) you specified for the new boot environment. Determining which
file systems should be in the new boot environment.
Updating boot environment description database on all BEs.
Updating system configuration files.
Creating configuration for boot environment .
Source boot environment is .
Creating boot environment .
Creating file systems on boot environment .
Creating  file system for  in zone  on .
Populating file systems on boot environment .
Checking selection integrity.
Integrity check OK.
Populating contents of mount point .
Copying.
WARNING: The file  contains a list of <2>
potential problems (issues) that were encountered while populating boot
environment .
INFORMATION: You must review the issues listed in
 and determine if any must be resolved. In
general, you can ignore warnings about files that were skipped because
they did not exist or could not be opened. You cannot ignore errors such
as directories or files that could not be created, or file systems running
out of disk space. You must manually resolve any such problems before you
activate boot environment .
Creating shared file system mount points.
Creating snapshot for  on .
Creating clone for  on .
Creating compare databases for boot environment .
Creating compare datab

Re: [zfs-discuss] zfs-auto-snapshot 0.11 work (was Re: zfs-auto-snapshot with at scheduling )

2008-08-08 Thread Tim Foster
On Wed, 2008-08-06 at 15:58 -0700, Rob wrote:
> > The other changes that will appear in 0.11 (which is
> > nearly done) are:
> 
> Still looking forward to seeing .11 :)

Wow, there's one user out there at least!  Thanks!

> Think we can expect a release soon? (or at least svn access so 
> that others can check out the trunk?)

Nearly. Here's what's going on:

I'm working closely with Niall Power of the Desktop team at Sun who is
(along with Erwann Chenede) tackling the requirement to have ZFS
snapshots managed out of the box for OpenSolaris 2008.11:
http://opensolaris.org/os/project/indiana/resources/problem_statement/#DSK-5

To this end, they've decided to use with the existing zfs-auto-snapshot
code, and put a proper GUI on top of it, rather than start from scratch.
(I suck at writing GUIs, so this is way cool!)

So far, we've come across a few parts of the core service that make it a
little bit harder to have it "just work" from a GUI perspective, so I've
been working to add those into the existing core codebase.  These are:

  * checking for missed snapshot on service start

  * a new "zfs/fs-name" keyword "##".  The existing keyword "//" is
inclusive, it only snapshots filesystems marked with a given
property. The new keyword "##" adds exclusive support: it snapshots
every filesystem except those set with a given property.

  * RBAC stuff - still haven't written this yet, but I think running the
service under a role with the "ZFS File System Management" profile
will be enough.

  * Collecting the default instances into a single group, giving us an
easier way to enable/disable all of the current default service
instances from the GUI. (we're still working out how best to do
this, and whether that big on/off switch should be elsewhere or not)


The new code will continue to be backwards compatible with previous
releases: manifests from 0.1 upwards will still work just fine with
0.11.  The plan is to split the code into two packages, one being the
core SMF service + canned instances, the other being the GUI code.


For now, we've got a new Mercurial repository at:

hg clone ssh://[EMAIL PROTECTED]/hg/jds/zfs-snapshot

It's just got most versions up to 0.10 in there at present, I'll commit
the 0.11 changes as soon as we're happy with them, hopefully in the next
week or two.

The code is under the JDS project for now, ultimately I'd like to get
the core service into ON at some point, but that'll need more ample free
time than I've got right now :-)

cheers,
tim


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Jive Forum--Stupid? (Was: Booting from a USB HD)

2008-08-08 Thread W. Wayne Liauh
...
> > However, I was unable to use this flash stick to
> boot an Athlon X2
> > machine.  Its MBR was read--and the GRUB options
> were shown on the
> > screen.  But when I selected an option (e.g., the
> rc3), the machine
> > would go into the restart mode, and the GRUB screen
> would be shown
> > again.  This process can be repeated again and
> again.  It seems that
> > the bootloader was not able to read the kernel from
> the flash stick.
> >
> > Also I noticed that with the Athlon X2 machine
> (which is about two
> > years old), the os08.05-installed flash stick was
> NOT treated as a
> > removable disc.  Instead, I had to move its boot
> priority up in the
> > __hard drive__ category in order for it to be
> recognized by the POST
> > process.  This contrasts with the C2D notebook,
> which recognizes the
> > flash stick as a removable medium.
> 
> First off, please get a real mail client that doesn't
> send huge lines. 
> Thankfully Thunderbird has re-wrap that handles
> quotations.
> 
> Your problem is almost certainly that your boot
> device order differs, 
> probably due to thw BIOS differences you mention. Go
> to the grub command 
> line, and do a "find /platform/i86pc/multiboot". Pay
> attention to the 
> hd(n,m) it prints (I hope!) and edit your boot entry
> to match. Once 
> you're up in multi-user, add a new boot entry to
> /boot/grub/menu.lst for 
> your alternate device numbering.
> 
> FYI, this is a generic x86 grub problem - Linux
> behaves the same way.
> 
> -- 
> Carson
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discu
> ss


Thanks for the reply.

For some reason, it took __several__ days for your reply to appear in the Jive 
forum--and, as a result, I almost missed it.  It's always too late now (past 10 
pm Hawaii time).  I will try your suggestion this weekend.

Regarding the "HUGE" lines, I have received similar complaints in the past but 
I really don't have any idea what's going on.  If I use Thunderbird, it would 
create a new thread every time I post a reply.  And when viewed from the Jive 
web forum, it would look like I was forking and graffitiing the entire 
place--making me look extremely stupid.

If, OTOH, I use the Jive reply button, as you and many others have pointed out, 
my reply may continue on a huge, never-ending long line.  I run os 08.05 and I 
use FF2&3.  Nothing exorbitant.  Everything looks perfectly normal to me.   The 
Jive forum has a reply button, and I really don't see why it should cause any 
problem.  (B/c of the large volume of mails, I prefer exclusively using the 
Jive forum to communicate.)

on this aspect (& probably only on this aspect), the Jive forum is the most 
stupid forum on this planet AFAIC.  I understand there are legitimate reasons 
why this is so bad, but to those who are not aware of what's going on (perhaps 
more than 99% of the forum participants), the Jive forum looks inexcusably 
stupid.

Thanks again for your kind help, & please be sure that my rants are not against 
you.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs-auto-snapshot: Use at ? SMF prop caching?

2008-08-08 Thread Tim Foster
Hey Nils,

On Sun, 2008-08-03 at 05:57 -0700, Nils Goroll wrote:
> This does sound like a valid alternative solution for this requirement if you
> want to avoid using "at", though this will involve additional complexity for
> parsing timestamps of existing snapshots and calculating intervals, which
> I think is not that trivial in shells (consider timezone changes, leap years
> etc).

Right - the code I have is good enough: you're absolutely correct that
it drifts across unequal months and leap years, but I'm happy with it.
The code I have right now does basically this: get the last snapshot
taken for this schedule, then:


# all calculations done according to time since epoch.
LAST_SNAP_TIME=$(zfs get -H -p -o value creation $LAST_SNAPSHOT)
LAST_SNAP_TIME_HUMAN=$(zfs get -H -o value creation $LAST_SNAPSHOT)
NOW=$(perl -e 'print time;')


MINUTE_S=60
HOUR_S=$(( $MINUTE_S * 60 ))
DAY_S=$(( $HOUR_S * 24 ))
MONTH_S=$(( $DAY_S * 30 ))

case $INTERVAL in
"minutes")
MULTIPLIER=$MINUTE_S
;;
"hours")
MULTIPLIER=$HOUR_S
;;
"days")
MULTIPLIER=$DAY_S
;;
"none")
return 0
;;
"*")
print_log "WARNING - unknown interval encountered in 
check_missed_snapshots!"
return 1
esac

PERIOD_S=$(( $MULTIPLIER * $PERIOD ))
AGO=$(( $NOW - $LAST_SNAP_TIME ))

if [ $AGO -gt $PERIOD_S ] ; then
print_log "Last snapshot for $FMRI taken on LAST_SNAP_TIME_HUMAN"
print_log "which was greater than the $PERIOD $INTERVAL schedule. 
Taking snapshot now."
take_snapshot $FMRI
fi


Suggestions welcome?

> # Adding a cron job that runs exactly every x time-intervals is hard to do
> # properly.

Absolutely.

> > Hard to please everyone!  If you felt like it, it'd  be great to get the 
> > "offset" property working - that'd make the use of cron a lot more 
> > flexible for admins I think.
> 
> OK, I'll let you know when (if) I start working on it so we don't do double 
> work.

Thanks!

> > Would the conditional-snapshot-on-start-method solution work for you?
> 
> I think so, on the other hand I don't see why exactly you want to avoid 
> supporting
> "at" as well.

I'd like to avoid adding at(1) support because while I think it's a
pretty neat hack, I think it also duplicates functionality, causes more
maintenance and doesn't have a clean interface with the rest of the
service (imho).

If cron not being expressive enough is the problem, yet cron is an
already accepted way of running periodic system services, wouldn't it
make more sense to spend time getting cron up to spec on Solaris than
having to always hack around it?

> > I've attached some sample code - see what you think.
> 
> This is basically a simpler version of the same idea - put svcprops in 
> variables.
> There are a couple of obstacles here:
> 
> - If you create variables with the names of svc properties, you run into the
>   issue that shell variables cant contain all characters valid for svc 
> properties,
>   which you need to work around then (you are using sed to filter out some
>   characters (e.g. by mapping - to _), but this will make more than one svc-
>   prop onto the same cache entry, which might work for zfs-auto-snapshot,
>   but is not a general solution).

I'm not sure it needs to be a general solution, it just needs to work
for this service. I'm just filtering out the key values though, I think
this should be safe.

>   My suggested code uses associative arrays which don't have this limitation.
> 
> - For your solution, how do you invalidate the cache if a property is being
>   changed or deleted (this is trivial, but not yet implemented)?

Right, the cache gets created once at the beginning of a method call, if
a user changes an SMF property in the middle of that method running, the
results should be undefined.

> - Does your solution handle white space, quotes etc. in svcprop values 
> properly
>   (I think there is an issue regarding white space, but I have not tested it)?

Very good point - I'll dig into this.

> - Does your solution impose a security risk? (consider the eval $NAME)

Not that I'm aware of - at least no more than the "zfs/backup-save-cmd"
property, which allows an administrator to set an arbitrary command to
process the zfs send stream.  The point is, if a user can set any SMF
property for this service, then they're already privileged (or should
know better)

With the upcoming change to running this stuff under a restricted role,
this will be even less of a concern.

cheers,
tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] resilver in progress - which disk is inconsistent?

2008-08-08 Thread Justin Vassallo
Hi,

 

I've got a 'resilver in progress'. Since resilvering is slowing my pool down
incredibly (db queries taking 20 times longer than normal), I want to pause
the resilvering.

 

AFAIK there is no way to tell resilvering to pause, so I want to detach the
inconsistent disk and attach it again tonight, when it won't affect users.
To do that I need to know which disk is inconsistent, but zpool status does
not show me any info in regard.

 

Is there any way to identify which disk is inconsistent?

 

Thanks

justin

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss