Re: [zfs-discuss] ZFS send fails incremental snapshot

2009-01-04 Thread Carsten Aulbert
Hi Brent,

Brent Jones wrote:
> I am using 2008.11 with the Timeslider automatic snapshots, and using
> it to automatically send snapshots to a remote host every 15 minutes.
> Both sides are X4540's, with the remote filesystem mounted read-only
> as I read earlier that would cause problems.
> The snapshots send fine for several days, I accumulate many snapshots
> at regular intervals, and they are sent without any problems.
> Then I will get the dreaded:
> "
> cannot receive incremental stream: most recent snapshot of pdxfilu02
> does not match incremental source
> "
> 

Which command line are you using?

Maybe you need to do a rollback first (zfs receive -F)?

Cheers

Carsten
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS send fails incremental snapshot

2009-01-04 Thread Brent Jones
Hello all,
I am using 2008.11 with the Timeslider automatic snapshots, and using
it to automatically send snapshots to a remote host every 15 minutes.
Both sides are X4540's, with the remote filesystem mounted read-only
as I read earlier that would cause problems.
The snapshots send fine for several days, I accumulate many snapshots
at regular intervals, and they are sent without any problems.
Then I will get the dreaded:
"
cannot receive incremental stream: most recent snapshot of pdxfilu02
does not match incremental source
"

Manually sending does not work, or destroying snapshots on the remote
side and resending the batch again from the earliest point in time.
The only way I have found that works, is to destroy the entire zfs
filesystem on the remote side, and begin anew.

Is there a way to force a ZFS receive, or to get more information
about what changed on the remote system to cause it not to accept
anymore snapshots?

Thank you in advance

-- 
Brent Jones
br...@servuhome.net
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI (For VirtualBox target) and SMB

2009-01-04 Thread Sanjeev Bagewadi
Kevin,

Kevin Pattison wrote:
> Hey all,
>
> I'm setting up a ZFS based fileserver to use both as a shared network drive 
> and separately to have an iSCSI target to be used as the "Hard disk" of a 
> windows based VM runninf on another machine.
>
> I've built the machine, installed the OS, created the RAIDZ pool and now have 
> a couple of questions (I'm pretty much new to Solaris by the way but have 
> been using Linux for some time). In my attempt to create the iSCSI target to 
> be used and the VM disk I created (through the web frontend) a new dataset 
> under the main pool of type "Volume" and gave it 30GB of space and called it 
> iTunesVM. I then tried to run:
> zfs set shareiscsi=on tank/iTunesVM
> but got the error:
> cannot share 'tank/iTunesVM': iscsitgtd failed request to share
> cannot share 'tank/itune...@zfs-auto-snap:weekly-2009-01-02-15:02': iscsitgtd 
> failed request to share
>   
This needs to be investigated. From what I see all the snapshots on that 
volume inherit the shareiscsi property. And hence
when we set it we try to share the snapshots as well and that is failing 
here.

Thanks and regards,
Sanjeev.

> I've checked and my iSCSI target service is on and running.
>
> With regards a network share accessible to both Windows, Linux and Mac OS 
> machines on the network, what protocol would be best to use (NFS or SMB). I 
> would then like to set up a locally hosted headless windows VM to run a 
> Windows Media Player/iTunes share over the network for access to the music 
> from my xbox/PS3.
>
> All help appreciated,
> Kevpatts
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2009-01-04 Thread Tim
On Sun, Jan 4, 2009 at 5:47 PM, Orvar Korvar  wrote:

> "ECC theory tells, that you need a minimum distance of 3
> to correct one error in a codeword, ergo neither RAID-5 or RAID-6
> are enough: you need RAID-2 (which nobody uses today)."
>
> What is "RAID-2"? Is it raidz2?
> --
>


Google is your friend ;)
http://www.pcguide.com/ref/hdd/perf/raid/levels/singleLevel2-c.html

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2009-01-04 Thread Orvar Korvar
"ECC theory tells, that you need a minimum distance of 3
to correct one error in a codeword, ergo neither RAID-5 or RAID-6
are enough: you need RAID-2 (which nobody uses today)."

What is "RAID-2"? Is it raidz2?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-04 Thread Richard Elling
Bernd Finger wrote:
> Hi,
>
> After I published a blog entry about installing OpenSolaris 2008.11 on a 
> USB stick, I read a comment about a possible issue with wearing out 
> blocks on the USB stick after some time because ZFS overwrites its 
> uberblocks in place.
>
> I tried to get more information about how updating uberblocks works with 
> the following dtrace script:
>
> /* io:genunix::start */
> io:genunix:default_physio:start,
> io:genunix:bdev_strategy:start,
> io:genunix:biodone:done
> {
> printf ("%d %s %d %d", timestamp, execname, args[0]->b_blkno, 
> args[0]->b_bcount);
> }
>
> fbt:zfs:uberblock_update:entry
> {
> printf ("%d (%d) %d, %d, %d, %d, %d, %d, %d, %d", timestamp,
>   args[0]->ub_timestamp,
>   args[0]->ub_rootbp.blk_prop, args[0]->ub_guid_sum,
>   args[0]->ub_rootbp.blk_birth, args[0]->ub_rootbp.blk_fill,
>   args[1]->vdev_id, args[1]->vdev_asize, args[1]->vdev_psize,
>   args[2]);
> }
>
> The output shows the following pattern after most of the 
> uberblock_update events:
>
>0  34404 uberblock_update:entry 244484736418912 (1231084189) 
> 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 0, 26747
>0   6668bdev_strategy:start 244485190035647 sched 502 1024
>0   6668bdev_strategy:start 244485190094304 sched 1014 1024
>0   6668bdev_strategy:start 244485190129133 sched 39005174 1024
>0   6668bdev_strategy:start 244485190163273 sched 39005686 1024
>0   6656  biodone:done 244485190745068 sched 502 1024
>0   6656  biodone:done 244485191239190 sched 1014 1024
>0   6656  biodone:done 244485191737766 sched 39005174 1024
>0   6656  biodone:done 244485192236988 sched 39005686 1024
>
> ...
>0  34404   uberblock_update:entry 244514710086249 
> (1231084219) 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 
> 0, 26748
>0  34404   uberblock_update:entry 244544710086804 
> (1231084249) 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 
> 0, 26749
> ...
>0  34404   uberblock_update:entry 244574740885524 
> (1231084279) 9226475971064889345, 4541013553469450828, 26750, 159, 0, 0, 
> 0, 26750
>0   6668 bdev_strategy:start 244575189866189 sched 508 1024
>0   6668 bdev_strategy:start 244575189926518 sched 1020 1024
>0   6668 bdev_strategy:start 244575189961783 sched 39005180 1024
>0   6668 bdev_strategy:start 244575189995547 sched 39005692 1024
>0   6656   biodone:done 244575190584497 sched 508 1024
>0   6656   biodone:done 244575191077651 sched 1020 1024
>0   6656   biodone:done 244575191576723 sched 39005180 1024
>0   6656   biodone:done 244575192077070 sched 39005692 1024
>
> I am not a dtrace or zfs expert, but to me it looks like in many cases, 
> an uberblock update is followed by a write of 1024 bytes to four 
> different disk blocks. I also found that the four block numbers are 
> incremented with always even numbers (256, 258, 260, ,..) 127 times and 
> then the first block is written again. Which would mean that for a txg 
> of 5, the four uberblock copies have been written 5/127=393 
> times (Correct?).
>   

The uberblocks are stored in a circular queue: 128 entries @ 1k.  The method
is described in the on-disk specification document.  I applaud your 
effort to
reverse-engineer this :-)
http://www.opensolaris.org/os/community/zfs/docs/ondiskformat0822.pdf

I've done some research in this area by measuring the actual I/O to each
block on the disk.  This can be done with TNF or dtrace -- for any
workload.  I'd be interested in hearing about your findings, especially if
you record block update counts for real workloads.

Note: wear leveling algorithms for specific devices do not seem to be
publically available :-(  But the enterprise SSDs seem to be gravitating
towards using DRAM write caches anyway.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs & iscsi sustained write performance

2009-01-04 Thread milosz
thanks for your responses, guys...

the nagle's tweak is the first thing i did, actually.

not sure what the network limiting factors could be here... there's no switch, 
jumbo frames are on... maybe it's the e1000g driver?  it's been wonky since 94 
or so.  even during the write bursts i'm only getting 60% of gigabit on average.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Unable to add cache device

2009-01-04 Thread JZ
[no Sun folks replying to this?  ok, let me do more spam then...]

Scott, thank you so much for the testing spirit and sharing the result with 
the list! -- We architects can be talking all day long and still don't have 
any idea how the open things would work on "any box", not just the 
poster-boy kind of expensive boxes with tons of hardware.

However, I would just like to suggest that the SSD performance gain would be 
mostly in rates (IOPS), but not throughput (MB/s). If you measure the gain 
in light of rates, you might be (actually should be, by our architecting 
theory) much more impressed.
[well, only if you care about database applications, beyond just our 
personal digital media files on company network...   :-)   ]

Please see the testing below, done before the 10/2008 Sun official 7000 SSD 
availability annoucement, as well as the tech talk by Brendan, a bit long 
(and less fun than my spam), but I am sure it is worth the time to study.
http://blogs.sun.com/brendan/entry/test

Best,
z


- Original Message - 
From: "Scott Laird" 
To: "Richard Elling" 
Cc: ; "Akhilesh Mritunjai" 

Sent: Saturday, January 03, 2009 12:02 AM
Subject: Re: [zfs-discuss] Unable to add cache device


> On Fri, Jan 2, 2009 at 8:54 PM, Richard Elling  
> wrote:
>> Scott Laird wrote:
>>>
>>> On Fri, Jan 2, 2009 at 4:52 PM, Akhilesh Mritunjai
>>>  wrote:
>>>

 As for source, here you go :)


 http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/zpool/zpool_vdev.c#650

>>>
>>> Thanks.  It's in the middle of get_replication, so I suspect it's a
>>> bug--zpool tries to check on the replication status of existing vdevs
>>> and croaks in the process.  As it turns out, I was able to add the
>>> cache devices just fine once the resilver completed.
>>>
>>
>> It is a bug because the assertion failed.  Please file one.
>> http://en.wikipedia.org/wiki/Assertion_(computing)
>> http://bugs.opensolaris.org
>>
>>> Out of curiosity, what's the easiest way to shove a file into the
>>> L2ARC?  Repeated reads with dd if=file of=/dev/null doesn't appear to
>>> do the trick.
>>>
>>
>> To put something in the L2ARC, it has to be purged from the ARC.
>> So until you run out of space in the ARC, nothing will be placed into
>> the L2ARC.
>
> I have a ~50G working set and 8 GB of RAM, so I'm out of space in my
> ARC.  My read rate is low enough for the disks to keep up, but I'd
> like to see lower latency.  Also, 30G SSDs were cheap last week :-).
>
> My big problem is that dd if=file of=/dev/null doesn't appear to
> actually read the whole file--I can loop over 50G of data in about 20
> seconds while doing under 100 MB/sec of disk I/O.  Does Solaris's dd
> have some sort of of=/dev/null optimization?  Adding conv=swab seems
> to be making it work better, but I'm still only seeing write rates of
> ~1 MB/sec per SSD, even though they're mostly empty.
>
>
> Scott
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS iSCSI (For VirtualBox target) and SMB

2009-01-04 Thread kristof
I've seen this error often, but mostly the volume is shared.

I think it happens as soon ay the volume has snapshots.

To check if the volume is exposed or not, you can run:

iscsitadm list target -v

If the volume shows up, it's OK and you should ignore the message.

K
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to mount rpool and edit vfstab from LiveCD?

2009-01-04 Thread Patrik Greco
Problem Solved!

I mounted wrong zfs pool.
wrong commad = zfs set mountpoint=/b xpool/ROOT/opensolaris

right commad = zfs set mountpoint=/b xpool/ROOT/opensolaris[b]-1[/b]
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to mount rpool and edit vfstab from LiveCD?

2009-01-04 Thread kristof
If you have snapshots of the root filesystem, you can recover the file.

To check for snapshots run:

zfs list - t all

If you see something like

rpool/ROOT/opensola...@x

then you are lucky, you will find the original vfstab file in:

/b/.zfs//etc/vfstab

K
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-04 Thread Andrew Gabriel
Bernd Finger wrote:
> Hi,
>
> After I published a blog entry about installing OpenSolaris 2008.11 on a 
> USB stick, I read a comment about a possible issue with wearing out 
> blocks on the USB stick after some time because ZFS overwrites its 
> uberblocks in place.

The flash controllers used on solid state disks implement wear leveling, 
to ensure hot blocks don't prematurely wear out on the flash device. 
Wear leveling will move hot logical blocks around in the physical flash 
memory, so one part doesn't wear out much faster than the rest of it.

I would presume flash on USB sticks does something similar, but I don't 
know that for sure.

-- 
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to find out the zpool of an uberblock printed with the fbt:zfs:uberblock_update: probes?

2009-01-04 Thread Bernd Finger
Hi,

After I published a blog entry about installing OpenSolaris 2008.11 on a 
USB stick, I read a comment about a possible issue with wearing out 
blocks on the USB stick after some time because ZFS overwrites its 
uberblocks in place.

I tried to get more information about how updating uberblocks works with 
the following dtrace script:

/* io:genunix::start */
io:genunix:default_physio:start,
io:genunix:bdev_strategy:start,
io:genunix:biodone:done
{
printf ("%d %s %d %d", timestamp, execname, args[0]->b_blkno, 
args[0]->b_bcount);
}

fbt:zfs:uberblock_update:entry
{
printf ("%d (%d) %d, %d, %d, %d, %d, %d, %d, %d", timestamp,
  args[0]->ub_timestamp,
  args[0]->ub_rootbp.blk_prop, args[0]->ub_guid_sum,
  args[0]->ub_rootbp.blk_birth, args[0]->ub_rootbp.blk_fill,
  args[1]->vdev_id, args[1]->vdev_asize, args[1]->vdev_psize,
  args[2]);
}

The output shows the following pattern after most of the 
uberblock_update events:

   0  34404 uberblock_update:entry 244484736418912 (1231084189) 
9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 0, 26747
   0   6668bdev_strategy:start 244485190035647 sched 502 1024
   0   6668bdev_strategy:start 244485190094304 sched 1014 1024
   0   6668bdev_strategy:start 244485190129133 sched 39005174 1024
   0   6668bdev_strategy:start 244485190163273 sched 39005686 1024
   0   6656  biodone:done 244485190745068 sched 502 1024
   0   6656  biodone:done 244485191239190 sched 1014 1024
   0   6656  biodone:done 244485191737766 sched 39005174 1024
   0   6656  biodone:done 244485192236988 sched 39005686 1024

...
   0  34404   uberblock_update:entry 244514710086249 
(1231084219) 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 
0, 26748
   0  34404   uberblock_update:entry 244544710086804 
(1231084249) 9226475971064889345, 4541013553469450828, 26747, 159, 0, 0, 
0, 26749
...
   0  34404   uberblock_update:entry 244574740885524 
(1231084279) 9226475971064889345, 4541013553469450828, 26750, 159, 0, 0, 
0, 26750
   0   6668 bdev_strategy:start 244575189866189 sched 508 1024
   0   6668 bdev_strategy:start 244575189926518 sched 1020 1024
   0   6668 bdev_strategy:start 244575189961783 sched 39005180 1024
   0   6668 bdev_strategy:start 244575189995547 sched 39005692 1024
   0   6656   biodone:done 244575190584497 sched 508 1024
   0   6656   biodone:done 244575191077651 sched 1020 1024
   0   6656   biodone:done 244575191576723 sched 39005180 1024
   0   6656   biodone:done 244575192077070 sched 39005692 1024

I am not a dtrace or zfs expert, but to me it looks like in many cases, 
an uberblock update is followed by a write of 1024 bytes to four 
different disk blocks. I also found that the four block numbers are 
incremented with always even numbers (256, 258, 260, ,..) 127 times and 
then the first block is written again. Which would mean that for a txg 
of 5, the four uberblock copies have been written 5/127=393 
times (Correct?).

What I would like to find out is how to access fields from arg1 (this is 
the data of type vdev in:

int uberblock_update(uberblock_t *ub, vdev_t *rvd, uint64_t txg)

). When using the fbt:zfs:uberblock_update:entry probe, its elements are 
always 0, as you can see in the above output. When using the 
fbt:zfs:uberblock_update:return probe, I am getting an error message 
like the following:

dtrace: failed to compile script zfs-uberblock-report-04.d: line 14: 
operator -> must be applied to a pointer

Any idea how to access the fields of vdev, or how to print out the pool 
name associated to an uberblock_update event?

Regards,

Bernd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs & iscsi sustained write performance

2009-01-04 Thread Sean Alderman
> What is less clear is why windows write performance drops to zero.

Perhaps the tweak for Nagel's Algorithm in Windows would be in order?

http://blogs.sun.com/constantin/entry/x4500_solaris_zfs_iscsi_perfect
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to mount rpool and edit vfstab from LiveCD?

2009-01-04 Thread Thommy M . Malmström
My son (15 years old) has installed OpenSolaris 2008.11 on disk on
his system and everything was OK until he made a newbie mistake and
edited the /etc/vfstab file incorrectly, that now prevents him from
booting. (Think he had done too much Linux...)
It just hangs on the splash screen.

My idea was to try to do a live cd boot, import the rpool on the disk,
mount my ZFS root filesystem somewhere, and then edit the
misconfigured vfstab, save it and reboot from disk.

Here's the commands he tried after some ideas from me was...

1. zpool import

   Shows the rpool with a long id number

2. zpool import -f  xpool

3. mkdir /b

4. zfs set mountpoint=/b xpool/ROOT/opensolaris

5. zfs mount xpool/ROOT/opensolaris

6. cat /b/etc/vfstab

But he can't see the edits that he made.

So, how do he get the "real" file mounted?
He made a _LOT_ of additions and upgrades after the initial install
and do really want to recover...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss