Re: [zfs-discuss] Fixing device names after disk shuffle
On 14 Oct 2012, at 20:56 , "Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)" wrote: >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >> boun...@opensolaris.org] On Behalf Of Paul van der Zwan >> >> What was c5t2 is now c7t1 and what was c4t1 is now c5t2. >> Everything seems to be working fine, it's just a bit confusing. > > That ... Doesn't make any sense. Did you reshuffle these while the system > was powered on or something? > No hot-swappable devices so it was just a SATA cable swap while the system was down. > sudo devfsadm -Cv > sudo zpool export datapool > sudo zpool export homepool > sudo zpool import -a > sudo reboot -p > Hmm would have to try that in single user mode as those pools contain my homedirs and some shared FS's. > The normal behavior is: During the import, or during the reboot when the > filesystem gets mounted, zfs searches the available devices in the system for > components of a pool. I don't see any way the devices reported by "zpool > status" wouldn't match the devices reported by "format." Unless, as you say, > it's somehow overridden by the cache file. > > It surprised me as well as it seems to be working fine. Tried a scrub of rpool and that went without a problem. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Fixing device names after disk shuffle
I moved some disk around on my Openindiana system and now the names that are shown by zpool status no longer match the names format shows: $ zpool status pool: datapool state: ONLINE scan: scrub repaired 0 in 7h58m with 0 errors on Wed Oct 3 01:13:47 2012 config: NAMESTATE READ WRITE CKSUM datapoolONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 logs c5t2d0p5 ONLINE 0 0 0 errors: No known data errors pool: homepool state: ONLINE scan: scrub repaired 0 in 0h56m with 0 errors on Tue Oct 9 16:56:54 2012 config: NAME STATE READ WRITE CKSUM homepool ONLINE 0 0 0 mirror-0ONLINE 0 0 0 c5t0d0s0 ONLINE 0 0 0 c5t1d0s0 ONLINE 0 0 0 logs c5t2d0p6ONLINE 0 0 0 errors: No known data errors pool: rpool state: ONLINE scan: scrub repaired 0 in 0h2m with 0 errors on Sun Oct 14 15:56:09 2012 config: NAMESTATE READ WRITE CKSUM rpool ONLINE 0 0 0 c5t2d0s0 ONLINE 0 0 0 errors: No known data errors $ sudo format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c5t0d0 /pci@0,0/pci8086,28@1f,2/disk@0,0 1. c5t1d0 /pci@0,0/pci8086,28@1f,2/disk@1,0 2. c5t2d0 /pci@0,0/pci8086,28@1f,2/disk@2,0 3. c5t3d0 /pci@0,0/pci8086,28@1f,2/disk@3,0 4. c7t1d0 /pci@0,0/pci8086,3a40@1c/pci1095,7132@0/disk@1,0 5. c8t0d0 /pci@0,0/pci8086,28@1d,7/storage@2/disk@0,0 Specify disk (enter its number): ^D What was c5t2 is now c7t1 and what was c4t1 is now c5t2. Everything seems to be working fine, it's just a bit confusing. How can I 'fix' this ? Delete /etc/zfs/zpool.cache and reboot ? TIA Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Block locations in a mirror vdev ?
I cannot find the answer in the on disk specification or anywhere else. Are the vdev in a mirror block by block copies ? I mean is block 10013223 on on device the same as block 10013223 on the other devices in a mirror vdev. Off course only after that block has ever been used by zfs, I know blocks that have never been used contain unspecified data. What happens if a block on one devices get corrupted because of a media error ? Will zfs just allocate another block on that device and write the repaired data to that new block ? Where can I find more information on how zfs does this ? Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 18 mrt 2010, at 10:07, Henrik Johansson wrote: > Hello, > > On 17 mar 2010, at 16.22, Paul van der Zwan wrote: > >> >> On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: >> >>> Someone correct me if I'm wrong, but it could just be a coincidence. That >>> is, perhaps the data that you copied happens to lead to a dedup ratio >>> relative to the data that's already on there. You could test this out by >>> copying a few gigabytes of data you know is unique (like maybe a DVD video >>> file or something), and that should change the dedup ratio. >> >> The first copy of that data was unique and even dedup is switched off for >> the entire pool so it seems a bug in the calculation of the >> dedupratio or it used a method that is giving unexpected results. > > I wonder if the dedup ratio is calculated by the contents of the DDT or by > all the data contents of the whole pool, i'we only looked at the ratio for > datasets which had dedup on for the whole lifetime. If the former, data added > when it's switched off will never alter the ratio (until rewritten when with > dedup on). The source should have the answer, but i'm on mail only for a few > weeks. > > It'a probably for the whole dataset, that makes the most sense, just a > thought. > It looks like the ratio only gets updated when dedup is switched on and freezes if you switch dedup off for the entire pool, like I did. I tried to have a look at the source but it was way too complex to figure it out in the time I had available so far. Best regards, Paul van der Zwan Sun Microsystems Nederland > Regards > > Henrik > http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 17 mrt 2010, at 10:56, zfs ml wrote: > On 3/17/10 1:21 AM, Paul van der Zwan wrote: >> >> On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: >> >>> Someone correct me if I'm wrong, but it could just be a coincidence. That >>> is, perhaps the data that you copied happens to lead to a dedup ratio >>> relative to the data that's already on there. You could test this out by >>> copying a few gigabytes of data you know is unique (like maybe a DVD video >>> file or something), and that should change the dedup ratio. >> >> The first copy of that data was unique and even dedup is switched off for >> the entire pool so it seems a bug in the calculation of the >> dedupratio or it used a method that is giving unexpected results. >> >> Paul > > beadm list -a > and/or other snapshots that were taken before turning off dedup? Possibly but that should not matter. If I triple the amount of data in the pool, with dedup switch off, the dedupratio should IMHO change because the amount of non-deduped data has changed. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: > Someone correct me if I'm wrong, but it could just be a coincidence. That is, > perhaps the data that you copied happens to lead to a dedup ratio relative to > the data that's already on there. You could test this out by copying a few > gigabytes of data you know is unique (like maybe a DVD video file or > something), and that should change the dedup ratio. The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the dedupratio or it used a method that is giving unexpected results. Paul > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: > Someone correct me if I'm wrong, but it could just be a coincidence. That is, > perhaps the data that you copied happens to lead to a dedup ratio relative to > the data that's already on there. You could test this out by copying a few > gigabytes of data you know is unique (like maybe a DVD video file or > something), and that should change the dedup ratio. The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the dedupratio or it used a method that is giving unexpected results. Paul > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] dedupratio riddle
On Opensolaris build 134, upgraded from older versions, I have an rpool for which I had switch on dedup for a few weeks. After that I switched to back on. Now it seems the dedup ratio is stuck at a value of 1.68. Even when I copy more then 90 GB of data it still remains at 1.68. Any ideas ? Paul Here is some evidence… Before the copy : $ zpool list NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT rpool 931G 132G 799G14% 1.68x ONLINE - $ After the copy : $ zpool list NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT rpool 931G 225G 706G24% 1.68x ONLINE - $ It has only been enabled for 11 days last month. $ pfexec zpool history |grep dedup 2010-02-11.21:19:42 zfs set dedup=verify rpool 2010-02-22.21:38:15 zfs set dedup=off rpool And it is off on all filesystems: $ zfs get -r dedup rpool NAME PROPERTY VALUE SOURCE rpool dedup off local rp...@20100227dedup - - rpool/ROOTdedup off inherited from rpool rpool/r...@20100227 dedup - - rpool/ROOT/b131-zones dedup off inherited from rpool rpool/ROOT/b131-zo...@20100227dedup - - rpool/ROOT/b132 dedup off inherited from rpool rpool/ROOT/b...@20100227 dedup - - rpool/ROOT/b133 dedup off inherited from rpool rpool/ROOT/b134 dedup off inherited from rpool rpool/ROOT/b...@install dedup - - rpool/ROOT/b...@2010-02-07-11:19:05 dedup - - rpool/ROOT/b...@2010-02-20-15:59:22 dedup - - rpool/ROOT/b...@20100227 dedup - - rpool/ROOT/b...@2010-03-11-19:18:51 dedup - - rpool/dumpdedup off inherited from rpool rpool/d...@20100227 dedup - - rpool/export dedup off inherited from rpool rpool/exp...@20100227 dedup - - rpool/export/home dedup off inherited from rpool rpool/export/h...@20100227dedup - - rpool/export/home/beheer dedup off inherited from rpool rpool/export/home/beh...@20100227 dedup - - rpool/export/home/paulz dedup off inherited from rpool rpool/export/home/pa...@20100227 dedup - - rpool/export/sharededup off inherited from rpool rpool/export/sh...@20100227 dedup - - rpool/local dedup off inherited from rpool rpool/lo...@20100227 dedup - - rpool/paulzmail dedup off inherited from rpool rpool/paulzm...@20100227 dedup - - rpool/pkg dedup off inherited from rpool rpool/p...@20100227dedup - - rpool/swapdedup off inherited from rpool rpool/s...@20100227 dedup - - rpool/zones dedup off inherited from rpool rpool/zo...@20100227 dedup - - rpool/zones/buildzone dedup off inherited from rpool rpool/zones/buildz...@20100227dedup - - rpool/zones/buildzone/ROOTdedup off inherited from rpool rpool/zones/buildzone/r...@20100227 dedup - - rpool/zones/buildzone/ROOT/zbe-1 dedup off inherited from rpool rpool/zones/buildzone/ROOT/zb...@20100227 dedup - - rpool/zones/buildzone/ROOT/zbe-2 dedup off inherited from rpool rpool/zones/buildzone/ROOT/zb...@20100227 dedup - - rpool
Re: [zfs-discuss] Apple Removes Nearly All Reference To ZFS
On 11 jun 2009, at 11:48, Sami Ketola wrote: On 11 Jun 2009, at 12:44, Paul van der Zwan wrote: Strange thing I noticed in the keynote is that they claim the disk usage of Snow Leopard is 6 GB less than Leopard mostly because of compression. Either they have implemented compressed binaries or they use filesystem compression. Neither feature is present in Leopard AFAIK.. Filesystem compression is a ZFS feature, so I think this is because they are removing PowerPC support from the binaries. I really doubt the PPC specific code is 6GB. A few 100 MB perhaps. Most of a fat binary or an .app folder is architecture independent and will remain. And Phil Schiller specifically mentioned it was because of compression. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Apple Removes Nearly All Reference To ZFS
On 11 jun 2009, at 10:48, Jerry K wrote: There is a pretty active apple ZFS sourceforge group that provides RW bits for 10.5. Things are oddly quiet concerning 10.6. I am curious about how this will turn out myself. Jerry Strange thing I noticed in the keynote is that they claim the disk usage of Snow Leopard is 6 GB less than Leopard mostly because of compression. Either they have implemented compressed binaries or they use filesystem compression. Neither feature is present in Leopard AFAIK.. Filesystem compression is a ZFS feature, so Paul Disclaimer, even though I work for Sun, I have no idea what's going on regarding Apple and ZFS. Rich Teer wrote: It's not pertinent to this sub-thread, but zfs (albeit read-only) is already in currently shipping MacOS 10.5. SO presumably it'll be in MacOS 10.6... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance with Sun StorageTek 2540
> On Wed, 27 Feb 2008, Cyril Plisko wrote: > >> > >> > http://www.simplesystems.org/users/bfriesen/zfs-discuss/2540-zfs-performance.pdf > > > > Nov 26, 2008 ??? May I borrow your time machine ? ;-) > > Are there any stock prices you would like to know about? Perhaps you > > are interested in the outcome of the elections? > No need for a time machine, the US presidential election outcome is already known: http://www.theonion.com/content/video/diebold_accidentally_leaks Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Roadmap - thoughts on expanding raidz / restriping / defrag
On 17 Dec 2007, at 11:42, Jeff Bonwick wrote: > In short, yes. The enabling technology for all of this is something > we call bp rewrite -- that is, the ability to rewrite an existing > block pointer (bp) to a new location. Since ZFS is COW, this would > be trivial in the absence of snapshots -- just touch all the data. > But because a block may appear in many snapshots, there's more to it. > It's not impossible, just a bit tricky... and we're working on it. > > Once we have bp rewrite, many cool features will become available as > trivial applications of it: on-line defrag, restripe, recompress, etc. > Does that include evacuating vdevs ? Marking a vdev read only and then doing a rewrite pass would clear out the vdev, wouldn't it ? Paul > Jeff > > On Mon, Dec 17, 2007 at 02:29:14AM -0800, Ross wrote: >> Hey folks, >> >> Does anybody know if any of these are on the roadmap for ZFS, or >> have any idea how long it's likely to be before we see them (we're >> in no rush - late 2008 would be fine with us, but it would be nice >> to know they're being worked on)? >> >> I've seen many people ask for the ability to expand a raid-z pool >> by adding devices. I'm wondering if it would be useful to work on >> a defrag / restriping tool to work hand in hand with this. >> >> I'm assuming that when the functionality is available, adding a >> disk to a raid-z set will mean the existing data stays put, and >> new data is written across a wider stripe. That's great for >> performance for new data, but not so good for the existing files. >> Another problem is that you can't guarantee how much space will be >> added. That will have to be calculated based on how much data you >> already have. >> >> ie: If you have a simple raid-z of five 500GB drives, you would >> expect adding another drive to add 500GB of space. However, if >> your pool is half full, you can only make use of 250GB of space, >> the other 250GB is going to be wasted. >> >> What I would propose to solve this is to implement a defrag / >> restripe utility as part of the raid-z upgrade process, making it >> a three step process: >> >> - New drive added to raid-z pool >> - Defrag tool begins restriping and defragmenting old data >> - Once restripe complete, pool reports the additional free space >> >> There are some limitations to this. You would maybe want to >> advise that expanding a raid-z pool should only be done with a >> reasonable amount of free disk space, and that it may take some >> time. It may also be beneficial to add the ability to add >> multiple disks in one go. >> >> However, if it works it would seem to add several benefits: >> - Raid-z pools can be expanded >> - ZFS gains a defrag tool >> - ZFS gains a restriping tool >> >> >> This message posted from opensolaris.org >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Write over read priority possible ?
On 25 Jun 2007, at 14:37, [EMAIL PROTECTED] wrote: On 25 Jun 2007, at 14:00, [EMAIL PROTECTED] wrote: I'm testing an X4500 where we need to send over 600MB/s over the network. This is no problem, I get about 700MB/s over a single 10G interface. Problem is the box also needs to accept incoming data at 100MB/s. If I do a simple test ftp-ing files into the same filesystem I see the FTP being limited to about 25-30MB/s. What was the speed when ftp'ing to /dev/null? (Depending on the exact Solaris version, ftpd may or may not be really slow) ftping to a system while no read load was present maxed out the 1GB interface at 100MB/s. Note, the ftp put load was a single stream while the read load were 224 concurrent streams. How many interfaces did you use and was ftp confined to its own interface? Outgoing load used a dedicated 10GB interface, incoming load used a dedicated e1000g interface. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Write over read priority possible ?
On 25 Jun 2007, at 14:00, [EMAIL PROTECTED] wrote: I'm testing an X4500 where we need to send over 600MB/s over the network. This is no problem, I get about 700MB/s over a single 10G interface. Problem is the box also needs to accept incoming data at 100MB/s. If I do a simple test ftp-ing files into the same filesystem I see the FTP being limited to about 25-30MB/s. What was the speed when ftp'ing to /dev/null? (Depending on the exact Solaris version, ftpd may or may not be really slow) ftping to a system while no read load was present maxed out the 1GB interface at 100MB/s. Note, the ftp put load was a single stream while the read load were 224 concurrent streams. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Write over read priority possible ?
I'm testing an X4500 where we need to send over 600MB/s over the network. This is no problem, I get about 700MB/s over a single 10G interface. Problem is the box also needs to accept incoming data at 100MB/s. If I do a simple test ftp-ing files into the same filesystem I see the FTP being limited to about 25-30MB/s. Is there a way to increase priority of write over reads ? Doing the writes to a separate pool is not an option because in order to get the read speed I had to make the 4500 into one big pool with 23 mirror vdev ( I got about 25 MB/s per disk so for 700 MB/s I need about 30 disks active concurrently) Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: Re[2]: [zfs-discuss] linux versus sol10
On 8 Nov 2006, at 16:16, Robert Milkowski wrote: Hello Paul, Wednesday, November 8, 2006, 3:23:35 PM, you wrote: PvdZ> On 7 Nov 2006, at 21:02, Michael Schuster wrote: listman wrote: hi, i found a comment comparing linux and solaris but wasn't sure which version of solaris was being referred. can the list confirm that this issue isn't a problem with solaris10/zfs?? "Linux also supports asynchronous directory updates which can make a significant performance improvement when branching. On Solaris machines, inode creation is very slow and can result in very long iowait states." I think this cannot be commented on in a useful fashion without more information this supposed issue. AFAIK, neither ufs nor zfs "create inodes" (at run time), so this is somewhat hard to put into context. get a complete description of what this is about, then maybe we can give you a useful answer. PvdZ> This could be related to Linux trading reliability for speed by doing PvdZ> async metadata updates. PvdZ> If your system crashes before your metadata is flushed to disk your PvdZ> filesystem might be hosed and a restore PvdZ> from backups may be needed. you can achieve something similar with fastfs on ufs file systems and setting zil_disable to 1 on ZFS. Sure UFS and ZFS can be faster, but having fast, but possibly dangerous, defaults gives you nice benchmark figures ;-) In real life I prefer the safe, but a bit slower, defaults, as should anybody who values his data. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] linux versus sol10
On 7 Nov 2006, at 21:02, Michael Schuster wrote: listman wrote: hi, i found a comment comparing linux and solaris but wasn't sure which version of solaris was being referred. can the list confirm that this issue isn't a problem with solaris10/zfs?? "Linux also supports asynchronous directory updates which can make a significant performance improvement when branching. On Solaris machines, inode creation is very slow and can result in very long iowait states." I think this cannot be commented on in a useful fashion without more information this supposed issue. AFAIK, neither ufs nor zfs "create inodes" (at run time), so this is somewhat hard to put into context. get a complete description of what this is about, then maybe we can give you a useful answer. This could be related to Linux trading reliability for speed by doing async metadata updates. If your system crashes before your metadata is flushed to disk your filesystem might be hosed and a restore from backups may be needed. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Trying to replicate ZFS self-heal demo and not seeing fixed error
On 9-mei-2006, at 11:35, Joerg Schilling wrote: Darren J Moffat <[EMAIL PROTECTED]> wrote: Jeff Bonwick wrote: I personally hate this device naming semantic (/dev/rdsk/c-t-d not meaning what you'd logically expect it to). (It's a generic Solaris bug, not a ZFS thing.) I'll see if I can get it changed. Because almost everyone gets bitten by this. I've heard lots of people complain about this over the years. Some claim the SunOS model (or the slightly altered Linux one) was better, others hate what we have but don't know what to do to fix it. So whats your proposal ? I just booted up Minix 3.1.1 today in Qemu and noticed to my surprise that it has a disk nameing scheme similar to what Solaris uses. It has c?d?p?s? note that both p (PC FDISK I assume) and s is used, HP-UX uses the same scheme. I think any system descending from the old SysV branch has the c?t?d? s? naming convention. I don't remember which version first used it but as far as I remember it was already used in the mid 80's. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss