Re: [zfs-discuss] SPARC SATA, please.
On 25/06/2009, at 5:16 AM, Miles Nordin wrote: and mpt is the 1068 driver, proprietary, works on x86 and SPARC. then there is also itmpt, the third-party-downloadable closed-source driver from LSI Logic, dunno much about it but someone here used it. I'm confused. Why do you say the mpt driver is proprietary and the LSI provided tool is closed source? I thought they were both closed source and that the LSI chipset specifications were proprietary. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4500 resilvering spare taking forever?
Yep, it also suffers from the bug that restarts resilvers when you take a snapshot. This was fixed in b94. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bu g_id=6343667 -- richard Hats off to Richard for saving the day. This was exactly the issue. I shut off my automatic snapshots and 3 days later my resilver is done. Joe -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
Miles Nordin wrote: There's also been talk of two tools, MegaCli and lsiutil, which are both binary only and exist for both Linux and Solaris, and I think are used only with the 1078 cards but maybe not. lsiutil works with LSI chips that use the Fusion-MPT interface (SCSI, SAS, and FC), including the 1068. I've used it with both the mpt and itmpt driver. MegaCLI appears to be for MegaRAID SAS and SATA II controllers (using the mega_sas driver), including the 1078. I've never used it. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
Thanks very much everyone. Victor, I did think about using VirtualBox, but I have a real machine and a supply of hard drives for a short time, for I'll test it out using that if I can. Scott, of course, at work we use three mirrors and it works very well, has saved us on occasion where we have detached the third mirror, upgraded, found the upgrade failed and have been able to revert from the third mirror instead of having to go through backups. George, it will be great to see the 'autoexpand' in the next release. I'm keeping my home server on stable releases for the time being :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpoll status -x output
It might be easier to look for the pool status thusly zpool get health poolname Correct me if I'm wrong but zpool get is available only in some latest versions of OS and Solaris 10 (we are using on some boxes some older versions of Solaris 10). Nevertheless IMO zpoll status -x should work as it is described in manual and current behavior does not matches to description in manual :) Tomasz -- Wydział Zarządzania i Ekonomii Politechnika Gdańska http://www.zie.pg.gda.pl/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] auto snapshots 0.12
Hi all, Just a quick plug: the latest version of ZFS Automatic Snapshots SMF service hit the hg repository yesterday. If you're using 0.11 or older, it's well worth upgrading to get the few bugfixes (especially if you're using CIFs - we use '_' instead of ':' in snapshot names now) More at: http://blogs.sun.com/timf/entry/zfs_automatic_snapshots_0_12 cheers, tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] how to convert zio-io_offset to disk block number?
I use following dtrace script to trace the postion of one file on zfs: #!/usr/sbin/dtrace -qs zio_done:entry /((zio_t *)(arg0))-io_vd/ { zio=(zio_t *)arg0; printf(Offset:%x and Size:%x\n,zio-io_offset,zio-io_size); printf(vd:%x\n,(unsigned long)(zio-io_vd)); printf(process name:%s\n,execname); tracemem(zio-io_data,40); stack(); } and I run dd command: dd if=/export/dsk1/test1 bs=512 count=1, the dtrace script will generate following it output: Offset:657800 and Size:200 vd:ff02d6a1a700 process name:sched zfs`zio_execute+0xa0 genunix`taskq_thread+0x193 unix`thread_start+0x8 ^C The tracemem output is the right context of file test1, which is a 512-byte text file. zpool status has following output: pool: tpool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tpool ONLINE 0 0 0 c2t0d0ONLINE 0 0 0 errors: No known data errors My question is how to translate zio-io_offset (0x657800, equal to decimal number 6649856) outputed by dtace to block number on disk c2t0d0? I tried to use dd if=/dev/dsk/c2t0d0 of=text iseek=6650112 bs=512 count=1 for a check,but the result is not right. Thanks Zhihui ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
I am not sure how zfs would know the rate of the underlying disk storage Easy: Is the buffer growing? :-) If the amount of data in the buffer is growing, you need to throttle back a bit until the disks catch up. Don't stop writes until the buffer is empty, just slow them down to match the rate at which you're clearing data from the buffer. In your case I'd expect to see ZFS buffer the early part of the write (so you'd see a very quick initial burst), but from then on you would want a continual stream of data to disk, at a steady rate. To the client it should respond just like storing to disk, the only difference is there's actually a small delay before the data hits the disk, which will be proportional to the buffer size. ZFS won't have so much opportunity to optimize writes, but you wouldn't get such stuttering performance. However, reading through the other messages, if it's a known bug and ZFS blocking reads while writing, there may not be any need for this idea. But then, that bug has been open since 2006, is flagged as fix in progress, and was planned for snv_51 o_0. So it probably is worth having this discussion. And I may be completely wrong here, but reading that bug, it sounds like ZFS issues a whole bunch of writes at once as it clears the buffer, which ties in with the experiences of stalling actually being caused by reads being blocked. I'm guessing given ZFS's aims it made sense to code it that way - if you're going to queue a bunch of transactions to make them efficient on disk, you don't want to interrupt that batch with a bunch of other (less efficient) reads. But the unintended side effect of this is that ZFS's attempt to optimize writes will causes jerky read and write behaviour any time you have a large amount of writes going on, and when you should be pushing the disks to 100% usage you're never going to reach that as it's always going to have 5s of inactivity, followed by 5s of running the disks flat out. In fact, I wonder if it's a simple as the disks ending up doing 5s of reads, a delay for processing, 5s of writes, 5s of reads, etc... It's probably efficient, but it's going to *feel* horrible, a 5s delay is easily noticeable by the end user, and is a deal breaker for many applications. In situations like that, 5s is a *huge* amount of time, especially so if you're writing to a disk or storage device which has it's own caching! Might it be possible to keep the 5s buffer for ordering transactions, but then commit that as a larger number of small transactions instead of one huge one? The number of transactions could even be based on how busy the system is - if there are a lot of reads coming in, I'd be quite happy to split that into 50 transactions. On 10GbE, 5s is potentially 6.25GB of data. Even split into 50 transactions you're writing 128MB at a time, and that sounds plenty big enough to me! Either way, something needs to be done. If we move to ZFS our users are not going to be impressed with 5s delays on the storage system. Finally, I do have one question for the ZFS guys: How does the L2ARC interact with this? Are reads from the L2ARC blocked, or will they happen in parallel with the writes to the main storage? I suspect that a large L2ARC (potentially made up of SSD disks) would eliminate this problem the majority of the time. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] auto snapshots 0.12
Thanks Tim, do you know which build this is going to appear in? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput?
thank you ;) I mean that it would be faster in reading compressed data IF the write with compression is faster than non-compressed? Just like lzjb. But i can't understand why the read performance is generally unaffected by compression? Because the uncompression (lzjb, gzip) is faster than compression in algorithm, so I think reading the compressing data would need more less CPU time. So the conclusion in the blog that read performance is generally unaffected by compression, I'm not agreed with it. Except the ARC cached the data in the read test and there are no random read test? My data is text data set, about 320,000 text files or emails. The compression ratio is: lzjb 1.55x gzip-1 2.54x gzip-2 2.58x gzip 2.72x gzip-9 2.73x for your curiosity :) From: David Pacheco david.pach...@sun.com To: Chookiex hexcoo...@yahoo.com Cc: zfs-discuss@opensolaris.org Sent: Thursday, June 25, 2009 2:00:49 AM Subject: Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput? Chookiex wrote: Thank you for your reply. I had read the blog. The most interesting thing is WHY is there no performance improve when it set any compression? There are many potential reasons, so I'd first try to identify what your current bandwidth limiter is. If you're running out of CPU on your current workload, for example, adding compression is not going to help performance. If this is over a network, you could be saturating the link.. Or you might not have enough threads to drive the system to bandwidth. Compression will only help performance if you've got plenty of CPU and other resources but you're out of disk bandwidth. But even if that's the case, it's possible that compression doesn't save enough space that you actually decrease the number of disk I/Os that need to be done. The compressed read I/O is less than uncompressed data, and decompress is faster than compress. Out of curiosity, what's the compression ratio? -- Dave so if lzjb write is better than non-compressed, the lzjb read would be better than write? Is the ARC or L2ARC do any tricks? Thanks *From:* David Pacheco david.pach...@sun.com *To:* Chookiex hexcoo...@yahoo.com *Cc:* zfs-disc...@opensolaris..org *Sent:* Wednesday, June 24, 2009 4:53:37 AM *Subject:* Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput? Chookiex wrote: Hi all. Because the property compression could decrease the file size, and the file IO will be decreased also. So, would it increase the ZFS I/O throughput with compression? for example: I turn on gzip-9,on a server with 2*4core Xeon, 8GB RAM. It could compress my files with compressratio 2.5x+. could it be? or I turn on lzjb, about 1.5x with the same files. It's possible, but it depends on a lot of factors, including what your bottleneck is to begin with, how compressible your data is, and how hard you want the system to work compressing it. With gzip-9, I'd be shocked if you saw bandwidth improved. It seems more common with lzjb: http://blogs.sun.com/dap/entry/zfs_compression (skip down to the results) -- Dave could it be? Is there anyone have a idea? thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- David Pacheco, Sun Microsystems Fishworks. http://blogs.sun.com/dap/ -- David Pacheco, Sun Microsystems Fishworks. http://blogs.sun.com/dap/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] auto snapshots 0.12
Hi Ross, On Thu, 2009-06-25 at 04:24 -0700, Ross wrote: Thanks Tim, do you know which build this is going to appear in? I've actually no idea - SUNWzfs-auto-snapshot gets delivered by the Desktop consolidation, not me. I'm checking in with them to see what the story is. That said, it probably makes sense to wait till a build is available on pkg.opensolaris.org that includes the 'zfs list -d' support and I get a chance to do the (tiny bit of) work to start using that in the method script and push 0.12.1. - but any testing you feel like doing between now and then would be most welcome :-) cheers, tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
On Wed, 24 Jun 2009, Lejun Zhu wrote: There is a bug in the database about reads blocked by writes which may be related: http://bugs.opensolaris.org/view_bug.do?bug_id=6471212 The symptom is sometimes reducing queue depth makes read perform better. This one certainly sounds promising. Since Matt Ahrens has been working on it for almost a year, it must be almost fixed by now. :-) I am not sure how is queue depth is managed, but it seems possible to detect when reads are blocked by bulk writes and make some automatic adjustments to improve balance. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Regular panics: BAD TRAP: type=e
Im having the same problems. aprox. every 1-9 hours it crashes and the backtrace is exactly the same as posted here. the machine ran b98 rock-solid for a long time... Anyone have a clue where to start? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
if those servers are on physical boxes right now i'd do some perfmon caps and add up the iops. Using perfmon to get a sense of what is required is a good idea. Use the 95 percentile to be conservative. The counters I have used are in the Physical disk object. Don't ignore the latency counters either. In my book, anything consistently over 20ms or so is excessive. I run 30+ VMs on an Equallogic array with 14 sata disks, broken up as two striped 6 disk raid5 sets (raid 50) with 2 hot spares. That array is, on average, about 25% loaded from an IO stand point. Obviously my VMs are pretty light. And the EQL gear is *fast*, which makes me feel better about spending all of that money :). Regarding ZIL usage, from what I have read you will only see benefits if you are using NFS backed storage, but that it can be significant. link? From the ZFS Evil Tuning Guide (http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide): ZIL stands for ZFS Intent Log. It is used during synchronous writes operations. further down: If you've noticed terrible NFS or database performance on SAN storage array, the problem is not with ZFS, but with the way the disk drivers interact with the storage devices. ZFS is designed to work with storage devices that manage a disk-level cache. ZFS commonly asks the storage device to ensure that data is safely placed on stable storage by requesting a cache flush. For JBOD storage, this works as designed and without problems. For many NVRAM-based storage arrays, a problem might come up if the array takes the cache flush request and actually does something rather than ignoring it. Some storage will flush their caches despite the fact that the NVRAM protection makes those caches as good as stable storage. ZFS issues infrequent flushes (every 5 second or so) after the uberblock updates. The problem here is fairly inconsequential. No tuning is warranted here. ZFS also issues a flush every time an application requests a synchronous write (O_DSYNC, fsync, NFS commit, and so on). The completion of this type of flush is waited upon by the application and impacts performance. Greatly so, in fact. From a performance standpoint, this neutralizes the benefits of having an NVRAM-based storage. When I was testing iSCSI vs. NFS, it was clear iSCSI was not doing sync, NFS was. Here are some zpool iostat numbers: iSCSI testing using iometer with the RealLife work load (65% read, 60% random, 8k transfers - see the link in my previous post) - it is clear that writes are being cached in RAM, and then spun off to disk. # zpool iostat data01 1 capacity operationsbandwidth pool used avail read write read write -- - - - - - - data01 55.5G 20.4T691 0 4.21M 0 data01 55.5G 20.4T632 0 3.80M 0 data01 55.5G 20.4T657 0 3.93M 0 data01 55.5G 20.4T669 0 4.12M 0 data01 55.5G 20.4T689 0 4.09M 0 data01 55.5G 20.4T488 1.77K 2.94M 9.56M data01 55.5G 20.4T 29 4.28K 176K 23.5M data01 55.5G 20.4T 25 4.26K 165K 23.7M data01 55.5G 20.4T 20 3.97K 133K 22.0M data01 55.6G 20.4T170 2.26K 1.01M 11.8M data01 55.6G 20.4T678 0 4.05M 0 data01 55.6G 20.4T625 0 3.74M 0 data01 55.6G 20.4T685 0 4.17M 0 data01 55.6G 20.4T690 0 4.04M 0 data01 55.6G 20.4T679 0 4.02M 0 data01 55.6G 20.4T664 0 4.03M 0 data01 55.6G 20.4T699 0 4.27M 0 data01 55.6G 20.4T423 1.73K 2.66M 9.32M data01 55.6G 20.4T 26 3.97K 151K 21.8M data01 55.6G 20.4T 34 4.23K 223K 23.2M data01 55.6G 20.4T 13 4.37K 87.1K 23.9M data01 55.6G 20.4T 21 3.33K 136K 18.6M data01 55.6G 20.4T468496 2.89M 1.82M data01 55.6G 20.4T687 0 4.13M 0 Testing against NFS shows writes to disk continuously. NFS Testing capacity operationsbandwidth pool used avail read write read write -- - - - - - - data01 59.6G 20.4T 57216 352K 1.74M data01 59.6G 20.4T 41 21 660K 2.74M data01 59.6G 20.4T 44 24 655K 3.09M data01 59.6G 20.4T 41 23 598K 2.97M data01 59.6G 20.4T 34 33 552K 4.21M data01 59.6G 20.4T 46 24 757K 3.09M data01 59.6G 20.4T 39 24 593K 3.09M data01 59.6G 20.4T 45 25 687K 3.22M data01 59.6G 20.4T 45 23 683K 2.97M data01 59.6G 20.4T 33 23 492K 2.97M data01 59.6G 20.4T 16 41 214K 1.71M data01 59.6G 20.4T 3 2.36K 53.4K 30.4M data01 59.6G 20.4T 1 2.23K 20.3K 29.2M data01
Re: [zfs-discuss] [storage-discuss] Backups
I think I am getting closer to ideas as to how to back this up. I will do as you said to backup the os, take an image or something of that nature. I will take a full backup every one to three months of the virtual machines, however the data that the vm is working with will be mounted seperately so that if the virtual machine goes down all that is needed is to restore the last backup of the vm and mount the storage and we should be up and running. Now my only worry is how to backup data that the vm's are accessing. I guess my question is this: Say I take a full backup every x amount of days, say 7 so weekly backups. I then take snapshots throughout the week. Then something happens and there is a flood or something. Once I have all hardware and that side of things going, can I restore from that full backup and then apply the snapshots to it. Will I then be up to yesterday backup wise or are those snapshots useless and I am up to last week. Thanks for helping! Greg -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS write I/O stalls
On Thu, 25 Jun 2009, Ross wrote: But the unintended side effect of this is that ZFS's attempt to optimize writes will causes jerky read and write behaviour any time you have a large amount of writes going on, and when you should be pushing the disks to 100% usage you're never going to reach that as it's always going to have 5s of inactivity, followed by 5s of running the disks flat out. In fact, I wonder if it's a simple as the disks ending up doing 5s of reads, a delay for processing, 5s of writes, 5s of reads, etc... It's probably efficient, but it's going to *feel* horrible, a 5s delay is easily noticeable by the end user, and is a deal breaker for many applications. Yes, 5 seconds is a long time. For an application which mixes computation with I/O it is not really acceptable for read I/O to go away for up to 5 seconds. This represents time that the CPU is not being used, and a time that the application may be unresponsive to the user. When compression is used the impact is different, but the compression itself consumes considerable CPU (and quite abruptly) so that other applications (e.g. X11) stop responding during the compress/write cycle. The read problem is one of congestion. If I/O is congested with massive writes, then reads don't work. It does not really matter how fast your storage system is. If the 5 seconds of buffered writes are larger than what the device driver and storage system buffering allows for, then the I/O channel will be congested. As an example, my storage array is demonstrated to be able to write 359MB/second but ZFS will blast data from memory as fast as it can, and the storage path can not effectively absorb 1.8GB (359*5) of data since the StorageTek 2500's internal buffers are much smaller than that, and fiber channel device drivers are not allowed to consume much memory either. To make matters worse, I am using ZFS mirrors so the amount of data written to the array in those five seconds is doubled to 3.6GB. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best controller card for 8 SATA drives ?
On Wed, Jun 24 at 18:43, Bob Friesenhahn wrote: On Wed, 24 Jun 2009, Eric D. Mudama wrote: The main purpose for using SSDs with ZFS is to reduce latencies for synchronous writes required by network file service and databases. In the available 5 months ago category, the Intel X25-E will write sequentially at ~170MB/s according to the datasheets. That is faster than most, if not all rotating media today. Sounds good. Is that is after the whole device has been re-written a few times or just when you first use it? Based on the various review sites, some tests experience a temporary performance decrease when performing sequential IO over the top of previously randomly written data, which resolves in some short time period. I am not convinced that simply writing the devices makes them slower. Actual performance will be workload specific, YMMV. How many of these devices do you own and use? I own two of them personally, and work with many every day. Seagate Cheetah drives can now support a sustained data rate of 204MB/second. That is with 600GB capacity rather than 64GB and at a similar price point (i.e. 10X less cost per GB). Or you can just RAID-0 a few cheaper rotating rust drives and achieve a huge sequential data rate. True. In $ per sequential GB/s, rotating rust still wins by far. However, your comment about all flash being slower than rotating at sequential writes was mistaken. Even at 10x the price, if you're working with a dataset that needs random IO, the $ per IOP from flash can be significantly greater than any amount of rust, and typically with much lower power consumption to boot. Obviously the primary benefits of SSDs aren't in sequential reads/writes, but they're not necessarilly complete dogs there either. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput?
Chookiex wrote: thank you ;) I mean that it would be faster in reading compressed data IF the write with compression is faster than non-compressed? Just like lzjb. Do you mean that it would be faster to read compressed data than uncompressed data, or it would be faster to read compressed data than to write it? But i can't understand why the read performance is generally unaffected by compression? Because the uncompression (lzjb, gzip) is faster than compression in algorithm, so I think reading the compressing data would need more less CPU time. So the conclusion in the blog that read performance is generally unaffected by compression, I'm not agreed with it. Except the ARC cached the data in the read test and there are no random read test? My comment was just an empirical observation: in my experiments, read time was basically unaffected. I don't believe this was a result of ARC caching because I constructed the experiments to avoid that altogether by using working sets larger than the ARC and streaming through the data. In my case the system's read bandwidth wasn't a performance limiter. We know this because the write bandwidth was much higher (see the graphs), and we were writing twice as much data as we were reading (because we were mirroring). So even if compression was decreasing the amount of I/O that was done on the read side, other factors (possibly the number of clients) limited the bandwidth we could achieve before we got to a point where compression would have made any difference. -- Dave My data is text data set, about 320,000 text files or emails. The compression ratio is: lzjb 1.55x gzip-1 2.54x gzip-2 2.58x gzip 2.72x gzip-9 2.73x for your curiosity :) *From:* David Pacheco david.pach...@sun.com *To:* Chookiex hexcoo...@yahoo.com *Cc:* zfs-discuss@opensolaris.org *Sent:* Thursday, June 25, 2009 2:00:49 AM *Subject:* Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput? Chookiex wrote: Thank you for your reply. I had read the blog. The most interesting thing is WHY is there no performance improve when it set any compression? There are many potential reasons, so I'd first try to identify what your current bandwidth limiter is. If you're running out of CPU on your current workload, for example, adding compression is not going to help performance. If this is over a network, you could be saturating the link. Or you might not have enough threads to drive the system to bandwidth. Compression will only help performance if you've got plenty of CPU and other resources but you're out of disk bandwidth. But even if that's the case, it's possible that compression doesn't save enough space that you actually decrease the number of disk I/Os that need to be done. The compressed read I/O is less than uncompressed data, and decompress is faster than compress. Out of curiosity, what's the compression ratio? -- Dave so if lzjb write is better than non-compressed, the lzjb read would be better than write? Is the ARC or L2ARC do any tricks? Thanks *From:* David Pacheco david.pach...@sun.com mailto:david.pach...@sun.com *To:* Chookiex hexcoo...@yahoo.com mailto:hexcoo...@yahoo.com *Cc:* zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org *Sent:* Wednesday, June 24, 2009 4:53:37 AM *Subject:* Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput? Chookiex wrote: Hi all. Because the property compression could decrease the file size, and the file IO will be decreased also. So, would it increase the ZFS I/O throughput with compression? for example: I turn on gzip-9,on a server with 2*4core Xeon, 8GB RAM. It could compress my files with compressratio 2.5x+. could it be? or I turn on lzjb, about 1.5x with the same files. It's possible, but it depends on a lot of factors, including what your bottleneck is to begin with, how compressible your data is, and how hard you want the system to work compressing it. With gzip-9, I'd be shocked if you saw bandwidth improved. It seems more common with lzjb: http://blogs.sun.com/dap/entry/zfs_compression (skip down to the results) -- Dave could it be? Is there anyone have a idea? thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- David Pacheco, Sun Microsystems Fishworks. http://blogs.sun.com/dap/ -- David Pacheco, Sun Microsystems Fishworks.
[zfs-discuss] unable to import zfs pool
Hi , I had a zfs pool which i exported before our SAN maintenance and powerpath upgrade but now after the powerpath upgrade and maintenance i 'm unable to import the pool it give following errors # zpool import pool: emcpool1 id: 5596268873059055768 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-3C config: emcpool1 UNAVAIL insufficient replicas emcpower0c UNAVAIL cannot open # zpool import -f emcpool1 cannot import 'emcpool1': invalid vdev configuration any idea what could be the reason for this ? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import zfs pool
could it be possible that your path changed? just do format CTRL+D and look if emcpower0c is now located somewhere else. regards daniel Ketan no-re...@opensolaris.org writes: Hi , I had a zfs pool which i exported before our SAN maintenance and powerpath upgrade but now after the powerpath upgrade and maintenance i 'm unable to import the pool it give following errors # zpool import pool: emcpool1 id: 5596268873059055768 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-3Cconfig: emcpool1 UNAVAIL insufficient replicas emcpower0c UNAVAIL cannot open # zpool import -f emcpool1 cannot import 'emcpool1': invalid vdev configuration any idea what could be the reason for this ? -- disy Informationssysteme GmbH Daniel Priem Netzwerk- und Systemadministrator Tel: +49 721 1 600 6000, Fax: -605, E-Mail: daniel.pr...@disy.net Entdecken Sie Lösungen mit Köpfchen auf unserer neuen Website: www.disy.net Firmensitz: Erbprinzenstr. 4-12, 76133 Karlsruhe Registergericht: Amtsgericht Mannheim, HRB 107964 Geschäftsführer: Claus Hofmann - Environment . Reporting . GIS - ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import zfs pool
no idea path changed or not .. but following is output from my format .. and nothing has changed AVAILABLE DISK SELECTIONS: 0. c1t0d0 SUN146G cyl 14087 alt 2 hd 24 sec 848 /p...@0/p...@0/p...@2/s...@0/s...@0,0 1. c1t1d0 SUN146G cyl 14087 alt 2 hd 24 sec 848 /p...@0/p...@0/p...@2/s...@0/s...@1,0 2. c3t5006016841E0A08Dd0 DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /p...@0/p...@0/p...@8/p...@0/p...@2/SUNW,q...@0/f...@0,0/s...@w5006016841e0a08d,0 3. c3t5006016041E0A08Dd0 DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /p...@0/p...@0/p...@8/p...@0/p...@2/SUNW,q...@0/f...@0,0/s...@w5006016041e0a08d,0 4. c3t5006016041E0A08Dd1 DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /p...@0/p...@0/p...@8/p...@0/p...@2/SUNW,q...@0/f...@0,0/s...@w5006016041e0a08d,1 5. c3t5006016841E0A08Dd1 DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /p...@0/p...@0/p...@8/p...@0/p...@2/SUNW,q...@0/f...@0,0/s...@w5006016841e0a08d,1 6. c5t5006016141E0A08Dd0 DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /p...@0/p...@0/p...@8/p...@0/p...@a/SUNW,q...@0/f...@0,0/s...@w5006016141e0a08d,0 7. c5t5006016941E0A08Dd0 DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /p...@0/p...@0/p...@8/p...@0/p...@a/SUNW,q...@0/f...@0,0/s...@w5006016941e0a08d,0 8. c5t5006016141E0A08Dd1 DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /p...@0/p...@0/p...@8/p...@0/p...@a/SUNW,q...@0/f...@0,0/s...@w5006016141e0a08d,1 9. c5t5006016941E0A08Dd1 DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /p...@0/p...@0/p...@8/p...@0/p...@a/SUNW,q...@0/f...@0,0/s...@w5006016941e0a08d,1 10. emcpower0a DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /pseudo/e...@0 11. emcpower1a DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /pseudo/e...@1 Specify disk (enter its number): Specify disk (enter its number): r...@essapl020-u006 # -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import zfs pool
Ketan no-re...@opensolaris.org writes: no idea path changed or not .. but following is output from my format .. and nothing has changed AVAILABLE DISK SELECTIONS: 0. c1t0d0 SUN146G cyl 14087 alt 2 hd 24 sec 848 /p...@0/p...@0/p...@2/s...@0/s...@0,0 1. c1t1d0 SUN146G cyl 14087 alt 2 hd 24 sec 848 /p...@0/p...@0/p...@2/s...@0/s...@1,0 2. c3t5006016841E0A08Dd0 DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /p...@0/p...@0/p...@8/p...@0/p...@2/SUNW,q...@0/f...@0,0/s...@w5006016841e0a08d,0 3. c3t5006016041E0A08Dd0 DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /p...@0/p...@0/p...@8/p...@0/p...@2/SUNW,q...@0/f...@0,0/s...@w5006016041e0a08d,0 4. c3t5006016041E0A08Dd1 DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /p...@0/p...@0/p...@8/p...@0/p...@2/SUNW,q...@0/f...@0,0/s...@w5006016041e0a08d,1 5. c3t5006016841E0A08Dd1 DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /p...@0/p...@0/p...@8/p...@0/p...@2/SUNW,q...@0/f...@0,0/s...@w5006016841e0a08d,1 6. c5t5006016141E0A08Dd0 DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /p...@0/p...@0/p...@8/p...@0/p...@a/SUNW,q...@0/f...@0,0/s...@w5006016141e0a08d,0 7. c5t5006016941E0A08Dd0 DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /p...@0/p...@0/p...@8/p...@0/p...@a/SUNW,q...@0/f...@0,0/s...@w5006016941e0a08d,0 8. c5t5006016141E0A08Dd1 DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /p...@0/p...@0/p...@8/p...@0/p...@a/SUNW,q...@0/f...@0,0/s...@w5006016141e0a08d,1 9. c5t5006016941E0A08Dd1 DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /p...@0/p...@0/p...@8/p...@0/p...@a/SUNW,q...@0/f...@0,0/s...@w5006016941e0a08d,1 10. emcpower0a DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /pseudo/e...@0 11. emcpower1a DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /pseudo/e...@1 Specify disk (enter its number): Specify disk (enter its number): r...@essapl020-u006 # reading your first post status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-3Cconfig: emcpool1 UNAVAIL insufficient replicas emcpower0c UNAVAIL cannot open one or more devices are really missing. check your connection to the emc again regards daniel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import zfs pool
thats the problem this system has just 2 LUNs assigned and both are present as you can see from format output 10. emcpower0a DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /pseudo/e...@0 11. emcpower1a DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /pseudo/e...@1 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import zfs pool
Ketan no-re...@opensolaris.org writes: thats the problem this system has just 2 LUNs assigned and both are present as you can see from format output 10. emcpower0a DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /pseudo/e...@0 11. emcpower1a DGC-RAID5-0326 cyl 51198 alt 2 hd 256 sec 16 /pseudo/e...@1 ahhh. so the path has changed. your old path was emcpower0c now you have emcpower0a and emcpower1a this config is somewhere cached. i am not sure, but IIRC you can clear the cache an then activate the pool. if somebode else here can jump in and point him to the right URL? regards daniel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
jl == James Lever j...@jamver.id.au writes: jl I thought they were both closed source yes, both are closed source / proprietary. If you are really confused and not just trying to pick a dictionary fight, I can start saying ``closed source / proprietary'' on Solaris lists from now on. On Linux lists, ``proprietary'' is clear enough, but maybe the people around here are different. jl and that the LSI chipset specifications were proprietary. shrug I don't know about specifications, but I do know that Linux has an open source driver for 1068, and Solaris has an open source driver for 1078. Getting source without specifications is a problem, though, yes, if you want to track down a bug in the driver or write a driver for another OS. The other problem is, with both chips but especially with the 1078, it soudns like these cards are very ``firmware'' heavy, and the firmware is proprietary. This causes the complaints here that 'hd' (smartctl equivalent) doesn't work. And that with PERC/1078 they have to make RAID0's of each disk with LSI labels on the disk which blocks moving the disk from one controller to another---meaning a broken controller could potentially toast your whole zpool no matter what disk redundancy you had, unless you figure out some way to escape the trap. If not for the ``closed-source / proprietary'' firmware, these two problems could never persist. so, there is still no SATA driver for Solaris that: * is open-source. like a fully-open stack, not just ``here look! here is some source. is that a rabbit over there?'' open-source meaning I can add smartctl or DVD writer or NCQ support without bumping into some strange blob that stops me. open-source meaning I can swap out a disk without having to run any proprietary code to ``bless'' the disk first. no BIOS bluescreen garbage either. * supports NCQ and hotplug * performs well and doesn't have a lot of bugs, like ``freezes'' and so on * works on x86 and SPARC * comes in card form so it can achieve high port density on Linux, both Marvell and LSI 1068 driver come close to or meet all these. (smartctl DOES work with Linux's open source 1068 driver.) Sun has more leverage with LSI than Linux not less because they are an actual customer of LSI's chips for the hardware they sell---even ditched Marvell for LSI!---yet they do worse on driver openness negotiation and then try to blame LSI's whim, and tell random scmuck user to ``go complain to LSI'' when we are not LSI's customer, Sun is. The issue gets more complicated, but not better, IMHO. pgpQpHDvDu5iT.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
sm == Scott Meilicke no-re...@opensolaris.org writes: sm Some storage will flush their caches despite the fact that the sm NVRAM protection makes those caches as good as stable sm storage. [...] ZFS also issues a flush every time an sm application requests a synchronous write (O_DSYNC, fsync, NFS sm commit, and so on). [...] this neutralizes the benefits of sm having an NVRAM-based storage. if the external RAID array or the solaris driver is broken, yes. If not broken, the NVRAM should provide an extra-significant speed boost for exactly the case of frequent synchronous writes. Isn't that section of the evil tuning guide you're quoting actually about checking if the NVRAM/driver connection is working right or not? sm When I was testing iSCSI vs. NFS, it was clear iSCSI was not sm doing sync, NFS was. I wonder if this is a bug in iSCSI, in either the VMWare initiator or the Sun target. With VM's there shouldn't be any opening and closing of files to provoke an extra sync on NFS, only read, write, and sync to the middle of big files, so I wouldn't think NFS should do any more or less syncing than iSCSI. pgpGGlNZGsbbP.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import zfs pool
zpool cache is in /etc/zfs/zpool.cache or it can be viewed as zdb -C but in my case its blank :-( -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import zfs pool
and regarding the path my other system has same and its working fine see the below output # zpool status pool: emcpool1 state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: none requested config: NAME STATE READ WRITE CKSUM emcpool1 ONLINE 0 0 0 emcpower0c ONLINE 0 0 0 errors: No known data errors 10. emcpower0a DGC-RAID5-0326 cyl 65533 alt 2 hd 16 sec 890 /pseudo/e...@0 11. emcpower1a DGC-RAID 5-0326-300.00GB /pseudo/e...@1 Specify disk (enter its number): Specify disk (enter its number): r...@essapl020-u008 # -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
The situation regarding lack of open source drivers for these LSI 1068/1078-based cards is quite scary. And did I understand you correctly when you say that these LSI 1068/1078 drivers write labels to drives, meaning you can't move drives from an LSI controlled array to another arbitrary array due to these labels? If this is the case then surely my best bet would be to go for the non-LSI controllers -- e.g. the AOC-SAT2-MV8 instead, which I presume does not write labels to the array drives? Please correct me if I have misunderstood. Cheers, Simon -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] auto snapshots 0.12
Thanks Tim! -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
sb == Simon Breden no-re...@opensolaris.org writes: sb The situation regarding lack of open source drivers for these sb LSI 1068/1078-based cards is quite scary. meh I dunno. The amount of confusion is a little scary, I guess. sb And did I understand you correctly when you say that these LSI sb 1068/1078 drivers write labels to drives, no incorrect. I'm using a 1068 (``closed-source / proprietary driver''), and it doesn't write such labels. The firmware piece is big, so not all 1068 are necessarily the same: I think some are capable of RAID0/RAID1. but so far I've not heard of a 1068 demanding LSI labels, and mine doesn't. The LSI 1078 (PERC) with the open-source x86-only driver is the one with the big ``closed-source / proprietary'' firmware blob running on the card itself. Others have reported this blob demands LSI labels on the disks. I don't have one. who knows, maybe you can cross-flash some weird firmware from some strange variant of card that doesn't need LSI labels on each disk, or maybe some binary blob config tool will flip a magic undocumented switch inside the card to make it JBOD-able. I don't like to deal in such circus-hoop messes unless someone else can do the work and tell me exactly how. sb go for the non-LSI controllers -- e.g. the AOC-SAT2-MV8 no, you misunderstood because there are two kinds of LSI card with two different drivers. compared to Marvell, LSI 1068 has a cheaper bus (PCIe), performs better, and seems to have fewer bugs (ex. 6787312 is duplicate of a secret Marvell bug), and its proprietary driver includes a SPARC object. The Marvell controller is still ``closed-source / proprietary'' driver (Linux driver for the same chip: open source), so you gain nothing there. The one thing Marvell might gain you is, it's SATA framework, so smartctl/hd may be closer to working. On Linux both cards use their uniform SCSI framework so smartctl works. I have both AOC-SAT2-MV8 and AOC-USAS-L8i and suggest the latter. You have to unscrew teh reverse-polarity card-edge bracket and buy some octopus cables from thenerds.net or adaptec or similar, is all. AOC-USAS-L8i works with these cables among others: http://www.thenerds.net/3WARE.AMCC_Serial_Attached_SCSI_SAS_Internal_Cable.CBLSFF8087OCF10M.html pgpaAcdBpkIe7.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best controller card for 8 SATA drives ?
On Fri, Jun 26, 2009 at 4:11 AM, Eric D. Mudama edmud...@bounceswoosh.orgwrote: True. In $ per sequential GB/s, rotating rust still wins by far. However, your comment about all flash being slower than rotating at sequential writes was mistaken. Even at 10x the price, if you're working with a dataset that needs random IO, the $ per IOP from flash can be significantly greater than any amount of rust, and typically with much lower power consumption to boot. Obviously the primary benefits of SSDs aren't in sequential reads/writes, but they're not necessarilly complete dogs there either. It's all about iops. HDD can do about 300 iops, SSD can get up to 10k+ iops. On sequential writes obviously low iops is not a problem - 300 x 128kB is 40MB. But for small packet random sync NFS traffic 300 * 32kb is hardly a 1MB/s. Nicholas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
Miles Nordin wrote: sb == Simon Breden no-re...@opensolaris.org writes: sb The situation regarding lack of open source drivers for these sb LSI 1068/1078-based cards is quite scary. meh I dunno. The amount of confusion is a little scary, I guess. sb And did I understand you correctly when you say that these LSI sb 1068/1078 drivers write labels to drives, no incorrect. I'm using a 1068 (``closed-source / proprietary driver''), and it doesn't write such labels. I think the confusion is because the 1068 can do hardware RAID, it can and does write its own labels, as well as reserve space for replacements of disks with slightly different sizes. But that is only one mode of operation. Nit: the definition of proprietary is relating to ownership. One could argue that Linus still owns Linux since he has such strong control over what is accepted in the Linux kernel :-) Similarly, one could argue that a forker would own the fork. In other words, open source and proprietary are not mutually exclusive, nor is closed source a synonym for proprietary. You say tomato, I say 'mater. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import zfs pool
Thanx to all for the efforts but i was able to import the zpool after disabling first HBA cards do not know the reason for this but now the pool is imported and there was not disk lost :-) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
Miles, thanks for helping clear up the confusion surrounding this subject! My decision is now as above: for my existing NAS to leave the pool as-is, and seek a 2+ SATA port card for the 2-drive mirror for 2 x 30GB SATA boot SSDs that I want to add. For the next NAS build later on this summer, I will go for an LSI 1068-based SAS/SATA configuration based on a PCIe expansion slot, rather than the ageing PCI-X slots. Using PCIe instead of PCI-X also opens up a load more possible motherboards, although as I want ECC support this still limits choices for mobos. I was thinking of using something like a Xeon E5504 (Nehalem) in the new NAS, and I've been hunting for a good, highly compatible mobo that will give the least aggro (trouble) with OpenSolaris, and this one looks good as it's pretty much totally Intel chipsets, and it has an LSI SAS1068E, which I trust should be supported by Solaris, and it also has additional PCIe slots for additional future expansion, and basic onboard graphics chip, and dual Intel GbE NICs: SuperMicro X8STi-3F: http://www.supermicro.com/products/motherboard/Xeon3000/X58/X8STi-3F.cfm Any comments on this mobo welcome, plus suggestions for a possible PCIe-based 2+ port SATA card that is reliable and has a solid driver. Simon -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
I think the confusion is because the 1068 can do hardware RAID, it can and does write its own labels, as well as reserve space for replacements of disks with slightly different sizes. But that is only one mode of operation. So, it sounds like if I use a 1068-based device, and I *don't* want it to write labels to the drives to allow easy portability of drives to a different controller, then I need to avoid the RAID mode of the device and instead force it to use JBOD mode. Is this easily selectable? I guess you just avoid the Use RAID mode option in the controller's BIOS or something? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
On Thu, 25 Jun 2009 15:43:17 -0700 (PDT) Simon Breden no-re...@opensolaris.org wrote: I think the confusion is because the 1068 can do hardware RAID, it can and does write its own labels, as well as reserve space for replacements of disks with slightly different sizes. But that is only one mode of operation. So, it sounds like if I use a 1068-based device, and I *don't* want it to write labels to the drives to allow easy portability of drives to a different controller, then I need to avoid the RAID mode of the device and instead force it to use JBOD mode. Is this easily selectable? I guess you just avoid the Use RAID mode option in the controller's BIOS or something? It's even simpler than that with the 1068 - just don't use raidctl or the bios to create raid volumes and you'll have a bunch of plain disks. No forcing required. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog Kernel Conference Australia - http://au.sun.com/sunnews/events/2009/kernel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
On Fri, Jun 26 at 8:55, James C. McPherson wrote: On Thu, 25 Jun 2009 15:43:17 -0700 (PDT) Simon Breden no-re...@opensolaris.org wrote: I think the confusion is because the 1068 can do hardware RAID, it can and does write its own labels, as well as reserve space for replacements of disks with slightly different sizes. But that is only one mode of operation. So, it sounds like if I use a 1068-based device, and I *don't* want it to write labels to the drives to allow easy portability of drives to a different controller, then I need to avoid the RAID mode of the device and instead force it to use JBOD mode. Is this easily selectable? I guess you just avoid the Use RAID mode option in the controller's BIOS or something? It's even simpler than that with the 1068 - just don't use raidctl or the bios to create raid volumes and you'll have a bunch of plain disks. No forcing required. Exactly. Worked as such out-of-the-box with no forcing of any kind for me. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
Isn't that section of the evil tuning guide you're quoting actually about checking if the NVRAM/driver connection is working right or not? Miles, yes, you are correct. I just thought it was interesting reading about how syncs and such work within ZFS. Regarding my NFS test, you remind me that my test was flawed, in that my iSCSI numbers were using the ESXi iSCSI SW initiator, while the NFS tests were performed with the VM as the guest, not ESX. I'll give ESX as the NFS client, vmdks on NFS, a go and get back to you. Thanks! Scott -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
That sounds even better :) So what's the procedure to create a zpool using the 1068? Also, any special 'tricks /tips' / commands required for using a 1068-based SAS/SATA device? Simon -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
On Thu, 25 Jun 2009 16:11:04 -0700 (PDT) Simon Breden no-re...@opensolaris.org wrote: That sounds even better :) So what's the procedure to create a zpool using the 1068? same as any other device: # zpool create poolname vdev vdev vdev Also, any special 'tricks /tips' / commands required for using a 1068-based SAS/SATA device? no James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog Kernel Conference Australia - http://au.sun.com/sunnews/events/2009/kernel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
OK, thanks James. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
Simon Breden wrote: I think the confusion is because the 1068 can do hardware RAID, it can and does write its own labels, as well as reserve space for replacements of disks with slightly different sizes. But that is only one mode of operation. So, it sounds like if I use a 1068-based device, and I *don't* want it to write labels to the drives to allow easy portability of drives to a different controller, then I need to avoid the RAID mode of the device and instead force it to use JBOD mode. Is this easily selectable? I guess you just avoid the Use RAID mode option in the controller's BIOS or something? In the Sun onboard version of the 1068, the JBOD mode is the default. I don't know about the add-in cards, but I suspect it's the same. Worst case, you push Cntr-L (or whatever it prompts you for) at the BIOS initialization, and remove any RAID device it's configured. With no RAID devices configured, it runs as a pure HBA (i.e. in JBOD mode). -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
Simon Breden wrote: Miles, thanks for helping clear up the confusion surrounding this subject! My decision is now as above: for my existing NAS to leave the pool as-is, and seek a 2+ SATA port card for the 2-drive mirror for 2 x 30GB SATA boot SSDs that I want to add. For the next NAS build later on this summer, I will go for an LSI 1068-based SAS/SATA configuration based on a PCIe expansion slot, rather than the ageing PCI-X slots. Using PCIe instead of PCI-X also opens up a load more possible motherboards, although as I want ECC support this still limits choices for mobos. I was thinking of using something like a Xeon E5504 (Nehalem) in the new NAS, and I've been hunting for a good, highly compatible mobo that will give the least aggro (trouble) with OpenSolaris, and this one looks good as it's pretty much totally Intel chipsets, and it has an LSI SAS1068E, which I trust should be supported by Solaris, and it also has additional PCIe slots for additional future expansion, and basic onboard graphics chip, and dual Intel GbE NICs: SuperMicro X8STi-3F: http://www.supermicro.com/products/motherboard/Xeon3000/X58/X8STi-3F.cfm Any comments on this mobo welcome, plus suggestions for a possible PCIe-based 2+ port SATA card that is reliable and has a solid driver. Simon Note that the X8STi-3F requires an L-bracket riser card to use both the PCI-E x16 and the x8 slot, which will be mounted horizontally (and, likely, limited to low-profile cards). You'd likely have to use a custom Supermicro case for this to work. Otherwise, you're limited to the PCI-E x16 slot, in a standard vertical orientation. The board does have an IPMI-based KVM ethernet port, but I have no idea if it's supported under Solaris. Also, remember, that you'll have to order a Xeon CPU with this, NOT the i7 CPU, in order to get ECC memory support. Personally, I'd go for an AMD-based system, which is about the same cost, and a much better board: http://www.supermicro.com/Aplus/motherboard/Opteron2000/MCP55/H8DM3-2.cfm (comes with a 1068E SAS controller, AND the nVidia MCP55-based 6-port SATA controller, no need for any more PCI-cards, and it supports the add-in card for remote KVM console; it's a dual-socket, Extended ATX size, though). The MCP55 is the chipset currently in use in the Sun X2200 M2 series of servers. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] how to convert zio-io_offset to disk block number?
Find that zio-io_offset is the absolute offset of device, not in sector unit. And If we need use zdb -R to dump the block, we should use the offset (zio-io_offset-0x40). 2009/6/25 zhihui Chen zhch...@gmail.com I use following dtrace script to trace the postion of one file on zfs: #!/usr/sbin/dtrace -qs zio_done:entry /((zio_t *)(arg0))-io_vd/ { zio=(zio_t *)arg0; printf(Offset:%x and Size:%x\n,zio-io_offset,zio-io_size); printf(vd:%x\n,(unsigned long)(zio-io_vd)); printf(process name:%s\n,execname); tracemem(zio-io_data,40); stack(); } and I run dd command: dd if=/export/dsk1/test1 bs=512 count=1, the dtrace script will generate following it output: Offset:657800 and Size:200 vd:ff02d6a1a700 process name:sched zfs`zio_execute+0xa0 genunix`taskq_thread+0x193 unix`thread_start+0x8 ^C The tracemem output is the right context of file test1, which is a 512-byte text file. zpool status has following output: pool: tpool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM tpool ONLINE 0 0 0 c2t0d0ONLINE 0 0 0 errors: No known data errors My question is how to translate zio-io_offset (0x657800, equal to decimal number 6649856) outputed by dtace to block number on disk c2t0d0? I tried to use dd if=/dev/dsk/c2t0d0 of=text iseek=6650112 bs=512 count=1 for a check,but the result is not right. Thanks Zhihui ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss