Re: [zfs-discuss] ZFS Performance Issue
William Fretts-Saxton wrote: > Unfortunately, I don't know the record size of the writes. Is it as simple > as looking @ the size of a file, before and after a client request, and > noting the difference in size? This is binary data, so I don't know if that > makes a difference, but the average write size is a lot smaller than the file > size. > > Should the recordsize be in place BEFORE data is written to the file system, > or can it be changed after the fact? I might try a bunch of different > settings for trial and error. > > The I/O is actually done by RRD4J, which is a round-robin database library. > It is a Java version of 'rrdtool' which saves data into a binary format, but > also "cleans up" the data according to its age, saving less of the older data > as time goes on. > You should tune that in application level, see https://rrd4j.dev.java.net/ down in "performance issue" section. Try the "NIO" backend and use smaller (2048?) record size... -- This space was intended to be left blank. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hardware RAID vs. ZFS RAID
Andy Lubel wrote: > With my (COTS) LSI 1068 and 1078 based controllers I get consistently > better performance when I export all disks as jbod (MegaCli - > CfgEachDskRaid0). > > Is that really 'all disks as JBOD'? or is it 'each disk as a single drive RAID0'? It may not sound different on the surface, but I asked in another thread and others confirmed, that if your RAID card has a battery backed cache giving ZFS many single drive RAID0's is much better than JBOD (using the 'nocacheflush' option may even improve it more.) My understanding is that it's kind of like the best of both worlds. You get the higher number of spindles and vdevs for ZFS to manage, ZFS gets to do the redundancy, and the the HW RAID Cache gives virtually instant acknowledgement of writes, so that ZFS can be on it's way. So I think many RAID0's is not always the same as JBOD. That's not to say that even True JBOD doesn't still have an advantage over HW RAID. I don't know that for sure. But I think there is a use for HW RAID in ZFS configs which wasn't always the theory I've heard. > I have really learned not to do it this way with raidz and raidz2: > > #zpool create pool2 raidz c3t8d0 c3t9d0 c3t10d0 c3t11d0 c3t12d0 > c3t13d0 c3t14d0 c3t15d0 > Why? I know creating raidz's with more than 9-12 devices, but that doesn't cross that threshold. Is there a reason you'd split 8 disks up into 2 groups of 4? What experience led you to this? (Just so I don't have to repeat it. ;) ) -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance Issue
> -Setting zfs_nocacheflush, though got me drastically > increased throughput--client requests took, on > average, less than 2 seconds each! > > So, in order to use this, I should have a storage > array, w/battery backup, instead of using the > internal drives, correct? I have the option of using > a 6120 or 6140 array on this system so I might just > try that out. We use 3510 and 2540 arrays for Cyrus mail-stores which hold about 10K accounts each. Recommend going with dual-controllers though for safety. Our setups are really simple. Put 2 array units on the SAN, make a pair or RAID-5 LUNs. Then RAID-10 these LUNs together in ZFS. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hardware RAID vs. ZFS RAID
Actually the point is that there are situations that occur in which the typical software stack will make the wrong decision because it has no concept of the underlying hardware and no fault management structure that factors in more than just a single failed IO at a time... Hardware RAID controllers are for the most part just an embedded systemthe firmware is software...yes there might be a few ASICs or FPGAs to help out with XOR calculations, but most use CPUs that have XOR engines nowadays The difference is that the firmware is designed with intimate knowledge of how things are connected, silicon bugs, etc Running a solaris box with zfs to JBODS and exporting a file system to clients is approaching the same structure...with the exception that the solaris box is usually a general purpose box instead of an embedded controller Once you add a second solaris box and cluster them, you start rebuilding a redundant RAID array, but still would be lacking the level of fault handling that is put into the typical hardware array... I am sure this will be done at some point with the blade servers...it only makes sense... -Joel > You just made a great case for doing it in software. > > --Toby > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ? Removing a disk from a ZFS Storage Pool
Dave Lowenstein wrote: > Couldn't we move fixing "panic the system if it can't find a lun" up to > the front of the line? that one really sucks. That's controlled by the failmode property of the zpool, added in PSARC 2007/567 which was integrated in b77. -- James Andrewartha ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance Issue
> -Still playing with 'recsize' values but it doesn't seem to be doing > much...I don't think I have a good understand of what exactly is being > written...I think the whole file might be overwritten each time > because it's in binary format. The other thing to keep in mind is that the tunables like compression and recsize only affect newly written blocks. If you have a bunch of data that was already laid down on disk and then you change the tunable, this will only cause new blocks to have the new size. If you experiment with this, make sure all of your data has the same blocksize by copying it over to the new pool once you've changed the properties. > -Setting zfs_nocacheflush, though got me drastically increased > throughput--client requests took, on average, less than 2 seconds > each! > > So, in order to use this, I should have a storage array, w/battery > backup, instead of using the internal drives, correct? zfs_nocacheflush should only be used on arrays with a battery backed cache. If you use this option on a disk, and you lose power, there's no guarantee that your write successfully made it out of the cache. A performance problem when flushing the cache of an individual disk implies that there's something wrong with the disk or its firmware. You can disable the write cache of an individual disk using format(1M). When you do this, ZFS won't lose any data, whereas enabling zfs_nocacheflush can lead to problems. I'm attaching a DTrace script that will show the cache-flush times per-vdev. Remove the zfs_nocacheflush tuneable and re-run your test while using this DTrace script. If one particular disk takes longer than the rest to flush, this should show us. In that case, we can disable the write cache on that particular disk. Otherwise, we'll need to disable the write cache on all of the disks. The script is attached as zfs_flushtime.d Use format(1M) with the -e option to adjust the write_cache settings for SCSI disks. -j #!/usr/sbin/dtrace -Cs /* * CDDL HEADER START * * The contents of this file are subject to the terms of the * Common Development and Distribution License (the "License"). * You may not use this file except in compliance with the License. * * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE * or http://www.opensolaris.org/os/licensing. * See the License for the specific language governing permissions * and limitations under the License. * * When distributing Covered Code, include this CDDL HEADER in each * file and include the License file at usr/src/OPENSOLARIS.LICENSE. * If applicable, add the following below this CDDL HEADER, with the * fields enclosed by brackets "[]" replaced with your own identifying * information: Portions Copyright [] [name of copyright owner] * * CDDL HEADER END */ /* * Copyright 2008 Sun Microsystems, Inc. All rights reserved. * Use is subject to license terms. */ #define DKIOC (0x04 << 8) #define DKIOCFLUSHWRITECACHE(DKIOC|34) fbt:zfs:vdev_disk_io_start:entry /(args[0]->io_cmd == DKIOCFLUSHWRITECACHE) && (self->traced == 0)/ { self->traced = args[0]; self->start = timestamp; } fbt:zfs:vdev_disk_ioctl_done:entry /args[0] == self->traced/ { @a[stringof(self->traced->io_vd->vdev_path)] = quantize(timestamp - self->start); self->start = 0; self->traced = 0; } ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Is swap still needed on c0d0s1 to get crash dumps?
Roman Morokutti wrote: > Lori Alt writes in the netinstall README that a slice > should be available for crash dumps. In order to get > this done the following line should be defined within > the profile: > > filesys c0[t0]d0s1 auto swap > > So my question is, is this still needed and how to > access a crash dump if it happened? > The dumpadm command can be used to manage dump devices. The key is that today, you can't use ZFS for a dump device. So if you want to collect dumps, you'll need to use a non-ZFS device to do so. For many people, the historic use is a swap device on slice 1 which is also the default dump device. So yes, this will work, but no it is not a requirement. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hardware RAID vs. ZFS RAID
With my (COTS) LSI 1068 and 1078 based controllers I get consistently better performance when I export all disks as jbod (MegaCli - CfgEachDskRaid0). I even went through all the loops and hoops with 6120's, 6130's and even some SGI storage and the result was always the same; better performance exporting single disk than even the "ZFS" profiles within CAM. --- 'pool0': #zpool create pool0 mirror c2t0d0 c2t1d0 #zpool add pool0 mirror c2t2d0 c2t3d0 #zpool add pool0 mirror c2t4d0 c2t5d0 #zpool add pool0 mirror c2t6d0 c2t7d0 'pool2': #zpool create pool2 raidz c3t8d0 c3t9d0 c3t10d0 c3t11d0 #zpool add pool2 raidz c3t12d0 c3t13d0 c3t14d0 c3t15d0 I have really learned not to do it this way with raidz and raidz2: #zpool create pool2 raidz c3t8d0 c3t9d0 c3t10d0 c3t11d0 c3t12d0 c3t13d0 c3t14d0 c3t15d0 So when is thumper going to have an all SAS option? :) -Andy On Feb 7, 2008, at 2:28 PM, Joel Miller wrote: > Much of the complexity in hardware RAID is in the fault detection, > isolation, and management. The fun part is trying to architect a > fault-tolerant system when the suppliers of the components can not > come close to enumerating most of the possible failure modes. > > What happens when a drive's performance slows down because it is > having to go through internal retries more than others? > > What layer gets to declare a drive dead? What happens when you start > declaring the drives dead one by one because of they all seemed to > stop responding but the problem is not really the drives? > > Hardware RAID systems attempt to deal with problems that are not > always straight forward...Hopefully we will eventually get similar > functionality in Solaris... > > Understand that I am a proponent of ZFS, but everything has it's use. > > -Joel > > > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] nfs exporting nested zfs
On Thu, Feb 07, 2008 at 01:54:58PM -0800, Andrew Tefft wrote: > Let's say I have a zfs called "pool/backups" and it contains two > zfs'es, "pool/backups/server1" and "pool/backups/server2" > > I have sharenfs=on for pool/backups and it's inherited by the > sub-zfs'es. I can then nfs mount pool/backups/server1 or > pool/backups/server2, no problem. > > If I mount pool/backups on a system running Solaris Express build 81, The NFSv3 client, and the NFSv4 client up to some older snv build (I forget which) will *not* follow the sub-mounts that exist on the server side. In recent snv builds the NFSv4 client will follow the sub-mounts that exist on the server side. If you use the -hosts automount map (/net) then the NFSv3 client and older NFSv4 clients will mount the server-side sub-mounts, but only as they existed when the automount was made. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] nfs exporting nested zfs
Because of the mirror mount feature that integrated into that Solaris Express, build 77. You can read about here on page 20 of the ZFS Admin Guide: http://opensolaris.org/os/community/zfs/docs/zfsadmin.pdf Cindy Andrew Tefft wrote: > Let's say I have a zfs called "pool/backups" and it contains two zfs'es, > "pool/backups/server1" and "pool/backups/server2" > > I have sharenfs=on for pool/backups and it's inherited by the sub-zfs'es. I > can then nfs mount pool/backups/server1 or pool/backups/server2, no problem. > > If I mount pool/backups on a system running Solaris Express build 81, I can > see the contents of pool/backups/server1 and pool/backups/server2 as I'd > expect. But when I mount pool/backups on Solaris 10 or Solaris 8, I just see > empty directories for server1 and server2. And if I actually write there, the > files go in /pool/backups (and they can be seen on the nfs server if I > unmount the sub-zfs'es). And that's extra bad because if I reboot the nfs > server, the sub-zfs'es fail to mount because their mountpoints are not empty, > and so it won't come up in multi-user). > > (the whole idea here is that I really want just the one nfs mount, but I want > to be able to separate the data into separate zfs'es). > > So why does this work with the build 81 nfs client, and not others, and is it > possible to make it work? Right now the number of sub-zfs'es is only a > handful so I can mount them individually, but it's not the way I want it to > work. > > > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] nfs exporting nested zfs
Let's say I have a zfs called "pool/backups" and it contains two zfs'es, "pool/backups/server1" and "pool/backups/server2" I have sharenfs=on for pool/backups and it's inherited by the sub-zfs'es. I can then nfs mount pool/backups/server1 or pool/backups/server2, no problem. If I mount pool/backups on a system running Solaris Express build 81, I can see the contents of pool/backups/server1 and pool/backups/server2 as I'd expect. But when I mount pool/backups on Solaris 10 or Solaris 8, I just see empty directories for server1 and server2. And if I actually write there, the files go in /pool/backups (and they can be seen on the nfs server if I unmount the sub-zfs'es). And that's extra bad because if I reboot the nfs server, the sub-zfs'es fail to mount because their mountpoints are not empty, and so it won't come up in multi-user). (the whole idea here is that I really want just the one nfs mount, but I want to be able to separate the data into separate zfs'es). So why does this work with the build 81 nfs client, and not others, and is it possible to make it work? Right now the number of sub-zfs'es is only a handful so I can mount them individually, but it's not the way I want it to work. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] UFS on zvol Cache Questions...
Hello, I have a unique deployment scenario where the marriage of ZFS zvol and UFS seem like a perfect match. Here are the list of feature requirements for my use case: * snapshots * rollback * copy-on-write * ZFS level redundancy (mirroring, raidz, ...) * compression * filesystem cache control (control what's in and out) * priming the filesystem cache (dd if=file of=/dev/null) * control the upper boundary of RAM consumed by the filesystem. This helps me to avoid contention between the filesystem cache and my application. Before zfs came along, I could achieve all but rollback, copy-on-write and compression through UFS+some volume manager. I would like to use ZFS but with ZFS I cannot prime the cache and I don't have the ability to control what is in the cache (e.g. like with the directio UFS option). If I create a ZFS zvol and format it as a UFS filesystem, it seems like I get the best of both worlds. Can anyone poke holes in this strategy? I think the biggest possible risk factor is if the ZFS zvol still uses the arc cache. If this is the case, I may be double-dipping on the filesystem cache. e.g. The UFS filesystem uses some RAM and ZFS zvol uses some RAM for filesystem cache. Is this a true statement or does the zvol use a minimal amount of system RAM? Lastly, if I were to try this scenario, does anyone know how to monitor the RAM consumed by the zvol and UFS? e.g. Is there a dtrace script for monitoring ZFS or UFS memory consuption? Thanks in advance, Brad ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance Issue
To avoid making multiple posts, I'll just write everything here: -Moving to nv_82 did not seem to do anything, so I doesn't look like fsync was the issue. -Disabling ZIL didn't do anything either -Still playing with 'recsize' values but it doesn't seem to be doing much...I don't think I have a good understand of what exactly is being written...I think the whole file might be overwritten each time because it's in binary format. -Setting zfs_nocacheflush, though got me drastically increased throughput--client requests took, on average, less than 2 seconds each! So, in order to use this, I should have a storage array, w/battery backup, instead of using the internal drives, correct? I have the option of using a 6120 or 6140 array on this system so I might just try that out. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hardware RAID vs. ZFS RAID
Much of the complexity in hardware RAID is in the fault detection, isolation, and management. The fun part is trying to architect a fault-tolerant system when the suppliers of the components can not come close to enumerating most of the possible failure modes. What happens when a drive's performance slows down because it is having to go through internal retries more than others? What layer gets to declare a drive dead? What happens when you start declaring the drives dead one by one because of they all seemed to stop responding but the problem is not really the drives? Hardware RAID systems attempt to deal with problems that are not always straight forward...Hopefully we will eventually get similar functionality in Solaris... Understand that I am a proponent of ZFS, but everything has it's use. -Joel This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] NFS device IDs for snapshot filesystems
I notice that files within a snapshot show a different deviceID to stat than the parent file does. But this is not true when mounted via NFS. Is this a limitation of the NFS client, or just what the ZFS fileserver is doing? Will this change in the future? With NFS4 mirror mounts? -- Darren Dunham [EMAIL PROTECTED] Senior Technical Consultant TAOShttp://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send / receive between different opensolaris versions?
On Wed, 2008-02-06 at 13:42 -0600, Michael Hale wrote: > Hello everybody, > > I'm thinking of building out a second machine as a backup for our mail > spool where I push out regular filesystem snapshots, something like a > warm/hot spare situation. > > Our mail spool is currently running snv_67 and the new machine would > probably be running whatever the latest opensolaris version is (snv_77 > or later). > > My first question is whether or not zfs send / receive is portable > between differing releases of opensolaris. My second question (kind > of off topic for this list) is that I was wondering the difficulty > involved in upgrading snv_67 to a later version of opensolaris given > that we're running a zfs root boot configuration For your first question, zfs(1) says: zfs upgrade [-r] [-V version] [-a | filesystem] Upgrades file systems to a new on-disk version. Once this is done, the file systems will no longer be acces- sible on systems running older versions of the software. "zfs send" streams generated from new snapshots of these file systems can not be accessed on systems running older versions of the software. The format of the stream is dependent on just the zfs filesystem version at the time of the snapshot, so as they are backwards compatible, a system with newer zfs bits can always receive an older snapshot. The current filesystem version is 3 (not to be confused with zpool which is at 10), so it's unlikely to have changed recently. The officially supported method for upgrading a zfs boot system is to BFU (which upgrades ON but breaks package support). However, you should be able to do an in-place upgrade with the zfs_ttinstall wrapper for ttinstall (the Solaris text installer). This means booting from CD/DVD (or netbooting) and then running the script: http://opensolaris.org/jive/thread.jspa?threadID=46588&tstart=255 You will have to edit it to fit your zfs layout. > -- > Michael Hale > <[EMAIL PROTECTED] > > > Manager of Engineering SupportEnterprise > Engineering Group > Transcom Enhanced Services > http://www.transcomus.com > > > -Albert ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Lost intermediate snapshot; incremental backup still possible?
I keep my system synchronized to a USB disk from time to time. The script works by sending incremental snapshots to a pool on the USB disk, then deleting those snapshots from the source machine. A botched script ended up deleting a snapshot that was not successfully received on the USB disk. Now, I've lost the ability to send incrementally since the intermediate snapshot is lost. From what I gather, if I try to send a full snapshot, it will require deleting and replacing the dataset on the USB disk. Is there any way around this? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool destroy core dumps with unavailable iscsi device
Hi Ross, On Thu, 2008-02-07 at 08:30 -0800, Ross wrote: > While playing around with ZFS and iSCSI devices I've managed to remove > an iscsi target before removing the zpool. Now any attempt to delete > the pool (with or without -f) core dumps zpool. > > Any ideas how I get rid of this pool? Yep, here's one way: zpool export other pools on the system, then delete /etc/zfs/zpool.cache, reboot the machine then do a zpool import for each of the other pools you want to keep. cheers, tim -- Tim Foster, Sun Microsystems Inc, Solaris Engineering Ops http://blogs.sun.com/timf ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool destroy core dumps with unavailable iscsi device
While playing around with ZFS and iSCSI devices I've managed to remove an iscsi target before removing the zpool. Now any attempt to delete the pool (with or without -f) core dumps zpool. Any ideas how I get rid of this pool? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance Issue
Slight correction. 'recsize' must be a power of 2 so it would be 8192. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance Issue
RRD4J isn't a DB, per se, so it doesn't really have a "record" size. In fact, I don't even know if, when data is written to the binary, whether it is contiguous or not so the amount written may not directly correlate to a proper record-size. I did run your command and found the size patterns you were talking about: 462 java409 3320 java409 6819 java409 5 java 1227 1 java 1692 16 java 3243 "409" is the number of clients I tested, so I assume it means the largest write it makes is "6819". Is that bits or bytes? Does that mean I should try setting my recordsize equal to the lowest multiple of 512 GREATER than 6819? (14 x 512 = 7168) This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance Issue
William, It should be fairly easy to find the record size using DTrace. Take an aggregation of the the writes happening (aggregate on size for all the write(2) system calls). This would give fair idea of the IO size pattern. Does RRD4J have a record size mentioned ? Usually if it is a database-application they have a record-size option when the DB is created (based on my limited knowledge about DBs). Thanks and regards, Sanjeev. PS : Here is a simple script which just aggregates on the write size and executable name : -- snip -- #!/usr/sbin/dtrace -s syscall::write:entry { wsize = (size_t) arg2; @write[wsize, execname] = count(); } -- snip -- William Fretts-Saxton wrote: > Unfortunately, I don't know the record size of the writes. Is it as simple > as looking @ the size of a file, before and after a client request, and > noting the difference in size? This is binary data, so I don't know if that > makes a difference, but the average write size is a lot smaller than the file > size. > > Should the recordsize be in place BEFORE data is written to the file system, > or can it be changed after the fact? I might try a bunch of different > settings for trial and error. > > The I/O is actually done by RRD4J, which is a round-robin database library. > It is a Java version of 'rrdtool' which saves data into a binary format, but > also "cleans up" the data according to its age, saving less of the older data > as time goes on. > > > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- Solaris Revenue Products Engineering, India Engineering Center, Sun Microsystems India Pvt Ltd. Tel:x27521 +91 80 669 27521 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hardware RAID vs. ZFS RAID
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 John-Paul Drawneek wrote: | I guess a USB pendrive would be slower than a | harddisk. Bad performance | for the ZIL. A "decent" pendrive of mine writes at 3-5MB/s. Sure there are faster ones, but any desktop harddisk can write at 50MB/s. If you are *not* talking about consumer grade pendrives, I can't comment. - -- Jesus Cea Avion _/_/ _/_/_/_/_/_/ [EMAIL PROTECTED] http://www.argo.es/~jcea/ _/_/_/_/ _/_/_/_/ _/_/ jabber / xmpp:[EMAIL PROTECTED] _/_/_/_/ _/_/_/_/_/ ~ _/_/ _/_/_/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/_/_/ _/_/_/_/ _/_/ "My name is Dump, Core Dump" _/_/_/_/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.8 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQCVAwUBR6sQSplgi5GaxT1NAQKD+AP/XdzxquaUk559ldZr2Wwcq0mIGnAXXDsf uCz+HBiYVLpgqqyv6I5gGgoeF417YopPvsiL0fpAEWIMeB/BgeTvU/xarq2sFeD6 NOt9S31C2pOaRCfDkPerBwof5ScKvqL4LICPUhWfYbrx45V6A6dV6IVYYzx1Pj6r ePKcyjPfDhQ= =n2Ut -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance Issue
One thing I just observed is that the initial file size is 65796 bytes. When it gets an update, the file size remains @ 65796. Is there a minimum file size? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance Issue
I just installed nv82 so we'll see how that goes. I'm going to try the recordsize idea above as well. A note about UFS: I was told by our local Admin guru that ZFS turns on write-caching for disks, which is something that a UFS file system should not have turned on, so that if I convert the ZFS f/s to a UFS one, I could be giving the UFS performance an unrealistic "boost" to performance because it would still have the caching on. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Performance Issue
Unfortunately, I don't know the record size of the writes. Is it as simple as looking @ the size of a file, before and after a client request, and noting the difference in size? This is binary data, so I don't know if that makes a difference, but the average write size is a lot smaller than the file size. Should the recordsize be in place BEFORE data is written to the file system, or can it be changed after the fact? I might try a bunch of different settings for trial and error. The I/O is actually done by RRD4J, which is a round-robin database library. It is a Java version of 'rrdtool' which saves data into a binary format, but also "cleans up" the data according to its age, saving less of the older data as time goes on. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Is swap still needed on c0d0s1 to get crash dumps?
Lori Alt writes in the netinstall README that a slice should be available for crash dumps. In order to get this done the following line should be defined within the profile: filesys c0[t0]d0s1 auto swap So my question is, is this still needed and how to access a crash dump if it happened? Roman This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS on Solaris and Mac Leopard
For some time now, I have had zfs pool, created (if I remeber this correctly) on my x86 opensolaris, with zfs version 6, and have it accessable on my Leopard Mac. I ran the ZFS beta on the Leopard beta with no problems at all. I've now installed the latest zfs RW build on my Leopard and it work nicely readwrite on my macbook. It is a pool consisting of one whoe disk. The Leopard says: # zpool import pool: space id: 123931456072276617 state: ONLINE status: The pool is formatted using an older on-disk version. action: The pool can be imported using its name or numeric identifier, though some features will not be available without an explicit 'zpool upgrade'.config: space ONLINE disk3 ONLINE However, on Solaris the pool is found, but considered damaged on a 5.11 snv_64a sun4u sparc: # zpool import pool: space id: 123931456072276617 state: FAULTED status: One or more devices contains corrupted data. action: The pool cannot be imported due to damaged devices or data. see: http://www.sun.com/msg/ZFS-8000-5E config: space UNAVAIL insufficient replicas c4t0d0s0 UNAVAIL corrupted data and the disk is reported 'unknown' by format. Anyone seen something simular? The pool is still version 6, since I'd like to use it on standard Leopard which has the read-only zfs limited to version 6. Could a cure be to upgrade (OS and/or zfs)? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] MySQL, Lustre and ZFS
Not sure why would you want these 3 together, but lustre and zfs will work together in Lustre 1.8 version. ZFS will be backend filesystem for Lustre servers. See this http://wiki.lustre.org/index.php?title=Lustre_OSS/MDS_with_ZFS_DMU Cheers, -Atul On Feb 7, 2008 8:39 AM, kilamanjaro <[EMAIL PROTECTED]> wrote: > Hi all, Any thoughts on if and when ZFS, MySQL, and Lustre 1.8 (and > beyond) will work together and be supported so by Sun? > > - Network Systems Architect >Advanced Digital Systems Internet > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- Atul Vidwansa Cluster File Systems Inc. http://www.clusterfs.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss