Re: [zfs-discuss] Re: Re: Heavy writes freezing system
Rainer Heilke wrote: If you plan on RAC, then ASM makes good sense. It is unclear (to me anyway) if ASM over a zvol is better than ASM over a raw LUN. Hmm. I thought ASM was really the _only_ effective way to do RAC, but then, I'm not a DBA (and don't want to be ;-) We'll be just using raw LUN's. While the zvol idea is interesting, the DBA's are very particular about making sure the environment is set up in a way Oracle will support (and not hang up when we have a problem). ASM is relatively new technology. Traditionally, OPS and RAC were built over raw devices, directly or as represented by cluster-aware logical volume managers. DBAs tend to not like raw, so Sun Cluster (Solaris Cluster) supports RAC over QFS which is a very good solution. Some Sun Cluster customers run RAC over NFS, which also works surprisingly well. Meanwhile, Oracle continues to develop ASM to appease the DBAs who want filesystem-like solutions. IMHO, in the long run, Oracle will transition many customers to ASM and this means that it probably isn't worth the effort to make a file system be the best for Oracle, at the expense of other features and workloads. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Heavy writes freezing system
> If you plan on RAC, then ASM makes good sense. It is > unclear (to me anyway) > if ASM over a zvol is better than ASM over a raw LUN. Hmm. I thought ASM was really the _only_ effective way to do RAC, but then, I'm not a DBA (and don't want to be ;-) We'll be just using raw LUN's. While the zvol idea is interesting, the DBA's are very particular about making sure the environment is set up in a way Oracle will support (and not hang up when we have a problem). Rainer This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: Heavy writes freezing system
If some aspect of the load is writing large amount of data into the pool (through the memory cache, as opposed to the zil) and that leads to a frozen system, I think that a possible contributor should be: |6429205||each zpool needs to monitor its throughput and throttle heavy writers| -r Anantha N. Srirama writes: > Bug 6413510 is the root cause. ZFS maestros please correct me if I'm quoting > an incorrect bug. > > > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: Heavy writes freezing system
Rainer Heilke wrote On 01/17/07 15:44,: It turns out we're probably going to go the UFS/ZFS route, with 4 filesystems (the DB files on > UFS with Directio). It seems that the pain of moving from a single-node ASM to a RAC'd ASM is great, and not worth it. > The DBA group decided doing the migration to UFS for the DB files now, and > then to a RAC'd ASM later, will end up being the easiest, safest route. Rainer Still curious as to if and when this bug will get fixed... If you're referring to bug 6413510 that Anantha mentioned then my earlier post today answered that: > This problem was fixed in snv_48 last September and will be > in S10_U4. Neil ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Heavy writes freezing system
I did some straight up Oracle/ZFS testing but not on Zvols. I'll give it a shot and report back, next week is the earliest. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Heavy writes freezing system
It turns out we're probably going to go the UFS/ZFS route, with 4 filesystems (the DB files on UFS with Directio). It seems that the pain of moving from a single-node ASM to a RAC'd ASM is great, and not worth it. The DBA group decided doing the migration to UFS for the DB files now, and then to a RAC'd ASM later, will end up being the easiest, safest route. Rainer Still curious as to if and when this bug will get fixed... This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: Heavy writes freezing system
>We had a 2TB filesystem. No matter what options I set explicitly, the >UFS filesystem kept getting written with a 1 million file limit. >Believe me, I tried a lot of options, and they kept getting se t back >on me. The limit is documented as "1 million inodes per TB". So something must not have gone right. But many people have complained and you could take the newfs source and fix the limitation. The discontinuity when going from <1TB to over 1TB is appaling. (<1TB allows for 137million inodes; >= 1TB allows for 1million per). The rationale is fsck time (but logging is forced anyway) The 1 million limit is arbitrary and too low... Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Heavy writes freezing system
We had a 2TB filesystem. No matter what options I set explicitly, the UFS filesystem kept getting written with a 1 million file limit. Believe me, I tried a lot of options, and they kept getting set back on me. After a fair bit of poking around (Google, Sun's site, etc.) I found several other notes indicating that this was the limit for UFS file systems. (For the pedants, keep in mind we are talking computers, so the actual number will be some exponent of 2. "! million" is an approximation.) If someone has gotten around this under UFS, I'd be very interested--as an intellectual curiousity--in knowing what switches you passed to the mkfs/newfs command(s). Rainer This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: Heavy writes freezing system
Anantha N. Srirama wrote On 01/17/07 08:32,: Bug 6413510 is the root cause. ZFS maestros please correct me if I'm quoting an incorrect bug. Yes, Anantha is correct that is the bug id, which could be responsible for more disk writes than expected. Let me try to explain that bug. The ZIL as described in http://blogs.sun.com/perrin collects transactions in memory of all system calls until they are committed in a transaction group (txg) at the pool level. If a request arrives to force to stable stoarge a particular file (fsync or O_DSYNC) then the ZIL used to write out all in memory transactions for the file system. This meant transactions unrelated to that file were written including directory creations, renames etc - which might be important in being able to re-create the file. However, it also pushed out user data for other files, which can be voluminous. The problem was originally seen when a ksh history file was fsync-ed during a large data write. It would take many seconds to flush the large write through the log, just to ensure a "pwd" command typed was safely on disk! This inefficiency occurs only when a "mismatch" of applications use the same file system. The fix was essentially to push out all meta data for the file system but only the file data related to the file being fsync-ed or O_DSYC-ed. This problem was fixed in snv_48 last September and will be in S10_U4. Neil. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Heavy writes freezing system
> Also as an workaround you could disable zil if it's > acceptable to you > (in case of system panic or hard reset you can endup > with > unrecoverable database). Again, not an option, but thatnks for the pointer. I read a bit about this last week, and it sounds way too scary. Rainer This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Heavy writes freezing system
Bug 6413510 is the root cause. ZFS maestros please correct me if I'm quoting an incorrect bug. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Heavy writes freezing system
The DBA team isn't wanting to do another test. They have "made up their minds". We have a meeting with them tomorrow, though, and will try to convince them of one more test so that we can try the mdb and fsstat tools. (The admin doing the tests was using iostat, not fsstat.) I, at least, am interested in finding exactly where the failure is, rather than just saying "ZFS doesn't work". :-( Rainer This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Heavy writes freezing system
> Rainer Heilke, > > You have 1/4 of the amount of memory that the 2900 > 0 system is capable of (192GBs : I think). Yep. The server does not hold the application (three-tier architecture) so this is the standard build we bought. The memory has not indicated any problems. All errors point to write issues. > Secondly, output from fsstat(1M) could be helpful. > > Run this command over time and check to see if the > values change over time.. Thanks. I'll pass this along to the person doing the testing. He's been doing some measuring, but I'm not sure if fsstat was one of them. Rainer This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss