[zfs-discuss] zfs cache behaviour.
Hi all, I'm looking for a document that explain the working of the zfs cache. Especially the S10 update6. I need to understand why the utilization of the cache is not up to the arc size we set. In respect of the server workload, we expect that the datas are spread across the size allowed for the cache. Best regards, Pascal -- Pascal FORTIN Services Account Manager Sun Microsystems France 13 avenue Morane Saulnier 78140 Velizy Villacoublay Phone x30401 / +33 1 34 03 04 01 Mobile +33 6 85 83 10 01 Email pascal.for...@sun.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
On 02/23/09 20:24, Ilya Tatar wrote: Hello, I am building a home file server and am looking for an ATX mother board that will be supported well with OpenSolaris (onboard SATA controller, network, graphics if any, audio, etc). I decided to go for Intel based boards (socket LGA 775) since it seems like power management is better supported with Intel processors and power efficiency is an important factor. After reading several posts about ZFS it looks like I want ECC memory as well. Does anyone have any recommendations? Any motherboard for the Core2 or Core i7 Intel processors with the ICH southbridge (desktop boards) or ESB2 soutbridge (server boards) will be well supported. I recommend an actual Intel board since they also always use the Intel network chip (well supported and tuned). Many of the third party boards from MSI, Gigabyte, Asus, DFI, ECS, and others also work, but for some (penny pinching) reason, they tend to use network chips like Marvell that are not yet supported, or Realtek, for which some of the models are supported. So using an actual board from Intel Corp will be best supported right out of the box. For that matter, because of the work we do with Intel, almost any of their boards will be supported using the ICH 6, 7, 8, 9, or ICH10 SATA ports in either legacy or AHCI mode. Again, almost any version of the Intel network (NIC) chips are supported across all their boards. If you are able to find one that is not, I'd love to hear about it and add it to our work queue. In the most recent builds of Solaris Nevada (SXCE), the integrated Intel graphics found on many of the boards is well supported. On other boards, use a low end VGA card. Again, if you find an Intel board where the graphics is not supported or not working, please let us know the specifics and we'll fix it. Cheers, Neal Here are a few that I found. Any comments about those? Supermicro C2SBX+ http://www.supermicro.com/products/motherboard/Core2Duo/X48/C2SBX+.cfm Gigabyte GA-X48-DS4 gigabyte: http://www.gigabyte.com.tw/Products/Motherboard/Products_Overview.aspx?ProductID=2810 Intel S3200SHV http://www.intel.com/Products/Server/Motherboards/Entry-S3200SH/Entry-S3200SH-overview.htm Thanks for any help, -Ilya ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
Neal Pollack wrote: On 02/23/09 20:24, Ilya Tatar wrote: ... efficiency is an important factor. After reading several posts about ZFS it looks like I want ECC memory as well. ... Any motherboard for the Core2 or Core i7 Intel processors with the ICH Not. Intel decided we don't need ECC memory on the Core i7 (one of the few truly idiotic things I can remember them doing lately). The OP specified ECC RAM, so Core i7 is a no go. Thanks for nothing, Intel. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs cache behaviour.
Pascal Fortin wrote: Hi all, I'm looking for a document that explain the working of the zfs cache. Especially the S10 update6. I need to understand why the utilization of the cache is not up to the arc size we set. In respect of the server workload, we expect that the datas are spread across the size allowed for the cache. The ARC size will be automatically adjusted based upon usage. If you do not access data, it won't be in the cache. Other than the automatic adjustments based on resource constraints (ARC will shrink if memory is needed for applications) it works like most other caches. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Mon, Feb 23, 2009 at 10:05:31AM -0800, Christopher Mera wrote: I recently read up on Scott Dickson's blog with his solution for jumpstart/flashless cloning of ZFS root filesystem boxes. I have to say that it initially looks to work out cleanly, but of course there are kinks to be worked out that deal with auto mounting filesystems mostly. The issue that I'm having is that a few days after these cloned systems are brought up and reconfigured they are crashing and svc.configd refuses to start. When you snapshot a ZFS filesystem you get just that -- a snapshot at the filesystem level. That does not mean you get a snapshot at the _application_ level. Now, svc.configd is a daemon that keeps a SQLite2 database. If you snapshot the filesystem in the middle of a SQLite2 transaction you won't get the behavior that you want. In other words: quiesce your system before you snapshot its root filesystem for the purpose of replicating that root on other systems. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
Not. Intel decided we don't need ECC memory on the Core i7 I thought that was a Core i7 vs Xeon E55xx for socket LGA-1366 so that's why this X58 MB claims ECC support: http://supermicro.com/products/motherboard/Xeon3000/X58/X8SAX.cfm ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 19:18, Nicolas Williams nicolas.willi...@sun.com wrote: On Mon, Feb 23, 2009 at 10:05:31AM -0800, Christopher Mera wrote: I recently read up on Scott Dickson's blog with his solution for jumpstart/flashless cloning of ZFS root filesystem boxes. I have to say that it initially looks to work out cleanly, but of course there are kinks to be worked out that deal with auto mounting filesystems mostly. The issue that I'm having is that a few days after these cloned systems are brought up and reconfigured they are crashing and svc.configd refuses to start. When you snapshot a ZFS filesystem you get just that -- a snapshot at the filesystem level. That does not mean you get a snapshot at the _application_ level. Now, svc.configd is a daemon that keeps a SQLite2 database. If you snapshot the filesystem in the middle of a SQLite2 transaction you won't get the behavior that you want. In other words: quiesce your system before you snapshot its root filesystem for the purpose of replicating that root on other systems. That would be a bug in ZFS or SQLite2. A snapshoot should be an atomic operation. The effect should be the same as power fail in the meddle of an transaction and decent databases can cope with that. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
Either way - it would be ideal to quiesce the system before a snapshot anyway, no? My next question now is what particular steps would be recommended to quiesce a system for the clone/zfs stream that I'm looking to achieve... All your help is appreciated. Regards, Christopher Mera -Original Message- From: Mattias Pantzare [mailto:pantz...@gmail.com] Sent: Tuesday, February 24, 2009 1:38 PM To: Nicolas Williams Cc: Christopher Mera; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs streams data corruption On Tue, Feb 24, 2009 at 19:18, Nicolas Williams nicolas.willi...@sun.com wrote: On Mon, Feb 23, 2009 at 10:05:31AM -0800, Christopher Mera wrote: I recently read up on Scott Dickson's blog with his solution for jumpstart/flashless cloning of ZFS root filesystem boxes. I have to say that it initially looks to work out cleanly, but of course there are kinks to be worked out that deal with auto mounting filesystems mostly. The issue that I'm having is that a few days after these cloned systems are brought up and reconfigured they are crashing and svc.configd refuses to start. When you snapshot a ZFS filesystem you get just that -- a snapshot at the filesystem level. That does not mean you get a snapshot at the _application_ level. Now, svc.configd is a daemon that keeps a SQLite2 database. If you snapshot the filesystem in the middle of a SQLite2 transaction you won't get the behavior that you want. In other words: quiesce your system before you snapshot its root filesystem for the purpose of replicating that root on other systems. That would be a bug in ZFS or SQLite2. A snapshoot should be an atomic operation. The effect should be the same as power fail in the meddle of an transaction and decent databases can cope with that. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 10:41 AM, Christopher Mera cm...@reliantsec.net wrote: Either way - it would be ideal to quiesce the system before a snapshot anyway, no? My next question now is what particular steps would be recommended to quiesce a system for the clone/zfs stream that I'm looking to achieve... All your help is appreciated. Regards, Christopher Mera -Original Message- From: Mattias Pantzare [mailto:pantz...@gmail.com] Sent: Tuesday, February 24, 2009 1:38 PM To: Nicolas Williams Cc: Christopher Mera; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs streams data corruption On Tue, Feb 24, 2009 at 19:18, Nicolas Williams nicolas.willi...@sun.com wrote: On Mon, Feb 23, 2009 at 10:05:31AM -0800, Christopher Mera wrote: I recently read up on Scott Dickson's blog with his solution for jumpstart/flashless cloning of ZFS root filesystem boxes. I have to say that it initially looks to work out cleanly, but of course there are kinks to be worked out that deal with auto mounting filesystems mostly. The issue that I'm having is that a few days after these cloned systems are brought up and reconfigured they are crashing and svc.configd refuses to start. When you snapshot a ZFS filesystem you get just that -- a snapshot at the filesystem level. That does not mean you get a snapshot at the _application_ level. Now, svc.configd is a daemon that keeps a SQLite2 database. If you snapshot the filesystem in the middle of a SQLite2 transaction you won't get the behavior that you want. In other words: quiesce your system before you snapshot its root filesystem for the purpose of replicating that root on other systems. That would be a bug in ZFS or SQLite2. A snapshoot should be an atomic operation. The effect should be the same as power fail in the meddle of an transaction and decent databases can cope with that. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss If you are writing a script to handle ZFS snapshots/backups, you could issue an SMF command to stop the service before taking the snapshot. Or at the very minimum, perform an SQL dump of the DB so you at least have a consistent full copy of the DB as a flat file in case you can't stop the DB service. -- Brent Jones br...@servuhome.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 07:37:39PM +0100, Mattias Pantzare wrote: On Tue, Feb 24, 2009 at 19:18, Nicolas Williams nicolas.willi...@sun.com wrote: When you snapshot a ZFS filesystem you get just that -- a snapshot at the filesystem level. That does not mean you get a snapshot at the _application_ level. Now, svc.configd is a daemon that keeps a SQLite2 database. If you snapshot the filesystem in the middle of a SQLite2 transaction you won't get the behavior that you want. In other words: quiesce your system before you snapshot its root filesystem for the purpose of replicating that root on other systems. That would be a bug in ZFS or SQLite2. I suspect it's actually a bug in svc.configd. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
Thanks for your responses.. Brent: And I'd have to do that for every system that I'd want to clone? There must be a simpler way.. perhaps I'm missing something. Regards, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 10:56:45AM -0800, Brent Jones wrote: If you are writing a script to handle ZFS snapshots/backups, you could issue an SMF command to stop the service before taking the snapshot. Or at the very minimum, perform an SQL dump of the DB so you at least have a consistent full copy of the DB as a flat file in case you can't stop the DB service. I don't think there's any way to ask svc.config to pause. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS: unreliable for professional usage?
Hello Joe, Monday, February 23, 2009, 7:23:39 PM, you wrote: MJ Mario Goebbels wrote: One thing I'd like to see is an _easy_ option to fall back onto older uberblocks when the zpool went belly up for a silly reason. Something that doesn't involve esoteric parameters supplied to zdb. MJ Between uberblock updates, there may be many write operations to MJ a data file, each requiring a copy on write operation. Some of MJ those operations may reuse blocks that were metadata blocks MJ pointed to by the previous uberblock. MJ In which case the old uberblock points to a metadata tree full of garbage. MJ Jeff, you must have some idea on how to overcome this in your bugfix, would you care to share? As was suggested on the list before ZFS could keep a list of freed blocks for last N txgs and if there are still other blocks to be used it would not allocated those from the last N transactions. -- Best regards, Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 11:32 AM, Christopher Mera cm...@reliantsec.net wrote: Thanks for your responses.. Brent: And I'd have to do that for every system that I'd want to clone? There must be a simpler way.. perhaps I'm missing something. Regards, Chris Well, unless the database software itself can notice a snapshot taking place, and flush all data to disk, pause transactions until the snapshot is finished, then properly resume, I don't know what to tell you. It's an issue for all databases, Oracle, MSSQL, MySQL... how to do an atomic backup, without stopping transactions, and maintaining consistency. Replication is on possible solution, dumping to a file periodically is one, or just tolerating that your database will not be consistent after a snapshot and have to replay logs / consistency check it after bringing it up from a snapshot. Once you figure that out in a filesystem agnostic way, you'll be a wealthy person indeed. -- Brent Jones br...@servuhome.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server -- ECC claims
rl == Rob Logan r...@logan.com writes: rl that's why this X58 MB claims ECC support: the claim is worth something. People always say ``AMD supports ECC because the memory controller is in the CPU so they all support it, it cannot be taken away from you by lying idiot motherboard manufacturers or greedy marketers trying to segment users into different demand groups'' but you still need some motherboard BIOS to flip the ECC switch to ``wings stay on'' mode before you start down the runway. Here is a rather outdated and Linux-specific workaround for cheapo AMD desktop boards that don't have an ECC option in their BIOS: http://newsgroups.derkeiler.com/Archive/Alt/alt.comp.periphs.mainboard.asus/2005-10/msg00365.html http://hyvatti.iki.fi/~jaakko/sw/ The discussion about ECC-only vs scrub-and-fix, about how to read from PCI if ECC errors are happening (though not necessarily which stick), and his 10-ohm testing method, is also interesting. I still don't understand what chip-kill means. I remember something about a memory scrubbing kernel thread in Solaris. This sounds like the AMD chips have a hardware scrubber? Also how are ECC errors reported in Solaris? I guess this is getting OT though. Anyway ECC is not just a feature bullet to gather up and feel good. You have to finish the job and actually interact with it. pgpYuGZlK9cDn.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
cm == Christopher Mera cm...@reliantsec.net writes: cm it would be ideal to quiesce the system before a snapshot cm anyway, no? It would be more ideal to find the bug in SQLite2 or ZFS. Training everyone, ``you always have to quiesce the system before proceeding, because it's full of bugs'' is retarded MS-DOS behavior. I think it is actually harmful. pgpk37ALPzeuv.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
bj == Brent Jones br...@servuhome.net writes: bj tolerating that your database will not be consistent after a bj snapshot and have to replay logs / consistency check it ``not be consistent'' != ``have to replay logs'' pgpLNmP6hsO3I.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
How is it that flash archives can avoid these headaches? Ultimately I'm doing this to clone ZFS root systems because at the moment Flash Archives are UFS only. -Original Message- From: Brent Jones [mailto:br...@servuhome.net] Sent: Tuesday, February 24, 2009 2:49 PM To: Christopher Mera Cc: Mattias Pantzare; Nicolas Williams; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs streams data corruption On Tue, Feb 24, 2009 at 11:32 AM, Christopher Mera cm...@reliantsec.net wrote: Thanks for your responses.. Brent: And I'd have to do that for every system that I'd want to clone? There must be a simpler way.. perhaps I'm missing something. Regards, Chris Well, unless the database software itself can notice a snapshot taking place, and flush all data to disk, pause transactions until the snapshot is finished, then properly resume, I don't know what to tell you. It's an issue for all databases, Oracle, MSSQL, MySQL... how to do an atomic backup, without stopping transactions, and maintaining consistency. Replication is on possible solution, dumping to a file periodically is one, or just tolerating that your database will not be consistent after a snapshot and have to replay logs / consistency check it after bringing it up from a snapshot. Once you figure that out in a filesystem agnostic way, you'll be a wealthy person indeed. -- Brent Jones br...@servuhome.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On 02/24/09 12:57, Christopher Mera wrote: How is it that flash archives can avoid these headaches? Are we sure that they do avoid this headache? A flash archive (on ufs root) is created by doing a cpio of the root file system. Could a cpio end up archiving a file that was mid-way through an SQLite2 transaction? Lori Ultimately I'm doing this to clone ZFS root systems because at the moment Flash Archives are UFS only. -Original Message- From: Brent Jones [mailto:br...@servuhome.net] Sent: Tuesday, February 24, 2009 2:49 PM To: Christopher Mera Cc: Mattias Pantzare; Nicolas Williams; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs streams data corruption On Tue, Feb 24, 2009 at 11:32 AM, Christopher Mera cm...@reliantsec.net wrote: Thanks for your responses.. Brent: And I'd have to do that for every system that I'd want to clone? There must be a simpler way.. perhaps I'm missing something. Regards, Chris Well, unless the database software itself can notice a snapshot taking place, and flush all data to disk, pause transactions until the snapshot is finished, then properly resume, I don't know what to tell you. It's an issue for all databases, Oracle, MSSQL, MySQL... how to do an atomic backup, without stopping transactions, and maintaining consistency. Replication is on possible solution, dumping to a file periodically is one, or just tolerating that your database will not be consistent after a snapshot and have to replay logs / consistency check it after bringing it up from a snapshot. Once you figure that out in a filesystem agnostic way, you'll be a wealthy person indeed. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
Here's what makes me say that: There are over 700 boxes deployed using Flash Archive's on an S10 system with a UFS root. We've been working on basing our platform on a ZFS root and took Scott Dickson's suggestions (http://blogs.sun.com/scottdickson/entry/flashless_system_cloning_with_z fs) for doing a System Clone. The process worked out well, the system came up and looked stable until 24 hours later kernel panic's became incessant and svc.configd won't load its repository any longer. Hope that explains where I'm coming from.. Regards, Chris From: lori@sun.com [mailto:lori@sun.com] Sent: Tuesday, February 24, 2009 3:13 PM To: Christopher Mera Cc: Brent Jones; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs streams data corruption On 02/24/09 12:57, Christopher Mera wrote: How is it that flash archives can avoid these headaches? Are we sure that they do avoid this headache? A flash archive (on ufs root) is created by doing a cpio of the root file system. Could a cpio end up archiving a file that was mid-way through an SQLite2 transaction? Lori Ultimately I'm doing this to clone ZFS root systems because at the moment Flash Archives are UFS only. -Original Message- From: Brent Jones [mailto:br...@servuhome.net] Sent: Tuesday, February 24, 2009 2:49 PM To: Christopher Mera Cc: Mattias Pantzare; Nicolas Williams; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs streams data corruption On Tue, Feb 24, 2009 at 11:32 AM, Christopher Mera cm...@reliantsec.net mailto:cm...@reliantsec.net wrote: Thanks for your responses.. Brent: And I'd have to do that for every system that I'd want to clone? There must be a simpler way.. perhaps I'm missing something. Regards, Chris Well, unless the database software itself can notice a snapshot taking place, and flush all data to disk, pause transactions until the snapshot is finished, then properly resume, I don't know what to tell you. It's an issue for all databases, Oracle, MSSQL, MySQL... how to do an atomic backup, without stopping transactions, and maintaining consistency. Replication is on possible solution, dumping to a file periodically is one, or just tolerating that your database will not be consistent after a snapshot and have to replay logs / consistency check it after bringing it up from a snapshot. Once you figure that out in a filesystem agnostic way, you'll be a wealthy person indeed. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 01:17:47PM -0600, Nicolas Williams wrote: I don't think there's any way to ask svc.config to pause. Well, IIRC that's not quite right. You can pstop svc.startd, gently kill (i.e., not with SIGKILL) svc.configd, take your snapshot, then prun svc.startd. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 2:15 PM, Nicolas Williams nicolas.willi...@sun.comwrote: On Tue, Feb 24, 2009 at 01:17:47PM -0600, Nicolas Williams wrote: I don't think there's any way to ask svc.config to pause. Well, IIRC that's not quite right. You can pstop svc.startd, gently kill (i.e., not with SIGKILL) svc.configd, take your snapshot, then prun svc.startd. Nico -- Hot Backup? # Connect to the database sqlite3 db $dbfile # Lock the database, copy and commit or rollback if {[catch {db transaction immediate {file copy $dbfile ${dbfile}.bak}} res]} { puts Backup failed: $res } else { puts Backup succeeded } ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 02:53:14PM -0500, Miles Nordin wrote: cm == Christopher Mera cm...@reliantsec.net writes: cm it would be ideal to quiesce the system before a snapshot cm anyway, no? It would be more ideal to find the bug in SQLite2 or ZFS. Training everyone, ``you always have to quiesce the system before proceeding, because it's full of bugs'' is retarded MS-DOS behavior. I think it is actually harmful. It's NOT a bug in ZFS. It might be a bug in SQLite2, it might be a bug in svc.configd. More information would help; specifically: error/log messages from svc.configd, and /etc/svc/repository.db. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 02:27:18PM -0600, Tim wrote: On Tue, Feb 24, 2009 at 2:15 PM, Nicolas Williams nicolas.willi...@sun.comwrote: On Tue, Feb 24, 2009 at 01:17:47PM -0600, Nicolas Williams wrote: I don't think there's any way to ask svc.config to pause. Well, IIRC that's not quite right. You can pstop svc.startd, gently kill (i.e., not with SIGKILL) svc.configd, take your snapshot, then prun svc.startd. Nico -- Hot Backup? # Connect to the database sqlite3 db $dbfile # Lock the database, copy and commit or rollback if {[catch {db transaction immediate {file copy $dbfile ${dbfile}.bak}} res]} { puts Backup failed: $res } else { puts Backup succeeded } SMF uses SQLite2. Sorry. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 12:19:22PM -0800, Christopher Mera wrote: There are over 700 boxes deployed using Flash Archive's on an S10 system with a UFS root. We've been working on basing our platform on a ZFS root and took Scott Dickson's suggestions (http://blogs.sun.com/scottdickson/entry/flashless_system_cloning_with_z fs) for doing a System Clone. The process worked out well, the system came up and looked stable until 24 hours later kernel panic's became incessant and svc.configd won't load its repository any longer. OK, svc.configd cannot cause a panic, so perhaps there is a ZFS bug. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On 24-Feb-09, at 1:37 PM, Mattias Pantzare wrote: On Tue, Feb 24, 2009 at 19:18, Nicolas Williams nicolas.willi...@sun.com wrote: On Mon, Feb 23, 2009 at 10:05:31AM -0800, Christopher Mera wrote: I recently read up on Scott Dickson's blog with his solution for jumpstart/flashless cloning of ZFS root filesystem boxes. I have to say that it initially looks to work out cleanly, but of course there are kinks to be worked out that deal with auto mounting filesystems mostly. The issue that I'm having is that a few days after these cloned systems are brought up and reconfigured they are crashing and svc.configd refuses to start. When you snapshot a ZFS filesystem you get just that -- a snapshot at the filesystem level. That does not mean you get a snapshot at the _application_ level. Now, svc.configd is a daemon that keeps a SQLite2 database. If you snapshot the filesystem in the middle of a SQLite2 transaction you won't get the behavior that you want. In other words: quiesce your system before you snapshot its root filesystem for the purpose of replicating that root on other systems. That would be a bug in ZFS or SQLite2. A snapshoot should be an atomic operation. The effect should be the same as power fail in the meddle of an transaction and decent databases can cope with that. In this special case, that is likely so. But Nicolas' point is salutary in general, especially in the increasingly common case of virtual machines whose disk images are on ZFS. Interacting bugs or bad configuration can produce novel failure modes. Quiescing a system with a complex mix of applications and service layers is no simple matter either, as many readers of this list well know... :) --Toby ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 2:37 PM, Nicolas Williams nicolas.willi...@sun.comwrote: Hot Backup? # Connect to the database sqlite3 db $dbfile # Lock the database, copy and commit or rollback if {[catch {db transaction immediate {file copy $dbfile ${dbfile}.bak}} res]} { puts Backup failed: $res } else { puts Backup succeeded } SMF uses SQLite2. Sorry. I don't quite follow why it wouldn't work for sqlite2 as well... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 03:25:53PM -0600, Tim wrote: On Tue, Feb 24, 2009 at 2:37 PM, Nicolas Williams nicolas.willi...@sun.comwrote: Hot Backup? # Connect to the database sqlite3 db $dbfile # Lock the database, copy and commit or rollback if {[catch {db transaction immediate {file copy $dbfile ${dbfile}.bak}} res]} { puts Backup failed: $res } else { puts Backup succeeded } SMF uses SQLite2. Sorry. I don't quite follow why it wouldn't work for sqlite2 as well... Because SQLite2 doesn't have that feature. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] hardware advice
I'm sorry this is going to be rather lame since I can't even provide a `specs' page for the hardware I want to ask about. Its pretty discouraging trying to get information from Aopen. A specification page seems like the bare minimum to provide. Building up a home zfs server and have specific hardware on hand already. So, here is hoping someone will `just know' about an: Aopen AK86-L Mothherboard On an Athlon64 2.2ghz +3400 It has 3 gb of ram currently.. and that is the max (I don't recall ever hearing anything about ECC with this ram but not sure I even know what it is anyway. I do see mention of it here often. The mobo has only 2 sata and 2 IDE. I'm thinking of adding a PCI style 4 port sata controller. So can anyone vouch for that hardware being likely to work with Opensol-11? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
Rob Logan wrote: Not. Intel decided we don't need ECC memory on the Core i7 I thought that was a Core i7 vs Xeon E55xx for socket LGA-1366 so that's why this X58 MB claims ECC support: http://supermicro.com/products/motherboard/Xeon3000/X58/X8SAX.cfm They lie*. Read the Intel Core i7 specs - no ECC on any of them. * They claim future Nehalem processor families. These mysterious future CPUs may indeed support ECC. The Core i7-(920|940|965) do not. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
On Tue, Feb 24, 2009 at 4:04 PM, Carson Gaspar car...@taltos.org wrote: They lie*. Read the Intel Core i7 specs - no ECC on any of them. * They claim future Nehalem processor families. These mysterious future CPUs may indeed support ECC. The Core i7-(920|940|965) do not. Given the current state of AMD, I think we all know that's not likely. Why cut into the revenue of your server line chips when you don't have to? Right? --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Mon, Feb 23, 2009 at 02:36:07PM -0800, Christopher Mera wrote: panic[cpu0]/thread=dacac880: BAD TRAP: type=e (#pf Page fault) rp=d9f61850 addr=1048c0d occurred in module zfs due to an illegal access to a user address Can you describe what you're doing with your snapshot? Are you zfs send'ing your snapshots to new systems' rpools? Or something else? You're not using dd(1) or anything like that, right? Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
It's a zfs snapshot that's then sent to a file.. On the new boxes I'm doing a jumpstart install with the SUNWCreq package, and using the finish script to mount an NFS filesystem that contains the *.zfs dump files. Zfs receive is actually importing the data and the boot environment then boots fine. -Original Message- From: Nicolas Williams [mailto:nicolas.willi...@sun.com] Sent: Tuesday, February 24, 2009 5:43 PM To: Christopher Mera Cc: lori@sun.com; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs streams data corruption On Mon, Feb 23, 2009 at 02:36:07PM -0800, Christopher Mera wrote: panic[cpu0]/thread=dacac880: BAD TRAP: type=e (#pf Page fault) rp=d9f61850 addr=1048c0d occurred in module zfs due to an illegal access to a user address Can you describe what you're doing with your snapshot? Are you zfs send'ing your snapshots to new systems' rpools? Or something else? You're not using dd(1) or anything like that, right? Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
On Tue, Feb 24, 2009 at 03:08:18PM -0800, Christopher Mera wrote: It's a zfs snapshot that's then sent to a file.. On the new boxes I'm doing a jumpstart install with the SUNWCreq package, and using the finish script to mount an NFS filesystem that contains the *.zfs dump files. Zfs receive is actually importing the data and the boot environment then boots fine. It's possible that your zfs send output files are getting corrupted when accessed via NFS. Try ssh. Also, when does the panic happen? I searched for CRs with parts of that panic string and found none. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
Miles Nordin wrote: Hope this helps untangle some FUD. Snapshot backups of databases *are* safe, unless the database or application above it is broken in a way that makes cord-yanking unsafe too. Actually Miles, what they were asking for is generally referred to as a checkpoint and they are used by all major databases for backing up files. Performing a checkpoint will perform such tasks as making sure that all transactions recorded in the log but not yet written to the database are written out and that the system is not in the middle of a write when you grab the data. Dragging the discussion of database recovery into the discussion seems to me to only be increasing the FUD factor. Regards, Greg ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs streams data corruption
gp == Greg Palmer gregorylpal...@netscape.net writes: gp Performing a checkpoint will perform such tasks as making sure gp that all transactions recorded in the log but not yet written gp to the database are written out and that the system is not in gp the middle of a write when you grab the data. great copying of buzzwords out of a glossary, but does it change my claim or not? My claim is: that SQLite2 should be equally as tolerant of snapshot backups as it is of cord-yanking. The special backup features of databases including ``performing a checkpoint'' or whatever, are for systems incapable of snapshots, which is most of them. Snapshots are not writeable, so this ``in the middle of a write'' stuff just does not happen. gp Dragging the discussion of database recovery into the gp discussion seems to me to only be increasing the FUD factor. except that you need to draw a distinction between recovery from cord-yanking which should be swift and absolutely certain, and recovery from a cpio-style backup done with the database still running which requires some kind of ``consistency scanning'' and may involve ``corruption'' and has every right to simply not work at all. The FUD I'm talking about, is mostly that people seem to think all kinds of recovery are of the second kind, which is flatly untrue! Backing up a snapshot of the database should involve the first category of recovery (after restore), the swift and certain kind, EVEN if you do not ``quiesce'' the database or take a ``checkpoint'' or whatever your particular vendor calls it, before taking the snapshot. You are entitled to just snap it, and expect that recovery work swiftly and certainly just as it does if you yank the cord. If your database vendor considers it some major catastrophe to have the cord yanked, requiring special tools, training seminars, buzzwords, and hours of manual checking, then we have a separate problem, but I don't think SQLite2 is in that category! Of course Toby rightly pointed out this claim does not apply if you take a host snapshot of a virtual disk, inside which a database is running on the VM guest---that implicates several pieces of untrustworthy stacked software. But for snapshotting SQLite2 to clone the currently-running machine I think the claim does apply, no? pgpd5AH6jPUrj.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss