Kyle McDonald wrote: > Garrett D'Amore wrote: >> Edward Pilatowicz wrote: >>> i'm not sure if this is an arc issue or not, but i'll raise it here >>> anyway. (perhaps it's just lack of imagination on my part.) >>> >>> i'm really confused about the usage model for this device. first you >>> recommend that this device is a good choice as a ZFS SLOG. ok. (i >>> guess that assumes you've hooked the device up to an external ups so >>> that the SLOG doesn't go away if power to the server fails?) >>> persumably >>> you'd never expect anyone to run ddrdrivectl backup" when the device is >>> used as an slog device on a running system, right? >>> >>> so presumably ddrdrivectl exists for some other usage? i'm trying to >>> figure out what this use case could be. could you supply some >>> examples? >>> >> >> ddrdrivectl takes a snapshot. Its totally safe to do it on a running >> system, but you don't want to do it if the device is in heavy use. >> >> However, if you're going to shut the system down for a while, you >> might want to run the backup to flush the data to the disk. >> This could be done automatically as part of a shutdown script, or it >> could be done even nightly via cron (just recognize that you'll >> "stall" all I/O to the device for 60 seconds while its running.) >> > Except as I understand it, after a clean shutdown there is nothing > worth backing up in the slog. If I understand it correctly the slog is > only ever read when sync. transactions need to be replayed to the > zpool after a power failure. If a clean shutdown is initiated, there > shouldn't be anything in the slog that hasn't already been committed > to the pool.
That *sounds* right. I'm not sure what happens if the SLOG is stale though... can ZFS cope with this? I don't know... I don't think it was designed for that scenario (where the SLOG data is more stale than the actual drive data.) > > Also during a power failure - where you'd want to backup the RAM to > flash before your UPS runs out - your system (if it's not also on the > UPS, in which case the PCI power is present and the external power is > pointless) is down and you can't run the command anyway. I imagine > booting back up when power is restored would be problematic (if you > have this slog on your root pool anyway) also since you can't really > restore the contents of the RAM with the command before the pool is > imported. PCI power is not present during a full reboot cycle. You need the external power to preserve DDR across reboot cycles (at least "prom" reboots -- fast reboot preserves PCI power.) > >> When used for a regular ZFS pool, you can "export" the pool, and then >> snapshot the state, and even port the unit to a different system and >> reimport the pool. (Although you'll need to manually run >> "ddrdrivectl load" in that case.) > Again, Any exported zpool should have all the contents of the slog > already fully committed to the disks, and no need for a snapshot of > the slog should be needed. Unless I'm missing something.... I was suggesting if you use the device as a zfs pool device, not a an SLOG. (In fact most of my testing of the device has been in this fashion -- with ordinary ZFS filesystems on it.) >> I do want to interject one thing here: I've made a number of >> suggestions to the vendor for how this process can be improved -- IMO >> the backup and restore operation should be handled automatically, and >> ideally we'd like supercaps or a battery to do the operation on >> demand. I think the vendor is looking at these for future revisions >> of the product, but right now we have to make do with the way the >> hardware works. (Unfortunately, this cannot be corrected by a >> firmware update -- indeed the device has no firmware, as it is >> basically implemented entirely in FPGA.) >> > These improvelments seem to be required for use for ZFS (again unless > I'm missing something ;) )I wouldn't want an slog that needed manual > intervention to move valid and needed data from RAM to flash. It seems > like automating this copy when power loss is detected (and back when > power is resotred) is a requirement for an slog device. A battery (or > supercap) that will last long enough to allow that to happen would be > nice too. I don't necessarily agree that it is "required", but it certainly seems like it would be better to have this than not have it. That said, the device *does* work. I've been using it quite a lot, and haven't had any problems. Of course, if you encounter a problem that leaves your system completely unbootable, and your power is down for longer than your UPS can supply data to the DDRdrive (which seems like it would last quite a long time -- no moving parts here, and no CPU!), then you will probably ultimately suffer some data loss. Now, my understanding is that DDRdrive's existing customers use these things in controlled environments (machine rooms, etc.). In that environment, there is always stable power (via UPS), so there is no worry to the customer about data loss. The device isn't perfect; there are some obvious things that could be done to make it better. But for those folks who can use it and need a cheap alternative for an SLOG device, this looks really attractive. - Garrett > > -Kyle > >> - Garrett >>> ed >>> >>> >>> On Sun, Nov 29, 2009 at 02:18:03PM -0800, Garrett D'Amore - sun >>> microsystems wrote: >>> >>>> Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI >>>> This information is Copyright 2009 Sun Microsystems >>>> 1. Introduction >>>> 1.1. Project/Component Working Name: >>>> DDRdrive X1 driver >>>> 1.2. Name of Document Author/Supplier: >>>> Author: Garrett D'Amore >>>> 1.3 Date of This Document: >>>> 29 November, 2009 >>>> 4. Technical Description >>>> >>>> This project aims to supply a new device driver, ddrdrive, to support >>>> the DDRdrive X1 hybrid DDR RAM and NAND solid state storage adapter >>>> (www.ddrdrive.com). The only model supported is the X1, which is >>>> identified by PCI id pci19e3,dd52. >>>> >>>> We are seeking Patch binding for this case, although we have no >>>> specific >>>> plans to backport at this time. >>>> >>>> The ddrdrive driver is dependent on the new "bd" block device driver, >>>> specified in PSARC 2009/646. >>>> >>>> The physical device is a PCIe x1 card with 4GB of RAM and 4GB of NAND, >>>> with an external power supply (suitable for connection to a UPS). It >>>> offers excellent write latencies, and is a good choice for a ZFS SLOG >>>> device. >>>> >>>> >>>> The device has one unusual attribute, however, which is the way data >>>> is moved between NAND flash and DDR RAM. The DDR RAM remains valid >>>> as long as power is supplied either via the PCIe bus, or via the >>>> external >>>> AC power adapter. If both power sources are lost, the DDR resets and >>>> any data not backed up to NAND is lost. >>>> >>>> The data is only moved between NAND and RAM on demand, and can take up >>>> to 60 seconds to copy (either for backup of DDR to NAND or for >>>> restoring >>>> NAND to to DDR.) The device only supports a complete backup or restore >>>> operation. >>>> >>>> In order to support this functionality, a new command, >>>> "ddrdrivectl", is >>>> provided by this case. It has the following syntax: >>>> >>>> ddrdrivectl list [-v] >>>> list all adapters >>>> ddrdrivectl backup [-f] [-q] <adapter> >>>> backup adapter to flash >>>> ddrdrivectl restore [-f] [-q] <adapter> >>>> restore adapter from flash >>>> ddrdrivectl help >>>> show this help message >>>> >>>> The <adapter> field is a simple whole number, representing the >>>> "instance" >>>> of the device as found /etc/path_to_inst. (The "list" command is >>>> offered so >>>> administrators won't need to grunge through that file, though.) >>>> >>>> This command additionally has some protections, which are intended to >>>> prevent unsafe use of the backup or restore operation. Specifically, >>>> the "restore" operation includes some checks (implemented via >>>> libdiskmgt) >>>> to ensure that a restore is not performed while the media is currently >>>> mounted or otherwise in current use by the system. In both cases, >>>> unless >>>> -f is specified, the user is prompted to confirm the operation. Both >>>> operations effectively "suspend" the device for about 60 seconds while >>>> the transfer completes. >>>> >>>> (The command also has an internal "connect" and "disconnect" >>>> subcommand, >>>> which are intended solely to facilitate development by allowing the >>>> target >>>> devices to be "detached" and thereby allowing the module to be >>>> unloaded. >>>> These are undocumented Project Private interfaces, though.) >>>> >>>> >>>> Imported Interfaces >>>> >>>> bd block DDI Consolidation Private PSARC 2009/646 >>>> libdiskmgmt Consolidation Private Used by ddrdrivectl >>>> >>>> Exported Interfaces >>>> >>>> /kernel/drv/ddrdrive Committed Device driver >>>> /usr/sbin/ddrdrivectl Uncommitted Command (and all >>>> subcommands). >>>> >>>> >>>> 6. Resources and Schedule >>>> 6.4. Steering Committee requested information >>>> 6.4.1. Consolidation C-team Name: >>>> ON >>>> 6.5. ARC review type: FastTrack >>>> 6.6. ARC Exposure: open >>>> >> >> _______________________________________________ >> opensolaris-arc mailing list >> opensolaris-arc at opensolaris.org >