Thank you Keith for your elaborate explanation about the inner workings of cache and the options that are involved. I has made me realize again not to draw conclusions too quickly when one is not in depth familiar with the background of a problem.
Regards, Rob -----Original Message----- From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Keith Medcalf Sent: donderdag 26 maart 2015 2:29 To: General Discussion of SQLite Database Subject: Re: [sqlite] FW: Very poor SQLite performance when using Win8.1 + Intel RAID1 On Wednesday, 25 March, 2015 07:47, Rob van der Stel <RvanderStel at benelux.tokheim.com> said: >The naming in WinXP and Win81 is such that a mistake is easily made, the >corresponding options are namely: > >a1) WinXP: 'Optimize for for performance' -- this corresponds to >a2) Win81: 'Enable write cache on the device' > >b1) WinXP: 'Enable write cache on the disk' -- this corresponds to >b2) Win81: 'Turn off Windows write-cache buffer flushing on the >device' > >I wrongfully thought that b1 and a2 represented the same option. In my >System 1 WinXP both options were 'checked' (a1 & b1). >In my System 2 Win81 only option 'a2' was checked, assuming that it >resembled 'b1' and that option 'b2' was a new feature in Win81. Actually, a1 does not correspond with a2. In Windows XP, A1 (Optimize for performance) enables the Windows Volume Write Cache in system RAM. This is because Windows XP is seeing your SSD as a "Flash" drive (ie, as if it were nothing more than a removeable flash drive of the same type as you would plug into a USB port and can yank out at any old time). By default, such devices have all write caching disabled in the OS so that the luser can "yank" the device out of the computer at any time without destroying the filesystem. If you are not the sort of person who engages in this behaviour and uses "safe remove" for such devices then this option will enable the OS Cache and enable putting not-so-primitive filesystems (such as NTFS instead of FAT) on the device. Advanced filesystem performance is absolutely dreadful without OS Write Caching, Scatter Gather, and Write Coalescing because the amount of I/O without a cache is so huge. A1: 'Optimize for for performance' means to turn a yankable device into a user-won't-yank-without-notice device and enables the OS cache. A2 and B1 are the same: They enable the device level cache on the device to be write-back (checked) or write-through (unchecked). B2: Is always checked in XP (you cannot enable forced device buffer flush). More recent versions of Windows recognize that an SSD is not a user yankable device and therefore does not disable the OS (System) cache to protect the luser from hurting themselves. They see the SSD as a traditional fixed disk device (which it really is). If XP saw the SSD as a fixed disk rather than luser-yankable, it would have enabled the OS Cache by default and the option you would have seen would have been whether or not to enable the device level write cache (ie, write-back rather than write-through). Thus ended the options available under XP. With OS Caching enabled on XP/Vista/2003 (or earlier) there was no way for an application to guarantee that a write *EVER* made it to permanent storage. In fact, there is a special "oooopsie" message for this "Cache Data Lost before it could be written to disk" that is quite common on any Windows client/server OS system (XP/Vista/2003 and before) that has an even moderate I/O load. On later versions of Windows (in this case Win 8.1, though it applies to anything based on the 2008 or later kernel base, which is basically anything later than 2003 Server/Windows Vista) the cache management is improved. Primarily this was done to support the "transactional filesystem". This added the "Turn off Windows Write-Cache buffer flushing" option to Fixed Disks (actually, the default was always there and it was always checked -- you just now have the option), which is only effective if "Enable Write Cache" (ie, write-back cache at the device level) is turned on. Now we get into the two different kinds of cache that can be available -- block cache, and filesystem cache. A block cache is how one increases I/O performance to a block device and is based on common physics and the construction of spinning disk media: You can read an entire track of data into a buffer in the same time as you can read a sector span of less than a full track, and that you can write an entire track from a buffer to the media in less time than you can write a sector span of less than a full track (you can prove this mathematically based of rotational speed, settling time, and using the average third of a rotation delay). Therefore, all I/O to the physical media should always be done by full track read or written as a single stroke from wherever the head lands after settling. (The same theory applies to access to RAM and MLC Flash -- it is faster to read or write an entire "line" than it is to read/write a single byte -- though in the case of modern RAM/MLC! Flash, i t is practically impossible to access a subset of a "line" without a block cache). In the case of RAM and MLC Flash, the controller logic and L1/L2/L3 cache enforces this. So, the cache located "on the drive" in modern hard disks (and other things simillar) is a block cache. It works in entire "blocks" up-down from the actual storage media to the cache on the device. So if the OS sends a command to "read sector 485938" the drive might actually read sectors 485700 through 485799 into the block cache, then return just the wee bit requested to the OS. Similarly if the OS asks to "Write Sector 485938" and the I/O block from sectors 485700 through 485799 is not in the device cache, then the device must read the larger "line" block first, edit in the data from the OS, and then re-write the entire block back to the media. This is why bitty-box Operating Systems have such pitiful I/O characteristics. Bitty-box OS's generally instead have what is called a FileSystem cache rather than a block cache in the hopes this will compensate for the inability to comprehend physics (and the difficulty of implementing efficient I/O in non-coprocessed resource scarce environments). This means that the cache contains chained file data and not the I/O blocks, and why all I/O from windows (and other bitty-box OS's) is devolved into scatter/gather of sectors (or clusters) rather than whole blocks, and why I/O performance generally sucks. Because of these issues it became necessary to have the ability in bitty-box filesystem caches to "flush" a file (since they were inherently very inefficient) -- a concept which does not exist and is not needed for High Performance I/O systems which operate with "I/O Blocks". This forces the OS to write-down the data from the filesystem cache to the device, but the device must still read/write/edit by line, so waiting for this process to occur is slow. When the Windows "Turn Off Windows Write-Cache buffer flushing" is turned ON (ier, unchecked), the OS requires that the writes from the cache to the device are "written to the media" before being acknowledged by the device (and thus marked clean in the cache, preventing the aforementioned "ooopsie" errors. When enabled it means that there may be data sitting in the RAM buffers on the device that has not yet been written to the actual persistent storage yet, but will be as fast as the drive can manage (and will be marked as ! clean in the filesystem cache). Thus there is a small window where a power failure may cause loss of data. Devices intended for higher level of performance will have the write-queue either independantly battery backed (if in static RAM for example) or will have NVRAM and capacitors sufficient to "flush" the RAM write-queue into non-volatile storage on a power failure. There are companies which make proper "block level" caches that run on Windows that sit between the OS FileSystem cache and the device who's sole purpose is to collect up the bitty writes from the OS and send them to the actual storage device efficiently. Of course, because these use system RAM they can also cause data loss, however, they can provide huge I/O performace increases even when not operating in "write-back". Examples range from products like SuperCache, to how the I/O subsystem on hypervisors such as ESX work to do large I/O block operations to the media while dealing with bitty-writes from the VM's (hypervisors can emulate high performance co-processed I/O systems because they are able to dedicate storage and CPU to proper I/O management independantly of the workload/VMs) So, the answer to you question really is that it depends on your device. If you check the "Turn off Windows write-cache buffer flushing on the device", and the device does not have the capability to preserve the pending chain of write-edits in Non-volatile storage if the power fails, then in the event of a power failure you will lose the data that is in the edit-chain waiting to be written. On the other hand, for SSDs it would be possible to provide sufficient power to enable the cache to be written out simply by using on-board capacitors since the time to flush the write-chain properly is very small compared to a spinning disk. --- Theory is when you know everything but nothing works. Practice is when everything works but no one knows why. Sometimes theory and practice are combined: nothing works and no one knows why. _______________________________________________ sqlite-users mailing list sqlite-users at mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users This e-mail and any attachments contain material that is confidential for the sole use of the intended recipient. Any review, reliance or distribution by others or forwarding without express permission is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. This message has been scanned for malware by Websense. www.websense.com