Hello, [ I have not yet had the time to look at code again in response to some of the points raised raised by several people; but I wanted to follow-up somewhat still on other bits. ]
> > You would have to test for whether it's time to sleep much more often. > > Possibly before every ExecProcNode call would be enough. > > That would have overhead comparable to EXPLAIN ANALYZE, which is a lot. > > I'm fairly dubious about this whole proposal: it's not clear to me that > the vacuum delay stuff works very well at all, and to the extent that it > does work it's because vacuum has such stylized, predictable behavior. Well, it definitely works well enough to make a large difference in my use cases. In particular with respect to the amount of write activity generated which easily causes latency problems. That said, it remains to be seen how much of an issue heavy write activity will be once upgraded to 8.3 and after tweaking the Linux buffer cache. Right now, I do not expect the database to be even useful in one of my use cases, if it were not for delay points during vacuuming. So although I make no argument as to whether it works better due to the limited and understood nature of vacuuming, it is definitely an appreciate feature for my use cases. > The same can't be said of general SQL queries. For one thing, it's > not apparent that rate-limiting I/O would be sufficient, because > although vacuum is nearly always I/O bound, general queries often are > CPU bound; or their system impact might be due to other factors like > contention for shared-memory data structures. In my case I mostly care about I/O. I believe that this is going to be a fairly common fact with anyone whose primary concern is latency. The good part about CPU contention is that it is handled quite well by modern operating systems/hardware. Even on a single-core machine, a single CPU bound query should still only have a percentage-wise throughput impact on other traffic (normally; of course you might have some particularly bad contention on some resource, etc). If your database is very sensitive to latency, you are likely running it at far below full throughput, meaning that there should be quite a bit of margin in terms of CPU. This would be especially true on multi-core machines where the impact of a single backend is even less. The problem I have with I/O is that saturating I/O, in particular with writes, has all sorts of indirect effects that are difficult to predict, and are not at all guaranteed to translate into a simple percentage-wise slow-down. For example, I've seen stalls lasting several *minutes* due to a bulk DELETE of a million rows or so. With mixed random-access writes, streaming writes, and the PG buffer cache, the operating system buffer cache, and the RAID controller's cache, it is not at all unlikely that you will have significant latency problems when saturating the system with writes. So recognizing that I am not likely to ever have very good behavior while saturating the storage system with writes, I instead want to limit the write activity generated to a sensible amount (preferably such that individual bursts are small enough to e.g. fit in a RAID controller cache). This reduces the problem of ensuring good behavior with respect to short burst of writes and their interaction with checkpoints, which is a much easier problem than somehow ensuring "fairness" under write-saturated load. So that is where my motivation comes from; in more or less all my use cases, limiting disk I/O is massively more important than limiting CPU usage. On this topic, I have started thinking again about direct I/O. I asked about this on -performance a while back in a different context and it seemed there was definitely no clear concensus that one "should" have direct I/O. That seemed to be mostly due to a perceived lack of increase in throughput. However my gut feel is that by bypassing the OS buffer cache, you could significantly improve real-time/latency sensitive aspects in PostgreSQL in cases where throughput is not your primary concern. Perhaps something like that would be a more effective approach. > Priority inversion is > a pretty serious concern as well (ie, a sleeping "low priority" query > might be blocking other queries). I presume this is in reference to bulk modifications (rather than selects) blocking other transactions with conflicting updates? If so, yes I see that. The feature would be difficult to use reliably for writes except in very controlled situations (in my particular use-case that I am tunnel vision:ing on, it is guaranteed that there is no conflict due to the nature of the application). But is my understanding correct that there is no reason to believe there are such issues for read-only queries, or queries that do not actually conflict (at the SQL level) with concurrent transactions? (Ignoring the impact it might have on old transactions hanging around for a longer time.) -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <[EMAIL PROTECTED]>' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org
pgpS145eUe9aH.pgp
Description: PGP signature