Re: [HACKERS] too much WAL volume
On Fri, 27 Apr 2007, Jim Nasby wrote: Yes, but how many data drives would you need to have to bottleneck on WAL? Even if the entire database is memory resident you'd still have to write all the pages out at some point, and it seems to me that you'd need a fair amount of disk capacity the data directory before you got pegged by WAL. Depends on the type of transactions. If you're doing something with lots of INSERT and perhaps some UPDATE volume that doesn't need to read heavily from the database to complete, because most of the important stuff is already in memory, you might run into the WAL limit without too much on the database disk side. I did say it was difficult... When I did some DBT2 testing a bit over a year ago I had a 20 drive RAID10 for data and a mirror for WAL and was nowhere close to pegged on WAL (this was on a Sun V40 connected to one of their storage arrays). No doubt, the main reason I haven't used DBT2 more is because the WAL volume produced before you run into database limited bottlenecks isn't large, and certainly not in the same ratio as some of the apps I'm modeling. Mine lean more toward straightforward transaction logging in parts. I'm running on similar hardware (V40 is very close, I think the EMC array I test against is a bit better than the most of the Sun models) and I've seen some scenarios that produce 40MB/s average - 60MB/s peak of WAL volume. Sure seems like I'm rate limited by the RAID-1 WAL disk. As you say, eventually all the data has to make it to disk, but since it's not too expensive nowadays to have gigabytes worth of memory and disk array cache you can put off database writes for a surprisingly long period of time with the right system design. It's harder to buffer those pesky O_DIRECT WAL writes when they blow right though at least one level of cache. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] too much WAL volume
On Apr 27, 2007, at 4:58 AM, Greg Smith wrote: On Thu, 26 Apr 2007, Zeugswetter Andreas ADI SD wrote: I am not sure that shrinking per WAL record size (other than the full page images), e.g. by only logging changed bytes and not whole tuples, would have a huge impact on OLTP tx/sec, since the limiting factor is IO's per second and not Mb per second. With the kind of caching controller that's necessary for any serious OLTP work with Postgres, number of I/Os per second isn't really an important number. Total volume of writes to the WAL volume can be though. It's difficult but not impossible to encounter a workload that becomes bottlenecked by WAL volume on a good OLTP server, particularly because that's often going to a single or RAID-1 disk. Whether those workloads also have the appropriate properties such that their WAL could be shrunk usefully in real-time is a good question. Yes, but how many data drives would you need to have to bottleneck on WAL? Even if the entire database is memory resident you'd still have to write all the pages out at some point, and it seems to me that you'd need a fair amount of disk capacity the data directory before you got pegged by WAL. When I did some DBT2 testing a bit over a year ago I had a 20 drive RAID10 for data and a mirror for WAL and was nowhere close to pegged on WAL (this was on a Sun V40 connected to one of their storage arrays). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] too much WAL volume
Writing to a different area was considered in pg, but there were more negative issues than positive. So imho pg_compresslog is the correct path forward. The current discussion is only about whether we want a more complex pg_compresslog and no change to current WAL, or an increased WAL size for a less complex implementation. Both would be able to compress the WAL to the same archive log size. Huh? As conceived, pg_compresslog does nothing to lower log volume for general purposes, just on-disk storage size for archiving. It doesn't help us at all with the tremendous amount of log we put out for an OLTP server, for example. Ok, that is not related to the original discussion though. I have thus changed the subject, and removed [PATCHES]. You cannot directly compare the pg WAL size with other db's since they write parts to other areas (e.g. physical log in Informix). You would need to include those writes in a fair comparison. It is definitely not true, that writing to a different area has only advantages. The consensus was, that writing the page images to the WAL has more pro's. We could revisit the pros and cons though. Other options involve special OS and hardware (we already have that), or accepting a high risc of needing a restore after power outage (we don't have that, because we use no mechanism to detect such a failure). I am not sure that shrinking per WAL record size (other than the full page images), e.g. by only logging changed bytes and not whole tuples, would have a huge impact on OLTP tx/sec, since the limiting factor is IO's per second and not Mb per second. Recent developments like HOT seem a lot more promising in this regard since they avoid IO. Andreas ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] too much WAL volume
On Thu, 26 Apr 2007, Zeugswetter Andreas ADI SD wrote: I am not sure that shrinking per WAL record size (other than the full page images), e.g. by only logging changed bytes and not whole tuples, would have a huge impact on OLTP tx/sec, since the limiting factor is IO's per second and not Mb per second. With the kind of caching controller that's necessary for any serious OLTP work with Postgres, number of I/Os per second isn't really an important number. Total volume of writes to the WAL volume can be though. It's difficult but not impossible to encounter a workload that becomes bottlenecked by WAL volume on a good OLTP server, particularly because that's often going to a single or RAID-1 disk. Whether those workloads also have the appropriate properties such that their WAL could be shrunk usefully in real-time is a good question. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq