Re: [HACKERS] Table and Index compression

2009-08-12 Thread Pierre Frédéric Caillau d
For future reference, and since this keeps appearing every few months: The license of LZO is not acceptable for inclusion or use with PostgreSQL. You need to find a different library if you want to pursue this further. Yes, I know about the license... I used LZO for tests, but since my

Re: [HACKERS] Table and Index compression

2009-08-12 Thread Peter Eisentraut
On Tuesday 11 August 2009 13:05:39 Pierre Frédéric Caillaud wrote: > Well, here is the patch. I've included a README, which I paste here. > If someone wants to play with it (after the CommitFest...) feel free to > do so. > While it was an interesting thing to try, I don't think it

Re: [HACKERS] Table and Index compression

2009-08-11 Thread Pierre Frédéric Caillau d
Well, here is the patch. I've included a README, which I paste here. If someone wants to play with it (after the CommitFest...) feel free to do so. While it was an interesting thing to try, I don't think it has enough potential to justify more effort... * How to test - apply the

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Josh Berkus
Pierre, > I added a field in PageHeader which contains : > - 0 to indicate a non-compressed page > - length of compressed data if compressed > > If compression gains nothing (ie gains less than 4K), the page is > stored raw. > > It seems that only pages having a PageHeader ar

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Sam Mason
On Fri, Aug 07, 2009 at 04:17:18PM +0200, Pierre Frrrdddric Caillaud wrote: > I'm answering my own question : at the beginning of the run, postgres > creates a 800MB temporary file, then it fills the table, then deletes the > temp file. > Is this because I use generate_series to fill the test t

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Pierre Frédéric Caillau d
On Fri, 07 Aug 2009 15:42:35 +0200, Kevin Grittner wrote: Pierre Frédéric Caillaud wrote: tablespace is a RAID5 of 3 drives, xlog in on a RAID1 of 2 drives, but it does it too if I put the tablespace and data on the same volume. it starts out relatively fast : si sobibo in

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Kevin Grittner
Pierre Frédéric Caillaud wrote: > tablespace is a RAID5 of 3 drives, xlog in on a RAID1 of 2 drives, > but it does it too if I put the tablespace and data on the same > volume. > it starts out relatively fast : > > si sobibo in csus sy id wa > 00 0 43680 2796 1

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Sam Mason
On Fri, Aug 07, 2009 at 03:29:44PM +0200, Pierre Frrrdddric Caillaud wrote: > vmstat output : Sorry, I don't know enough of PGs internals to suggest anything here, but iostat may give you more details as to what's going on. -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Pierre Frédéric Caillau d
Not strictly related to compression, but I've noticed something really strange... pg 8.4 (vanilla) is doing it, and my compressed version is doing it too. tablespace is a RAID5 of 3 drives, xlog in on a RAID1 of 2 drives, but it does it too if I put the tablespace and data on the same volume. T

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Pierre Frédéric Caillau d
For reference what I'm picturing is this: When a table is compressed it's marked read-only which bars any new tuples from being inserted or existing tuples being deleted. Then it's frozen and any pages which contain tuples wich can't be frozen are waited on until they can be. When it's finished e

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Robert Haas
On Fri, Aug 7, 2009 at 8:18 AM, Greg Stark wrote: > For reference what I'm picturing is this: > > When a table is compressed it's marked read-only which bars any new > tuples from being inserted or existing tuples being deleted. Then it's > frozen and any pages which contain tuples wich can't be fr

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Sam Mason
On Fri, Aug 07, 2009 at 12:59:57PM +0100, Greg Stark wrote: > On Fri, Aug 7, 2009 at 12:48 PM, Sam Mason wrote: > >> Well most users want compression for the space savings. So running out > >> of space sooner than without compression when most of the space is > >> actually unused would disappoint t

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Greg Stark
For reference what I'm picturing is this: When a table is compressed it's marked read-only which bars any new tuples from being inserted or existing tuples being deleted. Then it's frozen and any pages which contain tuples wich can't be frozen are waited on until they can be. When it's finished ev

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Greg Stark
On Fri, Aug 7, 2009 at 12:48 PM, Sam Mason wrote: >> Well most users want compression for the space savings. So running out >> of space sooner than without compression when most of the space is >> actually unused would disappoint them. > > Note, that as far as I can tell for a filesystems you only

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Sam Mason
On Fri, Aug 07, 2009 at 11:49:46AM +0100, Greg Stark wrote: > On Fri, Aug 7, 2009 at 11:29 AM, Sam Mason wrote: > > When you choose a compression algorithm you know how much space a worst > > case compression will take (i.e. lzo takes up to 8% more for a 4kB block > > size). This space should be r

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Pierre Frédéric Caillau d
Also, I'm puzzled why it would the space increase would proportional to the amount of data and be more than 300 bytes. There's no reason it wouldn't be a small fixed amount. The ideal is you set aside one bit -- if the bit is set the rest is compressed and has to save at least one bit. If the bi

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Greg Stark
On Fri, Aug 7, 2009 at 11:29 AM, Sam Mason wrote: > When you choose a compression algorithm you know how much space a worst > case compression will take (i.e. lzo takes up to 8% more for a 4kB block > size).  This space should be reserved in case of situations like the > above and the filesystem sh

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Sam Mason
On Fri, Aug 07, 2009 at 10:33:33AM +0100, Greg Stark wrote: > 2009/8/7 Pierre Frédéric Caillaud : > > Also, about compressed NTFS : it can give you disk-full errors on read(). > > I suspect it's unavoidable for similar reasons to the problems > Postgres faces. When you issue a read() you have to f

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Greg Stark
2009/8/7 Pierre Frédéric Caillaud : > > Also, about compressed NTFS : it can give you disk-full errors on read(). I suspect it's unavoidable for similar reasons to the problems Postgres faces. When you issue a read() you have to find space in the filesystem cache to hold the data. Some other data

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Sam Mason
On Fri, Aug 07, 2009 at 10:36:39AM +0200, Pierre Frrrdddric Caillaud wrote: > Also, about compressed NTFS : it can give you disk-full errors on read(). > While this may appear stupid, it is in fact very good. Is this not just because they've broken the semantics of read? > As a side note, I have

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Pierre Frédéric Caillau d
On Thu, Aug 6, 2009 at 4:03 PM, Greg Stark wrote: I like the idea too, but I think there are some major problems to solve. In particular I think we need a better solution to blocks growing than sparse files. How much benefit does this approach have over using TOAST compression more aggressively

Re: [HACKERS] Table and Index compression

2009-08-07 Thread Pierre Frédéric Caillau d
First, a few things that I forgot to mention in the previous message : I like the idea too, but I think there are some major problems to solve. In particular I think we need a better solution to blocks growing than sparse files. Sparse files allow something great : to test this concept in

Re: [HACKERS] Table and Index compression

2009-08-06 Thread Ron Mayer
I'm curious what advantages there are in building compression into the database itself, rather than using filesystem-based compression. I see ZFS articles[1] discuss how enabling compression improves performance with ZFS; for Linux, Btrfs has compression features as well[2]; and on Windows NTFS se

Re: [HACKERS] Table and Index compression

2009-08-06 Thread Josh Berkus
On 8/6/09 1:03 PM, Greg Stark wrote: > One possibility is to handle only read-only tables. That would make > things a *lot* simpler. But it sure would be inconvenient if it's only > useful on large static tables but requires you to rewrite the whole > table -- just what you don't want to do with la

Re: [HACKERS] Table and Index compression

2009-08-06 Thread Kevin Grittner
Robert Haas wrote: > On Thu, Aug 6, 2009 at 4:03 PM, Greg Stark wrote: >> I like the idea too, but I think there are some major problems to >> solve. In particular I think we need a better solution to blocks >> growing than sparse files. > > How much benefit does this approach have over using TO

Re: [HACKERS] Table and Index compression

2009-08-06 Thread Robert Haas
On Thu, Aug 6, 2009 at 4:03 PM, Greg Stark wrote: > I like the idea too, but I think there are some major problems to > solve. In particular I think we need a better solution to blocks > growing than sparse files. How much benefit does this approach have over using TOAST compression more aggressiv

Re: [HACKERS] Table and Index compression

2009-08-06 Thread Greg Stark
I like the idea too, but I think there are some major problems to solve. In particular I think we need a better solution to blocks growing than sparse files. The main problem with using sparse files is that currently postgres is careful to allocate blocks early so it can fail if there's not enoug

Re: [HACKERS] Table and Index compression

2009-08-06 Thread Guillaume Smet
Pierre, On Thu, Aug 6, 2009 at 11:39 AM, PFC wrote: > The best for this is lzo : very fast decompression, a good compression ratio > on a sample of postgres table and indexes, and a license that could work. The license of lzo doesn't allow us to include it in PostgreSQL without relicensing Postgr

Re: [HACKERS] Table and Index compression

2009-08-06 Thread Josh Berkus
On 8/6/09 2:39 AM, PFC wrote: > > > With the talk about adding compression to pg_dump lately, I've been > wondering if tables and indexes could be compressed too. > So I've implemented a quick on-the-fly compression patch for postgres I find this very interesting, and would like to test it f

[HACKERS] Table and Index compression

2009-08-06 Thread PFC
With the talk about adding compression to pg_dump lately, I've been wondering if tables and indexes could be compressed too. So I've implemented a quick on-the-fly compression patch for postgres Sorry for the long email, but I hope you find this interesting. Why compress ? 1- To sa