subject:"\[HACKERS\] TOAST compression"

Re: [HACKERS] TOAST compression

2006-02-27 Thread Qingqing Zhou


"Josh Berkus"  wrote
> > While I'm waiting to figure out how to get the size of the toast table,
at
> > least I can provide the speed of query with/without assumed compression
on
> > the 6K text columns.
>
> Check out the table_size view in the newsysviews project.  Andrew computed
the
> regular, toast, and index sizes as a query.
>

Will pg_total_relation_size_oid()/pg_total_relation_size_name() do the job?

Regards,
Qingqing



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] TOAST compression

2006-02-26 Thread Josh Berkus

Luke,

> While I'm waiting to figure out how to get the size of the toast table, at
> least I can provide the speed of query with/without assumed compression on
> the 6K text columns.

Check out the table_size view in the newsysviews project.  Andrew computed the 
regular, toast, and index sizes as a query.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] TOAST compression

2006-02-26 Thread Luke Lonergan

While I'm waiting to figure out how to get the size of the toast table, at
least I can provide the speed of query with/without assumed compression on
the 6K text columns.

To insure that we're actually accessing the data in the rows, I do a regexp
query on the TOASTed rows:

mpptestdb=# select count(*) from bigtable1 where b ~ 'a';
 count  

 10
(1 row)

Time: 700.574 ms

That's 770MB of database data regexp'ed in 0.7 seconds, or about
1.1GB/second.  Not bad, but slower by good measure than the I/O subsystem.

Now we try the same table with PLAIN storage type:

mpptestdb=# select count(*) from bigtable1 where b ~ 'a';
 count  

 10
(1 row)

Time: 284.146 ms

That's the same 770MB in 0.28 seconds or 2.8GB/s, much better.

- Luke



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] TOAST compression

2006-02-26 Thread Luke Lonergan

Josh,

On 2/26/06 8:04 PM, "Josh Berkus"  wrote:

> Check out SET STORAGE.

I just altered the MIVP data generator in Bizgres MPP to produce the usual
15 column table but with a 6K row size.  You'd only expect a few tens of
bytes variance around the 6K, and the data is randomly chosen words from a
seed file of about 14.5K.

I generated a file with 100,000 rows, total size is 600MB.  My tests are
with Bizgres MPP 2.1 GA (based on Postgres 8.1.3) on 4 machines with 8
primary segments.

So, with a column storage type of PLAIN, here's the filesystem resident
size:
mpptestdb=# select relname,8*relpages/128 as MB from pg_class where
relname='bigtable1';;
  relname  | mb  
---+-
 bigtable1 | 789

And here are a couple of selects (obviously coming from the 64GB of RAM):
mpptestdb=# select count(*) from bigtable1;
 count  

 10
(1 row)

Time: 45.685 ms
mpptestdb=# select count(*) from bigtable1;
 count  

 10
(1 row)

Time: 25.024 ms

Here it is with the default EXTENDED:
mpptestdb=# select relname,8*relpages/128 as MB from pg_class where
relname='bigtable1';
  relname  | mb 
---+
 bigtable1 | 20

Now I'm stuck - I can get the toast table OID, but my attempts to find out
it's size are futile.

Suggestions?

- Luke



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] TOAST compression

2006-02-26 Thread Josh Berkus

Luke,

> As Jim pointed out, we would need a real test to confirm the behavior,
> I'm not yet acquainted with the toast compression, so it's harder for me
> to compose a real test.

Check out SET STORAGE.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] TOAST compression

2006-02-26 Thread Luke Lonergan

Hannu,

On 2/26/06 12:19 PM, "Hannu Krosing" <[EMAIL PROTECTED]> wrote:

>> On DBT-3 data, I've just run some tests meant to simulate the speed
>> differences of compression versus native I/O.  My thought is that an
>> external use of gzip on a binary dump file should be close to the speed of
>> LZW on toasted fields,
> 
> Your basic assumption si probbaly wrong :(
> 
> gzip what ? "compression level" setting of gzip has big effect on both
> compression speed and compression rate. And I suspect that even the
> fastest level (gzip -1) compresses slower and better than postgresql's
> lzcompress.

I thought it might form a lower bound on speed.  75MB/s on an Opteron is
really super fast compared to what I expected from LZ.  And since gzip uses
LZ compression, maybe they'll be in the same family of results,
behavior-wise.

I'd be more concerned about the implementation within the executor and how
the "get tuple, decompress tuple, use tuple" cycling might destroy the
efficiency of the scanning compared to decompressing large blocks of data.

As Jim pointed out, we would need a real test to confirm the behavior, I'm
not yet acquainted with the toast compression, so it's harder for me to
compose a real test.

> these are also somewhat bogus tests, if you would want them to be
> comparable with dd below, you should have used 'time gzip -c
> supplier.bin > /dev/null'

Hmm - the file writing seems to matter on decompress, making decompression
even faster:

[EMAIL PROTECTED] tmp]# time gunzip tmp.bin.gz

real0m2.254s
user0m1.718s
sys 0m0.536s
[EMAIL PROTECTED] tmp]# time gzip tmp.bin

real0m12.912s
user0m12.555s
sys 0m0.355s
[EMAIL PROTECTED] tmp]# time gzip -c supplier.bin > /dev/null

real0m12.582s
user0m12.464s
sys 0m0.115s
[EMAIL PROTECTED] tmp]# time gunzip -c tmp.bin.gz > /dev/null

real0m1.734s
user0m1.690s
sys 0m0.043s

- Luke

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] TOAST compression

2006-02-26 Thread Hannu Krosing

Ühel kenal päeval, P, 2006-02-26 kell 09:31, kirjutas Luke Lonergan:
> Jim,
> 
> On 2/26/06 8:00 AM, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote:
> 
> > Any idea on how decompression time compares to IO bandwidth? In other
> > words, how long does it take to decompress 1MB vs read that 1MB vs read
> > whatever the uncompressed size is?
> 
> On DBT-3 data, I've just run some tests meant to simulate the speed
> differences of compression versus native I/O.  My thought is that an
> external use of gzip on a binary dump file should be close to the speed of
> LZW on toasted fields,

Your basic assumption si probbaly wrong :(

gzip what ? "compression level" setting of gzip has big effect on both
compression speed and compression rate. And I suspect that even the
fastest level (gzip -1) compresses slower and better than postgresql's
lzcompress.


>  so I just dumped the "supplier" table (see below) of
> size 202MB in data pages to disk, then ran gzip/gunzip on the the binary
> file.  Second test - an 8k block dd from that same file, meant to simulate a
> seq scan (it's faster by 25% than doing it in PG though):
> 
>  gzip/gunzip =
> [EMAIL PROTECTED] ls -l supplier.bin
> -rw-r--r--  1 mppdemo1 mppdemo1 177494266 Feb 26 09:17 supplier.bin
> 
> [EMAIL PROTECTED] time gzip supplier.bin
> 
> real0m12.979s
> user0m12.558s
> sys 0m0.400s
> [EMAIL PROTECTED] time gunzip supplier.bin
> 
> real0m2.286s
> user0m1.713s
> sys 0m0.573s

these are also somewhat bogus tests, if you would want them to be
comparable with dd below, you should have used 'time gzip -c
supplier.bin > /dev/null'

> [EMAIL PROTECTED] time dd if=supplier.bin of=/dev/null bs=8k
> 21666+1 records in
> 21666+1 records out
> 
> real0m0.138s
> user0m0.003s
> sys 0m0.135s


Hannu



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] TOAST compression

2006-02-26 Thread Alvaro Herrera

Luke Lonergan wrote:
> Jim,
> 
> On 2/26/06 10:37 AM, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote:
> 
> > So the cutover point (on your system with very fast IO) is 4:1
> > compression (is that 20 or 25%?).
> 
> Actually the size of the gzipp'ed binary file on disk was 65MB, compared to
> 177.5MB uncompressed, so the compression ratio is 37% (?), or 2.73:1.

I doubt our algorithm would give the same compression (though I haven't
really measured it).  The LZ implementation we use is supposed to have
lightning speed at the cost of a not-so-good compression ratio.

> No, unfortunately not.  O'Reilly's jobs data have 65K rows, so that would
> work.  How do we implement LZW compression on toasted fields?  I've never
> done it!

See src/backend/utils/adt/pg_lzcompress.c

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] TOAST compression

2006-02-26 Thread Luke Lonergan

Jim,

On 2/26/06 10:37 AM, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote:

> So the cutover point (on your system with very fast IO) is 4:1
> compression (is that 20 or 25%?).

Actually the size of the gzipp'ed binary file on disk was 65MB, compared to
177.5MB uncompressed, so the compression ratio is 37% (?), or 2.73:1.

> But that's assuming that PostgreSQL
> can read data as fast as dd, which we all know isn't the case.

Actually, I had factored that in already.  The filesystem delivered
1,200MB/s out of cache in this case - the 300MB/s is what we routinely do
from Postgres seqscan per instance on this system.

> That's
> also assuming a pretty top-notch IO subsystem.

True - actually I'm pretty impressed with 75MB/s gunzip speed.

> Based on that, I'd argue
> that 10% is probably a better setting, though it would be good to test
> an actual case (does dbt3 produce fields large enough to ensure that
> most of them will be toasted?)

No, unfortunately not.  O'Reilly's jobs data have 65K rows, so that would
work.  How do we implement LZW compression on toasted fields?  I've never
done it!

> Given the variables involved, maybe it makes sense to add a GUC?

Dunno - I'm not sure how the current scheme works, this is new to me.

We had considered using a compression mechanism for table data, but had some
completely different ideas, more along the lines of a compressing heap
store.  The main problem as I see it is the CPU required to get there at
reasonable performance as you point out.  However, the trend is inevitable -
we'll soon have more CPU than we could otherwise use to work with...

- Luke

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] TOAST compression

2006-02-26 Thread Jim C. Nasby

On Sun, Feb 26, 2006 at 09:31:05AM -0800, Luke Lonergan wrote:
> Note that this filesystem can do about 400MB/s, and we routinely see scan
> rates of 300MB/s within PG, so the real comparision is:
> 
> Direct seqscan at 300MB/s versus gunzip at 77.5MB/s

So the cutover point (on your system with very fast IO) is 4:1
compression (is that 20 or 25%?). But that's assuming that PostgreSQL
can read data as fast as dd, which we all know isn't the case. That's
also assuming a pretty top-notch IO subsystem. Based on that, I'd argue
that 10% is probably a better setting, though it would be good to test
an actual case (does dbt3 produce fields large enough to ensure that
most of them will be toasted?)

Given the variables involved, maybe it makes sense to add a GUC?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] TOAST compression

2006-02-26 Thread Luke Lonergan

Jim,

On 2/26/06 8:00 AM, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote:

> Any idea on how decompression time compares to IO bandwidth? In other
> words, how long does it take to decompress 1MB vs read that 1MB vs read
> whatever the uncompressed size is?

On DBT-3 data, I've just run some tests meant to simulate the speed
differences of compression versus native I/O.  My thought is that an
external use of gzip on a binary dump file should be close to the speed of
LZW on toasted fields, so I just dumped the "supplier" table (see below) of
size 202MB in data pages to disk, then ran gzip/gunzip on the the binary
file.  Second test - an 8k block dd from that same file, meant to simulate a
seq scan (it's faster by 25% than doing it in PG though):

 gzip/gunzip =
[EMAIL PROTECTED] ls -l supplier.bin
-rw-r--r--  1 mppdemo1 mppdemo1 177494266 Feb 26 09:17 supplier.bin

[EMAIL PROTECTED] time gzip supplier.bin

real0m12.979s
user0m12.558s
sys 0m0.400s
[EMAIL PROTECTED] time gunzip supplier.bin

real0m2.286s
user0m1.713s
sys 0m0.573s
[EMAIL PROTECTED] time dd if=supplier.bin of=/dev/null bs=8k
21666+1 records in
21666+1 records out

real0m0.138s
user0m0.003s
sys 0m0.135s

 Interpretation =
Zip speed: 177.5 MB in 13 seconds, or 13.6 MB/s
Unzip speed: 177.5 MB in 2.29 seconds, or 77.5 MB/s
Direct access speed: 177.5 MB in 0.138 seconds or 1,286 MB/s

Note that this filesystem can do about 400MB/s, and we routinely see scan
rates of 300MB/s within PG, so the real comparision is:

Direct seqscan at 300MB/s versus gunzip at 77.5MB/s

 Background data =
demo=# \d supplier
 Table "public.supplier"
   Column|  Type  | Modifiers
-++---
 s_suppkey   | integer| not null
 s_name  | character(25)  | not null
 s_address   | character varying(40)  | not null
 s_nationkey | integer| not null
 s_phone | character(15)  | not null
 s_acctbal   | numeric(15,2)  | not null
 s_comment   | character varying(101) | not null

demo=# select relname,8*relpages/128 as MB from pg_class order by relpages
desc limit 6;
 relname  |   mb   
--+
 lineitem | 123434
 orders   |  24907
 partsupp |  14785
 part |   3997
 customer |   3293
 supplier |202
(6 rows)

Time: 2.024 ms
demo=# copy supplier to '/tmp/supplier.bin' with binary;
COPY
Time: 7328.186 ms
demo=# copy supplier to '/tmp/supplier.txt';
COPY
Time: 5503.168 ms

 Note how the text file dumps faster than binary, and it's smaller
 at 148MB.

demo=# select version();
   
version


-
 PostgreSQL 8.1.3 (Bizgres MPP 2.1) on x86_64-unknown-linux-gnu, compiled by
GCC gcc (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2) compiled on Feb 23 2006
11:34:06
(1 row)




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] TOAST compression

2006-02-26 Thread Tom Lane

Neil Conway <[EMAIL PROTECTED]> writes:
> toast_compress_datum() considers compression to be "successful" if the
> compressed version of the datum is smaller than the uncompressed
> version. I think this is overly generous: if compression reduces the
> size of the datum by, say, 0.01%, it is likely a net loss to use the
> compressed version of the datum since we'll need to pay for LZ
> decompression every time that we de-TOAST it. This situation can occur
> frequently when storing "mostly-uncompressible" data (compressed images,
> encrypted data, etc.) -- some parts of the data will compress well (e.g.
> metadata), but the vast majority will not.

Does it really occur frequently?  When dealing with already-compressed
or encrypted data, the LZ transform actually makes the data larger by
some small percentage.  This will outweigh any savings on compressible
headers or what have you, just because those are only a tiny part of the
file to begin with.  (Else the format designers would have found a way
to compress them too.)  So I'd expect the existing test to catch most of
the real-world cases you cite.

I'm not particularly inclined to worry about this without some hard
evidence that it's a problem.

You'd need some numerical evidence anyway to justify any specific
threshold, else it's just as arbitrary as "is it smaller" ... and the
latter at least requires a few instructions less to check.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] TOAST compression

2006-02-26 Thread Jim C. Nasby

On Sat, Feb 25, 2006 at 09:39:34PM -0500, Neil Conway wrote:
> It's true that LZ decompression is fast, so we should probably use the
> compressed version of the datum unless the reduction in size is very
> small. I'm not sure precisely what that threshold should be, however.

Any idea on how decompression time compares to IO bandwidth? In other
words, how long does it take to decompress 1MB vs read that 1MB vs read
whatever the uncompressed size is?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] TOAST compression

2006-02-25 Thread Bruce Momjian

Neil Conway wrote:
> toast_compress_datum() considers compression to be "successful" if the
> compressed version of the datum is smaller than the uncompressed
> version. I think this is overly generous: if compression reduces the
> size of the datum by, say, 0.01%, it is likely a net loss to use the
> compressed version of the datum since we'll need to pay for LZ
> decompression every time that we de-TOAST it. This situation can occur
> frequently when storing "mostly-uncompressible" data (compressed images,
> encrypted data, etc.) -- some parts of the data will compress well (e.g.
> metadata), but the vast majority will not.
> 
> It's true that LZ decompression is fast, so we should probably use the
> compressed version of the datum unless the reduction in size is very
> small. I'm not sure precisely what that threshold should be, however.
> 
> Comments?

20%?  25%

-- 
  Bruce Momjian   http://candle.pha.pa.us
  SRA OSS, Inc.   http://www.sraoss.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] TOAST compression

2006-02-25 Thread Neil Conway

toast_compress_datum() considers compression to be "successful" if the
compressed version of the datum is smaller than the uncompressed
version. I think this is overly generous: if compression reduces the
size of the datum by, say, 0.01%, it is likely a net loss to use the
compressed version of the datum since we'll need to pay for LZ
decompression every time that we de-TOAST it. This situation can occur
frequently when storing "mostly-uncompressible" data (compressed images,
encrypted data, etc.) -- some parts of the data will compress well (e.g.
metadata), but the vast majority will not.

It's true that LZ decompression is fast, so we should probably use the
compressed version of the datum unless the reduction in size is very
small. I'm not sure precisely what that threshold should be, however.

Comments?

-Neil



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

Re: [HACKERS] TOAST compression

[HACKERS] TOAST compression

15 matches

Site Navigation

Mail list logo

Footer information