Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-11-04 Thread Bruce Momjian
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Alvaro Herrera's message of vie oct 28 16:47:13 -0300 2011: BTW we had previous discussions about dropping pg_database's toast table. Maybe this is a good time to do it, even if there's no risk of this bug

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-28 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Wait a minute: I'm confused. What's at issue here, at least AIUI, is taking a TOAST pointer and fetching the corresponding value. But that must involve reading from the TOAST table, and our usual paradigm for reading from system catalogs is (1) take

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-28 Thread Alvaro Herrera
Excerpts from Tom Lane's message of vie oct 28 15:37:43 -0300 2011: The main concern I had about detoast before caching is the risk of circularity, ie, needing detoastable cache entries in order to figure out how to detoast. But I think it's probably okay. The current list of catalogs with

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-28 Thread Alvaro Herrera
Excerpts from Alvaro Herrera's message of vie oct 28 16:47:13 -0300 2011: BTW we had previous discussions about dropping pg_database's toast table. Maybe this is a good time to do it, even if there's no risk of this bug (or the hypothetical circularity detoasting problem) showing up there.

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-28 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: The risk factor comes in when we hold a syscache entry across transactions; then this guarantee is lost. (In both the original example and the pg_proc case above, the second backend is only at risk when it starts its transaction after the first one's

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-28 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: If we made the commit sequence number more generally available, incrementing it at the point of visibility change under cover of ProcArrayLock, and including the then-current value in a Snapshot object when built, would that help with this at

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-28 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Alvaro Herrera's message of vie oct 28 16:47:13 -0300 2011: BTW we had previous discussions about dropping pg_database's toast table. Maybe this is a good time to do it, even if there's no risk of this bug (or the hypothetical

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-28 Thread Merlin Moncure
On Fri, Oct 28, 2011 at 3:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Alvaro Herrera's message of vie oct 28 16:47:13 -0300 2011: BTW we had previous discussions about dropping pg_database's toast table.  Maybe this is a good time

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-26 Thread Simon Riggs
On Wed, Oct 26, 2011 at 4:24 AM, Tom Lane t...@sss.pgh.pa.us wrote: Even given your recent changes to reduce the overhead of checking for sinval messages, I'm not sure that it'd be practical to move the sinval message processing to just-before-we-look-up-a-cache-entry.  Right now, we do

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-26 Thread Robert Haas
On Tue, Oct 25, 2011 at 11:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: Even given your recent changes to reduce the overhead of checking for sinval messages, I'm not sure that it'd be practical to move the sinval message processing to just-before-we-look-up-a-cache-entry. Wait a minute: I'm

[HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-25 Thread Tom Lane
I believe I have reproduced the behavior described by Andrew Hammond in http://archives.postgresql.org/pgsql-general/2011-10/msg00928.php This is using the regression database: 1. In session 1, do set default_statistics_target TO 1; analyze tenk1; (We need the large stats

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-25 Thread Robert Haas
On Tue, Oct 25, 2011 at 9:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: What has happened here is that the second ANALYZE has marked itself committed in pg_clog and no longer running in the ProcArray, so VACUUM feels entitled to remove toast tuples that the ANALYZE deleted.  However, the ANALYZE

Re: [HACKERS] TOAST versus VACUUM, or missing chunk number 0 for toast value identified

2011-10-25 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Tue, Oct 25, 2011 at 9:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ordinarily, sending out sinval messages post-commit is okay because we don't release locks until after that, and we suppose that our locks prevent any other transactions from getting

Re: [HACKERS] toast tables on system catalogs

2011-09-05 Thread Bruce Momjian
Alvaro Herrera wrote: Excerpts from Tom Lane's message of mar mar 01 19:03:35 -0300 2011: Alvaro Herrera alvhe...@alvh.no-ip.org writes: Strangely, we made pg_database have a toast table, and the only reason for this is datacl. Should we create toast tables for the remaining catalogs?

Re: [HACKERS] toast tables on system catalogs

2011-09-05 Thread Robert Haas
On Mon, Sep 5, 2011 at 1:01 PM, Bruce Momjian br...@momjian.us wrote: Alvaro Herrera wrote: Excerpts from Tom Lane's message of mar mar 01 19:03:35 -0300 2011: Alvaro Herrera alvhe...@alvh.no-ip.org writes: Strangely, we made pg_database have a toast table, and the only reason for this

[HACKERS] toast error after pg_upgrade 9.0.4 - 9.1rc1

2011-08-31 Thread Lou Picciano
After running an essentially uneventful* pg_upgrade from 9.0.4 - 9.1rc1, we are seeing some toast errors logged on the new cluster: All are of this pattern: ERROR: missing chunk number 0 for toast value 130087 in pg_toast_34735 Have seen the same pattern for a few of the databases in the

Re: [HACKERS] toast error after pg_upgrade 9.0.4 - 9.1rc1

2011-08-31 Thread Bruce Momjian
Lou Picciano wrote: After running an essentially uneventful* pg_upgrade from 9.0.4 - 9.1rc1, we are seeing some toast errors logged on the new cluster: All are of this pattern: ERROR: missing chunk number 0 for toast value 130087 in pg_toast_34735 Have seen the same pattern for a few of

Re: [HACKERS] toast error after pg_upgrade 9.0.4 - 9.1rc1

2011-08-31 Thread Bruce Momjian
Bruce Momjian wrote: Lou Picciano wrote: After running an essentially uneventful* pg_upgrade from 9.0.4 - 9.1rc1, we are seeing some toast errors logged on the new cluster: All are of this pattern: ERROR: missing chunk number 0 for toast value 130087 in pg_toast_34735 Have seen

Re: [HACKERS] toast error after pg_upgrade 9.0.4 - 9.1rc1

2011-08-31 Thread Lou Picciano
- Original Message - From: Bruce Momjian br...@momjian.us To: Lou Picciano loupicci...@comcast.net Cc: pgsql-hackers@postgresql.org Sent: Wednesday, August 31, 2011 10:38:01 PM Subject: Re: [HACKERS] toast error after pg_upgrade 9.0.4 - 9.1rc1 Lou Picciano wrote: After running

[HACKERS] toast tables on system catalogs

2011-03-01 Thread Alvaro Herrera
Hi, Someone on IRC recently noticed that you can't grant USAGE privileges on a table to a large number of roles. (My experiment says 2466 works, 2467 doesn't). Of course, this is wrong and all that. I wrote a blog article about this:

Re: [HACKERS] toast tables on system catalogs

2011-03-01 Thread Tom Lane
Alvaro Herrera alvhe...@alvh.no-ip.org writes: Strangely, we made pg_database have a toast table, and the only reason for this is datacl. Should we create toast tables for the remaining catalogs? As I commented on your blog, this is nonsense. pg_database has a TOAST table becase we thought

Re: [HACKERS] toast tables on system catalogs

2011-03-01 Thread Alvaro Herrera
Excerpts from Tom Lane's message of mar mar 01 19:03:35 -0300 2011: Alvaro Herrera alvhe...@alvh.no-ip.org writes: Strangely, we made pg_database have a toast table, and the only reason for this is datacl. Should we create toast tables for the remaining catalogs? As I commented on your

[HACKERS] TOAST issue on custom index access method

2010-06-24 Thread Carsten Kropf
Hi *, I have a strange issue using a custom built index structure. My index access method support document type composed of words (as tsvector) and points (1-dimensional arrays of them). For internal reasons, I have to save the documents as a whole inside my structure (for proper

Re: [HACKERS] TOAST issue on custom index access method

2010-06-24 Thread Tom Lane
Carsten Kropf ckro...@fh-hof.de writes: I have a strange issue using a custom built index structure. My index access method support document type composed of words (as tsvector) and points (1-dimensional arrays of them). For internal reasons, I have to save the documents as a whole inside

Re: [HACKERS] TOAST issue on custom index access method

2010-06-24 Thread Carsten Kropf
Oh, thanks. I didn't know about index tables not having access to associated toast values. The index access method is (at least in the textual part) similar to GIN, however, I needed to do some changes to it. Saving the whole document is actually only important for vacuum. I think, I will find

Re: [HACKERS] TOAST issue on custom index access method

2010-06-24 Thread Tom Lane
Carsten Kropf ckro...@fh-hof.de writes: However, it is a little bit strange, that I get toasted values (when inserting from another table) and untoasted values, if I insert items directly. Could anybody please explain this to me? Huh? An index will never ever get passed an

Re: [HACKERS] TOAST issue on custom index access method

2010-06-24 Thread Carsten Kropf
Hmm, k, then I don't know exactly what happens. I have some values in my composite type. The size of the index tuple (if coming from one table and being copied to the target table) is about 15x bytes. Otherwise, if I insert the same tuple directly, it is about 12kb. So, I don't know exactly

Re: [HACKERS] TOAST code ignores freespace (was Tweak TOAST code)

2010-05-04 Thread Simon Riggs
On Mon, 2010-05-03 at 23:36 -0400, Jan Wieck wrote: Suppose we had something similar to the NULL value bitmap, specifying plain or compressed values (not TOAST references), that are moved to a shadow tuple inside the toast table. Suppose further we had some statistics about how often

Re: [HACKERS] TOAST code ignores freespace (was Tweak TOAST code)

2010-05-03 Thread Jan Wieck
On 5/2/2010 10:34 AM, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: Not commenting further on that patch, but I notice that when we UPDATE the toasting algorithm takes no account of the available freespace on the current block. If we are updating and the space available would make a

[HACKERS] TOAST code ignores freespace (was Tweak TOAST code)

2010-05-02 Thread Simon Riggs
On Wed, 2009-07-22 at 01:21 +, Tom Lane wrote: Tweak TOAST code so that columns marked with MAIN storage strategy are not forced out-of-line unless that is necessary to make the row fit on a page. Previously, they were forced out-of-line if needed to get the row down to the default

Re: [HACKERS] TOAST code ignores freespace (was Tweak TOAST code)

2010-05-02 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: Not commenting further on that patch, but I notice that when we UPDATE the toasting algorithm takes no account of the available freespace on the current block. If we are updating and the space available would make a difference to the row length chosen,

Re: [HACKERS] TOAST code ignores freespace (was Tweak TOAST code)

2010-05-02 Thread Simon Riggs
On Sun, 2010-05-02 at 10:34 -0400, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: Not commenting further on that patch, but I notice that when we UPDATE the toasting algorithm takes no account of the available freespace on the current block. If we are updating and the space

Re: [HACKERS] Toast rel options

2010-04-29 Thread Simon Riggs
On Wed, 2010-04-28 at 21:11 -0400, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: It's possible to set toast reloptions for tables that don't have toast tables at all. e.g. create table test (c1 integer); ALTER TABLE test SET (toast.autovacuum_enabled = off); Why? 1.

Re: [HACKERS] Toast rel options

2010-04-29 Thread Alvaro Herrera
Simon Riggs wrote: Also, this seems not to work? postgres=# alter table test set (my.expectation = on); ERROR: unrecognized parameter namespace my I thought we had enabled custom table options in this release? Or was that deferred, or even rejected completely? IIRC you can define your

Re: [HACKERS] Toast rel options

2010-04-29 Thread Alvaro Herrera
Simon Riggs wrote: Why do we protect against this? postgres=# alter table test set (nonexistent = on); ERROR: unrecognized parameter notexistent Eh? The parameter doesn't exist. It's not exactly the same as defining a value for a parameter that exists but is unused. -- Alvaro Herrera

[HACKERS] Toast rel options

2010-04-28 Thread Simon Riggs
It's possible to set toast reloptions for tables that don't have toast tables at all. e.g. create table test (c1 integer); ALTER TABLE test SET (toast.autovacuum_enabled = off); Why? -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Toast rel options

2010-04-28 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: It's possible to set toast reloptions for tables that don't have toast tables at all. e.g. create table test (c1 integer); ALTER TABLE test SET (toast.autovacuum_enabled = off); Why? 1. Why not? 2. They might have toast tables later, after an

[HACKERS] TOAST table (or index?) corruption on PITR replica

2009-03-06 Thread Alvaro Herrera
We have a customer running a very high-velocity database. We have set up a PITR replica that's restoring correctly using JD's PITRtools and pg_standby. However, upon stopping the recovery and starting it for production (for switchover) it promptly gets into these problems: 14895 2009-02-28

Re: [HACKERS] TOAST table (or index?) corruption on PITR replica

2009-03-06 Thread Gregory Stark
Alvaro Herrera alvhe...@commandprompt.com writes: 14903 2009-02-28 22:22:02 EST kangaroo app 49a9fe2e.3a37 ERROR: index pg_toast_42299_index contains unexpected zero page at block 23768 There were actually several different instances of the first error involving different toast values. Do

Re: [HACKERS] TOAST table (or index?) corruption on PITR replica

2009-03-06 Thread Zdenek Kotala
Alvaro Herrera píše v pá 06. 03. 2009 v 10:32 -0300: The servers are running 8.3.5. Does this ring a bell for anyone? look here: http://archives.postgresql.org/pgsql-bugs/2008-11/msg00113.php It look likes that there could be problem with btree wal. Zdenek -- Sent via

[HACKERS] toast by chunk-end (was Re: PG_PAGE_LAYOUT_VERSION 5 - time for change)

2008-11-17 Thread Alvaro Herrera
Zdenek Kotala wrote: Alvaro Herrera napsal(a): Heikki Linnakangas wrote: Hmm, you're right. I think it can be made to work by storing the *end* offset of each chunk. To find the chunk containing offset X, search for the first chunk with end_offset X. FWIW I'm trying to do this. So

[HACKERS] Toast bug in CVS HEAD

2008-11-05 Thread Heikki Linnakangas
A bug was introduced a while ago by this patch: commit 447f7364dd7227a32b58a2aff24f587dd7d7051a Author: Tom Lane [EMAIL PROTECTED] Date: Sat Apr 12 23:14:21 2008 + Create new routines systable_beginscan_ordered, systable_getnext_ordered, systable_endscan_ordered that have

Re: [HACKERS] Toast bug in CVS HEAD

2008-11-05 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Basically, this comment and code in genam.c: ... is wrong, because it assumes that there's only one scankey per index column, but that's not true for toast_fetch_datum_slice(), which uses two scankeys for the chunkid, to fetch a range. Attached

Re: [HACKERS] Toast bug in CVS HEAD

2008-11-05 Thread Heikki Linnakangas
Tom Lane wrote: It's kind of annoying to introduce a search when it's so seldom needed, though. How about something like /* fast path for common case */ if (key[i].sk_attno == irel-rd_index-indkey.values[i]) key[i].sk_attno = i + 1; else ...

[HACKERS] TOAST Threshold? Re: Status of 8.3 patches

2007-08-21 Thread Chris Browne
[EMAIL PROTECTED] (Joshua D. Drake) writes: Alvaro Herrera wrote: Joshua D. Drake wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Heikki Linnakangas wrote: Joshua D. Drake wrote: I guess my point is, if the patch looks good and does not appear to hurt anything, why not apply it? At

Re: [HACKERS] TOAST usage setting

2007-06-08 Thread Zeugswetter Andreas ADI SD
My next suggestion would be to leave EXTERN_TUPLES_PER_PAGE as is, but: Split data wider than a page into page sized chunks as long as they fill whole pages. Split the rest with EXTERN_TUPLES_PER_PAGE (4) as now. This would not waste more space than currently, but improve performance

Re: [HACKERS] TOAST usage setting

2007-06-07 Thread Bruce Momjian
Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: Well, it is summarized here: http://momjian.us/expire/TOAST/SUMMARY.html It made non-TOAST access 2x faster, but TOAST 7x slower, and that seemed like a good compromise. Is this still testing with all data fitting

Re: [HACKERS] TOAST usage setting

2007-06-07 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes: Gregory Stark wrote: Is this still testing with all data fitting in RAM? Yes. Having things out of RAM is going to make access even slower, but it is going to allow the heap to be in RAM more often. It would let us measure the actual impact of

Re: [HACKERS] TOAST usage setting

2007-06-06 Thread Zeugswetter Andreas ADI SD
No, you misunderstood. Bruce was suggesting changing the target to 512. That means if a row is wider than ~2k, toaster will try to toast until the base row is ~512 bytes. I would not do that part for 8.3. OK, what do you suggest for 8.3? Attached are my suggestion to use 512 and a

Re: [HACKERS] TOAST usage setting

2007-06-06 Thread Bruce Momjian
Zeugswetter Andreas ADI SD wrote: No, you misunderstood. Bruce was suggesting changing the target to 512. That means if a row is wider than ~2k, toaster will try to toast until the base row is ~512 bytes. I would not do that part for 8.3. OK, what do you suggest for 8.3?

Re: [HACKERS] TOAST usage setting

2007-06-06 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes: Well, it is summarized here: http://momjian.us/expire/TOAST/SUMMARY.html It made non-TOAST access 2x faster, but TOAST 7x slower, and that seemed like a good compromise. Is this still testing with all data fitting in RAM? -- Gregory Stark

Re: [HACKERS] TOAST usage setting

2007-06-06 Thread Zeugswetter Andreas ADI SD
While I agree, that 2 might be a good compromise with low risc for now, I think that toasting all rows down to ~512 bytes is too narrowly targeted at not reading wider columns. Well, it is summarized here: http://momjian.us/expire/TOAST/SUMMARY.html It made non-TOAST access

Re: [HACKERS] TOAST usage setting

2007-06-05 Thread Zeugswetter Andreas ADI SD
The big question is do we want to drop the target tuple size down to 512, and increase the chunk size to 8k for 8.3? Dropping the tuple size down to 512 is going to give us some smaller TOAST values to fill in free space created by the 8k chuck size, assuming you have both types

Re: [HACKERS] TOAST usage setting

2007-06-05 Thread Bruce Momjian
Zeugswetter Andreas ADI SD wrote: The big question is do we want to drop the target tuple size down to 512, and increase the chunk size to 8k for 8.3? Dropping the tuple size down to 512 is going to give us some smaller TOAST values to fill in free space created by the 8k

Re: [HACKERS] TOAST usage setting

2007-06-04 Thread Jim C. Nasby
On Fri, Jun 01, 2007 at 01:50:12PM -0400, Bruce Momjian wrote: I think the long-term solution is to go to a 2k/8k fragment/block model, but that isn't going to happen for 8.3. There might well have been lessons learned since UFS (anyone know what ZFS does in this regard?), but I agree that we

Re: [HACKERS] TOAST usage setting

2007-06-04 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes: On Fri, Jun 01, 2007 at 01:50:12PM -0400, Bruce Momjian wrote: The big question is do we want to drop the target tuple size down to 512, and increase the chunk size to 8k for 8.3? If we do that people could see their disk space usage increase by up to

Re: [HACKERS] TOAST usage setting

2007-06-01 Thread Bruce Momjian
Gregory Stark wrote: Gregory Stark [EMAIL PROTECTED] writes: Bruce Momjian [EMAIL PROTECTED] writes: shared_buffers again was 32MB so all the data was in memory. The case where all the data is in memory is simply not interesting. The cost of TOAST is the random access seeks it

Re: [HACKERS] TOAST usage setting

2007-06-01 Thread Bruce Momjian
Zeugswetter Andreas ADI SD wrote: It is good. It shows, that we even see a small advantage in the everything cached case. What we don't have yet is numbers for whether EXTERN_TUPLES_PER_PAGE=1 substantially increases the toast table size for real life scenarios, what happens in the worst

Re: [HACKERS] TOAST usage setting

2007-05-31 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes: shared_buffers again was 32MB so all the data was in memory. The case where all the data is in memory is simply not interesting. The cost of TOAST is the random access seeks it causes. You seem to be intentionally avoiding testing the precise thing we're

Re: [HACKERS] TOAST usage setting

2007-05-31 Thread Gregory Stark
Gregory Stark [EMAIL PROTECTED] writes: Bruce Momjian [EMAIL PROTECTED] writes: shared_buffers again was 32MB so all the data was in memory. The case where all the data is in memory is simply not interesting. The cost of TOAST is the random access seeks it causes. You seem to be

Re: [HACKERS] TOAST usage setting

2007-05-31 Thread Zeugswetter Andreas ADI SD
I tested EXTERN_TUPLES_PER_PAGE for values 4(default), 2, and 1: 4 15.596 2 15.197 1 14.6 which is basically a 3% decrease from 4-2 and 2-1. The test script and result are here: http://momjian.us/expire/TOAST2/ shared_buffers again was

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Zeugswetter Andreas ADI SD
I reran the tests with hashtext(), and created a SUMMARY.HTML chart: http://momjian.us/expire/TOAST/ What you will see is that pushing TEXT to a TOAST column allows quick access to non-TOAST values and single-row TOAST values, but accessing all TOAST columns is slower than

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Tom Lane
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: My expectation would be, that we want to allow a toast tuple to fill a whole page (TOAST_TUPLES_PER_PAGE = 1), I've been wondering about that too. It certainly needs to be experimented with, independently of TOAST_TUPLE_THRESHOLD, now that

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: My expectation would be, that we want to allow a toast tuple to fill a whole page (TOAST_TUPLES_PER_PAGE = 1), I thought that previously but then I started thinking maybe that's not true. Picture

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes: Whereas if you set toast_tuples_per_page to 8k then the only option for Postgres will be to put each datum in its own page and waste 1-3k on every page. No, because actually the code is designed to make the toast chunk size just enough less than 8K that

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes: The shorter-than-normal tuples carrying the last chunk of any particular datum are going to result in wasted space to the extent that we can't pack them together on a page, but that's true now. Right now, if you have a large toasted datum, it mostly will

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Zeugswetter Andreas ADI SD
Whereas if you set toast_tuples_per_page to 8k then the only option for Postgres will be to put each datum in its own page and waste 1-3k on every page. No, because actually the code is designed to make the toast chunk size just enough less than 8K that the tuples fit. He

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Bruce Momjian
Tom Lane wrote: Gregory Stark [EMAIL PROTECTED] writes: Whereas if you set toast_tuples_per_page to 8k then the only option for Postgres will be to put each datum in its own page and waste 1-3k on every page. No, because actually the code is designed to make the toast chunk size just

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Heikki Linnakangas
Bruce Momjian wrote: What you will see is that pushing TEXT to a TOAST column allows quick access to non-TOAST values and single-row TOAST values, but accessing all TOAST columns is slower than accessing them in the heap, by a factor of 3-18x. Looking at the chart, it seems 512 is the proper

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes: Uh, am I supposed to be running more TOAST tests? Would someone explain what they want tested? If you want my opinion I would say we need two tests: 1) For TOAST_TUPLE_TARGET: We need to run the test scripts you have already for sizes that cause

Re: [HACKERS] TOAST usage setting

2007-05-30 Thread Bruce Momjian
I tested EXTERN_TUPLES_PER_PAGE for values 4(default), 2, and 1: 4 15.596 2 15.197 1 14.6 which is basically a 3% decrease from 4-2 and 2-1. The test script and result are here: http://momjian.us/expire/TOAST2/ shared_buffers again was 32MB

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes: I tested TOAST using a method similar to the above method against CVS HEAD, with default shared_buffers = 32MB and no assert()s. I created backends with power-of-2 seetings for TOAST_TUPLES_PER_PAGE (4(default), 8, 16, 32, 64) which gives

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Bruce Momjian
Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: I tested TOAST using a method similar to the above method against CVS HEAD, with default shared_buffers = 32MB and no assert()s. I created backends with power-of-2 seetings for TOAST_TUPLES_PER_PAGE (4(default), 8, 16, 32,

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes: Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: I tested TOAST using a method similar to the above method against CVS HEAD, with default shared_buffers = 32MB and no assert()s. I created backends with power-of-2 seetings for

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Gregory Stark
Gregory Stark [EMAIL PROTECTED] writes: Is your database initialized with C locale? If so then length(text) is optimized to not have to detoast: if (pg_database_encoding_max_length() == 1) PG_RETURN_INT32(toast_raw_datum_size(str) - VARHDRSZ); Of course I got that wrong.

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Bruce Momjian
Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: I tested TOAST using a method similar to the above method against CVS HEAD, with default shared_buffers = 32MB and no assert()s. I created backends with

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Alvaro Herrera
Bruce Momjian wrote: Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: I tested TOAST using a method similar to the above method against CVS HEAD, with default shared_buffers = 32MB and no assert()s. I

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Bruce Momjian
Alvaro Herrera wrote: Strangely, 128 bytes seems to be the break-even point for TOAST and non-TOAST, even for sequential scans of the entire heap touching all long row values. I am somewhat confused why TOAST has faster access than inline heap data. Is your database

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes: Gregory Stark wrote: Is your database initialized with C locale? If so then length(text) is optimized to not have to detoast: if (pg_database_encoding_max_length() == 1) PG_RETURN_INT32(toast_raw_datum_size(str) - VARHDRSZ); Wow,

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Alvaro Herrera
Bruce Momjian wrote: My test uses random data, which I figured was a close to real-world as I could get, and I have a test that makes sure the data was pushed to the TOAST table. Should I still try EXTERNAL? My point is that you probably want to measure separately the effect of compression

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Bruce Momjian
Alvaro Herrera wrote: Bruce Momjian wrote: My test uses random data, which I figured was a close to real-world as I could get, and I have a test that makes sure the data was pushed to the TOAST table. Should I still try EXTERNAL? My point is that you probably want to measure

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Bruce Momjian
Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: Gregory Stark wrote: Is your database initialized with C locale? If so then length(text) is optimized to not have to detoast: if (pg_database_encoding_max_length() == 1)

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes: No, we did substring() too :) Uh, I looked at text_substring(), and while there is an optimization to do character counting for encoding length == 1, it is still accessing the data. Sure but it'll only access the first chunk. There are two chunks in

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Bruce Momjian
Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: No, we did substring() too :) Uh, I looked at text_substring(), and while there is an optimization to do character counting for encoding length == 1, it is still accessing the data. Sure but it'll only access the first

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Bruce Momjian
Bruce Momjian wrote: Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: No, we did substring() too :) Uh, I looked at text_substring(), and while there is an optimization to do character counting for encoding length == 1, it is still accessing the data. Sure but

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: I reran the tests with hashtext(), and created a SUMMARY.HTML chart: http://momjian.us/expire/TOAST/ I don't understand what the numbers in this chart are? regards, tom lane ---(end of

Re: [HACKERS] TOAST usage setting

2007-05-29 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I reran the tests with hashtext(), and created a SUMMARY.HTML chart: http://momjian.us/expire/TOAST/ I don't understand what the numbers in this chart are? They are taken from the test script and output files that are also in

[HACKERS] TOAST usage setting

2007-05-28 Thread Bruce Momjian
bruce wrote: * Re: [HACKERS] Modifying TOAST thresholds /Tom Lane/ At this point it seems nothing will be done about this issue for 8.3. I'm not sure anyone has an idea how to test it. TPCC isn't really useful because it has a fixed size (500 byte) string buffer. Perhaps if we

Re: [HACKERS] TOAST usage setting

2007-05-28 Thread Chris Browne
[EMAIL PROTECTED] (Bruce Momjian) writes: The results are here: http://momjian.us/expire/TOAST/ I'll take a look and see if there's anything further it makes sense for me to try testing. Thanks for following up so quickly; what with the cold I have had, I haven't yet gotten back to the

[HACKERS] TOAST table names

2006-12-13 Thread Simon Riggs
Is there a big reason why TOAST tables are called such cryptic names? e.g. pg_toast.pg_toast_16399 Wouldn't it be more pleasant to have them called the same thing as their parent e.g. pg_toast.name_toast This would be very convenient for most purposes, though it would mean we'd have to do

Re: [HACKERS] TOAST table names

2006-12-13 Thread Bernd Helmle
On Wed, 13 Dec 2006 11:29:44 +, Simon Riggs [EMAIL PROTECTED] wrote: Is there a big reason why TOAST tables are called such cryptic names? e.g. pg_toast.pg_toast_16399 Wouldn't it be more pleasant to have them called the same thing as their parent e.g. pg_toast.name_toast Wouldn't

Re: [HACKERS] TOAST compression

2006-02-27 Thread Qingqing Zhou
Josh Berkus josh@agliodbs.com wrote While I'm waiting to figure out how to get the size of the toast table, at least I can provide the speed of query with/without assumed compression on the 6K text columns. Check out the table_size view in the newsysviews project. Andrew computed the

Re: [HACKERS] TOAST compression

2006-02-26 Thread Jim C. Nasby
On Sat, Feb 25, 2006 at 09:39:34PM -0500, Neil Conway wrote: It's true that LZ decompression is fast, so we should probably use the compressed version of the datum unless the reduction in size is very small. I'm not sure precisely what that threshold should be, however. Any idea on how

Re: [HACKERS] TOAST compression

2006-02-26 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes: toast_compress_datum() considers compression to be successful if the compressed version of the datum is smaller than the uncompressed version. I think this is overly generous: if compression reduces the size of the datum by, say, 0.01%, it is likely a net

Re: [HACKERS] TOAST compression

2006-02-26 Thread Luke Lonergan
Jim, On 2/26/06 8:00 AM, Jim C. Nasby [EMAIL PROTECTED] wrote: Any idea on how decompression time compares to IO bandwidth? In other words, how long does it take to decompress 1MB vs read that 1MB vs read whatever the uncompressed size is? On DBT-3 data, I've just run some tests meant to

Re: [HACKERS] TOAST compression

2006-02-26 Thread Jim C. Nasby
On Sun, Feb 26, 2006 at 09:31:05AM -0800, Luke Lonergan wrote: Note that this filesystem can do about 400MB/s, and we routinely see scan rates of 300MB/s within PG, so the real comparision is: Direct seqscan at 300MB/s versus gunzip at 77.5MB/s So the cutover point (on your system with very

Re: [HACKERS] TOAST compression

2006-02-26 Thread Luke Lonergan
Jim, On 2/26/06 10:37 AM, Jim C. Nasby [EMAIL PROTECTED] wrote: So the cutover point (on your system with very fast IO) is 4:1 compression (is that 20 or 25%?). Actually the size of the gzipp'ed binary file on disk was 65MB, compared to 177.5MB uncompressed, so the compression ratio is 37%

Re: [HACKERS] TOAST compression

2006-02-26 Thread Alvaro Herrera
Luke Lonergan wrote: Jim, On 2/26/06 10:37 AM, Jim C. Nasby [EMAIL PROTECTED] wrote: So the cutover point (on your system with very fast IO) is 4:1 compression (is that 20 or 25%?). Actually the size of the gzipp'ed binary file on disk was 65MB, compared to 177.5MB uncompressed, so

Re: [HACKERS] TOAST compression

2006-02-26 Thread Hannu Krosing
Ühel kenal päeval, P, 2006-02-26 kell 09:31, kirjutas Luke Lonergan: Jim, On 2/26/06 8:00 AM, Jim C. Nasby [EMAIL PROTECTED] wrote: Any idea on how decompression time compares to IO bandwidth? In other words, how long does it take to decompress 1MB vs read that 1MB vs read whatever the

Re: [HACKERS] TOAST compression

2006-02-26 Thread Luke Lonergan
Hannu, On 2/26/06 12:19 PM, Hannu Krosing [EMAIL PROTECTED] wrote: On DBT-3 data, I've just run some tests meant to simulate the speed differences of compression versus native I/O. My thought is that an external use of gzip on a binary dump file should be close to the speed of LZW on

  1   2   >