[HACKERS] feature request: explain "with details" option

2016-09-08 Thread Roger Pack
My apologies if this was already requested before...

I think it would be fantastic if postgres had an "explain the explain" option:
Today's explain tells us what loops and scans were used, and relative
costs, etc.  It doesn't seem to tell *why* the planner elected to use
what it did.

For instance, in the case of a corrupted index, it doesn't say why
it's not using that index, it just doesn't use it, causing some
confusion to end users.  At least causing confusion to me.

Or in the case of where it iterates over an entire table (seq. scan)
instead of using an index because the index range specified "is most
of the table" (thus not helpful to use the index)...The choice is
appropriate.  The reasoning why is not explicitly mentioned.  Again
causing possibility for some delay as you try to "decipher the mind"
of the planner.  Sometimes tables (ex: tables after having been first
propagated) need an "analyze" run on them, but it's not clear from an
"explain" output that the analyze statistics are faulty.  Not even a
hint.

So this is a feature request for an "EXPLAIN DETAILS" option or
something, basically like today's explain but with more "rationale"
included.  This could be immensely useful to many Postgres users.

I'd even be willing to chip in a couple hundred bucks if it would help
grease the wheels for somebody taking up the challenge if that helps
at all :)

Thank you for your consideration in this regard.
-roger-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS]

2015-09-01 Thread Roger Pack
> I don't think it would be horrifically hard to change the way toast OIDs
> are assigned (I'm thinking we'd basically switch to creating a sequence
> for every toast table), but I don't think anyone's ever tried to push
> toast hard enough to hit this kind of limit.

I'd be interested in promoting this effort, just to contribute back to
the community (scope from me: hundreds of dollars, before anybody gets
too).  If anybody's interested give me a bid out of band.  If anybody
else would be interested in contributing some money to this cause let
me know.
Thanks!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-04-27 Thread Roger Pack
On 4/27/15, Jim Nasby  wrote:
> On 4/25/15 1:19 PM, Bruce Momjian wrote:
>>  Note if you are storing a table with rows that exceed 2KB in size
>>  (aggregate size of each row) then the "Maximum number of rows in a
>>  table" may be limited to 4 Billion, see TOAST.
>
> That's not accurate though; you could be limited to far less than 4B
> rows. If each row has 10 fields that toast, you'd be limited to just
> 400M rows.

Good point.  I noted that on the TOAST wiki page now, at least (and
also mentioned that using partitioning is a "work around" for now).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fwd: ability to return number of rows inserted into child partition tables request

2014-11-17 Thread Roger Pack
Hello.

I was trying to get postgres to return the "correct" number of rows
inserted for batch inserts to a partitioned table [using the triggers as
suggested here
http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html results in
it always returning 0 by default].

What I ideally wanted it to do is to be able to insert into just the child
partitions, and return number of rows updated.

It seems the state of the art is either to return the NEW row from the
insert trigger [which causes it to also be saved to the parent master
table], then define an extra trigger to remove the parent table.  So 2
inserts and 1 delete for an insert. [1]

Or you can use an unconditional rule and it will return the number of rows
updated [however, in this case, since we're using partitioning, we I think
need multiple rules, once for each child table].

It is possible for a view to use a trigger and still return the number of
rows updated, which provides another work around. (See bottom of [1]).

Is there some more elegant way here?  It seems odd that partitioned tables
basically cannot, without a *lot* of massaging, return number of rows
updated, am I missing something or do I understand ok? [Today this requires
people to put in lots of work arounds, like *not* checking for number of
rows returned for batch inserts, etc.-- potentially dangerous as well]

Is there, for instance, some work around, like a way to manually cause the
count of the number of rows affected by the command to be incremented
here?  Or possibly conditional rules could be made possible to return the
output string with number of rows affected (feature request)?

I guess this has come up before, FWIW.
http://grokbase.com/t/postgresql/pgsql-general/0863bjzths/insert-into-master-table-0-rows-affected-hibernate-problems


One way of fixing this would be to allow "do instead" rules on normal
tables, instead of only on views (otherwise we are forced to use a rule,
correct me if I'm wrong).  I'd wager there would be other viable options as
well.

Thanks!
-roger-


[1]
http://stackoverflow.com/questions/83093/hibernate-insert-batch-with-partitioned-postgresql


[HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-01-30 Thread Roger Pack
>> On 1/29/15, Roger Pack  wrote:
>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>> tables that have BLOB's
>>
>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>> reasoning is the same...
>
> It only applies to large objects, not bytea or text.

OK I think I figured out possibly why the wiki says this.  I guess
BYTEA entries > 2KB will be autostored via TOAST, which uses an OID in
its backend.  So BYTEA has a same limitation.  It appears that
disabling TOAST is not an option [1].
So I guess if the number of BYTEA entries (in the sum all tables?
partitioning doesn't help?) with size > 2KB is > 4 billion then there
is actually no option there?  If this occurred it might cause "all
sorts of things to break"? [2]
Thanks!
-roger-

[1] 
http://www.postgresql.org/message-id/20130405140348.gc4...@awork2.anarazel.de
[2] 
http://www.postgresql.org/message-id/CAL1QdWfb-p5kE9DT2pMqBxohaKG=vxmdremsbjc+7tkboek...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-01-31 Thread Roger Pack
Oops forgot to forward to the list (suggestion/feature request to the
list admin for the various pg lists: make the default "reply to" go to
the list, not the sender, if at all possible).

Response below:

On 1/30/15, Jim Nasby  wrote:
> On 1/30/15 11:54 AM, Roger Pack wrote:
>>>> On 1/29/15, Roger Pack  wrote:
>>>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>>>> tables that have BLOB's
>>>>
>>>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>>>> reasoning is the same...
>>>
>>> It only applies to large objects, not bytea or text.
>>
>> OK I think I figured out possibly why the wiki says this.  I guess
>> BYTEA entries > 2KB will be autostored via TOAST, which uses an OID in
>> its backend.  So BYTEA has a same limitation.  It appears that
>> disabling TOAST is not an option [1].
>> So I guess if the number of BYTEA entries (in the sum all tables?
>> partitioning doesn't help?) with size > 2KB is > 4 billion then there
>> is actually no option there?  If this occurred it might cause "all
>> sorts of things to break"? [2]
>
> It's a bit more complex than that. First, toast isn't limited to bytea;
> it holds for ALL varlena fields in a table that are allowed to store
> externally. Second, the limit is actually per-table: every table gets
> it's own toast table, and each toast table is limited to 4B unique OIDs.
> Third, the OID counter is actually global, but the code should handle
> conflicts by trying to get another OID. See toast_save_datum(), which
> calls GetNewOidWithIndex().
>
> Now, the reality is that GetNewOidWithIndex() is going to keep
> incrementing the global OID counter until it finds an OID that isn't in
> the toast table. That means that if you actually get anywhere close to
> using 4B OIDs you're going to become extremely unhappy with the
> performance of toasting new data.

OK so "system stability" doesn't degrade per se when it wraps [since
they all use that GetNewOid method or similar [?] good to know.

So basically when it gets near 4B TOAST'ed rows it may have to wrap that
counter and search for "unused" number, and for each number it's
querying the TOAST table to see if it's already used, degrading
performance.

So I guess partitioning tables for now is an acceptable work around,
good to know.

Thanks much for your response, good to know the details before we dive
into postgres with our 8B row table with BYTEA's in it :)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-02-02 Thread Roger Pack
On 1/30/15, Jim Nasby  wrote:
> On 1/30/15 11:54 AM, Roger Pack wrote:
>>>> On 1/29/15, Roger Pack  wrote:
>>>>> Hello.  I see on this page a mention of basically a 4B row limit for
>>>>> tables that have BLOB's
>>>>
>>>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>>>> reasoning is the same...
>>>
>>> It only applies to large objects, not bytea or text.
>>
>> OK I think I figured out possibly why the wiki says this.  I guess
>> BYTEA entries > 2KB will be autostored via TOAST, which uses an OID in
>> its backend.  So BYTEA has a same limitation.  It appears that
>> disabling TOAST is not an option [1].
>> So I guess if the number of BYTEA entries (in the sum all tables?
>> partitioning doesn't help?) with size > 2KB is > 4 billion then there
>> is actually no option there?  If this occurred it might cause "all
>> sorts of things to break"? [2]
>
> It's a bit more complex than that. First, toast isn't limited to bytea;
> it holds for ALL varlena fields in a table that are allowed to store
> externally. Second, the limit is actually per-table: every table gets
> it's own toast table, and each toast table is limited to 4B unique OIDs.
> Third, the OID counter is actually global, but the code should handle
> conflicts by trying to get another OID. See toast_save_datum(), which
> calls GetNewOidWithIndex().
>
> Now, the reality is that GetNewOidWithIndex() is going to keep
> incrementing the global OID counter until it finds an OID that isn't in
> the toast table. That means that if you actually get anywhere close to
> using 4B OIDs you're going to become extremely unhappy with the
> performance of toasting new data.

OK so "system stability" doesn't degrade per se when it wraps, good to know.

So basically when it gets near 4B rows it may have to wrap that
counter multiple times, and for each "entry" it's searching if it's
already used, etc.

So I guess partitioning tables for now is an acceptable work around,
good to know.

Thanks much for your response, good to know the details before we dive
into postgres with our 8B row table with BYTEA's in it :)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-02-02 Thread Roger Pack
On 2/2/15, José Luis Tallón  wrote:
> On 01/31/2015 12:25 AM, Jim Nasby wrote:
>> [snip]
>> It's a bit more complex than that. First, toast isn't limited to
>> bytea; it holds for ALL varlena fields in a table that are allowed to
>> store externally. Second, the limit is actually per-table: every table
>> gets it's own toast table, and each toast table is limited to 4B
>> unique OIDs. Third, the OID counter is actually global, but the code
>> should handle conflicts by trying to get another OID. See
>> toast_save_datum(), which calls GetNewOidWithIndex().
>>
>> Now, the reality is that GetNewOidWithIndex() is going to keep
>> incrementing the global OID counter until it finds an OID that isn't
>> in the toast table. That means that if you actually get anywhere close
>> to using 4B OIDs you're going to become extremely unhappy with the
>> performance of toasting new data.
>
> Indeed ..
>
>> I don't think it would be horrifically hard to change the way toast
>> OIDs are assigned (I'm thinking we'd basically switch to creating a
>> sequence for every toast table), but I don't think anyone's ever tried
>> to push toast hard enough to hit this kind of limit.
>
> We did. The Billion Table Project, part2 (a.k.a. "when does Postgres'
> OID allocator become a bottleneck") The allocator becomes
> essentially unusable at about 2.1B OIDs, where it performed very well at
> "quite empty"(< 100M objects) levels.
>
> So yes, using one sequence per TOAST table should help.
> Combined with the new SequenceAMs / sequence implementation being
> proposed (specifically: one file for all sequences in a certain
> tablespace) this should scale much better.

But it wouldn't be perfect, right? I mean if you had multiple
deletion/insertions and pass 4B then the "one sequence per TOAST
table" would still wrap [albeit more slowly], and performance start
degrading the same way.  And there would still be the hard 4B limit.
Perhaps the foreign key to the TOAST table could be changed from oid
(32 bits) to something else (64 bits) [as well the sequence] so that
it never wraps?  What do you think? And would a more aggressive change
like this have a chance of being accepted into the code base?
Thanks.
-roger-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers