Re: How to reference a DB with a period in its name ?

2024-03-30 Thread Jan Wieck

On 3/29/24 17:16, David Gauthier wrote:

I tried encapsulating the DB name in double quotes (no good), single 
quotes (still no good) escaping with '\' (no good), escaping with ".." 
(no good).


SELECT * FROM "thedb.v1".theschem.thetab;

Just as documented.





Re: Failed to parse new syntax

2023-08-05 Thread Jan Wieck
Are you aware that PostgreSQL has a built in '~' operator for regular 
expressions?



Regards, Jan


On 8/5/23 11:56, jacktby jacktby wrote:

/*
*   similarity_search_expr is used for our multi-mode
*   similarity_search, and we just use this for multi
*   cols search.
*/
similarity_search_expr:
sub_search_expr '<' AexprConst {
$$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "<", $1, $3, @2);
}
;
sub_search_expr:
'[' col_tuple_expr '~' AexprConst ']' {
$$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "~", $2, $4, @3);
}
;
col_tuple_expr:
columnref { $$ = $1;}
| '(' col_list_expr ')' { $$ = $2;}
;
col_list_expr:
columnref {
ResTarget* target = makeNode(ResTarget);
target->name = NULL;
target->indirection = NIL;
target->val = (Node *) $1;
target->location = @1;
$$ = list_make1(target);
}
| col_list_expr ',' columnref { $$ = lappend($1,$3);}
;
This is my new grammer.
But I get  below:
   postgres=# select * from t2 where [a ~ 1] < 0;
   ERROR:  syntax error at or near "~"
  LINE 1: select * from t2 where [a ~ 1] < 0;
It’s strange that it can’t parse ‘~’, I add it in the parser.







Re: Active Active PostgreSQL Solution

2023-06-14 Thread Jan Wieck

On 6/9/23 06:38, Mohsin Kazmi wrote:

Hello Everyone,

I have been working on PostgreSQL databases for the last three years and 
I have also migrate databases from Oracle to PostgreSQL as well. I 
configured PostgreSQL for logical replication as well.


Now in order to deploy PostgreSQL in our production servers, I need to 
configure it in Active Active mode. Can anyone help me to do so?


pgEdge is working on a solution for active-active. It is not yet GA so 
not ready for production. It will be available for free.


https://www.pgedge.com/


Best Regards,
Jan





Re: Profiling a function call

2023-05-21 Thread Jan Wieck

On 5/20/23 00:36, Tiffany Thang wrote:

Hi,
I have a function that has been executing for a long time and not 
returning any results. Wait event=NULL so it seems like it is still 
executing and not waiting on any specific resources. Is there a way to 
profile the function call to get an idea of what it is currently 
executing within the function? All I could find in pg_stat_activity is 
the function is running. I want to know what query/transaction within 
the function is running. Is there a way to obtain that information?


This may give you some idea. I haven't touched or used it in many years, 
but it might get you started:


https://github.com/wieck/plbacktrace


Best Regards, Jan



PostgreSQL 15.2 on Linux.

Thanks.

Tiff






Re: Purpose of DEFERRABLE _and_ INITIALLY DEFERRED foreign key constraint checking?

2022-08-04 Thread Jan Wieck

On 8/3/22 20:30, Ron wrote:

AWS RDS Postgresql 12.10

https://www.postgresql.org/docs/12/sql-createtable.html

[quote]
|DEFERRABLE|
|NOT DEFERRABLE|

This controls whether the constraint can be deferred. A constraint
that is not deferrable will be checked immediately after every
command. *Checking of constraints that are deferrable can be
postponed until the end of the transaction*[/quote]

[/quote]

But yet a |DEFERRABLE| FK constraint in a transaction immediately failed 
on a FK constraint violation.


[quote]
|INITIALLY IMMEDIATE|
|INITIALLY DEFERRED|

If a constraint is deferrable, this clause specifies the default
time to check the constraint. If the constraint is|INITIALLY
IMMEDIATE|, it is checked after each statement. This is the default.
*If the constraint is|INITIALLY DEFERRED|**, it is checked only
at the end of the transaction.*

[/quote]

INITIALLY DEFERRED solved my problem.  Why do both clauses exist?


This is as per the Standard.

The default is NOT DEFERRABLE and when DEFERRABLE is specified then the 
default is INITIALLY DEFERRED. This can then be overriden inside a 
transaction with SET CONSTRAINT so that one or more (or all) DEFERRABLE 
constraints will be deferred until the end of transaction OR until they 
are explicitly set to IMMEDIATE again. Setting a previously DEFERRED 
constraint to IMMEDIATE will immediately run all the queued up checks.


This gives the application absolute fine control as to when constraints 
are checked.


The purpose of deferrable constraints is to do things that normally are 
impossible. Like for example a circular constraint because you want 
table A and table B to have a guaranteed 1:1 content on their primary 
key. For every row in A there must be a corresponding row in B and vice 
versa. This is implemented with two constraints where A and B point at 
each other. Without deferring those constraints it would be impossible 
to ever get a single row into either of them.



Regards, Jan




Re: External psql editor

2022-04-29 Thread Jan Wieck

On 4/29/22 16:17, Rich Shepard wrote:

On Fri, 29 Apr 2022, Jan Wieck wrote:


Did you hit Enter after \e ?


Jan,

Yes. For example, I put a previous command at the prompt to be modified. It
began with 'insert ...' so I added an initial \e to the command. psql told
me that \einsert is not a valid command after I pressed the [Enter] key.


It is the other way around, like in


postgres-# select now()\e


That puts "select now()" into vi(1) for me.


Regards, Jan




Re: External psql editor

2022-04-29 Thread Jan Wieck

On 4/29/22 15:50, Mladen Gogala wrote:
Is there a way to define the name of the temporary file created by \e 
command? I'd like to name it "afiedt.buf", not for sentimental reasons. 
I already have a cron job that cleans afiedt.buf from my home directory 
every hour and having psql name temporary file like that would simplify 
the cleaning process. The name comes from another database with the same 
editor construct as \e. I am actually quite used to that.


I honestly don't know. However, psql is reading back that temporary file 
and removes it when you exit the external editor, so there is no use for 
that cron job here. This is similar to the behavior of other systems 
like "virsh edit DOM" for example.



Regards, Jan




Re: External psql editor

2022-04-29 Thread Jan Wieck

On 4/29/22 14:10, Rich Shepard wrote:

I tried, unsuccessily, to use \e. Entering it while a command is displayed
does nothing. So I'm doing something wrong.


Did you hit Enter after \e ?


Regards, Jan




Re: External psql editor

2022-04-29 Thread Jan Wieck

On 4/29/22 13:13, Rich Shepard wrote:

While in psql, type \e and Enter. You will have the current query buffer
in the editor. You can do this at the end of a partial (not yet semicolon
terminated) query.


Can I set it before entering any command or better yet, when I invoke psql?


Not that I know of. \e starts the external editor and you have to save 
and exit that editor to get back to psql in order to execute it. IMHO 
the whole construct has very limited usability.



Regards, Jan




Re: External psql editor

2022-04-29 Thread Jan Wieck

On 4/29/22 11:55, Rich Shepard wrote:

I do all my postgres work using the psql shell. Editing a command reguires
moving character-by-character and I'd like to use my small text editor (joe)
because it allows more control over line movement.

A web search found a stackexchange thread that suggested adding to
~/.bash_profile the line:
export PSQL_EDITOR=/usr/bin/joe
so I did this yesterday.

Today I've learned that the keyboard chords I use in joe in other
applications aren't working here. For example, C-w should delete the word to
the right of the point (cursor location). It doesn't. Instead, it deletes
from the cursor postion to the head of the line. C-x doesn't move the cursor
one word to the right, but cancels the command.

Is there a way for me to specify use of joe at the psql command line? (I'd
use emacs but that's gross overkill.)


What you are missing is that even though the PSQL_EDITOR env variable is 
set, psql itself doesn't emulate that editor's behavior natively. You 
need to actually launch the editor (possibly while having a partial 
query in the buffer) with the \e command.


While in psql, type \e and Enter. You will have the current query buffer 
in the editor. You can do this at the end of a partial (not yet 
semicolon terminated) query.



Best Regards, Jan




Re: Help with large delete

2022-04-16 Thread Jan Wieck
Make your connection immune to disconnects by using something like the
screen utility.


Regards, Jan

On Sat, Apr 16, 2022, 09:26 Perry Smith  wrote:

> Currently I have one table that mimics a file system.  Each entry has a
> parent_id and a base name where parent_id is an id in the table that must
> exist in the table or be null with cascade on delete.
>
> I’ve started a delete of a root entry with about 300,000 descendants.  The
> table currently has about 22M entries and I’m adding about 1600 entries per
> minute still.  Eventually there will not be massive amounts of entries
> being added and the table will be mostly static.
>
> I started the delete before from a terminal that got detached.  So I
> killed that process and started it up again from a terminal less likely to
> get detached.˘
>
> My question is basically how can I make life easier for Postgres?  I
> believe (hope) the deletes will be few and far between but they will happen
> from time to time.  In this case, Dropbox — its a long story that isn’t
> really pertinent.  The point is that @#$% happens.
>
> “What can I do” includes starting completely over if necessary.  I’ve only
> got about a week invested in this and its just machine time at zero cost.
> I could stop the other processes that are adding entries and let the delete
> finish if that would help.  etc.
>
> Thank you for your time,
> Perry
>
>


Re: What have I done!?!?!? :-)

2022-04-08 Thread Jan Wieck

On 4/8/22 09:58, Jan Wieck wrote:

It only affects constraints that have been declared DEFERRABLE. Those
that are not are silently ignored (as per SQL standard).


I should have said "... silently ignored by this statement and still 
fire IMMEDIATE".



Just to be clear, Jan




Re: What have I done!?!?!? :-)

2022-04-08 Thread Jan Wieck

On 4/8/22 09:27, Magnus Hagander wrote:



On Fri, Apr 8, 2022 at 3:23 PM Perry Smith > wrote:

It has been a long time since I’ve done Rails stuff.  What follows
is the best I can recall but please take it with a grain of salt.

The first problem is that generally Rails does not put constraints
in the database.  There were others like me who thought that was
insane and would put constraints in the database — this includes
foreign key constraints, check constraints, etc.  About the only
constraint that could be added into the DB using native Rails was
the “not null” constraint.

When foreign and other constraints were added, it broke something
they call “Fixtures” which are present db states that are plopped
into the DB during testing.  To “fix” that, I (and others) would add
this into our code base: (I’m adding this to see what you guys think
of it — is it safer / better or just as insane?)

       def disable_referential_integrity()
         transaction {
           begin
             execute "SET CONSTRAINTS ALL DEFERRED"
             yield
           ensure
             execute "SET CONSTRAINTS ALL IMMEDIATE"
           end
         }
       end


This is perfectly normal code and nothing wrong with it. DEFERRED 
constraints are how you are *supposed* to handle such things. It defers 
the check of the foreign key to the end of the transaction, but it will 
still fail to commit if the foreign key is broken *at that point*. But 
it lets you do things like modify multiple tables that refer to each 
other, and have the changes only checked when they're all done.


Indeed, especially because this code does not require any elevated 
permissions, guarantees referential integrity at commit time and 
guarantees that no inconsistent, intermediate state will ever be visible 
to another, concurrent session.


It only affects constraints that have been declared DEFERRABLE. Those 
that are not are silently ignored (as per SQL standard).


A lot of frameworks didn't support foreign keys because one of the most 
popular databases at that time didn't support them. Well, the SQL parser 
of that particular database would accept the syntax, but the engine 
would not enforce anything. Even the manual of that database stated that 
"foreign keys are mostly for documentation purposes and are not needed 
as long as the application does all operations in the correct order." 
They changed that part of the documentation when support for InnoDB was 
added. Therefore I would not put all blame on the Rails developers.



Best Regards, Jan




Re: What have I done!?!?!? :-)

2022-04-08 Thread Jan Wieck

On 4/8/22 08:58, Magnus Hagander wrote:
A side-note on this, which of course won't help the OP at this point, 
but if the general best practice of not running the application with a 
highly privileged account is followed, the problem won't occur (it will 
just fail early before breaking things). DISABLE TRIGGER ALL requires 
either ownership of the table or superuser permissions, none of which 
it's recommended that the application run with. Doesn't help once the 
problem has occurred of course, but can help avoid it happening in the 
future.


It gets even better further down in that code, where it UPDATEs 
pg_constraint directly. That not only requires superuser but also catupd 
permissions (which are separate from superuser for a reason).



Regards, Jan




Re: What have I done!?!?!? :-)

2022-04-08 Thread Jan Wieck

On 4/8/22 01:57, Nikolay Samokhvalov wrote:
On Thu, Apr 7, 2022 at 8:10 AM Jan Wieck <mailto:j...@wi3ck.info>> wrote:


So **IF** Active Record is using that feature, then it can dump any
amount of garbage into your PostgreSQL database and PostgreSQL will
happily accept it with zero integrity checking.


It's DISABLE TRIGGER ALL 
https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12 
<https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12>


Similar poison, same side effect.

Looking further at that code it also directly updates the PostgreSQL 
system catalog. This is a big, red flag.


Why do the Rails developers think they need a sledgehammer like that? It 
seems to be doing that for over 7 years, so it is hard to tell from the 
commit log why they need to disable RI at all.



Regards, Jan




Re: What have I done!?!?!? :-)

2022-04-07 Thread Jan Wieck

On 4/6/22 18:25, Perry Smith wrote:

Rather than explain how I got here, I’ll just explain the state I’m in.
...

I’m using Active Record with the psql adapter.  It has a 
disable_referential_integrity which takes a block of code.  When the 
block of code exists, the constraints are put back.  At least, that is 
what I thought.


I’m wondering if the disabled constraints are still disabled somehow. 
  If so, how would I check for that and how would I turn them back on? 
  Or am I way off in the weeds?


That depends on how exactly Active Record does this disabling of 
constraints. There is a GUC in PostgreSQL 'session_replication_role'. 
Setting that to value 'replica' will do precisely that as a side effect. 
Its primary purpose is for logical replication systems (like Londiste, 
Slony and logical decoding based ones) to disable user triggers and 
referential integrity actions (like on delete cascade) as well as 
integrity checking under the assumption that those actions have been 
performed on the origin database and will be replicated as well or are 
unnecessary.


Note that changing that setting requires PostgreSQL superuser privilege. 
Precisely because of the danger of getting your database into an 
inconsistent state.


So **IF** Active Record is using that feature, then it can dump any 
amount of garbage into your PostgreSQL database and PostgreSQL will 
happily accept it with zero integrity checking.



Best Regards, Jan




Re: reading this group other than thru mails

2021-11-16 Thread Jan Wieck

On 11/16/21 03:36, Sanjay Minni wrote:
Is there any way to read this group other than thru mails or search old 
mails other than thru the interface at 
https://www.postgresql.org/list/pgsql-general 
<https://www.postgresql.org/list/pgsql-general>. I dont seem to find 
these very friendly.


There are 11 pages of search result for "forum email" in pgsql-general. 
I believe the answer is somewhere in there.


https://www.postgresql.org/search/?m=1=forum+email+=2=-1=r


--
Jan Wieck




Re: bottom / top posting

2021-06-10 Thread Jan Wieck

On 6/10/21 12:08 PM, Basques, Bob (CI-StPaul) wrote:

Alternatively, where I deem appropriate I will do inline 
posting/clipping, but I always announce that at the top (posting) of the 
response.


"where you deem appropriate" you will do that, and where "I deem 
appropriate" I will stop reading whatever you thought was important.


Want to get a message to me? Think less about what you deem appropriate.


Best Regards, Jan

--
Jan Wieck
Postgres User since 1994




Re: bottom / top posting

2021-06-09 Thread Jan Wieck

On 6/7/21 5:40 PM, Joshua Drake wrote:
At least the students haven't discovered this list yet and are posting 
their homework assignments (like they do on chat channels).


You forgot to mention that they are posting their homework assignments 
as cellphone pictures taken from half broken implementations in whatever 
GUI they are using.



Regards, Jan

--
Jan Wieck
Postgres User since 1994




Re: pg_upgrade and wraparound

2021-05-03 Thread Jan Wieck

On 4/30/21 3:32 PM, Bruce Momjian wrote:

On Sat, Mar 13, 2021 at 08:43:54AM -0500, Jan Wieck wrote:

On 3/12/21 8:30 PM, Michael Paquier wrote:
> Hi Jan,
> 
> On Fri, Mar 12, 2021 at 06:13:33PM -0500, Jan Wieck wrote:

> > One of the things in my way is that when using pg_resetwal to put the
> > NextXID way into the future (to push the old cluster close to wraparound for
> > example), the postmaster won't start because it doesn't have the pg_xact
> > files for that around. Should pg_resetwal create the files in the gap
> > between the old NextXID and the new one?
> 
> I think that you should add this patch to the next commit fest to

> track it properly:
> https://commitfest.postgresql.org/33/
> --
> Michael
> 


Actually this is the wrong patch (this one is for PG-12, not for HEAD). Will
update later today.

But yes, putting it into the next commitfest after initial discussion is the
plan.


Uh, were either of these things done?



Not yet, but I will enter it so that we can get it into 15 for sure.


Regards, Jan

--
Jan Wieck
Postgres User since 1994




Re: The Amazon CloudFront distribution is configured to block access from your country.

2021-04-12 Thread Jan Wieck

On 4/12/21 5:13 PM, felix.quin...@yahoo.com wrote:

Unfortunately the official site for downloading the installer is blocked for me.
I only had that option and trust what I am downloading.


That wasn't your only option. Other members have suggested that you 
contact EnterpriseDB directly.


Note that CloudFront is just a content service by Amazon Web Services. 
The owner of that content, in this case EnterpriseDB, controls those 
"GeoRestrictions", that you are having trouble with. See



https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/georestrictions.html


for details.

It is possible that this restriction is on purpose and that EDB really 
doesn't want people from your country to download those files. It is 
also possible that this particular download got that restriction by 
mistake, either on EDB's side or on the side of the third-party GeoIP 
service, Amazon is using to determine the origin of your IP address. I 
know a few people at EDB (I did work there in the past). Asking them 
won't do any harm and if all of this is actually by mistake, they will 
be glad you asked.



Regards, Jan

--
Jan Wieck
Postgres User since 1994




Re: The Amazon CloudFront distribution is configured to block access from your country.

2021-04-11 Thread Jan Wieck

On 4/11/21 11:16 AM, Ron wrote:
If my software requires zlib, and I link to https://zlib.net/ on my 
page, then your logic absolutely means that I'm responsible for 
supporting zlib.


Let's clarify "support" in this context. If you link to zlib.net, you 
are not responsible for zlib itself. But if for whatever reason that 
link breaks, you are to some degree responsible for giving a user (who 
tried to follow that link) a helping hand where to find what you linked 
to now.


The OP did precisely that. Asking for help because a link didn't work. 
And I think it was perfectly appropriate to ask here.



--
Jan Wieck
Postgres User since 1994




Re: Unkillable processes creating millions of tiny temp files

2021-03-20 Thread Jan Wieck

On 3/5/21 6:57 PM, Tom Lane wrote:


Not sure how fast that is either.  If you need to do it again, you could
try manually rm'ing everything under the pgsql_tmp directory before
letting the postmaster start.
That is actually a strategy that works rather well. mv(1) the tmp 
directory to something date(1) based, then kick off a recursive rm(1) 
-rf on everything named pgsql_tmp_*. That won't miss anything in the 
case the whole server is restarted while the procedure is under way. It 
can cause multiple rm(1) processes trampling over each other, but that 
has no real ill side effects. They are just trying to unlink a file 
another one already did.


Under normal circumstances the rm(1) will clean up while the postmaster 
is already up and possibly created a new pgsql_tmp.



Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services




Re: pg_upgrade and wraparound

2021-03-13 Thread Jan Wieck

On 3/12/21 8:30 PM, Michael Paquier wrote:

Hi Jan,

On Fri, Mar 12, 2021 at 06:13:33PM -0500, Jan Wieck wrote:

One of the things in my way is that when using pg_resetwal to put the
NextXID way into the future (to push the old cluster close to wraparound for
example), the postmaster won't start because it doesn't have the pg_xact
files for that around. Should pg_resetwal create the files in the gap
between the old NextXID and the new one?


I think that you should add this patch to the next commit fest to
track it properly:
https://commitfest.postgresql.org/33/
--
Michael



Actually this is the wrong patch (this one is for PG-12, not for HEAD). 
Will update later today.


But yes, putting it into the next commitfest after initial discussion is 
the plan.



Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services




Re: pg_upgrade and wraparound

2021-03-12 Thread Jan Wieck

Resurrecting an old thread.

We (AWS) have seen this wraparound during pg_upgrade more often recently 
with customers who have millions of large objects in their databases.


On 6/11/18 1:14 PM, Tom Lane wrote:

Andres Freund  writes:

I suspect the issue is that pg_resetwal does:
if (set_xid != 0)
{
ControlFile.checkPointCopy.nextXid = set_xid;



/*
 * For the moment, just set oldestXid to a value that will force
 * immediate autovacuum-for-wraparound.  It's not clear whether 
adding
 * user control of this is useful, so let's just do something 
that's
 * reasonably safe.  The magic constant here corresponds to the
 * maximum allowed value of autovacuum_freeze_max_age.
 */
ControlFile.checkPointCopy.oldestXid = set_xid - 20;
if (ControlFile.checkPointCopy.oldestXid < 
FirstNormalTransactionId)
ControlFile.checkPointCopy.oldestXid += 
FirstNormalTransactionId;
ControlFile.checkPointCopy.oldestXidDB = InvalidOid;
}



but we have codepath that doesn't check for oldestXidDB being
InvalidOid.  Not great.


Hm, I think I'd define the problem as "pg_resetwal is violating the
expectation that oldestXidDB be valid".

However, this just explains the basically-cosmetic issue that the
complaint message mentions OID 0.  It doesn't really get us to the
answer to why Alexander is seeing a failure.  It might be useful
to see pg_controldata output for the old cluster, as well as
"select datname, datfrozenxid from pg_database" output from the
old cluster.


Unfortunately I don't have pg_controldata output from the old clusters 
either. I would like to be able to artificially create an "old" cluster 
that fails during pg_upgrade in that way.


One of the things in my way is that when using pg_resetwal to put the 
NextXID way into the future (to push the old cluster close to wraparound 
for example), the postmaster won't start because it doesn't have the 
pg_xact files for that around. Should pg_resetwal create the files in 
the gap between the old NextXID and the new one?


Onw thing I do have is a patch that provides a workaround for the 
problem as well as a substantial speed improvement for the case at hand. 
This patch adds some options to pg_upgrade, pg_dump and pg_restore.


Option added to pg_dump:

--blob-in-parallel

This option requires --schema-only. It causes pg_dump to emit the BLOB 
metadata with SECTION_DATA instead of SECTION_PRE_DATA. This causes the 
statements for creating the large object metadata (lo_create(OID) and 
ALTER LARGE OBJECT) to move into the parallel phase of pg_restore, which 
means that their metadata will be created in parallel. In my tests a 
database containing large objects only is upgraded in 1/#cores the time.


Option added to pg_restore:

--blob-batch-size=N

With this option pg_restore tries to put N BLOB TOC entries into one 
transaction. This is per parallel worker and it will commit those 
batches if there is a change in object type, so only BLOB TOC entries 
will ever be batched at all. With a sufficient 
'max_locks_per_transation' a --blob-batch-size=1000 nicely reduces the 
number of XIDs consumed for upgrading 10M large objects from 20M to 10K.


Options added to pg_upgrade:

--blob-in-parallel   forwarded to pg_dump
--blob-batch-size=N  forwarded to pg_restore
--restore-jobs=N forwarded as --jobs=N to pg_restore


Patch is attached.


Regards, Jan


--
Jan Wieck
Principle Database Engineer
Amazon Web Services
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index c6059fc..fdcb5e7 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -865,6 +865,11 @@ RunWorker(ArchiveHandle *AH, ParallelSlot *slot)
 	WaitForCommands(AH, pipefd);
 
 	/*
+	 * Close an eventually open BLOB batch transaction.
+	 */
+	CommitBlobTransaction((Archive *)AH);
+
+	/*
 	 * Disconnect from database and clean up.
 	 */
 	set_cancel_slot_archive(slot, NULL);
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 97941fa..2bedd02 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -201,6 +201,8 @@ typedef struct Archive
 
 	int			numWorkers;		/* number of parallel processes */
 	char	   *sync_snapshot_id;	/* sync snapshot id for parallel operation */
+	int			blobBatchSize;	/* # of blob to restore per transaction */
+	bool		blobInParallel;	/* place "BLOB" TEs in SECTION_DATA */
 
 	/* info needed for string escaping */
 	int			encoding;		/* libpq code for client_encoding */
@@ -268,6 +270,7 @@ extern void WriteData(Archive *AH, const void *data, size_t dLen);
 extern int	StartBlob(Archive *AH, Oid oid);
 extern int	EndBlob(Archive *AH, Oid oid);
 
+extern void CommitBlobTransaction(Archive 

Re: Christopher Browne

2020-11-04 Thread Jan Wieck
Christopher Browne was my colleague and friend at Afilias and past for more
than a decade. I have countless memories that connect us. He will be missed.


Jan

On Wed, Nov 4, 2020, 18:29 Steve Singer  wrote:

>
> It is with much sadness that I am letting the community know that Chris
> Browne passed away recently.
>
> Chris had been a long time community member and was active on various
> Postgresql mailing lists.  He was a member of the funds committee, the
> PgCon
> program committee and served on the board of the PostgreSQL Community
> Association of Canada. Chris was a maintainer of the Slony
> replication system and worked on various PG related tools.
>
> I worked with Chris for over 10 years and anytime someone had a problem he
> would jump at the chance to help and propose solutions. He
> always had time to listen to your problem and offer ideas or explain how
> something worked.
>
> I will miss Chris
>
> Steve
>
>
>
>


Re: unexpected chunk number 2 (expected 0) for toast value ... in pg_toast_18536

2020-03-15 Thread Jan Wieck
Have you tried to reindex the table? Toast internally forces an index scan,
so missing index tuples or an otherwise corrupted toast index would have
the same symptoms as toast chunks actually missing.


Regards, Jan

On Sun, Mar 15, 2020, 16:21 Karsten Hilbert  wrote:

> On Sun, Mar 15, 2020 at 12:58:53PM -0700, Adrian Klaver wrote:
>
> > > > We then tried to DELETE the offending row
> > > >
> > > >   delete from blobs.doc_obj where pk = 82224;
> > > >
> > > > but that, again, shows the "unexpected chunk" problem.
> > >
> > > According to
> > >
> > >
> http://www.databasesoup.com/2013/10/de-corrupting-toast-tables.html
> > >
> > > an UPDATE of the row is recommended -- should that work
> > > better than a DELETE ?
> > >
> > > I can't find documentation pointing to a fundamental
> > > implementation difference that suggests so.
> >
> >
> https://www.postgresql.org/docs/12/storage-toast.html#STORAGE-TOAST-ONDISK
> >
> > "During an UPDATE operation, values of unchanged fields are normally
> > preserved as-is; so an UPDATE of a row with out-of-line values incurs no
> > TOAST costs if none of the out-of-line values change."
>
> However, where is the fault in my thinking ?
>
> -> An UPDATE actually *would* change the TOASTed BYTEA field (which is
> corrupt).
>
> I had hoped that the DELETE would NOT have to touch the TOAST
> table at all (and thereby not check the chunks) as "all it
> needs to do" is mark the row in the *primary* table as
> not-needed-anymore.
>
> I must be misunderstanding something.
>
> Karsten
> --
> GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
>
>
>


Re: ERROR: invalid memory alloc request size 1073741824

2018-01-30 Thread Jan Wieck
On Tue, Jan 30, 2018 at 12:35 PM, Stefan Blanke <
stefan.bla...@framestore.com> wrote:

> Hello,
>
> We've tripped over an error when doing a "COPY.. TO STDOUT WITH BINARY"
> query.
>
> "ERROR:  invalid memory alloc request size 1073741824"
> (exactly 1GB)
>

I have my money on a corrupted TOAST entry. Is this happening on
trustworthy hardware or beige box with no ECC or RAID?


Regards, Jan





>
> So a palloc() call is failing on the AllocSizeIsValid() check.
>
> Does anyone know if this a safety catch we are tripping with a bad query -
> or whether this check is something that should never be hit (i.e. a bug).
>
> This has been some discussion before about making a change before the
> 9.5beta1:
>
> https://www.postgresql.org/message-id/flat/9A28C8860F777E439
> AA12E8AEA7694F8010F6F3F%40BPXM15GP.gisp.nec.co.jp
>
> https://www.postgresql.org/message-id/flat/CAKJS1f_sxPyBHF6%
> 3DnEYwPJdKKDNjVDP91b_EJApfyGTei%3DViDw%40mail.gmail.com
>
> postgresql 9.5.4 (2PGDG.rhel6)
> work_mem = 256MB
>
> Thanks,
> Stefan
>
>


-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: session_replication_role meaning?

2018-01-30 Thread Jan Wieck
On Tue, Jan 30, 2018 at 3:36 AM, Laurenz Albe <laurenz.a...@cybertec.at>
wrote:

> Luca Ferrari wrote:
> > now this should be trivial, but I cannot udnerstand what is the
> > purpose of session_replication_role
> > or better, when I should use it in a way different from 'origin'.
>
> It is used to enable or disable triggers.
>
> By default, tables are created with all triggers enabled, which means
> that they fire with the default setting "session_replication_role =
> origin".
>
> You can change "session_replication_role" to "replica" to disable the
> firing
> of triggers (unless they are set ENABLE REPLICA or ENABLE ALWAYS).
> This is done by the logical replication apply worker, but you can also
> use it to bypass triggers, e.g. to speed up operation, if you know what
> you are doing.
>
> What is confusing is that there are three settings for
> "session_replication_role",
> but the two settings "local" and "origin" have the same meaning.
> Maybe that was meant to change at some point, but I see no explanation in
> the original discussion.
>

All of the above does also apply to referential integrity triggers. That
means that under session_replication_role='replica' you replication system
can replicate things out of order with respect to foreign keys. It also
means that if you don't replicate the primary key table you can get the
target database inconsistent.

The setting of 'local' has indeed the same meaning for everything in stock
PostgreSQL. The Slony log and deny-access triggers react to it by
suppressing their actions. An application working under 'local' can modify
the origin without the changes being replicated and modify the replica
without the deny-access trigger aborting the transaction. The Slony engine
uses that mode when running SQL scripts through the EXECUTE DDL feature.
That way you can perform bulk operations like pruning without the
individual row changes being replicated.

The setting of 'replica' is very important if you have triggers that for
example do auditing or stuff like stamping created and last update
timestamps or session users. You certainly don't want to overwrite the real
last update timestamp or session user with the replication engine user and
time.


Regards, Jan



>
> Yours,
> Laurenz Albe
>
>


-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: Does PostgreSQL check database integrity at startup?

2017-12-30 Thread Jan Wieck
On Sat, Dec 30, 2017 at 10:27 AM, Stephen Frost <sfr...@snowman.net> wrote:

>
> The checksums included in PG are page-level and therefore there simply
> isn't one to look at if the file is zero bytes.
>

And even if the file wasn't zero bytes you can't tell from the per page
CRCs if you have all the pages you should have. You could have extra pages
that aren't supposed to  be there or missing some (or any mix of the two).
A per page CRC is useless for those cases.


Regards, Jan





>
> Thanks!
>
> Stephen
>



-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: Does PostgreSQL check database integrity at startup?

2017-12-29 Thread Jan Wieck
On Thu, Dec 28, 2017 at 1:26 PM, Stephen Frost <sfr...@snowman.net> wrote:

> Greetings Brent,
>
> * Brent Wood (pcr...@yahoo.com) wrote:
> > A tool to calculate a checksum of sorts based on the table (file)
> content would provide a better surety of duplication than simply checking
> file size - like differently vacuumed tables in each copy could have the
> same content but be different file sizes.
>
> PG has support for checksums and there are tools out there to validate
> that the checksum is correct for all pages which have one, but that
> wouldn't help in this case because the file is zero'd out (and a zero'd
> out file is actually a valid file in a PG data directory).
>
> Also, the files on the primary and the replica actually can be different
> when looked at with a complete-file checksum due to hint bits being set
> differently (at least, possibly other ways too).  That doesn't make them
> invalid or incorrect though.
>

In addition to what Stephen and everyone else said, it is impossible to get
a valid snapshot of the whole "file" on a running server without locking
the relation and reading it through the PostgreSQL buffer cache. On data
files such as heap and index, PostgreSQL does extensive write caching.
Preventing data loss from write caching is a primary purpose of WAL. Write
caching in the application (PostgreSQL in this case) prevents the OS from
actually knowing the correct "logical" state of the file at any given point
in time. This means that even a LVM snapshot will not give you consistent
data files of a running server, because the not yet written changes (in
shared buffers) waiting for a checkpoint to force them into OS buffers
won't be visible from outside PostgreSQL.


Regards, Jan



such a check to verify that the files backed up during a backup have the
> same checksum as the files being restored from that backup can be done,
> and that *is* done in at least some of the PG backup tools already
> (pgBackRest has an independent manifest that it stores for each backup
> which contains the checksum of each file as-backed-up, and it verifies
> that checksum when performing a restore to make sure that the backed up
> file wasn't corrupted in place, other tools hopefully have similar).
>
> > I do wonder though - given the order of records in a table (file) is not
> necessarily identical (or is it?) event this may be problematic. Perhaps a
> checksum based on the result of a query output ordered by primary key could
> work?
>
> The order of records in a *file* should be the same in the heap on the
> primary as they are on the replica, but that doesn't mean the contents
> of those files will be exactly the same (as mentioned above, hint bits
> can differ).  We used to have cases where the indexes could also be
> different, but I believe that was changed so they should match.
>
> I've used the approach of doing a checksum across the results of an
> ordered query to compare between systems and that generally does work,
> but it's a bit tricky if you're trying to compare a table that's heavily
> modified- you need to determine the point in the WAL stream that you're
> at on the primary when you run the query and then replay the replica to
> that point in the WAL and then run the query on the replica, otherwise
> you could end up with differences that are just because of the ongoing
> transactions being run to update the table.
>
> Thanks!
>
> Stephen
>



-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info