Re: [SOLVED?] Re: Disk wait problem... not hardware...
On 2023-10-29 12:45:08 -0400, p...@pfortin.com wrote: > On Sun, 29 Oct 2023 16:16:05 +0100 Peter J. Holzer wrote: > >However, the table statistics contain an estimate for the number of > >rows: > > > >hjp=> select schemaname, relname, n_live_tup from pg_stat_user_tables order > >by 3 desc; > >╔╤═══╤╗ > >║ schemaname │ relname │ n_live_tup ║ > >╟┼───┼╢ > >║ public │ ncvhis_2016_12_03 │977 ║ > >║ public │ random_test │100 ║ > >║ public │ beislindex│351 ║ > >║ public │ delivery │ 6 ║ > >... > > Are there prerequisites for this query? > Every (264) n_live_tup returned was zero... though, VERY fast.. :) You need to ANALYZE the tables. Autovacuum should do this automatically when it detects that a table has changed "enough" since the last analyze, but for slowly growing tables that can sometimes take a long time (indeed, I did an "analyze beislindex" just before the query above because it showed only 25 tuples and I knew that wasn't right). A count of 0 for a large table is weird, though. Unless ... did you do a major version upgrade recently? That nukes the statistics and you have to analyze everything again. Also, I'm not sure if pg_restore triggers an analyze. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: pg_checksums?
On 2023-10-30 09:56:31 +0900, Michael Paquier wrote: > Hm? Page checksums are written when a page is flushed to disk, we > don't set them for dirty buffers or full-page writes included in WAL, > so it should be OK to do something like the following: > - Stop cleanly a standby. > - Run pg_checksums on the standby to enable them. > - Restart the standby. > - Catchup with the latest changes > - Stop cleanly the primary, letting the shutdown checkpoint be > replicated to the standby. > - Promote the standby. > - Enable checksums on the previous primary. > - Start the previous primary to be a standby of the node you failed > over to. I stand corrected. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: pg_checksums?
On 2023-10-29 13:26:27 -0500, Ron wrote: > On 10/29/23 12:57, Paul Förster wrote: > > Safe in the sense that, if I enable checksums on a replica, switch > > over and the enable checksums on the other side, if this is ok, or > > whether future mutations on the primary will corrupt the replica. > > Trying it would tell you something. > > > That's why I asked if I need to perform a patronictl reinit. > > Best to ask Percona. Why Percona? hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Fw: Server process exited with exit code 4
it doesn't seem like a hardware issue because we have over 20 machines running and 5 of them have been experiencing this issue recently. we have encountered problems before with antivirus software causing the database to hang, so we are also considering if other software on the Windows operating system is interfering. Forwarded Message | From | Justin Clift | | Date | 10/30/2023 14:29 | | To | yangsr3411 | | Cc | pgsql-general | | Subject | Re: Server process exited with exit code 4 | On 2023-10-30 14:02, yangsr3411 wrote: > Has anyone encountered similar problem or may know a solution? Just to rule out hardware problems, does the server hardware have some way of showing things like ECC memory errors and similar? Most official server hardware (HPE, Dell, etc) have utilities that can show a log of any recent weirdness that occurred at a hardware level. If yours can, take a look for things like ECC errors or any other strange stuff. :) Regards and best wishes, Justin Clift
Re: BRIN index maintenance on table without primary key
On Fri, 27 Oct 2023, Dimitrios Apostolou wrote: So the question is: how to maintain the physical order of the tuples? Answering to myself, there doesn't seem to be any way to run pg_repack on a table without a UNIQUE key. To run CLUSTER, the only way I see is to: 1. Create a btree index on the same column that has the BRIN index 2. CLUSTER 3. Drop the index This should take very long on my huge table, and keeps the table exclusively locked. The disk space needed would also be huge, if the table isn't partitioned. I wonder why CLUSTER can't run based on a BRIN-indexed column. Is it theoretically impossible, or is it just not implemented yet? My understanding so far is that CLUSTER only rewrites the table *according to the index order* and does not touch the index itself. For a BRIN index though it would need to rewrite the table *ignoring the index* and then rewrite the index too, in order to keep the ranges fully optimized. So the logic is very different, and maybe closer to what VACUUM FULL does. Thanks, Dimitris
xmax not zero?
Hi all, I have a table that presents an xmax not zeroed outside of a transaction block, and it does not look normal to me. I have no idea about how this happened, it is one of my "toy" PostgreSQL virtual machines. But how to dig an better understand why there is an xmax with a non-zero value? I've a snapshot for doing more tests. testdb=> select version(); version -- PostgreSQL 16.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit (1 row) testdb=> select txid_current() as me, xmin, xmax, pk /* table real column */, pg_snapshot_xmin( pg_current_snapshot() ), pg_snapshot_ xmax( pg_current_snapshot() ), pg_current_snapshot() from automobili limit 5; me| xmin | xmax | pk| pg_snapshot_xmin | pg_snapshot_xmax | pg_current_snapshot -+-+-+-+--+--+- 1713451 | 1533610 | 1675700 | 501 | 1713451 | 1713451 | 1713451:1713451: 1713451 | 1533610 | 1675700 | 503 | 1713451 | 1713451 | 1713451:1713451: 1713451 | 1533610 | 1675700 | 505 | 1713451 | 1713451 | 1713451:1713451: 1713451 | 1533610 | 1675700 | 507 | 1713451 | 1713451 | 1713451:1713451: 1713451 | 1533610 | 1675700 | 509 | 1713451 | 1713451 | 1713451:1713451: testdb=> vacuum verbose automobili; INFO: vacuuming "testdb.luca.automobili" INFO: launched 1 parallel vacuum worker for index cleanup (planned: 1) INFO: finished vacuuming "testdb.luca.automobili": index scans: 0 pages: 0 removed, 12738 remain, 12738 scanned (100.00% of total) tuples: 0 removed, 100 remain, 0 are dead but not yet removable removable cutoff: 1713454, which was 0 XIDs old when operation ended new relfrozenxid: 1713454, which is 179844 XIDs ahead of previous value frozen: 12738 pages from table (100.00% of total) had 100 tuples frozen index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed I/O timings: read: 273.835 ms, write: 108.286 ms avg read rate: 25.819 MB/s, avg write rate: 25.880 MB/s buffer usage: 12776 hits, 12711 misses, 12741 dirtied WAL usage: 38215 records, 12741 full page images, 60502693 bytes system usage: CPU: user: 0.56 s, system: 0.21 s, elapsed: 3.84 s INFO: vacuuming "testdb.pg_toast.pg_toast_76512" INFO: finished vacuuming "testdb.pg_toast.pg_toast_76512": index scans: 0 pages: 0 removed, 0 remain, 0 scanned (100.00% of total) tuples: 0 removed, 0 remain, 0 are dead but not yet removable removable cutoff: 1713454, which was 0 XIDs old when operation ended new relfrozenxid: 1713454, which is 186042 XIDs ahead of previous value frozen: 0 pages from table (100.00% of total) had 0 tuples frozen index scan not needed: 0 pages from table (100.00% of total) had 0 dead item identifiers removed I/O timings: read: 0.520 ms, write: 0.000 ms avg read rate: 9.902 MB/s, avg write rate: 0.000 MB/s buffer usage: 19 hits, 1 misses, 0 dirtied WAL usage: 1 records, 0 full page images, 188 bytes system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s VACUUM testdb=> select txid_current(), xmin, xmax, pk from automobili limit 5; txid_current | xmin | xmax | pk --+-+--+- 1713454 | 1533610 |0 | 501 1713454 | 1533610 |0 | 503 1713454 | 1533610 |0 | 505 1713454 | 1533610 |0 | 507 1713454 | 1533610 |0 | 509 (5 rows)
Re: xmax not zero?
Hi, Le lun. 30 oct. 2023 à 13:45, Luca Ferrari a écrit : > Hi all, > I have a table that presents an xmax not zeroed outside of a > transaction block, and it does not look normal to me. > I have no idea about how this happened, it is one of my "toy" > PostgreSQL virtual machines. > But how to dig an better understand why there is an xmax with a non-zero > value? > There are many reasons for a non-zero value: row updated or deleted in a rollbacked transaction, row updated or deleted in a current transaction, row locked by a SELECT FOR UPDATE, and perhaps others I don't remember right now. > I've a snapshot for doing more tests. > > testdb=> select version(); > version > > -- > PostgreSQL 16.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 > 20221121 (Red Hat 11.3.1-4), 64-bit > (1 row) > > testdb=> select txid_current() as me, xmin, xmax, pk /* table real > column */, pg_snapshot_xmin( pg_current_snapshot() ), pg_snapshot_ > xmax( pg_current_snapshot() ), pg_current_snapshot() from automobili limit > 5; > me| xmin | xmax | pk| pg_snapshot_xmin | > pg_snapshot_xmax | pg_current_snapshot > > -+-+-+-+--+--+- > 1713451 | 1533610 | 1675700 | 501 | 1713451 | > 1713451 | 1713451:1713451: > 1713451 | 1533610 | 1675700 | 503 | 1713451 | > 1713451 | 1713451:1713451: > 1713451 | 1533610 | 1675700 | 505 | 1713451 | > 1713451 | 1713451:1713451: > 1713451 | 1533610 | 1675700 | 507 | 1713451 | > 1713451 | 1713451:1713451: > 1713451 | 1533610 | 1675700 | 509 | 1713451 | > 1713451 | 1713451:1713451: > > > testdb=> vacuum verbose automobili; > INFO: vacuuming "testdb.luca.automobili" > INFO: launched 1 parallel vacuum worker for index cleanup (planned: 1) > INFO: finished vacuuming "testdb.luca.automobili": index scans: 0 > pages: 0 removed, 12738 remain, 12738 scanned (100.00% of total) > tuples: 0 removed, 100 remain, 0 are dead but not yet removable > removable cutoff: 1713454, which was 0 XIDs old when operation ended > new relfrozenxid: 1713454, which is 179844 XIDs ahead of previous value > frozen: 12738 pages from table (100.00% of total) had 100 tuples frozen > index scan not needed: 0 pages from table (0.00% of total) had 0 dead > item identifiers removed > I/O timings: read: 273.835 ms, write: 108.286 ms > avg read rate: 25.819 MB/s, avg write rate: 25.880 MB/s > buffer usage: 12776 hits, 12711 misses, 12741 dirtied > WAL usage: 38215 records, 12741 full page images, 60502693 bytes > system usage: CPU: user: 0.56 s, system: 0.21 s, elapsed: 3.84 s > INFO: vacuuming "testdb.pg_toast.pg_toast_76512" > INFO: finished vacuuming "testdb.pg_toast.pg_toast_76512": index scans: 0 > pages: 0 removed, 0 remain, 0 scanned (100.00% of total) > tuples: 0 removed, 0 remain, 0 are dead but not yet removable > removable cutoff: 1713454, which was 0 XIDs old when operation ended > new relfrozenxid: 1713454, which is 186042 XIDs ahead of previous value > frozen: 0 pages from table (100.00% of total) had 0 tuples frozen > index scan not needed: 0 pages from table (100.00% of total) had 0 > dead item identifiers removed > I/O timings: read: 0.520 ms, write: 0.000 ms > avg read rate: 9.902 MB/s, avg write rate: 0.000 MB/s > buffer usage: 19 hits, 1 misses, 0 dirtied > WAL usage: 1 records, 0 full page images, 188 bytes > system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s > VACUUM > > testdb=> select txid_current(), xmin, xmax, pk from automobili limit 5; > txid_current | xmin | xmax | pk > --+-+--+- > 1713454 | 1533610 |0 | 501 > 1713454 | 1533610 |0 | 503 > 1713454 | 1533610 |0 | 505 > 1713454 | 1533610 |0 | 507 > 1713454 | 1533610 |0 | 509 > (5 rows) > > > Regards. -- Guillaume.
Re: pg_checksums?
Hi Michael, > On Oct 30, 2023, at 01:56, Michael Paquier wrote: > > Hm? Page checksums are written when a page is flushed to disk, we > don't set them for dirty buffers or full-page writes included in WAL, > so it should be OK to do something like the following: > - Stop cleanly a standby. > - Run pg_checksums on the standby to enable them. > - Restart the standby. > - Catchup with the latest changes > - Stop cleanly the primary, letting the shutdown checkpoint be > replicated to the standby. > - Promote the standby. > - Enable checksums on the previous primary. > - Start the previous primary to be a standby of the node you failed > over to. That's exactly the reasoning behind my initial idea and question. Patroni does the switchover job for me including catching up on the latest changes, etc. Seems that opinions vary. Are there any hard facts? It turns out that enabling checksums can take quite some time to complete, i.e. downtime for the application which is hard to do in a 24x7 environment. Cheers Paul
Re: pg_checksums?
Hi Peter, > On Oct 30, 2023, at 11:03, Peter J. Holzer wrote: > On 2023-10-29 13:26:27 -0500, Ron wrote: >> Best to ask Percona. > > Why Percona? Probably a typo. Patroni is used. Cheers Paul
Re: xmax not zero?
On Mon, 2023-10-30 at 13:53 +0100, Guillaume Lelarge wrote: > Le lun. 30 oct. 2023 à 13:45, Luca Ferrari a écrit : > > I have a table that presents an xmax not zeroed outside of a > > transaction block, and it does not look normal to me. > > I have no idea about how this happened, it is one of my "toy" > > PostgreSQL virtual machines. > > But how to dig an better understand why there is an xmax with a non-zero > > value? > > There are many reasons for a non-zero value: row updated or deleted in a > rollbacked > transaction, row updated or deleted in a current transaction, row locked by a > SELECT FOR UPDATE, and perhaps others I don't remember right now. INSERT ... ON CONFLICT could also cause this. As long as the transaction corresponding to the "xid" is marked as "aborted", that number is not valid and is treated like a 0. Yours, Laurenz Albe
Re: pg_checksums?
On 10/30/23 08:18, Paul Förster wrote: Hi Peter, On Oct 30, 2023, at 11:03, Peter J. Holzer wrote: On 2023-10-29 13:26:27 -0500, Ron wrote: Best to ask Percona. Why Percona? Probably a typo. Patroni is used. Erroneously thinking that Percona develops Patroni. :D -- Born in Arizona, moved to Babylonia.
Re: pg_checksums?
On Mon, 30 Oct 2023 at 14:46, Ron wrote: > Erroneously thinking that Percona develops Patroni. :D > IIRC, they may have made one or two contributions, but very minor. But anyway, Patroni is orthogonal to pg_checksums. As Michael already said, the following workflow works just fine (I did it dozens of times): 1. enable checksums on the standby node 2. start the standby and let it catch up with the primary 3. switchover to a standby node 4. enable checksums on the former primary (now replica). Regards, -- Alexander Kukushkin
Help with a good mental model for estimating PostgreSQL throughput
Hello! Can someone help me develop a good mental model for estimating PostgreSQL throughput? Here's what I mean. Suppose I have: - 1000 connections - typical query execution time of 1ms - but additional network latency of 100ms What if at all would be an estimate of the number of operations that can be performed within 1 second? My initial guess would be ~1, but then perhaps I'm overlooking something. I expect a more reliable figure would be obtained through testing, but I'm looking for an *a priori *back-of-the-envelope estimate. Thanks! Best, David
Re: Help with a good mental model for estimating PostgreSQL throughput
On Mon, 2023-10-30 at 08:05 -0700, David Ventimiglia wrote: > Can someone help me develop a good mental model for estimating PostgreSQL > throughput? > Here's what I mean. Suppose I have: > * 1000 connections > * typical query execution time of 1ms > * but additional network latency of 100ms > What if at all would be an estimate of the number of operations that can be > performed > within 1 second? My initial guess would be ~1, but then perhaps I'm > overlooking > something. I expect a more reliable figure would be obtained through > testing, but > I'm looking for an a priori back-of-the-envelope estimate. Thanks! It depends on the number of cores, if the workload is CPU bound. If the workload is disk bound, look for the number of I/O requests a typical query needs, and how many of them you can perform per second. The network latency might well be a killer. Use pgBouncer with transaction mode pooling. Yours, Laurenz Albe
Re: Help with a good mental model for estimating PostgreSQL throughput
Thanks! Let's say there are 10 cores, the workload is not CPU bound, and there is a connection pooler like pgBouncer in place. Would the number of operations more likely be: - 1000 ms / total ms per operation * number of cores = 1000 ms / 101 ms * 10 = ~100 - 1000 ms / total ms per operation * number of connections = 1000 ms / 101 ms * 1000 = ~1 - something else - impossible to determine without more information Best, David On Mon, Oct 30, 2023 at 8:46 AM Laurenz Albe wrote: > On Mon, 2023-10-30 at 08:05 -0700, David Ventimiglia wrote: > > Can someone help me develop a good mental model for estimating > PostgreSQL throughput? > > Here's what I mean. Suppose I have: > > * 1000 connections > > * typical query execution time of 1ms > > * but additional network latency of 100ms > > What if at all would be an estimate of the number of operations that can > be performed > > within 1 second? My initial guess would be ~1, but then perhaps I'm > overlooking > > something. I expect a more reliable figure would be obtained through > testing, but > > I'm looking for an a priori back-of-the-envelope estimate. Thanks! > > It depends on the number of cores, if the workload is CPU bound. > If the workload is disk bound, look for the number of I/O requests a > typical query > needs, and how many of them you can perform per second. > > The network latency might well be a killer. > > Use pgBouncer with transaction mode pooling. > > Yours, > Laurenz Albe >
meaning of CIDR mask in pg_hba.conf
Hi, I have postgres version 12 running on centos 7. I found an entry in my pg_hba.conf entry as given below under IPV4 connections: host all all /0 md5 I could not understand the meaning of "/0" here. as I know that each IPV4 there are total 4 octets and each octet will be read based on given CIDR mask (/8, /16, /24 or /32) but I am watching first time "/0" that I couldn't understand, So please help me in explaining its prompt meaning and how IP will be read with /0? Regards, Atul
Re: meaning of CIDR mask in pg_hba.conf
> So please help me in explaining its prompt meaning and how IP will be read > with /0? It means no subnet mask so it will match all IP addresses.
Re: meaning of CIDR mask in pg_hba.conf
On 10/30/23 10:45 AM, Atul Kumar wrote: Hi, I have postgres version 12 running on centos 7. I found an entry in my pg_hba.conf entry as given below under IPV4 connections: host all all /0 md5 I could not understand the meaning of "/0" here. as I know that each IPV4 there are total 4 octets and each octet will be read based on given CIDR mask (/8, /16, /24 or /32) but I am watching first time "/0" that I couldn't understand, So please help me in explaining its prompt meaning and how IP will be read with /0? From here: https://www.postgresql.org/docs/current/auth-pg-hba-conf.html ... |0.0.0.0/0| represents all IPv4 addresses, and |::0/0| represents all IPv6 addresses. Is / one of the above? Regards, Atul
capacity planning question
Hi, I am in need of an infrastructure set up for data analytics / live video stream analytics application using big data and analytics technology.. The data is basically right now stored as structured data(no video streaming) in PostgresDatabase. ( Its an emergency call handling solution, In database which stores, caller info (address, mobile number, locations co-ordinates, emergency category metadata and dispatch information regarding rescue vehicles., Rescue vehicle location update (lat, long)every 30 seconds all are stored in the Postgres Database .. Input1 : I have to do an analytics on these data( say 600 GB for the last 2 years its the size grown from initial setup).To perform an analytical application development( using python and data analytics libraries, and displaying the results and analytical predication through a dashboard application.) Query 1. How much resource in terms of compute(GPU?(CPU) cores required for this analytical application? and memory ? And any specific type of storage(in memory like redis required ? ) etc, which I have to provision for these kind of application processing. ?? any hints most welcome.. Any more input required let me know I can provide if available. Input 2 In addition to the above I have to do video analytics from bodyworn cameras by police personnel, drone surveillance Videos from any emergency sites, patrol vehicle (from a mobile tablet device over 5G )live streaming of indent locations for few minutes ( say 3 to 5 minutes live streaming for each incident. ) There are 50 drones, 500 Emergency rescue service vehicles, 300 body worn camera personnels.. and roughly 5000 incidents / emergency incidents per day happening, which needs video streaming for at least 1000 incidents for a time duration of 4 to 5 minutes live streaming. Query2. What/(how many) kind of computing resources GPUs(CPUs)? RAM, Storage solutions I have to deploy in numbers( or cores of GPUs/how many/(CPUs)? RAM ? In Memory (Redis or similar ) or any other specific data storage mechanisms ? Any hints much appreciated.. Best, Krishane
Re: pg_checksums?
Hi Alexander, > On Oct 30, 2023, at 14:56, Alexander Kukushkin wrote: ... > But anyway, Patroni is orthogonal to pg_checksums. ... Just to be sure I understand you correctly: This does not work with Patroni? Cheers Paul
Re: pg_checksums?
On Mon, 30 Oct 2023, 19:34 Paul Förster, wrote: > > > Just to be sure I understand you correctly: This does not work with > Patroni? > That's not what I said. Patroni only manages Postgres. It is exactly the same Postgres as you would run it without Patroni. Everything will work. Regards, -- Alexander Kukushkin >
Re: Help with a good mental model for estimating PostgreSQL throughput
On Mon, 2023-10-30 at 08:59 -0700, David Ventimiglia wrote: > On Mon, Oct 30, 2023 at 8:46 AM Laurenz Albe wrote: > > On Mon, 2023-10-30 at 08:05 -0700, David Ventimiglia wrote: > > > Can someone help me develop a good mental model for estimating PostgreSQL > > > throughput? > > > Here's what I mean. Suppose I have: > > > * 1000 connections > > > * typical query execution time of 1ms > > > * but additional network latency of 100ms > > > What if at all would be an estimate of the number of operations that can > > > be performed > > > within 1 second? My initial guess would be ~1, but then perhaps I'm > > > overlooking > > > something. I expect a more reliable figure would be obtained through > > > testing, but > > > I'm looking for an a priori back-of-the-envelope estimate. Thanks! > > > > It depends on the number of cores, if the workload is CPU bound. > > If the workload is disk bound, look for the number of I/O requests a > > typical query > > needs, and how many of them you can perform per second. > > > > The network latency might well be a killer. > > > > Use pgBouncer with transaction mode pooling. > > Thanks! Let's say there are 10 cores, the workload is not CPU bound, and > there is a > connection pooler like pgBouncer in place. Would the number of operations > more likely be: > > * 1000 ms / total ms per operation * number of cores = 1000 ms / 101 ms * 10 > = ~100 > * 1000 ms / total ms per operation * number of connections = 1000 ms / 101 > ms * 1000 = ~1 > * something else > * impossible to determine without more information If the workload is not CPU bound, it is probably disk bound, and you have to look at the number if I/O requests. If you look at the CPU, the second calculation should be more to the point. However, if one request by the customer results in 10 database requests, the request will already take 2 seconds due to the network latency, even though it causes next to no load on the database. Yours, Laurenz Albe
Re: pg_checksums?
> > On Oct 30, 2023 at 7:00 PM, Paul Försterwrote: > > > Hi Michael, > > > On Oct 30, 2023, at 01:56, Michael Paquierwrote: > > > > > - Enable checksums on the previous primary. > > - Start the previous primary to be a standby of the node you failed > > over to. > > That's exactly the reasoning behind my initial idea and question. Patroni > does the switchover job for me including catching up on the latest changes, > etc. > > Seems that opinions vary. Are there any hard facts? > > > The best hard facts are those generated in your environment. > > > It turns out that enabling checksums can take quite some time to complete, > i.e. downtime for the application which is hard to do in a 24x7 environment. > > Yes. Try it first with a smaller sample. > > Cheers > Paul > >