Re: [GENERAL] table versioning approach (not auditing)
Hey there. Thank you very much for that fix! Thats why I'd like to have a joint development and joint testing. It's way more convincing for users to go for a solution that is tested by some experts than just by a random developer :) I'm open to create a new project and push the code there. Don't care about the name. Then we might figure out which parts are already good, which parts could be improved and where to go next. I think switching to JSONB for example will be easy, as it offers the same functions than JSON afaik. Gesendet: Dienstag, 30. September 2014 um 21:16 Uhr Von: Adam Brusselback adambrusselb...@gmail.com An: Felix Kunde felix-ku...@gmx.de Cc: pgsql-general@postgresql.org pgsql-general@postgresql.org Betreff: Re: [GENERAL] table versioning approach (not auditing) Felix, I'd love to see a single, well maintained project. For example, I just found yours, and gave it a shot today after seeing this post. I found a bug when an update command is issued, but the old and new values are all the same. The trigger will blow up. I've got a fix for that, but if we had one project that more than a handful of people used, stuff like that would be quashed very quickly. I love the design of it by the way. Any idea what it will take to move to JSONB for 9.4? On Tue, Sep 30, 2014 at 7:22 AM, Felix Kunde felix-ku...@gmx.de wrote:Hey yes i'm adding an additional key to each of my tables. First i wanted to use the primary key as one column in my audit_log table, but in some of my tables the PK consists of more than one column. Plus it's nice to have one key that is called the same over all tables. To get a former state for one row at date x I need to join the latest delta BEFORE date x with each delta AFTER date x. If I would log complete rows, this joining part would not be neccessary, but as I usually work with spatial databases that have complex geometries and also image files, this strategy is too harddisk consuming. If there are more users following a similar approach, I wonder why we not throw all the good ideas together, to have one solution that is tested, maintained and improved by more developpers. This would be great. Felix Gesendet: Montag, 29. September 2014 um 23:25 Uhr Von: Abelard Hoffman abelardhoff...@gmail.com[abelardhoff...@gmail.com] An: Felix Kunde felix-ku...@gmx.de[felix-ku...@gmx.de] Cc: pgsql-general@postgresql.org[pgsql-general@postgresql.org] pgsql-general@postgresql.org[pgsql-general@postgresql.org] Betreff: Re: [GENERAL] table versioning approach (not auditing) Thank you Felix, Gavin, and Jonathan for your responses. Felix Jonathan: both of you mention just storing deltas. But if you do that, how do you associate the delta record with the original row? Where's the PK stored, if it wasn't part of the delta? Felix, thank you very much for the example code. I took a look at your table schemas. I need to study it more, but it looks like the way you're handling the PK, is you're adding a separate synthethic key (audit_id) to each table that's being versioned. And then storing that key along with the delta. So then to find all the versions of a given row, you just need to join the audit row with the schema_name.table_name.audit_id column. Is that right? The only potential drawback there is there's no referential integrity between the audit_log.audit_id and the actual table. I do like that approach very much though, in that it eliminates the need to interrogate the json data in order to perform most queries. AH On Mon, Sep 29, 2014 at 12:26 AM, Felix Kunde felix-ku...@gmx.de[felix-ku...@gmx.de] wrote:Hey i've also tried to implement a database versioning using JSON to log changes in tables. Here it is: https://github.com/fxku/audit[https://github.com/fxku/audit][https://github.com/fxku/audit[https://github.com/fxku/audit]] I've got two versioning tables, one storing information about all transactions that happened and one where i put the JSON logs of row changes of each table. I'm only logging old values and not complete rows. Then I got a function that recreates a database state at a given time into a separate schema - either to VIEWs, MVIEWs or TABLES. This database state could then be indexed in order to work with it. You can also reset the production state to the recreated past state. Unfortunately I've got no time to further work on it at the moment + I have not done tests with many changes in the database so I can't say if the recreation process scales well. On downside I've realised is that using the json_agg function has limits when I've got binary data. It gets too long. So I'm really looking forward using JSONB. There are more plans in my mind. By having a Transaction_Log table it should be possible to revert only certain transactions. I'm also thinking of parallel versioning, e.g. different users are all working with their version of the database and commit their changes to the
[GENERAL] Extract especific text from a sql statement
Hi, I need help to extract fields and tables from a sql statement. Example: SELECT pc.cod, pc.val, pi.qtd, COALESCE(pc.name, 'empty') AS name, lower(coalesce(pc.email, 'empty')) as email, status FROM pc INNER JOIN pi on (pc.cod = pi.cod) WHERE pc.startdate CURRENT_DATE order by 1 desc ; I need to return that: pc.cod pc.val pi.qtd pc.name pc.email status pc pi pc.startdate can you help me ? -- *Atenciosamente,Emanuel Araújo*
[GENERAL] Best practices for Large import in empty database?
I am about to feed an empty database with a large import of data (the size of csv files is about 500 GB). Tables are created but I haven't added any constraints or indexes yet. I wonder whether the best practice is to add them before or after the import. Are there other good practices that would ease the import process? Cheers, Daniel
Re: [GENERAL] Best practices for Large import in empty database?
Daniel Begin jfd...@hotmail.com writes: I am about to feed an empty database with a large import of data (the size of csv files is about 500 GB). Tables are created but I haven't added any constraints or indexes yet. I wonder whether the best practice is to add them before or after the import. Are there other good practices that would ease the import process? http://www.postgresql.org/docs/9.3/static/populate.html regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Best practices for Large import in empty database?
On 10/1/2014 9:13 AM, Daniel Begin wrote: I am about to feed an empty database with a large import of data (the size of csv files is about 500 GB). Tables are created but I haven't added any constraints or indexes yet. I wonder whether the best practice is to add them before or after the import. Are there other good practices that would ease the import process? Cheers, Daniel create index after, as long as you dont need it for uniqueness checking. then: begin; truncate table junk; copy ... with freeze; commit; Even though the table is empty, the begin,trunc,copy lets PG skip a bunch of extra WAL. -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Postgres tcp_keepalive_xxxx parameters.
Hi. We have an issue with postgres clients tha are being disconnected from database server after some time. Client is a web application which creates a pool of connections. Since client and server are on different VLANS I think the problem is that the FW which is routing traffic is droping idle connections after some time. Os is CentOS 5.10 and kernel config is as follows: root@itk-iv-mcs ~ # sysctl -a | grep keepalive net.ipv4.tcp_keepalive_intvl = 75 net.ipv4.tcp_keepalive_probes = 9 net.ipv4.tcp_keepalive_time = 7200 tcp_keepalive_xxx in postgresql.conf were set to 0 all. Then we tried to set tcp_keepalives_idle = 60 # TCP_KEEPIDLE, in seconds; tcp_keepalives_interval = 60 # TCP_KEEPINTVL, in seconds; #tcp_keepalives_count = 0 # TCP_KEEPCNT; So effective values (I guess are) 60,60 and 9 (system default). Even with this configuration the cliens are being disconnected as before. At this point I am not sure if I have understood properly the way keepalive is working or not, becase if PG will send a packet after 60 seconds of inactivity and the client respond ok then it will not retry and wait for tcp_keepalives_idle seconds again... and so on. Then interval and count are used only in case the client does not respond. If this is true and I am not wrong then we can discard this issue as FW issue regarding idle connections... Is this the behaviour of the tcp_keepalives_xxx parameters? Is there any more I have to setup or check to make keepalive work? Do you know how can we sort out this issue? Thanks in advance. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Postgres-tcp-keepalive--parameters-tp5821282.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Best practices for Large import in empty database?
Oups, right to the point! Thanks... -Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Tom Lane Sent: October-01-14 10:38 To: Daniel Begin Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Best practices for Large import in empty database? Daniel Begin jfd...@hotmail.com writes: I am about to feed an empty database with a large import of data (the size of csv files is about 500 GB). Tables are created but I haven't added any constraints or indexes yet. I wonder whether the best practice is to add them before or after the import. Are there other good practices that would ease the import process? http://www.postgresql.org/docs/9.3/static/populate.html regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres tcp_keepalive_xxxx parameters.
jlrando jose.luis.rando.ca...@ericsson.com writes: We have an issue with postgres clients tha are being disconnected from database server after some time. Client is a web application which creates a pool of connections. Since client and server are on different VLANS I think the problem is that the FW which is routing traffic is droping idle connections after some time. Probably. It might be asymmetric; have you tried enabling keepalives from the client end, rather than the server? If using libpq, you can set keepalive parameters in the connection string: http://www.postgresql.org/docs/9.3/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS If that doesn't fix it, you might want to get out Wireshark or a similar tool and verify that keepalive packets are actually getting sent. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] table versioning approach (not auditing)
I know we're kinda hijacking this thread, so sorry for that. If you'd like to do that, i'd be more than happy to use it and push any fixes / changes upstream. I don't have much of a preference on the name either, as long as it's something that makes sense. I would consider myself far from an expert though! Either way, more people using a single solution is a good thing. As a side note, I did some benchmarking this morning and wanted to share the results: pgbench -i -s 140 -U postgres pgbench pgbench -c 4 -j 4 -T 600 -U postgres pgbench no auditing tps: 2854 NOTE: Accounts are audited auditing tps: 1278 pgbench -c 2 -j 2 -N -T 300 -U postgres pgbench no auditing tps: 2504 NOTE: Accounts are audited auditing tps: 822 pgbench -c 2 -j 2 -T 300 -U postgres pgbench no auditing tps: 1836 NOTE: branches and tellers are audited, accounts are not auditing tps: 505 I'd love to see if there are some easy wins to boost the performance. On Wed, Oct 1, 2014 at 5:19 AM, Felix Kunde felix-ku...@gmx.de wrote: Hey there. Thank you very much for that fix! Thats why I'd like to have a joint development and joint testing. It's way more convincing for users to go for a solution that is tested by some experts than just by a random developer :) I'm open to create a new project and push the code there. Don't care about the name. Then we might figure out which parts are already good, which parts could be improved and where to go next. I think switching to JSONB for example will be easy, as it offers the same functions than JSON afaik. Gesendet: Dienstag, 30. September 2014 um 21:16 Uhr Von: Adam Brusselback adambrusselb...@gmail.com An: Felix Kunde felix-ku...@gmx.de Cc: pgsql-general@postgresql.org pgsql-general@postgresql.org Betreff: Re: [GENERAL] table versioning approach (not auditing) Felix, I'd love to see a single, well maintained project. For example, I just found yours, and gave it a shot today after seeing this post. I found a bug when an update command is issued, but the old and new values are all the same. The trigger will blow up. I've got a fix for that, but if we had one project that more than a handful of people used, stuff like that would be quashed very quickly. I love the design of it by the way. Any idea what it will take to move to JSONB for 9.4? On Tue, Sep 30, 2014 at 7:22 AM, Felix Kunde felix-ku...@gmx.de wrote:Hey yes i'm adding an additional key to each of my tables. First i wanted to use the primary key as one column in my audit_log table, but in some of my tables the PK consists of more than one column. Plus it's nice to have one key that is called the same over all tables. To get a former state for one row at date x I need to join the latest delta BEFORE date x with each delta AFTER date x. If I would log complete rows, this joining part would not be neccessary, but as I usually work with spatial databases that have complex geometries and also image files, this strategy is too harddisk consuming. If there are more users following a similar approach, I wonder why we not throw all the good ideas together, to have one solution that is tested, maintained and improved by more developpers. This would be great. Felix Gesendet: Montag, 29. September 2014 um 23:25 Uhr Von: Abelard Hoffman abelardhoff...@gmail.com[abelardhoff...@gmail.com ] An: Felix Kunde felix-ku...@gmx.de[felix-ku...@gmx.de] Cc: pgsql-general@postgresql.org[pgsql-general@postgresql.org] pgsql-general@postgresql.org[pgsql-general@postgresql.org] Betreff: Re: [GENERAL] table versioning approach (not auditing) Thank you Felix, Gavin, and Jonathan for your responses. Felix Jonathan: both of you mention just storing deltas. But if you do that, how do you associate the delta record with the original row? Where's the PK stored, if it wasn't part of the delta? Felix, thank you very much for the example code. I took a look at your table schemas. I need to study it more, but it looks like the way you're handling the PK, is you're adding a separate synthethic key (audit_id) to each table that's being versioned. And then storing that key along with the delta. So then to find all the versions of a given row, you just need to join the audit row with the schema_name.table_name.audit_id column. Is that right? The only potential drawback there is there's no referential integrity between the audit_log.audit_id and the actual table. I do like that approach very much though, in that it eliminates the need to interrogate the json data in order to perform most queries. AH On Mon, Sep 29, 2014 at 12:26 AM, Felix Kunde felix-ku...@gmx.de[ felix-ku...@gmx.de] wrote:Hey i've also tried to implement a database versioning using JSON to log changes in tables. Here it is: https://github.com/fxku/audit[https://github.com/fxku/audit][https://github.com/fxku/audit[https://github.com/fxku/audit]] I've got two versioning tables, one storing information about all
Re: [GENERAL] Extract especific text from a sql statement
Emanuel Araújo wrote Hi, I need help to extract fields and tables from a sql statement. Example: SELECT pc.cod, pc.val, pi.qtd, COALESCE(pc.name, 'empty') AS name, lower(coalesce(pc.email, 'empty')) as email, status FROM pc INNER JOIN pi on (pc.cod = pi.cod) WHERE pc.startdate CURRENT_DATE order by 1 desc ; I need to return that: pc.cod pc.val pi.qtd pc.name pc.email status pc pi pc.startdate can you help me ? Me, probably not. I would suggest you tell us what goal/use-case you have in mind that you think you need to do the above. The project is open source so you are welcome to look at the parser code and see how it goes about pulling out identifiers and determining what they are. A big question is how do you want to deal with aliases and views present in the query? Do you have a particular language you are restricting your search to? David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Extract-especific-text-from-a-sql-statement-tp5821256p5821296.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] improving speed of query that uses a multi-column filter ?
On Oct 1, 2014, at 12:34 AM, Misa Simic wrote: Have you considered maybe partial indexes? http://www.postgresql.org/docs/9.3/static/indexes-partial.html I.e idx1 on pk column of the table with where inside index exactly the same as your first where Idx2 on pk column with where inside index as second where That was actually my first attempt , and I was hoping it would work. Unfortunately, there is always something in the queries that keeps Postgres trying to use other (slower) indexes or jumping to a sequential scan. I haven't been able to trick the planner into using the partial index, and most online resources suggested it wasn't possible.
[GENERAL] Is there a way to get both the plan and the results?
Hi all! Building a small tool, and it would be super useful if I could get both the query results and the plan at the same time. At the moment, the tool accepts a SQL query, drops it into a file and call psql --file=whatever.sql. I can change to use a lower-level API if necessary. Ideally, I’d like to do something like EXLAIN (ANALYZE, RESULTS) SELECT ... I’m talking about multi-minute queries, so it’s less than ideal if I have to run the query twice: once for EXPLAIN ANALYZE, once for the actual results. Looking around, https://duckduckgo.com/?q=postgresql+get+query+results+and+query+plan+in+single+call doesn’t return anything useful. Google neither, for that matter. The PG mailing list archives seem to not have any results on that. I’m specifically using 9.1. Not opposed to upgrade if necessary. Thanks! François Beausoleil -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Is there a way to get both the plan and the results?
François Beausoleil wrote Hi all! Building a small tool, and it would be super useful if I could get both the query results and the plan at the same time. At the moment, the tool accepts a SQL query, drops it into a file and call psql --file=whatever.sql. I can change to use a lower-level API if necessary. Ideally, I’d like to do something like EXLAIN (ANALYZE, RESULTS) SELECT ... I’m talking about multi-minute queries, so it’s less than ideal if I have to run the query twice: once for EXPLAIN ANALYZE, once for the actual results. Looking around, https://duckduckgo.com/?q=postgresql+get+query+results+and+query+plan+in+single+call doesn’t return anything useful. Google neither, for that matter. The PG mailing list archives seem to not have any results on that. I’m specifically using 9.1. Not opposed to upgrade if necessary. See: http://www.postgresql.org/docs/9.1/static/auto-explain.html There is no way to do so from a client interface. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Is-there-a-way-to-get-both-the-plan-and-the-results-tp5821313p5821315.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Extract especific text from a sql statement
You already have most of the result columns, so the following should do it. SELECT pc.cod, pc.val, pi.qtd, COALESCE(pc.name, 'empty') AS name, lower(coalesce(pc.email, 'empty')) as email, status, c1.relname, c2.relname, pc.startdate FROM pc INNER JOIN pi on (pc.cod = pi.cod) JOIN pg_class c1 ON (c1.relname = 'pc' AND c1.relkind = 'r' JOIN pg_class c2 ON (c2.relname = 'pi' AND c2.relkind = 'r' WHERE pc.startdate CURRENT_DATE order by 1 desc; Learn the catalogs and you will learn to be a good dba. Melvin Davidson
Re: [GENERAL] Extract especific text from a sql statement
Melvin Davidson-5 wrote You already have most of the result columns, so the following should do it. SELECT pc.cod, pc.val, pi.qtd, COALESCE(pc.name, 'empty') AS name, lower(coalesce(pc.email, 'empty')) as email, status, c1.relname, c2.relname, pc.startdate FROM pc INNER JOIN pi on (pc.cod = pi.cod) JOIN pg_class c1 ON (c1.relname = 'pc' AND c1.relkind = 'r' JOIN pg_class c2 ON (c2.relname = 'pi' AND c2.relkind = 'r' WHERE pc.startdate CURRENT_DATE order by 1 desc; Learn the catalogs and you will learn to be a good dba. Melvin Davidson I read the example answer as being a single column (or some other println(...) output) that generates a single row for each of the string literal identifiers extracted from a parse of the raw query text - possibly after capturing raw identifiers and performing catalog lookups. Your particular answer also seems pointless in that the JOIN pg_class is unnecessary since the ON clause sets a constant for relname and then re-uses that in the select-list. You'd been better off just writing: SELECT ..., 'pc' AS relname, 'pi' AS relname FROM ... and providing disambiguating aliases. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Extract-especific-text-from-a-sql-statement-tp5821256p5821328.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general