[GENERAL] How to synchronize tables in remote (production) and local databases in case if the structure of local database's table has been modified
Hello, I have faced with a complicated case of table synchronisation. The full description of the problem (with some figures) is posted here: http://stackoverflow.com/questions/26237661/postgresql-update-table-with-new-records-from-the-same-table-on-remote-server Here it the partial repost of my stackowerflow's topic: We have a PostgreSQL server running in production and a plenty of workstations with an isolated development environments. Each one has its own local PostgreSQL server (with no replication with the production server). Developers need to receive updates stored in production server periodically. I am trying to figure out how to dump the contents of several selected tables from server in order to update the tables on development workstations. The biggest challenge is that the tables I'm trying to synchronize may be diverged (developers may add - but not delete - new fields to the tables through the Django ORM, while schema of the production database remains unchanged for a long time).Therefore the updated records and new fields of the tables stored on workstations must be preserved against the overwriting.I guess that direct dumps (e.g. pg_dump -U remote_user -h remote_server -t table_to_copy source_db | psql target_db) are not suitable here.UPD: If possible I would also like to avoid the use of third (intermediate) database while transferring the data from production database to the workstations. Have no idea how to work it out. Any help will be appreciated.Sincerely,-- Vitaly Isaevsoftware engeneerTeam112.ru
Re: [GENERAL] Converting char to varchar automatically
Hi! also, it generates statement which tries to change all columns to one character length columns. Andrus. From: Andrus Sent: Monday, October 06, 2014 8:11 PM To: Melvin Davidson Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Converting char to varchar automatically Hi! SELECT 'ALTER TABLE ' || quote_ident(n.nspname) || '.' || quote_ident(c.relname) || ' ALTER COLUMN ' || quote_ident(a.attname) || ' TYPE varchar;' FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace JOIN pg_attribute a ON a.attrelid = c.oid JOIN pg_type t ON t.oid = a.atttypid WHERE t.typname = 'char' AND n.nspname 'pg_catalog'; It does not return any data. Andrus.
[GENERAL] Convert raw fields into Datum
Hi, I have the raw fields of a particular row in char * . Now my requirement is to convert these value into datum * . How should i go about this ? Does converting into datum depends on the type of the field too ? -- Harsha
Re: [GENERAL] psql connection issue
On 10/07/2014 09:10 PM, Stephen Davies wrote: The permissions on the socket are 777 owner/group postgres. I installed the 9.3 onto the Centos 7 server using the repo at postgresql.org. (http://yum.postgresql.org/9.3/redhat/rhel-$releasever-$basearch) There is no /var/run/postgresql and find cannot find another socket anywhere else. Sounds similar to this: Long version: http://serverfault.com/questions/609947/database-connection-to-postgresql-refused-for-flask-app-under-mod-wsgi-when-start Short version: Disable SELinux Cheers and thanks, Stephen On 08/10/14 14:32, Tom Lane wrote: Stephen Davies sdav...@sdc.com.au writes: I am in the process of migrating a bunch of databases and associated CGI scripts from 9.1.4 to 9.3 (and from 32-bit to 64-bit). The database migration has been successful but I have an issue with psql connections from CGI scripts. I can connect to the 9.3 server locally with psql from the command line, with psql from other boxes on the LAN via TCP, via JDBC from programs and servlets but cannot connect locally via CGI. If I run any of the CGI scripts from the command line they work but when invoked by Apache, they fail with the usual question as to whether anything is listening on socket /tmp/.s.PGSQL.5432. Some Linux variants think it improves security to run daemons like apache in a context where what the daemon sees as /tmp has been mapped somewhere else. If you're running one of these platforms, the Postgres server and libpq distributed by the vendor will have been hacked to cope, typically by agreeing that the socket location is something like /var/run/postgresql/ rather than /tmp. I'm guessing your 9.3 installation was self-built and hasn't been configured that way. regards, tom lane -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Convert raw fields into Datum
sri harsha sriharsha9...@gmail.com writes: I have the raw fields of a particular row in char * . Now my requirement is to convert these value into datum * . How should i go about this ? Does converting into datum depends on the type of the field too ? Yes. You are probably looking for BuildTupleFromCStrings(). regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Sync production DB with development?
I am currently doing periodic syncs of one of my production databases to my development database using the command pg_dump -ch production host database name | psql database name, run on my development server. This works well enough, but as the size of the production database grows, this command is, for obvious reasons, getting progressively slower (a full sync like this currently takes about 35 seconds). Is there a better way? Something that will only transfer records that are different on the production server, like rsync does for files perhaps?My main concern here is the time it takes to sync, given that the database size will only continue growing as time passes (unless I start implementing an archive at some point). The current database has two years worth of records. I would assume that the time the sync takes would grow roughly linearly with the number of records, so I could easily be over a minute of sync time in another two years. I would really rather not have to wait several minutes every time I want to update my development data. ---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD VERSION:3.0 N:Brewster;Israel;;; FN:Israel Brewster ORG:Frontier Flying Service;MIS TITLE:PC Support Tech II EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com TEL;type=WORK;type=pref:907-450-7293 item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701; item1.X-ABADR:us CATEGORIES:General X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson END:VCARD
Re: [GENERAL] Sync production DB with development?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 El 08/10/14 a las 14:01, Israel Brewster escibió: I am currently doing periodic syncs of one of my production databases to my development database using the command pg_dump -ch production host database name | psql database name, run on my development server. This works well enough, but as the size of the production database grows, this command is, for obvious reasons, getting progressively slower (a full sync like this currently takes about 35 seconds). Is there a better way? Something that will only transfer records that are different on the production server, like rsync does for files perhaps? You can setup a streaming server, however I wont' recommend to sync from a production server. Usually there is no need to have *all* the data from prod to development. Both environments should be isolated for security reasons. Other thing is to implement a QA server, streaming from the master or taking a nightly snapshot with pg_basebackup. I think it could be more than enough. Actually, doing pg_dump | psql could take more time than pg_basebackup. My main concern here is the time it takes to sync, given that the database size will only continue growing as time passes (unless I start implementing an archive at some point). The current database has two years worth of records. I would assume that the time the sync takes would grow roughly linearly with the number of records, so I could easily be over a minute of sync time in another two years. I would really rather not have to wait several minutes every time I want to update my development data. Which is the entire size of your production cluster? - -- - -- Emanuel Calvo http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services Bs. As., Argentina (GMT-3) -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org iQIcBAEBCgAGBQJUNXSfAAoJEIBeI/HMagHmUHwQALpvwm44MJnDPOIGiPInZNRB 7DghWZD/JY4HbO1sKFMneXTJuNAbnNgE3MVyPnuqBnji1X0GyyDb0/NHW7yee7kd 3g+/FcryUAdY0WLuiuezcvc92eMrQTqE0eVEkM8WhlwFdmog3vbln1BeOYdMN/MF mrOjCBZRPYbIHwk6+hGN+C/OvE4e6n91+kN+J00Ga3+oD+LZxUZTzjY2ywh5d7yd HYsu2V2S508IQVISfoOR/skWWIWGr43ouzyPauJ/lSu/H0Jgor9c0ZjBeQUu6hsh N6i48yY80VRDC1F5EoEg5NVzzaPcye7r3IgEp2Yx3dsvVw+sF71P/1dqJNg9K1I/ GplwEVn0Uq7X5ld8crMMqmzLdGCQXXl1CJlLa6g593SzhjWsqeC+Fj1bFJnspNB6 xrEolLC1r+ZG7DTIRr4fmomk702/X8NDJPd4FcJ8G5lOm3keJz3yygmrP1X6GVYt UPj4p+DnwlikQqDIbLAzSAKoj/pGyya2GtadUwqQkzLYGELTNU2UR5yVcwIr/Wi5 oMA5iNqxHwrZoPzMaFQzvK+e2IHQSe/IH1EVqrqtOy+FqQLvNgABtrBrEjcy3JLq 2jL0tA9yNuixncv6JBe8kFXJ7+gwpdrGG69YGSq74B1/IzoMsWo9L+eIv1x4YnBw xDGMjU/lhV7A9MagRZa6 =g73R -END PGP SIGNATURE- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Processor usage/tuning question
On Tue, Oct 7, 2014 at 12:06 PM, israel isr...@eraalaska.net wrote: Thank you all for the advice. It looks like the load is due to a query that is taking around 1300ms to complete - a query that is run by every client connected (probably half a dozen or so, although I don't have specific numbers), every fifteen seconds or so. As you can imagine, that keeps the server rather busy :-) Specifically, it looks like the time is due to a sort (PARTITION BY tail ORDER BY pointtime DESC) that operates on around 100,000 rows. The lovely details: The query in question is the following: SELECT * FROM (SELECT tail, to_char(pointtime,'MM/DD/ HH24:MI:SS'), lat,lng,altitude,heading,speed,source,pointtime, ROW_NUMBER() OVER (PARTITION BY tail ORDER BY pointtime DESC) as row FROM data WHERE tail in (list of about 55 values or so) and pointtime='timestamp of 24 hours prior to current UTC time'::timestamp) s1 WHERE s1.row=5 ORDER BY tail, pointtime DESC In english, it boils down to get the five most recent data points for each listed tail number. I look at the last 24 hours of data because it is quite possible that a tail number may have no recent data points. How many different tail numbers do you have in the last 24 hours? Based on the numbers you provide, it sounds like the list of 55 tail numbers is pretty much all of them that it could expect to find anyway. One obvious optimization is to look at a smaller time range. This will definitely speed up the query, but at the risk of not getting any data points for one or more of the requested tail numbers (there is already this risk, but looking back 24 hours keeps it fairly small for us). But then, do you care? If you haven't heard from an airplane in 24 hours, it seems like either you don't care, or you care very very much and don't need the database to remind you. The table description: tracking=# \d data Table public.data Column |Type | Modifiers ---+-+-- - id| bigint | not null default nextval('data_id_seq'::regclass) tail | character varying(16) | not null timerecp | timestamp without time zone | not null default now() altitude | integer | pointtime | timestamp without time zone | lat | numeric(7,5)| not null lng | numeric(8,5)| not null speed | integer | heading | integer | source| character varying(64) | syncd | boolean | default false Indexes: data_pkey PRIMARY KEY, btree (id) pointtime_idx btree (pointtime) syncd_idx btree (syncd) tail_idx btree (tail) tailtime_idx btree (tail, pointtime DESC) timerecp_idx btree (timerecp) tracking=# Adding the two-column sorted index didn't seem to affect the query time much. I don't think PostgreSQL is going to be able to reason very effectively about a ROW_NUMBER() in a inner table and then a row=5 in the outer one being equivalent to a LIMIT query for which it could walk an index and then stopping once it finds 5 of them. Does this need to issued as a single query? Why not issue 55 different queries? It seems like the client is likely going to need to pick the returned list back out by tail number anyway, so both the client and the server might be happier with separate queries. The table current contains 1303951 rows, and any given 24 hour period has around 110,000 rows. The results of the explain analyze command can be seen here: http://explain.depesz.com/s/H5w (nice site, btw. I'll have to be sure to bookmark it), where it clearly shows the the sort on data.tail,data.pointtime is the largest timesink (if I am reading it right). The sort does seem pretty slow. What is your encoding and collation? Could you use the C collation if you are not already? Cheers, Jeff
Re: [GENERAL] Converting char to varchar automatically
Hi! Using Toms recommendation I added not attisdropped and now got the query SELECT 'ALTER TABLE ' || quote_ident(n.nspname) || '.' || quote_ident(c.relname) || ' ALTER COLUMN ' || quote_ident(a.attname) || ' TYPE varchar;' FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace JOIN pg_attribute a ON a.attrelid = c.oid JOIN pg_type t ON t.oid = a.atttypid WHERE t.typname = 'bpchar' AND c.relkind = 'r' AND n.nspname 'pg_catalog' and not attisdropped; Will this create commands which replace all user-defined char things in database to varchar ? TYPE varchar creates single character column so most alter table command will fail. How to change this so that original char column width is kept ? I looked into tables used in this query but havent found column which holds char column defined width. How get it or is it better to re-write this query using informational_schema ? How to change this query so that it creates single alter table command for every table (with multiple alter column clauses) to increase conversion speed ? Andrus. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Processor usage/tuning question
---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD VERSION:3.0 N:Brewster;Israel;;; FN:Israel Brewster ORG:Frontier Flying Service;MIS TITLE:PC Support Tech II EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com TEL;type=WORK;type=pref:907-450-7293 item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701; item1.X-ABADR:us CATEGORIES:General X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson END:VCARD On Oct 8, 2014, at 11:34 AM, Jeff Janes jeff.ja...@gmail.com wrote:On Tue, Oct 7, 2014 at 12:06 PM, israel isr...@eraalaska.net wrote: Thank you all for the advice. It looks like the load is due to a query that is taking around 1300ms to complete - a query that is run by every client connected (probably half a dozen or so, although I don't have specific numbers), every fifteen seconds or so. As you can imagine, that keeps the server rather busy :-) Specifically, it looks like the time is due to a sort (PARTITION BY tail ORDER BY pointtime DESC) that operates on around 100,000 rows. The lovely details: The query in question is the following: SELECT * FROM (SELECT tail, to_char(pointtime,'MM/DD/ HH24:MI:SS'), lat,lng,altitude,heading,speed,source,pointtime, ROW_NUMBER() OVER (PARTITION BY tail ORDER BY pointtime DESC) as row FROM data WHERE tail in (list of about 55 values or so) and pointtime='timestamp of 24 hours prior to current UTC time'::timestamp) s1 WHERE s1.row=5 ORDER BY tail, pointtime DESC In english, it boils down to get the five most recent data points for each listed tail number. I look at the last 24 hours of data because it is quite possible that a tail number may have no recent data points.How many different tail numbers do you have in the last 24 hours? Based on the numbers you provide, it sounds like the list of 55 tail numbers is pretty much all of them that it could expect to find anyway.In this case yes - that 55 or so is everything (worst case, but fairly common scenario). The system is set up such that the user can select which tail number(s) they want to see, thus the tail in list construct. It's just that they often select all tail numbers. One obvious optimization is to look at a smaller time range. This will definitely speed up the query, but at the risk of not getting any data points for one or more of the requested tail numbers (there is already this risk, but looking back 24 hours keeps it fairly small for us).But then, do you care? If you haven't heard from an airplane in 24 hours, it seems like either you don't care, or you care very very much and don't need the database to remind you.Except that the last data point received is still valid as the aircraft's current location, even if it came in several hours ago - which we may well care about. That is, some users may need to see where a given aircraft (or the entire fleet) is, even if an aircraft hasn't updated in a while. That said, I did discuss this with my higher-ups, and got the ok to take it down to four hours. The table description: tracking=# \d data Table "public.data" Column | Type | Modifiers ---+-+--- id| bigint | not null default nextval('data_id_seq'::regclass) tail | character varying(16) | not null timerecp | timestamp without time zone | not null default now() altitude | integer | pointtime | timestamp without time zone | lat | numeric(7,5)| not null lng | numeric(8,5)| not null speed | integer | heading | integer | source | character varying(64) | syncd | boolean | default false Indexes: "data_pkey" PRIMARY KEY, btree (id) "pointtime_idx" btree (pointtime) "syncd_idx" btree (syncd) "tail_idx" btree (tail) "tailtime_idx" btree (tail, pointtime DESC) "timerecp_idx" btree (timerecp) tracking=# Adding the two-column sorted index didn't seem to affect the query time much.I don't think PostgreSQL is going to be able to reason very effectively about a ROW_NUMBER() in a inner table and then a row=5 in the outer one being equivalent to a LIMIT query for which it could walk an index and then stopping once it finds 5 of them.Does this need to issued as a single query? Why not issue 55 different queries? It seems like the client is likely going to need to pick the returned list back out by tail number anyway, so both the client and the server might be happier with separate queries.Good point. Doing that brought the entire execution time down to around 60ms. Just ever so slightly better than the ~1200ms I was getting before. :-) I just have an unreasonable aversion to making multiple queries. I guess this is a prime example of why I need to overcome that :-) The table current contains 1303951 rows, and
Re: [GENERAL] Sync production DB with development?
On Oct 8, 2014, at 9:30 AM, Emanuel Calvo emanuel.ca...@2ndquadrant.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 El 08/10/14 a las 14:01, Israel Brewster escibió: I am currently doing periodic syncs of one of my production databases to my development database using the command pg_dump -ch production host database name | psql database name, run on my development server. This works well enough, but as the size of the production database grows, this command is, for obvious reasons, getting progressively slower (a full sync like this currently takes about 35 seconds). Is there a better way? Something that will only transfer records that are different on the production server, like rsync does for files perhaps? You can setup a streaming server, however I wont' recommend to sync from a production server. No, that wouldn't work well, because I need full access to my development server, and I need to be able to NOT have it in sync while I am working on it. Usually there is no need to have *all* the data from prod to development. Both environments should be isolated for security reasons. Agreed. and no, I don't need all the data. But pg_dump doesn't give me an option to, say, only grab the last week of data. Other thing is to implement a QA server, streaming from the master or taking a nightly snapshot with pg_basebackup. I think it could be more than enough. A QA server is great, and nightly snapshots are probably fine for that, however it doesn't help with my development server, I don't think. Actually, doing pg_dump | psql could take more time than pg_basebackup. My main concern here is the time it takes to sync, given that the database size will only continue growing as time passes (unless I start implementing an archive at some point). The current database has two years worth of records. I would assume that the time the sync takes would grow roughly linearly with the number of records, so I could easily be over a minute of sync time in another two years. I would really rather not have to wait several minutes every time I want to update my development data. Which is the entire size of your production cluster? At the moment, only about 538MB. Which I realize isn't all that large in the grand scheme of databases. - -- - -- Emanuel Calvo http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services Bs. As., Argentina (GMT-3) -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org iQIcBAEBCgAGBQJUNXSfAAoJEIBeI/HMagHmUHwQALpvwm44MJnDPOIGiPInZNRB 7DghWZD/JY4HbO1sKFMneXTJuNAbnNgE3MVyPnuqBnji1X0GyyDb0/NHW7yee7kd 3g+/FcryUAdY0WLuiuezcvc92eMrQTqE0eVEkM8WhlwFdmog3vbln1BeOYdMN/MF mrOjCBZRPYbIHwk6+hGN+C/OvE4e6n91+kN+J00Ga3+oD+LZxUZTzjY2ywh5d7yd HYsu2V2S508IQVISfoOR/skWWIWGr43ouzyPauJ/lSu/H0Jgor9c0ZjBeQUu6hsh N6i48yY80VRDC1F5EoEg5NVzzaPcye7r3IgEp2Yx3dsvVw+sF71P/1dqJNg9K1I/ GplwEVn0Uq7X5ld8crMMqmzLdGCQXXl1CJlLa6g593SzhjWsqeC+Fj1bFJnspNB6 xrEolLC1r+ZG7DTIRr4fmomk702/X8NDJPd4FcJ8G5lOm3keJz3yygmrP1X6GVYt UPj4p+DnwlikQqDIbLAzSAKoj/pGyya2GtadUwqQkzLYGELTNU2UR5yVcwIr/Wi5 oMA5iNqxHwrZoPzMaFQzvK+e2IHQSe/IH1EVqrqtOy+FqQLvNgABtrBrEjcy3JLq 2jL0tA9yNuixncv6JBe8kFXJ7+gwpdrGG69YGSq74B1/IzoMsWo9L+eIv1x4YnBw xDGMjU/lhV7A9MagRZa6 =g73R -END PGP SIGNATURE- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sync production DB with development?
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 so I could easily be over a minute of sync time in another two years. I would really rather not have to wait several minutes every time I want to update my development data. A minute is really not that long of a wait, especially given the tradeoff in complexity. Still, if the majority of the time is spent moving old data from one or more tables, you could exclude those from the pg_dump with -T, then copy over some small subset of the table with a pair of COPY commands from prod to dev. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201410081635 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAlQ1oBIACgkQvJuQZxSWSsjSxgCgjhcAvjgoBgpYA2FEKiKovSos l/QAn1tdZk69ku8Z1LArrFzESopr1/OB =l59M -END PGP SIGNATURE- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sync production DB with development?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 El 08/10/14 a las 17:24, Israel Brewster escibió: On Oct 8, 2014, at 9:30 AM, Emanuel Calvo emanuel.ca...@2ndquadrant.com wrote: El 08/10/14 a las 14:01, Israel Brewster escibió: I am currently doing periodic syncs of one of my production databases to my development database using the command pg_dump -ch production host database name | psql database name, run on my development server. This works well enough, but as the size of the production database grows, this command is, for obvious reasons, getting progressively slower (a full sync like this currently takes about 35 seconds). Is there a better way? Something that will only transfer records that are different on the production server, like rsync does for files perhaps? You can setup a streaming server, however I wont' recommend to sync from a production server. No, that wouldn't work well, because I need full access to my development server, and I need to be able to NOT have it in sync while I am working on it. That's the issue: streaming will left your development server as read-only and I guess you don't want to do that. Usually there is no need to have *all* the data from prod to development. Both environments should be isolated for security reasons. Agreed. and no, I don't need all the data. But pg_dump doesn't give me an option to, say, only grab the last week of data. Yeah, at this point you will need to build your own sample data using production values. Other thing is to implement a QA server, streaming from the master or taking a nightly snapshot with pg_basebackup. I think it could be more than enough. A QA server is great, and nightly snapshots are probably fine for that, however it doesn't help with my development server, I don't think. Well, the nightly snapshots will allow you to sync every night your development server and write over it during the day (you can sync many times as you want). Actually, doing pg_dump | psql could take more time than pg_basebackup. My main concern here is the time it takes to sync, given that the database size will only continue growing as time passes (unless I start implementing an archive at some point). The current database has two years worth of records. I would assume that the time the sync takes would grow roughly linearly with the number of records, so I could easily be over a minute of sync time in another two years. I would really rather not have to wait several minutes every time I want to update my development data. Which is the entire size of your production cluster? At the moment, only about 538MB. Which I realize isn't all that large in the grand scheme of databases. I think that at this point you shouldn't be worried at all. Usually, you can build sample data DB and using it as a template. So, any time you want to drp and recreate your database, you just do a *create database* with the template option. - -- - -- Emanuel Calvo http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services Bs. As., Argentina (GMT-3) -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org iQIcBAEBCgAGBQJUNaRnAAoJEIBeI/HMagHm5sEQALsfKrf3UHoVAr3N9EQfvlza Ds7ZXue48Mt63FHLlsscC/lsJlfublJXwnV/7H9aFgviVDgWRqpjfrtFWk/5WzXn g0c6zbjB13uc5K50OQqeFjo4Sb1UMZqSInGLDa3wi2BCT8XQepUQk60Hy9YJo449 2VkibVrJPulEJN2pviL7nlHNQoW3A2KHne9uEc3sdVDtAPgXrbB45BW184Qgpc34 r2ReAqM8S533lTe/ZfXpn6ZUru8uJHXnwkRirt8HOelXeIYNxvZyjmzaFpXmt5ZG grJfx0WZB+qTmT4lLw2OdwZ/3U/M1y4cMLnePmBWpdxph43gSH7AEDO2XMyhpsnQ 5To1zgfZexvZXZ1AIMAAShgGxPMLgCVy7lTdzfZVS92CiU2ou4gsh8s3wZpwjc68 VuplS28HIY2GNZQm4OackfAXwm9yR80YdW7rE6Il7eUx1pAv8OZJyQPDmoav4J7V Ir7X9kIMK8JUtb+G79lpsQcBwJ6f+BGW+OMMqfYHfuSfPErLprYuoN9A7cZVJNtv D9fWyHtcvyFMfyxfZmGdY1pK5M/9DWrI7e9ILp29oywZies0Zk9b9AuV3Hki8LZm kQuJOswXKM5/g1U4QS9a5Pf0DF5qx4vAZq9OgtLnN8kyUykgZaHuJJzVSdE+wsOW Q14aZJWWdJpBSu45NtrG =Zb7i -END PGP SIGNATURE- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Sync production DB with development?
On 10/8/14, 2:24 PM, Israel Brewster wrote: Agreed. and no, I don't need all the data. But pg_dump doesn't give me an option to, say, only grab the last week of data. You might want to check out pg_sample. It lets you get a referentially consistent sample of a database. You specify the number of rows you want and it will get do the rest, retrieving enough rows from other tables to satisfy any foreign key constraints. https://github.com/mla/pg_sample Jeff -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Converting char to varchar automatically
This revised query should give you what you need: SELECT 'ALTER TABLE ' || quote_ident(n.nspname) || '.' || quote_ident(c.relname) || ' ALTER COLUMN ' || quote_ident(a.attname) || ' TYPE varchar(' || i.character_maximum_length || ');' FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace JOIN pg_attribute a ON a.attrelid = c.oid JOIN pg_type t ON t.oid = a.atttypid JOIN information_schema.columns i ON (i.table_name = c.relname AND i.column_name = a.attname) WHERE t.typname = 'bpchar' AND c.relkind = 'r' AND n.nspname 'pg_catalog' and not attisdropped; On Wed, Oct 8, 2014 at 3:34 PM, Andrus kobrule...@hot.ee wrote: Hi! Using Toms recommendation I added not attisdropped and now got the query SELECT 'ALTER TABLE ' || quote_ident(n.nspname) || '.' || quote_ident(c.relname) || ' ALTER COLUMN ' || quote_ident(a.attname) || ' TYPE varchar;' FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace JOIN pg_attribute a ON a.attrelid = c.oid JOIN pg_type t ON t.oid = a.atttypid WHERE t.typname = 'bpchar' AND c.relkind = 'r' AND n.nspname 'pg_catalog' and not attisdropped; Will this create commands which replace all user-defined char things in database to varchar ? TYPE varchar creates single character column so most alter table command will fail. How to change this so that original char column width is kept ? I looked into tables used in this query but havent found column which holds char column defined width. How get it or is it better to re-write this query using informational_schema ? How to change this query so that it creates single alter table command for every table (with multiple alter column clauses) to increase conversion speed ? Andrus. -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.
Re: [GENERAL] How to synchronize tables in remote (production) and local databases in case if the structure of local database's table has been modified
On Wed, Oct 8, 2014 at 12:49 AM, Vitaly Isaev vis...@team112.ru wrote: I am trying to figure out how to dump the contents of several selected tables from server in order to update the tables on development workstations. The biggest challenge is that the tables I'm trying to synchronize may be diverged (developers may add - but not delete - new fields to the tables through the Django ORM, while schema of the production database remains unchanged for a long time). The COPY trick will probably help you. Note that I specify a column list in the last COPY statement. skonoplev@[local]:5432 ~=# create table t (i integer); CREATE TABLE skonoplev@[local]:5432 ~=# insert into t select * from generate_series(1, 5); INSERT 0 5 skonoplev@[local]:5432 ~=# copy t to '/tmp/t.dump'; COPY 5 skonoplev@[local]:5432 ~=# truncate t; TRUNCATE TABLE skonoplev@[local]:5432 ~=# alter table t add s text; ALTER TABLE skonoplev@[local]:5432 ~=# copy t(i) from '/tmp/t.dump'; COPY 5 skonoplev@[local]:5432 ~=# select * from t; i | s ---+--- 1 | 2 | 3 | 4 | 5 | (5 rows) -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (499) 346-7196, +7 (988) 888-1979 gray...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Converting char to varchar automatically
Hi! Thank you. This revised query should give you what you need: SELECT 'ALTER TABLE ' || quote_ident(n.nspname) || '.' || quote_ident(c.relname) || ' ALTER COLUMN ' || quote_ident(a.attname) || ' TYPE varchar(' || i.character_maximum_length || ');' FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace JOIN pg_attribute a ON a.attrelid = c.oid JOIN pg_type t ON t.oid = a.atttypid JOIN information_schema.columns i ON (i.table_name = c.relname AND i.column_name = a.attname) WHERE t.typname = 'bpchar' AND c.relkind = 'r' AND n.nspname 'pg_catalog' and not attisdropped; How to create single alter table command for every table ? Can we use string concat aggregate function or window functions or plpgsql or something other ? Andrus.
Re: [GENERAL] Converting char to varchar automatically
There really is no easy way to make a single ALTER for each table unless you use a programming language. However, adding a GROUP BY c.relname, a.attname would certainly simplify editing. Then you can combine all the ALTER COLUMN's for each table. On Wed, Oct 8, 2014 at 6:21 PM, Andrus kobrule...@hot.ee wrote: Hi! Thank you. This revised query should give you what you need: SELECT 'ALTER TABLE ' || quote_ident(n.nspname) || '.' || quote_ident(c.relname) || ' ALTER COLUMN ' || quote_ident(a.attname) || ' TYPE varchar(' || i.character_maximum_length || ');' FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace JOIN pg_attribute a ON a.attrelid = c.oid JOIN pg_type t ON t.oid = a.atttypid JOIN information_schema.columns i ON (i.table_name = c.relname AND i.column_name = a.attname) WHERE t.typname = 'bpchar' AND c.relkind = 'r' AND n.nspname 'pg_catalog' and not attisdropped; How to create single alter table command for every table ? Can we use string concat aggregate function or window functions or plpgsql or something other ? Andrus. -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.
[GENERAL] Importing large object (with lo_import) to table in separate tablespaces takes up lots of space in $PGDATA
Hi all. I'm having a database, called apeland, which at first (when created) was in the default-tablespace (in $PGDATA), then I moved it with the commands: # create tablespace apeland location '/home/andreak/programs/postgresql/9.3.5/tablespaces/apeland'; CREATE TABLESPACE Check space-usage before moving: $ du -hs data/ tablespaces/ 59M data/ 27M tablespaces/ Move the database to new tablespace # alter database apeland set tablespace apeland; ALTER DATABASE Check space-usage after moving: $ du -hs data/ tablespaces/ 52M data/ 34M tablespaces/ Then I created this table: # create table files(data oid); CREATE TABLE Insert this file: $ du -hs origo-war-01-14-01.20.war 130M origo-war-01-14-01.20.war # insert into files(data) values(lo_import('/home/andreak/data/origo-war-01-14-01.20.war')); INSERT 0 1 Check space-usage: $ du -hs data/ tablespaces/ 164M data/ 208M tablespaces/ Now - why is so much extra space used in $PGDATA? Is there a way to reclame it? Was the apeland-db moved completely or is there lots dangeling left in PGDATA? Thanks. -- Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 andr...@visena.com mailto:andr...@visena.com www.visena.com https://www.visena.com https://www.visena.com
Re: [GENERAL] psql connection issue
This is not the same issue. However, I had already disabled SELinux for other reasons. The actual cause of my issue was the new private tmp facility in systemd startup of httpd. This makes the PostgreSQL socket invisible to CGI scripts. We have survived for many years without this before migrating to CentOS 7 so I simply disabled this too and all came good. Cheers and thanks, Stephen On 08/10/14 23:49, Adrian Klaver wrote: On 10/07/2014 09:10 PM, Stephen Davies wrote: The permissions on the socket are 777 owner/group postgres. I installed the 9.3 onto the Centos 7 server using the repo at postgresql.org. (http://yum.postgresql.org/9.3/redhat/rhel-$releasever-$basearch) There is no /var/run/postgresql and find cannot find another socket anywhere else. Sounds similar to this: Long version: http://serverfault.com/questions/609947/database-connection-to-postgresql-refused-for-flask-app-under-mod-wsgi-when-start Short version: Disable SELinux Cheers and thanks, Stephen On 08/10/14 14:32, Tom Lane wrote: Stephen Davies sdav...@sdc.com.au writes: I am in the process of migrating a bunch of databases and associated CGI scripts from 9.1.4 to 9.3 (and from 32-bit to 64-bit). The database migration has been successful but I have an issue with psql connections from CGI scripts. I can connect to the 9.3 server locally with psql from the command line, with psql from other boxes on the LAN via TCP, via JDBC from programs and servlets but cannot connect locally via CGI. If I run any of the CGI scripts from the command line they work but when invoked by Apache, they fail with the usual question as to whether anything is listening on socket /tmp/.s.PGSQL.5432. Some Linux variants think it improves security to run daemons like apache in a context where what the daemon sees as /tmp has been mapped somewhere else. If you're running one of these platforms, the Postgres server and libpq distributed by the vendor will have been hacked to cope, typically by agreeing that the socket location is something like /var/run/postgresql/ rather than /tmp. I'm guessing your 9.3 installation was self-built and hasn't been configured that way. regards, tom lane -- = Stephen Davies Consulting P/L Phone: 08-8177 1595 Adelaide, South Australia.Mobile:040 304 0583 Records Collections Management. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Importing large object (with lo_import) to table in separate tablespaces takes up lots of space in $PGDATA
Hi, Le 9 oct. 2014 01:31, Andreas Joseph Krogh andr...@visena.com a écrit : Hi all. I'm having a database, called apeland, which at first (when created) was in the default-tablespace (in $PGDATA), then I moved it with the commands: # create tablespace apeland location '/home/andreak/programs/postgresql/9.3.5/tablespaces/apeland'; CREATE TABLESPACE Check space-usage before moving: $ du -hs data/ tablespaces/ 59M data/ 27M tablespaces/ Move the database to new tablespace # alter database apeland set tablespace apeland; ALTER DATABASE Check space-usage after moving: $ du -hs data/ tablespaces/ 52M data/ 34M tablespaces/ Then I created this table: # create table files(data oid); CREATE TABLE Insert this file: $ du -hs origo-war-01-14-01.20.war 130Morigo-war-01-14-01.20.war # insert into files(data) values(lo_import('/home/andreak/data/origo-war-01-14-01.20.war')); INSERT 0 1 Check space-usage: $ du -hs data/ tablespaces/ 164Mdata/ 208Mtablespaces/ Now - why is so much extra space used in $PGDATA? Is there a way to reclame it? Was the apeland-db moved completely or is there lots dangeling left in PGDATA? Everything is moved if you used ALTER DATABASE. I'd guess what you have in the data folder are mostly WAL files. You should use du on data/base to get size from relations' files, and not everything else including configuration files.
Re: [GENERAL] Converting char to varchar automatically
Hi! There really is no easy way to make a single ALTER for each table unless you use a programming language. I’snt SQL a programming language ? However, adding a GROUP BY c.relname, a.attname would certainly simplify editing. Then you can combine all the ALTER COLUMN's for each table. I wrote with stem as ( SELECT 'ALTER TABLE ' || quote_ident(n.nspname) || '.' || quote_ident(c.relname) as prefix , string_agg( ' ALTER COLUMN ' || quote_ident(a.attname) || ' TYPE varchar(' || i.character_maximum_length ||')', ',' ) as body FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace JOIN pg_attribute a ON a.attrelid = c.oid JOIN pg_type t ON t.oid = a.atttypid JOIN information_schema.columns i ON (i.table_name = c.relname AND i.column_name = a.attname) WHERE t.typname = 'bpchar' AND c.relkind = 'r' AND n.nspname 'pg_catalog' and not attisdropped group by 1 ) select prefix || ' '|| body || ';' as statement from stem Is this prefect ? Andrus.