Re: [GENERAL] Avoiding a deadlock
Paul Jungwirth wrote: Out of curiosity: any reason the ORDER BY should be in the subquery? It seems like it ought to be in the UPDATE (if that's allowed). Hmm, it's not allowed. :-) It's still surprising that you can guarantee the order of a multi-row UPDATE by ordering a subquery. To be honest, I don't think that there is any guarantee for this to work reliably in all comparable cases, as PostgreSQL does not guarantee in which order it performs the UPDATEs. It just happens to work with certain plans (use EXPLAIN to see wat will happen). Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Problems with PostgreSQL Replication (Log Shipping)
Hello, everybody I have one problem and I need some help. My environment: one master and one slave (PostgreSQL 9.2.2). My cluster has about 160GB and pg_basebackup to syncronize them (master and slave). The sintax is below: pg_basebackup -h productionaddress -p productionport -U productionuser -D datadirectory -P -v My recovery.conf: standby_mode = 'on' primary_conninfo = 'host=productionaddress port=productionport user=productionuser' archive_cleanup_command = 'pg_archivecleanup /slave/transactionlogs %r' My postgresql.conf: (master) wal_level = hot_standby checkpoint_segments = 10 archive_mode = on archive_command = 'rsync -Crap %p postgres@slaveaddress :/slave/transactionlogs/%f' max_wal_senders = 1 wal_keep_segments = 50 My postgresql.conf: (slave) checkpoint_segments = 10 hot_standby = on In my slave I have the following erros: My first attempt 2013-03-07 15:58:21 BRT [11817]: [1-1] user=,db= LOG: database system was interrupted; last known up at 2013-03-07 15:55:43 BRT 2013-03-07 15:58:21 BRT [11817]: [2-1] user=,db= LOG: entering standby mode 2013-03-07 15:58:21 BRT [11818]: [1-1] user=,db= LOG: streaming replication successfully connected to primary 2013-03-07 15:58:25 BRT [11817]: [3-1] user=,db= LOG: consistent recovery state reached at 141/8FBB5F0 2013-03-07 15:58:25 BRT [11817]: [4-1] user=,db= LOG: redo starts at 141/2251F90 2013-03-07 15:58:25 BRT [11817]: [5-1] user=,db= FATAL: could not access status of transaction 30622931 2013-03-07 15:58:25 BRT [11817]: [6-1] user=,db= DETAIL: Could not read from file pg_clog/001D at offset 49152: Success. 2013-03-07 15:58:25 BRT [11817]: [7-1] user=,db= CONTEXT: xlog redo commit: 2013-03-07 15:55:40.673623-03 2013-03-07 15:58:25 BRT [11767]: [1-1] user=,db= LOG: startup process (PID 11817) exited with exit code 1 2013-03-07 15:58:25 BRT [11767]: [2-1] user=,db= LOG: terminating any other active server processes In my slave I have the following erros: My second attempt 2013-03-11 12:07:49 BRT [5862]: [1-1] user=,db= LOG: database system was interrupted; last known up at 2013-03-11 12:06:31 BRT 2013-03-11 12:07:49 BRT [5862]: [2-1] user=,db= LOG: entering standby mode 2013-03-11 12:07:49 BRT [5864]: [1-1] user=,db= LOG: streaming replication successfully connected to primary 2013-03-11 12:07:53 BRT [5862]: [3-1] user=,db= LOG: consistent recovery state reached at 168/816AE10 2013-03-11 12:07:53 BRT [5862]: [4-1] user=,db= LOG: redo starts at 167/FEC3D828 2013-03-11 12:07:53 BRT [5862]: [5-1] user=,db= FATAL: could not access status of transaction 36529670 2013-03-11 12:07:53 BRT [5862]: [6-1] user=,db= DETAIL: Could not read from file pg_clog/0022 at offset 212992: Success. 2013-03-11 12:07:53 BRT [5862]: [7-1] user=,db= CONTEXT: xlog redo commit: 2013-03-11 12:05:35.069759-03 2013-03-11 12:07:53 BRT [5762]: [1-1] user=,db= LOG: startup process (PID 5862) exited with exit code 1 2013-03-11 12:07:53 BRT [5762]: [2-1] user=,db= LOG: terminating any other active server processes I had the same problem, but in different files (pg_clog): First attempt: 2013-03-07 15:58:25 BRT [11817]: [5-1] user=,db= FATAL: could not access status of transaction 30622931 2013-03-07 15:58:25 BRT [11817]: [6-1] user=,db= DETAIL: Could not read from file pg_clog/001D at offset 49152: Success. Second attempt: 2013-03-11 12:07:53 BRT [5862]: [5-1] user=,db= FATAL: could not access status of transaction 36529670 2013-03-11 12:07:53 BRT [5862]: [6-1] user=,db= DETAIL: Could not read from file pg_clog/0022 at offset 212992: Success. When I created the cluster and I did this test, it was OK. Now, my cluster has about 160GB and I tried starting the replication and I have this problems. Some idea? Thank you. Best Regards João Paulo -- JotaComm http://jotacomm.wordpress.com
[GENERAL] indexing elements of a csv ?
Hi: v9.0.1 on linux. I have a table with a column that is a csv. Users will select records based upon the existence of an element of the csv. There is an index on that column but I'm thinking that it won't be of much use in this situation. Is there a way to facilitate these queries? Example: create table foo (col0 text, col1 text); create index foo_col1 on foo (col1); insert into foo (col0,col1) values ('moe','aa,bbb,c'),('larry','x,bbb,yyy'),('curly','m,,oo'); now... select col0 from foo where the csv element 'bbb' exists as a csv element of col1 Some attempts, which get the right answers, but which probably won't be very efficient... select col0 from foo where string_to_array('bbb','') @ string_to_array(col1); select col0 from foo where ','||col1||',' like '%,bbb,%'; select col0 from foo where ((col1 like 'bbb,%') or (col1 like '%,bbb,%') or (col1 like '%,bbb')); Long shot, but I thought I'd ask anyway. Thanks in Advance !
Re: [GENERAL] indexing elements of a csv ?
2013/3/12 Gauthier, Dave dave.gauth...@intel.com: Hi: v9.0.1 on linux. I have a table with a column that is a csv. Users will select records based upon the existence of an element of the csv. There is an index on that column but I'm thinking that it won't be of much use in this situation. Is there a way to facilitate these queries? Example: create table foo (col0 text, col1 text); create index foo_col1 on foo (col1); insert into foo (col0,col1) values ('moe','aa,bbb,c'),('larry','x,bbb,yyy'),('curly','m,,oo'); now... select col0 from foo where the csv element 'bbb' exists as a csv element of col1 Some attempts, which get the right answers, but which probably won't be very efficient... select col0 from foo where string_to_array('bbb','') @ string_to_array(col1); select col0 from foo where ','||col1||',' like '%,bbb,%'; select col0 from foo where ((col1 like 'bbb,%') or (col1 like '%,bbb,%') or (col1 like '%,bbb')); Long shot, but I thought I'd ask anyway. A GIN index might do the trick: CREATE INDEX ix_col1_ix ON foo USING GIN(string_to_array(col1,',')); (This is assuming the CSV values can be cleanly converted to an array using string_to_array()). You could then query it with: SELECT col0 FROM foo WHERE string_to_array(col1,',') @ '{bbb}'::text[]; HTH Ian Barwick -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] indexing elements of a csv ?
An option would be to create a column of type tsvector. That way you could do text searches using partial words or words and get results including those containing forms of the word. From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Gauthier, Dave [dave.gauth...@intel.com] Sent: Tuesday, March 12, 2013 8:50 AM To: pgsql-general@postgresql.org Subject: [GENERAL] indexing elements of a csv ? Hi: v9.0.1 on linux. I have a table with a column that is a csv. Users will select records based upon the existence of an element of the csv. There is an index on that column but I'm thinking that it won't be of much use in this situation. Is there a way to facilitate these queries? Example: create table foo (col0 text, col1 text); create index foo_col1 on foo (col1); insert into foo (col0,col1) values ('moe','aa,bbb,c'),('larry','x,bbb,yyy'),('curly','m,,oo'); now... select col0 from foo where the csv element 'bbb' exists as a csv element of col1 Some attempts, which get the right answers, but which probably won't be very efficient... select col0 from foo where string_to_array('bbb','') @ string_to_array(col1); select col0 from foo where ','||col1||',' like '%,bbb,%'; select col0 from foo where ((col1 like 'bbb,%') or (col1 like '%,bbb,%') or (col1 like '%,bbb')); Long shot, but I thought I'd ask anyway. Thanks in Advance !
Re: [GENERAL] indexing elements of a csv ?
2013/3/13 Ian Lawrence Barwick barw...@gmail.com: 2013/3/12 Gauthier, Dave dave.gauth...@intel.com: Hi: v9.0.1 on linux. I have a table with a column that is a csv. Users will select records based upon the existence of an element of the csv. There is an index on that column but I'm thinking that it won't be of much use in this situation. Is there a way to facilitate these queries? Example: create table foo (col0 text, col1 text); create index foo_col1 on foo (col1); insert into foo (col0,col1) values ('moe','aa,bbb,c'),('larry','x,bbb,yyy'),('curly','m,,oo'); now... select col0 from foo where the csv element 'bbb' exists as a csv element of col1 Some attempts, which get the right answers, but which probably won't be very efficient... select col0 from foo where string_to_array('bbb','') @ string_to_array(col1); select col0 from foo where ','||col1||',' like '%,bbb,%'; select col0 from foo where ((col1 like 'bbb,%') or (col1 like '%,bbb,%') or (col1 like '%,bbb')); Long shot, but I thought I'd ask anyway. A GIN index might do the trick: CREATE INDEX ix_col1_ix ON foo USING GIN(string_to_array(col1,',')); (This is assuming the CSV values can be cleanly converted to an array using string_to_array()). You could then query it with: SELECT col0 FROM foo WHERE string_to_array(col1,',') @ '{bbb}'::text[]; Just out of interest, I populated the table with around 1,000,000 rows of randomly generated data (three items of random upper case characters in col1), results with and without index below (using an untuned 9.2 installation on a laptop with a slow hard drive). Note that adding the index doubled the total table size, which might be something to watch out for if the table is very big and you have a lot of unique values in the CSV column. Regards Ian Barwick testdb=# SELECT * from foo where string_to_array(col1,',') @ '{PKRY}'::text[]; col0 | col1 +--- ARWC | JIJ,MBDVU,PKRY FUNWOA | JKEK,PKRY,MQFUQTJ PJGTD | KSO,HSTB,PKRY (3 rows) Time: 1325.536 ms testdb=# explain SELECT * from foo where string_to_array(col1,',') @ '{PKRY}'::text[]; QUERY PLAN -- Seq Scan on foo (cost=0.00..28400.42 rows=5021 width=76) Filter: (string_to_array(col1, ','::text) @ '{PKRY}'::text[]) (2 rows) testdb=# CREATE INDEX ix_col1_ix ON foo using gin(string_to_array(col1,',')); CREATE INDEX Time: 170533.158 ms testdb=# ANALYZE foo; ANALYZE Time: 1431.665 ms testdb=# SELECT * from foo where string_to_array(col1,',') @ '{PKRY}'::text[]; col0 | col1 +--- ARWC | JIJ,MBDVU,PKRY FUNWOA | JKEK,PKRY,MQFUQTJ PJGTD | KSO,HSTB,PKRY (3 rows) Time: 0.906 ms testdb=# explain SELECT * from foo where string_to_array(col1,',') @ '{PKRY}'::text[]; QUERY PLAN Bitmap Heap Scan on foo (cost=20.79..389.58 rows=101 width=24) Recheck Cond: (string_to_array(col1, ','::text) @ '{PKRY}'::text[]) - Bitmap Index Scan on ix_col1_ix (cost=0.00..20.76 rows=101 width=0) Index Cond: (string_to_array(col1, ','::text) @ '{PKRY}'::text[]) (4 rows) Time: 0.377 ms -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Testing Technique when using a DB
I tried posting this from Google Groups but I did not see it come through after an hour so this may be a duplicate message for some. The current testing technique for things like Ruby On Rails has three choices but all of the choices will not work in my case. The first choice is truncate which starts a transaction before each test. If the testing is within the same process, this works and a roll back restores the DB. But if the testing involves two processes, then the test data entered by the test rig can not be seen by the system under test (SUT). With Rails, there are times when this is needed. The test rig drives a browser which calls into a Rails application. There are dangerous ways to still use this method but each has various down falls or risks. The other two choices are delete and truncate which both end up with an empty database just after each test. This prevents any test data that is already in the database from being used after the first test. Note that a test run will run through a sequence of tests (usually quite a few). All of these are fairly fast with each one being faster under different conditions (according to users). Generally, this pushes the Rails community to have either fixtures or factories. Both are ok solutions but both also have problems. In my case, I have a dozen or so tables all with very tight constraints and creating either fixtures or factories is very troublesome. Also, I have a real database with real data in production and it seems foolish not to take advantage of the knowledge contained within that database. By knowledge I mean the particular values and weirdness within the data that a factory or a fixture might not realize. One choice would be to create the database, use it, and then drop it for each test. I would create the database from a template that already has data taken from the production database (and probably trimmed down to a small subset of it). This requires some crafty dancing in the Rails set up since it likes to just attach to a database and run but it could be done. From first blush, this sounds like it would be really slow but may be not. The other choice would be to somehow copy the data to temporary tables before the test run and then copy it back. The advantage to this is it is not very PostgreSQL specific. Indeed, if the template database is already set up, then only one copy would be needed at the start of the test. The other thought I had is if there is some type of leaky transaction. A transaction where another process can see the data but the roll back would still work and be effective. Essentially I'm asking if all the protections a database offers could be dropped... but I thought I'd see if that was possible. The other thought is perhaps there is a snap shot type concept. I don't see it in the list of SQL commands. A snap shot would do exactly what it sounds like. It would take a snap shot and save it somehow. Then a restore to snap shot would restore the DB back to that state. I thought this group might suggest other ideas and either nuke the really bad ideas or promote the plausible ideas I've mentioned above. Sorry for the long post. I appreciate your thoughts. Perry -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Testing Technique when using a DB
On 03/12/2013 08:41 AM, Perry Smith wrote: One choice would be to create the database, use it, and then drop it for each test. I would create the database from a template that already has data taken from the production database (and probably trimmed down to a small subset of it). This requires some crafty dancing in the Rails set up since it likes to just attach to a database and run but it could be done. From first blush, this sounds like it would be really slow but may be not. It depends on your environment (i.e. do you have isolated dev, test and production or are you testing your code on production machines) and the nature of your tests (functionality and bugs only or load and performance). The speed of CREATE DATABASE foo TEMPLATE bar; depends, of course on the size of your template but I've found it to be fast enough for test databases (a few hundred MB takes well under a minute on an old desktop). Try it and see. The other thought is perhaps there is a snap shot type concept. I don't see it in the list of SQL commands. A snap shot would do exactly what it sounds like. It would take a snap shot and save it somehow. Then a restore to snap shot would restore the DB back to that state. I may be missing something here but pg_dump/pg_restore do exactly that. Or you could get more complicated and use point-in-time recovery, external tools like pg_barman or even, perhaps, a file-level snapshot of the database files (when the PostgreSQL is shut down, of course) to handle your base test starting point. Of all the options mentioned my first inclination would be to create your test db from a known template prior to each test run. Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Testing Technique when using a DB
On Mar 12, 2013, at 8:41 AM, Perry Smith pedz...@gmail.com wrote: One choice would be to create the database, use it, and then drop it for each test. I would create the database from a template that already has data taken from the production database (and probably trimmed down to a small subset of it). This requires some crafty dancing in the Rails set up since it likes to just attach to a database and run but it could be done. From first blush, this sounds like it would be really slow but may be not. I do this. It's not blindingly fast, but plenty fast enough for automated testing as long as your tests aren't too fine-grained and your test database isn't too big. It takes no more than two or three seconds on my (slow, IO-starved) QA VMs. By parameterizing the database name you can parallelize tests - each test creates it's own copy of the template database, runs whatever it needs to run, then drops the database. That lets you hide a lot of the setup/teardown latency. The other choice would be to somehow copy the data to temporary tables before the test run and then copy it back. The advantage to this is it is not very PostgreSQL specific. Indeed, if the template database is already set up, then only one copy would be needed at the start of the test. You'd also need to undo any other state that was changed. Sequences, for instance, if they can affect the meaning of your test or expected results in any way. The other thought I had is if there is some type of leaky transaction. A transaction where another process can see the data but the roll back would still work and be effective. Essentially I'm asking if all the protections a database offers could be dropped... but I thought I'd see if that was possible. That - or anything else involving rolling back transactions - would only work if you were testing an app that didn't use transactions. The other thought is perhaps there is a snap shot type concept. I don't see it in the list of SQL commands. A snap shot would do exactly what it sounds like. It would take a snap shot and save it somehow. Then a restore to snap shot would restore the DB back to that state. That's pretty much what creating a database from a template does, other than the need to have everybody disconnect from the database before doing the drop+create. Dump / restore will do that too - somewhat slower, but doesn't require disconnecting from the DB. File-system level snapshots are another option, but I'm pretty sure you'd need to shut down and restart the database server, which'd cost far more than you'd save. Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Using psql to feed a file line by line to a table column
Hello, I have a list of 40 non-english words, each on a separate line and in UTF8 format, which I'd like to put in the word column of the following table (also in UTF8 and 8.4.13): create table good_words ( word varchar(64) primary key, verified boolean not null default false, stamp timestamp default current_timestamp ); Is there maybe a psql trick for that (the psql --help doesn't mention an input field separator option)? Or do I have to write a Perl-script for that task? Thank you Alex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using psql to feed a file line by line to a table column
2013/3/13 Alexander Farber alexander.far...@gmail.com: Hello, I have a list of 40 non-english words, each on a separate line and in UTF8 format, which I'd like to put in the word column of the following table (also in UTF8 and 8.4.13): create table good_words ( word varchar(64) primary key, verified boolean not null default false, stamp timestamp default current_timestamp ); Is there maybe a psql trick for that (the psql --help doesn't mention an input field separator option)? Or do I have to write a Perl-script for that task? This should work from psql: \copy good_words(word) from '/path/to/file.txt' HTH Ian Barwick -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using psql to feed a file line by line to a table column
Unfortunately doesn't work - On Tue, Mar 12, 2013 at 5:53 PM, Ian Lawrence Barwick barw...@gmail.com wrote: 2013/3/13 Alexander Farber alexander.far...@gmail.com: I have a list of 40 non-english words, each on a separate line and in UTF8 format, which I'd like to put in the word column of the following table (also in UTF8 and 8.4.13): create table good_words ( word varchar(64) primary key, verified boolean not null default false, stamp timestamp default current_timestamp ); This should work from psql: \copy good_words(word) from '/path/to/file.txt' I try: bukvy= \copy bukvy_good_words(word) from WORDS ; \copy: parse error at ; bukvy= \copy bukvy_good_words(word) from 'WORDS' ; \copy: parse error at ; bukvy= \copy bukvy_good_words(word) from WORDS ; \copy: parse error at ; bukvy= \copy bukvy_good_words(word) from '/home/afarber/WORDS' ; \copy: parse error at ; bukvy= \copy bukvy_good_words(word) from /home/afarber/WORDS ; \copy: parse error at ; (sorry, lied you about the table name :-) The file is in the home dir and readable: # ls -al WORDS -rw-rw-r-- 1 afarber afarber 8263539 Mar 12 2013 WORDS Any ideas, what is wrong there for 8.4.13 ? Thank you Alex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using psql to feed a file line by line to a table column
2013/3/13 Alexander Farber alexander.far...@gmail.com: Unfortunately doesn't work - On Tue, Mar 12, 2013 at 5:53 PM, Ian Lawrence Barwick barw...@gmail.com wrote: 2013/3/13 Alexander Farber alexander.far...@gmail.com: I have a list of 40 non-english words, each on a separate line and in UTF8 format, which I'd like to put in the word column of the following table (also in UTF8 and 8.4.13): create table good_words ( word varchar(64) primary key, verified boolean not null default false, stamp timestamp default current_timestamp ); This should work from psql: \copy good_words(word) from '/path/to/file.txt' I try: bukvy= \copy bukvy_good_words(word) from WORDS ; \copy: parse error at ; bukvy= \copy bukvy_good_words(word) from 'WORDS' ; \copy: parse error at ; bukvy= \copy bukvy_good_words(word) from WORDS ; \copy: parse error at ; bukvy= \copy bukvy_good_words(word) from '/home/afarber/WORDS' ; \copy: parse error at ; bukvy= \copy bukvy_good_words(word) from /home/afarber/WORDS ; \copy: parse error at ; (sorry, lied you about the table name :-) The file is in the home dir and readable: # ls -al WORDS -rw-rw-r-- 1 afarber afarber 8263539 Mar 12 2013 WORDS Any ideas, what is wrong there for 8.4.13 ? Yup: \copy: parse error at ; which is psql telling you it doesn't like the semicolon. Try leaving it out... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL 9.2.3 performance problem caused Exclusive locks
2013-03-08 13:27:16 +0200 Emre Hasegeli emre.haseg...@tart.com.tr: PostgreSQL writes several following logs during the problem which I never saw before 9.2.3: LOG: process 4793 acquired ExclusiveLock on extension of relation 305605 of database 16396 after 2348.675 ms I tried * to downgrade to 9.2.2 * to disable autovacuum * to disable synchronous commit * to write less on the big tables * to increase checkpoint segments * to increase max connections * to move pg_xlog to sepe None of them helps to avoid downtimes. I could not find anything related to it? Do you have any idea? Have you ever experience something like this? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] table spaces
Ok, So by that token (more drives the better), I should have raid 5 (or whichever will work) with all 6 drives in it ? I was thinking about splitting it up like this. I have 6 drives (and one spare). Combine them into 3 separate logical drives in mirrored configuration (for some hardware redundancy). And use one for base system, and some less frequently read tables, second one for WAL, third one for whatever tables/indexes happen to need separate space (subject to characterisation outcome). I was basically under impression that separating WAL is a big plus. On top of that, having separate partition to hold some other data - will do too. But it sounds - from what you said - like having all in single logical drive will work, because raid card will spread the load amongst number of drives. Am I understanding that correctly ?
[GENERAL] Age of the WAL?
What's the best way to determine the age of the current WAL? Not the current segment, but the whole thing. Put another way: is there a way to determine a timestamp for the oldest available transaction in the WAL? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] table spaces
On 3/12/2013 2:31 PM, Gregg Jaskiewicz wrote: I was basically under impression that separating WAL is a big plus. On top of that, having separate partition to hold some other data - will do too. But it sounds - from what you said - like having all in single logical drive will work, because raid card will spread the load amongst number of drives. Am I understanding that correctly ? both those models have merits. doing a single raid 10 should fairly evenly distribute the IO workload given adequate concurrency, and suitable stripe size and alignment. there are scenarios where a hand tuned spindle layout can be more efficient, but there's also the possibility of getting write bound on any one of those 3 seperate raid1's, and having other disks sitting idle. -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] table spaces
On 12 March 2013 21:59, John R Pierce pie...@hogranch.com wrote: On 3/12/2013 2:31 PM, Gregg Jaskiewicz wrote: I was basically under impression that separating WAL is a big plus. On top of that, having separate partition to hold some other data - will do too. But it sounds - from what you said - like having all in single logical drive will work, because raid card will spread the load amongst number of drives. Am I understanding that correctly ? both those models have merits. doing a single raid 10 should fairly evenly distribute the IO workload given adequate concurrency, and suitable stripe size and alignment. there are scenarios where a hand tuned spindle layout can be more efficient, but there's also the possibility of getting write bound on any one of those 3 seperate raid1's, and having other disks sitting idle. I'm trying to get an understanding of all options. So out of 6 disks then having 4 in Raid 1+0 configuration and other two in mirror for WAL. That's another option then for me to test.
Re: [GENERAL] Age of the WAL?
Erik Jones ejo...@engineyard.com writes: What's the best way to determine the age of the current WAL? Not the current segment, but the whole thing. Put another way: is there a way to determine a timestamp for the oldest available transaction in the WAL? Transaction commit and abort records carry timestamps, so you could figure this out with something like pg_xlogdump. I don't know of any canned solution though. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Testing Technique when using a DB
On Mar 12, 2013, at 8:42 AM, Perry Smith pedz...@gmail.com wrote: I tried posting this from Google Groups but I did not see it come through after an hour so this may be a duplicate message for some. The current testing technique for things like Ruby On Rails has three choices but all of the choices will not work in my case. The first choice is truncate which starts a transaction before each test. I think you mean transactional tests here, not truncate. While the test database is truncated once at the start of all the tests, each test runs inside its own transaction which will never be committed. If the testing is within the same process, this works and a roll back restores the DB. But if the testing involves two processes, then the test data entered by the test rig can not be seen by the system under test (SUT). With Rails, there are times when this is needed. The test rig drives a browser which calls into a Rails application. There are dangerous ways to still use this method but each has various down falls or risks. IMO, you don't want to use transactional tests. When you do, ActiveRecord will use savepoints to mimic transactions. This means now() will never change during the tests, deferred constraints/triggers won't work, other processes can't see the data, etc. The system is behaving differently in the test environment than in the real one, which is bad. If you are treating the database as a really dumb store of data, then you may want to use transactional tests. But be aware of the caveats. The other two choices are delete and truncate which both end up with an empty database just after each test. This prevents any test data that is already in the database from being used after the first test. Note that a test run will run through a sequence of tests (usually quite a few). All of these are fairly fast with each one being faster under different conditions (according to users). Generally, this pushes the Rails community to have either fixtures or factories. Both are ok solutions but both also have problems. In my case, I have a dozen or so tables all with very tight constraints and creating either fixtures or factories is very troublesome. Also, I have a real database with real data in production and it seems foolish not to take advantage of the knowledge contained within that database. By knowledge I mean the particular values and weirdness within the data that a factory or a fixture might not realize. One choice would be to create the database, use it, and then drop it for each test. I would create the database from a template that already has data taken from the production database (and probably trimmed down to a small subset of it). This requires some crafty dancing in the Rails set up since it likes to just attach to a database and run but it could be done. From first blush, this sounds like it would be really slow but may be not. The other choice would be to somehow copy the data to temporary tables before the test run and then copy it back. The advantage to this is it is not very PostgreSQL specific. Indeed, if the template database is already set up, then only one copy would be needed at the start of the test. The other thought I had is if there is some type of leaky transaction. A transaction where another process can see the data but the roll back would still work and be effective. Essentially I'm asking if all the protections a database offers could be dropped... but I thought I'd see if that was possible. The other thought is perhaps there is a snap shot type concept. I don't see it in the list of SQL commands. A snap shot would do exactly what it sounds like. It would take a snap shot and save it somehow. Then a restore to snap shot would restore the DB back to that state. This would be super super super awesome, but it doesn't exist as far as I know. This would be a permanent snapshot that could be easily and quickly restored. I wonder if it would be possible to make an extension that made this easy to do. I thought this group might suggest other ideas and either nuke the really bad ideas or promote the plausible ideas I've mentioned above. Make sure to look at database_cleaner if you haven't yet. Also this may be interesting: https://github.com/bmabey/database_cleaner/issues/80 Personally, I use database_cleaner's delete method, plus I load a SQL file at the beginning of each test. Sorry for the long post. I appreciate your thoughts. Perry -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Testing Technique when using a DB
To all who replied: Thank you. I did typo. I meant transaction instead of truncate. I had not seriously considered pg_dump / pg_restore because I assumed it would be fairly slow but I will experiment with pg_restore and template techniques this weekend and see which ones prove viable. I know about and use database cleaner and intend on offering it to them. I posted a comment on the bug report provided. Loading a separate SQL file seems like you are 99% of the way there. How is that as far as performance? pg_restore should be faster. The other advantage to the pg_restore or loading an SQL file is you can have various files for different tests. On Mar 12, 2013, at 10:09 PM, Joe Van Dyk wrote: On Mar 12, 2013, at 8:42 AM, Perry Smith pedz...@gmail.com wrote: I tried posting this from Google Groups but I did not see it come through after an hour so this may be a duplicate message for some. The current testing technique for things like Ruby On Rails has three choices but all of the choices will not work in my case. The first choice is truncate which starts a transaction before each test. I think you mean transactional tests here, not truncate. While the test database is truncated once at the start of all the tests, each test runs inside its own transaction which will never be committed. If the testing is within the same process, this works and a roll back restores the DB. But if the testing involves two processes, then the test data entered by the test rig can not be seen by the system under test (SUT). With Rails, there are times when this is needed. The test rig drives a browser which calls into a Rails application. There are dangerous ways to still use this method but each has various down falls or risks. IMO, you don't want to use transactional tests. When you do, ActiveRecord will use savepoints to mimic transactions. This means now() will never change during the tests, deferred constraints/triggers won't work, other processes can't see the data, etc. The system is behaving differently in the test environment than in the real one, which is bad. If you are treating the database as a really dumb store of data, then you may want to use transactional tests. But be aware of the caveats. The other two choices are delete and truncate which both end up with an empty database just after each test. This prevents any test data that is already in the database from being used after the first test. Note that a test run will run through a sequence of tests (usually quite a few). All of these are fairly fast with each one being faster under different conditions (according to users). Generally, this pushes the Rails community to have either fixtures or factories. Both are ok solutions but both also have problems. In my case, I have a dozen or so tables all with very tight constraints and creating either fixtures or factories is very troublesome. Also, I have a real database with real data in production and it seems foolish not to take advantage of the knowledge contained within that database. By knowledge I mean the particular values and weirdness within the data that a factory or a fixture might not realize. One choice would be to create the database, use it, and then drop it for each test. I would create the database from a template that already has data taken from the production database (and probably trimmed down to a small subset of it). This requires some crafty dancing in the Rails set up since it likes to just attach to a database and run but it could be done. From first blush, this sounds like it would be really slow but may be not. The other choice would be to somehow copy the data to temporary tables before the test run and then copy it back. The advantage to this is it is not very PostgreSQL specific. Indeed, if the template database is already set up, then only one copy would be needed at the start of the test. The other thought I had is if there is some type of leaky transaction. A transaction where another process can see the data but the roll back would still work and be effective. Essentially I'm asking if all the protections a database offers could be dropped... but I thought I'd see if that was possible. The other thought is perhaps there is a snap shot type concept. I don't see it in the list of SQL commands. A snap shot would do exactly what it sounds like. It would take a snap shot and save it somehow. Then a restore to snap shot would restore the DB back to that state. This would be super super super awesome, but it doesn't exist as far as I know. This would be a permanent snapshot that could be easily and quickly restored. I wonder if it would be possible to make an extension that made this easy to do. I thought this group might suggest other ideas and either nuke the really bad ideas or promote the