[GENERAL] using trigger to change statusin one table from the modification in other table
Hello guys, i'm trying to use trigger in these two tables below, in order to do the following; To cancel the booking of an accommodation, since the customer do the host (status = Cancelled) in the book_rooms table. Changing the state of accommodation to occupied (Accommodation Table). CREATE TABLE book_rooms (Idbookroom int AUTO_INCREMENT not null, IdHost int not null, IdAccommododation int not null, booking_date Date not null, InitialDate Date not null, EndDate not null, Status varchar(10) not null check (Estado in (‘active’, ‘Cancelled’)), PRIMARY KEY (Idbookroom), FOREIGN KEY (IdHost) REFERENCES Hosts(IdHost), FOREIGN KEY (IdAccommododation) REFERENCES Accommodations(IdAccommododation), UNIQUE(IdHost, IdAccommododation, booking_date)) TABLE Accommodations (IdAccommododation int AUTO_INCREMENT not null, name varchar(20) not null, high_season_price not null numeric (5,2), low_season_price not null numeric (5,2), Status varchar(12) not null ((status = 'occupied') or (estado = 'available') or (estado = 'maintenance')), PRIMARY KEY (IdAccommododation), UNIQUE(name)), can someone help will be appreciated -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Problem with the 9.1 one-click installer Windows7 64bit
Craig Ringer, 14.09.2011 06:20: I forwarded your message to Dave Page in case the EDB folks needed to look into this. He commented that: Aside from the fact that icacls is hanging for reasons unknown, it appears to be doing what it is designed to do - it traverses up the path from the data directory to the root directory, granting read/execute permissions on each dir as it goes. It's a non-recursive grant (the default - /T makes it recurse), and is done because a great deal of the failures we used to see were caused by the user not having read or execute permissions on one of the parents of the data directory (or installation directory). ... which to me explains why: Then when I kill that process, the installer starts a new one with icacls C:\Daten /grant tkellerer:RX As Dave noted, it's a non-recursive grant. It isn't changing the permissions for C:\ and everything under it. It's only changing the permissions for C:\ its self. It's then stepping down the path of parents down to the datadir and doing the same thing to make sure you have the required rights on every folder all the way down. See: http://technet.microsoft.com/en-us/library/cc753525(WS.10).aspx and note the lack of the /t flag when icacls is called. Given that, it's not clear why it's taking so long. A lock of some kind, maybe? Anyway, it's *not* recursively changing the permissions on your HDD. Thanks for the feedback. Those steps do make sense - and I appreciate the installer doing that, as there were a lot of problems in various forums regarding the permissions ;) But I have no idea why it's hanging (or taking that long). I ran the installer on a Windows XP box (32bit) that is not part of a domain, and there everything worked just fine. I currently don't have time to re-test this and wait until it finished (or check what iacls.exe is waiting for). But maybe next week I can try to find the real reason for that. Regards Thomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Query performance help with 'shadow table' approach.
On 13 Sep 2011, at 23:44, Brian Fehrle wrote: These queries basically do a 'select max(primary_key_column) from table group by column1, column2. Because of the group by, we would result in a sequential scan of the entire table which proves to be costly. That seems to suggest a row where the primary key that has the max value is special in some way. Making them more easily distinguishable from normal rows seems like a good idea here. Since the table has a ton of columns, I set up a smaller table that will house a copy of some of the data that the query uses, the Primary Key colum, and the two columns I do my 'group by' on. That's one way to distinguish these special rows from the rest. You could also mark them as special using an extra column and/or create an expression-based index over just those rows. However, especially with the below section in mind, it would appear your data could be normalised a bit more (your splitting off that shadow table is a step in doing so, in fact). I'm also wondering, does your primary key have actual meaning? It would appear to just indicate the order in which the records were created (I'm assuming it's a serial type surrogate PK, and not a natural one). This shadow table will also only contain one row for every column1 and column2 combination (due to the group by), and for those rows, will have the max of the primary key. Even with this, the 'shadow' table will have about 14 million rows, compared to the 15 million in the main table. Don't (column1, column2) make up a key then? I get the feeling you should split your table in 3 sections: Table 1: main lookup (PK: pkey_sid) Table 2: Variation lookup (PK: (column1, column2), FK: pkey_sid) Table 3: Data (FK: the above) So the issue here comes in retrieving the needed data from my main table. The resulting rows is estimated to be 409,600, and the retrieving of the primary key's that are associated with those rows is actually really easy. However, when we take those 409,600 rows back to the main table to retrieve the other columns I need, the planner is just doing a sequential scan as it's most likely going to be faster than hitting the index then retrieving the columns I need for all 400K+ rows. Is that estimate accurate? If not, see Ondrej's suggestions. That is only about 1/30th of your table. I don't think a seqscan makes sense here unless your data is distributed badly. Things to note: 1. If I reduce my where clause's range, then the sequential scan turns into an index scan, but sadly this can't always be done. Does it make sense to CLUSTER your data in some sense? That would improve the data distribution issue and would probably push the threshold for a seqscan up some. Cheers, Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL benchmarked on XFS vs ZFS vs btrfs vs ext4
On 14/09/11 12:56, Andy Colson wrote: On 09/13/2011 08:15 PM, Toby Corkindale wrote: Hi, Some months ago, I ran some (probably naive) benchmarks looking at how pgbench performed on an identical system with differing filesystems. (on Linux). [snip] Did you test unplugging the power cable in the middle of a test to see which would come back up? Heh, no, but it'd be interesting to see.. I wonder if turning a virtual machine off has the same effect? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Stability of 9.1.0
Hi. I install postgresql 9.1.0 version Win32 from enterprisedb (WinXp+sp3). Every about 0,5 - 6 hours server stops. Whats going on ?? Below a example log file: 2011-09-14 08:49:37 CEST FATAL: the database system is starting up 2011-09-14 08:49:38 CEST LOG: checkpoint complete: wrote 2179 buffers (26.6%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=1.766 s, sync=22.640 s, total=24.468 s; sync files=67, longest=20.957 s, average=0.337 s 2011-09-14 08:49:38 CEST LOG: database system is ready to accept connections 2011-09-14 08:49:38 CEST LOG: autovacuum launcher started 2011-09-14 08:49:51 CEST FATAL: no pg_hba.conf entry for host 126.100.42.8, user Admin, database www, SSL off 2011-09-14 08:49:52 CEST ERROR: relation msysconf does not exist at character 63 2011-09-14 08:49:52 CEST STATEMENT: declare SQL_CUR08F966C8 cursor for SELECT Config, nValue FROM MSysConf 2011-09-14 08:54:14 CEST LOG: duration: 171981.000 ms statement: VACUUM VERBOSE 2011-09-14 08:56:52 CEST LOG: duration: 5760.674 ms plan: Query Text: SELECT to_char(DataPliku,'DD-MM') , count(*)::text , count(case when Pierwszy zwrot is null then 1 end)::text , count(case when Pierwszy zwrot is null and Darmowe is null then 1 end)::text , rpad(min(Data),8) AS Biling od , rpad(max(Data),8) AS Biling do FROM ONLY Bladpol2 WHERE (DataPliku=current_date-21) GROUP BY DataPliku ORDER BY DataPliku Sort (cost=25428.40..25428.45 rows=22 width=31) (actual time=5760.584..5760.616 rows=16 loops=1) Sort Key: DataPliku Sort Method: quicksort Memory: 18kB - HashAggregate (cost=25427.14..25427.91 rows=22 width=31) (actual time=5760.339..5760.484 rows=16 loops=1) - Bitmap Heap Scan on Bladpol2 (cost=1971.04..23850.38 rows=105117 width=31) (actual time=324.139..5108.444 rows=106392 loops=1) Recheck Cond: (DataPliku = (('now'::text)::date - 21)) - Bitmap Index Scan on Bladpol2_DataPliku (cost=0.00..1944.76 rows=105117 width=0) (actual time=323.527..323.527 rows=106392 loops=1) Index Cond: (DataPliku = (('now'::text)::date - 21)) 2011-09-14 08:56:52 CEST CONTEXT: SQL function c12 statement 1 2011-09-14 08:56:52 CEST LOG: duration: 6031.000 ms statement: BEGIN;declare SQL_CUR018F9058 cursor with hold for SELECT * FROM www.c12();fetch 30 in SQL_CUR018F9058 2011-09-14 08:56:52 CEST LOG: duration: 6038.352 ms plan: Query Text: BEGIN;declare SQL_CUR018F9058 cursor with hold for SELECT * FROM www.c12();fetch 30 in SQL_CUR018F9058 Function Scan on c12 (cost=25.00..35.00 rows=1000 width=192) (actual time=6038.269..6038.303 rows=16 loops=1) 2011-09-14 08:56:59 CEST LOG: could not receive data from client: No connection could be made because the target machine actively refused it. 2011-09-14 08:56:59 CEST LOG: unexpected EOF on client connection 2011-09-14 09:02:07 CEST LOG: server process (PID 3160) was terminated by exception 0xC005 2011-09-14 09:02:07 CEST HINT: See C include file ntstatus.h for a description of the hexadecimal value. 2011-09-14 09:02:07 CEST LOG: terminating any other active server processes 2011-09-14 09:02:07 CEST WARNING: terminating connection because of crash of another server process 2011-09-14 09:02:07 CEST DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2011-09-14 09:02:07 CEST HINT: In a moment you should be able to reconnect to the database and repeat your command. 2011-09-14 09:02:07 CEST WARNING: terminating connection because of crash of another server process 2011-09-14 09:02:07 CEST DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2011-09-14 09:02:07 CEST HINT: In a moment you should be able to reconnect to the database and repeat your command. 2011-09-14 09:02:07 CEST WARNING: terminating connection because of crash of another server process 2011-09-14 09:02:07 CEST DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2011-09-14 09:02:07 CEST HINT: In a moment you should be able to reconnect to the database and repeat your command. 2011-09-14 09:02:07 CEST LOG: all server processes terminated; reinitializing 2011-09-14 09:02:17 CEST FATAL: pre-existing shared memory block is still in use 2011-09-14 09:02:17 CEST HINT: Check if there are any old server processes still running, and terminate them. Here are non-commented settings: listen_addresses = '*' # what IP address(es) to listen on; port
Re: [GENERAL] Stability of 9.1.0
I think that using any function in plperlu language stops the server. Perl version is 5.14 from activestate. 2011/9/14, pasman pasmański pasma...@gmail.com: Hi. I install postgresql 9.1.0 version Win32 from enterprisedb (WinXp+sp3). Every about 0,5 - 6 hours server stops. Whats going on ?? Below a example log file: 2011-09-14 08:49:37 CEST FATAL: the database system is starting up 2011-09-14 08:49:38 CEST LOG: checkpoint complete: wrote 2179 buffers (26.6%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=1.766 s, sync=22.640 s, total=24.468 s; sync files=67, longest=20.957 s, average=0.337 s 2011-09-14 08:49:38 CEST LOG: database system is ready to accept connections 2011-09-14 08:49:38 CEST LOG: autovacuum launcher started 2011-09-14 08:49:51 CEST FATAL: no pg_hba.conf entry for host 126.100.42.8, user Admin, database www, SSL off 2011-09-14 08:49:52 CEST ERROR: relation msysconf does not exist at character 63 2011-09-14 08:49:52 CEST STATEMENT: declare SQL_CUR08F966C8 cursor for SELECT Config, nValue FROM MSysConf 2011-09-14 08:54:14 CEST LOG: duration: 171981.000 ms statement: VACUUM VERBOSE 2011-09-14 08:56:52 CEST LOG: duration: 5760.674 ms plan: Query Text: SELECT to_char(DataPliku,'DD-MM') , count(*)::text , count(case when Pierwszy zwrot is null then 1 end)::text , count(case when Pierwszy zwrot is null and Darmowe is null then 1 end)::text , rpad(min(Data),8) AS Biling od , rpad(max(Data),8) AS Biling do FROM ONLY Bladpol2 WHERE (DataPliku=current_date-21) GROUP BY DataPliku ORDER BY DataPliku Sort (cost=25428.40..25428.45 rows=22 width=31) (actual time=5760.584..5760.616 rows=16 loops=1) Sort Key: DataPliku Sort Method: quicksort Memory: 18kB - HashAggregate (cost=25427.14..25427.91 rows=22 width=31) (actual time=5760.339..5760.484 rows=16 loops=1) - Bitmap Heap Scan on Bladpol2 (cost=1971.04..23850.38 rows=105117 width=31) (actual time=324.139..5108.444 rows=106392 loops=1) Recheck Cond: (DataPliku = (('now'::text)::date - 21)) - Bitmap Index Scan on Bladpol2_DataPliku (cost=0.00..1944.76 rows=105117 width=0) (actual time=323.527..323.527 rows=106392 loops=1) Index Cond: (DataPliku = (('now'::text)::date - 21)) 2011-09-14 08:56:52 CEST CONTEXT: SQL function c12 statement 1 2011-09-14 08:56:52 CEST LOG: duration: 6031.000 ms statement: BEGIN;declare SQL_CUR018F9058 cursor with hold for SELECT * FROM www.c12();fetch 30 in SQL_CUR018F9058 2011-09-14 08:56:52 CEST LOG: duration: 6038.352 ms plan: Query Text: BEGIN;declare SQL_CUR018F9058 cursor with hold for SELECT * FROM www.c12();fetch 30 in SQL_CUR018F9058 Function Scan on c12 (cost=25.00..35.00 rows=1000 width=192) (actual time=6038.269..6038.303 rows=16 loops=1) 2011-09-14 08:56:59 CEST LOG: could not receive data from client: No connection could be made because the target machine actively refused it. 2011-09-14 08:56:59 CEST LOG: unexpected EOF on client connection 2011-09-14 09:02:07 CEST LOG: server process (PID 3160) was terminated by exception 0xC005 2011-09-14 09:02:07 CEST HINT: See C include file ntstatus.h for a description of the hexadecimal value. 2011-09-14 09:02:07 CEST LOG: terminating any other active server processes 2011-09-14 09:02:07 CEST WARNING: terminating connection because of crash of another server process 2011-09-14 09:02:07 CEST DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2011-09-14 09:02:07 CEST HINT: In a moment you should be able to reconnect to the database and repeat your command. 2011-09-14 09:02:07 CEST WARNING: terminating connection because of crash of another server process 2011-09-14 09:02:07 CEST DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2011-09-14 09:02:07 CEST HINT: In a moment you should be able to reconnect to the database and repeat your command. 2011-09-14 09:02:07 CEST WARNING: terminating connection because of crash of another server process 2011-09-14 09:02:07 CEST DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2011-09-14 09:02:07 CEST HINT: In a moment you should be able to reconnect to the database and repeat your command. 2011-09-14 09:02:07 CEST LOG: all server processes terminated; reinitializing 2011-09-14 09:02:17 CEST FATAL: pre-existing shared memory block is still in use 2011-09-14
Re: [GENERAL] pg_dump with select command
Any update on below issue. Thanks Adarsh Sharma wrote: Dear all, Today I need some part ( subset ) of some tables to another database to a remote server. I need to take backup of tables after satisfying a select query. Is there any option to specify query in pg_dump command.I researched in the manual but not able to find that. Please let me know if it is possible as we can can specify in mysqldump command. Thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump with select command
On 14/09/2011 10:31, Adarsh Sharma wrote: Any update on below issue. Thanks Adarsh Sharma wrote: Dear all, Today I need some part ( subset ) of some tables to another database to a remote server. I need to take backup of tables after satisfying a select query. Is there any option to specify query in pg_dump command.I researched in the manual but not able to find that. No, there isn't. Instead, you could use \copy from within psql (NB: very different from the SQL command COPY). Ray. -- Raymond O'Donnell :: Galway :: Ireland r...@iol.ie -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump with select command
On Wed, Sep 14, 2011 at 5:31 AM, Adarsh Sharma adarsh.sha...@orkash.comwrote: Any update on below issue. Someone already responded to you with the answer, don't top-post. Thanks Adarsh Sharma wrote: Dear all, Today I need some part ( subset ) of some tables to another database to a remote server. I need to take backup of tables after satisfying a select query. Is there any option to specify query in pg_dump command.I researched in the manual but not able to find that. Please let me know if it is possible as we can can specify in mysqldump command. Thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Bit datatype performance?
Hi all, One of my entities 'E' may be 'tagged' with an arbitrary set of 256 tags 'T'. A first approach could be to add a M:N relationship between 'E' and 'T'. A second way to do this could be to add a BIT(256) datatype to 'E', setting bits to '1' if the entity is tagged with each one of the 256 tags (i.e. using a 'bitmask' on the set of tags). Since querying entities 'E' with a certain set of tags 'T' must be very fast I was wondering if the second approach would be faster. What do you think? Thanks for any hints, Antonio -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] using trigger to change status in one table from the modification in other table
Hello guys, i'm trying to use trigger in these two tables below, in order to do the following; To cancel the booking of an accommodation, since the customer do the host (status = Cancelled) in the book_rooms table. Changing the state of accommodation to occupied (Accommodation Table). CREATE TABLE book_rooms (Idbookroom int AUTO_INCREMENT not null, IdHost int not null, IdAccommododation int not null, booking_date Date not null, InitialDate Date not null, EndDate not null, Status varchar(10) not null check (Estado in (‘active’, ‘Cancelled’)), PRIMARY KEY (Idbookroom), FOREIGN KEY (IdHost) REFERENCES Hosts(IdHost), FOREIGN KEY (IdAccommododation) REFERENCES Accommodations(IdAccommododation), UNIQUE(IdHost, IdAccommododation, booking_date)) TABLE Accommodations (IdAccommododation int AUTO_INCREMENT not null, name varchar(20) not null, high_season_price not null numeric (5,2), low_season_price not null numeric (5,2), Status varchar(12) not null ((status = 'occupied') or (estado = 'available') or (estado = 'maintenance')), PRIMARY KEY (IdAccommododation), UNIQUE(name)), Any help will be appreciated -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bit datatype performance?
On Sep 14, 2011, at 6:00, Antonio Vieiro anto...@antonioshome.net wrote: Hi all, One of my entities 'E' may be 'tagged' with an arbitrary set of 256 tags 'T'. A first approach could be to add a M:N relationship between 'E' and 'T'. A second way to do this could be to add a BIT(256) datatype to 'E', setting bits to '1' if the entity is tagged with each one of the 256 tags (i.e. using a 'bitmask' on the set of tags). Since querying entities 'E' with a certain set of tags 'T' must be very fast I was wondering if the second approach would be faster. What do you think? Thanks for any hints, Antonio -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Dealing with 256 arbitrary ones and zeros instead of meaningful named tags seems to be asking for mega-confusion. If performance is really that important do both and run some performance tests. If the tag set ever changes a schema change will be needed for the bit version but not the two-table version. David J. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bit datatype performance?
On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote: Hi all, One of my entities 'E' may be 'tagged' with an arbitrary set of 256 tags 'T'. A first approach could be to add a M:N relationship between 'E' and 'T'. A second way to do this could be to add a BIT(256) datatype to 'E', setting bits to '1' if the entity is tagged with each one of the 256 tags (i.e. using a 'bitmask' on the set of tags). Since querying entities 'E' with a certain set of tags 'T' must be very fast I was wondering if the second approach would be faster. What do you think? Thanks for any hints, Antonio I assume each entity may have one or more different tags. Actually performing test like ... where (t.bits :mymask) = :mymask should be quite fast and faster then creating additional relations, but only if it's highly probable that your query will almost always scan whole table. The advantage of indexes is that the index is used 1st and tail (slower) parts of query will always get subset of table. In bitset, You will probably scan whole table. So I think, you should do some performance test for large number of data, and compare both ways. I think bitset will be fast for really small data, but M:N relations may be faster for really large data sets. You need to measure size of your database too, in M:N case with 256 tags it may be quite large. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bit datatype performance?
Other option is use an array of int2 instead of bit(256). It can be indexed. 2011/9/14, Radosław Smogura rsmog...@softperience.eu: On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote: Hi all, One of my entities 'E' may be 'tagged' with an arbitrary set of 256 tags 'T'. A first approach could be to add a M:N relationship between 'E' and 'T'. A second way to do this could be to add a BIT(256) datatype to 'E', setting bits to '1' if the entity is tagged with each one of the 256 tags (i.e. using a 'bitmask' on the set of tags). Since querying entities 'E' with a certain set of tags 'T' must be very fast I was wondering if the second approach would be faster. What do you think? Thanks for any hints, Antonio I assume each entity may have one or more different tags. Actually performing test like ... where (t.bits :mymask) = :mymask should be quite fast and faster then creating additional relations, but only if it's highly probable that your query will almost always scan whole table. The advantage of indexes is that the index is used 1st and tail (slower) parts of query will always get subset of table. In bitset, You will probably scan whole table. So I think, you should do some performance test for large number of data, and compare both ways. I think bitset will be fast for really small data, but M:N relations may be faster for really large data sets. You need to measure size of your database too, in M:N case with 256 tags it may be quite large. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- pasman -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL benchmarked on XFS vs ZFS vs btrfs vs ext4
On 14/09/2011 09:30, Toby Corkindale wrote: On 14/09/11 12:56, Andy Colson wrote: On 09/13/2011 08:15 PM, Toby Corkindale wrote: Hi, Some months ago, I ran some (probably naive) benchmarks looking at how pgbench performed on an identical system with differing filesystems. (on Linux). [snip] Did you test unplugging the power cable in the middle of a test to see which would come back up? Heh, no, but it'd be interesting to see.. I wonder if turning a virtual machine off has the same effect? No, never the exact same effect. There are two things to consider: if/when/how the OS flushes the data to the hardware and if/when/how the hardware flushes the data to physical storage. You can simulate only the failure of the first part with a virtual machine, but not the second (unless you bring down the VM host...). -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bit datatype performance?
Hi again, Thanks for the tip. In fact I was thinking of creating an index on the bitmask, so I could use: ... where t.bits = :mymask directly, avoiding a full table scan. I assume this is possible (indexing bit and comparing bits), isn't it? Thanks, Antonio El 14/09/11 15:58, Radosław Smogura escribió: On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote: Hi all, One of my entities 'E' may be 'tagged' with an arbitrary set of 256 tags 'T'. A first approach could be to add a M:N relationship between 'E' and 'T'. A second way to do this could be to add a BIT(256) datatype to 'E', setting bits to '1' if the entity is tagged with each one of the 256 tags (i.e. using a 'bitmask' on the set of tags). Since querying entities 'E' with a certain set of tags 'T' must be very fast I was wondering if the second approach would be faster. What do you think? Thanks for any hints, Antonio I assume each entity may have one or more different tags. Actually performing test like ... where (t.bits :mymask) = :mymask should be quite fast and faster then creating additional relations, but only if it's highly probable that your query will almost always scan whole table. The advantage of indexes is that the index is used 1st and tail (slower) parts of query will always get subset of table. In bitset, You will probably scan whole table. So I think, you should do some performance test for large number of data, and compare both ways. I think bitset will be fast for really small data, but M:N relations may be faster for really large data sets. You need to measure size of your database too, in M:N case with 256 tags it may be quite large. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Identifying Reason for Column Name Returned by SELECT
I run this SELECT statement on a table: select distinct(site_id) from chemistry order by site_id; and in the returned set I see: GW-21 GW-22 GW-22 + GW-24 I want to find that row returning 'GW-22 +' because I believe it should be 'GW-23'. However, my attempts to retrieve that row keep failing. I've tried these statements: select count(*) from chemistry where site_id = 'GW-22 +'; count --- 0 (1 row) yet, select count(*) from chemistry where site_id = 'GW-22'; count --- 803 (1 row) Looking for the blank row also fails when I try to specify site_id as is null, = ' ', or =''. Please point me to the proper way of finding this rogue row so I can correct the value in the site_id column. TIA, Rich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Identifying Reason for Column Name Returned by SELECT
I suspect you have a multi-line entry and the '+' is just indicating that the field continues. Try ...where site_id ~ 'GW-22'... (this may take a while if the table is very large). Cheers, Steve On 09/14/2011 09:35 AM, Rich Shepard wrote: I run this SELECT statement on a table: select distinct(site_id) from chemistry order by site_id; and in the returned set I see: GW-21 GW-22 GW-22 + GW-24 I want to find that row returning 'GW-22 +' because I believe it should be 'GW-23'. However, my attempts to retrieve that row keep failing. I've tried these statements: select count(*) from chemistry where site_id = 'GW-22 +'; count --- 0 (1 row) yet, select count(*) from chemistry where site_id = 'GW-22'; count --- 803 (1 row) Looking for the blank row also fails when I try to specify site_id as is null, = ' ', or =''. Please point me to the proper way of finding this rogue row so I can correct the value in the site_id column. TIA, Rich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Identifying Reason for Column Name Returned by SELECT
Your example suggests that the GW-22 is a substring of the field followed by trailing spaces so you'll want something that searches substrings, whereas = will always just test on matching the entire field. Try like by default, such as where site_id like 'GW-22 %'. I added the space between the 22 and the wildcard % so that the field containing just 'GW-22' isn't also matched. If you need something more specific than simple substring match, you can use a regular expression, which I think is spelled like where site_id ~ '...' but check the manual to be sure about regexp syntax. But like will probably do you here. -- Darren Duncan Rich Shepard wrote: I run this SELECT statement on a table: select distinct(site_id) from chemistry order by site_id; and in the returned set I see: GW-21 GW-22 GW-22 + GW-24 I want to find that row returning 'GW-22 +' because I believe it should be 'GW-23'. However, my attempts to retrieve that row keep failing. I've tried these statements: select count(*) from chemistry where site_id = 'GW-22 +'; count --- 0 (1 row) yet, select count(*) from chemistry where site_id = 'GW-22'; count --- 803 (1 row) Looking for the blank row also fails when I try to specify site_id as is null, = ' ', or =''. Please point me to the proper way of finding this rogue row so I can correct the value in the site_id column. TIA, Rich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL benchmarked on XFS vs ZFS vs btrfs vs ext4
Can you go into some more detail on how you set up ZFS on these systems? Robert Treat conjecture: xzilla.net consulting: omniti.com On Tue, Sep 13, 2011 at 10:56 PM, Andy Colson a...@squeakycode.net wrote: On 09/13/2011 08:15 PM, Toby Corkindale wrote: Hi, Some months ago, I ran some (probably naive) benchmarks looking at how pgbench performed on an identical system with differing filesystems. (on Linux). Since then the kernel-level version of ZFS became usable, and there have been improvements to btrfs, and no doubt various updates in the Linux kernel and PostgreSQL that should help performance. I ran the tests on Ubuntu 11.04 with Pg 9.0 first, then upgraded the system to Ubuntu 11.10 (beta) with Pg 9.1 and ran them again. The latter combination showed a considerable performance improvement overall - although I didn't investigate to find out whether this was due to kernel improvements, postgres improvements, or virtio improvements. The results are measured in transactions-per-second, with higher numbers being better. Results: ext4 (data=writeback,relatime): natty: 248 oneiric: 297 ext4 (data=writeback,relatime,nobarrier): natty: didn't test oneiric: 1409 XFS (relatime): natty: didn't test oneiric: 171 btrfs (relatime): natty: 61.5 oneiric: 91 btrfs (relatime,nodatacow): natty: didn't test oneiric: 128 ZFS (defaults): natty: 171 oneiric: 996 Conclusion: Last time I ran these tests, xfs and ext4 pulled very similar results, and both were miles ahead of btrfs. This time around, ext4 has managed to get a significantly faster result than xfs. However we have a new contender - ZFS performed *extremely* well on the latest Ubuntu setup - achieving triple the performance of regular ext4! I'm not sure how it achieved this, and whether we're losing some kind of data protection (eg. like the barrier options in XFS and ext4). If ext4 has barriers disabled, it surpasses even ZFSs high score. Oddly, ZFS performed wildly differently on ubuntu 11.04 vs 11.10b. I can't explain this. Any ideas? Cheers, Toby Did you test unplugging the power cable in the middle of a test to see which would come back up? -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Identifying Reason for Column Name Returned by SELECT
On Wed, Sep 14, 2011 at 9:42 AM, Steve Crawford scrawf...@pinpointresearch.com wrote: I suspect you have a multi-line entry and the '+' is just indicating that the field continues. Try ...where site_id ~ 'GW-22'... (this may take a while if the table is very large). You might be able to get an index scan if you include a few more criteria: AND site_id BETWEEN 'GW-22' AND 'GW-23' Also WHERE site_id '^GW-22' may use and index scan also. -- Regards, Richard Broersma Jr. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Identifying Reason for Column Name Returned by SELECT
Darren Duncan wrote: Try like by default, such as where site_id like 'GW-22 %'. I added the space between the 22 and the wildcard % so that the field containing just 'GW-22' isn't also matched. Sorry, I should have said where site_id like 'GW-22%' and site_id != 'GW-22' (no explicit space) as a better way to exclude 'GW-22' from the results. -- Darren Duncan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Identifying Reason for Column Name Returned by SELECT
On Wed, 14 Sep 2011, Steve Crawford wrote: I suspect you have a multi-line entry and the '+' is just indicating that the field continues. Steve, et al.: It's not multi-line, but malformed. Try ...where site_id ~ 'GW-22'... (this may take a while if the table is very large). This found the appropriate rows. Now, my question is DDL-related: What is the appropriate syntax to change 'GW-22 +' to GW-22? Can I use 'like' or '~' in an ALTER TABLE RENAME column ... statement? Thanks guys, Rich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Identifying Reason for Column Name Returned by SELECT
On Wed, Sep 14, 2011 at 10:04 AM, Rich Shepard rshep...@appl-ecosys.com wrote: This found the appropriate rows. Now, my question is DDL-related: What is the appropriate syntax to change 'GW-22 +' to GW-22? Can I use 'like' or '~' in an ALTER TABLE RENAME column ... statement? I'm confused. Do you want to UPDATE the affected records to GW-22. Or do you want to ALTER the table to add a column constraint to prevent malformed site_id's in the future? -- Regards, Richard Broersma Jr. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Identifying Reason for Column Name Returned by SELECT
On Wed, 14 Sep 2011, Richard Broersma wrote: I'm confused. Richard, Apparently, I am also confused. Doing too many things simultaneoulsy. Do you want to UPDATE the affected records to GW-22. Or do you want to ALTER the table to add a column constraint to prevent malformed site_id's in the future? Update. That makes my question moot. Thanks, Rich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bit datatype performance?
Hi, I think it's not bad approach if performance is important. I don't know how b-tree index will work with bitset datatype, but I assume it should treat is as 256bit number (maybe someone more competive in internals will answer this). Please bear in mind, that this approach will work well only on query You have written. Because You ask on performance, I will add this topic You may want to test and think about it PG by default uses text transfer mode, so if you transfer your data from/to server those will be transferred as 256 0/1 character string. You may to think about storing tags as e.g. 4 long (64bit) fields, or 2 type 4 UUIDs (128bit) and use composite index. If you have ability to use binary transfer and on your client side bitest will be mapped to some reasonable type, then You won, otherwise (in binary mode) you should get nice boost when you will store, those values in types I have wrote. Of course those are only some concepts, personally I have never made such things. Regards, Radek On Wed, 14 Sep 2011 17:58:58 +0200, Antonio Vieiro wrote: Hi again, Thanks for the tip. In fact I was thinking of creating an index on the bitmask, so I could use: ... where t.bits = :mymask directly, avoiding a full table scan. I assume this is possible (indexing bit and comparing bits), isn't it? Thanks, Antonio El 14/09/11 15:58, Radosław Smogura escribió: On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote: Hi all, One of my entities 'E' may be 'tagged' with an arbitrary set of 256 tags 'T'. A first approach could be to add a M:N relationship between 'E' and 'T'. A second way to do this could be to add a BIT(256) datatype to 'E', setting bits to '1' if the entity is tagged with each one of the 256 tags (i.e. using a 'bitmask' on the set of tags). Since querying entities 'E' with a certain set of tags 'T' must be very fast I was wondering if the second approach would be faster. What do you think? Thanks for any hints, Antonio I assume each entity may have one or more different tags. Actually performing test like ... where (t.bits :mymask) = :mymask should be quite fast and faster then creating additional relations, but only if it's highly probable that your query will almost always scan whole table. The advantage of indexes is that the index is used 1st and tail (slower) parts of query will always get subset of table. In bitset, You will probably scan whole table. So I think, you should do some performance test for large number of data, and compare both ways. I think bitset will be fast for really small data, but M:N relations may be faster for really large data sets. You need to measure size of your database too, in M:N case with 256 tags it may be quite large. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bit datatype performance?
Hi, if you are thinking to access data in that manner, what's the point of bits (or tags)? The idea behind having a value and then using a bitmask is to be able to test the value against different bitmasks, each bitmask corresponding to a different tag or tag combination. The *where *statement you are suggesting differs in nothing from a regular id or in this case a category id (instead of a combination of tags)You will be fetching all records that only have a specific mask. I think you are a little bit confused: Let's say you have several tags each with an identifier: TAG_NATURE = 1 TAG_ANIMALS = 2 TAG_CARS = 4 TAG_SPORTS = 8 then you have a record ... the idea to use bits is to be able to assign that record a single value, formed by the combination of the different tags. For example if an element corresponds to TAG_NATURE and TAG_ANIMALS, you would want to have that element with a value of TAG_NATURE + TAG_ANIMALS resulting in a tag value of 3. Then if you want to extract all ANIMALS you just do: *... where value TAG_ANIMALS = TAG_ANIMALS;* because if you just do: *... where value = TAG_ANIMALS* you will only get the elements that *exclusively *have the tag TAG_ANIMALS. You will miss for instance those that have the NATURE *and* ANIMALS (or any other tag). So, your simple index on value willl not be of any help, since you won't be doing *... where value = ANY_SPECIFIC_TAG* because of the latter. Now, if you are going to have a different TAG for every TAG COMBINATION well, you can do that, but that would be no different than any regular id, in this case, it would be more of a CATEGORY, and elements will only be able to have one single category for them. One alternative would be to try to make some helping indexes on expressions, maybe with the help of a function like: create or replace function hasTag(data integer, tag integer) returns boolean as $$ declare begin return (data tag) = tag; end; $$ language plpgsql immutable; -- this function would return select hasTag(1, 1); -- true select hasTag(3, 1); -- true select hasTag(4, 1); -- false This way, you could reformulate your query in the following fashion: ... where hasTag(value, TAG_NATURE); and you could now build an index on yourtable based on that expression like: create index idx_yourtable_hasTag_1 on yourtable (hasTag(value, 1 /* TAG_NATURE */)); If you would like to fetch a combination of tags, you could do: ... where hasTag(value, TAG_NATURE) and hasTag(value, TAG_ANIMALS) requiring an extra index on (hasTag(value, TAG_ANIMALS)). In this way, you will end up requiring 256 indexes :) (which can be from acceptable to *ridiculous*, depending on how much often the indexes should be updated, volume, etc), it's up to you. I'm not actually suggesting you use this approach, it's just a raw idea, and it's just the conclusion of one line of thought, that may or may have not crossed your mind. Maybe with some refinement, you can get to something more practical. But nonetheless (if I'm not missing something huge), the *where *statement you provided is just the wrong approach to tags. hope it helps, regards, eduardo On Wed, Sep 14, 2011 at 12:58 PM, Antonio Vieiro anto...@antonioshome.netwrote: Hi again, Thanks for the tip. In fact I was thinking of creating an index on the bitmask, so I could use: ... where t.bits = :mymask directly, avoiding a full table scan. I assume this is possible (indexing bit and comparing bits), isn't it? Thanks, Antonio El 14/09/11 15:58, Radosław Smogura escribió: On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote: Hi all, One of my entities 'E' may be 'tagged' with an arbitrary set of 256 tags 'T'. A first approach could be to add a M:N relationship between 'E' and 'T'. A second way to do this could be to add a BIT(256) datatype to 'E', setting bits to '1' if the entity is tagged with each one of the 256 tags (i.e. using a 'bitmask' on the set of tags). Since querying entities 'E' with a certain set of tags 'T' must be very fast I was wondering if the second approach would be faster. What do you think? Thanks for any hints, Antonio I assume each entity may have one or more different tags. Actually performing test like ... where (t.bits :mymask) = :mymask should be quite fast and faster then creating additional relations, but only if it's highly probable that your query will almost always scan whole table. The advantage of indexes is that the index is used 1st and tail (slower) parts of query will always get subset of table. In bitset, You will probably scan whole table. So I think, you should do some performance test for large number of data, and compare both ways. I think bitset will be fast for really small data, but M:N relations may be faster for really large data sets. You need to measure size of your database too, in M:N case with 256 tags it may be quite large. -- Sent via pgsql-general mailing list
[GENERAL] Arrays
Hi Is there a method of counting the number of elements in an array?? Bob
Re: [GENERAL] Arrays
2011/9/14 Bob Pawley rjpaw...@shaw.ca Hi Is there a method of counting the number of elements in an array?? Yes... Use function array_upper [1]. See an example: postgres@bdteste=# SELECT array_upper(ARRAY['foo', 'bar'], 1); array_upper - 2 (1 row) [1] http://www.postgresql.org/docs/current/interactive/functions-array.html -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
Re: [GENERAL] Arrays
On Wed, Sep 14, 2011 at 1:05 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: 2011/9/14 Bob Pawley rjpaw...@shaw.ca Hi Is there a method of counting the number of elements in an array?? Yes... Use function array_upper [1]. See an example: postgres@bdteste=# SELECT array_upper(ARRAY['foo', 'bar'], 1); array_upper - 2 that only gives you one dimension's worth elements, and only is correct if the array is 1 based. select count(*) from unnest(_array_); will give you an exact count. another way to do it which doesn't require expanding the array would be to parse and calculate # elements from the output of array_dims() (which is unfortunately returned as text). merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Identifying Reason for Column Name Returned by SELECT
On Wed, Sep 14, 2011 at 12:04 PM, Rich Shepard rshep...@appl-ecosys.com wrote: On Wed, 14 Sep 2011, Steve Crawford wrote: I suspect you have a multi-line entry and the '+' is just indicating that the field continues. Steve, et al.: It's not multi-line, but malformed. It *is* mult-line. psql uses a '+ to show line breaks: postgres=# select E'a\nb'; ?column? -- a + b (1 row) To fix your data, i'd consider using the replace() function to knock out newlines: postgres=# select replace(E'a\nb', E'\n', ''); replace - ab (1 row) Also consider adjusting the app and/or the database to block them in the future. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Arrays
On 09/14/2011 11:01 AM, Bob Pawley wrote: Hi Is there a method of counting the number of elements in an array?? Bob Look at array_dims, array_upper and array_lower. But note that PostgreSQL allows multi-dimensional arrays. The array_dims function gives you all the dimensions. If you have a one-dimensional array you can use array_upper(yourarray, 1) and array_lower(yourarray, 2). Also be aware that the first element in a PostgreSQL array typically has an index of 1 but does not have to. In fact it is legal to have arrays that start at a negative index and proceed to a positive one. Cheers, Steve
Re: [GENERAL] Bit datatype performance?
Hi, Thanks for the tip. Maybe two UUIDs are a best approach. I'll see which is more performant. Kind regards, Antonio El 14/09/11 19:32, Radosław Smogura escribió: Hi, I think it's not bad approach if performance is important. I don't know how b-tree index will work with bitset datatype, but I assume it should treat is as 256bit number (maybe someone more competive in internals will answer this). Please bear in mind, that this approach will work well only on query You have written. Because You ask on performance, I will add this topic You may want to test and think about it PG by default uses text transfer mode, so if you transfer your data from/to server those will be transferred as 256 0/1 character string. You may to think about storing tags as e.g. 4 long (64bit) fields, or 2 type 4 UUIDs (128bit) and use composite index. If you have ability to use binary transfer and on your client side bitest will be mapped to some reasonable type, then You won, otherwise (in binary mode) you should get nice boost when you will store, those values in types I have wrote. Of course those are only some concepts, personally I have never made such things. Regards, Radek On Wed, 14 Sep 2011 17:58:58 +0200, Antonio Vieiro wrote: Hi again, Thanks for the tip. In fact I was thinking of creating an index on the bitmask, so I could use: ... where t.bits = :mymask directly, avoiding a full table scan. I assume this is possible (indexing bit and comparing bits), isn't it? Thanks, Antonio El 14/09/11 15:58, Radosław Smogura escribió: On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote: Hi all, One of my entities 'E' may be 'tagged' with an arbitrary set of 256 tags 'T'. A first approach could be to add a M:N relationship between 'E' and 'T'. A second way to do this could be to add a BIT(256) datatype to 'E', setting bits to '1' if the entity is tagged with each one of the 256 tags (i.e. using a 'bitmask' on the set of tags). Since querying entities 'E' with a certain set of tags 'T' must be very fast I was wondering if the second approach would be faster. What do you think? Thanks for any hints, Antonio I assume each entity may have one or more different tags. Actually performing test like ... where (t.bits :mymask) = :mymask should be quite fast and faster then creating additional relations, but only if it's highly probable that your query will almost always scan whole table. The advantage of indexes is that the index is used 1st and tail (slower) parts of query will always get subset of table. In bitset, You will probably scan whole table. So I think, you should do some performance test for large number of data, and compare both ways. I think bitset will be fast for really small data, but M:N relations may be faster for really large data sets. You need to measure size of your database too, in M:N case with 256 tags it may be quite large. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Query performance help with 'shadow table' approach.
On 09/14/2011 01:10 AM, Alban Hertroys wrote: On 13 Sep 2011, at 23:44, Brian Fehrle wrote: These queries basically do a 'select max(primary_key_column) from table group by column1, column2. Because of the group by, we would result in a sequential scan of the entire table which proves to be costly. That seems to suggest a row where the primary key that has the max value is special in some way. Making them more easily distinguishable from normal rows seems like a good idea here. Since the table has a ton of columns, I set up a smaller table that will house a copy of some of the data that the query uses, the Primary Key colum, and the two columns I do my 'group by' on. That's one way to distinguish these special rows from the rest. You could also mark them as special using an extra column and/or create an expression-based index over just those rows. However, especially with the below section in mind, it would appear your data could be normalised a bit more (your splitting off that shadow table is a step in doing so, in fact). I'm also wondering, does your primary key have actual meaning? It would appear to just indicate the order in which the records were created (I'm assuming it's a serial type surrogate PK, and not a natural one). It isn't a serial type, and the id increment is handled by the application. This shadow table will also only contain one row for every column1 and column2 combination (due to the group by), and for those rows, will have the max of the primary key. Even with this, the 'shadow' table will have about 14 million rows, compared to the 15 million in the main table. Don't (column1, column2) make up a key then? I get the feeling you should split your table in 3 sections: Table 1: main lookup (PK: pkey_sid) Table 2: Variation lookup (PK: (column1, column2), FK: pkey_sid) Table 3: Data (FK: the above) (column1, column2) could possibly have multiple occurrences of the combination. Such as, 4 rows where column1 = 54 and column2 = 86, in these cases with multiple rows, I just want the one with the max(primary_key). I'm looking into options like this, but at this moment changing the base table structure is out of the question, but adding tables along the side to try to speed things up is ok. Im trying to not cause changes in the application. So the issue here comes in retrieving the needed data from my main table. The resulting rows is estimated to be 409,600, and the retrieving of the primary key's that are associated with those rows is actually really easy. However, when we take those 409,600 rows back to the main table to retrieve the other columns I need, the planner is just doing a sequential scan as it's most likely going to be faster than hitting the index then retrieving the columns I need for all 400K+ rows. Is that estimate accurate? If not, see Ondrej's suggestions. That is only about 1/30th of your table. I don't think a seqscan makes sense here unless your data is distributed badly. Yeah the more I look at it, the more I think it's postgres _thinking_ that it's faster to do a seqential scan. I'll be playing with the random_page_cost that Ondrej suggested, and schedule a time where I can do some explain analyzes (production server and all). Things to note: 1. If I reduce my where clause's range, then the sequential scan turns into an index scan, but sadly this can't always be done. Does it make sense to CLUSTER your data in some sense? That would improve the data distribution issue and would probably push the threshold for a seqscan up some. Cheers, Alban Hertroys Thanks, I'll be reporting back in with my next findings. - Brian F -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL 9.1, replica and unlogged tables
Writing to any table on the standby is strictly forbidden so you can forget having your own volatile copy. Hello, It should be possible to declare a -non postgres- SQL/MED table pointing e.g. to a csv on localhost, souldn't it ? best regards, Marc Mamin
Re: [GENERAL] Problem with the 9.1 one-click installer Windows7 64bit
Thomas Kellerer wrote: So I killed the iacls.exe and the script proceeded, just to hang at the next call to icacls.exe when it tried to set the privileges on the directory for the postgres user despite the fact that that user already was the owner and had full control over it. So I killed icacls.exe again and then the script finally finished without problems. The service was registered and successully started. Before killing hanging processes that I can't explain I usually go out and get myself a coffee. In this particular case it fixed the problem because iacls.exe didn't actually hang but only took some time to complete. Regards, Brar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Identifying Reason for Column Name Returned by SELECT
On Wed, 14 Sep 2011, Merlin Moncure wrote: It *is* mult-line. psql uses a '+ to show line breaks: Merlin, Yep. I discovered this when I dumped the table as an ASCII text file and saw the '\n' after the site_id string on some rows. I've no idea how it got there. Thanks, Rich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL benchmarked on XFS vs ZFS vs btrfs vs ext4
Dne 14.9.2011 03:15, Toby Corkindale napsal(a): Hi, Some months ago, I ran some (probably naive) benchmarks looking at how pgbench performed on an identical system with differing filesystems. (on Linux). Since then the kernel-level version of ZFS became usable, and there have been improvements to btrfs, and no doubt various updates in the Linux kernel and PostgreSQL that should help performance. I ran the tests on Ubuntu 11.04 with Pg 9.0 first, then upgraded the system to Ubuntu 11.10 (beta) with Pg 9.1 and ran them again. The latter combination showed a considerable performance improvement overall - although I didn't investigate to find out whether this was due to kernel improvements, postgres improvements, or virtio improvements. The results are measured in transactions-per-second, with higher numbers being better. Results: ext4 (data=writeback,relatime): natty: 248 oneiric: 297 ext4 (data=writeback,relatime,nobarrier): natty: didn't test oneiric: 1409 XFS (relatime): natty: didn't test oneiric: 171 btrfs (relatime): natty: 61.5 oneiric: 91 btrfs (relatime,nodatacow): natty: didn't test oneiric: 128 ZFS (defaults): natty: 171 oneiric: 996 Conclusion: Last time I ran these tests, xfs and ext4 pulled very similar results, and both were miles ahead of btrfs. This time around, ext4 has managed to get a significantly faster result than xfs. However we have a new contender - ZFS performed *extremely* well on the latest Ubuntu setup - achieving triple the performance of regular ext4! I'm not sure how it achieved this, and whether we're losing some kind of data protection (eg. like the barrier options in XFS and ext4). If ext4 has barriers disabled, it surpasses even ZFSs high score. Oddly, ZFS performed wildly differently on ubuntu 11.04 vs 11.10b. I can't explain this. Any ideas? Those results look very suspisicous I guess. Unless we know all the details about the fs options, pgbench runs and system (drives, cache, controller) it's very difficult to say anything. For example the ZFS results seem really strange - I've done a quite thorough benchmark of file systems last month (http://bit.ly/oEdkSH) and I really don't think the differences can be that huge under the same conditions - increase from 171 to 996 is ridiculous. The only explanation I can come up with is that oneiric uses significantly weaker ZFS options (e.g. disabled write barriers) and there's a controller with write cache. But that also means you should not compare ZFS to plain ext4 (297 tps) but to ext4 with write barriers disabled (1409 tps). That changes the results a bit, because instead of being 3x faster, ZFS is about 30% slower. Tomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Noob help for charting on web site, need assistance
I am new to the mailing list and to PostgreSQL. I am currently running PostgreSQL 9.0.4 on a Windows XP laptop. I am successfully logging sensor data to a database (temperature, humidity, voltage, etc…), I would like to display the data in a chart on a web site. Currently I am have everything on the same Windows system, PostgreSQL, data acquisition system, Apache Web server, pgAdmin III, and pgAgent. I am updating the database with a csv file (generated by the data acquisition system) with the COPY command in an SQL script (updates every minute via SQL script and pgAgent). Not the best way, but it works for now. I would like to use something simple (if there is such a thing) to show a chart of sensor output over time base on a SQL query, e.g. hourly average past 24 hrs, daily average, weekly average, etc… The system is on a local network that should not be accessible from the outside network, behind a firewall with net10 address. Because some use Macs and iPads, I can't use a Flash based product, eliminating some options. I have looked a number of charting products, mostly based off of a google search and this page: http://webtecker.com/2008/06/12/10-free-chart-scripts/ The charting products include the following: Flot, Open Flash Chart, AmCharts, Emprise JavaScript Charts, PlotKit, Flotr, PHP/SWF Charts, Visifire, FusionCharts, and JFreeChart. I appreciate any help in pointing me toward a solution and/or recommendations for how to graphically display sensor data over time. Regards, Greg Howard -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Query performance help with 'shadow table' approach.
On 14 Sep 2011, at 20:45, Brian Fehrle wrote: That is only about 1/30th of your table. I don't think a seqscan makes sense here unless your data is distributed badly. Yeah the more I look at it, the more I think it's postgres _thinking_ that it's faster to do a seqential scan. I'll be playing with the random_page_cost that Ondrej suggested, and schedule a time where I can do some explain analyzes (production server and all). Before you do that, turn off seqscans (there's a session option for that) and see if index scans are actually faster. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Noob help for charting on web site, need assistance
On 09/14/2011 02:56 PM, Greg Howard wrote: I am new to the mailing list and to PostgreSQL. I am currently running PostgreSQL 9.0.4 on a Windows XP laptop. I am successfully logging sensor data to a database (temperature, humidity, voltage, etc…), I would like to display the data in a chart on a web site. Currently I am have everything on the same Windows system, PostgreSQL, data acquisition system, Apache Web server, pgAdmin III, and pgAgent If you are running PHP you can check out JpGraph. Not free but not too expensive either and there are lots of chart options. Sparklines might be just the ticket - especially for mobile devices. One implementation is at http://sparkline.org/ Both the above are in PHP so you would need that on your system. You could do something in JavaScript. gRaphaël is pretty cool: http://g.raphaeljs.com/ And RRDtool makes nice graphs of sensor-type data: http://oss.oetiker.ch/rrdtool/ Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Noob help for charting on web site, need assistance
On 09/14/2011 04:56 PM, Greg Howard wrote: I am new to the mailing list and to PostgreSQL. I am currently running PostgreSQL 9.0.4 on a Windows XP laptop. I am successfully logging sensor data to a database (temperature, humidity, voltage, etc…), I would like to display the data in a chart on a web site. Currently I am have everything on the same Windows system, PostgreSQL, data acquisition system, Apache Web server, pgAdmin III, and pgAgent. I am updating the database with a csv file (generated by the data acquisition system) with the COPY command in an SQL script (updates every minute via SQL script and pgAgent). Not the best way, but it works for now. I would like to use something simple (if there is such a thing) to show a chart of sensor output over time base on a SQL query, e.g. hourly average past 24 hrs, daily average, weekly average, etc… The system is on a local network that should not be accessible from the outside network, behind a firewall with net10 address. Because some use Macs and iPads, I can't use a Flash based product, eliminating some options. I have looked a number of charting products, mostly based off of a google search and this page: http://webtecker.com/2008/06/12/10-free-chart-scripts/ The charting products include the following: Flot, Open Flash Chart, AmCharts, Emprise JavaScript Charts, PlotKit, Flotr, PHP/SWF Charts, Visifire, FusionCharts, and JFreeChart. I appreciate any help in pointing me toward a solution and/or recommendations for how to graphically display sensor data over time. Regards, Greg Howard I use cron, perl and gnuplot. Simple (if you know perl) to get the data in any format you want, and graph. Cron so its up to date, but wont bring the box down if 50 people hit refresh as fast as they can. -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Cryptic Error Message Importing Table Dump
Now that I fixed the rows that had the inadvertent newlines in one column, I'm trying to read in the fixed table from the .sql file produced by pg_dump. I know there are duplicate rows now that I removed the newlines, and those are easily fixed (although the reported line numbers don't match what I see in emacs). There is, however, one problem that I don't understand so I can't find the row and fix it. Here's what psql reports: \i /full-path-to-file/chemistry.sql SET SET SET SET SET SET SET SET SET CREATE TABLE ALTER TABLE psql:/full-path-to-file/chemistry.sql:47475: ERROR: invalid input syntax for type real: CONTEXT: COPY chemistry, line 47363, column quant: Line 47475 is below the last line with content in the file. Line 47363 contains: 96-A000890 SC 1996-04-23 Conductance, Specific 394 uS/cm t \N \N \N (which is wrapped here, but not in the emacs buffer). There are 10 columns, which is how many there should be. When I go to the end of the line there's no space or other extraneous character. The column 'quant' contains the number 394. While that's an integer, the column domain is real and psql doesn't complain about other whole numbers in that column. Please help me understand what the error message and context are telling me because I just don't see it. Rich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Noob help for charting on web site, need assistance
On 09/14/2011 02:56 PM, Greg Howard wrote: I am new to the mailing list and to PostgreSQL. I am currently running PostgreSQL 9.0.4 on a Windows XP laptop. I am successfully logging sensor data to a database (temperature, humidity, voltage, etc…), I would like to display the data in a chart on a web site. Currently I am have everything on the same Windows system, PostgreSQL, data acquisition system, Apache Web server, pgAdmin III, and pgAgent. I am updating the database with a csv file (generated by the data acquisition system) with the COPY command in an SQL script (updates every minute via SQL script and pgAgent). Not the best way, but it works for now. I would like to use something simple (if there is such a thing) to show a chart of sensor output over time base on a SQL query, e.g. hourly average past 24 hrs, daily average, weekly average, etc… The system is on a local network that should not be accessible from the outside network, behind a firewall with net10 address. Because some use Macs and iPads, I can't use a Flash based product, eliminating some options. I have looked a number of charting products, mostly based off of a google search and this page: http://webtecker.com/2008/06/12/10-free-chart-scripts/ The charting products include the following: Flot, Open Flash Chart, AmCharts, Emprise JavaScript Charts, PlotKit, Flotr, PHP/SWF Charts, Visifire, FusionCharts, and JFreeChart. I appreciate any help in pointing me toward a solution and/or recommendations for how to graphically display sensor data over time. I would suggest importing it into Google Docs and charting it as you would a spreadsheet and then sharing that. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL benchmarked on XFS vs ZFS vs btrfs vs ext4
On 15/09/11 02:46, Robert Treat wrote: Can you go into some more detail on how you set up ZFS on these systems? I'm afraid my knowledge of ZFS is rather weak compared to the other filesystems - all I really did was zpool create followed by zfs create, using all the defaults. The zpool was created against an LVM logical volume (which was the same one used for all the filesystems measured in the tests). That LV was itself part of a volume group that was striped over three disks (Western Digital WD1003FBYX). I'm happy to re-run the benchmark with different ZFS options if you can suggest some. Unfortunately I can't easily bypass the LVM layer here, as the disks involved are fully committed to the volume group and I don't want to rebuild the machine. Cheers, Toby On Tue, Sep 13, 2011 at 10:56 PM, Andy Colsona...@squeakycode.net wrote: On 09/13/2011 08:15 PM, Toby Corkindale wrote: Hi, Some months ago, I ran some (probably naive) benchmarks looking at how pgbench performed on an identical system with differing filesystems. (on Linux). Since then the kernel-level version of ZFS became usable, and there have been improvements to btrfs, and no doubt various updates in the Linux kernel and PostgreSQL that should help performance. I ran the tests on Ubuntu 11.04 with Pg 9.0 first, then upgraded the system to Ubuntu 11.10 (beta) with Pg 9.1 and ran them again. The latter combination showed a considerable performance improvement overall - although I didn't investigate to find out whether this was due to kernel improvements, postgres improvements, or virtio improvements. The results are measured in transactions-per-second, with higher numbers being better. Results: ext4 (data=writeback,relatime): natty: 248 oneiric: 297 ext4 (data=writeback,relatime,nobarrier): natty: didn't test oneiric: 1409 XFS (relatime): natty: didn't test oneiric: 171 btrfs (relatime): natty: 61.5 oneiric: 91 btrfs (relatime,nodatacow): natty: didn't test oneiric: 128 ZFS (defaults): natty: 171 oneiric: 996 Conclusion: Last time I ran these tests, xfs and ext4 pulled very similar results, and both were miles ahead of btrfs. This time around, ext4 has managed to get a significantly faster result than xfs. However we have a new contender - ZFS performed *extremely* well on the latest Ubuntu setup - achieving triple the performance of regular ext4! I'm not sure how it achieved this, and whether we're losing some kind of data protection (eg. like the barrier options in XFS and ext4). If ext4 has barriers disabled, it surpasses even ZFSs high score. Oddly, ZFS performed wildly differently on ubuntu 11.04 vs 11.10b. I can't explain this. Any ideas? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pg_dump with select command
I am sorry Sir, but Still I am not able to solve the issue. I followed the below steps they are :- 1. Create table from the main tables by *create table as select* command. 2. Then I take the backup of that tables and restore on the remote machine. 3. After this I have to change the table names to the original ones in the remote server. Where as , In mysql we have -X option to specify a query while taking backups then restore them. But I think Postgresql doesnot support this. Thanks Alban Hertroys wrote: On 14 September 2011 11:31, Adarsh Sharma adarsh.sha...@orkash.com mailto:adarsh.sha...@orkash.com wrote: Any update on below issue. What do you expect an update on? You got your answer, which included the solution.
[GENERAL] how to use all the options of EXPLAIN?
Could anyone please tell me how to use all the options of explain in Postgresql 9. http://www.postgresql.org/docs/9.0/static/sql-explain.html I can use EXPLAIN ANALYZE. FORMAT: explain (format yaml) select * from tab1; QUERY PLAN --- - Plan: + Node Type: Seq Scan+ Relation Name: tab1+ Alias: tab1+ Startup Cost: 0.00 + Total Cost: 21558.94 + Plan Rows: 1688234 + Plan Width: 333 (1 row) But explain (format yaml) analyze select * from tab1; ERROR: syntax error at or near analyze LINE 1: explain (format yaml) analyze select * from tab1 BUFFERS: explain (buffers true) select * from tab1; ERROR: EXPLAIN option BUFFERS requires ANALYZE But explain (buffers true) analyze select * from tab1; ERROR: syntax error at or near analyze LINE 1: explain (buffers true) analyze select * from tab1... Same for COSTS. Does any of our experts use these options? If yes, please tell me how. Thanks.