[PERFORM] INSERT times - same storage space but more fields - much slower inserts
Hi I've been doing some testing for the Bacula project, which uses PostgreSQL as one of the databases in which it stores backup catalogs. Insert times are critical in this environment, as the app may insert millions of records a day. I've been evaluating a schema change for Bacula that takes a field that's currently stored as a gruesome-to-work-with base64-encoded representation of a binary blob, and expands it into a set of integer fields that can be searched, indexed, etc. The table size of the expanded form is marginally smaller than for the base64-encoded string version. However, INSERT times are *CONSIDERABLY* greater for the version with more fields. It takes 1011 seconds to insert the base64 version, vs 1290 seconds for the expanded-fields version. That's a difference of 279 seconds, or 27%. Despite that, the final table sizes are the same. The SQL dump for the base64 version is 1734MB and the expanded one is 2189MB, about a 25% increase. Given that the final table sizes are the same, is the slowdown likely to just be the cost of parsing the extra SQL, converting the textual representations of the numbers, etc? If I use tab-separated input and COPY, the original-format file is 1300MB and the expanded-structure format is 1618MB. The performance hit on COPY-based insert is not as bad, at 161s vs 182s (13%), but still quite significant. Any ideas about what I might be able to do to improve the efficiency of inserting records with many integer fields? In case it's of interest, the base64 and expanded schema are: CREATE TABLE file ( fileid bigint NOT NULL, fileindex integer DEFAULT 0 NOT NULL, jobid integer NOT NULL, pathid integer NOT NULL, filenameid integer NOT NULL, markid integer DEFAULT 0 NOT NULL, lstat text NOT NULL, md5 text NOT NULL ); CREATE TABLE file ( fileid bigint, fileindex integer, jobid integer, pathid integer, filenameid integer, markid integer, st_dev integer, st_ino integer, st_mod integer, st_nlink integer, st_uid integer, st_gid integer, st_rdev bigint, st_size integer, st_blksize integer, st_blocks integer, st_atime integer, st_mtime integer, st_ctime integer, linkfi integer, md5 text ); ( Yes, those are the fields of a `struct lstat' ). -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] difficulties with time based queries
What can I do to prevent the index from getting bloated, or in whatever state it was in? What else can I do to further improve queries on this table? Someone suggested posting details of my conf file. Which settings are most likely to be useful for this? If you often do range queries on date, consider partitioning your table by date (something like 1 partition per month). Of course, if you also often do range queries on something other than date, and uncorrelated, forget it. If you make a lot of big aggregate queries, consider materialized views : Like how many games player X won this week, etc - create helper tables which contain the query results - every night, recompute the results taking into account the most recent data - don't recompute results based on old data that never changes This is only interesting if the aggregation reduces the data volume by an appreciable amount. For instance, if you run a supermarket with 1000 distinct products in stock and you sell 100.000 items a day, keeping a cache of count of product X sold each day will reduce your data load by about 100 on the query count of product X sold this month. The two suggestion above are not mutually exclusive. You could try bizgres also. Or even MySQL !... MySQL's query engine is slower than pg but the tables take much less space than Postgres, and it can do index-only queries. So you can fit more in the cache. This is only valid for MyISAM (InnoDB is a bloated hog). Of course, noone would want to use MyISAM for the safe storage, but it's pretty good as a read-only storage. You can even use the Archive format for even more compactness and use of cache. Of course you'd have to devise a way to dump from pg and load into MySQL but that's not hard. MySQL can be good if you target a table with lots of small rows with a few ints, all of them in a multicolumn index, so it doesn't need to hit the table itself. Note that one in his right mind would never run aggregate queries on a live R/W MyISAM table since the long queries will block all writes and blow up the reaction time. But for a read-only cache updated at night, or replication slave, it's okay. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Nested query performance issue
On Thu, 9 Apr 2009, Glenn Maynard wrote: On Thu, Apr 9, 2009 at 7:29 AM, Heikki Linnakangas wrote: SELECT s.* FROM score s, game g WHERE s.game_id = g.id AND s.id IN ( SELECT s2.id FROM score s2 WHERE s2.game_id=g.id ORDER BY s2.score DESC LIMIT 1 ); You don't really need the join with game here, simplifying this into: SELECT s.* FROM score s WHERE s.id IN ( SELECT s2.id FROM score s2 WHERE s2.game_id=s.game_id ORDER BY s2.score DESC LIMIT 1 ); I don't think it makes it any faster, though. It's about 10% faster for me. I'm surprised the planner can't figure out that this join is redundant. Because the join isn't redundant? You're making the assumption that for every score.game_id there is exactly one game.id that matches. Of course, you may have a unique constraint and foreign key/trigger that ensures this. Matthew -- The third years are wandering about all worried at the moment because they have to hand in their final projects. Please be sympathetic to them, say things like ha-ha-ha, but in a sympathetic tone of voice -- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?
On Fri, 10 Apr 2009, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: Not as far as I can tell. It looks to me like the planner is assuming that a forwards index scan and a reverse index scan will have the same cost. Right, because they do. If you think otherwise, demonstrate it. They do when the correlation of indexed value versus position in the table is low, resulting in random access. However, when the correlation is near 1, then the index scan approximates to sequential access to disc. In that case, scan direction would be important. Of course, there's the separate issue that correlation isn't actually that good a measure of the cost of an index scan, but I'm not sure what is better, and feasible. Matthew -- Our riverbanks and seashores have a beauty all can share, provided there's at least one boot, three treadless tyres, a half-eaten pork pie, some oil drums, an old felt hat, a lorry-load of tar blocks, and a broken bedstead there. -- Flanders and Swann -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Nested query performance issue
On Tue, Apr 14, 2009 at 5:33 AM, Matthew Wakeling matt...@flymine.org wrote: It's about 10% faster for me. I'm surprised the planner can't figure out that this join is redundant. Because the join isn't redundant? You're making the assumption that for every score.game_id there is exactly one game.id that matches. Of course, you may have a unique constraint and foreign key/trigger that ensures this. That's the definition of the tables I gave. CREATE TABLE game (id SERIAL NOT NULL PRIMARY KEY); -- pk implies unique CREATE TABLE score (id SERIAL NOT NULL PRIMARY KEY, score REAL, game_id INTEGER REFERENCES game (id)); (I don't think it makes any difference to whether this can be optimized, but adding NOT NULL back to game_id doesn't change it, either.) -- Glenn Maynard -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Nested query performance issue
2009/4/9 Віталій Тимчишин tiv...@gmail.com: OK, got to my postgres. Here you are: create or replace function explode_array(in_array anyarray) returns setof anyelement as $$ select ($1)[s] from generate_series(1,array_upper($1, 1)) as s; $$ language sql immutable; in 8.4, this will be replaced by the built in 'unnest'. Also we have array_agg. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts
Craig, * Craig Ringer (cr...@postnewspapers.com.au) wrote: I've been doing some testing for the Bacula project, which uses PostgreSQL as one of the databases in which it stores backup catalogs. We also use Bacula with a PostgreSQL backend. I've been evaluating a schema change for Bacula that takes a field that's currently stored as a gruesome-to-work-with base64-encoded representation of a binary blob, and expands it into a set of integer fields that can be searched, indexed, etc. This would be extremely nice. The table size of the expanded form is marginally smaller than for the base64-encoded string version. However, INSERT times are *CONSIDERABLY* greater for the version with more fields. It takes 1011 seconds to insert the base64 version, vs 1290 seconds for the expanded-fields version. That's a difference of 279 seconds, or 27%. Despite that, the final table sizes are the same. If I use tab-separated input and COPY, the original-format file is 1300MB and the expanded-structure format is 1618MB. The performance hit on COPY-based insert is not as bad, at 161s vs 182s (13%), but still quite significant. Any ideas about what I might be able to do to improve the efficiency of inserting records with many integer fields? Bacula should be using COPY for the batch data loads, so hopefully won't suffer too much from having the fields split out. I think it would be interesting to try doing PQexecPrepared with binary-format data instead of using COPY though. I'd be happy to help you implement a test setup for doing that, if you'd like. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] difficulties with time based queries
If you often do range queries on date, consider partitioning your table by date (something like 1 partition per month). Of course, if you also often do range queries on something other than date, and uncorrelated, forget it. If you pick your partition to line up with your queries than you can probably do away with the date index. Even if it doesn't always line up perfectly its worth considering. If you make a lot of big aggregate queries, consider materialized views : Like how many games player X won this week, etc - create helper tables which contain the query results - every night, recompute the results taking into account the most recent data - don't recompute results based on old data that never changes This is only interesting if the aggregation reduces the data volume by an appreciable amount. For instance, if you run a supermarket with 1000 distinct products in stock and you sell 100.000 items a day, keeping a cache of count of product X sold each day will reduce your data load by about 100 on the query count of product X sold this month. This obviously creates some administration overhead. So long as this is manageable for you this is a great solution. You might also want to look at Mondrian at http://mondrian.pentaho.org/ . It takes some tinkering but buys you some neat views into your data and automatically uses those aggregate tables. Nik Everett
Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts
On Tue, 14 Apr 2009, Stephen Frost wrote: Bacula should be using COPY for the batch data loads, so hopefully won't suffer too much from having the fields split out. I think it would be interesting to try doing PQexecPrepared with binary-format data instead of using COPY though. I'd be happy to help you implement a test setup for doing that, if you'd like. You can always do binary-format COPY. Matthew -- An ant doesn't have a lot of processing power available to it. I'm not trying to be speciesist - I wouldn't want to detract you from such a wonderful creature, but, well, there isn't a lot there, is there? -- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts
* Matthew Wakeling (matt...@flymine.org) wrote: On Tue, 14 Apr 2009, Stephen Frost wrote: Bacula should be using COPY for the batch data loads, so hopefully won't suffer too much from having the fields split out. I think it would be interesting to try doing PQexecPrepared with binary-format data instead of using COPY though. I'd be happy to help you implement a test setup for doing that, if you'd like. You can always do binary-format COPY. I've never played with binary-format COPY actually. I'd be happy to help test that too though. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts
Stephen Frost wrote: * Matthew Wakeling (matt...@flymine.org) wrote: On Tue, 14 Apr 2009, Stephen Frost wrote: Bacula should be using COPY for the batch data loads, so hopefully won't suffer too much from having the fields split out. I think it would be interesting to try doing PQexecPrepared with binary-format data instead of using COPY though. I'd be happy to help you implement a test setup for doing that, if you'd like. You can always do binary-format COPY. I've never played with binary-format COPY actually. I'd be happy to help test that too though. I'd have to check the source/a protocol dump to be sure, but I think PQexecPrepared(...), while it takes binary arguments, actually sends them over the wire in text form. PostgreSQL does have a binary protocol as well, but it suffers from the same issues as binary-format COPY: Unlike PQexecPrepared(...), binary-format COPY doesn't handle endian and type size issues for you. You need to convert the data to the database server's endianness and type sizes, but I don't think the PostgreSQL protocol provides any way to find those out. It's OK if we're connected via a UNIX socket (and thus are on the same host), though I guess a sufficiently perverse individual could install a 32-bit bacula+libpq, and run a 64-bit PostgreSQL server, or even vice versa. It should also be OK when connected to `localhost' (127.0.0.0/8) . In other cases, binary-format COPY would be unsafe without some way to determine remote endianness and sizeof(various types). -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts
Craig Ringer cr...@postnewspapers.com.au writes: Unlike PQexecPrepared(...), binary-format COPY doesn't handle endian and type size issues for you. You need to convert the data to the database server's endianness and type sizes, but I don't think the PostgreSQL protocol provides any way to find those out. The on-the-wire binary format is much better specified than you think. (The documentation of it sucks, however.) It's big-endian in all cases and the datatype sizes are well defined. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] INSERT times - same storage space but more fields - much slower inserts
Craig, * Craig Ringer (cr...@postnewspapers.com.au) wrote: In other cases, binary-format COPY would be unsafe without some way to determine remote endianness and sizeof(various types). As Tom mentioned already, the binary protocol is actually pretty well defined, and it's in network-byte-order, aka, big-endian. The only issue that I can think of off-hand that you need to know about the server is if it's using 64-bit integers for date-times or if it's using float. That's a simple check to do, however, specifically with: show integer_datetimes; It's also alot cheaper to do the necessary byte-flipping to go from whatever-endian to network-byte-order than to do the whole printf/atoi conversion. Handling timestamps takes a bit more magic but you can just pull the appropriate code/#defines from the server backend, but I don't think that's even an issue for this particular set. What does your test harness currently look like, and what would you like to see to test the binary-format COPY? I'd be happy to write up the code necessary to implement binary-format COPY for this. Thanks, Stephen signature.asc Description: Digital signature