Re: [PERFORM] Performance die when COPYing to table with bigint PK
Yes, you are right. Performance become even more awful. Can some techniques from pg_bulkload be implemented in postgres core? Current performance is not suitable for any enterprise-wide production system. 2011/8/5 Віталій Тимчишин tiv...@gmail.com: In my tests it greatly depends on if index writes are random or sequential. My test time goes down from few hours to seconds if I add to the end of index. As for me, best comparision would be to make two equal int4 columns with same data as in int8, two indexes, then perform the test. My bet it will be slower than int8. Четвер, 4 серпня 2011 р. користувач Robert Ayrapetyan robert.ayrapet...@comodo.com написав: All you are saying disproves following: in experiment I replaces bigint index: CREATE INDEX ix_t_big ON test.t USING btree (id_big) TABLESPACE tblsp_ix; with 4 (!) other indexes: If you look at the rest of my mail - you would notice 50 times difference in performance. What you would say? That accessing a page from RAM is more than 50 times as fast as a random access of that page from disk. -Kevin -- Ayrapetyan Robert, Comodo Anti-Malware Data Processing Analysis and Management System (CAMDPAMS) http://repo-qa.camdpams.odessa.office.comodo.net/mediawiki/index.php -- Best regards, Vitalii Tymchyshyn -- Ayrapetyan Robert, Comodo Anti-Malware Data Processing Analysis and Management System (CAMDPAMS) http://repo-qa.camdpams.odessa.office.comodo.net/mediawiki/index.php -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance die when COPYing to table with bigint PK
+525k 0+0io 0pf+0w COPY 10 0.000u 0.039s 0:29.10 0.1% 526+1393k 0+0io 0pf+0w COPY 10 0.000u 0.036s 0:36.29 0.0% 538+1164k 0+0io 0pf+0w COPY 10 0.000u 0.037s 0:43.77 0.0% 526+1393k 0+0io 0pf+0w COPY 10 0.000u 0.036s 1:01.94 0.0% 538+1164k 0+0io 0pf+0w COPY 10 0.007u 0.029s 0:13.99 0.1% 808+2074k 0+0io 0pf+0w COPY 10 0.003u 0.005s 0:46.02 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.031s 0:45.58 0.0% 316+836k 0+0io 0pf+0w COPY 10 0.000u 0.038s 1:00.39 0.0% 526+1393k 0+0io 0pf+0w COPY 10 0.000u 0.036s 0:24.38 0.1% 538+1164k 0+0io 0pf+0w COPY 10 0.000u 0.037s 0:41.32 0.0% 538+1382k 0+0io 0pf+0w COPY 10 0.000u 0.005s 0:46.13 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.005u 0.000s 0:43.15 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.005s 0:45.59 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.005s 1:54.92 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.037s 2:22.47 0.0% 538+1382k 0+0io 0pf+0w COPY 10 0.000u 0.005s 1:40.65 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.006u 0.020s 1:43.52 0.0% 650+1328k 0+0io 0pf+0w COPY 10 0.000u 0.036s 1:43.33 0.0% 538+1164k 0+0io 0pf+0w COPY 10 0.000u 0.036s 1:47.00 0.0% 526+1393k 0+0io 0pf+0w COPY 10 0.000u 0.036s 2:18.94 0.0% 538+1164k 0+0io 0pf+0w from that moment all iterations went for more then 1 min and I interrupted test. DROP INDEX test.ix_t_big; time copy_data.csh /aaa/app.data 100 COPY 10 0.000u 0.005s 0:02.42 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.007u 0.029s 0:01.88 1.0% 808+2074k 0+0io 0pf+0w COPY 10 0.000u 0.005s 0:01.83 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.005s 0:01.75 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.005s 0:01.82 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.037s 0:01.81 1.6% 526+1393k 0+0io 0pf+0w COPY 10 0.000u 0.036s 0:01.84 1.6% 538+1164k 0+0io 0pf+0w COPY 10 0.000u 0.036s 0:01.86 1.6% 421+1114k 0+0io 0pf+0w COPY 10 0.000u 0.036s 0:01.77 1.6% 538+1164k 0+0io 0pf+0w ... Everything returned back to good perfomance state. With number of rows 50 mln all numbers in test with index on bigint column are multiplied on 20, while without index even on 200 mln rows speed remains constant (1-2 sec per 100k rows file). P.S. tried same with 2 columns (bigint and int) - it didn't produced such effect probably because data volume has critical effect. On Tue, Aug 2, 2011 at 8:41 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Ayrapetyan robert.ayrapet...@comodo.com wrote: So I'm still convinced - this bug relates to FreeBSD 64-bit + UFS + bigint column index (some of these may be superfluous, but I have no resources to check on different platforms with different filesystems). Linux 64 bit XFS bigint column index only shows a slightly longer run time for bigint versus int here. What timings do you get for the insert statements if you run the following in your environment? create table bi (big bigint not null, medium int not null); insert into bi with x(n) as (select generate_series(1, 100) select n + 50, n from x; \timing on truncate table bi; insert into bi with x(n) as (select generate_series(1, 100)) select n + 50, n from x; truncate table bi; insert into bi with x(n) as (select generate_series(1, 100)) select n + 50, n from x; truncate table bi; insert into bi with x(n) as (select generate_series(1, 100)) select n + 50, n from x; create unique index bi_medium on bi (medium); truncate table bi; insert into bi with x(n) as (select generate_series(1, 100)) select n + 50, n from x; truncate table bi; insert into bi with x(n) as (select generate_series(1, 100)) select n + 50, n from x; truncate table bi; insert into bi with x(n) as (select generate_series(1, 100)) select n + 50, n from x; drop index bi_medium; create unique index bi_big on bi (big); truncate table bi; insert into bi with x(n) as (select generate_series(1, 100)) select n + 50, n from x; truncate table bi; insert into bi with x(n) as (select generate_series(1, 100)) select n + 50, n from x; truncate table bi; insert into bi with x(n) as (select generate_series(1, 100)) select n + 50, n from x; \timing off drop table bi; Here's what I get: Time: 1629.141 ms Time: 1638.060 ms Time: 1711.833 ms Time: 4151.953 ms Time: 4602.679 ms Time: 5107.259 ms Time: 4654.060 ms Time: 5158.157 ms Time: 5101.110 ms -Kevin -- Ayrapetyan Robert, Comodo Anti-Malware Data Processing Analysis and Management System (CAMDPAMS) http://repo-qa.camdpams.odessa.office.comodo.net/mediawiki/index.php -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance die when COPYing to table with bigint PK
If you look at the rest of my mail - you would notice 50 times difference in performance. What you would say? On Thu, Aug 4, 2011 at 7:11 PM, Vitalii Tymchyshyn tiv...@gmail.com wrote: 04.08.11 18:59, Kevin Grittner написав(ла): Robert Ayrapetyanrobert.ayrapet...@comodo.com wrote: Kevin Grittnerkevin.gritt...@wicourts.gov wrote: [regarding tests which do show the problem] tried same with 2 columns (bigint and int) - it didn't produced such effect probably because data volume has critical effect. Based on what you're showing, this is almost certainly just a matter of pushing your volume of active data above the threshold of what your cache holds, forcing it to do disk access rather than RAM access for a significant portion of the reads. -Kevin Yep. Seems so. Plus famous you'd better insert data, then create indexes. On my database it takes twice the time for int8 then for int4 to insert data. Also it takes ~twice a time (2 hours) to add 200K of rows to 200M of rows than to make an index over 200M of rows (1 hour). Best regards, Vitalii Tymchyshyn -- Ayrapetyan Robert, Comodo Anti-Malware Data Processing Analysis and Management System (CAMDPAMS) http://repo-qa.camdpams.odessa.office.comodo.net/mediawiki/index.php -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance die when COPYing to table with bigint PK
All you are saying disproves following: in experiment I replaces bigint index: CREATE INDEX ix_t_big ON test.t USING btree (id_big) TABLESPACE tblsp_ix; with 4 (!) other indexes: CREATE INDEX ix_t2 ON test.t USING btree (ip) TABLESPACE tblsp_ix; CREATE INDEX ix_t3 ON test.t USING btree (id_small) TABLESPACE tblsp_ix; CREATE INDEX ix_t4 ON test.t USING btree (id_smalll) TABLESPACE tblsp_ix; CREATE INDEX ix_t5 ON test.t USING btree (ts) TABLESPACE tblsp_ix; which are definitely larger then one bigint index. 0.000u 0.005s 0:13.23 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.035s 0:05.08 0.5% 421+1114k 0+0io 0pf+0w COPY 10 0.000u 0.036s 0:19.28 0.1% 526+1393k 0+0io 0pf+0w COPY 10 0.000u 0.005s 0:05.56 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.006u 0.012s 0:05.57 0.1% 984+1820k 0+0io 0pf+0w COPY 10 0.007u 0.029s 0:05.20 0.3% 808+1746k 0+0io 0pf+0w COPY 10 0.005u 0.000s 0:05.35 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.011s 0:05.92 0.1% 316+836k 0+0io 0pf+0w COPY 10 0.000u 0.005s 0:12.08 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.029s 0:05.46 0.3% 808+2074k 0+0io 0pf+0w COPY 10 0.002u 0.002s 0:05.35 0.0% 0+0k 0+0io 0pf+0w COPY 10 0.000u 0.005s 0:06.52 0.0% 0+0k 0+0io 0pf+0w Insertions became slower 4-5 times, which is ok. Nothing is closer to even half of minute, while one bigint index constantly gives more then minute and even 2 for 100k records. On Thu, Aug 4, 2011 at 8:22 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Ayrapetyan robert.ayrapet...@comodo.com wrote: If you look at the rest of my mail - you would notice 50 times difference in performance. What you would say? That accessing a page from RAM is more than 50 times as fast as a random access of that page from disk. -Kevin -- Ayrapetyan Robert, Comodo Anti-Malware Data Processing Analysis and Management System (CAMDPAMS) http://repo-qa.camdpams.odessa.office.comodo.net/mediawiki/index.php -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance die when COPYing to table with bigint PK
Quite possible. But anyway - I don't think performance degradation must be so huge in case of using UNIQUE indexes. On Mon, Aug 1, 2011 at 12:06 PM, Vitalii Tymchyshyn tiv...@gmail.com wrote: 31.07.11 16:51, Robert Ayrapetyan написав(ла): Hello. I've found strange behavior of my pg installation (tested both 8.4 and 9.0 - they behave same) on FreeBSD platform. In short - when some table have PK on bigint field - COPY to that table from file becomes slower and slower as table grows. When table reaches ~5GB - COPY of 100k records may take up to 20 mins. I've experimented with all params in configs, moved indexes to separate hdd etc - nothing made any improvement. However, once I'm dropping 64 bit PK - COPY of 100k records passes in seconds. Interesting thing - same table has other indexes, including composite ones, but none of them include bigint fields, that's why I reached decision that bug connected with indexes on bigint fields only. I did see this behavior, but as for me it occurs for UNIQUE indexes only (including PK), not dependent on field type. You can check this by dropping PK and creating it as a regular non-unique index. Best regards, Vitalii Tymchyshyn -- Ayrapetyan Robert, Comodo Anti-Malware Data Processing Analysis and Management System (CAMDPAMS) http://repo-qa.camdpams.odessa.office.comodo.net/mediawiki/index.php -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance die when COPYing to table with bigint PK
Seems this assumption is not right. Just created simple index on bigint column - situation with huge performance degradation repeated. Dropping this index solved COPY issues on the fly. So I'm still convinced - this bug relates to FreeBSD 64-bit + UFS + bigint column index (some of these may be superfluous, but I have no resources to check on different platforms with different filesystems). On Mon, Aug 1, 2011 at 12:15 PM, Robert Ayrapetyan robert.ayrapet...@comodo.com wrote: Quite possible. But anyway - I don't think performance degradation must be so huge in case of using UNIQUE indexes. On Mon, Aug 1, 2011 at 12:06 PM, Vitalii Tymchyshyn tiv...@gmail.com wrote: 31.07.11 16:51, Robert Ayrapetyan написав(ла): Hello. I've found strange behavior of my pg installation (tested both 8.4 and 9.0 - they behave same) on FreeBSD platform. In short - when some table have PK on bigint field - COPY to that table from file becomes slower and slower as table grows. When table reaches ~5GB - COPY of 100k records may take up to 20 mins. I've experimented with all params in configs, moved indexes to separate hdd etc - nothing made any improvement. However, once I'm dropping 64 bit PK - COPY of 100k records passes in seconds. Interesting thing - same table has other indexes, including composite ones, but none of them include bigint fields, that's why I reached decision that bug connected with indexes on bigint fields only. I did see this behavior, but as for me it occurs for UNIQUE indexes only (including PK), not dependent on field type. You can check this by dropping PK and creating it as a regular non-unique index. Best regards, Vitalii Tymchyshyn -- Ayrapetyan Robert, Comodo Anti-Malware Data Processing Analysis and Management System (CAMDPAMS) http://repo-qa.camdpams.odessa.office.comodo.net/mediawiki/index.php -- Ayrapetyan Robert, Comodo Anti-Malware Data Processing Analysis and Management System (CAMDPAMS) http://repo-qa.camdpams.odessa.office.comodo.net/mediawiki/index.php -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Performance die when COPYing to table with bigint PK
Hello. I've found strange behavior of my pg installation (tested both 8.4 and 9.0 - they behave same) on FreeBSD platform. In short - when some table have PK on bigint field - COPY to that table from file becomes slower and slower as table grows. When table reaches ~5GB - COPY of 100k records may take up to 20 mins. I've experimented with all params in configs, moved indexes to separate hdd etc - nothing made any improvement. However, once I'm dropping 64 bit PK - COPY of 100k records passes in seconds. Interesting thing - same table has other indexes, including composite ones, but none of them include bigint fields, that's why I reached decision that bug connected with indexes on bigint fields only. In terms of IO picture is following: after copy started gstat shows 100% load on index partition (as I mentioned above - I've tried separate hdd to keep index tablespace), large queue (over 2k elements), and constant slow write on speed of ~2MB\s. Hdd becomes completely unresponsive, even ls on empty folder hangs for minute or so. To avoid thoughts like your hdd is slow, you haven't tuned postgresql.conf etc - all slowness dissapears with drop of bigint PK, same time other indexes on same table remain alive. And yes - I've tried drop PK \ recreate PK, vacuum full analyze and all other things - nothing helped, only drop helps. Is this known and expected behavior? -- Ayrapetyan Robert, Comodo Anti-Malware Data Processing Analysis and Management System (CAMDPAMS) http://repo-qa.camdpams.odessa.office.comodo.net/mediawiki/index.php -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance