Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
There is switch-like sql case: 39.6.2.4. Simple CASE CASE search-expression WHEN expression [, expression [ ... ]] THEN statements [ WHEN expression [, expression [ ... ]] THEN statements ... ] [ ELSE statements ] END CASE; It should work like C switch statement. Also, for bulk insert, have you tried "for each statement" triggers instead of "for each row"? This would look like a lot of inserts and would not be fast in single-row-insert case, but can give you benefit for huge inserts. It should look like insert into quotes_2012_09_10 select * from new where cast(new.received_time as date) = '2012-09-10' ; insert into quotes_2012_09_11 select * from new where cast(new.received_time as date) = '2012-09-11' ; ... 2012/12/27 Stephen Frost > * Jeff Janes (jeff.ja...@gmail.com) wrote: > > If the main goal is to make it faster, I'd rather see all of plpgsql get > > faster, rather than just a special case of partitioning triggers. For > > example, right now a CASE statement with 100 branches is > about > > the same speed as an equivalent list of 100 elsif. So it seems to be > doing > > a linear search, when it could be doing a hash that should be a lot > faster. > > That's a nice thought, but I'm not sure that it'd really be practical. > CASE statements in plpgsql are completely general and really behave more > like an if/elsif tree than a C-style switch() statement or similar. For > one thing, the expression need not use the same variables, could be > complex multi-variable conditionals, etc. > > Figuring out that you could build a dispatch table for a given CASE > statement and then building it, storing it, and remembering to use it, > wouldn't be cheap. > > On the other hand, I've actually *wanted* a simpler syntax on occation. > I have no idea if there'd be a way to make it work, but this would be > kind of nice: > > CASE OF x -- or whatever > WHEN 1 THEN blah blah > WHEN 2 THEN blah blah > WHEN 3 THEN blah blah > END > > which would be possible to build into a dispatch table by looking at the > type of x and the literals used in the overall CASE statement. Even so, > there would likely be some number of WHEN conditions required before > it'd actually be more efficient to use, though perhaps getting rid of > the expression evaluation (if that'd be possible) would make up for it. > > Thanks, > > Stephen > -- Best regards, Vitalii Tymchyshyn
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
BTW: If "select count(*) from new" is fast, you can even choose the strategy in trigger depending on insert size. 2012/12/28 Vitalii Tymchyshyn > There is switch-like sql case: > 39.6.2.4. Simple CASE > > CASE search-expression > WHEN expression [, expression [ ... ]] THEN > statements > [ WHEN expression [, expression [ ... ]] THEN > statements > ... ] > [ ELSE > statements ] > END CASE; > > It should work like C switch statement. > > Also, for bulk insert, have you tried "for each statement" triggers > instead of "for each row"? > This would look like a lot of inserts and would not be fast in > single-row-insert case, but can give you benefit for huge inserts. > It should look like > insert into quotes_2012_09_10 select * from new where > cast(new.received_time as date) = '2012-09-10' ; > insert into quotes_2012_09_11 select * from new where > cast(new.received_time as date) = '2012-09-11' ; > ... > > 2012/12/27 Stephen Frost > >> * Jeff Janes (jeff.ja...@gmail.com) wrote: >> > If the main goal is to make it faster, I'd rather see all of plpgsql get >> > faster, rather than just a special case of partitioning triggers. For >> > example, right now a CASE statement with 100 branches is >> about >> > the same speed as an equivalent list of 100 elsif. So it seems to be >> doing >> > a linear search, when it could be doing a hash that should be a lot >> faster. >> >> That's a nice thought, but I'm not sure that it'd really be practical. >> CASE statements in plpgsql are completely general and really behave more >> like an if/elsif tree than a C-style switch() statement or similar. For >> one thing, the expression need not use the same variables, could be >> complex multi-variable conditionals, etc. >> >> Figuring out that you could build a dispatch table for a given CASE >> statement and then building it, storing it, and remembering to use it, >> wouldn't be cheap. >> >> On the other hand, I've actually *wanted* a simpler syntax on occation. >> I have no idea if there'd be a way to make it work, but this would be >> kind of nice: >> >> CASE OF x -- or whatever >> WHEN 1 THEN blah blah >> WHEN 2 THEN blah blah >> WHEN 3 THEN blah blah >> END >> >> which would be possible to build into a dispatch table by looking at the >> type of x and the literals used in the overall CASE statement. Even so, >> there would likely be some number of WHEN conditions required before >> it'd actually be more efficient to use, though perhaps getting rid of >> the expression evaluation (if that'd be possible) would make up for it. >> >> Thanks, >> >> Stephen >> > > > > -- > Best regards, > Vitalii Tymchyshyn > -- Best regards, Vitalii Tymchyshyn
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
Vitalii, * Vitalii Tymchyshyn (tiv...@gmail.com) wrote: > There is switch-like sql case: [...] > It should work like C switch statement. It does and it doesn't. It behaves generally like a C switch statement, but is much more flexible and therefore can't be optimized like a C switch statement can be. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
Hello > > Also, for bulk insert, have you tried "for each statement" triggers instead > of "for each row"? > This would look like a lot of inserts and would not be fast in > single-row-insert case, but can give you benefit for huge inserts. > It should look like > insert into quotes_2012_09_10 select * from new where cast(new.received_time > as date) = '2012-09-10' ; > insert into quotes_2012_09_11 select * from new where cast(new.received_time > as date) = '2012-09-11' ; > ... It has only one problem - PostgreSQL has not relations NEW and OLD for statements triggers. Regards Pavel -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
Why so? Basic form "case lvalue when rvalue then out ... end" is much like switch. The "case when condition then out ... end" is different, more complex beast, but first one is essentially a switch. If it is now trnasformed into "case when lvalue = rvalue1 then out1 when lvalue=rvalue2 then out2 ... end" then this can be optimized and this would benefit many users, not only ones that use partitioning. 2012/12/28 Stephen Frost > Vitalii, > > * Vitalii Tymchyshyn (tiv...@gmail.com) wrote: > > There is switch-like sql case: > [...] > > It should work like C switch statement. > > It does and it doesn't. It behaves generally like a C switch statement, > but is much more flexible and therefore can't be optimized like a C > switch statement can be. > > Thanks, > > Stephen > -- Best regards, Vitalii Tymchyshyn
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
It's a pity. Why does not it listed in "Compatibility" section of create trigger documentation? I think, this makes "for each statement" triggers not compatible with SQL99. 2012/12/28 Pavel Stehule > Hello > > > > > Also, for bulk insert, have you tried "for each statement" triggers > instead > > of "for each row"? > > This would look like a lot of inserts and would not be fast in > > single-row-insert case, but can give you benefit for huge inserts. > > It should look like > > insert into quotes_2012_09_10 select * from new where > cast(new.received_time > > as date) = '2012-09-10' ; > > insert into quotes_2012_09_11 select * from new where > cast(new.received_time > > as date) = '2012-09-11' ; > > ... > > It has only one problem - PostgreSQL has not relations NEW and OLD for > statements triggers. > > Regards > > Pavel > -- Best regards, Vitalii Tymchyshyn
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
2012/12/28 Vitalii Tymchyshyn : > Why so? Basic form "case lvalue when rvalue then out ... end" is much like > switch. > The "case when condition then out ... end" is different, more complex beast, > but first one is essentially a switch. If it is now trnasformed into > "case when lvalue = rvalue1 then out1 when lvalue=rvalue2 then out2 ... end" > then this can be optimized and this would benefit many users, not only ones > that use partitioning. please, look to plpgsql source code. PL/pgSQL is too simply and has not own arithmetic unit - all is transformed to SELECTs, has not any optimization. But is really short and maintainable. These SELECTs are evaluated only when it is necessary - but it is evaluated by PostgreSQL expression executor - not by PL/pgSQL directly - PL/pgSQL cannot process constant by self. So any enhancing needs PL/pgSQL redesign and I am not sure, so this use case has accurate benefit, because expression bottleneck is only one part of partitioning triggers bottleneck. More - if you need really fast code, you can use own code in C - and it be 10x times faster than any optimized PL/pgSQL code. And using C triggers in PostgreSQL is not terrible work. Using plpgsql row triggers for partitioning is not good idea - it is just work around from my perspective, and we should to solve source of problem - missing native support. Regards Pavel Stehule > > > 2012/12/28 Stephen Frost >> >> Vitalii, >> >> * Vitalii Tymchyshyn (tiv...@gmail.com) wrote: >> > There is switch-like sql case: >> [...] >> > It should work like C switch statement. >> >> It does and it doesn't. It behaves generally like a C switch statement, >> but is much more flexible and therefore can't be optimized like a C >> switch statement can be. >> >> Thanks, >> >> Stephen > > > > > -- > Best regards, > Vitalii Tymchyshyn -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
On Friday, December 28, 2012, Vitalii Tymchyshyn wrote: > There is switch-like sql case: > 39.6.2.4. Simple CASE > > CASE search-expression > WHEN expression [, expression [ ... ]] THEN > statements > [ WHEN expression [, expression [ ... ]] THEN > statements > ... ] > [ ELSE > statements ] > END CASE; > > It should work like C switch statement. > > I had thought that too, but the catch is that the target expressions do not need to be constants when the function is created. Indeed, they can even be volatile. CREATE OR REPLACE FUNCTION foo(x integer) RETURNS integer AS $$ BEGIN case x when 0 then return -5; when (random()*10)::integer then return 1; when (random()*10)::integer then return 2; when (random()*10)::integer then return 3; when (random()*10)::integer then return 4; when (random()*10)::integer then return 5; when (random()*10)::integer then return 6; when (random()*10)::integer then return 7; when (random()*10)::integer then return 8; when (random()*10)::integer then return 9; when (random()*10)::integer then return 10; else return -6; Cheers, Jeff >
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
2012/12/28 Vitalii Tymchyshyn : > Why so? Basic form "case lvalue when rvalue then out ... end" is much like > switch. Sorry, to be honest, I missed that distinction and didn't expect that to work as-is, yet apparently it does. Does it currently perform the same as an if/elsif tree or is it implemented to actually use a table lookup? * Pavel Stehule (pavel.steh...@gmail.com) wrote: > please, look to plpgsql source code. PL/pgSQL is too simply and has > not own arithmetic unit - all is transformed to SELECTs, has not any > optimization. But is really short and maintainable. I was thinking we'd actually do this for all CASE statements, those in plpgsql and those in regular SQL, if it's possible to do. Hopefully it'd be possible to do easily in plpgsql once the SQL-level CASE is done. > These SELECTs are evaluated only when it is necessary - but it is > evaluated by PostgreSQL expression executor - not by PL/pgSQL directly > - PL/pgSQL cannot process constant by self. Right, but I wonder if we could pass the entire CASE tree to the executor, with essentially pointers to the code blocks which will be executed, and get back a function which we can call over and over that takes whatever the parameter is and returns the 'right' pointer? > So any enhancing needs PL/pgSQL redesign and I am not sure, so this > use case has accurate benefit, because expression bottleneck is only > one part of partitioning triggers bottleneck. More - if you need > really fast code, you can use own code in C - and it be 10x times > faster than any optimized PL/pgSQL code. And using C triggers in > PostgreSQL is not terrible work. It's quite a bit of work for people who don't know C or are (understandably) concerned about writing things which can easily segfault the entire backend. > Using plpgsql row triggers for partitioning is not good idea - it is > just work around from my perspective, and we should to solve source of > problem - missing native support. I agree that native partitioning would certainly be nice. I was really hoping that was going to happen for 9.3, but it seems unlikely now (unless I've missed something). Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
* Jeff Janes (jeff.ja...@gmail.com) wrote: > I had thought that too, but the catch is that the target expressions do not > need to be constants when the function is created. Indeed, they can even > be volatile. Right, any optimization in this regard would only work in certain instances- eg: when the 'WHEN' components are all constants and the data type is something we can manage, etc, etc. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
2012/12/28 Stephen Frost : > 2012/12/28 Vitalii Tymchyshyn : >> Why so? Basic form "case lvalue when rvalue then out ... end" is much like >> switch. > > Sorry, to be honest, I missed that distinction and didn't expect that to > work as-is, yet apparently it does. Does it currently perform the same > as an if/elsif tree or is it implemented to actually use a table lookup? both IF and CASE has very similar implementation - table lookup is not used - there are not special path for searching constants > > * Pavel Stehule (pavel.steh...@gmail.com) wrote: >> please, look to plpgsql source code. PL/pgSQL is too simply and has >> not own arithmetic unit - all is transformed to SELECTs, has not any >> optimization. But is really short and maintainable. > > I was thinking we'd actually do this for all CASE statements, those in > plpgsql and those in regular SQL, if it's possible to do. Hopefully > it'd be possible to do easily in plpgsql once the SQL-level CASE is > done. > I am not sure - SQL case is not heavy specially optimized too :( I see only one possible way, do almost work when CASE statement is parsed and bypass executor - this can work, but I afraid so it can slowdown first start and some use cases where is not too much paths, because we have to check all paths before executions. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
On Thursday, December 20, 2012, Scott Marlowe wrote: > > 3: Someone above mentioned rules being faster than triggers. In my > experience they're WAY slower than triggers but maybe that was just on > the older pg versions (8.3 and lower) we were doing this on. I'd be > interested in seeing some benchmarks if rules have gotten faster or I > was just doing it wrong. > It apparently depends on how you use them. To load 1e6 rows into the parent, redistributing to 100 partitions (rows evenly distributed over partitions) using RULEs, it took 14.5 seconds using a "insert into foo select * from foo_tmp" (not counting the time it took to prepopulate the foo_tmp via \copy). This is about 25% faster than the 18.4 seconds it took to load the same data via \copy using a plpgsql trigger which was structured with nested IF ... ELSE...END IF that do a binary search over the partitions. However if I didn't use \copy or "insert into...select", but rather used a Perl loop invoking normal single-row inserts (but all in a single transaction) with DBD::Pg, then the RULEs took 596 seconds, an astonishing seven times slower than the 83 seconds it took the previously mentioned plpgsql trigger to do the same thing. This was under 9.1.7. In 9.2.2, it seems to get 3 times worse yet for RULEs in the insert loop. But that result seems hard to believe, so I am repeating it. Cheers Jeff > >
RES: [PERFORM] Performance on Bulk Insert to Partitioned Table
UNSUBSCRIBE De: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] Em nome de Jeff Janes Enviada em: sexta-feira, 28 de dezembro de 2012 14:31 Para: Scott Marlowe Cc: Tom Lane; Charles Gomes; Ondrej Ivanič; pgsql-performance@postgresql.org Assunto: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table On Thursday, December 20, 2012, Scott Marlowe wrote: 3: Someone above mentioned rules being faster than triggers. In my experience they're WAY slower than triggers but maybe that was just on the older pg versions (8.3 and lower) we were doing this on. I'd be interested in seeing some benchmarks if rules have gotten faster or I was just doing it wrong. It apparently depends on how you use them. To load 1e6 rows into the parent, redistributing to 100 partitions (rows evenly distributed over partitions) using RULEs, it took 14.5 seconds using a "insert into foo select * from foo_tmp" (not counting the time it took to prepopulate the foo_tmp via \copy). This is about 25% faster than the 18.4 seconds it took to load the same data via \copy using a plpgsql trigger which was structured with nested IF ... ELSE...END IF that do a binary search over the partitions. However if I didn't use \copy or "insert into...select", but rather used a Perl loop invoking normal single-row inserts (but all in a single transaction) with DBD::Pg, then the RULEs took 596 seconds, an astonishing seven times slower than the 83 seconds it took the previously mentioned plpgsql trigger to do the same thing. This was under 9.1.7. In 9.2.2, it seems to get 3 times worse yet for RULEs in the insert loop. But that result seems hard to believe, so I am repeating it. Cheers Jeff
Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
Hello 2012/12/28 Luciano Ernesto da Silva : > UNSUBSCRIBE > > > > De: pgsql-performance-ow...@postgresql.org > [mailto:pgsql-performance-ow...@postgresql.org] Em nome de Jeff Janes > Enviada em: sexta-feira, 28 de dezembro de 2012 14:31 > Para: Scott Marlowe > Cc: Tom Lane; Charles Gomes; Ondrej Ivanič; pgsql-performance@postgresql.org > Assunto: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table > > > > > > On Thursday, December 20, 2012, Scott Marlowe wrote: > > > 3: Someone above mentioned rules being faster than triggers. In my > experience they're WAY slower than triggers but maybe that was just on > the older pg versions (8.3 and lower) we were doing this on. I'd be > interested in seeing some benchmarks if rules have gotten faster or I > was just doing it wrong. > > I am not sure, but I expect so speed or slowness of rules depends primary on number of partitions. More significantly than triggers. Regards Pavel > > It apparently depends on how you use them. > > > > To load 1e6 rows into the parent, redistributing to 100 partitions (rows > evenly distributed over partitions) using RULEs, it took 14.5 seconds using > a "insert into foo select * from foo_tmp" (not counting the time it took to > prepopulate the foo_tmp via \copy). > > > > This is about 25% faster than the 18.4 seconds it took to load the same data > via \copy using a plpgsql trigger which was structured with nested IF ... > ELSE...END IF that do a binary search over the partitions. > > However if I didn't use \copy or "insert into...select", but rather used a > Perl loop invoking normal single-row inserts (but all in a single > transaction) with DBD::Pg, then the RULEs took 596 seconds, an astonishing > seven times slower than the 83 seconds it took the previously mentioned > plpgsql trigger to do the same thing. > > > > This was under 9.1.7. > > > > In 9.2.2, it seems to get 3 times worse yet for RULEs in the insert loop. > But that result seems hard to believe, so I am repeating it. > > > > Cheers > > > > Jeff > > -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance