Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-27 Thread Marti Raudsepp
On Fri, Jan 27, 2012 at 06:31, sridhar bamandlapally
sridhar@gmail.com wrote:
 --
 | Id  | Operation | Name | Rows  | Bytes | Cost (%CPU)| Time |
 --
 |   0 | SELECT STATEMENT  |  |  7444K|   944M| 16077   (4)| 00:03:13 |
 |   1 |  TABLE ACCESS FULL| EMP  |  7444K|   944M| 16077   (4)| 00:03:13 |
 --

Sorry to take this off topic, but... Seriously, over 3 minutes to read
944 MB of data? That's less than 5 MB/s, what's wrong with your
database? :)

Regards,
Marti

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-27 Thread Vitalii Tymchyshyn

27.01.12 11:06, Marti Raudsepp написав(ла):

On Fri, Jan 27, 2012 at 06:31, sridhar bamandlapally
sridhar@gmail.com  wrote:

--
| Id  | Operation | Name | Rows  | Bytes | Cost (%CPU)| Time |
--
|   0 | SELECT STATEMENT  |  |  7444K|   944M| 16077   (4)| 00:03:13 |
|   1 |  TABLE ACCESS FULL| EMP  |  7444K|   944M| 16077   (4)| 00:03:13 |
--

Sorry to take this off topic, but... Seriously, over 3 minutes to read
944 MB of data? That's less than 5 MB/s, what's wrong with your
database? :)
Actually I'd ask how parallel CPU may help table sequence scan? Usually 
sequence scan does not take large amount of cpu time, so I see no point 
in parallelism.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-27 Thread Tomas Vondra
On 27 Leden 2012, 10:06, Marti Raudsepp wrote:
 On Fri, Jan 27, 2012 at 06:31, sridhar bamandlapally
 sridhar@gmail.com wrote:
 --
 | Id  | Operation | Name | Rows  | Bytes | Cost (%CPU)| Time
 |
 --
 |   0 | SELECT STATEMENT  |  |  7444K|   944M| 16077   (4)| 00:03:13
 |
 |   1 |  TABLE ACCESS FULL| EMP  |  7444K|   944M| 16077   (4)| 00:03:13
 |
 --

 Sorry to take this off topic, but... Seriously, over 3 minutes to read
 944 MB of data? That's less than 5 MB/s, what's wrong with your
 database? :)

Yes, those results are quite suspicious. There's probably something
interfering with the queries (other queries, different processes, block
cleanout, ...) or maybe this is purely due to caching.

sridhar, run the queries repeatedly and my quess is the difference will
disappear (and the fist query will be a bit faster I guess).

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-27 Thread Thomas Kellerer

sridhar bamandlapally, 27.01.2012 05:31:

SQL explain plan for select * from hr.emp ;
Explained.
PLAN
--
| Id  | Operation | Name | Rows  | Bytes | Cost (%CPU)| Time |
--
|   0 | SELECT STATEMENT  |  |  7444K|   944M| 16077   (4)| 00:03:13 |
|   1 |  TABLE ACCESS FULL| EMP  |  7444K|   944M| 16077   (4)| 00:03:13 |
--
*WITH PARALLEL Option*
SQL explain plan for select /*+parallel(emp,4)*/ * from hr.emp ;
Explained.
PLAN
-
| Id  | Operation| Name | Rows  | Bytes | Cost (%CPU)| Time 
|
-
|   0 | SELECT STATEMENT |  |  7444K|   944M|  4442   (3)| 00:00:54 
|
|   1 |  PX COORDINATOR  |  |   |   ||  
|
|   2 |   PX SEND QC (RANDOM)| :TQ1 |  7444K|   944M|  4442   (3)| 00:00:54 
|
|   3 |PX BLOCK ITERATOR |  |  7444K|   944M|  4442   (3)| 00:00:54 
|
|   4 | TABLE ACCESS FULL| EMP  |  7444K|   944M|  4442   (3)| 00:00:54 
|
-

In the above plan ( WITH PARALLEL Option )
1. Cost has been nearly reduced to 1/4th
2. CPU has been reduced
3. Time has been nearly reduced to 1/3rd


I have *never* seen the time column in the explain plan output come anywhere 
near the actual execution time in Oracle.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-27 Thread sridhar bamandlapally
For security reasons, I cannot put real-time senario into loop,

the one which I gave is an example, if I have solution for this then
same will be applied for real-time senario

We have implemented PARALLEL technology into our database and same
expecting after migration to PostgreSQL,

The real-time SQL-Query is hiting 18000 times per day, and PARALLEL
option gave us great performance and big window for all other process

Concept is, we need window for every process on database and all
together should fit in our window and time-line.

We think PostgreSQL should also upgrade PARALLEL technology at SQL-Query level





On 1/27/12, Thomas Kellerer spam_ea...@gmx.net wrote:
 sridhar bamandlapally, 27.01.2012 05:31:
 SQL explain plan for select * from hr.emp ;
 Explained.
 PLAN
 --
 | Id  | Operation | Name | Rows  | Bytes | Cost (%CPU)| Time |
 --
 |   0 | SELECT STATEMENT  |  |  7444K|   944M| 16077   (4)| 00:03:13 |
 |   1 |  TABLE ACCESS FULL| EMP  |  7444K|   944M| 16077   (4)| 00:03:13 |
 --
 *WITH PARALLEL Option*
 SQL explain plan for select /*+parallel(emp,4)*/ * from hr.emp ;
 Explained.
 PLAN
 -
 | Id  | Operation| Name | Rows  | Bytes | Cost (%CPU)|
 Time |
 -
 |   0 | SELECT STATEMENT |  |  7444K|   944M|  4442   (3)|
 00:00:54 |
 |   1 |  PX COORDINATOR  |  |   |   ||
  |
 |   2 |   PX SEND QC (RANDOM)| :TQ1 |  7444K|   944M|  4442   (3)|
 00:00:54 |
 |   3 |PX BLOCK ITERATOR |  |  7444K|   944M|  4442   (3)|
 00:00:54 |
 |   4 | TABLE ACCESS FULL| EMP  |  7444K|   944M|  4442   (3)|
 00:00:54 |
 -

 In the above plan ( WITH PARALLEL Option )
 1. Cost has been nearly reduced to 1/4th
 2. CPU has been reduced
 3. Time has been nearly reduced to 1/3rd

 I have *never* seen the time column in the explain plan output come
 anywhere near the actual execution time in Oracle.




 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-27 Thread Tomas Vondra
On 27 Leden 2012, 13:16, sridhar bamandlapally wrote:
 For security reasons, I cannot put real-time senario into loop,

The point of Marti's comment was that the estimates (as presented by
EXPLAIN PLAN FOR in Oracle) are inherently imprecise. Don't trust what
Oracle is telling you about the expected runtime of the queries. Run the
queries repeatedly to see.

 the one which I gave is an example, if I have solution for this then
 same will be applied for real-time senario

There's no way to execute a single query in a parallel manner and it won't
be available anytime soon.

This is not an issue unless you have a CPU bound query and you have unused
CPUs. That's not the case of your example, because the sequential scan is
likely to be I/O bound, thus executing it in parallel won't fix the issue.

 We have implemented PARALLEL technology into our database and same
 expecting after migration to PostgreSQL,

Why? Have you tried to run the query on PostgreSQL?

 The real-time SQL-Query is hiting 18000 times per day, and PARALLEL
 option gave us great performance and big window for all other process

Are we still discussing the example you've posted? Because this 18k hits
per day means running the query every 5 seconds. And if the query takes
more than a few seconds, there will be multiple queries running
concurrently, thus eating CPUs.

 Concept is, we need window for every process on database and all
 together should fit in our window and time-line.

Not sure what you mean by window or time-line?

 We think PostgreSQL should also upgrade PARALLEL technology at SQL-Query
 level

That is currently discussed in other threads, but it won't happen any time
soon (a few years in the future, maybe).

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-27 Thread Cédric Villemain
 the one which I gave is an example, if I have solution for this then
 same will be applied for real-time senario

 There's no way to execute a single query in a parallel manner and it won't
 be available anytime soon.

 This is not an issue unless you have a CPU bound query and you have unused
 CPUs. That's not the case of your example, because the sequential scan is
 likely to be I/O bound, thus executing it in parallel won't fix the issue.

it is possible to emulate with plproxy for example.


 We have implemented PARALLEL technology into our database and same
 expecting after migration to PostgreSQL,

 Why? Have you tried to run the query on PostgreSQL?

premature optimization ...


 The real-time SQL-Query is hiting 18000 times per day, and PARALLEL
 option gave us great performance and big window for all other process

 Are we still discussing the example you've posted? Because this 18k hits
 per day means running the query every 5 seconds. And if the query takes
 more than a few seconds, there will be multiple queries running
 concurrently, thus eating CPUs.

agreed.


 Concept is, we need window for every process on database and all
 together should fit in our window and time-line.

 Not sure what you mean by window or time-line?

 We think PostgreSQL should also upgrade PARALLEL technology at SQL-Query
 level

 That is currently discussed in other threads, but it won't happen any time
 soon (a few years in the future, maybe).

at the SQL level, I don't see the immediate benefit given that the
feature is not implemented: SQL level stuff  (planner hint) are here
to workaround what the server can not handle on its own. And
PostgreSQL policiy is not to allow planner hint, but to fix/improve
the server.

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-26 Thread sridhar bamandlapally
*Hi ALL*
**
*Please have a look into this,*
*this may help us to think on PARALLEL option*
**
*WITHOUT PARALLEL Option*
SQL explain plan for select * from hr.emp ;
Explained.
PLAN
--
| Id  | Operation | Name | Rows  | Bytes | Cost (%CPU)| Time |
--
|   0 | SELECT STATEMENT  |  |  7444K|   944M| 16077   (4)| 00:03:13 |
|   1 |  TABLE ACCESS FULL| EMP  |  7444K|   944M| 16077   (4)| 00:03:13 |
--


*WITH PARALLEL Option*
SQL explain plan for select /*+parallel(emp,4)*/ * from hr.emp ;
Explained.
PLAN
-
| Id  | Operation| Name | Rows  | Bytes | Cost (%CPU)|
Time |
-
|   0 | SELECT STATEMENT |  |  7444K|   944M|  4442   (3)|
00:00:54 |
|   1 |  PX COORDINATOR  |  |   |   |
|  |
|   2 |   PX SEND QC (RANDOM)| :TQ1 |  7444K|   944M|  4442   (3)|
00:00:54 |
|   3 |PX BLOCK ITERATOR |  |  7444K|   944M|  4442   (3)|
00:00:54 |
|   4 | TABLE ACCESS FULL| EMP  |  7444K|   944M|  4442   (3)|
00:00:54 |
-

In the above plan ( WITH PARALLEL Option )
1. Cost has been nearly reduced to 1/4th
2. CPU has been reduced
3. Time has been nearly reduced to 1/3rd




On Thu, Jan 26, 2012 at 2:24 AM, Claudio Freire klaussfre...@gmail.comwrote:

 On Wed, Jan 25, 2012 at 5:16 PM, Merlin Moncure mmonc...@gmail.com
 wrote:
  On Wed, Jan 25, 2012 at 7:43 AM, Claudio Freire klaussfre...@gmail.com
 wrote:
  I know squat about how to implement this, but I've been considering
  picking the low hanging fruit on that tree and patching up PG to try
  the concept. Many of the items above would require a thread-safe
  execution engine, which may be quite hard to get and have a
  significant performance hit. Some don't, like parallel sort.
 
  This was just discussed on -hackers yesterday -- see thread
  'multithreaded query planner'.  In short, judging by the comments of
  some of the smartest people working on this project, it sounds like
  using threads to attack this is not going to happen, ever.  Note you
  can probably still get parallel execution in other ways, using
  processes, shared memory, etc, so I'd consider researching in that
  direction.

 If you mean this[0] thread, it doesn't show anything conclusive
 against, say, parallel sort or pipelining.

 But I agree, checking the code, it would be really tough to get any
 more than parallel sorting by primitive types with threads.

 Processes, however, show promise.

 [0] http://archives.postgresql.org/pgsql-hackers/2012-01/msg00734.php



Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-25 Thread sridhar bamandlapally
Hi Everyone

I just want to illustrate an idea may possible for bringing up
parallel process in PostgreSQL at SQL-Query level

The PARALLEL option in Oracle really give great improvment in
performance, multi-thread concept has great possibilities

In Oracle we have hints ( see below ) :
SELECT /*+PARALLEL( e, 2 )*/ e.* FROM EMP e ;

PostgreSQL ( may if possible in future ) :
SELECT e.* FROM EMP PARALLEL ( e, 2) ;


*Note: The below syntax does not work with any PostgreSQL versions
PostgreSQL Syntax for SELECT ( with PARALLEL )

[ WITH [ RECURSIVE ] with_query [, ...] ]
SELECT [ ALL | DISTINCT [ ON ( expression [, ...] ) ] ]
* | expression [ [ AS ] output_name ] [, ...]
[ FROM from_item [, ...] ]
[ WHERE condition ]
[ GROUP BY expression [, ...] ]
[ HAVING condition [, ...] ]
[ WINDOW window_name AS ( window_definition ) [, ...] ]
[ PARALLEL (alias | table | index |segment ,  no. of threads ) ]
[ { UNION | INTERSECT | EXCEPT } [ ALL ] select ]
[ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS {
FIRST | LAST } ] [, ...] ]
[ LIMIT { count | ALL } ]
[ OFFSET start [ ROW | ROWS ] ]
[ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY ]
[ FOR { UPDATE | SHARE } [ OF table_name [, ...] ] [ NOWAIT ] [...] ]




On 1/24/12, ashish nauriyal anauri...@gmail.com wrote:
 Thoughts by Bruce Momjian on Parallel execution in PostgreSQL...

 http://momjian.us/main/blogs/pgblog/2011.html#December_5_2011

 You can give your thoughts on the blog itself

 Thanks,
 Ashish Nauriyal


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-25 Thread Claudio Freire
On Wed, Jan 25, 2012 at 6:18 AM, sridhar bamandlapally
sridhar@gmail.com wrote:
 I just want to illustrate an idea may possible for bringing up
 parallel process in PostgreSQL at SQL-Query level

 The PARALLEL option in Oracle really give great improvment in
 performance, multi-thread concept has great possibilities

 In Oracle we have hints ( see below ) :
 SELECT /*+PARALLEL( e, 2 )*/ e.* FROM EMP e ;

 PostgreSQL ( may if possible in future ) :
 SELECT e.* FROM EMP PARALLEL ( e, 2) ;

It makes little sense (and is contrary to pg policy of no hinting) to
do it like that.

In fact, I've been musing for a long time on leveraging pg's
sophisticated planner to do the parallelization:
 * Synchroscan means whenever a table has to be scanned twice, it can
be done with two threads.
 * Knowing whether a scan will hit mostly disk or memory can help in
deciding whether to do them in parallel or not (memory can be
parallelized, interleaved memory access isn't so bad, but interleaved
disk access is disastrous)
 * Big sorts can be parallelized quite easily
 * Number of threads to use can be a tunable or automatically set to
the number of processors on the system
 * Pipelining is another useful plan transformation: parallelize
I/O-bound nodes with CPU-bound ones.

I know squat about how to implement this, but I've been considering
picking the low hanging fruit on that tree and patching up PG to try
the concept. Many of the items above would require a thread-safe
execution engine, which may be quite hard to get and have a
significant performance hit. Some don't, like parallel sort.

Also, it is necessary to notice that parallelization will create some
priority inversion issues. Simple, non-parallelizable queries will
suffer from resource starvation when contending against more complex,
parallelizable ones.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-25 Thread sridhar bamandlapally
Yes

Hint method is an alternative solution which does not appear to be
exclusive parallelism solution as it is included in comment block and have
no error handling,
and this could be one of the reason against PG policy

Parameter method ( which we are thinking about ) can be very exclusive
parallelism solution
with proper error handling as it is part of SQL-Query syntax

On Wed, Jan 25, 2012 at 7:13 PM, Claudio Freire klaussfre...@gmail.comwrote:

 On Wed, Jan 25, 2012 at 6:18 AM, sridhar bamandlapally
 sridhar@gmail.com wrote:
  I just want to illustrate an idea may possible for bringing up
  parallel process in PostgreSQL at SQL-Query level
 
  The PARALLEL option in Oracle really give great improvment in
  performance, multi-thread concept has great possibilities
 
  In Oracle we have hints ( see below ) :
  SELECT /*+PARALLEL( e, 2 )*/ e.* FROM EMP e ;
 
  PostgreSQL ( may if possible in future ) :
  SELECT e.* FROM EMP PARALLEL ( e, 2) ;

 It makes little sense (and is contrary to pg policy of no hinting) to
 do it like that.

 In fact, I've been musing for a long time on leveraging pg's
 sophisticated planner to do the parallelization:
  * Synchroscan means whenever a table has to be scanned twice, it can
 be done with two threads.
  * Knowing whether a scan will hit mostly disk or memory can help in
 deciding whether to do them in parallel or not (memory can be
 parallelized, interleaved memory access isn't so bad, but interleaved
 disk access is disastrous)
  * Big sorts can be parallelized quite easily
  * Number of threads to use can be a tunable or automatically set to
 the number of processors on the system
  * Pipelining is another useful plan transformation: parallelize
 I/O-bound nodes with CPU-bound ones.

 I know squat about how to implement this, but I've been considering
 picking the low hanging fruit on that tree and patching up PG to try
 the concept. Many of the items above would require a thread-safe
 execution engine, which may be quite hard to get and have a
 significant performance hit. Some don't, like parallel sort.

 Also, it is necessary to notice that parallelization will create some
 priority inversion issues. Simple, non-parallelizable queries will
 suffer from resource starvation when contending against more complex,
 parallelizable ones.



Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-25 Thread Merlin Moncure
On Wed, Jan 25, 2012 at 7:43 AM, Claudio Freire klaussfre...@gmail.com wrote:
 I know squat about how to implement this, but I've been considering
 picking the low hanging fruit on that tree and patching up PG to try
 the concept. Many of the items above would require a thread-safe
 execution engine, which may be quite hard to get and have a
 significant performance hit. Some don't, like parallel sort.

This was just discussed on -hackers yesterday -- see thread
'multithreaded query planner'.  In short, judging by the comments of
some of the smartest people working on this project, it sounds like
using threads to attack this is not going to happen, ever.  Note you
can probably still get parallel execution in other ways, using
processes, shared memory, etc, so I'd consider researching in that
direction.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL Parallel Processing !

2012-01-25 Thread Claudio Freire
On Wed, Jan 25, 2012 at 5:16 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Jan 25, 2012 at 7:43 AM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 I know squat about how to implement this, but I've been considering
 picking the low hanging fruit on that tree and patching up PG to try
 the concept. Many of the items above would require a thread-safe
 execution engine, which may be quite hard to get and have a
 significant performance hit. Some don't, like parallel sort.

 This was just discussed on -hackers yesterday -- see thread
 'multithreaded query planner'.  In short, judging by the comments of
 some of the smartest people working on this project, it sounds like
 using threads to attack this is not going to happen, ever.  Note you
 can probably still get parallel execution in other ways, using
 processes, shared memory, etc, so I'd consider researching in that
 direction.

If you mean this[0] thread, it doesn't show anything conclusive
against, say, parallel sort or pipelining.

But I agree, checking the code, it would be really tough to get any
more than parallel sorting by primitive types with threads.

Processes, however, show promise.

[0] http://archives.postgresql.org/pgsql-hackers/2012-01/msg00734.php

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance