Re: [HACKERS] Parallel Seq Scan

Jim Nasby Tue, 27 Jan 2015 15:44:57 -0800

On 1/26/15 11:11 PM, Amit Kapila wrote:

On Tue, Jan 27, 2015 at 3:18 AM, Jim Nasby <[email protected] 
<mailto:[email protected]>> wrote:
 >
 > On 1/23/15 10:16 PM, Amit Kapila wrote:
 >>
 >> Further, if we want to just get the benefit of parallel I/O, then
 >> I think we can get that by parallelising partition scan where different
 >> table partitions reside on different disk partitions, however that is
 >> a matter of separate patch.
 >
 >
 > I don't think we even have to go that far.
 >
 >
 > We'd be a lot less sensitive to IO latency.
 >
 > I wonder what kind of gains we would see if every SeqScan in a query spawned 
a worker just to read tuples and shove them in a queue (or shove a pointer to a 
buffer in the queue).
 >


Here IIUC, you want to say that just get the read done by one parallel
worker and then all expression calculation (evaluation of qualification
and target list) in the main backend, it seems to me that by doing it
that way, the benefit of parallelisation will be lost due to tuple
communication overhead (may be the overhead is less if we just
pass a pointer to buffer but that will have another kind of problems
like holding buffer pins for a longer period of time).

I could see the advantage of testing on lines as suggested by Tom Lane,
but that seems to be not directly related to what we want to achieve by
this patch (parallel seq scan) or if you think otherwise then let me know?


There's some low-hanging fruit when it comes to improving our IO performance 
(or more specifically, decreasing our sensitivity to IO latency). Perhaps the 
way to do that is with the parallel infrastructure, perhaps not. But I think 
it's premature to look at parallelism for increasing IO performance, or 
worrying about things like how many IO threads we should have before we at 
least look at simpler things we could do. We shouldn't assume there's nothing 
to be gained short of a full parallelization implementation.

That's not to say there's nothing else we could use parallelism for. Sort, 
merge and hash operations come to mind.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

Reply via email to