Hello Gregory, Gregory Stark wrote:
Oracle is using Direct I/O so they need the reader and writer threads to avoid blocking on i/o all the time. We count on the OS doing readahead and buffering our writes so we don't have to. Direct I/O and needing some way to do asynchronous writes and reads are directly tied.
Yeah, except in cases where we can tell ahead non-sequential reads. Which admittedly doesn't come up too frequently and can probably be handled with posix_fadvice - as you are currently testing.
Where Parallel query is useful is when you have queries that involve a substantial amount of cpu resources, especially if you have a very fast I/O system which can saturate the bandwidth to a single cpu.
Full ACK, the very same applies to parallel querying on shared-nothing clusters. Those can help if the bandwidth to all processing cores together becomes the bottleneck (and the resulting data is relatively small compared to the input data).
For example, Sun's UltraSparc T2 features only 8 PCIe lanes for those 8 cores, so you end up with 250 MiB/sec per core or about 32 MiB/sec per thread on average. To be fair: their 10 Gig Ethernet ports don't go via PCIe, so you get an additional 2x 1 GiB/sec for the complete chip. And memory bandwidth looks a lot better: Sun claims 60+ GiB/sec, leaving almost 8 GiB/sec per core or 1 GiB/sec per thread.
If my calculations for Intel are correct, a Quad Xeon with a 1.33 GHz FSB has around 21 GiB/sec throughput to main memory, giving 5 GiB/sec per core. (Why are these numbers so hard to find? It looks like Intel deliberately obfuscates them with FSB MHz or Giga-transactions per sec and the like.)
Regards Markus ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend