> Does that include increasing the size of read/write blocks? I've > noticedthat with a large enough table it takes a while to do a > sequential scan, > even if it's cached; I wonder if the fact that it takes a million > read(2) calls to get through an 8G table is part of that. >
Actually some of that readaheads,etc the OS does already if it does some sort of throttling/clubbing of reads/writes. But its not enough for such types of workloads. Here is what I think will help: * Support for different Blocksize TABLESPACE without recompiling the code.. (Atlease support for a different Blocksize for the whole database without recompiling the code) * Support for bigger sizes of WAL files instead of 16MB files WITHOUT recompiling the code.. Should be a tuneable if you ask me (with checkpoint_segments at 256.. you have too many 16MB files in the log directory) (This will help OLTP benchmarks more since now they don't spend time rotating log files) * Introduce a multiblock or extent tunable variable where you can define a multiple of 8K (or BlockSize tuneable) to read a bigger chunk and store it in the bufferpool.. (Maybe writes too) (Most devices now support upto 1MB chunks for reads and writes) *There should be a way to preallocate files for TABLES in TABLESPACES otherwise with multiple table writes in the same filesystem ends with fragmented files which causes poor "READS" from the files. * With 64bit 1GB file chunks is also moot.. Maybe it should be tuneable too like 100GB without recompiling the code. Why recompiling is bad? Most companies that will support Postgres will support their own binaries and they won't prefer different versions of binaries for different blocksizes, different WAL file sizes, etc... and hence more function using the same set of binaries is more desirable in enterprise environments Regards, Jignesh ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match