On 10/11/2010 08:27 AM, Joshua Tolley wrote:

One thing a test program would have to take into account is multiple
concurrent users. What speeds up the single user case may well hurt the
multi user case, and the behaviors that hurt single user cases may have been
put in place on purpose to allow decent multi-user performance. Of course, all
of that is "might" and "maybe", and I can't prove any assertions about block
size either. But the fact of multiple users needs to be kept in mind.

Agreed. I've put together a simple test program to test I/O chunk sizes. It only tests single-user performance, but it'd be pretty trivial to adapt it to spawn a couple of worker children or run several threads, each with a suitable delay as it's rather uncommon to have a bunch of seqscans all fire off at once.

From this test it's pretty clear that with buffered I/O of an uncached 700mb file under Linux, the I/O chunk size makes very little difference, with all chunk sizes taking 9.8s to read the test file, with near-identical CPU utilization. Caches were dropped between each test run.

For direct I/O (by ORing the O_DIRECT flag to the open() flags), chunk size is *hugely* significant, with 4k chunk reads of the test file taking 38s, 8k 22s, 16k 14s, 32k 10.8s, 64k - 1024k 9.8s, then rising a little again over 1024k.

Apparently Oracle is almost always configured to use direct I/O, so it would benefit massively from large chunk sizes. PostgreSQL is almost never used with direct I/O, and at least in terms of the low-level costs of syscalls and file system activity, shouldn't care at all about read chunk sizes.

Bumping readahead from 256 to 8192 made no significant difference for either case. Of course, I'm on a crappy laptop disk...

I'm guessing this is the origin of the OP's focus on I/O chunk sizes.

Anyway, for the single-seqscan case, I see little evidence here that using a bigger read chunk size would help PostgreSQL reduce overheads or improve performance.

OP: Is your Oracle instance using direct I/O?

--
Craig Ringer
#define _GNU_SOURCE

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

// 4k alignment
static const unsigned int ALIGN_SHIFT = 12;

static void usage() {
        printf("Usage: test chunksize_kb filename\n");
        exit(1);
}

int main(int argc, char * argv[]) {
        if (argc != 3) {
                usage();
        }
        char * end;
        long chunksize_kb = strtol(argv[1], &end, 10);
        if (end == argv[1]) {
                printf("Cannot parse chunk size as number");
                usage();
        }
        long chunksize_bytes = chunksize_kb * 1024;
        int fd = open(argv[2], O_RDONLY|O_NOATIME|O_DIRECT);
        if (fd == -1) {
                perror("Unable to open input file");
                usage();
        }
        void * buf = malloc(chunksize_bytes + (1<<ALIGN_SHIFT));
        void * aligned_buf = (void*) ((((unsigned 
long)buf)>>ALIGN_SHIFT)<<ALIGN_SHIFT);
        ssize_t ret;
        do {
                ret = read(fd, aligned_buf, chunksize_bytes);
        } while (ret > 0);
        perror("read");
}
[cr...@ayaki tmp]$ echo -n "direct=f Readahead: "; sudo blockdev --getra 
/dev/sda; for (( x = 2; x < 14; x++ )); do echo $((1<<x))kb blocks; echo 1 | 
sudo tee -a /proc/sys/vm/drop_caches >/dev/null; time ./io $((1<<x)) test_file; 
done
direct=f Readahead: 256
4kb blocks
read: Success

real    0m9.786s
user    0m0.034s
sys     0m1.840s
8kb blocks
read: Success

real    0m9.868s
user    0m0.020s
sys     0m1.698s
16kb blocks
read: Success

real    0m9.836s
user    0m0.009s
sys     0m1.591s
32kb blocks
read: Success

real    0m10.078s
user    0m0.007s
sys     0m1.691s
64kb blocks
read: Success

real    0m9.819s
user    0m0.008s
sys     0m1.688s
128kb blocks
read: Success

real    0m9.876s
user    0m0.002s
sys     0m1.685s
256kb blocks
read: Success

real    0m9.811s
user    0m0.001s
sys     0m1.643s
512kb blocks
read: Success

real    0m9.861s
user    0m0.001s
sys     0m1.671s
1024kb blocks
read: Success

real    0m9.811s
user    0m0.002s
sys     0m1.728s
2048kb blocks
read: Success

real    0m9.985s
user    0m0.001s
sys     0m1.820s
4096kb blocks
read: Success

real    0m9.845s
user    0m0.002s
sys     0m1.832s
8192kb blocks
read: Success

real    0m9.789s
user    0m0.000s
sys     0m1.921s
[cr...@ayaki tmp]$ echo -n "direct=f Readahead: "; sudo blockdev --getra 
/dev/sda; for (( x = 2; x < 14; x++ )); do echo $((1<<x))kb blocks; echo 1 | 
sudo tee -a /proc/sys/vm/drop_caches >/dev/null; time ./io $((1<<x)) test_file; 
done
direct=f Readahead: 4096
4kb blocks
read: Success

real    0m10.178s
user    0m0.015s
sys     0m1.167s
8kb blocks
read: Success

real    0m10.037s
user    0m0.006s
sys     0m1.180s
16kb blocks
read: Success

real    0m10.303s
user    0m0.004s
sys     0m1.230s
32kb blocks
read: Success

real    0m10.870s
user    0m0.003s
sys     0m1.148s
64kb blocks
read: Success

real    0m10.194s
user    0m0.004s
sys     0m1.080s
128kb blocks
read: Success

real    0m10.401s
user    0m0.004s
sys     0m1.233s
256kb blocks
read: Success

real    0m10.087s
user    0m0.000s
sys     0m1.202s
512kb blocks
read: Success

real    0m10.061s
user    0m0.001s
sys     0m1.280s
1024kb blocks
read: Success

real    0m10.161s
user    0m0.003s
sys     0m1.301s
2048kb blocks
read: Success

real    0m10.246s
user    0m0.000s
sys     0m1.291s
4096kb blocks
read: Success

real    0m9.996s
user    0m0.002s
sys     0m1.366s
8192kb blocks
read: Success

real    0m9.997s
user    0m0.001s
sys     0m1.599s
[cr...@ayaki tmp]$ echo -n "Readahead: "; sudo blockdev --getra /dev/sda; for 
(( x = 2; x < 14; x++ )); do echo $((1<<x))kb blocks; time ./io $((1<<x)) 
test_file; done
Readahead: 256
4kb blocks
read: Success

real    0m38.252s
user    0m0.045s
sys     0m12.866s
8kb blocks
read: Success

real    0m22.145s
user    0m0.021s
sys     0m8.383s
16kb blocks
read: Success

real    0m14.221s
user    0m0.012s
sys     0m5.489s
32kb blocks
read: Success

real    0m10.835s
user    0m0.004s
sys     0m4.500s
64kb blocks
read: Success

real    0m9.752s
user    0m0.003s
sys     0m3.658s
128kb blocks
read: Success

real    0m9.980s
user    0m0.000s
sys     0m3.513s
256kb blocks
read: Success

real    0m9.872s
user    0m0.001s
sys     0m3.245s
512kb blocks
read: Success

real    0m9.814s
user    0m0.000s
sys     0m3.173s
1024kb blocks
read: Success

real    0m9.880s
user    0m0.001s
sys     0m3.170s
2048kb blocks
read: Success

real    0m11.961s
user    0m0.000s
sys     0m3.380s
4096kb blocks
read: Success

real    0m11.374s
user    0m0.002s
sys     0m3.348s
8192kb blocks
read: Success

real    0m11.970s
user    0m0.000s
sys     0m3.312s
[cr...@ayaki tmp]$ echo -n "Readahead: "; sudo blockdev --getra /dev/sda; for 
(( x = 2; x < 14; x++ )); do echo $((1<<x))kb blocks; time ./io $((1<<x)) 
test_file; done
Readahead: 4096
4kb blocks
read: Success

real    0m37.465s
user    0m0.037s
sys     0m12.810s
8kb blocks
read: Success

real    0m21.776s
user    0m0.024s
sys     0m8.362s
16kb blocks
read: Success

real    0m14.085s
user    0m0.012s
sys     0m5.463s
32kb blocks
read: Success

real    0m10.661s
user    0m0.006s
sys     0m4.496s
64kb blocks
read: Success

real    0m9.778s
user    0m0.005s
sys     0m3.768s
128kb blocks
read: Success

real    0m9.730s
user    0m0.000s
sys     0m3.373s
256kb blocks
read: Success

real    0m10.371s
user    0m0.000s
sys     0m3.140s
512kb blocks
read: Success

real    0m9.847s
user    0m0.000s
sys     0m3.169s
1024kb blocks
read: Success

real    0m9.880s
user    0m0.000s
sys     0m3.098s
2048kb blocks
read: Success

real    0m11.124s
user    0m0.000s
sys     0m3.001s
4096kb blocks
read: Success

real    0m11.653s
user    0m0.000s
sys     0m2.765s
8192kb blocks
read: Success

real    0m11.503s
user    0m0.001s
sys     0m2.680s

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to