Re: [HACKERS] Memory Alignment in Postgres
On Wed, Sep 10, 2014 at 12:43 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Sep 9, 2014 at 10:08 AM, Arthur Silva arthur...@gmail.com wrote: I'm continuously studying Postgres codebase. Hopefully I'll be able to make some contributions in the future. For now I'm intrigued about the extensive use of memory alignment. I'm sure there's some legacy and some architecture that requires it reasoning behind it. That aside, since it wastes space (a lot of space in some cases) there must be a tipping point somewhere. I'm sure one can prove aligned access is faster in a micro-benchmark but I'm not sure it's the case in a DBMS like postgres, specially in the page/rows area. Just for the sake of comparison Mysql COMPACT storage (default and recommended since 5.5) doesn't align data at all. Mysql NDB uses a fixed 4-byte alignment. Not sure about Oracle and others. Is it worth the extra space in newer architectures (specially Intel)? Do you guys think this is something worth looking at? Yes. At least in my opinion, though, it's not a good project for a beginner. If you get your changes to take effect, you'll find that a lot of things will break in places that are not easy to find or fix. You're getting into really low-level areas of the system that get touched infrequently and require a lot of expertise in how things work today to adjust. I thought all memory alignment was (or at least the bulk of it) handled using some codebase wide macros/settings, otherwise how could different parts of the code inter-op? Poking this area might suffice for some initial testing to check if it's worth any more attention. Unaligned memory access received a lot attention in Intel post-Nehalen era. So it may very well pay off on Intel servers. You might find this blog post and it's comments/external-links interesting http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/ I'm a newbie in the codebase, so please let me know if I'm saying anything non-sense. The idea I've had before is to try to reduce the widest alignment we ever require from 8 bytes to 4 bytes. That is, look for types with typalign = 'd', and rewrite them to have typalign = 'i' by having them use two 4-byte loads to load an eight-byte value. In practice, I think this would probably save a high percentage of what can be saved, because 8-byte alignment implies a maximum of 7 bytes of wasted space, while 4-byte alignment implies a maximum of 3 bytes of wasted space. And it would probably be pretty cheap, too, because any type with less than 8 byte alignment wouldn't be affected at all, and even those types that were affected would only be slightly slowed down by doing two loads instead of one. In contrast, getting rid of alignment requirements completely would save a little more space, but probably at the cost of a lot more slowdown: any type with alignment requirements would have to fetch the value byte-by-byte instead of pulling the whole thing out at once. Does byte-by-byte access stand true nowadays? I though modern processors would fetch memory at very least in word sized chunks, so 4/8 bytes then merge-slice. But there are a couple of obvious problems with this idea, too, such as: 1. It's really complicated and a ton of work. 2. It would break pg_upgrade pretty darn badly unless we employed some even-more-complex strategy to mitigate that. 3. The savings might not be enough to justify the effort. Very true. It might be interesting for someone to develop a tool measuring the number of bytes of alignment padding we lose per tuple or per page and gather some statistics on it on various databases. That would give us some sense as to the possible savings. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Memory Alignment in Postgres
On Thu, Sep 11, 2014 at 9:32 AM, Arthur Silva arthur...@gmail.com wrote: I thought all memory alignment was (or at least the bulk of it) handled using some codebase wide macros/settings, otherwise how could different parts of the code inter-op? Poking this area might suffice for some initial testing to check if it's worth any more attention. Well, sure, but the issues aren't too simple. For example, I think there are cases where we rely on the alignment bytes being zero to distinguish between an aligned value following and an unaligned toasted value. That stuff can make your head explode, or at least mine. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory Alignment in Postgres
On 2014-09-11 10:32:24 -0300, Arthur Silva wrote: Unaligned memory access received a lot attention in Intel post-Nehalen era. So it may very well pay off on Intel servers. You might find this blog post and it's comments/external-links interesting http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/ FWIW, the reported results of imo pretty meaningless for postgres. It's sequential access over larger amount of memory. I.e. a perfectly prefetchable workload where it doesn't matter if superflous cachelines are fetched because they're going to be needed next round anyway. In many production workloads one of the most busy accesses to individual datums is the binary search on individual pages during index lookups. That's pretty much exactly the contrary to the above. Not saying that it's not going to be a benefit in many scenarios, but it's far from being as simple as saying that unaligned accesses on their own aren't penalized anymore. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory Alignment in Postgres
On Thu, Sep 11, 2014 at 8:32 AM, Arthur Silva arthur...@gmail.com wrote: On Wed, Sep 10, 2014 at 12:43 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Sep 9, 2014 at 10:08 AM, Arthur Silva arthur...@gmail.com wrote: I'm continuously studying Postgres codebase. Hopefully I'll be able to make some contributions in the future. For now I'm intrigued about the extensive use of memory alignment. I'm sure there's some legacy and some architecture that requires it reasoning behind it. That aside, since it wastes space (a lot of space in some cases) there must be a tipping point somewhere. I'm sure one can prove aligned access is faster in a micro-benchmark but I'm not sure it's the case in a DBMS like postgres, specially in the page/rows area. Just for the sake of comparison Mysql COMPACT storage (default and recommended since 5.5) doesn't align data at all. Mysql NDB uses a fixed 4-byte alignment. Not sure about Oracle and others. Is it worth the extra space in newer architectures (specially Intel)? Do you guys think this is something worth looking at? Yes. At least in my opinion, though, it's not a good project for a beginner. If you get your changes to take effect, you'll find that a lot of things will break in places that are not easy to find or fix. You're getting into really low-level areas of the system that get touched infrequently and require a lot of expertise in how things work today to adjust. I thought all memory alignment was (or at least the bulk of it) handled using some codebase wide macros/settings, otherwise how could different parts of the code inter-op? Poking this area might suffice for some initial testing to check if it's worth any more attention. Unaligned memory access received a lot attention in Intel post-Nehalen era. So it may very well pay off on Intel servers. You might find this blog post and it's comments/external-links interesting http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/ I'm a newbie in the codebase, so please let me know if I'm saying anything non-sense. Be advised of the difficulties you are going to face here. Assuming for a second there is no reason not to go unaligned on Intel and there are material benefits to justify the effort, that doesn't necessarily hold for other platforms like arm/power. Even though intel handles the vast majority of installations it's not gonna fly to optimize for that platform at the expense of others so there'd have to be some kind of compile time setting to control alignment behavior. That being said, if you could pull this off cleanly, it'd be pretty neat. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory Alignment in Postgres
Merlin Moncure mmonc...@gmail.com writes: Be advised of the difficulties you are going to face here. Assuming for a second there is no reason not to go unaligned on Intel and there are material benefits to justify the effort, that doesn't necessarily hold for other platforms like arm/power. Note that on many (most?) non-Intel architectures, unaligned access is simply not an option. The chips themselves will throw SIGBUS or equivalent if you try it. Some kernels provide signal handlers that emulate the unaligned access in software rather than killing the process; but the performance consequences of hitting such traps more than very occasionally would be catastrophic. Even on Intel, I'd wonder what unaligned accesses do to atomicity guarantees and suchlike. This is not a big deal for row data storage, but we'd have to be careful about it if we were to back off alignment requirements for in-memory data structures such as latches and buffer headers. Another fun thing you'd need to deal with is ensuring that the C structs we overlay onto catalog data rows still match up with the data layout rules. On the whole, I'm pretty darn skeptical that such an effort would repay itself. There are lots of more promising things to hack on. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory Alignment in Postgres
On 2014-09-11 11:39:12 -0400, Tom Lane wrote: Even on Intel, I'd wonder what unaligned accesses do to atomicity guarantees and suchlike. They pretty much kill atomicity guarantees. Atomicity is guaranteed while you're inside a cacheline, but not once you span them. This is not a big deal for row data storage, but we'd have to be careful about it if we were to back off alignment requirements for in-memory data structures such as latches and buffer headers. Right. I don't think that's an option. Another fun thing you'd need to deal with is ensuring that the C structs we overlay onto catalog data rows still match up with the data layout rules. Yea, this would require some nastyness in the bki generation, but it'd probably doable to have different alignment for system catalogs. On the whole, I'm pretty darn skeptical that such an effort would repay itself. There are lots of more promising things to hack on. I have no desire to hack on it, but I can understand the desire to reduce the space overhead... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory Alignment in Postgres
On Thu, Sep 11, 2014 at 11:27 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-11 10:32:24 -0300, Arthur Silva wrote: Unaligned memory access received a lot attention in Intel post-Nehalen era. So it may very well pay off on Intel servers. You might find this blog post and it's comments/external-links interesting http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/ FWIW, the reported results of imo pretty meaningless for postgres. It's sequential access over larger amount of memory. I.e. a perfectly prefetchable workload where it doesn't matter if superflous cachelines are fetched because they're going to be needed next round anyway. In many production workloads one of the most busy accesses to individual datums is the binary search on individual pages during index lookups. That's pretty much exactly the contrary to the above. Not saying that it's not going to be a benefit in many scenarios, but it's far from being as simple as saying that unaligned accesses on their own aren't penalized anymore. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services I modified the test code to use a completely random scan pattern to test something that completely trashes the cache. Not realistic but still confirms the hypothesis that the overhead is minimal on modern Intel. -- test results compiling for 32bit -- processing word of size 2 offset = 0 average time for offset 0 is 422.7 offset = 1 average time for offset 1 is 422.85 processing word of size 4 offset = 0 average time for offset 0 is 436.6 offset = 1 average time for offset 1 is 451 offset = 2 average time for offset 2 is 444.3 offset = 3 average time for offset 3 is 441.9 processing word of size 8 offset = 0 average time for offset 0 is 630.15 offset = 1 average time for offset 1 is 653 offset = 2 average time for offset 2 is 655.5 offset = 3 average time for offset 3 is 660.85 offset = 4 average time for offset 4 is 650.1 offset = 5 average time for offset 5 is 656.9 offset = 6 average time for offset 6 is 656.6 offset = 7 average time for offset 7 is 656.9 -- test results compiling for 64bit -- processing word of size 2 offset = 0 average time for offset 0 is 402.55 offset = 1 average time for offset 1 is 406.9 processing word of size 4 offset = 0 average time for offset 0 is 424.05 offset = 1 average time for offset 1 is 436.55 offset = 2 average time for offset 2 is 435.1 offset = 3 average time for offset 3 is 435.3 processing word of size 8 offset = 0 average time for offset 0 is 444.9 offset = 1 average time for offset 1 is 470.25 offset = 2 average time for offset 2 is 468.95 offset = 3 average time for offset 3 is 476.75 offset = 4 average time for offset 4 is 474.9 offset = 5 average time for offset 5 is 468.25 offset = 6 average time for offset 6 is 469.8 offset = 7 average time for offset 7 is 469.1 // g++ -O2 -o test test.cpp ./test #include sys/stat.h #include sys/time.h #include sys/types.h #include iostream #include cassert #include vector #include inttypes.h using namespace std; class WallClockTimer { public: struct timeval t1, t2; WallClockTimer() : t1(), t2() { gettimeofday(t1, 0); t2 = t1; } void reset() { gettimeofday(t1, 0); t2 = t1; } int elapsed() { return (t2.tv_sec * 1000 + t2.tv_usec / 1000) - (t1.tv_sec * 1000 + t1.tv_usec / 1000); } int split() { gettimeofday(t2, 0); return elapsed(); } }; // xor shift uint32_t xor128(void) { static uint32_t x = 123456789; static uint32_t y = 362436069; static uint32_t z = 521288629; static uint32_t w = 88675123; uint32_t t; t = x ^ (x 11); x = y; y = z; z = w; return w = w ^ (w 19) ^ (t ^ (t 8)); } template class T void runtest() { size_t N = 10 * 1000 * 1000 ; int repeat = 20; WallClockTimer timer; const bool paranoid = false; cout processing word of size sizeof(T)endl; for(unsigned int offset = 0; offsetsizeof(T); ++offset) { vectorT bigarray(N+2); coutoffset = offsetendl; T * const begin = reinterpret_castT * (reinterpret_castuintptr_t(bigarray[0]) + offset); assert(offset + reinterpret_castuintptr_t(bigarray[0]) == reinterpret_castuintptr_t(begin) ); T * const end = begin + N; if(paranoid) assert(reinterpret_castuintptr_t(end)reinterpret_castuintptr_t(bigarray.back())); int sumt = 0; //cout ignore this: ; for(int k = 0 ; k repeat; ++k) { timer.reset(); for(size_t i = 0; i N; ++i) { int ri = xor128() % N; begin[ri] = static_castT( i ); } volatile T val = 1;
Re: [HACKERS] Memory Alignment in Postgres
On Thu, Sep 11, 2014 at 12:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: Be advised of the difficulties you are going to face here. Assuming for a second there is no reason not to go unaligned on Intel and there are material benefits to justify the effort, that doesn't necessarily hold for other platforms like arm/power. Note that on many (most?) non-Intel architectures, unaligned access is simply not an option. The chips themselves will throw SIGBUS or equivalent if you try it. Some kernels provide signal handlers that emulate the unaligned access in software rather than killing the process; but the performance consequences of hitting such traps more than very occasionally would be catastrophic. Even on Intel, I'd wonder what unaligned accesses do to atomicity guarantees and suchlike. This is not a big deal for row data storage, but we'd have to be careful about it if we were to back off alignment requirements for in-memory data structures such as latches and buffer headers. Another fun thing you'd need to deal with is ensuring that the C structs we overlay onto catalog data rows still match up with the data layout rules. On the whole, I'm pretty darn skeptical that such an effort would repay itself. There are lots of more promising things to hack on. regards, tom lane Indeed I don't know any other architectures that this would be at an option. So if this ever moves forward it must be turned on at compile time for x86-64 only. I wonder how the Mysql handle their rows even on those architectures as their storage format is completely packed. If we just reduced the alignment requirements when laying out columns in the rows and indexes by reducing/removing padding -- typalign, it'd be enough gain in my (humble) opinion. If you think alignment is not an issue you can see saving everywhere, which is kinda insane... I'm unsure how this equates in patch complexity, but judging by the reactions so far I'm assuming a lot.
Re: [HACKERS] Memory Alignment in Postgres
On Thu, Sep 11, 2014 at 02:54:36PM -0300, Arthur Silva wrote: Indeed I don't know any other architectures that this would be at an option. So if this ever moves forward it must be turned on at compile time for x86-64 only. I wonder how the Mysql handle their rows even on those architectures as their storage format is completely packed. If we just reduced the alignment requirements when laying out columns in the rows and indexes by reducing/removing padding -- typalign, it'd be enough gain in my (humble) opinion. If you think alignment is not an issue you can see saving everywhere, which is kinda insane... I'm unsure how this equates in patch complexity, but judging by the reactions so far I'm assuming a lot. If the column order in the table was independent of the physical layout, it would be possible to order columns to reduce the padding needed. Not my suggestion, just repeating a valid comment from earlier in the thread. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory Alignment in Postgres
On Tue, Sep 9, 2014 at 10:08 AM, Arthur Silva arthur...@gmail.com wrote: I'm continuously studying Postgres codebase. Hopefully I'll be able to make some contributions in the future. For now I'm intrigued about the extensive use of memory alignment. I'm sure there's some legacy and some architecture that requires it reasoning behind it. That aside, since it wastes space (a lot of space in some cases) there must be a tipping point somewhere. I'm sure one can prove aligned access is faster in a micro-benchmark but I'm not sure it's the case in a DBMS like postgres, specially in the page/rows area. Just for the sake of comparison Mysql COMPACT storage (default and recommended since 5.5) doesn't align data at all. Mysql NDB uses a fixed 4-byte alignment. Not sure about Oracle and others. Is it worth the extra space in newer architectures (specially Intel)? Do you guys think this is something worth looking at? Yes. At least in my opinion, though, it's not a good project for a beginner. If you get your changes to take effect, you'll find that a lot of things will break in places that are not easy to find or fix. You're getting into really low-level areas of the system that get touched infrequently and require a lot of expertise in how things work today to adjust. The idea I've had before is to try to reduce the widest alignment we ever require from 8 bytes to 4 bytes. That is, look for types with typalign = 'd', and rewrite them to have typalign = 'i' by having them use two 4-byte loads to load an eight-byte value. In practice, I think this would probably save a high percentage of what can be saved, because 8-byte alignment implies a maximum of 7 bytes of wasted space, while 4-byte alignment implies a maximum of 3 bytes of wasted space. And it would probably be pretty cheap, too, because any type with less than 8 byte alignment wouldn't be affected at all, and even those types that were affected would only be slightly slowed down by doing two loads instead of one. In contrast, getting rid of alignment requirements completely would save a little more space, but probably at the cost of a lot more slowdown: any type with alignment requirements would have to fetch the value byte-by-byte instead of pulling the whole thing out at once. But there are a couple of obvious problems with this idea, too, such as: 1. It's really complicated and a ton of work. 2. It would break pg_upgrade pretty darn badly unless we employed some even-more-complex strategy to mitigate that. 3. The savings might not be enough to justify the effort. It might be interesting for someone to develop a tool measuring the number of bytes of alignment padding we lose per tuple or per page and gather some statistics on it on various databases. That would give us some sense as to the possible savings. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory Alignment in Postgres
On Wed, Sep 10, 2014 at 11:43:52AM -0400, Robert Haas wrote: But there are a couple of obvious problems with this idea, too, such as: 1. It's really complicated and a ton of work. 2. It would break pg_upgrade pretty darn badly unless we employed some even-more-complex strategy to mitigate that. 3. The savings might not be enough to justify the effort. It might be interesting for someone to develop a tool measuring the number of bytes of alignment padding we lose per tuple or per page and gather some statistics on it on various databases. That would give us some sense as to the possible savings. And will we ever implement a logical attribute system so we can reorder the stored attribtes to minimize wasted space? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory Alignment in Postgres
On Wed, Sep 10, 2014 at 4:29 PM, Bruce Momjian br...@momjian.us wrote: On Wed, Sep 10, 2014 at 11:43:52AM -0400, Robert Haas wrote: But there are a couple of obvious problems with this idea, too, such as: 1. It's really complicated and a ton of work. 2. It would break pg_upgrade pretty darn badly unless we employed some even-more-complex strategy to mitigate that. 3. The savings might not be enough to justify the effort. It might be interesting for someone to develop a tool measuring the number of bytes of alignment padding we lose per tuple or per page and gather some statistics on it on various databases. That would give us some sense as to the possible savings. And will we ever implement a logical attribute system so we can reorder the stored attribtes to minimize wasted space? You forgot to attach the patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Memory Alignment in Postgres
I'm continuously studying Postgres codebase. Hopefully I'll be able to make some contributions in the future. For now I'm intrigued about the extensive use of memory alignment. I'm sure there's some legacy and some architecture that requires it reasoning behind it. That aside, since it wastes space (a lot of space in some cases) there must be a tipping point somewhere. I'm sure one can prove aligned access is faster in a micro-benchmark but I'm not sure it's the case in a DBMS like postgres, specially in the page/rows area. Just for the sake of comparison Mysql COMPACT storage (default and recommended since 5.5) doesn't align data at all. Mysql NDB uses a fixed 4-byte alignment. Not sure about Oracle and others. Is it worth the extra space in newer architectures (specially Intel)? Do you guys think this is something worth looking at? I'm trying to messing with the *ALIGN macros but so far I wasn't able to get any conclusive results. My guess is that I'm missing something in the code or pg_bench doesn't stress the difference enough. -- Arthur Silva
Re: [HACKERS] Memory Alignment in Postgres
On Tue, Sep 9, 2014 at 11:08:05AM -0300, Arthur Silva wrote: I'm continuously studying Postgres codebase. Hopefully I'll be able to make some contributions in the future. For now I'm intrigued about the extensive use of memory alignment. I'm sure there's some legacy and some architecture that requires it reasoning behind it. That aside, since it wastes space (a lot of space in some cases) there must be a tipping point somewhere. I'm sure one can prove aligned access is faster in a micro-benchmark but I'm not sure it's the case in a DBMS like postgres, specially in the page/rows area. Just for the sake of comparison Mysql COMPACT storage (default and recommended since 5.5) doesn't align data at all. Mysql NDB uses a fixed 4-byte alignment. Not sure about Oracle and others. Is it worth the extra space in newer architectures (specially Intel)? Do you guys think this is something worth looking at? I'm trying to messing with the *ALIGN macros but so far I wasn't able to get any conclusive results. My guess is that I'm missing something in the code or pg_bench doesn't stress the difference enough. Postgres reads data block from disk and puts them in shared memory, then the CPU accesses those values, like floats and integers, as though they were in allocated memory, i.e. we make no adjustments to the data from disk all the way to CPU. I don't think anyone has measured the overhead of doing less alignment, but I would be interested to see any test results produced. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers