[HACKERS] [PATCH v4] [libpq] Try to avoid manually masking SIGPIPEs on every send()

2009-07-23 Thread Jeremy Kerr
(SO_NOSIGPIPE) and the MSG_NOSIGNAL flag to send(). This change attempts to use these if they're available at compile- and run-time. If not, we drop back to manipulating the signal mask as before. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- v4: roll into one patch, use macros --- src/interfaces

Re: [HACKERS] [PATCH 1/2 v3] [libpq] rework sigpipe-handling macros

2009-07-20 Thread Jeremy Kerr
Hi Tom, That code is not broken as it stands, and doesn't appear to really gain anything from the proposed change. Why should we risk any portability questions when the code isn't going to get either simpler or shorter? OK, after attempting a macro version of this, we end up with: #define

Re: [HACKERS] [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex

2009-07-20 Thread Jeremy Kerr
Hi Greg, Thanks for the benchmark app, thought I'd pitch in with some ppc results: clz 1.530s lookup table 1.720s float hack 4.424s unrolled 5.280s normal 5.369s POWER5+: clz 2.046s lookup table 2.188s

Re: [HACKERS] [PATCH 1/2 v3] [libpq] rework sigpipe-handling macros

2009-07-19 Thread Jeremy Kerr
Hi Robert, Perhaps we should use macros. I was trying to avoid macros, as this means we lose type- and syntax- checking at the call-site, and end up with slightly messier code. However, I understand that this is probably personal preference for me :) How about just 'static' functions? (ie,

Re: [HACKERS] [PATCH v3] Avoid manual shift-and-test logic in AllocSetFreeIndex

2009-07-19 Thread Jeremy Kerr
Hi Robert, That having been said, Jeremy, you probably want to take a look at those comments and I have a few responses to them as well. OK, thanks for the heads-up. following comment: Applied and built cleanly. Regress passes. Trying to hunt down ppc box to see if performance

Re: [HACKERS] [PATCH 1/2 v3] [libpq] rework sigpipe-handling macros

2009-07-19 Thread Jeremy Kerr
Tom, However, I think the whole patch is pretty useless. That code is not broken as it stands, and doesn't appear to really gain anything from the proposed change. Why should we risk any portability questions when the code isn't going to get either simpler or shorter? This patch clears the

[HACKERS] [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex

2009-07-19 Thread Jeremy Kerr
Rather than testing single bits in a loop, change AllocSetFreeIndex to use the __builtin_clz() function to calculate the chunk index. This requires a new check for __builtin_clz in the configure script. Results in a ~2% performance increase on sysbench on PowerPC. Signed-off-by: Jeremy Kerr j

Re: [HACKERS] [PATCH 1/2 v3] [libpq] rework sigpipe-handling macros

2009-07-14 Thread Jeremy Kerr
Hi, Alvaro, Does this work in compilers other than GCC? I think we use some kludges to protect against them ... see pg_list.h for the canonical example. As I understand it, we're not using static inlines in pg_list.h to prevent multiple objects from exporting the same symbols if the

[HACKERS] [PATCH 0/2 v3] SIGPIPE masking in local socket connections

2009-06-30 Thread Jeremy Kerr
testing though, is there a machine that allows both SO_NOSIGPIPE and MSG_NOSIGNAL? v3: respin as context diffs Again, comments most welcome, Jeremy --- Jeremy Kerr (2): [libpq] rework sigpipe-handling macros [libpq] Try to avoid manually masking SIGPIPEs on every send() -- Sent

[HACKERS] [PATCH 2/2 v3] [libpq] Try to avoid manually masking SIGPIPEs on every send()

2009-06-30 Thread Jeremy Kerr
(SO_NOSIGPIPE) and the MSG_NOSIGNAL flag to send(). This change attempts to use these if they're available at compile- and run-time. If not, we drop back to manipulating the signal mask as before. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- src/interfaces/libpq/fe-connect.c | 40

[HACKERS] [PATCH 1/2 v3] [libpq] rework sigpipe-handling macros

2009-06-30 Thread Jeremy Kerr
change the code to be implemented as static inlines. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- src/interfaces/libpq/fe-secure.c | 88 --- 1 file changed, 55 insertions(+), 33 deletions(-) *** a/src/interfaces/libpq/fe-secure.c --- b/src/interfaces/libpq/fe

[HACKERS] [PATCH v3] Avoid manual shift-and-test logic in AllocSetFreeIndex

2009-06-30 Thread Jeremy Kerr
Move the shift-and-test login into a separate fls() function, which can use __builtin_clz() if it's available. This requires a new check for __builtin_clz in the configure script. Results in a ~2% performance increase on PowerPC. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- v3: respin

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Jeremy Kerr
Stephen, Is it just the size that matters, or is it when there are few spaces at the end? It's the number of spaces at the end. If we knew this number, then we wouldn't have to do any comparisons at all :) Cheers, Jeremy -- Sent via pgsql-hackers mailing list

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-26 Thread Jeremy Kerr
Stephen, If the updated function is always faster when the overall string is at least, say, 16 characters long, But that's not the case - the cost of the function (and the speedup from the previous version) depends on the number of spaces that there are at the end. For the new function to

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-25 Thread Jeremy Kerr
Hi Stephen, What would be really useful would be best case and worst case scenarios. I've put together some data from a microbenchmark of the bcTrulen function, patched and unpatched. As for best-case, if you have a long string of trailing spaces, we can go through them at theoretically one

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-25 Thread Jeremy Kerr
Tom, I've put together some data from a microbenchmark of the bcTrulen function, patched and unpatched. Uh, where's the data? If you're after the raw data for a run, I've put it up: http://ozlabs.org/~jk/projects/db/data/bctruelen.csv I've also packaged up the quick-and-dirty benchmark

[HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-23 Thread Jeremy Kerr
the number of compares where possible. Word-size compares are performed on aligned sections of the string. Results in a small performance increase; around 1-2% on my POWER6 test box. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- Resend: context diff this time --- src/backend/utils/adt/varchar.c | 24

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-23 Thread Jeremy Kerr
Robert, I'd still like to know the workload and exact numbers. From up-thread: http://ozlabs.org/~jk/projects/db/data/postgres.bcTruelen/ - or were there other details you were looking for? Cheers, Jeremy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-17 Thread Jeremy Kerr
Hi all, Speaking of which, what about some performance numbers? OK, benchmarks done: http://ozlabs.org/~jk/projects/db/data/postgres.bcTruelen/ Summary: small increase in performance (~1-2% on my machine), at about 1.5 standard deviations from the mean. Profiles show a decent drop in hits

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-16 Thread Jeremy Kerr
Hi all, That's correct. To check the alignment you would have to look at the actual pointer. I would suggest using the existing macros to handle alignment. Hm, though the only one I see offhand which is relevant is the moderately silly PointerIsAligned(). Still it would make the code clearer

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-16 Thread Jeremy Kerr
Hi Tom, Speaking of which, what about some performance numbers? Like Heikki, I'm quite suspicious of whether there is any real-world gain to be had from this approach. Will send numbers tomorrow, with the reworked patch. Cheers, Jeremy -- Sent via pgsql-hackers mailing list

Re: [HACKERS] char() overhead on read-only workloads not so insignifcant as the docs claim it is...

2009-06-15 Thread Jeremy Kerr
Alvaro, Maybe bcTruelen could be optimized to step on one word at a time (perhaps by using XOR against a precomputed word filled with ' '), instead of one byte at a time ... I have a patch for this, will send soon. Regards, Jeremy -- Sent via pgsql-hackers mailing list

[HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-15 Thread Jeremy Kerr
Signed-off-by: Jeremy Kerr j...@ozlabs.org --- src/backend/utils/adt/varchar.c | 24 +--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c index 5f3c658..6889dff 100644 --- a/src/backend/utils

Re: [HACKERS] [PATCH] backend: compare word-at-a-time in bcTruelen

2009-06-15 Thread Jeremy Kerr
Robert, This looks very non-portable to me. Unsurprisingly, I'm new to postgres hacking and the large number of supported platforms :) I was considering something like: unsigned int spaces; const unsigned int wordsize = sizeof(unsigned int); memset(spaces, ' ',

[HACKERS] [PATCH 0/2] SIGPIPE masking in local socket connections, v2

2009-06-10 Thread Jeremy Kerr
testing though, is there a machine that allows both SO_NOSIGPIPE and MSG_NOSIGNAL? Again, comments most welcome, Jeremy --- Jeremy Kerr (2): [libpq] rework sigpipe-handling macros [libpq] Try to avoid manually masking SIGPIPEs on every send() -- Sent via pgsql-hackers mailing

[HACKERS] [PATCH 1/2] [libpq] rework sigpipe-handling macros

2009-06-10 Thread Jeremy Kerr
change the code to be implemented as static inlines. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- src/interfaces/libpq/fe-secure.c | 88 --- 1 file changed, 55 insertions(+), 33 deletions(-) diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq

[HACKERS] [PATCH 2/2] [libpq] Try to avoid manually masking SIGPIPEs on every send()

2009-06-10 Thread Jeremy Kerr
(SO_NOSIGPIPE) and the MSG_NOSIGNAL flag to send(). This change attempts to use these if they're available at compile- and run-time. If not, we drop back to manipulating the signal mask as before. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- src/interfaces/libpq/fe-connect.c | 39

Re: [HACKERS] [PATCH 2/2] [libpq] Try to avoid manually masking SIGPIPEs on every send()

2009-06-10 Thread Jeremy Kerr
Marko, optval seems used without initialization. Dang, I was checking for the SO_NOSIGPIPE flag, and didn't check for optval. New patch coming... Cheers, Jeremy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:

[HACKERS] [PATCH v2] [libpq] Try to avoid manually masking SIGPIPEs on every send()

2009-06-10 Thread Jeremy Kerr
(SO_NOSIGPIPE) and the MSG_NOSIGNAL flag to send(). This change attempts to use these if they're available at compile- and run-time. If not, we drop back to manipulating the signal mask as before. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- v2: initialise optval --- src/interfaces/libpq/fe

Re: [HACKERS] [RFC,PATCH] Avoid manual shift-and-test logic in AllocSetFreeIndex

2009-06-04 Thread Jeremy Kerr
Hi Atsushi, If size = 8, fls((size - 1) ALLOC_MINBITS) is fls(0). The result of fls(0) is undefined. Yep, got caught out by this because my previous fls() supported zero. I think we have to never call fls(0) from AllocSetFreeIndex(). My proposal code: if (size (1 ALLOC_MINBITS))

[HACKERS] [PATCH v2] Avoid manual shift-and-test logic in AllocSetFreeIndex

2009-06-04 Thread Jeremy Kerr
Move the shift-and-test login into a separate fls() function, which can use __builtin_clz() if it's available. This requires a new check for __builtin_clz in the configure script. Results in a ~2% performance increase on PowerPC. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- v2: prevent fls(0

Re: [HACKERS] [RFC,PATCH] Avoid manual shift-and-test logic in AllocSetFreeIndex

2009-06-04 Thread Jeremy Kerr
Hi Atsushi, In my benchmark program, it is a little different performance in fls implementation and inline asm implementation. However, the result of a pgbench is almost the same improvement. Here is the result of my benchmark. Excellent, thank you for getting this extra set of numbers.

Re: [HACKERS] [PATCH v2] Add bit operations util header

2009-06-03 Thread Jeremy Kerr
Florian, +#if defined(__GNUC__) \ + (defined(__ppc__) || defined(__powerpc__) || \ +defined(__ppc64__) || defined (__powerpc64__)) If you require GCC anyway, you can use __builtin_clz instead. (It's been available since GCC 4.1 at least.) Because now we have to test the compiler

Re: [HACKERS] [PATCH v2] Add bit operations util header

2009-06-03 Thread Jeremy Kerr
Hi Tom, The other thing I didn't like about the patch was the assumption that it's okay to have a static inline function in a header. You can get away with that in gcc but *not* in other compilers. Gee, you user-space guys have it tough! :D Point taken, will rework. Look at the existing

[HACKERS] [RFC,PATCH] Avoid manual shift-and-test logic in AllocSetFreeIndex

2009-06-03 Thread Jeremy Kerr
Move the shift-and-test login into a separate fls() function, which can use __builtin_clz() if it's available. This requires a new check for __builtin_clz in the configure script. Results in a ~2% performance increase on PowerPC. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- configure.in

Re: [HACKERS] [RFC,PATCH] SIGPIPE masking in local socket connections

2009-06-02 Thread Jeremy Kerr
Tom, The consideration is that the application fails completely on server disconnect (because it gets SIGPIPE'd). This was long ago deemed unacceptable, and we aren't likely to change our opinion on that. OK, understood. I'm guessing MSG_NOSIGNAL on the send() isn't portable enough here?

Re: [HACKERS] [RFC,PATCH] SIGPIPE masking in local socket connections

2009-06-02 Thread Jeremy Kerr
Tom, A feature that is exercised via setsockopt is probably fairly safe, since you can check for failure of the setsockopt call and then do it the old way. MSG_NOSIGNAL is a recv() flag, no? It's a flag to send(). The question is whether you could expect that the recv() would fail if it

Re: [HACKERS] faster version of AllocSetFreeIndex for x86 architecture

2009-06-02 Thread Jeremy Kerr
Hi, I made a faster version of AllocSetFreeIndex for x86 architecture. Neat, I have a version for PowerPC too. In order to prevent writing multiple copies of AllocSetFreeIndex, I propose that we add a fls() function (find last set); this can be defined in an architecture-independent manner

[HACKERS] [PATCH 1/2] Add bit operations util header

2009-06-02 Thread Jeremy Kerr
Add a utility header for simple bit operatios - bitops.h. At present, just contains the fls() (find last set bit) function. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- src/include/utils/bitops.h | 52 + 1 file changed, 52 insertions(+) diff

[HACKERS] [PATCH 2/2] Use fls() to find chunk set

2009-06-02 Thread Jeremy Kerr
Results in a ~2% performance increase by using the powerpc fls() implementation. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- src/backend/utils/mmgr/aset.c |8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr

[HACKERS] [PATCH v2] Add bit operations util header

2009-06-02 Thread Jeremy Kerr
Add a utility header for simple bit operatios - bitops.h. At present, just contains the fls() (find last set bit) function. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- v2: only use inline asm with gcc --- src/include/utils/bitops.h | 53 + 1

[HACKERS] [RFC,PATCH] Only disable sigpipe during SSL write

2009-06-01 Thread Jeremy Kerr
If the connection isn't over SSL, there's no need to do the disable. This effectively halves the number of syscalls performed by libpq when SSL is not in use. Signed-off-by: Jeremy Kerr j...@ozlabs.org --- src/interfaces/libpq/fe-secure.c |7 +++ 1 file changed, 3 insertions(+), 4

[HACKERS] [RFC,PATCH] SIGPIPE masking in local socket connections

2009-06-01 Thread Jeremy Kerr
the following performance improvement during a sysbench OLTP run: http://ozlabs.org/~jk/projects/db/data/sigpipe-perf.png load: sysbench --test=oltp --oltp-read-only=on, connecting locally, machine: POWER6, 64-way, 4.2GHz Comments most welcome, Jeremy --- Jeremy Kerr (1): Only disable sigpipe