On Fri, Dec 09, 2005 at 11:32:48AM -0500, Bruce Momjian wrote:
Tom Lane wrote:
Bruce Momjian pgman@candle.pha.pa.us writes:
I can see that being useful for a single-user application that doesn't
have locking or I/O bottlenecks, and doesn't have a multi-stage design
like a database. Do
Simon Riggs [EMAIL PROTECTED] wrote
You may be trying to use the memory too early. Prefetched memory takes
time to arrive in cache, so you may need to issue prefetch calls for N
+2, N+3 etc rather than simply N+1.
p.6-11 covers this.
I actually tried it and no improvements have been
This technique and others are discussed in detail in the Intel
Optimization Manual:
http://apps.intel.com/scripts-util/download.asp?url=/design/PentiumII/manuals/24512701.pdftitle=Intel%AE+Architecture+Optimization+Reference+Manualfullpg=3site=Developer
Similar manual exists for AMD and other
Do these optimizations have any affect on database software? I know
call overhead shows up as a performance bottleneck, but do these others
optimizations also measurably improve performance?
---
Simon Riggs wrote:
This
Kenneth Marshall wrote:
The main benefit of pre-fetching optimization is to allow just-
in-time data delivery to the processor. There are numerous papers
illustrating the dramatic increase in data throughput by using
datastructures designed to take advantage of prefetching. Factors
of 3-7 can
Bruce Momjian pgman@candle.pha.pa.us writes:
I can see that being useful for a single-user application that doesn't
have locking or I/O bottlenecks, and doesn't have a multi-stage design
like a database. Do we do enough of such processing that we will _see_
an improvement, or will our code
On Fri, Dec 09, 2005 at 10:37:25AM -0500, Bruce Momjian wrote:
Kenneth Marshall wrote:
The main benefit of pre-fetching optimization is to allow just-
in-time data delivery to the processor. There are numerous papers
illustrating the dramatic increase in data throughput by using
The main benefit of pre-fetching optimization is to allow just-
in-time data delivery to the processor. There are numerous papers
illustrating the dramatic increase in data throughput by using
datastructures designed to take advantage of prefetching. Factors
of 3-7 can be realized and this can
Tom Lane wrote:
Bruce Momjian pgman@candle.pha.pa.us writes:
I can see that being useful for a single-user application that doesn't
have locking or I/O bottlenecks, and doesn't have a multi-stage design
like a database. Do we do enough of such processing that we will _see_
an
On Fri, 2005-12-09 at 09:43 -0500, Bruce Momjian wrote:
Do these optimizations have any affect on database software? I know
call overhead shows up as a performance bottleneck, but do these others
optimizations also measurably improve performance?
Many of them can, but nowhere near as much
On Thu, 8 Dec 2005, Min Xu (Hsu) wrote:
Perhaps because P4 is already doing H/W prefetching?
http://www.tomshardware.com/2000/11/20/intel/page5.html
I ran the test program on an opteron 2.2G:
Got slowdown.
I ran it on a PIII 650M:
looks like some speedup to me.
Ok, I see ... so this
Wireless Device
-Original Message-
From: [EMAIL PROTECTED] [EMAIL PROTECTED]
To: Simon Riggs [EMAIL PROTECTED]
CC: Qingqing Zhou [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
pgsql-hackers@postgresql.org
Sent: Fri Dec 09 09:43:33 2005
Subject: Re: [HACKERS] Warm-cache prefetching
Do
Luke Lonergan wrote:
Bruce,
It (the compute intensity optimization) is what we did for copy parsing,
and it sped up by a factor of 100+.
The rest of the copy path could use some work too.
Yge virtual tuples in 8.1 are another example of grouping operations
into more compact chunks
Luke Lonergan wrote:
Bruce,
It (the compute intensity optimization) is what we did for copy parsing, and it
sped up by a factor of 100+.
The changes made to COPY were portable, though.
cheers
andrew
---(end of broadcast)---
TIP 5:
Andrew Dunstan [EMAIL PROTECTED] writes:
Luke Lonergan wrote:
It (the compute intensity optimization) is what we did for copy parsing, and
it sped up by a factor of 100+.
The changes made to COPY were portable, though.
In fact, the changes made to COPY had absolutely nada to do with any of
Tom,
On 12/9/05 2:14 PM, Tom Lane [EMAIL PROTECTED] wrote:
Andrew Dunstan [EMAIL PROTECTED] writes:
Luke Lonergan wrote:
It (the compute intensity optimization) is what we did for copy parsing, and
it sped up by a factor of 100+.
The changes made to COPY were portable, though.
In fact,
I found an interesting paper improving index speed by prefetching memory
data to L1/L2 cache here (there is discussion about prefetching disk
data to memory several days ago ice-breaker thread):
http://www.cs.cmu.edu/~chensm/papers/index_pf_final.pdf
Also related technique used to speedup
Qingqing,
On 12/8/05 8:07 PM, Qingqing Zhou [EMAIL PROTECTED] wrote:
/* prefetch ahead */
__asm__ __volatile__ (
1: prefetchnta 128(%0)\n
Luke Lonergan [EMAIL PROTECTED] wrote
/* prefetch ahead */
__asm__ __volatile__ (
1: prefetchnta 128(%0)\n
: : r (s) :
Perhaps because P4 is already doing H/W prefetching?
http://www.tomshardware.com/2000/11/20/intel/page5.html
I ran the test program on an opteron 2.2G:
% ./a.out 10 16
Sum: -951304192: with prefetch on - duration: 81.166 ms
Sum: -951304192: with prefetch off - duration: 79.769 ms
Sum:
20 matches
Mail list logo