Re: [HACKERS] GPUSort project

2006-04-13 Thread Benny Amorsen
 MvO == Martijn van Oosterhout kleptog@svana.org writes:

MvO Is this of practical use for run-of-the-mill video cards? --

The article suggests that using the GPU is a win even on a $100 64MB
card. The built-in card in most servers is probably not worth
bothering with, but many servers offer PCI Express these days.


/Benny



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] GPUSort project

2006-04-13 Thread mark
On Wed, Apr 12, 2006 at 09:31:52AM +0200, Benny Amorsen wrote:
  MvO == Martijn van Oosterhout kleptog@svana.org writes:
 MvO Is this of practical use for run-of-the-mill video cards? --
 The article suggests that using the GPU is a win even on a $100 64MB
 card. The built-in card in most servers is probably not worth
 bothering with, but many servers offer PCI Express these days.

I would imagine that one of the more major benefits of modern video
cards (not cheap ones though) is the faster memory that they use.
GDDR3... :-)

My minimal research on the subject suggests that the memory is double
or triple that of regular DDR or DDR2 in terms of performance.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] GPUSort project

2006-04-12 Thread Mischa Sandberg

Martijn van Oosterhout wrote:

On Tue, Apr 11, 2006 at 04:02:07PM -0700, Mischa Sandberg wrote:

Anybody on this list hear/opine anything pf the GPUSort project for 
postgresql? I'm working on a radix-sort subcase for tuplesort, and there 
are similarities.


http://www.andrew.cmu.edu/user/ngm/15-823/project/


I've heard it meantioned, didn't know they'd got it working. However,
none of my database servers have a 3D graphics anywhere near the power
they suggest in the article.

Is this of practical use for run-of-the-mill video cards?


Short answer: maybe.

Long answer: we're shipping a server (appliance) product built on stock 
rackmount hardware, that includes an ATI Rage (8MB) with nothing to do. Much of 
what the box does is a single cpu-bound process, sorting  maillog extracts. The 
GPU is an asset, even at 8MB; the headwork is in mapping/truncating sort keys 
down to dense ~32bit prefixes; and in making smooth judgements as to when to 
give the job to (a) the GPU (b) quicksort (c) a tiny bitonic sort in the SSE2 
registers.


Any of this would apply to postgres, if tuplesort.c can tolerate a preprocessing 
step that looks for special cases, and degrades gracefully into the standard 
case. I'm guessing that there are enough internal sorts (on oid, for example) 
having only small, memcmp-able sort keys, that this is worth adding in.


--
Engineers think that equations approximate reality.
Physicists think that reality approximates the equations.
Mathematicians never make the connection.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] GPUSort project

2006-04-12 Thread Simon Riggs
On Wed, 2006-04-12 at 10:00 -0700, Mischa Sandberg wrote:
 Martijn van Oosterhout wrote:
  On Tue, Apr 11, 2006 at 04:02:07PM -0700, Mischa Sandberg wrote:
  
 Anybody on this list hear/opine anything pf the GPUSort project for 
 postgresql? I'm working on a radix-sort subcase for tuplesort, and there 
 are similarities.
 
 http://www.andrew.cmu.edu/user/ngm/15-823/project/
  
  I've heard it meantioned, didn't know they'd got it working. However,
  none of my database servers have a 3D graphics anywhere near the power
  they suggest in the article.
  
  Is this of practical use for run-of-the-mill video cards?
 
 Short answer: maybe.
 
 Long answer: we're shipping a server (appliance) product built on stock 
 rackmount hardware, that includes an ATI Rage (8MB) with nothing to do. Much 
 of 
 what the box does is a single cpu-bound process, sorting  maillog extracts. 
 The 
 GPU is an asset, even at 8MB; the headwork is in mapping/truncating sort keys 
 down to dense ~32bit prefixes; and in making smooth judgements as to when to 
 give the job to (a) the GPU (b) quicksort (c) a tiny bitonic sort in the SSE2 
 registers.

There's been talk for the last few years in academic circles about
trying to use graphics APIs and/or specialised hardware to improve
various aspects of database technology.

It sounds like its possible, but it would have to give incredible gains
before its worth the effort to make it happen. 8MB of video RAM doesn't
score much against 256MB of normal RAM, which is pretty cheap these
days.

The hardware dependency would make this extremely sensitive to change,
so effort in this area might not give lasting benefit. As it happens,
I'm in favour of making code changes to exploit hardware, but this one
is too far for me to encourage anybody to pursue it further.

 Any of this would apply to postgres, if tuplesort.c can tolerate a 
 preprocessing 
 step that looks for special cases, and degrades gracefully into the standard 
 case. 

For other techniques, I think it can, depending upon the cost of the
preprocessing step. But the overall improvement from improving small
sorts could well be lost in the noise...so maybe not worth it.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] GPUSort project

2006-04-12 Thread Mischa Sandberg

[short]
This probably would be an uneasy fit into generic backend code.
Was hoping the GPUSort project might have fleeced/sorted out some issues.

[long]
Simon Riggs wrote:

On Wed, 2006-04-12 at 10:00 -0700, Mischa Sandberg wrote:

...
Long answer: we're shipping a server (appliance) product built on stock 
rackmount hardware, that includes an ATI Rage (8MB) with nothing to do. Much of 
what the box does is a single cpu-bound process, sorting  maillog extracts. The 
GPU is an asset, even at 8MB; the headwork is in mapping/truncating sort keys 
down to dense ~32bit prefixes; and in making smooth judgements as to when to 
give the job to (a) the GPU (b) quicksort (c) a tiny bitonic sort in the SSE2 
registers.



It sounds like its possible, but it would have to give incredible gains
before its worth the effort to make it happen. 8MB of video RAM doesn't
score much against 256MB of normal RAM, which is pretty cheap these
days.


A better comparison is 8MB of video RAM vs 512K of L2 cache. GPU's (also) have 
faster access (32GB/s) to RAM than the CPU, using AGP/PCI with no contention. 
Our product uses Xeons instead of Opterons; the 3GHz CPUs are just slogging, 
waiting 70% for RAM fetch.



The hardware dependency would make this extremely sensitive to change,
so effort in this area might not give lasting benefit. As it happens,
I'm in favour of making code changes to exploit hardware, but this one
is too far for me to encourage anybody to pursue it further.


Fair comment. I'm using OpenGL, and looking at Glift, so it's not as 
hardware-specific as you might think. Other projects at gpgpu.org seem to be 
able to switch among GPU's.


That being said, humbly admit that targetting specific hardware tends to give 
one tunnel vision. Coding if all these conditions are true, use the fast 
algorithm, else do it the normal way is also messier to extend than a nice 
clean interface layer :-(


Any of this would apply to postgres, if tuplesort.c can tolerate a preprocessing 
step that looks for special cases, and degrades gracefully into the standard 
case. 



For other techniques, I think it can, depending upon the cost of the
preprocessing step. But the overall improvement from improving small
sorts could well be lost in the noise...so maybe not worth it.


Agreed. GPU setup makes sorts 1MB not worth it.

Small sorts get a boost from bitonic sort in SSE2, which wires into the bottom 
of a special-case quicksort, where any subrange of 9..16 elements gets done in 
xmm registers.


I think the preprocessing to test and format keys for such sorts
is useful anyway. I was trying to make radix sort usable, and that requires the 
same key prep. Even if the key prep hits its space limit and says,
the input is unsuitable for radix sort, it still makes the normal quicksort 
faster, since some key prefixes are shorter.


--
Engineers think that equations approximate reality.
Physicists think that reality approximates the equations.
Mathematicians never make the connection.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] GPUSort project

2006-04-11 Thread Martijn van Oosterhout
On Tue, Apr 11, 2006 at 04:02:07PM -0700, Mischa Sandberg wrote:
 Anybody on this list hear/opine anything pf the GPUSort project for 
 postgresql? I'm working on a radix-sort subcase for tuplesort, and there 
 are similarities.
 
 http://www.andrew.cmu.edu/user/ngm/15-823/project/

I've heard it meantioned, didn't know they'd got it working. However,
none of my database servers have a 3D graphics anywhere near the power
they suggest in the article.

Is this of practical use for run-of-the-mill video cards?
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature