Re: [HACKERS] CUDA Sorting
On 02/11/2012 08:14 PM, Gaetano Mendola wrote: The trend is to have server capable of running CUDA providing GPU via external hardware (PCI Express interface with PCI Express switches), look for example at PowerEdge C410x PCIe Expansion Chassis from DELL. The C410X adds 16 PCIe slots to a server, housed inside a separate 3U enclosure. That's a completely sensible purchase if your goal is to build a computing cluster, where a lot of work is handed off to a set of GPUs. I think that's even less likely to be a cost-effective option for a database server. Adding a single dedicated GPU installed in a server to accelerate sorting is something that might be justifiable, based on your benchmarks. This is a much more expensive option than that though. Details at http://www.dell.com/us/enterprise/p/poweredge-c410x/pd for anyone who wants to see just how big this external box is. I did some experimenst timing the sort done with CUDA and the sort done with pg_qsort: CUDA pg_qsort 33Milion integers: ~ 900 ms, ~ 6000 ms 1Milion integers:~ 21 ms, ~ 162 ms 100k integers: ~ 2 ms, ~ 13 ms CUDA time has already in the copy operations (host->device, device->host). As GPU I was using a C2050, and the CPU doing the pg_qsort was a Intel(R) Xeon(R) CPU X5650 @ 2.67GHz That's really interesting, and the X5650 is by no means a slow CPU. So this benchmark is providing a lot of CPU power yet still seeing over a 6X speedup in sort times. It sounds like the PCI Express bus has gotten fast enough that the time to hand data over and get it back again can easily be justified for medium to large sized sorts. It would be helpful to take this patch and confirm whether it scales when using in parallel. Easiest way to do that would be to use the pgbench "-f" feature, which allows running an arbitrary number of some query at once. Seeing whether this acceleration continued to hold as the number of clients increases is a useful data point. Is it possible for you to break down where the time is being spent? For example, how much of this time is consumed in the GPU itself, compared to time spent transferring data between CPU and GPU? I'm also curious where the bottleneck is at with this approach. If it's the speed of the PCI-E bus for smaller data sets, adding more GPUs may never be practical. If the bus can handle quite a few of these at once before it saturates, it might be possible to overload a single GPU. That seems like it would be really hard to reach for database sorting though; I can't really defend justify my gut feel for that being true though. > I've never seen a PostgreSQL server capable of running CUDA, and I > don't expect that to change. That sounds like: "I think there is a world market for maybe five computers." - IBM Chairman Thomas Watson, 1943 Yes, and "640K will be enough for everyone", ha ha. (Having said the 640K thing is flat out denied by Gates, BTW, and no one has come up with proof otherwise). I think you've made an interesting case for this sort of acceleration now being useful for systems doing what's typically considered a data warehouse task. I regularly see servers waiting for far more than 13M integers to sort. And I am seeing a clear trend toward providing more PCI-E slots in servers now. Dell's R810 is the most popular single server model my customers have deployed in the last year, and it has 5 X8 slots in it. It's rare all 5 of those are filled. As long as a dedicated GPU works fine when dropped to X8 speeds, I know a fair number of systems where one of those could be added now. There's another data point in your favor I didn't notice before your last e-mail. Amazon has a "Cluster GPU Quadruple Extra Large" node type that runs with NVIDIA Tesla hardware. That means the installed base of people who could consider CUDA is higher than I expected. To demonstrate how much that costs, to provision a GPU enabled reserved instance from Amazon for one year costs $2410 at "Light Utilization", giving a system with 22GB of RAM and 1.69GB of storage. (I find the reserved prices easier to compare with dedicated hardware than the hourly ones) That's halfway between the High-Memory Double Extra Large Instance (34GB RAM/850GB disk) at $1100 and the High-Memory Quadruple Extra Large Instance (64GB RAM/1690GB disk) at $2200. If someone could prove sorting was a bottleneck on their server, that isn't an unreasonable option to consider on a cloud-based database deployment. I still think that an approach based on OpenCL is more likely to be suitable for PostgreSQL, which was part of why I gave CUDA low odds here. The points in favor of OpenCL are: -Since you last posted, OpenCL compiling has switched to using LLVM as their standard compiler. Good PostgreSQL support for LLVM isn't far away. It looks to me like the compiler situation for CUDA requires their PathScale based
Re: [HACKERS] [COMMITTERS] pgsql: Add new keywords SNAPSHOT and TYPES to the keyword list in gram.
On 11.02.2012 11:57, Alex Hunsaker wrote: On Thu, Feb 9, 2012 at 11:30, Tom Lane wrote: Alvaro Herrera writes: Excerpts from Tom Lane's message of jue feb 09 12:17:59 -0300 2012: FWIW that script is throwing a warning here: Use of assignment to $[ is deprecated at /pgsql/source/HEAD/src/tools/check_keywords.pl line 19. Any Perl hackers want to improve that script? The attached 2 liner does it for me. Thanks, applied! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of pg_archivecleanup -x option patch
On 02/01/2012 07:53 AM, Alex Shulgin wrote: One problem I've found while trying the example workflow is this: ~/local/postgresql/HEAD$ ./bin/pg_archivecleanup -d -x .gz data/archive/ 00010002.0020.backup.gz pg_archivecleanup: invalid filename input Try "pg_archivecleanup --help" for more information. I think we could accept the suffixed WAL filename and strip the suffix; another idea is to actually detect the suffix (if any) from the last command line argument, thus rendering the -x option unnecessary. Going full-on automatic with detecting the suffix seems like it might do the wrong thing when presented with a strange extension. I know a lot of the time I use pg_archivecleanup is inside scripts that tend to get setup and then ignored. If one of those has a bug, or unexpected data appears in the archive, I'd rather see this sort of major, obvious failure--not an attempt to sort things out that might make a bad decision. The smaller step of automatically stripping the specified suffix from the input file name, when it matches the one we've told the program to expect, seems like a usability improvement unlikely to lead to bad things though. I've certainly made the mistake you've shown when using the patched version of the program myself, just didn't think about catching and correcting it myself before. We can rev this to add that feature and resubmit easily enough, will turn that around soon. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Correctly initialise shared recoveryLastRecPtr in recovery.
On Mon, Jan 16, 2012 at 10:44 AM, Fujii Masao wrote: > On Fri, Jan 13, 2012 at 10:05 PM, Simon Riggs wrote: >> Correctly initialise shared recoveryLastRecPtr in recovery. >> Previously we used ReadRecPtr rather than EndRecPtr, which was >> not a serious error but caused pg_stat_replication to report >> incorrect replay_location until at least one WAL record is replayed. > > Thanks for the commit! > > This needs to be backported to 9.1 and 9.0 because they also have > the same problem. http://archives.postgresql.org/message-id/14373.1328890248%40sss.pgh.pa.us Minor version updates are scheduled to be released at the end of this month. Before that, can you backport this commit to 9.1 and 9.0? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] random_page_cost vs seq_page_cost
On 02/11/2012 07:53 PM, Jeff Janes wrote: Has it ever been well-characterized what the problem is with>8GB? I've used shared buffers above that size for testing purposes and could never provoke a problem with it. If anyone ever manages to characterize it well, we might actually make progress on isolating and fixing it. All we have so far are a couple of application level test results suggesting a higher value caused performance to drop. The first public one I remember was from Jignesh; http://archives.postgresql.org/pgsql-performance/2008-02/msg00184.php gives him quoting on where he found the Solaris roll-off was at. What we really need to stomp this one down is someone to find the same thing, then show profiler output in each case. Note that Jignesh's report included significant amount of filesystem level tuning, using things like more direct I/O, and that might be a necessary requirement to run into the exact variant of this limitation he mentioned. I haven't spent a lot of time looking for this problem myself. What I've heard second-hand from more than one person now is a) larger settings than 8GB can be an improvement for some people still, and b) simple benchmarks don't always have this problem. I have noted that the few public and private reports I've gotten all suggest problems show up on benchmarks of more complicated workloads. I think Jignesh mentioned this being obvious in the more complicated TPC-derived benchmarks, not in simple things like pgbench. I may be misquoting him though. And given that one of the possible causes for this was an excess of some lock contention, it's quite possible this one is already gone from 9.2, given the large number of lock related issues that have been squashed so far in this release. All of those disclaimers are why I think no one has pushed to put a note about this in the official docs. Right now the only suggested limit is this one: "The useful range for shared_buffers on Windows systems is generally from 64MB to 512MB." The most common practical limit I've run into with large shared_buffers settings hits earlier than 8GB: running into checkpoint spike issues. I have installs that started with shared_buffers in the 4 to 8GB range, where we saw badly spiking I/O at checkpoint sync time. Lowering the databases cache can result in smarter writing decisions withing the OS, improving latency--even though total writes are actually higher if you measure what flows from the database to OS. That side of the latency vs. throughput trade-off existing is one of the main reasons I haven't gone chasing after problems with really large shared_buffers settings. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Scaling XLog insertion (was Re: [HACKERS] Moving more work outside WALInsertLock)
On Thu, Feb 9, 2012 at 3:02 AM, Fujii Masao wrote: > On Thu, Feb 9, 2012 at 7:25 PM, Fujii Masao wrote: >> On Thu, Feb 9, 2012 at 3:32 AM, Jeff Janes wrote: >>> >>> After applying this patch and then forcing crashes, upon recovery the >>> database is not correct. >>> >>> If I make a table with 10,000 rows and then after that intensively >>> update it using a unique key: >>> >>> update foo set count=count+1 where foobar=? >>> >>> Then after the crash there are less than 10,000 visible rows: >>> >>> select count(*) from foo >>> >>> This not a subtle thing, it happens every time. I get counts of >>> between 1973 and 8827. Without this patch I always get exactly >>> 10,000. >>> >>> I don't really know where to start on tracking this down. >> >> Similar problem happened on my test. When I executed CREATE TABLE and >> shut down the server with immediate mode, after recovery I could not see the >> created table. Here are the server log of recovery with wal_debug = on: >> >> LOG: database system was interrupted; last known up at 2012-02-09 19:18:50 >> JST >> LOG: database system was not properly shut down; automatic recovery in >> progress >> LOG: redo starts at 0/179CC90 >> LOG: REDO @ 0/179CC90; LSN 0/179CCB8: prev 0/179CC30; xid 0; len 4: >> XLOG - nextOid: 24576 >> LOG: REDO @ 0/179CCB8; LSN 0/179CCE8: prev 0/179CC90; xid 0; len 16: >> Storage - file create: base/12277/16384 >> LOG: REDO @ 0/179CCE8; LSN 0/179DDE0: prev 0/179CCB8; xid 998; len >> 21; bkpb1: Heap - insert: rel 1663/12277/12014; tid 7/22 >> LOG: there is no contrecord flag in log file 0, segment 1, offset 7987200 >> LOG: redo done at 0/179CCE8 >> >> According to the log "there is no contrecord flag", ISTM the path treats the >> contrecord of backup block incorrectly, and which causes the problem. > > Yep, as far as I read the patch, it seems to have forgotten to set > XLP_FIRST_IS_CONTRECORD flag. Attached is my quick and dirty attempt to set XLP_FIRST_IS_CONTRECORD. I have no idea if I did it correctly, in particular if calling GetXLogBuffer(CurrPos) twice is OK or if GetXLogBuffer has side effects that make that a bad thing to do. I'm not proposing it as the real fix, I just wanted to get around this problem in order to do more testing. It does get rid of the "there is no contrecord flag" errors, but recover still does not work. Now the count of tuples in the table is always correct (I never provoke a crash during the initial table load), but sometimes updates to those tuples that were reported to have been committed are lost. This is more subtle, it does not happen on every crash. It seems that when recovery ends on "record with zero length at...", that recovery is correct. But when it ends on "invalid magic number in log file.." then the recovery is screwed up. Cheers, Jeff xloginsert_fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 3rd Cluster Hackers Summit, May 15th in Ottawa
Hackers, NTT Open Source has requested that I convene the 3rd Cluster Hackers summit at pgCon this year. As last year, it will be held on Tuesday (May 15th) during tutorials (and not conflicting with the Developer Summit). If you are a contributor to any of PostgreSQL's various replication, clustering, or virtualization and multiserver management tools, you are invited to attend. Please RSVP (see below). Draft agenda follows. Please let me know of any contributions/changes to the agenda you have: = Project Reports: 5 minutes from each project * Hot Standby/Binary Replication * pgPoolII * PostgresXC * Your Project Here = Technical Issues of common interest * SSI in cluster/replication * Parser export * Managing consistent views of data * Fault detection and handling * Node addition/removal * Configuration and operation * Cursor in replication/multi master * Your Issue Here The Cluster Summit will be from 10am to 5pm, with a break for lunch, which will be provided, sponsored by NTT. If you will be able to attend, please respond (offlist) to this email with the following: Your Name Project(s) you work on If you will be giving a Project Report If you have additions to the agenda Special dietary needs for lunch, if any If you need travel assistance Note that the availability of travel funding is not guaranteed; I can just agree to request it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Should I implement DROP INDEX CONCURRENTLY?
On Sun, Jan 8, 2012 at 8:19 AM, Simon Riggs wrote: > On Wed, Jan 4, 2012 at 11:14 AM, Simon Riggs wrote: > >> Not having a freelist at all is probably a simpler way of avoiding the >> lock contention, so I'll happily back that suggestion instead. Patch >> attached, previous patch revoked. > > v2 attached with cleanup of some random stuff that crept onto patch. Hi Simon, Based on the way this patch leaves the old code behind (using #ifdef), this looks more like a WIP patch which you want people to do performance testing with, rather than patch proposed for committing. If that is the case, could you outline the type of performance testing where you think it would make a difference (and whether it should be done on top of the main patch from this thread, the concurrent index drop one)? Also, it would be much easier to do the performance testing if this behavior was controlled by a temporarily added GUC, rather than an #ifdef. Do you think it is feasible to do that, or would the overhead of a single "if (some_guc)" per StrategyGetBuffer and StrategyFreeBuffer call distort things too much? Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CUDA Sorting
On 12/02/2012 13:13, Oleg Bartunov wrote: I'm wondering if CUDA will win in geomentry operations, for example, tesing point <@ complex_polygon I'm not sure if the algorithm you mentioned can be implemented in terms of vector algebra, blas, etc. It's plenty of geometry operations implemented in CUDA out there, my field of CUDA application is not this one so I'm not that much in it. However I can point you to official NVIDIA npp library that provides vector algebra algorithms, and some geometry algorithms as well. http://developer.download.nvidia.com/compute/DevZone/docs/html/CUDALibraries/doc/NPP_Library.pdf (take a look at around page 620). Regards Gaetano Mendola Oleg On Sun, 12 Feb 2012, Gaetano Mendola wrote: On 19/09/2011 16:36, Greg Smith wrote: On 09/19/2011 10:12 AM, Greg Stark wrote: With the GPU I'm curious to see how well it handles multiple processes contending for resources, it might be a flashy feature that gets lots of attention but might not really be very useful in practice. But it would be very interesting to see. The main problem here is that the sort of hardware commonly used for production database servers doesn't have any serious enough GPU to support CUDA/OpenCL available. The very clear trend now is that all systems other than gaming ones ship with motherboard graphics chipsets more than powerful enough for any task but that. I just checked the 5 most popular configurations of server I see my customers deploy PostgreSQL onto (a mix of Dell and HP units), and you don't get a serious GPU from any of them. Intel's next generation Ivy Bridge chipset, expected for the spring of 2012, is going to add support for OpenCL to the built-in motherboard GPU. We may eventually see that trickle into the server hardware side of things too. The trend is to have server capable of running CUDA providing GPU via external hardware (PCI Express interface with PCI Express switches), look for example at PowerEdge C410x PCIe Expansion Chassis from DELL. I did some experimenst timing the sort done with CUDA and the sort done with pg_qsort: CUDA pg_qsort 33Milion integers: ~ 900 ms, ~ 6000 ms 1Milion integers: ~ 21 ms, ~ 162 ms 100k integers: ~ 2 ms, ~ 13 ms CUDA time has already in the copy operations (host->device, device->host). As GPU I was using a C2050, and the CPU doing the pg_qsort was a Intel(R) Xeon(R) CPU X5650 @ 2.67GHz Copy operations and kernel runs (the sort for instance) can run in parallel, so while you are sorting a batch of data, you can copy the next batch in parallel. As you can see the boost is not negligible. Next Nvidia hardware (Keplero family) is PCI Express 3 ready, so expect in the near future the "bottle neck" of the device->host->device copies to have less impact. I strongly believe there is space to provide modern database engine of a way to offload sorts to GPU. I've never seen a PostgreSQL server capable of running CUDA, and I don't expect that to change. That sounds like: "I think there is a world market for maybe five computers." - IBM Chairman Thomas Watson, 1943 Regards Gaetano Mendola Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpoint sync pause
On Tue, Feb 7, 2012 at 1:22 PM, Greg Smith wrote: > On 02/03/2012 11:41 PM, Jeff Janes wrote: >>> >>> -The steady stream of backend writes that happen between checkpoints have >>> filled up most of the OS write cache. A look at /proc/meminfo shows >>> around >>> 2.5GB "Dirty:" >> >> "backend writes" includes bgwriter writes, right? > > > Right. > > >> Has using a newer kernal with dirty_background_bytes been tried, so it >> could be set to a lower level? If so, how did it do? Or does it just >> refuse to obey below the 5% level, as well? > > > Trying to dip below 5% using dirty_background_bytes slows VACUUM down faster > than it improves checkpoint latency. Does it cause VACUUM to create latency for other processes (like the checkpoint syncs do, by gumming up the IO for everyone) or does VACUUM just slow down without effecting other tasks? It seems to me that just lowering dirty_background_bytes (while not also lowering dirty_bytes) should not cause the latter to happen, but it seems like these kernel tunables never do exactly what they advertise. This may not be relevant to the current situation, but I wonder if we don't need a "vacuum_cost_page_dirty_seq" so that if the pages we are dirtying are consecutive (or at least closely spaced) they cost less, in anticipation that the eventual writes will be combined and thus consume less IO resources. I would think it would be common for some regions of table to be intensely dirtied, and some to be lightly dirtied (but still aggregating up to a considerable amount of random IO). But the vacuum process might also need to be made more "bursty", as even if it generates sequential dirty pages the IO system might write them randomly anyway if there are too many delays interspersed > Since the sort of servers that have > checkpoint issues are quite often ones that have VACUUM ones, too, that > whole path doesn't seem very productive. The one test I haven't tried yet > is whether increasing the size of the VACUUM ring buffer might improve how > well the server responds to a lower write cache. I wouldn't expect this to help. It seems like it would hurt, as it just leaves the data for even longer (however long it takes to circumnavigate the ring buffer) before there is any possibility of it getting written. I guess it does increase the chances that the dirty pages will "accidentally" get written by the bgwriter rather than the vacuum itself, but I doubt that that would be significant. ... >> >> Was the sorted checkpoint with an fsync after every file (real file, >> not VFD) one of the changes you tried? > > ... > > I haven't had very good luck with sorting checkpoints at the PostgreSQL > relation level on server-size systems. There is a lot of sorting already > happening at both the OS (~3GB) and BBWC (>=512MB) levels on this server. > My own tests on my smaller test server--with a scaled down OS (~750MB) and > BBWC (256MB) cache--haven't ever validated sorting as a useful technique on > top of that. It's never bubbled up to being considered a likely win on the > production one as a result. Without sorted checkpoints (or some other fancier method) you have to write out the entire pool before you can do any fsyncs. Or you have to do multiple fsyncs of the same file, with at least one occurring after the entire pool was written. With a sorted checkpoint, you can start issuing once-only fsyncs very early in the checkpoint process. I think that on large servers, that would be the main benefit, not the actually more efficient IO. (On small servers I've seen sorted checkpoints be much faster on shutdown checkpoints, but not on natural checkpoints, and presumably this improvement *is* due to better ordering). On your servers, you need big delays between fsyncs and not between writes (as they are buffered until the fsync). But in other situations, people need the delays between the writes. By using sorted checkpoints with fsyncs between each file, the delays between writes are naturally delays between fsyncs as well. So I think the benefit of using sorted checkpoints is that code to improve your situations is less likely to degrade someone else's situation, without having to introduce an extra layer of tunables. > >> What I/O are they trying to do? It seems like all your data is in RAM >> (if not, I'm surprised you can get queries to ran fast enough to >> create this much dirty data). So they probably aren't blocking on >> reads which are being interfered with by all the attempted writes. > > > Reads on infrequently read data. Long tail again; even though caching is > close to 100%, the occasional outlier client who wants some rarely accessed > page with their personal data on it shows up. Pollute the write caches > badly enough, and what happens to reads mixed into there gets very fuzzy. > Depends on the exact mechanics of the I/O scheduler used in the kernel > version deployed. OK, but I would still think it is a minority of transactions whi
Re: [HACKERS] CUDA Sorting
I'm wondering if CUDA will win in geomentry operations, for example, tesing point <@ complex_polygon Oleg On Sun, 12 Feb 2012, Gaetano Mendola wrote: On 19/09/2011 16:36, Greg Smith wrote: On 09/19/2011 10:12 AM, Greg Stark wrote: With the GPU I'm curious to see how well it handles multiple processes contending for resources, it might be a flashy feature that gets lots of attention but might not really be very useful in practice. But it would be very interesting to see. The main problem here is that the sort of hardware commonly used for production database servers doesn't have any serious enough GPU to support CUDA/OpenCL available. The very clear trend now is that all systems other than gaming ones ship with motherboard graphics chipsets more than powerful enough for any task but that. I just checked the 5 most popular configurations of server I see my customers deploy PostgreSQL onto (a mix of Dell and HP units), and you don't get a serious GPU from any of them. Intel's next generation Ivy Bridge chipset, expected for the spring of 2012, is going to add support for OpenCL to the built-in motherboard GPU. We may eventually see that trickle into the server hardware side of things too. The trend is to have server capable of running CUDA providing GPU via external hardware (PCI Express interface with PCI Express switches), look for example at PowerEdge C410x PCIe Expansion Chassis from DELL. I did some experimenst timing the sort done with CUDA and the sort done with pg_qsort: CUDA pg_qsort 33Milion integers: ~ 900 ms, ~ 6000 ms 1Milion integers:~ 21 ms, ~ 162 ms 100k integers: ~ 2 ms, ~ 13 ms CUDA time has already in the copy operations (host->device, device->host). As GPU I was using a C2050, and the CPU doing the pg_qsort was a Intel(R) Xeon(R) CPU X5650 @ 2.67GHz Copy operations and kernel runs (the sort for instance) can run in parallel, so while you are sorting a batch of data, you can copy the next batch in parallel. As you can see the boost is not negligible. Next Nvidia hardware (Keplero family) is PCI Express 3 ready, so expect in the near future the "bottle neck" of the device->host->device copies to have less impact. I strongly believe there is space to provide modern database engine of a way to offload sorts to GPU. I've never seen a PostgreSQL server capable of running CUDA, and I don't expect that to change. That sounds like: "I think there is a world market for maybe five computers." - IBM Chairman Thomas Watson, 1943 Regards Gaetano Mendola Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers