On Fri, May 24, 2013 at 01:13:21PM -0500, Jim Nasby wrote: > On 5/13/13 9:28 AM, Noah Misch wrote: >> It would be great if one client session could take advantage of multiple CPU >> cores. EnterpriseDB wishes to start the trek into this problem space for 9.4 >> by implementing parallel internal (i.e. not spilling to disk) sort. This >> touches on a notable subset of the infrastructure components we'll need for >> parallel general query. My intent is to map out the key design topics, hear >> about critical topics I hadn't considered, and solicit feedback on the >> quality >> of the high-level plan. Full designs for key pieces will come later. > > Have you considered GPU-based sorting? I know there's been discussion in the > past.
I had considered it briefly. Parallel sort is mainly valuable for expensive comparison operators. Sorting int4, for example, is too cheap for parallelism to be compelling. (In my test build of a 16 GiB int4 index, sorting took 11s of the 391s build time.) However, expensive operators are also liable to be difficult to reimplement for the GPU. In particular, implementing a GPU-based strcoll() for bttextcmp sounds like quite a project in its own right. > To me, the biggest advantage of GPU sorting is that most of the concerns > you've laid out go away; a backend that needs to sort just throws data at the > GPU to do the actual sorting; all the MVCC issues and what not remain within > the scope of a single backend. Those are matters we would eventually need to address as we parallelize more things, so I regard confronting them as an advantage. Among other benefits, this project is a vehicle for emplacing some infrastructure without inviting the full complexity entailed by loftier goals. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers