[HACKERS] trying to study how sorting works
Hello devs, I'm trying to understand how sorting exactly works in Postgres, I've understood that there are two sorting mechanisms depending on the size of the data, one being qsort which is initiated if workmem is 1024 kb and the other being external sort which is initiated in the other case. I tried to find out more material to understand how it exactly works but I'm unable to find any help material. Moreover, I'm trying to study the code using gdb by attaching it to the pg_backend_pid and having a breakpoint at raw_parser, from where I start analyzing. Any help material or resources would be really appreciated. Thanks. --Hitesh
Re: [HACKERS] GSoC - Idea Discussion
Hello devs, Thank you so much for the feedback, to answer to your questions: Tomas:So you've created an array of 1M integers, and it's 7x faster on GPU compared to pg_qsort(), correct? No, I meant general sorting, not on pg_qsort() Well, it might surprise you, but PostgreSQL almost never sorts numbers like this. PostgreSQL sorts tuples, which is way more complicated and, considering the variable length of tuples (causing issues with memory access), rather unsuitable for GPU devices. I might be missing something, of course.Also, it often needs additional information, like collations when sorting by a text field, for example. I totally agree with you on this point, my current target area is very confined as this is the beginning, I'm only considering integer values in one row. Why don't you show us the source code? Would be simpler than explaining what it does. You can have a look at the code here: https://github.com/hiteshramani/Postgres-CUDAThis is a compiled code, you can see the call to CUDA function in src/port/qsort.c and .h files - qsort_normal.h and qsort_cuda.h. The hello world program is in src/port/qsort_cuda.cu. Compilation happens in 2 phases - compile and link, I compiled the cuda file with nvcc and for linked I edited the makefile of src/timezone/ because zic build needed the linking of the cuda file. Suggestions are welcome. I'd recommend discussing the code here. It's certainly quite complex, especially if this is your first encounter with it. Yes, I felt it's a little complex but couldn't find a lot of help resources online. I'm looking for help. PostgreSQL uses adaptive sort - in-memory when it fits into work_mem, on-disk when it does not. This is decided at runtime.You'll have to do the same thing, because the amount of memory available on GPUs is limited to a few GBs, and it needs to work for datasets exceeding that limit (the amount of data is uncertain at planning time). Yes, I thought of that too. A call could be made with the integer array as an input to the GPU. The GPU then returns the result with a sorted array. I want to proceed step by step, as there are methods to sort amount which exceed the GPU memory. Álvaro Herrera:I downloaded the zip of the latest custom_join repo I saw 2 days ago. I'll check once again. Thank you. :) KaiGai Kohei: Let me say CUDA is better than OpenCL :-)Because of software quality of OpenCL runtime drivers provided by each vendor,I've often faced mysterious problems. Only nvidia's runtime are enough reliablefrom my point of view. In addition, when we implement using OpenCL is a featurefully depends on hardware characteristics, so we cannot ignore physical hardwareunderlying the abstraction layer.So, I'm now reworking the code to move CUDA from OpenCL. That's great, I'd love to help you with that and contribute in it. It seems to me you are a little bit optimistic.Unlike CPU code, GPU-Sorting logic has to reference device memory space,so all the data to be compared needs to be transferred to GPU devices.Any pointer on host address space is not valid on GPU calculation.Amount of device memory is usually smaller than host memory, so your codeneeds a capability to combined multiple chunks that is partially sorted...Probably, it is not all here. Aren't there algorithms which help you if the device memory is limited and the data is massive? I have a rough memory because I did a course online, where I saw algorithms to deal with such problems I suppose. Thanks and Regards,Hitesh Ramani
[HACKERS] GSoC - Idea Discussion
Hello devs, As stated earlier I was thinking to propose the integration of Postgres and CUDA for faster execution of order by queries thru optimizing the sorting code and sorting it with CUDA. I saw and tried to run PG Strom and ran into issues. Moreover, PG Strom is implemented in OpenCL, not CUDA. I have hardware to run CUDA and currently I'm at a point where I have almost integrated Postgres and CUDA. This opens up gates for a lot of features which can be optimized thru CUDA and parallel processing, though here I only want to focus on sorting, hence kind of feasible for the time period. As I did some research, I found CUDA is more efficient in not just the parallel performance but data transfer latency too. My idea is to create a branch of Postgres with the CUDA integrated code. For the feasibility, I guess it's very much feasible because I've almost integrated CUDA execution and the code needs to be optimized as per CUDA. Please give in your valuable suggestions and views on this. Thanks and Regards,Hitesh Ramani
Re: [HACKERS] GSoC 2015: Introduction and Doubt regarding a Project.
Hello Peter, 1. As I did some research on this project, I found *date_trunc() supporting intervals *was suggested last year but not selected as a GSoC project. Is it being floated this year too(as mentioned on the GSoC 2015 wiki page of Postgres)? If yes, what are the exact expected outputs? It seems to me that that would be too small for a GSoC project. I agree, but I saw the project mentioned on the GSoC wiki page of Postgres, and that's why I was curious to know the expected outputs, because if it was considered to be floated as a GSoC project, there must be something more than just what the title depicts. --Hitesh
[HACKERS] GSoC 2015: Introduction and Doubt regarding a Project.
Hello, I introduced myself on the pgsql-students list but just to introduce here too, my name is Hitesh Ramani, I'm a student of Hyderabad Central University, Hyderabad, India. Currently I'm pursuing a project in PostgreSQL as my Post Graduation project hence I've hacked into the Postgres code a lot of times to add some functionalities to it, specifically to the sorting code. I'm also familiar with the query processing and backend internals. I had a few doubts, which I need to ask and start on the proposal as soon as possible because the applications have opened. 1. As I did some research on this project, I found date_trunc() supporting intervals was suggested last year but not selected as a GSoC project. Is it being floated this year too(as mentioned on the GSoC 2015 wiki page of Postgres)? If yes, what are the exact expected outputs? 2. Can I send a proposal for 2 projects to Postgres itself? Out of which one can be taken forward. 3. Does PG Strom take care of sorting as well on the GPUs? Thanks and Regards,Hitesh Ramani