[HACKERS] trying to study how sorting works

2015-03-26 Thread hitesh ramani
Hello devs,
I'm trying to understand how sorting exactly works in Postgres, I've understood 
that there are two sorting mechanisms depending on the size of the data, one 
being qsort which is initiated if workmem is  1024 kb and the other being 
external sort which is initiated in the other case. I tried to find out more 
material to understand how it exactly works but I'm unable to find any help 
material.
Moreover, I'm trying to study the code using gdb by attaching it to the 
pg_backend_pid and having a breakpoint at raw_parser, from where I start 
analyzing. Any help material or resources would be really appreciated.
Thanks.
--Hitesh  

Re: [HACKERS] GSoC - Idea Discussion

2015-03-20 Thread hitesh ramani
Hello devs,
Thank you so much for the feedback, to answer to your questions:
Tomas:So you've created an array of 1M integers, and it's 7x faster on GPU 
compared to pg_qsort(), correct?
No, I meant general sorting, not on pg_qsort()
Well, it might surprise you, but PostgreSQL almost never sorts numbers like 
this. PostgreSQL sorts tuples, which is way more complicated and, considering 
the variable length of tuples (causing issues with memory access), rather 
unsuitable for GPU devices. I might be missing something, of course.Also, 
it often needs additional information, like collations when sorting by a text 
field, for example.
I totally agree with you on this point, my current target area is very confined 
as this is the beginning, I'm only considering integer values in one row.
Why don't you show us the source code? Would be simpler than explaining what 
it does.
You can have a look at the code here: 
https://github.com/hiteshramani/Postgres-CUDAThis is a compiled code, you can 
see the call to CUDA function in src/port/qsort.c and .h files - qsort_normal.h 
and qsort_cuda.h. The hello world program is in src/port/qsort_cuda.cu. 
Compilation happens in 2 phases - compile and link, I compiled the cuda file 
with nvcc and for linked I edited the makefile of src/timezone/ because zic 
build needed the linking of the cuda file.
Suggestions are welcome.
I'd recommend discussing the code here. It's certainly quite complex, 
especially if this is your first encounter with it.
Yes, I felt it's a little complex but couldn't find a lot of help resources 
online. I'm looking for help.
PostgreSQL uses adaptive sort - in-memory when it fits into work_mem, on-disk 
when it does not. This is decided at runtime.You'll have to do the same 
thing, because the amount of memory available on GPUs is limited to a few 
GBs, and it needs to work for datasets exceeding that limit (the amount of 
data is uncertain at planning time).
Yes, I thought of that too. A call could be made with the integer array as an 
input to the GPU. The GPU then returns the result with a sorted array. I want 
to proceed step by step, as there are methods to sort amount which exceed the 
GPU memory.
Álvaro Herrera:I downloaded the zip of the latest custom_join repo I saw 2 days 
ago. I'll check once again. Thank you. :)
KaiGai Kohei:
Let me say CUDA is better than OpenCL :-)Because of software quality of 
OpenCL runtime drivers provided by each vendor,I've often faced mysterious 
problems. Only nvidia's runtime are enough reliablefrom my point of view. In 
addition, when we implement using OpenCL is a featurefully depends on 
hardware characteristics, so we cannot ignore physical hardwareunderlying the 
abstraction layer.So, I'm now reworking the code to move CUDA from OpenCL.
That's great, I'd love to help you with that and contribute in it.
It seems to me you are a little bit optimistic.Unlike CPU code, GPU-Sorting 
logic has to reference device memory space,so all the data to be compared 
needs to be transferred to GPU devices.Any pointer on host address space is 
not valid on GPU calculation.Amount of device memory is usually smaller than 
host memory, so your codeneeds a capability to combined multiple chunks that 
is partially sorted...Probably, it is not all here.
Aren't there algorithms which help you if the device memory is limited and the 
data is massive? I have a rough memory because I did a course online, where I 
saw algorithms to deal with such problems I suppose.
Thanks and Regards,Hitesh Ramani  

[HACKERS] GSoC - Idea Discussion

2015-03-18 Thread hitesh ramani
Hello devs,
As stated earlier I was thinking to propose the integration of Postgres and 
CUDA for faster execution of order by queries thru optimizing the sorting code 
and sorting it with CUDA. I saw and tried to run PG Strom and ran into issues. 
Moreover, PG Strom is implemented in OpenCL, not CUDA.
I have hardware to run CUDA and currently I'm at a point where I have almost 
integrated Postgres and CUDA. This opens up gates for a lot of features which 
can be optimized thru CUDA and parallel processing, though here I only want to 
focus on sorting, hence kind of feasible for the time period.
As I did some research, I found CUDA is more efficient in not just the parallel 
performance but data transfer latency too. My idea is to create a branch of 
Postgres with the CUDA integrated code.
For the feasibility, I guess it's very much feasible because I've almost 
integrated CUDA execution and the code needs to be optimized as per CUDA.

Please give in your valuable suggestions and views on this.
Thanks and Regards,Hitesh Ramani  

Re: [HACKERS] GSoC 2015: Introduction and Doubt regarding a Project.

2015-03-17 Thread hitesh ramani
Hello Peter,

  1. As I did some research on this project, I found *date_trunc()
  supporting intervals‏ *was suggested last year but not selected as a
  GSoC project. Is it being floated this year too(as mentioned on the GSoC
  2015 wiki page of Postgres)? If yes, what are the exact expected outputs?
 
 It seems to me that that would be too small for a GSoC project.

I agree, but I saw the project mentioned on the GSoC wiki page of Postgres, and 
that's why I was curious to know the expected outputs, because if it was 
considered to be floated as a GSoC project, there must be something more than 
just what the title depicts.
--Hitesh  

[HACKERS] GSoC 2015: Introduction and Doubt regarding a Project.

2015-03-17 Thread hitesh ramani
Hello,
I introduced myself on the pgsql-students list but just to introduce here too, 
my name is Hitesh Ramani, I'm a student of Hyderabad Central University, 
Hyderabad, India. Currently I'm pursuing a project in PostgreSQL as my Post 
Graduation project hence I've hacked into the Postgres code a lot of times to 
add some functionalities to it, specifically to the sorting code. I'm also 
familiar with the query processing and backend internals.
I had a few doubts, which I need to ask and start on the proposal as soon as 
possible because the applications have opened.
1. As I did some research on this project, I found date_trunc() supporting 
intervals‏ was suggested last year but not selected as a GSoC project. Is it 
being floated this year too(as mentioned on the GSoC 2015 wiki page of 
Postgres)? If yes, what are the exact expected outputs?
2. Can I send a proposal for 2 projects to Postgres itself? Out of which one 
can be taken forward.
3. Does PG Strom take care of sorting as well on the GPUs?
Thanks and Regards,Hitesh Ramani