Re: [HACKERS] VOPS: vectorized executor for Postgres: how to speedup OLAP queries more than 10 times without changing anything in Postgres executor

2017-02-16 Thread Thom Brown
On 16 February 2017 at 17:00, Konstantin Knizhnik
 wrote:
> More progress in vectorized Postgres extension (VOPS). It is not required
> any more to use some special functions in queries.
> You can use vector operators in query with standard SQL and still get ten
> times improvement on some queries.
> VOPS extension now uses post parse analyze hook to transform query.
> I really impressed by flexibility and extensibility of Postgres type system.
> User defined types do most of the work.
>
> It is still responsibility of programmer or database administrator to create
> proper projections
> of original table. This projections need to use tiles types for some
> attributes (vops_float4,...).
> Then you can query this table using standard SQL. And this query will be
> executed using vector operations!
>
> Example of such TPC-H queries:
>
> Q1:
> select
> l_returnflag,
> l_linestatus,
> sum(l_quantity) as sum_qty,
> sum(l_extendedprice) as sum_base_price,
> sum(l_extendedprice*(1-l_discount)) as sum_disc_price,
> sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge,
> avg(l_quantity) as avg_qty,
> avg(l_extendedprice) as avg_price,
> avg(l_discount) as avg_disc,
> count(*) as count_order
> from
> vops_lineitem_projection
> where
> l_shipdate <= '1998-12-01'::date
> group by
> l_returnflag,
> l_linestatus
> order by
> l_returnflag,
> l_linestatus;
>
>
>
> Q6:
> select
> sum(l_extendedprice*l_discount) as revenue
> from
> lineitem_projection
> where
> l_shipdate between '1996-01-01'::date and '1997-01-01'::date
> and l_discount between 0.08 and 0.1
> and l_quantity < 24;

>
> On 13.02.2017 17:12, Konstantin Knizhnik wrote:
>>
>> Hello hackers,
>>
>> There were many discussions concerning  possible ways of speeding-up
>> Postgres. Different approaches were suggested:
>>
>> - JIT (now we have three different prototype implementations based on
>> LLVM)
>> - Chunked (vectorized) executor
>> - Replacing pull with push
>> - Columnar store (cstore_fdw, IMCS)
>> - Optimizing and improving current executor (reducing tuple deform
>> overhead, function call overhead,...)
>>
>> Obviously the best result can be achieved in case of combining all this
>> approaches. But actually them are more or less interchangeable: vectorized
>> execution is not eliminating interpretation overhead, but it is divided by
>> vector size and becomes less critical.
>>
>> I decided to write small prototype to estimate possible speed improvement
>> of vectorized executor. I created special types representing "tile" and
>> implement standard SQL operators for them. So neither Postgres planer,
>> nether Postgres executor, nether Postgres heap manager are changed. But I
>> was able to reach more than 10 times speed improvement on TPC-H Q1/Q6
>> queries!

Impressive work!

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VOPS: vectorized executor for Postgres: how to speedup OLAP queries more than 10 times without changing anything in Postgres executor

2017-02-16 Thread Konstantin Knizhnik
More progress in vectorized Postgres extension (VOPS). It is not 
required any more to use some special functions in queries.
You can use vector operators in query with standard SQL and still get 
ten times improvement on some queries.

VOPS extension now uses post parse analyze hook to transform query.
I really impressed by flexibility and extensibility of Postgres type 
system. User defined types do most of the work.


It is still responsibility of programmer or database administrator to 
create proper projections
of original table. This projections need to use tiles types for some 
attributes (vops_float4,...).
Then you can query this table using standard SQL. And this query will be 
executed using vector operations!


Example of such TPC-H queries:

Q1:
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice*(1-l_discount)) as sum_disc_price,
sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
vops_lineitem_projection
where
l_shipdate <= '1998-12-01'::date
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;



Q6:
select
sum(l_extendedprice*l_discount) as revenue
from
lineitem_projection
where
l_shipdate between '1996-01-01'::date and '1997-01-01'::date
and l_discount between 0.08 and 0.1
and l_quantity < 24;



On 13.02.2017 17:12, Konstantin Knizhnik wrote:

Hello hackers,

There were many discussions concerning  possible ways of speeding-up 
Postgres. Different approaches were suggested:


- JIT (now we have three different prototype implementations based on 
LLVM)

- Chunked (vectorized) executor
- Replacing pull with push
- Columnar store (cstore_fdw, IMCS)
- Optimizing and improving current executor (reducing tuple deform 
overhead, function call overhead,...)


Obviously the best result can be achieved in case of combining all 
this approaches. But actually them are more or less interchangeable: 
vectorized execution is not eliminating interpretation overhead, but 
it is divided by vector size and becomes less critical.


I decided to write small prototype to estimate possible speed 
improvement of vectorized executor. I created special types 
representing "tile" and implement standard SQL operators for them. So 
neither Postgres planer, nether Postgres executor, nether Postgres 
heap manager are changed. But I was able to reach more than 10 times 
speed improvement on TPC-H Q1/Q6 queries!


Please find more information here: 
https://cdn.rawgit.com/postgrespro/vops/ddcbfbe6/vops.html
The sources of the project can be found here: 
https://github.com/postgrespro/vops.git




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] VOPS: vectorized executor for Postgres: how to speedup OLAP queries more than 10 times without changing anything in Postgres executor

2017-02-13 Thread Konstantin Knizhnik

Hello hackers,

There were many discussions concerning  possible ways of speeding-up 
Postgres. Different approaches were suggested:


- JIT (now we have three different prototype implementations based on LLVM)
- Chunked (vectorized) executor
- Replacing pull with push
- Columnar store (cstore_fdw, IMCS)
- Optimizing and improving current executor (reducing tuple deform 
overhead, function call overhead,...)


Obviously the best result can be achieved in case of combining all this 
approaches. But actually them are more or less interchangeable: 
vectorized execution is not eliminating interpretation overhead, but it 
is divided by vector size and becomes less critical.


I decided to write small prototype to estimate possible speed 
improvement of vectorized executor. I created special types representing 
"tile" and implement standard SQL operators for them. So neither 
Postgres planer, nether Postgres executor, nether Postgres heap manager 
are changed. But I was able to reach more than 10 times speed 
improvement on TPC-H Q1/Q6 queries!


Please find more information here: 
https://cdn.rawgit.com/postgrespro/vops/ddcbfbe6/vops.html
The sources of the project can be found here: 
https://github.com/postgrespro/vops.git


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers