Re: [HACKERS] GSoC 2017

2017-02-10 Thread Dmitry Melnik
The expected result for this work is push-based executor working for many
types of queries (currently we aim at TPC-H), but it's unlikely to be a
production-ready patch to commit into mainline at that stage. This work is
the actual topic for our student's thesis, so he has already started, and
has working prototypes for very simple plans. Also, he won't be working on
this alone, but rather will make use of support and experience of our team
(as well as mentor's help).
So this is not about replacing current pull executor right away, but rather
to develop working prototype to find out about the benefits of switching
from pull to push model (for both the interpreter and LLVM JIT).

On Wed, Feb 8, 2017 at 7:06 PM, Robert Haas  wrote:

> On Mon, Feb 6, 2017 at 6:51 AM, Ruben Buchatskiy  wrote:
> > 2017-01-10 12:53 GMT+03:00 Alexander Korotkov  >:
> >> 1. What project ideas we have?
> >
> > We would like to propose a project on rewriting PostgreSQL executor from
> >
> > traditional Volcano-style [1] to so-called push-based architecture as
> > implemented in
> >
> > Hyper [2][3] and VitesseDB [4]. The idea is to reverse the direction of
> data
> > flow
> >
> > control: instead of pulling up tuples one-by-one with ExecProcNode(), we
> > suggest
> >
> > pushing them from below to top until blocking operator (e.g.
> Aggregation) is
> >
> > encountered. There’s a good example and more detailed explanation for
> this
> > approach in [2].
>
> I think this very possibly a good idea but extremely unlikely to be
> something that a college student or graduate student can complete in
> one summer.  More like an existing expert developer and a year of
> doing not much else.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best regards,
  Dmitry


[HACKERS] JIT compiler for expressions

2016-10-28 Thread Dmitry Melnik
Hello hackers,

We'd like to present our work on adding LLVM JIT compilation of expressions
in SQL queries for PostgreSQL. The source code (based on 9.6.1) along with
brief manual is available at our github: https://github.com/ispras/postgres .
Сurrent speedup for TPC-H Q1 is 20% (on 40GB workload). Please feel free to
test it and tell us what you think.

Currently, our JIT is used to compile expressions in every query, so for
short-running queries JIT compilation may take longer than query execution
itself. We plan to address it by using planner estimate to decide whether
it worth JIT compiling, also we may try parallel JIT compilation. But for
now we recommend testing it on a large workload  in order to pay off the
compilation (we've tested on 40GB database for Q1).

The changes in PostgreSQL code itself are rather small, while the biggest
part of new code in our repository is autogenerated (it's LLVM IR
generators for PostgreSQL backend functions). The only real reason for
shipping prebuild_llvm_backend.cpp is that it takes patched LLVM version to
generate, otherwise it's generated right from PostgreSQL source code
(please see more on automatic backend generation at our github site). With
pre-generated cpp file, building our github PGSQL version w/JIT requires
only clean, non-patched LLVM 3.7.

JIT compilation was tested on Linux, and currently we have 5 actual tests
failing (which results in 24 errors in a regtest). It requires LLVM 3.7
(3.7.1) as build dependency (you can specify path to proper llvm-config
with --with-llvm-config= configure option, e.g. it could be named
llvm-config-3.7 on your system). Mac support is highly experimental, and
wasn't tested much, but if you like to give it a try, you can do it with
LLVM 3.7 from MacPorts or Homebrew.

This work is a part of our greater effort on implementing full JIT compiler
in PostgreSQL, where along with JITting expressions we've changed the
iteration model from Volcano-style to push-model and reimplemented code
generation with LLVM for most of Scan/Aggregation/Join methods. That
approach gives much better speedup (x4-5 times on Q1), but it takes many
code changes, so we're developing it as PostgreSQL extension. It's not
ready for release yet, but we're now working on performance, compatibility,
as well as how to make it easier to maintain by making it possible to build
both JIT compiler and the interpreter from the same source code. More
information about our full JIT compiler and related work is available in
presentation at LLVM Cauldron (http://llvm.org/devmtg/2016-09/slides/Melnik-
PostgreSQLLLVM.pdf ) and PGCon (https://www.pgcon.org/2016/
schedule/attachments/411_ISPRAS%20LLVM+Postgres%20Presentation.pdf ).
Also we're going to give a lightning talk at upcoming PGConf.EU in Tallinn,
and discuss the further development with PostgreSQL community. We'd
appreciate any feedback!

-- 
Best regards,
  Dmitry Melnik
  Institute for System Programming of the Russian Academy of Sciences
  ISP RAS (www.ispras.ru/en/)