Re: [HACKERS] asynchronous and vectorized execution

2016-05-10 Thread Bert
hmm, the morsels paper looks really interesting at first sight.
Let's see if we can get a poc working in PostgreSQL? :-)

On Tue, May 10, 2016 at 9:42 PM, Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:

> On 05/10/2016 08:26 PM, Robert Haas wrote:
>
>> On Tue, May 10, 2016 at 3:00 AM, konstantin knizhnik
>> <k.knizh...@postgrespro.ru> wrote:
>>
>>> What's wrong with it that worker is blocked? You can just have more
>>> workers
>>> (more than CPU cores) to let other of them continue to do useful work.
>>>
>> Not really.  The workers are all running the same plan, so they'll all
>> make the same decision about which node needs to be executed next.  If
>> that node can't accommodate multiple processes trying to execute it at
>> the same time, it will have to block all of them but the first one.
>> Adding more processes just increases the number of processes sitting
>> around doing nothing.
>>
>
> Doesn't this actually mean that we need to have normal job scheduler which
> is given queue of jobs and having some pool of threads will be able to
> orginize efficient execution of queries? Optimizer can build pipeline
> (graph) of tasks, which corresponds to execution plan nodes, i.e. SeqScan,
> Sort, ... Each task is splitted into several jobs which can be concurretly
> scheduled by task dispatcher.  So you will not have blocked worker waiting
> for something and all system resources will be utilized. Such approach with
> dispatcher allows to implement quotas, priorities,... Also dispatches can
> care about NUMA and cache optimizations which is especially critical on
> modern architectures. One more reference:
> http://db.in.tum.de/~leis/papers/morsels.pdf
>
> Sorry, may be I wrong, but I still think that async.ops is "multitasking
> for poor":)
> Yes, maintaining threads and especially separate processes adds
> significant overhead. It leads to extra resources consumption and context
> switches are quite expensive. And I know from my own experience that
> replacing several concurrent processes performing some IO (for example with
> sockets) with just one process using multiplexing allows to increase
> performance. But still async. ops. is just a way to make programmer
> responsible for managing state machine instead of relying on OS tomake
> context switches. Manual transmission is still more efficient than
> automatic transmission. But still most drives prefer last one;)
>
> Seriously, I carefully read your response to Kochei, but still not
> convinced that async. ops. is what we need.  Or may be we just understand
> different thing by this notion.
>
>
>
>
>> But there are some researches, for example:
>>>
>>> http://www.vldb.org/pvldb/vol4/p539-neumann.pdf
>>>
>>> showing that the same or even better effect can be achieved by generation
>>> native code for query execution plan (which is not so difficult now,
>>> thanks
>>> to LLVM).
>>> It eliminates interpretation overhead and increase cache locality.
>>> I get similar results with my own experiments of accelerating SparkSQL.
>>> Instead of native code generation I used conversion of query plans to C
>>> code
>>> and experiment with different data representation. "Horisontal model"
>>> with
>>> loading columns on demands shows better performance than columnar store.
>>>
>> Yes, I think this approach should also be considered.
>>
>> At this moment (February) them have implemented translation of only few
>>> PostgreSQL operators used by ExecQuals  and do not support aggregates.
>>> Them get about 2 times increase of speed at synthetic queries and 25%
>>> increase at TPC-H Q1 (for Q1 most critical is generation of native code
>>> for
>>> aggregates, because ExecQual itself takes only 6% of time for this
>>> query).
>>> Actually these 25% for Q1 were achieved not by using dynamic code
>>> generation, but switching from PULL to PUSH model in executor.
>>> It seems to be yet another interesting PostgreSQL executor
>>> transformation.
>>> As far as I know, them are going to publish result of their work to open
>>> source...
>>>
>> Interesting.  You may notice that in "asynchronous mode" my prototype
>> works using a push model of sorts.  Maybe that should be taken
>> further.
>>
>>
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Bert Desmet
0477/305361


Re: [HACKERS] Problems with huge_pages and IBM Power8

2016-04-14 Thread Bert
hmpf; are you sure?
I just checked on our own rhel7 system, and RemoveIPC is set to 'no' by
default..

On Tue, Apr 12, 2016 at 10:26 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Andres Freund <and...@anarazel.de> writes:
> > On 2016-04-12 21:58:14 +0200, reiner peterke wrote:
> >> Looking for some insight into this issue.  the error from the postgres
> >> log on ubuntu is below.  It apperas to be related to semephores.
>
> > I've a bit of a hard time believing that this is related to huge pages.
>
> I'm betting that's this:
>
>
> http://www.postgresql.org/message-id/cak7teys9-o4bterbs3xuk2bffnnd55u2sm9j5r2fi7v6bhj...@mail.gmail.com
>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Bert Desmet
0477/305361


Re: [HACKERS] [COMMITTERS] pgsql: Support parallel aggregation.

2016-03-21 Thread Bert
#woopwoop! :-D great work, all!

On Mon, Mar 21, 2016 at 3:43 PM, Simon Riggs <si...@2ndquadrant.com> wrote:

> On 21 March 2016 at 14:35, David Fetter <da...@fetter.org> wrote:
>
>> On Mon, Mar 21, 2016 at 01:33:28PM +, Robert Haas wrote:
>> > Support parallel aggregation.
>>
>> ...and there was much rejoicing!
>>
>
> +1
>
> Well done all.
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> <http://www.2ndquadrant.com/>
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>



-- 
Bert Desmet
0477/305361


Re: [HACKERS] On columnar storage (2)

2016-03-03 Thread Bert
Hey Alvaro,

I was referring to https://wiki.postgresql.org/wiki/ColumnOrientedSTorage .
and yes, I'll be at the next fosdem / pgconf.eu for sure. :-)

Bert

On Thu, Mar 3, 2016 at 3:40 PM, Alvaro Herrera <alvhe...@2ndquadrant.com>
wrote:

> Bert wrote:
>
> > Alvaro,
> > You wrote that a wiki page would be opened regarding this. But I still
> > cannot find such a page (expect for an old page which hasn't changed in
> the
> > last year). Is there already something we can look at?
>
> Yeah, I haven't done that yet.  I will post here as soon as I get that
> done.  Happy to share another beer to discuss, next time I'm over there.
> I'm also going to have code to share for you to test by then!
>
> What's the other page you mention?
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>



-- 
Bert Desmet
0477/305361


Re: [HACKERS] On columnar storage (2)

2016-03-03 Thread Bert
Hello Haribabu,

Thank you for the performance test. But please not that the patch is
'thrown away', and will be totally rewritten. I have no idea of the status
of the second / third attempt however.
However, what is interesting is that for some queries this patch is already
on par with VCI. Which db is that exactly?

Alvaro,
You wrote that a wiki page would be opened regarding this. But I still
cannot find such a page (expect for an old page which hasn't changed in the
last year). Is there already something we can look at?

Bert

On Thu, Mar 3, 2016 at 6:07 AM, Haribabu Kommi <kommi.harib...@gmail.com>
wrote:

> On Mon, Feb 1, 2016 at 12:11 AM, Alvaro Herrera
> <alvhe...@2ndquadrant.com> wrote:
> > So we discussed some of this stuff during the developer meeting in
> > Brussels and the main conclusion is that we're going to split this up in
> > multiple independently useful pieces, and write up the general roadmap
> > in the wiki so that we can discuss in detail on-list.
> >
> > I'm marking this as Returned with Feedback now.
> >
> > Thanks everybody,
>
> Here I attached the DBT-3 performance report that is measured on the
> prototype patch
> that is written for columnar storage as I mentioned in my earlier mail
> with WOS and ROS
> design.
>
> Currently to measure the benefits of this design, we did the following
> changes,
> 1. Created the columnar storage index similar like other index methods
> 2. Used custom plan to generate the plan that can use the columnar storage
> 3. Optimized parallelism to use the columnar storage
>
> The code is not fully ready yet, I posted the performance results to
> get a view from
> community, whether this approach is really beneficial?
>
> I will provide the full details of the design and WIP patches later.
>
> Regards,
> Hari Babu
> Fujitsu Australia
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
Bert Desmet
0477/305361


Re: [HACKERS] Rework the way multixact truncations work

2015-12-10 Thread Bert
+1

On Thu, Dec 10, 2015 at 9:58 AM, Peter Geoghegan <p...@heroku.com> wrote:

> On Thu, Dec 10, 2015 at 12:34 AM, Andres Freund <and...@anarazel.de>
> wrote:
> >> > Ripping it out and replacing it monolithically will not
> >> > change that; it will only make the detailed history harder to
> >> > reconstruct, and I *will* want to reconstruct it.
> >>
> >> What's something that might happen six months from now and lead you to
> inspect
> >> master or 9.5 multixact.c between 4f627f8 and its revert?
> >
> > "Hey, what has happened to multixact.c lately? I'm investigating a bug,
> > and I wonder if it already has been fixed?", "Uh, what was the problem
> > with that earlier large commit?", "Hey, what has changed between beta2
> > and the final release?"...
>
> Quite.
>
> I can't believe we're still having this silly discussion. Can we please
> move on?
>
> --
> Peter Geoghegan
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Bert Desmet
0477/305361


Re: [HACKERS] Parallel Seq Scan

2015-11-17 Thread Bert
edit: maybe this is more useful? :)

(gdb) bt full
#0  0x00490b56 in heap_parallelscan_nextpage ()
No symbol table info available.
#1  0x00493fdf in heap_getnext ()
No symbol table info available.
#2  0x005c0733 in SeqNext ()
No symbol table info available.
#3  0x005ac5d9 in ExecScan ()
No symbol table info available.
#4  0x005a5c08 in ExecProcNode ()
No symbol table info available.
#5  0x005b5298 in ExecGather ()
No symbol table info available.
#6  0x005a5aa8 in ExecProcNode ()
No symbol table info available.
#7  0x005b68b9 in MultiExecHash ()
No symbol table info available.
#8  0x005b7256 in ExecHashJoin ()
No symbol table info available.
#9  0x005a5b18 in ExecProcNode ()
No symbol table info available.
#10 0x005b0ac9 in fetch_input_tuple ()
No symbol table info available.
#11 0x005b1eaf in ExecAgg ()
No symbol table info available.
#12 0x005a5ad8 in ExecProcNode ()
No symbol table info available.
#13 0x005c11e1 in ExecSort ()
No symbol table info available.
#14 0x005a5af8 in ExecProcNode ()
No symbol table info available.
#15 0x005ba164 in ExecLimit ()
No symbol table info available.
#16 0x005a5a38 in ExecProcNode ()
No symbol table info available.
#17 0x005a2343 in standard_ExecutorRun ()
No symbol table info available.
#18 0x0069cb08 in PortalRunSelect ()
No symbol table info available.
#19 0x0069de5f in PortalRun ()
No symbol table info available.
#20 0x0069bc16 in PostgresMain ()
No symbol table info available.
#21 0x00466f55 in ServerLoop ()
No symbol table info available.
#22 0x00648436 in PostmasterMain ()
No symbol table info available.
#23 0x004679f0 in main ()
No symbol table info available.


On Tue, Nov 17, 2015 at 12:38 PM, Bert <bier...@gmail.com> wrote:

> Hi,
>
> this is the backtrace:
> gdb /var/lib/pgsql/9.6/data/ /var/lib/pgsql/9.6/data/core.7877
> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <
> http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> /var/lib/pgsql/9.6/data/: Success.
> [New LWP 7877]
> Missing separate debuginfo for the main executable file
> Try: yum --enablerepo='*debug*' install
> /usr/lib/debug/.build-id/02/20b77a9ab8f607b0610082794165fccedf210d
> Core was generated by `postgres: postgres tpcds [loca'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00490b56 in ?? ()
> (gdb) bt full
> #0  0x00490b56 in ?? ()
> No symbol table info available.
> #1  0x3668 in ?? ()
> No symbol table info available.
> #2  0x7f956249a008 in ?? ()
> No symbol table info available.
> #3  0x0228c498 in ?? ()
> No symbol table info available.
> #4  0x0001 in ?? ()
> No symbol table info available.
> #5  0x0228ad00 in ?? ()
> No symbol table info available.
> #6  0x00493fdf in ?? ()
> No symbol table info available.
> #7  0x021a8e50 in ?? ()
> No symbol table info available.
> #8  0x in ?? ()
> No symbol table info available.
> (gdb) q
>
> Is there something else I can do?
>
> On Mon, Nov 16, 2015 at 8:59 PM, Robert Haas <robertmh...@gmail.com>
> wrote:
>
>> On Mon, Nov 16, 2015 at 2:51 PM, Bert <bier...@gmail.com> wrote:
>> > I've just pulled and compiled the new code.
>> > I'm running a TPC-DS like test on different PostgreSQL installations,
>> but
>> > running (max) 12queries in parallel on a server with 12cores.
>> > I've configured max_parallel_degree to 2, and I get messages that
>> backend
>> > processes crash.
>> > I am running the same test now with 6queries in parallel, and parallel
>> > degree to 2, and they seem to work. for now. :)
>> >
>> > This is the output I get in /var/log/messages
>> > Nov 16 20:40:05 woludwha02 kernel: postgres[22918]: segfault at
>> 7fa3437bf104
>> > ip 00490b56 sp 7ffdf2f083a0 error 6 in
>> postgres[40+5b5000]
>> >
>> > Is there something else I should get?
>>
>> Can you enable core dumps e.g. by passing the -c option to pg_ctl
>> start?  If you can get a core file, you can then get a backtrace
>> using:
>>
>> gdb /path/to/postgres /path/to/core
>> bt full
>> q
>>
>> That should be enough to find and fix whatever the bug is.  Thanks for
>> testing.
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>
>
>
> --
> Bert Desmet
> 0477/305361
>



-- 
Bert Desmet
0477/305361


Re: [HACKERS] Parallel Seq Scan

2015-11-17 Thread Bert
Hi,

this is the backtrace:
gdb /var/lib/pgsql/9.6/data/ /var/lib/pgsql/9.6/data/core.7877
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
/var/lib/pgsql/9.6/data/: Success.
[New LWP 7877]
Missing separate debuginfo for the main executable file
Try: yum --enablerepo='*debug*' install
/usr/lib/debug/.build-id/02/20b77a9ab8f607b0610082794165fccedf210d
Core was generated by `postgres: postgres tpcds [loca'.
Program terminated with signal 11, Segmentation fault.
#0  0x00490b56 in ?? ()
(gdb) bt full
#0  0x00490b56 in ?? ()
No symbol table info available.
#1  0x3668 in ?? ()
No symbol table info available.
#2  0x7f956249a008 in ?? ()
No symbol table info available.
#3  0x0228c498 in ?? ()
No symbol table info available.
#4  0x0001 in ?? ()
No symbol table info available.
#5  0x0228ad00 in ?? ()
No symbol table info available.
#6  0x00493fdf in ?? ()
No symbol table info available.
#7  0x021a8e50 in ?? ()
No symbol table info available.
#8  0x in ?? ()
No symbol table info available.
(gdb) q

Is there something else I can do?

On Mon, Nov 16, 2015 at 8:59 PM, Robert Haas <robertmh...@gmail.com> wrote:

> On Mon, Nov 16, 2015 at 2:51 PM, Bert <bier...@gmail.com> wrote:
> > I've just pulled and compiled the new code.
> > I'm running a TPC-DS like test on different PostgreSQL installations, but
> > running (max) 12queries in parallel on a server with 12cores.
> > I've configured max_parallel_degree to 2, and I get messages that backend
> > processes crash.
> > I am running the same test now with 6queries in parallel, and parallel
> > degree to 2, and they seem to work. for now. :)
> >
> > This is the output I get in /var/log/messages
> > Nov 16 20:40:05 woludwha02 kernel: postgres[22918]: segfault at
> 7fa3437bf104
> > ip 00490b56 sp 7ffdf2f083a0 error 6 in
> postgres[40+5b5000]
> >
> > Is there something else I should get?
>
> Can you enable core dumps e.g. by passing the -c option to pg_ctl
> start?  If you can get a core file, you can then get a backtrace
> using:
>
> gdb /path/to/postgres /path/to/core
> bt full
> q
>
> That should be enough to find and fix whatever the bug is.  Thanks for
> testing.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Bert Desmet
0477/305361


Re: [HACKERS] Parallel Seq Scan

2015-11-17 Thread Bert
Hey Robert,

Thank you for the help. As you might (not) know, I'm quite new to the
community, but I'm learning. with the help from people like you.
anyhow, find attached a third attempt to a valid backtrace file.

This run is compiled from commit 5f10b7a604c87fc61a2c20a56552301f74c9bd5f
and your latest patch atteched in this mailtrack.


cheers,
Bert​
 full_backtrace.log
<https://drive.google.com/file/d/0B_qnY25RovTmM0NtdkNSejByVGs/view?usp=drive_web>
​

On Tue, Nov 17, 2015 at 6:55 PM, Robert Haas <robertmh...@gmail.com> wrote:

> On Tue, Nov 17, 2015 at 6:52 AM, Bert <bier...@gmail.com> wrote:
> > edit: maybe this is more useful? :)
>
> Definitely.  But if you've built with --enable-debug and not stripped
> the resulting executable, we ought to get line numbers as well, plus
> the arguments to each function on the stack.  That would help a lot
> more.  The only things that get dereferenced in that function are
> "scan" and "parallel_scan", so it's a good bet that one of those
> pointers is pointing off into never-never land.  I can't immediately
> guess how that's happening, though.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Bert Desmet
0477/305361


Re: [HACKERS] Parallel Seq Scan

2015-11-16 Thread Bert
Hey,

I've just pulled and compiled the new code.
I'm running a TPC-DS like test on different PostgreSQL installations, but
running (max) 12queries in parallel on a server with 12cores.
I've configured max_parallel_degree to 2, and I get messages that backend
processes crash.
I am running the same test now with 6queries in parallel, and parallel
degree to 2, and they seem to work. for now. :)

This is the output I get in /var/log/messages
Nov 16 20:40:05 woludwha02 kernel: postgres[22918]: segfault at
7fa3437bf104 ip 00490b56 sp 7ffdf2f083a0 error 6 in
postgres[40+5b5000]

Is there something else I should get?

cheers,
Bert

On Mon, Nov 16, 2015 at 6:06 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:

> On Sat, Nov 14, 2015 at 10:12 PM, Amit Kapila <amit.kapil...@gmail.com>
> wrote:
> > On Fri, Nov 13, 2015 at 11:05 PM, Jeff Janes <jeff.ja...@gmail.com>
> wrote:
> >>
> >> On Wed, Nov 11, 2015 at 6:53 AM, Robert Haas <robertmh...@gmail.com>
> >> wrote:
> >> >
> >> > I've committed most of this, except for some planner bits that I
> >> > didn't like, and after a bunch of cleanup.  Instead, I committed the
> >> > consider-parallel-v2.patch with some additional planner bits to make
> >> > up for the ones I removed from your patch.  So, now we have parallel
> >> > sequential scan!
> >>
> >> Pretty cool.  All I had to do is mark my slow plperl functions as
> >> being parallel safe, and bang, parallel execution of them for seq
> >> scans.
> >>
> >> But, there does seem to be a memory leak.
> >>
> >
> > Thanks for the report.
> >
> > I think main reason of the leak in workers seems to be due the reason
> > that one of the buffer used while sending tuples (in function
> > BuildRemapInfo)
> > from worker to master is not getting freed and it is allocated for each
> > tuple worker sends back to master.  I couldn't find use of such a buffer,
> > so I think we can avoid the allocation of same or atleast we need to free
> > it.  Attached patch remove_unused_buf_allocation_v1.patch should fix the
> > issue.
>
> Thanks, that patch (as committed) has fixed the problem for me.  I
> don't understand the second one.
>
> Cheers,
>
> Jeff
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Bert Desmet
0477/305361


[HACKERS] introduction

2013-04-09 Thread Bert
Hi all,

I've just subscribed to this mailing list, so I thought it would be nice to
introduce myself.
I'm Bert Desmet, and I currently live in Belgium.
I work for Deloitte, where we are building a cloud BI platform. As a
database for the datawarehouses we use PostgreSQL since a few months.
Before that time we were using db2. Next to THE dba (yes, it's only for the
moment), I'm also playing with the linux servers, and I do code a bit.
Mostly in Python or bash.

I used to do community work for the fedora project, but since I changed
from a contributor to a normal user I decided to pick up a new challenge:
coding.
Since I use postgres every day this is also the product I want to code on.
But I realise every begin is difficult. I can't just dive into a big
project like postgres.

So I'll just monitor the mailing list a bit, pick some patches and test
them. I think this is the best way to learn to know the code, and is
probably helpful for everyone?

anyway, I hope I can grow to become a valuable contributor. But that won't
happen overnight! :-)

cheers,
Bert