Re: [HACKERS] log_destination=file

2018-03-02 Thread Magnus Hagander
On Fri, Mar 2, 2018 at 7:29 PM, Tom Lane  wrote:

> Robert Haas  writes:
> > On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
> >  wrote:
> >> Sorry for the naive question, but which of these bottlenecks are we
> >> actually hitting? I don't recall dealing with an actual production
> >> system where the log collector would be an issue, so I don't have a very
> >> good idea where the actual bottleneck is in this case.
>
> > Unfortunately, neither do I -- but I do believe that the problem is real.
>
> Not to state the obvious ... but if we do not know where the bottleneck
> is, isn't it a bit premature to be proposing changes to "fix" it?
>
> But that argument just leads to the same conclusion that seems to be
> prevalent at the moment, ie boot this to v12.
>

Just to be clear, it has already been booted to v12.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] log_destination=file

2018-03-02 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
>  wrote:
>> Sorry for the naive question, but which of these bottlenecks are we
>> actually hitting? I don't recall dealing with an actual production
>> system where the log collector would be an issue, so I don't have a very
>> good idea where the actual bottleneck is in this case.

> Unfortunately, neither do I -- but I do believe that the problem is real.

Not to state the obvious ... but if we do not know where the bottleneck
is, isn't it a bit premature to be proposing changes to "fix" it?

But that argument just leads to the same conclusion that seems to be
prevalent at the moment, ie boot this to v12.

regards, tom lane



Re: [HACKERS] log_destination=file

2018-03-02 Thread Robert Haas
On Mon, Jan 22, 2018 at 4:25 PM, Tomas Vondra
 wrote:
> Sorry for the naive question, but which of these bottlenecks are we
> actually hitting? I don't recall dealing with an actual production
> system where the log collector would be an issue, so I don't have a very
> good idea where the actual bottleneck is in this case.

Unfortunately, neither do I -- but I do believe that the problem is real.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] log_destination=file

2018-03-02 Thread Magnus Hagander
On Fri, Mar 2, 2018 at 8:27 AM, Andres Freund  wrote:

> Hi,
>
> On 2018-01-20 14:51:12 +0200, Magnus Hagander wrote:
> > Finally found myself back at this one, because I still think this is a
> > problem we definitely need to adress (whether with this file or not).
>
> This has an open CF entry for CF 2018-03. Given that there's still a lot
> of uncertainty what we want, I don't think that's a reasonable target.
> I would suggest moving it to 'returned with feedback'. Arguments
> against?
>

I agree, let's boot it. It needs more discussions about what to actually
do, and after that how. Let's focus on other things for 11, and have this
discussion again when the time comes. I'll go mark it.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] log_destination=file

2018-03-01 Thread Andres Freund
Hi,

On 2018-01-20 14:51:12 +0200, Magnus Hagander wrote:
> Finally found myself back at this one, because I still think this is a
> problem we definitely need to adress (whether with this file or not).

This has an open CF entry for CF 2018-03. Given that there's still a lot
of uncertainty what we want, I don't think that's a reasonable target.
I would suggest moving it to 'returned with feedback'. Arguments
against?

- Andres



Re: [HACKERS] log_destination=file

2018-01-22 Thread Tomas Vondra


On 01/22/2018 08:52 PM, Robert Haas wrote:
> On Sat, Jan 20, 2018 at 7:51 AM, Magnus Hagander  wrote:
>> Finally found myself back at this one, because I still think this is a
>> problem we definitely need to adress (whether with this file or not).
>>
>> The funneling into a single process is definitely an issue.
>>
>> But we don't really solve that problem today wit logging to stderr, do we?
>> Because somebody has to pick up the log as it came from stderr. Yes, you get
>> more overhead when sending the log to devnull, but that isn't really a
>> realistic scenario. The question is what to do when you actually want to
>> collect that much logging that quickly.
> 
> I think it depends on where the bottleneck is. If you're limited by 
> the speed at which a single process can write, shutting the logging 
> collector off and letting everyone write fixes it, because now you
> can bring the CPU cycles of many processes to bear rather than just
> one. If you're limited by the rate at which you can lay the file down
> on disk, then turning off the logging collector doesn't help, but I
> don't think that's the main problem. Now, of course, if you're
> writing the file to disk faster than a single process could do all
> those writes, then you're probably also going to need multiple
> processes to keep up with reading it, parsing it, etc. But that's not
> a problem for PostgreSQL core unless we decide to start shipping an
> in-core log analyzer.
> 

Sorry for the naive question, but which of these bottlenecks are we
actually hitting? I don't recall dealing with an actual production
system where the log collector would be an issue, so I don't have a very
good idea where the actual bottleneck is in this case.

I find it hard to believe the collector would be limited by I/O when
writing the data to disk (buffered sequential writes and all that).

So I guess it's more about the process of collecting the data from all
the processes through pipe, with the PIPE_MAX_PAYLOAD dance, etc.

I plan to do some experiments with log_statements=all, but my guess is
that the main issue is in transferring individual messages (which may be
further split into multiple chunks). What if we instead send the log
messages in larger batches? Of course, that would require alternative
transport mechanism (say, through shared memory) and the collector would
have to merge to maintain ordering. And then there's the issue that the
collector is pretty critical component.

FWIW I'm pretty sure we're not the only project facing the need to
collect high volume of logs from many processes, so how do the other
projects handle this issue?

>>
>> If each backend could actually log to *its own file*, then things would get
>> sped up. But we can't do that today. Unless you use the hooks and build it
>> yourself.
> 
> That seems like a useful thing to support in core.
> 

Yeah, I agree with that.

>> Per the thread referenced, using the hooks to handle the
>> very-high-rate-logging case seems to be the conclusion. But is that still
>> the conclusion, or do we feel we need to also have a native solution?
>>
>> And if the conclusion is that hooks is the way to go for that, then is the
>> slowdown of this patch actually a relevant problem to it?
> 
> I think that if we commit what you've proposed, we're making it harder
> for people who have a high volume of logging but are not currently
> using hooks.  I think we should try really hard to avoid the situation
> where our suggested workaround for a server change is "go write some C
> code and maybe you can get back to the performance you had with
> release N-1".  That's just not friendly.
> 
> I wonder if it would be feasible to set things up so that the logging
> collector was always started, but whether or not backends used it or
> wrote directly to their original stderr was configurable (e.g. dup
> stderr elsewhere, then dup whichever output source is currently
> selected onto stderr, then dup the other one if the config is changed
> later).
> 

I think the hook system is a really powerful tool, but it seems a bit
awkward to force people to use it to improve performance like this ...
That seems like something the core should to out of the box.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] log_destination=file

2018-01-22 Thread Robert Haas
On Sat, Jan 20, 2018 at 7:51 AM, Magnus Hagander  wrote:
> Finally found myself back at this one, because I still think this is a
> problem we definitely need to adress (whether with this file or not).
>
> The funneling into a single process is definitely an issue.
>
> But we don't really solve that problem today wit logging to stderr, do we?
> Because somebody has to pick up the log as it came from stderr. Yes, you get
> more overhead when sending the log to devnull, but that isn't really a
> realistic scenario. The question is what to do when you actually want to
> collect that much logging that quickly.

I think it depends on where the bottleneck is.  If you're limited by
the speed at which a single process can write, shutting the logging
collector off and letting everyone write fixes it, because now you can
bring the CPU cycles of many processes to bear rather than just one.
If you're limited by the rate at which you can lay the file down on
disk, then turning off the logging collector doesn't help, but I don't
think that's the main problem.  Now, of course, if you're writing the
file to disk faster than a single process could do all those writes,
then you're probably also going to need multiple processes to keep up
with reading it, parsing it, etc.  But that's not a problem for
PostgreSQL core unless we decide to start shipping an in-core log
analyzer.

> If each backend could actually log to *its own file*, then things would get
> sped up. But we can't do that today. Unless you use the hooks and build it
> yourself.

That seems like a useful thing to support in core.

> Per the thread referenced, using the hooks to handle the
> very-high-rate-logging case seems to be the conclusion. But is that still
> the conclusion, or do we feel we need to also have a native solution?
>
> And if the conclusion is that hooks is the way to go for that, then is the
> slowdown of this patch actually a relevant problem to it?

I think that if we commit what you've proposed, we're making it harder
for people who have a high volume of logging but are not currently
using hooks.  I think we should try really hard to avoid the situation
where our suggested workaround for a server change is "go write some C
code and maybe you can get back to the performance you had with
release N-1".  That's just not friendly.

I wonder if it would be feasible to set things up so that the logging
collector was always started, but whether or not backends used it or
wrote directly to their original stderr was configurable (e.g. dup
stderr elsewhere, then dup whichever output source is currently
selected onto stderr, then dup the other one if the config is changed
later).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] log_destination=file

2018-01-20 Thread Magnus Hagander
On Tue, Nov 14, 2017 at 5:33 PM, Robert Haas  wrote:

> On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthali...@gmail.com>
> wrote:
> > average latency:
> >
> > clients patch   master
> > 10  0.321   0.286
> > 20  0.669   0.602
> > 30  1.016   0.942
> > 40  1.358   1.280
> > 50  1.727   1.637
>
> That's still a noticeable slowdown, though.  And we've had previous
> reports of the overhead of logging being significant as well:
>
> http://postgr.es/m/CACLsApsA7U0GCFpojVQem6SGTEkv8
> vnwdbfhvi+dqo+gu5g...@mail.gmail.com
>
> I seem to recall a discussion, perhaps in-person, around the time Theo
> submitted that patch where it was reported that the logging collector
> could not be used on some systems he was working with because it
> became a major performance bottleneck.  With each backend writing its
> own messages to a file, it was tolerable, but when you tried to funnel
> everything through a single process, the back-pressure slowed down the
> entire system unacceptably.
>

Finally found myself back at this one, because I still think this is a
problem we definitely need to adress (whether with this file or not).

The funneling into a single process is definitely an issue.

But we don't really solve that problem today wit logging to stderr, do we?
Because somebody has to pick up the log as it came from stderr. Yes, you
get more overhead when sending the log to devnull, but that isn't really a
realistic scenario. The question is what to do when you actually want to
collect that much logging that quickly.

If each backend could actually log to *its own file*, then things would get
sped up. But we can't do that today. Unless you use the hooks and build it
yourself.

Per the thread referenced, using the hooks to handle the
very-high-rate-logging case seems to be the conclusion. But is that still
the conclusion, or do we feel we need to also have a native solution?

And if the conclusion is that hooks is the way to go for that, then is the
slowdown of this patch actually a relevant problem to it?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] log_destination=file

2017-11-29 Thread Michael Paquier
On Wed, Nov 15, 2017 at 12:33 AM, Robert Haas  wrote:
> On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthali...@gmail.com> wrote:
>> average latency:
>>
>> clients patch   master
>> 10  0.321   0.286
>> 20  0.669   0.602
>> 30  1.016   0.942
>> 40  1.358   1.280
>> 50  1.727   1.637
>
> That's still a noticeable slowdown, though.  And we've had previous
> reports of the overhead of logging being significant as well:
>
> http://postgr.es/m/caclsapsa7u0gcfpojvqem6sgtekv8vnwdbfhvi+dqo+gu5g...@mail.gmail.com
>
> I seem to recall a discussion, perhaps in-person, around the time Theo
> submitted that patch where it was reported that the logging collector
> could not be used on some systems he was working with because it
> became a major performance bottleneck.  With each backend writing its
> own messages to a file, it was tolerable, but when you tried to funnel
> everything through a single process, the back-pressure slowed down the
> entire system unacceptably.

I have moved this patch to next CF for now.. The original patch still applies.
-- 
Michael



Re: [HACKERS] log_destination=file

2017-11-14 Thread Robert Haas
On Sun, Sep 10, 2017 at 5:29 AM, Dmitry Dolgov <9erthali...@gmail.com> wrote:
> average latency:
>
> clients patch   master
> 10  0.321   0.286
> 20  0.669   0.602
> 30  1.016   0.942
> 40  1.358   1.280
> 50  1.727   1.637

That's still a noticeable slowdown, though.  And we've had previous
reports of the overhead of logging being significant as well:

http://postgr.es/m/caclsapsa7u0gcfpojvqem6sgtekv8vnwdbfhvi+dqo+gu5g...@mail.gmail.com

I seem to recall a discussion, perhaps in-person, around the time Theo
submitted that patch where it was reported that the logging collector
could not be used on some systems he was working with because it
became a major performance bottleneck.  With each backend writing its
own messages to a file, it was tolerable, but when you tried to funnel
everything through a single process, the back-pressure slowed down the
entire system unacceptably.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company