Re: Proposal: Adding json logging

2018-04-18 Thread Michael Paquier
On Wed, Apr 18, 2018 at 12:10:47PM -0700, Christophe Pettus wrote: > On Apr 18, 2018, at 11:59, Robert Haas wrote: >> For the record, I'm tentatively in favor of including something like >> this in contrib. > > I'm much less fussed by this in contrib/ (with the same concern you > noted), at minim

Re: Proposal: Adding json logging

2018-04-18 Thread Michael Paquier
On Wed, Apr 18, 2018 at 02:59:26PM -0400, Robert Haas wrote: > > Note that logging_collector should be enabled in postgresql.conf to > ensure consistent log outputs. As JSON strings are longer than normal > logs generated by PostgreSQL, this module increases the odds of malformed > log entrie

Re: Proposal: Adding json logging

2018-04-18 Thread Alvaro Herrera
John W Higgins wrote: > On Sun, Apr 15, 2018 at 11:08 AM, David Arnold wrote: > > > >This would appear to solve multiline issues within Fluent. > > >https://docs.fluentd.org/v0.12/articles/parser_multiline > > > > I definitely looked at that, but what guarantees do I have that the > > sequenc

Re: Proposal: Adding json logging

2018-04-18 Thread David Arnold
Excellent phrasing (thanks to Christophe!): "There is a large class of log analysis tool out there that has trouble with multiline formats and we should be good ecosystem players" > I'm much less fussed by this in contrib/ (with the same concern you noted), at a minimum as an example of how to do

Re: Proposal: Adding json logging

2018-04-18 Thread Christophe Pettus
> On Apr 18, 2018, at 11:59, Robert Haas wrote: > > I'm not sure exactly how you intended to this comment, but it seems to > me that whether CSV is ease or hard to parse, somebody might > legitimately find JSON more convenient. Of course. The specific comment I was replying to made a couple of

Re: Proposal: Adding json logging

2018-04-18 Thread Robert Haas
On Sun, Apr 15, 2018 at 1:07 PM, Christophe Pettus wrote: >> On Apr 15, 2018, at 09:51, David Arnold wrote: >> 1. Throughout this vivid discussion a good portion of support has already >> been manifested for the need of a more structured (machine readable) logging >> format. There has been no s

Re: Proposal: Adding json logging

2018-04-17 Thread David Arnold
Alvaro, just to clarify for me, do you refer to the messages generated by https://github.com/postgres/postgres/blob/master/src/backend/utils/error/elog.c or other messages? Standardizing on UTF8 seems a good option. Assuming it* is* a problem, I would classify this as another second-order problem,

Re: Proposal: Adding json logging

2018-04-17 Thread Alvaro Herrera
One issue I haven't seen mentioned in this thread is the translation status of the server message (as well as its encoding): it's possible to receive messages in some random language if the lc_message setting is changed. Requiring that lc_messages must always be set to some English locale seems li

Re: Proposal: Adding json logging

2018-04-17 Thread David Arnold
This discussion is thriving, and long and behold, we've got an opinion from Eduardo (fluent-bit): https://github.com/fluent/fluent-bit/issues/564#issuecomment-381844419 >Also consider that in not all scenarios full multiline logs are flushed right away, sometimes there are delays. I think this l

Re: Proposal: Adding json logging

2018-04-17 Thread David Arnold
>To me it's implied by the doc at: https://www.postgresql.org/docs/current/static/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG Additionally this still depends on the way some middleware might choose to stream data. Can we really be sure the risk is minimal that any middleware would ha

Re: Proposal: Adding json logging

2018-04-17 Thread Daniel Verite
David Arnold wrote: > Interesting, does that implicitly mean the whole log event would get > transmitted as a "line" (with CRLF) in CSV. To me it's implied by the doc at: https://www.postgresql.org/docs/current/static/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG > In the aff

Re: Proposal: Adding json logging

2018-04-17 Thread Peter Eisentraut
On 4/16/18 23:12, Michael Paquier wrote: >> I have also had good success using syslog. While syslog is not very >> structured, the setting syslog_split_messages allows sending log entries >> that include newlines in one piece, which works well if you have some >> kind of full-text search engine at

Re: Proposal: Adding json logging

2018-04-16 Thread Michael Paquier
On Mon, Apr 16, 2018 at 07:52:58PM -0400, Peter Eisentraut wrote: > I have used https://github.com/mpihlak/pg_logforward in the past, which > seems to be about the same thing. Didn't know this one. Thanks. > I have also had good success using syslog. While syslog is not very > structured, the s

Re: Proposal: Adding json logging

2018-04-16 Thread Peter Eisentraut
On 4/13/18 20:00, David Arnold wrote: > I have reviewed some log samples and all DO contain some kind of multi > line logs which are very uncomfortable to parse reliably in a log streamer. > > I asked Michael Paquier about his > solution: https://github.com/michaelpq/pg_plugins/tree/master/jsonlog

Re: Proposal: Adding json logging

2018-04-16 Thread David Arnold
> In CSV a line break inside a field is easy to process for a parser, because (per https://tools.ietf.org/html/rfc4180): >"Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes" Interesting, does that implicitly mean the whole log event would get trans

Re: Proposal: Adding json logging

2018-04-16 Thread Daniel Verite
David Arnold wrote: > Not claiming this assumption does imply parsing of a *rolling* set > of log lines with *previously unkown cardinality*. That's expensive > on computing resources. I don't have actual numbers, but it doesn't > seem too far fetched, neither. > I filed a question to the

Re: Proposal: Adding json logging

2018-04-16 Thread David Arnold
*Hi all,* This discussion has made big steps forward. It is very encouraging to see this amount of interest. It seems that this has been around at the back of many minds for some time already... Thanks to Chrisophe friendly reminder, I aim to try to define the problem space as concise as possible

Re: Proposal: Adding json logging

2018-04-16 Thread David Fetter
On Mon, Apr 16, 2018 at 10:06:29AM -0400, Andrew Dunstan wrote: > On 04/15/2018 05:05 PM, Christophe Pettus wrote: > >> On Apr 15, 2018, at 12:16, David Arnold wrote: > >> > >> Core-Problem: "Multi line logs are unnecessarily inconvenient to parse and > >> are not compatible with the design of so

Re: Proposal: Adding json logging

2018-04-16 Thread Andrew Dunstan
On 04/15/2018 05:05 PM, Christophe Pettus wrote: >> On Apr 15, 2018, at 12:16, David Arnold wrote: >> >> Core-Problem: "Multi line logs are unnecessarily inconvenient to parse and >> are not compatible with the design of some (commonly used) logging >> aggregation flows." > I'd argue that the

Re: Proposal: Adding json logging

2018-04-15 Thread David Arnold
> I'd argue that the first line of attack on that should be to explain to those consumers of logs that they are making some unwarranted assumptions about the kind of inputs they'll be seeing. PostgreSQL's CSV log formats are not a particular bizarre format, or very difficult to parse. The standar

Re: Proposal: Adding json logging

2018-04-15 Thread Christophe Pettus
> On Apr 15, 2018, at 12:16, David Arnold wrote: > > Core-Problem: "Multi line logs are unnecessarily inconvenient to parse and > are not compatible with the design of some (commonly used) logging > aggregation flows." I'd argue that the first line of attack on that should be to explain to th

Re: Proposal: Adding json logging

2018-04-15 Thread David Arnold
>Have you asked that question? You seem to at least have opened the source code - did you try to figure out what the logging format is? 1. -> No. 2. -> Yes. I might be wrong, but something in my head tells me to have them seen the other way round. Unfortunately, I'm not experienced enough to be ab

Re: Proposal: Adding json logging

2018-04-15 Thread John W Higgins
On Sun, Apr 15, 2018 at 11:08 AM, David Arnold wrote: > >This would appear to solve multiline issues within Fluent. > >https://docs.fluentd.org/v0.12/articles/parser_multiline > > I definitely looked at that, but what guarantees do I have that the > sequence is always ERROR/STATEMENT/DETAIL?

Re: Proposal: Adding json logging

2018-04-15 Thread David Arnold
>Why? The newlines aren't meaningfully different from other characters you need to parse? The data isn't actually stored in a newline separated fashion, that's just one byte with that meaning I miss the details, but I believe that stdout is usually parsed and streamed simply line by line. Like in:

Re: Proposal: Adding json logging

2018-04-15 Thread David Arnold
>This would appear to solve multiline issues within Fluent. >https://docs.fluentd.org/v0.12/articles/parser_multiline I definitely looked at that, but what guarantees do I have that the sequence is always ERROR/STATEMENT/DETAIL? And not the other way round? And it only works with tail logging

Re: Proposal: Adding json logging

2018-04-15 Thread Christophe Pettus
> On Apr 15, 2018, at 11:00, David Arnold wrote: > > CSV Logs: https://pastebin.com/uwfmRdU7 Is the issue that there are line breaks in things like lines 7-9? -- -- Christophe Pettus x...@thebuild.com

Re: Proposal: Adding json logging

2018-04-15 Thread Andres Freund
On 2018-04-15 18:00:05 +, David Arnold wrote: > CSV shows line breaks, STDOUT shows ERROR/FATAL and detail on different > lines, not an easy problem to stream-parse reliably (without some kind of a > buffer, etc)... Why? The newlines aren't meaningfully different from other characters you need

Re: Proposal: Adding json logging

2018-04-15 Thread David Arnold
>It looks like the thread skipped over the problem space for the solution space pretty fast OK, I apologize, it seemed to me from the feedback that the problem was already uncontested. To verify/falsify that was the objective of my previous mail :) >Can you elaborate? Sure. CSV Logs: https://pas

Re: Proposal: Adding json logging

2018-04-15 Thread John W Higgins
On Sun, Apr 15, 2018 at 10:39 AM, David Arnold wrote: > >More specifically, JSON logging does seem to be a solution in search of a > problem. PostgreSQL's CSV logs are very easy to machine-parse, and if > there are corrupt lines being emitted there, the first step should be to > fix those, rathe

Re: Proposal: Adding json logging

2018-04-15 Thread Christophe Pettus
> On Apr 15, 2018, at 10:39, David Arnold wrote: > > In the light of the specific use case / problem for this thread to be born, > what exactly would you suggest? It looks like the thread skipped over the problem space for the solution space pretty fast; I see your note: > I have reviewed so

Re: Proposal: Adding json logging

2018-04-15 Thread David Arnold
>More specifically, JSON logging does seem to be a solution in search of a problem. PostgreSQL's CSV logs are very easy to machine-parse, and if there are corrupt lines being emitted there, the first step should be to fix those, rather than introduce a new "this time, for sure" logging method. >I

Re: Proposal: Adding json logging

2018-04-15 Thread Christophe Pettus
> On Apr 15, 2018, at 10:07, Christophe Pettus wrote: > > >> On Apr 15, 2018, at 09:51, David Arnold wrote: >> >> 1. Throughout this vivid discussion a good portion of support has already >> been manifested for the need of a more structured (machine readable) logging >> format. There has be

Re: Proposal: Adding json logging

2018-04-15 Thread Christophe Pettus
> On Apr 15, 2018, at 09:51, David Arnold wrote: > > 1. Throughout this vivid discussion a good portion of support has already > been manifested for the need of a more structured (machine readable) logging > format. There has been no substantial objection to this need. I'm afraid I don't see

Re: Proposal: Adding json logging

2018-04-15 Thread David Arnold
Does everyone more or less agree with the following intermediate résumé? 1. Throughout this vivid discussion a good portion of support has already been manifested for the need of a more structured (machine readable) logging format. There has been no substantial objection to this need. 2. It has b

Re: Proposal: Adding json logging

2018-04-15 Thread David Arnold
> Exactly - arrays, maps, nested json objects. It's more organized and easier to reason about. As postgresql becomes more and more sophisticated over time, I see flat logging becoming more unwieldy. With tools like jq, reading and querying json on the command line is simple and user friendly, and u

Re: Proposal: Adding json logging

2018-04-15 Thread Jordan Deitch
> Exactly what are you logging here ??? Why would I need to see a > multi-dimensional array in the log ? If I wanted to capture the location of errors my clients are encountering on their postgres clusters in detail, I would need to parse the 'LOCATION' string in their log entries, parse out th

Re: Proposal: Adding json logging

2018-04-15 Thread Dave Cramer
On 15 April 2018 at 11:27, Jordan Deitch wrote: > > > I would suggest that the community consider whether postgres will log > > multidimensional data. That will weigh into the decision of json vs. > > another format quite significantly. I am a fan of the json5 spec ( > > https://json5.org/), tho

Re: Proposal: Adding json logging

2018-04-15 Thread Jordan Deitch
> > I would suggest that the community consider whether postgres will log > multidimensional data. That will weigh into the decision of json vs. > another format quite significantly. I am a fan of the json5 spec ( > https://json5.org/), though adoption of this is quite poor. > > What do you mean

Re: Proposal: Adding json logging

2018-04-15 Thread David Arnold
>A slightly larger lift would include escaping newlines and ensuring that JSON output is always single lines, however long. I think that's necessary, actually I was implicitly assuming that as a prerequisite. I cannot imagine anything else beeing actually useful. Alternatively, I'm sure logfmt ha

Re: Proposal: Adding json logging

2018-04-14 Thread Jordan Deitch
I would suggest that the community consider whether postgres will log multidimensional data. That will weigh into the decision of json vs. another format quite significantly. I am a fan of the json5 spec (https://json5.org/), though adoption of this is quite poor. --- Jordan Deitch https://i

Re: Proposal: Adding json logging

2018-04-14 Thread Ryan Pedela
On Sat, Apr 14, 2018, 4:33 PM Andres Freund wrote: > On 2018-04-15 00:31:14 +0200, David Fetter wrote: > > On Sat, Apr 14, 2018 at 01:20:16PM -0700, Andres Freund wrote: > > > On 2018-04-14 18:05:18 +0200, David Fetter wrote: > > > > CSV is very poorly specified, which makes it at best complicate

Re: Proposal: Adding json logging

2018-04-14 Thread Andres Freund
On 2018-04-15 00:31:14 +0200, David Fetter wrote: > On Sat, Apr 14, 2018 at 01:20:16PM -0700, Andres Freund wrote: > > On 2018-04-14 18:05:18 +0200, David Fetter wrote: > > > CSV is very poorly specified, which makes it at best complicated to > > > build correct parsing libraries. JSON, whatever gr

Re: Proposal: Adding json logging

2018-04-14 Thread David Fetter
On Sat, Apr 14, 2018 at 01:20:16PM -0700, Andres Freund wrote: > On 2018-04-14 18:05:18 +0200, David Fetter wrote: > > On Sat, Apr 14, 2018 at 11:51:17AM -0400, Tom Lane wrote: > > > David Fetter writes: > > > > I think a suite of json_to_* utilities would be a good bit more > > > > helpful in thi

Re: Proposal: Adding json logging

2018-04-14 Thread Tom Lane
Andres Freund writes: > On 2018-04-14 18:05:18 +0200, David Fetter wrote: >> CSV is very poorly specified, which makes it at best complicated to >> build correct parsing libraries. JSON, whatever gripes I have about >> the format[1] is extremely well specified, and hence has excellent >> parsing l

Re: Proposal: Adding json logging

2018-04-14 Thread Andres Freund
On 2018-04-14 18:05:18 +0200, David Fetter wrote: > On Sat, Apr 14, 2018 at 11:51:17AM -0400, Tom Lane wrote: > > David Fetter writes: > > > I think a suite of json_to_* utilities would be a good bit more > > > helpful in this regard than changing our human-eye-consumable > > > logs. We already ha

Re: Proposal: Adding json logging

2018-04-14 Thread David Arnold
Given we have the following LOG_DESTIONATION... Source: https://github.com/postgres/postgres/blob/9d4649ca49416111aee2c84b7e4441a0b7aa2fac/src/include/utils/elog.h#L394-L398 /* Log destination bitmap */ #define LOG_DESTINATION_STDERR 1 #define LOG_DESTINATION_SYSLOG 2 #define LOG_DESTINATION_EVEN

Re: Proposal: Adding json logging

2018-04-14 Thread David Arnold
>As to logfmt in particular, the fact that it's not standardized is probably a show-stopper. >Let's go with JSON. I Agree. Though I don't want to deprecate the idea of logfmt enterly, yet. In container infrastructure it's a defacto standard and it solves a real problem. But I'm in favor to step ba

Re: Proposal: Adding json logging

2018-04-14 Thread David Fetter
On Sat, Apr 14, 2018 at 11:51:17AM -0400, Tom Lane wrote: > David Fetter writes: > > I think a suite of json_to_* utilities would be a good bit more > > helpful in this regard than changing our human-eye-consumable > > logs. We already have human-eye-consumable logs by default. What > > we don't

Re: Proposal: Adding json logging

2018-04-14 Thread David Arnold
>I'm dubious that JSON is "easier on machines" than CSV. Under common paradigms you are right, but if we talk of line-by-line streaming with subsequent processing, then it's a show stopper. Of course, some log aggregators have buffers for that and can do Multiline parsing on that buffer, but 1. No

Re: Proposal: Adding json logging

2018-04-14 Thread Tom Lane
David Fetter writes: > I think a suite of json_to_* utilities would be a good bit more > helpful in this regard than changing our human-eye-consumable logs. We > already have human-eye-consumable logs by default. What we don't > have, and increasingly do want, is a log format that's really easy o

Re: Proposal: Adding json logging

2018-04-14 Thread David Fetter
On Sat, Apr 14, 2018 at 03:27:58PM +, David Arnold wrote: > > Plus it's likely only a short-lived interchange format, not something to be > retained for a long period. > > Absolutely. > > There might be an argument that it's not easy on the eyes in the case it > would be consumed by a pair of

Re: Proposal: Adding json logging

2018-04-14 Thread David Arnold
> Plus it's likely only a short-lived interchange format, not something to be retained for a long period. Absolutely. There might be an argument that it's not easy on the eyes in the case it would be consumed by a pair of them. It's absolutely valid. Golang community has found a solution for that

Re: Proposal: Adding json logging

2018-04-14 Thread Craig Ringer
On 14 April 2018 at 11:24, Michael Paquier wrote: > "I proposed that a couple of years back, to be rejected as the key names > are too much repetitive and take too much place. gzip is astonishingly good at dealing with that, so I think that's actually a bit of a silly reason to block it. Plus i

Re: Proposal: Adding json logging

2018-04-13 Thread Michael Paquier
On Sat, Apr 14, 2018 at 12:00:16AM +, David Arnold wrote: > I'm new here. I'm David and would describe myself as an ambitious newbie, > so please take my suggestion with a grain of salt. Welcome here. > I asked Michael Paquier about his solution: > https://github.com/michaelpq/pg_plugins/tre

Proposal: Adding json logging

2018-04-13 Thread David Arnold
*Hello,* I'm new here. I'm David and would describe myself as an ambitious newbie, so please take my suggestion with a grain of salt. *Use case:* I find it difficult to properly parse postgres logs into some kind of log aggregator (I use fluent bit). My two standard option are standard and csvlog