Re: Error log format configuration syntax
Op 28-07-10 15:41, Rainer Jung schreef: On 28.07.2010 13:44, Dan Poirier wrote: On 2010-07-28 at 03:51, Alex Wulmsalex.wu...@scarlet.be wrote: Hi, While adding some debug log statements to a module I'm working on, I ran into the problem that ap_log_error (in apache 2.2) does not support %zd or %zu conversion for type_t arguments. This was problematic to make the code compile on both 32 and 64 bit platforms. * platform (32-bit or 64-bit). This violates the whole purpose of type_t, which * was introduced in C exactly to provide cross-platform compatibility... I'm confused. Neither c89 nor c99 define a type type_t, as far as I can see. But you might find the *_FMT macro definitions from APR helpful, or else explain your problem in more detail? It seems in C99 a length specifier z means: For integer types, causes printf to expect a size_t sized integer argument. Citing: Specifies that a following d, i, o, u, x, or X conversion specifier applies to a size_t or the corresponding signed integer type argument; or that a following n conversion specifier applies to a pointer to a signed integer type corresponding to size_t argument. At least cross checking on Solaris 10 shows it is actually implemented for printf() there. So: type_t - size_t and the OP should be able to use APR_SIZE_T_FMT or APR_SSIZE_T_FMT (signed) as you suggested. Thanks for the tip about the APR formatting macro's (I'm still getting up to speed on the features of the APR API). It is indeed what I need. I'll adapt the code accordingly. Cheers, Alex
Re: Error log format configuration syntax
Hi, While adding some debug log statements to a module I'm working on, I ran into the problem that ap_log_error (in apache 2.2) does not support %zd or %zu conversion for type_t arguments. This was problematic to make the code compile on both 32 and 64 bit platforms. So I made a small wrapper (see below) to workaround the problem. I suggest to build support for %zd and %zu conversion into the unified logger. /** * ap_log_error does not support %zd or %zu conversion for type_t arguments * So with ap_log_error one would have to specify either %d or %ld, depending on the * platform (32-bit or 64-bit). This violates the whole purpose of type_t, which * was introduced in C exactly to provide cross-platform compatibility... * This wrapper function supports %zd and %zu conversion parameters. * Note that it truncates the logged message to 1000 bytes, so don't use it to log messages that might * be longer */ void ap_log_error_wrapper(const char *file, int line, int level, apr_status_t status, const server_rec *s, const char *fmt, ...) { char msg[1000]; va_list ap; va_start(ap, fmt); vsnprintf(msg, sizeof(msg), fmt, ap); ap_log_error(file, line, level, status, s, %s, msg); } Cheers, Alex Op 27-07-10 23:11, Stefan Fritsch schreef: On Wednesday 21 July 2010, William A. Rowe Jr. wrote: On 7/21/2010 4:35 PM, Stefan Fritsch wrote: And I agree that a unified logging formatter would be nice, but this is not something that will be done before 2.3.7 and it would likely mean an incompatible API change for modules providing handlers for mod_log_config. IMHO, this can wait until 3.0. IMHO, it must not. The simple reason is that the more code duplication we introduce, the more opportunity for flaws and maintenance headaches during the 2.4 lifecycle. I'd accept waiting for this entire custom error log feature for 3.0, but really would rather introduce it sooner and not later. I fear a unified logging formatter may either delay 2.4 or not make it into 2.4. In 2.4, we really need some way to omit some of the fields we currently have in the error log prefix ( 100 chars of prefix is insane). But as I had less time in the last few days than I hoped, it does not look like my (non-unified) errorlog formatter would be ready for 2.3.7 in any case. So we will have time until 2.3.8 to decide what to do. If there is a resulting API change, I think everyone is willing to accept this during the 2.3-alpha/beta cycle. 2.4.0 is our 'lockdown'. I'm willing to help with this code, although I'm just starting to dig out of the hole from my two recent hand surgeries. 6-finger typing isn't all that efficient :) I think there are more important things that could use your and my attention than avoiding this bit of code duplication. Buf if you want to start a bit of technical discussion: The main technical difference between error and access log is that for the access log, everything is allocated from the request pool and then written to the log file (i.e. there is no length limit). For the error log, there is a fixed 8K buffer (allocated on the stack). For the error log, it is not possible to use any of the existing pools to avoid unbounded memory usage. The unified log formatter would either have to use a pre-allocated buffer or a temp pool. For my work on the error log formatter, I have stayed with the pre- allocated buffer. Would this be a reasonable solution for the unified logger? It would mean a fixed limit for the access logs lines (though the access log could use a larger buffer than the error log, I guess 16 or 32K would be enough even for the access log). The pool approach would require a per-thread temp logging pool (using apr_threadkey_private_create or the like) or creating and destroying a sub-pool for every log line. Which solution looks better to you? Cheers, Stefan
Re: Error log format configuration syntax
On 2010-07-28 at 03:51, Alex Wulms alex.wu...@scarlet.be wrote: Hi, While adding some debug log statements to a module I'm working on, I ran into the problem that ap_log_error (in apache 2.2) does not support %zd or %zu conversion for type_t arguments. This was problematic to make the code compile on both 32 and 64 bit platforms. * platform (32-bit or 64-bit). This violates the whole purpose of type_t, which * was introduced in C exactly to provide cross-platform compatibility... I'm confused. Neither c89 nor c99 define a type type_t, as far as I can see. But you might find the *_FMT macro definitions from APR helpful, or else explain your problem in more detail? Dan
Re: Error log format configuration syntax
On 28.07.2010 13:44, Dan Poirier wrote: On 2010-07-28 at 03:51, Alex Wulmsalex.wu...@scarlet.be wrote: Hi, While adding some debug log statements to a module I'm working on, I ran into the problem that ap_log_error (in apache 2.2) does not support %zd or %zu conversion for type_t arguments. This was problematic to make the code compile on both 32 and 64 bit platforms. * platform (32-bit or 64-bit). This violates the whole purpose of type_t, which * was introduced in C exactly to provide cross-platform compatibility... I'm confused. Neither c89 nor c99 define a type type_t, as far as I can see. But you might find the *_FMT macro definitions from APR helpful, or else explain your problem in more detail? It seems in C99 a length specifier z means: For integer types, causes printf to expect a size_t sized integer argument. Citing: Specifies that a following d, i, o, u, x, or X conversion specifier applies to a size_t or the corresponding signed integer type argument; or that a following n conversion specifier applies to a pointer to a signed integer type corresponding to size_t argument. At least cross checking on Solaris 10 shows it is actually implemented for printf() there. So: type_t - size_t and the OP should be able to use APR_SIZE_T_FMT or APR_SSIZE_T_FMT (signed) as you suggested. Regards, Rainer
Re: Error log format configuration syntax
On Wednesday 21 July 2010, William A. Rowe Jr. wrote: On 7/21/2010 4:35 PM, Stefan Fritsch wrote: And I agree that a unified logging formatter would be nice, but this is not something that will be done before 2.3.7 and it would likely mean an incompatible API change for modules providing handlers for mod_log_config. IMHO, this can wait until 3.0. IMHO, it must not. The simple reason is that the more code duplication we introduce, the more opportunity for flaws and maintenance headaches during the 2.4 lifecycle. I'd accept waiting for this entire custom error log feature for 3.0, but really would rather introduce it sooner and not later. I fear a unified logging formatter may either delay 2.4 or not make it into 2.4. In 2.4, we really need some way to omit some of the fields we currently have in the error log prefix ( 100 chars of prefix is insane). But as I had less time in the last few days than I hoped, it does not look like my (non-unified) errorlog formatter would be ready for 2.3.7 in any case. So we will have time until 2.3.8 to decide what to do. If there is a resulting API change, I think everyone is willing to accept this during the 2.3-alpha/beta cycle. 2.4.0 is our 'lockdown'. I'm willing to help with this code, although I'm just starting to dig out of the hole from my two recent hand surgeries. 6-finger typing isn't all that efficient :) I think there are more important things that could use your and my attention than avoiding this bit of code duplication. Buf if you want to start a bit of technical discussion: The main technical difference between error and access log is that for the access log, everything is allocated from the request pool and then written to the log file (i.e. there is no length limit). For the error log, there is a fixed 8K buffer (allocated on the stack). For the error log, it is not possible to use any of the existing pools to avoid unbounded memory usage. The unified log formatter would either have to use a pre-allocated buffer or a temp pool. For my work on the error log formatter, I have stayed with the pre- allocated buffer. Would this be a reasonable solution for the unified logger? It would mean a fixed limit for the access logs lines (though the access log could use a larger buffer than the error log, I guess 16 or 32K would be enough even for the access log). The pool approach would require a per-thread temp logging pool (using apr_threadkey_private_create or the like) or creating and destroying a sub-pool for every log line. Which solution looks better to you? Cheers, Stefan
Re: Error log format configuration syntax
On 20.07.2010 21:14, Stefan Fritsch wrote: On Tue, 20 Jul 2010, Rainer Jung wrote: message and behind the message. I guess you can get rid of the latter split by assigning a format specifier also to the log message, like '%M' or similar, and then ErrorLogFormat [%{u}t] [%l] [pid %P%{:tid }T] %F: %{}{: }E%{[client }{] }a %M %{}{, referer: }i Doing it with two config directives was more straightforward to implement, but I agree that the configuration would be easier to read with a format specifier for the log message. Maybe questioning, how important the configurable pre and suffixes are. We could either provide fixed ones for the individual log patterns, or we could provide none and indeed log an empty string or - if we don't have a value. I'd say both ways are viable. I guess some users would find it nice to have a fixed column format until the error message begins, so it's easier to parse by script, others will find it more readable if the empty fields get suppressed (condensed format). What about: - Allow to choose whether empty values get dropped (one configuration switch to choose the condensed format) - taking all adjacent non-whitespace as prefixes and suffixes, collapse resulting adjacent whitespace in the output by adding the whitespace in front of the prefix to the prefix and dropping leading whitespace from the resulting line) Thats a very interesting idea. Taking it one step further, one could introduce a meta-character (e.g. ^) for separating the fields. If a format specifier produces no output, everything from the previous to the next field separator gets deleted. For example: ErrorLogFormat [%{u}t] [%l] [pid %P^:tid %T^] ^%F: ^%E: ^[client %a] ^%M ^, referer: %{Referer}i That's quite readable. I will check how much effort this is to implement. Yes, that way the choice between condensed and keeping empty fields is also back to users. If they mark a prefix or a suffix they get it removed, if not it will stay always (and we could print - for the missing data). Regards, Rainer
RE: Error log format configuration syntax
-Original Message- From: Rainer Jung Sent: Mittwoch, 21. Juli 2010 11:24 To: dev@httpd.apache.org Subject: Re: Error log format configuration syntax On 20.07.2010 21:14, Stefan Fritsch wrote: On Tue, 20 Jul 2010, Rainer Jung wrote: message and behind the message. I guess you can get rid of Thats a very interesting idea. Taking it one step further, one could introduce a meta-character (e.g. ^) for separating the fields. If a format specifier produces no output, everything from the previous to the next field separator gets deleted. For example: ErrorLogFormat [%{u}t] [%l] [pid %P^:tid %T^] ^%F: ^%E: ^[client %a] ^%M ^, referer: %{Referer}i That's quite readable. I will check how much effort this is to implement. Yes, that way the choice between condensed and keeping empty fields is also back to users. If they mark a prefix or a suffix they get it removed, if not it will stay always (and we could print - for the missing data). Sounds reasonable. Regards Rüdiger
Re: Error log format configuration syntax
On 7/20/2010 2:14 PM, Stefan Fritsch wrote: - taking all adjacent non-whitespace as prefixes and suffixes, collapse resulting adjacent whitespace in the output by adding the whitespace in front of the prefix to the prefix and dropping leading whitespace from the resulting line) ErrorLogFormat [%{u}t] [%l] [pid %P^:tid %T^] ^%F: ^%E: ^[client %a] ^%M ^, referer: %{Referer}i Outch - please evaluate using %^xxx instead. I'd rather this were simply supported in a unified logging formatter, so it would apply to access, new log extensions, etc. It could imply delete-adjacent non-whitespace unless there were multiple %-escapes in the same token. I'm not fond of arbitrarily adding new escape codes, and we already had a suitable one (%).
Re: Error log format configuration syntax
On Wednesday 21 July 2010, William A. Rowe Jr. wrote: On 7/20/2010 2:14 PM, Stefan Fritsch wrote: - taking all adjacent non-whitespace as prefixes and suffixes, collapse resulting adjacent whitespace in the output by adding the whitespace in front of the prefix to the prefix and dropping leading whitespace from the resulting line) ErrorLogFormat [%{u}t] [%l] [pid %P^:tid %T^] ^%F: ^%E: ^[client %a] ^%M ^, referer: %{Referer}i Outch - please evaluate using %^xxx instead. I'd rather this were simply supported in a unified logging formatter, so it would apply to access, new log extensions, etc. It could imply delete-adjacent non-whitespace unless there were multiple %-escapes in the same token. I'm not fond of arbitrarily adding new escape codes, and we already had a suitable one (%). We want a dedicated field separator, because only deleting adjacent non-whitespace wouldn't work for things like [client %a]. I would prefer a single character as separator and have chosen '^' because it is relatively unlikely to appear literally in the log format. But I wouldn't mind using a two character separator. In this case, '%|' is probably more readable than '%^'. And I agree that a unified logging formatter would be nice, but this is not something that will be done before 2.3.7 and it would likely mean an incompatible API change for modules providing handlers for mod_log_config. IMHO, this can wait until 3.0.
Re: Error log format configuration syntax
On 7/21/2010 4:35 PM, Stefan Fritsch wrote: And I agree that a unified logging formatter would be nice, but this is not something that will be done before 2.3.7 and it would likely mean an incompatible API change for modules providing handlers for mod_log_config. IMHO, this can wait until 3.0. IMHO, it must not. The simple reason is that the more code duplication we introduce, the more opportunity for flaws and maintenance headaches during the 2.4 lifecycle. I'd accept waiting for this entire custom error log feature for 3.0, but really would rather introduce it sooner and not later. If there is a resulting API change, I think everyone is willing to accept this during the 2.3-alpha/beta cycle. 2.4.0 is our 'lockdown'. I'm willing to help with this code, although I'm just starting to dig out of the hole from my two recent hand surgeries. 6-finger typing isn't all that efficient :)
Re: Error log format configuration syntax
On 7/21/2010 4:35 PM, Stefan Fritsch wrote: On Wednesday 21 July 2010, William A. Rowe Jr. wrote: On 7/20/2010 2:14 PM, Stefan Fritsch wrote: - taking all adjacent non-whitespace as prefixes and suffixes, collapse resulting adjacent whitespace in the output by adding the whitespace in front of the prefix to the prefix and dropping leading whitespace from the resulting line) ErrorLogFormat [%{u}t] [%l] [pid %P^:tid %T^] ^%F: ^%E: ^[client %a] ^%M ^, referer: %{Referer}i Outch - please evaluate using %^xxx instead. I'd rather this were simply supported in a unified logging formatter, so it would apply to access, new log extensions, etc. It could imply delete-adjacent non-whitespace unless there were multiple %-escapes in the same token. I'm not fond of arbitrarily adding new escape codes, and we already had a suitable one (%). We want a dedicated field separator, because only deleting adjacent non-whitespace wouldn't work for things like [client %a]. I would prefer a single character as separator and have chosen '^' because it is relatively unlikely to appear literally in the log format. But I wouldn't mind using a two character separator. In this case, '%|' is probably more readable than '%^'. Two thoughts ... some %{[client }q text-literal quoting method could keep that as a literal space, not whitespace. Or perhaps [client\ %^a] could do the same thing?
Re: Error log format configuration syntax
On 20.07.2010 00:39, Stefan Fritsch wrote: Hi, I have been working on making the error log format configurable. It's more or less working now, but I could use some feed-back about the config syntax. The difficulty is that many tokens only produce output in some situations (e.g. no remote IP in server scope, no thread id in a non-threaded MPM, etc.). Since we don't want to have many empty []s or -s in the log, such tokens should take a prefix and suffix that are only printed if there is some relevant data. Also, some tokens need an additional argument (e.g. time format, header name, ...). Due to cut'n'paste from mod_log_config, the currently implemented syntax is this: %{arg}{prefix}{suffix}T This results in rather ugly configuration lines. For example the format ErrorLogFormat prefix [%{u}t] [%l] [pid %P%{:tid }T] %F: %{}{: }E%{[client }{] }a ErrorLogFormat suffix %{Referer}{, referer: }i gives roughly what we currently have in trunk: [Mon Jul 19 23:41:17.073289 2010] [debug] [pid 19220:tid 4132666224] http_request.c(300): (42)Broken Pipe: [client 127.0.0.1:36119] something's broken, referer: http://blah.com/ Ah, you used prefix and suffix here in two different contexts, once as a placeholder in describing the syntax for the per token prefixes and suffixes, and once as reserved words for defining the log format used in front of the message and behind the message. I guess you can get rid of the latter split by assigning a format specifier also to the log message, like '%M' or similar, and then ErrorLogFormat [%{u}t] [%l] [pid %P%{:tid }T] %F: %{}{: }E%{[client }{] }a %M %{}{, referer: }i One could use different separators: %prefix{arg}suffixT which would lead to things like ErrorLogFormat prefix [%{u}t] [%l] [pid %P%:tid T] %F: %: E%[client ] a ErrorLogFormat suffix %, referer: {Referer} i which is a bit better but not really good. Or maybe: %prefix{arg}Tsuffix resulting in ErrorLogFormat prefix [%{u}t] [%l] [pid %P%:tid T] %F: %E: %[client a] ErrorLogFormat suffix %, referer: {Referer}i Does anyone have a better idea? Maybe questioning, how important the configurable pre and suffixes are. We could either provide fixed ones for the individual log patterns, or we could provide none and indeed log an empty string or - if we don't have a value. I'd say both ways are viable. I guess some users would find it nice to have a fixed column format until the error message begins, so it's easier to parse by script, others will find it more readable if the empty fields get suppressed (condensed format). What about: - Allow to choose whether empty values get dropped (one configuration switch to choose the condensed format) - taking all adjacent non-whitespace as prefixes and suffixes, collapse resulting adjacent whitespace in the output by adding the whitespace in front of the prefix to the prefix and dropping leading whitespace from the resulting line) BTW, I have also implemented once per request and once per connection logging and log ids that can be used to connect different lines in the error log and the error log with the access log. Great. For example, this format ErrorLogFormat prefix [%{uc}t] [%m:%l] %{}{req:}{ }L%{C}{conn:}{ }L ErrorLogFormat suffix ErrorLogFormat connection [pid %P] local: %A - remote: %a ErrorLogFormat request request %k on connection %{c}L %{Referer}{Referer: }i gives this output, which should be rather nice for debugging: [2010-07-19 23:35:45.076082] [core:notice] Command line: '/usr/local/apache2/bin/httpd' [2010-07-19 23:35:46.832314] [-:-] conn:ruMCWgQNaAo [pid 18932] local: 127.0.0.1:8081 - remote: 127.0 .0.1:49804 [2010-07-19 23:35:46.832367] [-:-] req:VOMCWgQNaAo request 0 on connection ruMCWgQNaAo [2010-07-19 23:35:46.832382] [http:trace4] req:VOMCWgQNaAo Headers received from client: [2010-07-19 23:35:46.832382] [http:trace4] req:VOMCWgQNaAo Connection: Keep-Alive ... [2010-07-19 23:35:46.833359] [-:-] req:secCWmQNaAo request 1 on connection ruMCWgQNaAo [2010-07-19 23:35:46.833385] [http:trace4] req:secCWmQNaAo Headers received from client: The patch is available at http://people.apache.org/~sf/errorlog_format_v1.diff . It still needs some polishing and cleanup, though. Cheers, Stefan
Re: Error log format configuration syntax
On Tue, 20 Jul 2010, Rainer Jung wrote: message and behind the message. I guess you can get rid of the latter split by assigning a format specifier also to the log message, like '%M' or similar, and then ErrorLogFormat [%{u}t] [%l] [pid %P%{:tid }T] %F: %{}{: }E%{[client }{] }a %M %{}{, referer: }i Doing it with two config directives was more straightforward to implement, but I agree that the configuration would be easier to read with a format specifier for the log message. Maybe questioning, how important the configurable pre and suffixes are. We could either provide fixed ones for the individual log patterns, or we could provide none and indeed log an empty string or - if we don't have a value. I'd say both ways are viable. I guess some users would find it nice to have a fixed column format until the error message begins, so it's easier to parse by script, others will find it more readable if the empty fields get suppressed (condensed format). What about: - Allow to choose whether empty values get dropped (one configuration switch to choose the condensed format) - taking all adjacent non-whitespace as prefixes and suffixes, collapse resulting adjacent whitespace in the output by adding the whitespace in front of the prefix to the prefix and dropping leading whitespace from the resulting line) Thats a very interesting idea. Taking it one step further, one could introduce a meta-character (e.g. ^) for separating the fields. If a format specifier produces no output, everything from the previous to the next field separator gets deleted. For example: ErrorLogFormat [%{u}t] [%l] [pid %P^:tid %T^] ^%F: ^%E: ^[client %a] ^%M ^, referer: %{Referer}i That's quite readable. I will check how much effort this is to implement. Cheers, Stefan
Error log format configuration syntax
Hi, I have been working on making the error log format configurable. It's more or less working now, but I could use some feed-back about the config syntax. The difficulty is that many tokens only produce output in some situations (e.g. no remote IP in server scope, no thread id in a non-threaded MPM, etc.). Since we don't want to have many empty []s or -s in the log, such tokens should take a prefix and suffix that are only printed if there is some relevant data. Also, some tokens need an additional argument (e.g. time format, header name, ...). Due to cut'n'paste from mod_log_config, the currently implemented syntax is this: %{arg}{prefix}{suffix}T This results in rather ugly configuration lines. For example the format ErrorLogFormat prefix [%{u}t] [%l] [pid %P%{:tid }T] %F: %{}{: }E%{[client }{] }a ErrorLogFormat suffix %{Referer}{, referer: }i gives roughly what we currently have in trunk: [Mon Jul 19 23:41:17.073289 2010] [debug] [pid 19220:tid 4132666224] http_request.c(300): (42)Broken Pipe: [client 127.0.0.1:36119] something's broken, referer: http://blah.com/ One could use different separators: %prefix{arg}suffixT which would lead to things like ErrorLogFormat prefix [%{u}t] [%l] [pid %P%:tid T] %F: %: E%[client ] a ErrorLogFormat suffix %, referer: {Referer} i which is a bit better but not really good. Or maybe: %prefix{arg}Tsuffix resulting in ErrorLogFormat prefix [%{u}t] [%l] [pid %P%:tid T] %F: %E: %[client a] ErrorLogFormat suffix %, referer: {Referer}i Does anyone have a better idea? BTW, I have also implemented once per request and once per connection logging and log ids that can be used to connect different lines in the error log and the error log with the access log. For example, this format ErrorLogFormat prefix [%{uc}t] [%m:%l] %{}{req:}{ }L%{C}{conn:}{ }L ErrorLogFormat suffix ErrorLogFormat connection [pid %P] local: %A - remote: %a ErrorLogFormat requestrequest %k on connection %{c}L %{Referer}{Referer: }i gives this output, which should be rather nice for debugging: [2010-07-19 23:35:45.076082] [core:notice] Command line: '/usr/local/apache2/bin/httpd' [2010-07-19 23:35:46.832314] [-:-] conn:ruMCWgQNaAo [pid 18932] local: 127.0.0.1:8081 - remote: 127.0 .0.1:49804 [2010-07-19 23:35:46.832367] [-:-] req:VOMCWgQNaAo request 0 on connection ruMCWgQNaAo [2010-07-19 23:35:46.832382] [http:trace4] req:VOMCWgQNaAo Headers received from client: [2010-07-19 23:35:46.832382] [http:trace4] req:VOMCWgQNaAo Connection: Keep-Alive ... [2010-07-19 23:35:46.833359] [-:-] req:secCWmQNaAo request 1 on connection ruMCWgQNaAo [2010-07-19 23:35:46.833385] [http:trace4] req:secCWmQNaAo Headers received from client: The patch is available at http://people.apache.org/~sf/errorlog_format_v1.diff . It still needs some polishing and cleanup, though. Cheers, Stefan