Hi Andrew,

On Thu, Apr 09, 2015 at 08:55:20AM -0500, Andrew Hayworth wrote:
> Hi Willy -
> 
> Apologies if this comes through multiple times; I'm having some mail
> difficulties.

I received it only once FYI.

> When I attempt to use %[path] in a log-format directive, I get the
> following errors:
> 
>   parsing [/etc/haproxy/haproxy.cfg:83] : 'log-format' : sample fetch
> <path> may not be reliably used here because it needs 'HTTP request
> headers' which is not available here.
> 
> Limited testing confirms that the path is not printed in the log when
> %[path] is included in log-format either. Perhaps I'm missing a
> configuration option that allows the use of that sample fetch in the
> logs?

OK I understand now. In fact you can do it using captures but that becomes
a bit more complex, at least too complex for a general usage in my opinion.

> >> +char* strip_uri_params(char *str) {
> >> +     int spaces = 0;
> >> +     int end = strlen(str);
> >> +
> >> +     int i;
> >> +     int path_end = end;
> >> +     for (i = 0; i < end; i++) {
> >> +             if (str[i] == ' ' && spaces == 0) {
> >> +                     spaces++;
> >> +             } else if (str[i] == '?' || (str[i] == ' ' && spaces > 0)) {
> >> +                     path_end = i;
> >> +                     break;
> >> +             }
> >> +     }
> >
> > What's the purpose of counting spaces to stop at the second one ? You
> > cannot not have any in the path, so I'm a bit puzzled.
> 
> It does seem counter-intuitive, but the uri variable at src/log.c#1546
> is actually the full HTTP request line.

Ah yes I didn't remember about this, you're right.

> Consider the following requests:
> 
>   GET /foo HTTP/1.1
>   GET /foo?bar HTTP/1.1
> 
> They should both produce an identical path (as the only difference is
> the query string), but they will not unless we strip the line at either
> the first question mark *or* the second space. Stripping at only the
> first question mark would produce logs like:
> 
>   GET /foo HTTP/1.1
>   GET /foo
> 
> ...which is undesirable.

Yes indeed.

> > Here I don't understand, you seem to be doing malloc+strncpy+strlen+free
> > just to get an integer which is the position of the question mark in the
> > chain that is determined very earély in your strip function, is that it ?
> > If so, that's totally overkill and not even logical!
> >
> > Thanks,
> > Willy
> >
> 
> You're right - this patch was not the most efficient or elegant. Chalk
> it up to not fully understanding the logging system, and my general lack
> of knowledge in C. I've included another patch that I think solves the
> problem much more elegantly, building on what you suggested.

But it systematically strips the HTTP version this time. I could suggest
another approach which I think would be cleaner and more universal. Simply
create not one, but 4 new log tags (and maybe prefix all of them with 'H'
to avoid confusing naming with other protocols if similar ones are wanted
later) :

  - %HM : HTTP method, returns the method only (string before the first
    space in the uri variable)

  - %HU : HTTP URI : returns the string between the first and second space
          or end of line

  - %HP : HTTP path : same as %HU but stops at first of question mark if
          present before a space.

  - %HV : HTTP version : reports the part after the second space if it exists.

I think that this will provide the elements needed to have your desired log
format and will allow other people to have theirs as well (eg those not
interested in the version can remove it).

Would that be OK for you ?

Thanks,
Willy


Reply via email to