On 19/04/2013 5:41 a.m., Alex Rousskov wrote:
On 04/06/2013 11:48 PM, Amos Jeffries wrote:
What this does is convert ErrorState object to using the generic
libformat.la parser and macro expansions instead of its own rather
limited custom ones for error pages and deny_info URL creation.
This is a very nice improvement IMO, thank you.
class AccessLogEntry: public RefCountable
{
...
+
+ /// details of the error page presented to the client (if any)
+ ErrorState *errorDetails;
};
Please do not use a general name like "errorDetails" to store
information dedicated to a very specific use by internal ErrorPage code.
Besides, we already have a whole class called ErrorDetail and it has
nothing to do with the above member. I suggest calling this new member
"errorPage".
Done.
Please also add a comment that explains that this pointer is managed
internally by ErrorState and should not be used outside ErrorState code.
You have documented that in your email, but I think it is worth
documenting that in the code. More on the lifetime of this pointer below.
+ MemBuf *res = ConvertText(m);
+
+ // yuck! steal the res buffer for our output
+ result.buf = res->buf;
+ result.size = res->size;
+ result.capacity = res->capacity;
+ result.max_capacity = res->max_capacity;
+ res->stolen = true; // prevent delete erasing the buffer we stole.
+ delete res;
Please rewrite without stealing. One way to do that would be to pass
MemBuf pointer to ConvertText. If the passed pointer is NULL,
ConvertText() will create a new buffer.
Done now. Thanks for the reminder.
+ // XXX: if request exists, we might use ALE from
request->lientConnectionManager->http.al ??
It sounds like a great idea because error page ALE is missing
information that the http.al has. On the other hand, if we set
http.al.errorDetails to our page, it is not obvious that another error
page for the same transaction will never do that as well, overwriting
our pointer and confusing things. I wonder if we can do better:
As far as I can tell, the useful lifetime of the new errorDetails
pointer is the fmt.assemble() call. If that is true, let's document that
fact and rewrite the code to set and clear the pointer just before and
just after that call. You may even assert that the ale_->errorDetails
pointer is nil in the error page destructor.
If we can do the above, then we should use http.al when it is available.
Refcounting should make that easy.
This is all done now.
*b) Doing a whole parse cycle for every error response is very costly.
We are already loading the text page files on every error response. This
adds the un-optimized libformat parser lag on top of that.
Have you tested the performance impact of these changes? Perhaps they
are not as bad as they sound because the old code was not super
optimized either?
Not yet. However the old code was just doing a parse scan with multiple
append() operations at each macro site. The new code is doing the same
on top of a format tree structure allocate/deallocate cycle. So for now
this is guaranteeing yet another step towards bad performance.
*c) For now AccessLogEntry equired by the libformat API is generated by
ErrorState just before use. Ideally the code creating the ErrorState
should pass it an already filled out ALE object for the transaction.
Unfortunately, a lot of code creating error pages is in no better
position than ErrorState to find that valuable ALE object. This is the
old problem of a missing "master transaction" object: HttpRequest is
available nearly everywhere but is declared (probably correctly!)
unsuitable for logging in favor of ALE which is not easily available in
many contexts.
I suspect that the only reasonable mid-term solution is to carefully add
a pointer to the "master transaction" ALE to HttpRequest.
- Ip::Address src_addr;
+ Ip::Address src_addr; // client Comm::Connection ??
Please see if you can delete this member. You may have removed the only
useful code that was using it. The dumping code is not very useful and
can can probably be rewritten to get this info from elsewhere (since the
ALE formatting code does support logging of the client source address
and does not have access to this member).
This one is not quite so easy. It is supposed to be used for errors
early in parsing when there is no HttpRequest object.
I have taken advantage of the change though and enacted that "??" / TODO
about making it a connection pointer.
That has caused a few build issues around the place. New patch to follow
when it builds again.
- case '%':
- p = "%";
- break;
-
default:
- mb.Printf("%%%c", token);
- do_quote = 0;
break;
}
After your changes, if an old error page had %x text, where "x" was not
a supported error page macro name, is it possible that the finalized
error page (as seen by users) may change? If so, this should be
documented in the change log, I think.
Yes it changes. It has been remarked as a bug several times that new
macros are visible in old releases.
+ // parse the text using Format:: parser
+ Format::Format fmt("errorPageTemporary");
+ fmt.parse(text);
If possible, please create fmt later, when it is actually needed for the
assemble() call.
Done.
Amos