> The last time this came up, some people were concerned that trusting > request-id on the wire was concerning to them because it's coming from > random users.
TBH I don't see the reason why a validated request-id value can't be logged on a callee service side, probably because I missed some previous context. Could you please give an example of such concerns? With service user I see two blocks: - A callee service needs to know if it's "special" user or not. - Until all services don't use a service user we'll not get the complete trace. Sean Dague writes: > One of the things that came up in a logging Forum session is how much > effort operators are having to put into reconstructing flows for things > like server boot when they go wrong, as every time we jump a service > barrier the request-id is reset to something new. The back and forth > between Nova / Neutron and Nova / Glance would be definitely well served > by this. Especially if this is something that's easy to query in elastic > search. > > The last time this came up, some people were concerned that trusting > request-id on the wire was concerning to them because it's coming from > random users. We're going to assume that's still a concern by some. > However, since the last time that came up, we've introduced the concept > of "service users", which are a set of higher priv services that we are > using to wrap user requests between services so that long running > request chains (like image snapshot). We trust these service users > enough to keep on trucking even after the user token has expired for > this long run operations. We could use this same trust path for > request-id chaining. > > So, the basic idea is, services will optionally take an inbound > X-OpenStack-Request-ID which will be strongly validated to the format > (req-$uuid). They will continue to always generate one as well. When the > context is built (which is typically about 3 more steps down the paste > pipeline), we'll check that the service user was involved, and if not, > reset the request_id to the local generated one. We'll log both the > global and local request ids. All of these changes happen in > oslo.middleware, oslo.context, oslo.log, and most projects won't need > anything to get this infrastructure. > > The python clients, and callers, will then need to be augmented to pass > the request-id in on requests. Servers will effectively decide when they > want to opt into calling other services this way. > > This only ends up logging the top line global request id as well as the > last leaf for each call. This does mean that full tree construction will > take more work if you are bouncing through 3 or more servers, but it's a > step which I think can be completed this cycle. > > I've got some more detailed notes, but before going through the process > of putting this into an oslo spec I wanted more general feedback on it > so that any objections we didn't think about yet can be raised before > going through the detailed design. > > -Sean -- Thanks, Andrey Volkov, Software Engineer, Mirantis, Inc. __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev