Hi Devs,
Each OpenStack service sends a request ID header with HTTP responses. This
request ID can be useful for tracking down problems in the logs. However, when
operation crosses service boundaries, this tracking can become difficult, as
each service has its own request ID. Request ID is not returned to the caller,
so it is not easy to track the request. This becomes especially problematic
when requests are coming in parallel. For example, glance will call cinder for
creating image, but that cinder instance may be handling several other requests
at the same time. By using same request ID in the log, user can easily find the
cinder request ID that is same as glance request ID in the g-api log. It will
help operators/developers to analyse logs effectively.
To address this issue we have come up with following solutions:
Solution 1: Return tuple containing headers and body from respective clients
(also favoured by Joe Gordon)
Reference:
https://review.openstack.org/#/c/156508/6/specs/log-request-id-mappings.rst
Pros:
1. Maintains backward compatibility
2. Effective debugging/analysing of the problem as both calling service
request-id and called service request-id are logged in same log message
3. Build a full call graph
4. End user will able to know the request-id of the request and can approach
service provider to know the cause of failure of particular request.
Cons:
1. The changes need to be done first in cross-projects before making changes in
clients
2. Applications which are using python-*clients needs to do required changes
(check return type of response)
Solution 2: Use thread local storage to store 'x-openstack-request-id'
returned from headers (suggested by Doug Hellmann)
Reference:
https://review.openstack.org/#/c/156508/9/specs/log-request-id-mappings.rst
Add new method 'get_openstack_request_id' to return this request-id to the
caller.
Pros:
1. Doesn't break compatibility
2. Minimal changes are required in client
3. Build a full call graph
Cons:
1. Malicious user can send long request-id to fill up the disk-space, resulting
in potential DoS
2. Changes need to be done in all python-*clients
3. Last request id should be flushed out in a subsequent call otherwise it will
return wrong request id to the caller
Solution 3: Unique request-id across OpenStack Services (suggested by Jamie
Lennox)
Reference:
https://review.openstack.org/#/c/156508/10/specs/log-request-id-mappings.rst
Get 'x-openstack-request-id' from auth plugin and add it to the request
headers. If 'x-openstack-request-id' key is present in the request header, then
it will use the same one further or else it will generate a new one.
Dependencies:
https://review.openstack.org/#/c/164582/ - Include request-id in auth plugin
and add it to request headers
https://review.openstack.org/#/c/166063/ - Add session-object for glance client
Add 'UserAuthPlugin' and '_ContextAuthPlugin' same as nova in cinder and neutron
Pros:
1. Using same request id for the request crossing multiple service boundaries
will help operators/developers identify the problem quickly
2. Required changes only in keystonemiddleware and oslo_middleware libraries.
No changes are required in the python client bindings or OpenStack core services
Cons:
1. As 'x-openstack-request-id' in the request header will be visible to the
user, it is possible to send same request id for multiple requests which in
turn could create more problems in case of troubleshooting cause of the failure
as request_id middleware will not check for its uniqueness in the scope of the
running OpenStack service.
2. Having the same request ID for all services for a single user API call means
you cannot generate a full call graph. For example if a single user's nova API
call produces 2 calls to glance you want to be able to differentiate the two
different calls.
During the Liberty design summit, I had a chance of discussing these designs
with some of the core members like Doug, Joe Gordon, Jamie Lennox etc. But not
able to came to any conclusion on the final design and know the communities
direction by which way they want to use this request-id effectively.
However IMO, solution 1 sounds more useful as the debugger can able to build
the full call graph which can be helpful for analysing gate failures
effectively as well as end user will be able to know his request-id and can
track his request.
I request all community members to go through these solutions and let us know
which is the appropriate way to improve the logs by logging request-id.
Thanks & Regards,
Abhishek Kekane
______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev