Hi Devs,

Each OpenStack service sends a request ID header with HTTP responses. This 
request ID can be useful for tracking down problems in the logs. However, when 
operation crosses service boundaries, this tracking can become difficult, as 
each service has its own request ID. Request ID is not returned to the caller, 
so it is not easy to track the request. This becomes especially problematic 
when requests are coming in parallel. For example, glance will call cinder for 
creating image, but that cinder instance may be handling several other requests 
at the same time. By using same request ID in the log, user can easily find the 
cinder request ID that is same as glance request ID in the g-api log. It will 
help operators/developers to analyse logs effectively.

To address this issue we have come up with following solutions:

Solution 1: Return tuple containing headers and body from respective clients 
(also favoured by Joe Gordon)
Reference: 
https://review.openstack.org/#/c/156508/6/specs/log-request-id-mappings.rst

Pros:
1. Maintains backward compatibility
2. Effective debugging/analysing of the problem as both calling service 
request-id and called service request-id are logged in same log message
3. Build a full call graph
4. End user will able to know the request-id of the request and can approach 
service provider to know the cause of failure of particular request.

Cons:
1. The changes need to be done first in cross-projects before making changes in 
clients
2. Applications which are using python-*clients needs to do required changes 
(check return type of  response)


Solution 2:  Use thread local storage to store 'x-openstack-request-id' 
returned from headers (suggested by Doug Hellmann)
Reference: 
https://review.openstack.org/#/c/156508/9/specs/log-request-id-mappings.rst

Add new method 'get_openstack_request_id' to return this request-id to the 
caller.

Pros:
1. Doesn't break compatibility
2. Minimal changes are required in client
3. Build a full call graph

Cons:
1. Malicious user can send long request-id to fill up the disk-space, resulting 
in potential DoS
2. Changes need to be done in all python-*clients
3. Last request id should be flushed out in a subsequent call otherwise it will 
return wrong request id to the caller


Solution 3: Unique request-id across OpenStack Services (suggested by Jamie 
Lennox)
Reference: 
https://review.openstack.org/#/c/156508/10/specs/log-request-id-mappings.rst

Get 'x-openstack-request-id' from auth plugin and add it to the request 
headers. If 'x-openstack-request-id' key is present in the request header, then 
it will use the same one further or else it will generate a new one.

Dependencies:
https://review.openstack.org/#/c/164582/ - Include request-id in auth plugin 
and add it to request headers
https://review.openstack.org/#/c/166063/ - Add session-object for glance client
Add 'UserAuthPlugin' and '_ContextAuthPlugin' same as nova in cinder and neutron


Pros:
1. Using same request id for the request crossing multiple service boundaries 
will help operators/developers identify the problem quickly
2. Required changes only in keystonemiddleware and oslo_middleware libraries. 
No changes are required in the python client bindings or OpenStack core services

Cons:
1. As 'x-openstack-request-id' in the request header will be visible to the 
user, it is possible to send same request id for multiple requests which in 
turn could create more problems in case of troubleshooting cause of the failure 
as request_id middleware will not check for its uniqueness in the scope of the 
running OpenStack service.
2. Having the same request ID for all services for a single user API call means 
you cannot generate a full call graph. For example if a single user's nova API 
call produces 2 calls to glance you want to be able to differentiate the two 
different calls.


During the Liberty design summit, I had a chance of discussing these designs 
with some of the core members like Doug, Joe Gordon, Jamie Lennox etc. But not 
able to came to any conclusion on the final design and know the communities 
direction by which way they want to use this request-id effectively.

However IMO, solution 1 sounds more useful as the debugger can able to build 
the full call graph which can be helpful for analysing gate failures 
effectively as well as end user will be able to know his request-id and can 
track his request.

I request all community members to go through these solutions and let us know 
which is the appropriate way to improve the logs by logging request-id.


Thanks & Regards,

Abhishek Kekane

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to