[ 
https://issues.apache.org/jira/browse/LENS-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14624253#comment-14624253
 ] 

Amareshwari Sriramadasu edited comment on LENS-602 at 7/13/15 6:14 AM:
-----------------------------------------------------------------------

Now that we have log segregation ids for each request, the api can be generic 
enough to get logs for a <logsegregatonid>. logsegregatonid is unique id for 
each REST request and is same as queryhandle for queries, prepare handle for 
prepared queries.

*API* :
I'm thinking about adding the api as :
GET on <lens-url>/logs/{logsegregatonid} : would return output stream.

*Implementation approaches* :

Here are the approaches I'm thinking about :

# API call will actually do a grep  <logsegregatonid> 
<${lens.log.dir}/<lens-log-file>.* and return result as output stream. 
## I still have to explore the options for streaming shell output as output 
stream as Http Response.
## This approach will allow users to give other parameters as well instead of 
logsegregatonid - such as timestamp, threadID/threadName or any other grep 
pattern.
# Logs corresponding to each <logsegregatonid> are logged into a separate file. 
And the file served as attachment on api call
## We would need a purging policy for these files. And if the number of files 
is huge - we might run out of max files in a directory in underlying OS. We 
will have to think of putting them in separate directories based on time or 
number of files.

*Feature turnoff*:

We should have a way to turn off  the feature : "Serving logs over REST " at 
server deployment - some deployments might want to turn this off to save 
request serving threads. As its serving huge content (the logs can be huge) 
over network and to control use of bandwidth, some deployments can turn off the 
feature.

*Get underlying driver's execution logs:*

For getting underlying driver's execution logs, if the driver provides the log, 
we should be able to serve. HiveServer2 provides thrift api to fetch logs 
corresponding to an operation, before closing the operation. So, HiveDriver can 
fetch and log operation logs in lens server with logsegregationid(the query 
handle) added for the logs fetched. Again all this would configurable - because 
it can cause HiveServer2 to get overloaded.

Thoughts?

I'm inclined to the option of grep on lens.log.dir and serving it as output 
stream on REST - trying out the same first. 

Everyone's thoughts, comments and suggestions are welcome.


was (Author: amareshwari):
Now that we have log segregation ids for each request, the api can be generic 
enough to get logs for a <logsegregatonid>. logsegregatonid is unique id for 
each REST request and is same as queryhandle for queries, prepare handle for 
prepared queries.

*API* :
I'm thinking about adding the api as :
GET on <lens-url>/logs/{logsegregatonid} : would return output stream.

*Implementation approaches* :

Here are the approaches I'm thinking about :

# API call will actually do a grep  <logsegregatonid> 
<${lens.log.dir}/<lens-log-file>.* and return result as output stream. 
## I still have to explore the options for streaming shell output as output 
stream as Http Response.
## This approach will allow users to give other parameters as well instead of 
logsegregatonid - such as timestamp, threadID/threadName or any other grep 
pattern.
# Logs corresponding to each <logsegregatonid> are logged into a separate file. 
And the file served as attachment on api call
## We would need a purging policy for these files. And if the number of files 
is huge - we might run out of max files in a directory in underlying OS. We 
will have to think of putting them in separate directories based on time or 
number of files.

*Feature turnoff*:

We should have a way to turn off  the feature : "Serving logs over REST " at 
server deployment - some deployments might want to turn this off to save 
request serving threads. As its serving huge content (the logs can be huge) 
over network and to control use of bandwidth, some deployments can turn off the 
feature.

* Get underlying driver's execution logs:*

For getting underlying driver's execution logs, if the driver provides the log, 
we should be able to serve. HiveServer2 provides thrift api to fetch logs 
corresponding to an operation, before closing the operation. So, HiveDriver can 
fetch and log operation logs in lens server with logsegregationid(the query 
handle) added for the logs fetched. Again all this would configurable - because 
it can cause HiveServer2 to get overloaded.

Thoughts?

I'm inclined to the option of grep on lens.log.dir and serving it as output 
stream on REST - trying out the same first. 

Everyone's thoughts, comments and suggestions are welcome.

> Easy access to per-query lens server logs
> -----------------------------------------
>
>                 Key: LENS-602
>                 URL: https://issues.apache.org/jira/browse/LENS-602
>             Project: Apache Lens
>          Issue Type: New Feature
>            Reporter: Angad Singh
>            Assignee: Amareshwari Sriramadasu
>              Labels: Hackathon-July
>
> Right now one has to have access to the lens server machine and find a lens 
> query's job logs manually in lensserver.log file. This is neither scalable 
> nor user-friendly for a shared multi-tenanted lens server.
> Just throwing server-exceptions to the client or showing the job ID is also 
> often not enough. Even when the query succeeds, one needs to see how 
> candidate fact tables and their columns, etc. were pruned and how join chains 
> were resolved, for example. That is only possible by seeing the lens server 
> logs.
> Instead of that, this ticket is to propose that lens store logs for each lens 
> query in a different log file on the server and that there be a REST end 
> point to access a query's log (by query ID). That URL can be pasted on the 
> client shell when the query is launched and the user can see a tailed log of 
> all that lens is doing behind the scenes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to