[ 
https://issues.apache.org/jira/browse/PHOENIX-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347364#comment-14347364
 ] 

Jan Fernando commented on PHOENIX-1452:
---------------------------------------

This is shaping up nicely, [~samarthjain]!

 A few concrete pieces of feedback:

1) Instead of hardcoding the metric names in individual classes could maintain 
a global list of metric names? I think it will make this easier to maintain as 
manage as we add more metrics over time.

2) Can we add a cumulative metric for the # queries executed? I think that 
would be really useful and didn't see it on the list or in the patch.

3) One thing that concerns about me the overall approach, and this is related 
to the fact I think we must have the ability to toggle metrics collection on 
and off,  is the fact that all the metrics and calculation logic is sprinkled 
throughout all the classes. 

For example with ScanningResultIterator we have the calculateSize() method and 
static member scanBytesRead. This approach makes it hard to implement 
toggleability. I think a better approach is to push all the calculations, 
metrics and collection down to a single class.

Let's use ScanningResultIterator as an example. The way ScanningResultIterator 
might work is a single call such as:
{code}
PhoenixMetrics.captureScanBytesRead(Result result).
{code}
We can them remove  the static member scanBytesRead for the metric and the 
calculateSize() method from the ScanningResultIterator and push this work down 
into PhoenixMetrics. Instrumented class just have a single line added and the 
magic all happens in PhoenixMetrics. That way you can instrument each method 
with a quick boolean check as to whether metrics is enabled before doing any 
collection. I think this is more scaleable than proliferating boolean checks 
throughout the code and statistics throughout the code and allows easy on-off 
toggleability.



> Add Phoenix client-side logging and capture resource utilization metrics
> ------------------------------------------------------------------------
>
>                 Key: PHOENIX-1452
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1452
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.0.0, 4.2
>            Reporter: Jan Fernando
>            Assignee: Samarth Jain
>         Attachments: PHOENIX-1452.patch, PHOENIX-1452_v2.patch, wip.patch
>
>
> For performance testing and tuning of features that use Phoenix and for 
> production monitoring it would be really helpful to easily be able to extract 
> statistics about Phoenix's client-side Thread Pool and Queue Depth usage to 
> help with tuning and being able to correlate the impact of tuning these 2 
> parameters to query performance.
> For global per JVM logging one of the following would meet my needs, with a 
> preference for #2:
> 1. A simple log line that that logs the data in ThreadPoolExecutor.toString() 
> at a configurable interval
> 2. Exposing the ThreadPoolExecutor metrics in PhoenixRuntime or other global 
> client exposed class and allow client to do their own logging.
> In addition to this it would also be really valuable to have a single log 
> line per query that provides statistics about the level of parallelism i.e. 
> number of parallel scans being executed. I don't full explain plan level of 
> data but a good heuristic to be able to track over time how queries are 
> utilizing the thread pool as data size grows etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to