[jira] [Comment Edited] (DRILL-6061) Feature Request: Global Query List showing queries from all Drill foreman nodes

Hari Sekhon (JIRA) Tue, 16 Jan 2018 03:09:41 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16327012#comment-16327012
 ]


Hari Sekhon edited comment on DRILL-6061 at 1/16/18 11:08 AM:
--------------------------------------------------------------

Yes this is what we found after raising this, we will point to MapR-FS 
/apps/drill/pstore to follow Hadoop layout best practices convention and test.

I think this should be documented a bit better / easier to find, perhaps in FAQ 
or a section stating something like "Global Query List - how to see the queries 
on the cluster from any Drill node" in the Apache Drill documentation. There is 
a MapR community connection to response to this as well:

[https://community.mapr.com/thread/21498-what-are-best-practices-for-managing-drill-query-profiles]

I recommend changing the Apache Drill documentation at:

{{[https://drill.apache.org/docs/persistent-configuration-storage/]}}

{{<directory to store pstore data>}} to a standardized best practice location 
of {{/apps/drill/pstore}} to fall in line with other apps on Hadoop clusters.

It's also worth documenting the load balancing algorithm used for load 
balancing across Drill nodes when acquiring a drillbit via zookeeper quorum 
referral (random, round robin, least connection etc).


was (Author: harisekhon):
Yes this is what we found after raising this, we will point to MapR-FS 
/apps/drill/pstore to follow Hadoop layout best practices convention and test.

I think this should be documented a bit better / easier to find, perhaps in FAQ 
or a section stating something like "Global Query List - how to see the queries 
on the cluster from any Drill node" in the Apache Drill documentation. There is 
a MapR community connection to response to this as well:

https://community.mapr.com/thread/21498-what-are-best-practices-for-managing-drill-query-profiles

I recommend changing the Apache Drill documentation {{<directory to store 
pstore data>}} with a single best practice path of {{/apps/drill/pstore}} to 
standardize this and fall in line with other apps on Hadoop clusters.

It's also worth documenting the load balancing algorithm used for load 
balancing across Drill nodes when acquiring a drillbit via zookeeper quorum 
referral (random, round robin, least connection etc).

> Feature Request: Global Query List showing queries from all Drill foreman 
> nodes
> -------------------------------------------------------------------------------
>
>                 Key: DRILL-6061
>                 URL: https://issues.apache.org/jira/browse/DRILL-6061
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components:  Server, Documentation, Metadata, Query Planning &amp; 
> Optimization, Tools, Build &amp; Test, Web Server
>    Affects Versions: 1.11.0
>         Environment: MapR 5.2
>            Reporter: Hari Sekhon
>            Priority: Major
>
> Feature Request to add a Global Query List to show all queries executed 
> across all Drill nodes in a cluster for better management and auditing.
> Right now there doesn't appear to be a way to see all queries across all 
> nodes in a Drill cluster. The Web UI on any given Drill node only shows the 
> queries coordinated by that local node if acting as the foreman for the 
> query, so if using ZooKeeper or a Load Balancer to distribute queries via 
> different Drill nodes then the query list will be spread across lots of 
> different nodes with no global timeline of queries.
> This seems to leave a bit of a gap in auditing functionality, with the only 
> other option that I can think of being immediately available is to limit all 
> query submissions via a single foreman node so the query list is complete on 
> that node - although that doesn't seem like a great idea in terms of load 
> distribution of query planning, coordination and final aggregation steps. 
> I've made load balancing configurations for Apache Drill and similar 
> technologies that could be used for that purpose with failover support to 
> maintain high availability at 
> https://github.com/HariSekhon/nagios-plugins/tree/master/haproxy) but would 
> still prefer if Drill was designed to store the global list of queries 
> submitted in a centralized place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (DRILL-6061) Feature Request: Global Query List showing queries from all Drill foreman nodes

Reply via email to