kkhatua opened a new pull request #1750: DRILL-2362: Profile Mgmt
URL: https://github.com/apache/drill/pull/1750
 
 
   This PR is a WIP for managing a large number of profiles. It involves the 
following features.
   
   1. Write profiles to indexed partitions (created on the fly, and default 
being organized in nested directories by year, month and date).
   2. Read chronologically from the above partitioned dirs. This improves 
performance by scanning and retrieving only from the most recent profiles
   3. Leverage Guava Cache by saving on cost of deserializing a profile 
multiple times from the disk. (Even 1 attempt at rendering a profile leads to 
atleast 2 times deserialization).
   4. Infer which partitioned dir has a profile based on queryId alone. This 
means that rather than scanning all the directories, we reverse engineer the 
query ID to figure out the approximate start time of the query to narrow down 
on the profile's location.
   5. Trace Exception [qId: 259432dc-7f8e-8fc5-af69-16a1ca817689 ] -> This is a 
sample bad profile and make the UI more robust in handling bad profiles that 
cant be deserialized
   6. Auto Index for 1st time (In batches of 10000) from root dir (sync if 
Distributed). Using ZK, synchronization is maintained when multiple Drillbits 
are sharing the same profile location
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to