virajjasani commented on PR #1848:
URL: https://github.com/apache/phoenix/pull/1848#issuecomment-1984557213

   @dbwong here are some of the observations:
   In both HBase 1 and 2, region locations for the given table are cached at 
HBase Connection level. Phoenix CQSI connections are by default cached for ~24 
hr. In fact, the issue for the large table range scan queries that we have seen 
occurs after 24 hr such that multiple range scan queries get hit by performance 
issue. Sometimes the queries take ~5-10 min worth of time and sometimes the 
thread that performs meta table lookup gets interrupted. However, even after 24 
hr, still significant num of queries get affected.
   
   In the incident, what we have observed is that the base table has ~138k 
regions. However, the queries are being done using tenant connections. Because 
of the fact that we get all table regions regardless of the nature of the 
query, we end up spending significant time retrieving region locations and 
filling up the connection cache, even when the given query on the tenant view 
likely does not require going through more than 5-10 regions.
   
   Hence, this fix would improve the performance of queries being done on the 
large tables (and usually tables that share more tenants are larger) 
significantly. I am not proposing any change to how the getAllTableRegions API 
is written, we can continue to follow the same pattern because HBase still does 
not provide API where it can take start and end key of the scan range and 
provide all list of region locations in single API, I can file a jira for the 
same as well.
   
   The current state of the PR address the performance issues of the queries:
   
   - Range Scan
   - Any queries using tenant connection - full scan, range scan, point lookup
   - Point lookup or Range scan on Salted table
   - Point lookup or Range scan on Salted table with Tenant id and/or View 
index id
   
   The only queries that should require all table region locations are the ones 
that need full base table scan.
   
   Does this look good to you?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to