[ https://issues.apache.org/jira/browse/PHOENIX-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353718#comment-15353718 ]
Josh Elser commented on PHOENIX-2724: ------------------------------------- bq. Mujtaba Chohan - as a first test, can you try increasing that cache size phoenix.stats.cache.maxSize? This will be an important config parameter. We might want to switch it to being a percentage of the heap instead of an absolute time. Additionally, enabling {{TRACE}} on {{org.apache.phoenix.query.TableStatsCache}} will tell you when entries are added or evicted to the client-side patch. bq. Mujtaba Chohan did try updating the client side cache by adjusting phoenix.client.maxMetaDataCacheSize to 1GB but that didn't help either. If it is related to the stats not being cached, altering that property wouldn't change anything. bq. Previously, the server-side cache was being used (which I think is bigger). If the cache is too small, we end up making an RPC each time to get the stats. I'm also wondering if there's an optimization to be had in avoiding this case in TableStatsCache. We should be able to determine when this is happening (the cache not actually acting as a cache for some configuration reason) and just short-circuit the RPC, sending back {{EMPTY_STATS}}. [~samarthjain], [~mujtabachohan], sorry you both got sucked into debugging this one. I'm lamenting even more the lack of insight we have into this (ideally, it should have been very easy to tell after the fact). I've been rolling the idea around about some mechanism we can plug into on the client to better understand execution (nothing fancy). Maybe we need to think about this soon after 4.8.0 I'm at a conference most of this week, but I'll try to keep an eye on my inbox and help out where possible. > Query with large number of guideposts is slower compared to no stats > -------------------------------------------------------------------- > > Key: PHOENIX-2724 > URL: https://issues.apache.org/jira/browse/PHOENIX-2724 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.7.0 > Environment: Phoenix 4.7.0-RC4, HBase-0.98.17 on a 8 node cluster > Reporter: Mujtaba Chohan > Assignee: Samarth Jain > Fix For: 4.8.0 > > Attachments: PHOENIX-2724.patch, PHOENIX-2724_addendum.patch, > PHOENIX-2724_v2.patch > > > With 1MB guidepost width for ~900GB/500M rows table. Queries with short scan > range gets significantly slower. > Without stats: > {code} > select * from T limit 10; // query execution time <100 msec > {code} > With stats: > {code} > select * from T limit 10; // query execution time >20 seconds > Explain plan: CLIENT 876085-CHUNK 476569382 ROWS 876060986727 BYTES SERIAL > 1-WAY FULL SCAN OVER T SERVER 10 ROW LIMIT CLIENT 10 ROW LIMIT > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)