The thing is that RVC will not work for case where the SQL query does aggregation on truncated primary key columns eg. SELECT pk1, TO_CHAR(CONVERT_TZ(TO_DATE(time_col,'yyyyMMddHHmm'), 'UTC','Asia/Kolkata'),'yyyyMMdd') as truncated_time_sample, SUM(col1) as agg_value, FROM ... GROUP BY pk1, truncated_time_sample..LIMIT 1000 OFFSET 1000
This is the reason to pick LIMIT OFFSET for pagination. >From your answer it seems for OFFSET queries, all the previous pages rows are scanned aggregated to rows and rejected till offset. So in such a scenario wouldn't those rows in block cache help during the pagination duration. Thanks, Abhishek On Wed, Aug 22, 2018 at 12:07 AM Thomas D'Silva <tdsi...@salesforce.com> wrote: > When you do an OFFSET Phoenix will scan rows and filter them out until it > reaches the offset count which can end up being very costly for large > offsets. > If you can use a RVC where the order matches the PK of the data table or > index the start key of the scan will be set based on the RVC, which is much > more efficient > (see http://phoenix.apache.org/paged.html). > > On Tue, Aug 21, 2018 at 8:06 AM, Abhishek Gupta <abhila...@gmail.com> > wrote: > >> Hi, >> >> Could you help me understand how LIMIT OFFSET queries work under the hood >> in Phoenix, is the filtering out of rows done in heap or is there some sort >> of optimisation where it can skip at disk level. >> My idea about posting this question was to understand if the rows from >> paste pages of the query in block cache can optimize the subsequent page >> call that would use the cache hits for the filter out rows and not seek at >> disk. >> >> Thanks, >> Abhishek >> >> On Sat, Aug 18, 2018 at 4:16 AM Thomas D'Silva <tdsi...@salesforce.com> >> wrote: >> >>> Shouldn't you pass the NO_CACHE hint for the LIMIT-OFFSET queries, >>> since you will be reading and filtering out lots of rows on the server? >>> I guess using the block cache for RVC queries might help depending on >>> how many rows you read per query, you should be able to easily test this >>> out. >>> >>> On Fri, Aug 17, 2018 at 4:25 AM, Abhishek Gupta <abhila...@gmail.com> >>> wrote: >>> >>>> Hi Team, >>>> >>>> I am working on a use case where SQL aggregated queries are made such >>>> that RVC cannot be used (aggregation on truncated primary key columns) >>>> instead LIMIT-OFFSET has to be used. RVC is used for some user user cases >>>> >>>> Currently I have disabled BLOCKCACHE for the table. I wanted to check >>>> if it would be more performant to instead enable BLOCKCACHE on the >>>> table and pass NO_CACHE hint for RVC queries because it uses >>>> non-LIMIT-OFFSET scans and not pass NO_CACHE for the LIMIT-OFFSET >>>> queries so that for the subsequent page calls can leverage prior page data >>>> in block cache. >>>> >>>> Thanks, >>>> Abhishek >>>> >>> >>> >