[ https://issues.apache.org/jira/browse/CARBONDATA-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kunal Kapoor updated CARBONDATA-3492: ------------------------------------- Issue Type: Improvement (was: Bug) > Cache Pre-Priming > ----------------- > > Key: CARBONDATA-3492 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3492 > Project: CarbonData > Issue Type: Improvement > Reporter: Akash R Nilugal > Priority: Major > Fix For: 2.0.0 > > Attachments: Cache_Pre_Priming_V1.pdf > > Time Spent: 12h 50m > Remaining Estimate: 0h > > Currently, we have an index server which basically helps in distributed > caching of the datamaps in a separate spark application. > The caching of the datamaps in index server will start once the query is > fired on the table for the first time, all the datamaps will be loaded > if the count(*) is fired and only required will be loaded for any filter > query. > Here the problem or the bottleneck is, until and unless the query is fired on > table, the caching won’t be done for the table datamaps. > So consider a scenario where we are just loading the data to table for whole > day and then next day we query, > so all the segments will start loading into cache. So first time the query > will be slow. > What if we load the datamaps into cache or preprime the cache without > waititng for any query on the table? > Yes, what if we load the cache after every load is done, what if we load the > cache for all the segments at once, > so that first time query need not do all this job, which makes it faster. > Here i have attached the design document for the pre-priming of cache into > index server. Please have a look at it -- This message was sent by Atlassian Jira (v8.3.4#803005)