Hi Akash, 1. Will the performance of end to end dataload operation be impacted if the segment datamap is loaded to cache once the load is finished. 2. Will there be a notification in logs stating that the loading of datamap cache is completed.
Regards On 2019/08/15 12:03:09, Akash Nilugal <akashnilu...@gmail.com> wrote: > Hi Community, > > Currently, we have an index server which basically helps in distributed > caching of the datamaps in a separate spark application. > > The caching of the datamaps in index server will start once the query is > fired on the table for the first time, all the datamaps will be loaded > > if the count(*) is fired and only required will be loaded for any filter > query. > > > Here the problem or the bottleneck is, until and unless the query is fired > on table, the caching won’t be done for the table datamaps. > > So consider a scenario where we are just loading the data to table for > whole day and then next day we query, > > so all the segments will start loading into cache. So first time the query > will be slow. > > > What if we load the datamaps into cache or preprime the cache without > waititng for any query on the table? > > Yes, what if we load the cache after every load is done, what if we load > the cache for all the segments at once, > > so that first time query need not do all this job, which makes it faster. > > > Here i have attached the design document for the pre-priming of cache into > index server. Please have a look at it > > and any suggestions or inputs on this are most welcomed. > > > https://drive.google.com/file/d/1YUpDUv7ZPUyZQQYwQYcQK2t2aBQH18PB/view?usp=sharing > > > > Regards, > > Akash R Nilugal >